commit 356d8c6fb2a7cf49e836742738a8b9a47e77cfea Author: Greg Kroah-Hartman Date: Wed Feb 27 09:22:04 2013 -0800 Linux 3.7.10 commit 314561f4fe77ffad1b560a10cdfb3b6fdc731a74 Author: Mathias Krause Date: Sat Feb 23 01:13:47 2013 +0000 sock_diag: Fix out-of-bounds access to sock_diag_handlers[] [ Upstream commit 6e601a53566d84e1ffd25e7b6fe0b6894ffd79c0 ] Userland can send a netlink message requesting SOCK_DIAG_BY_FAMILY with a family greater or equal then AF_MAX -- the array size of sock_diag_handlers[]. The current code does not test for this condition therefore is vulnerable to an out-of-bound access opening doors for a privilege escalation. Signed-off-by: Mathias Krause Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ca2656dccef64c437d6717468bffd3762b11816e Author: Nicholas Bellinger Date: Tue Feb 19 03:15:14 2013 +0000 target: Fix divide by zero bug in fabric_max_sectors for unconfigured devices commit 7a3cf6ca1ab2a2f7161c6dec5a787fc7a5de864e upstream This patch fixes a possible divide by zero bug when the fabric_max_sectors device attribute is written and backend se_device failed to be successfully configured -> enabled. Go ahead and use block_size=512 within se_dev_set_fabric_max_sectors() in the event of a target_configure_device() failure case, as no valid dev->dev_attrib.block_size value will have been setup yet. Signed-off-by: Nicholas Bellinger Cc: Herton Ronaldo Krzesinski Signed-off-by: Greg Kroah-Hartman commit 732b2a1a4bb58677ba9c7be3fac0960ceccc9b8f Author: Adam Jackson Date: Wed Feb 20 11:54:17 2013 -0500 drm/i915: Fix up mismerge of 3490ea5d in 3.7.y The 3.7.y version of this seems to have missed a hunk in i9xx_update_wm. Tested-by: Glen Gray Signed-off-by: Adam Jackson Signed-off-by: Greg Kroah-Hartman commit 53aa75aa360992e07141bbfa2c6062110e678605 Author: David S. Miller Date: Wed Feb 20 12:38:40 2013 -0800 sparc64: Fix huge PMD to PTE translation for sun4u in TLB miss handler. [ Upstream commit 76968ad2eac6456270353de168b21f04f4b3d1d3 ] When we set the sun4u version of the PTE execute bit, it's: or REG, _PAGE_EXEC_4U, REG _PAGE_EXEC_4U is 0x1000, unfortunately the immedate field of the 'or' instruction is a signed 13-bit value. So the above actually assembles into: or REG, -4096, REG completely corrupting the final PTE value. Set it with a: sethi %hi(_PAGE_EXEC_4U), TMP or REG, TMP, REG sequence instead. This fixes "git gc" crashes on sun4u machines. Reported-by: Meelis Roos Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d13cdf91b4ccc961b496a72d80b549dc158f970e Author: David S. Miller Date: Tue Feb 19 22:34:10 2013 -0800 sparc64: Fix tsb_grow() in atomic context. [ Upstream commit 0fbebed682ff2788dee58e8d7f7dda46e33aa10b ] If our first THP installation for an MM is via the set_pmd_at() done during khugepaged's collapsing we'll end up in tsb_grow() trying to do a GFP_KERNEL allocation with several locks held. Simply using GFP_ATOMIC in this situation is not the best option because we really can't have this fail, so we'd really like to keep this an order 0 GFP_KERNEL allocation if possible. Also, doing the TSB allocation from khugepaged is a really bad idea because we'll allocate it potentially from the wrong NUMA node in that context. So what we do is defer the hugepage TSB allocation until the first TLB miss we take on a hugepage. This is slightly tricky because we have to handle two unusual cases: 1) Taking the first hugepage TLB miss in the window trap handler. We'll call the winfix_trampoline when that is detected. 2) An initial TSB allocation via TLB miss races with a hugetlb fault on another cpu running the same MM. We handle this by unconditionally loading the TSB we see into the current cpu even if it's non-NULL at hugetlb_setup time. Reported-by: Meelis Roos Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f39d6de4af78fb70abbf7f43a2ecbff2a1e8d795 Author: David S. Miller Date: Tue Feb 19 13:20:08 2013 -0800 sparc64: Handle hugepage TSB being NULL. [ Upstream commit bcd896bae0166b4443503482a26ecf84d9ba60ab ] Accomodate the possibility that the TSB might be NULL at the point that update_mmu_cache() is invoked. This is necessary because we will sometimes need to defer the TSB allocation to the first fault that happens in the 'mm'. Seperate out the hugepage PTE test into a seperate function so that the logic is clearer. Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ab3f3a4fbfaec974f34b636d9bf978f4d3cc0b28 Author: David S. Miller Date: Tue Feb 19 12:56:18 2013 -0800 sparc64: Fix gfp_flags setting in tsb_grow(). [ Upstream commit a55ee1ff751f88252207160087d8197bb7538d4c ] We should "|= more_flags" rather than "= more_flags". Reported-by: David Rientjes Acked-by: David Rientjes Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 1f751f60a108d559a55fab8a916144da0fb1bb88 Author: David S. Miller Date: Wed Feb 13 12:21:06 2013 -0800 sparc64: Fix get_user_pages_fast() wrt. THP. [ Upstream commit 89a77915e0f56dc7b9f9082ba787895b6a83f809 ] Mostly mirrors the s390 logic, as unlike x86 we don't need the SetPageReferenced() bits. On sparc64 we also lack a user/privileged bit in the huge PMDs. In order to make this work for THP and non-THP builds, some header file adjustments were necessary. Namely, provide the PMD_HUGE_* bit defines and the pmd_large() inline unconditionally rather than protected by TRANSPARENT_HUGEPAGE. Reported-by: Aneesh Kumar K.V Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 15c67c392af6a7b1ad5b8b93c7a6613b15f0bfda Author: David S. Miller Date: Wed Feb 13 12:15:08 2013 -0800 sparc64: Add missing HAVE_ARCH_TRANSPARENT_HUGEPAGE. [ Upstream commit b9156ebb7beef015745f917f373abc137efc3400 ] This got missed in the cleanups done for the S390 THP support. CC: Gerald Schaefer Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit c8c7c761580de1576c7fe03a8aae5009983107a1 Author: David S. Miller Date: Thu Feb 14 11:49:01 2013 -0800 sunvdc: Fix off-by-one in generic_request(). [ Upstream commit f4d9605434c0fd4cc8639bf25cfc043418c52362 ] The 'operations' bitmap corresponds one-for-one with the operation codes, no adjustment is necessary. Reported-by: Mark Kettenis Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 496064cd2a5ec740671a522e5fa154ce91dfb012 Author: Bob Peterson Date: Fri Feb 1 12:03:02 2013 -0500 GFS2: Get a block reservation before resizing a file commit d2b47cfb26fe06002b8011707baac71a9ae8166f upstream. This patch allocates a block reservation structure before growing or shrinking a file. Without this structure, the grow or shink code can reference the bad pointer. Signed-off-by: Bob Peterson Signed-off-by: Steven Whitehouse Signed-off-by: Greg Kroah-Hartman commit a56de15779ba561c66c9fa04e6339f9a044b2001 Author: David Henningsson Date: Tue Feb 19 16:11:22 2013 +0100 ALSA: hda - hdmi: ELD shouldn't be valid after unplug commit bbfd8a19b6913f50a362457c34d49bfafe5e456e upstream. Currently, eld_valid is never set to false, except at kernel module load time. This patch makes sure that eld is no longer valid when the cable is (hot-)unplugged. Signed-off-by: David Henningsson Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 3eb1912b8557cff58a1fd1497cc4908a0af7fbc3 Author: Fernando Luis Vazquez Cao Date: Tue Feb 12 16:47:44 2013 +0900 ALSA: hda - Workaround for silent output on Sony Vaio VGC-LN51JGB with ALC889 commit 12e31a78c70dc12897fda2489113f445c0e94a18 upstream. Some Vaio all-in-one desktop PCs (for example VGC-LN51JGB) are affected by the same issue that caused Vaio Z laptops to become silent: the speaker pin must be connected to the first DAC even though the codec itself advertises flexible routing through any of the DACs. Use the no-primary-hp fixup for choosing the speaker pin as the primary so that the right DAC is assigned on this device. Signed-off-by: Fernando Luis Vazquez Cao Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 97acde03f1aeaf0470ad481e49a6171a6a025e30 Author: Anssi Hannula Date: Sun Feb 3 17:55:45 2013 +0200 ALSA: hda - Fix default multichannel HDMI mapping regression commit 20608731f479d48be6bcb88e727f360ddf98ddaf upstream. Commit d45e6889ee69456a4d5b1bbb32252f460cd48fa9 ("ALSA: hda - Provide the proper channel mapping for generic HDMI driver") added support for custom channel maps in the HDA HDMI driver. Due to a mistake in an 'if' condition the custom map is always used even when no such map has been set. This causes incorrect channel mapping for multichannel audio by default. Pass per_pin->chmap_set to hdmi_setup_channel_mapping() as a parameter so that it can use it for detecting if a custom map has been set instead of checking if map is NULL (which is never the case). Reported-by: Staffan Lindberg Signed-off-by: Anssi Hannula Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 32a08c1e1115ae59bea13d6e354d0d7fc5fbe2d7 Author: Takashi Iwai Date: Fri Feb 1 14:01:27 2013 +0100 ALSA: hda - Release assigned pin/cvt at error path of hdmi_pcm_open() commit 2ad779b7329d6894a80df94e693e72eaa0d56790 upstream. If the driver detects and invalid ELD, it gives an open error. But it forgot to release the assigned pin, converter and spdif ctls before returning. Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit d451fb49065183643cc048ab9ee905e5c312219e Author: Pawel Moll Date: Thu Feb 21 01:55:50 2013 +0000 ALSA: usb: Fix Processing Unit Descriptor parsers commit b531f81b0d70ffbe8d70500512483227cc532608 upstream. Commit 99fc86450c439039d2ef88d06b222fd51a779176 "ALSA: usb-mixer: parse descriptors with structs" introduced a set of useful parsers for descriptors. Unfortunately the parses for the Processing Unit Descriptor came with a very subtle bug... Functions uac_processing_unit_iProcessing() and uac_processing_unit_specific() were indexing the baSourceID array forgetting the fields before the iProcessing and process-specific descriptors. The problem was observed with Sound Blaster Extigy mixer, where nNrModes in Up/Down-mix Processing Unit Descriptor was accessed at offset 10 of the descriptor (value 0) instead of offset 15 (value 7). In result the resulting control had interesting limit values: Simple mixer control 'Channel Routing Mode Select',0 Capabilities: volume volume-joined penum Playback channels: Mono Capture channels: Mono Limits: 0 - -1 Mono: -1 [100%] Fixed by starting from the bmControls, which was calculated correctly, instead of baSourceID. Now the mentioned control is fine: Simple mixer control 'Channel Routing Mode Select',0 Capabilities: volume volume-joined penum Playback channels: Mono Capture channels: Mono Limits: 0 - 6 Mono: 0 [0%] Signed-off-by: Pawel Moll Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit d910f80d9926cfe4f247d3ead54f037c31b3e398 Author: Clemens Ladisch Date: Thu Jan 31 21:14:33 2013 +0100 ALSA: usb-audio: fix Roland A-PRO support commit 7da58046482fceb17c4a0d4afefd9507ec56de7f upstream. The quirk for the Roland/Cakewalk A-PRO keyboards accidentally used the wrong interface number, which prevented the driver from attaching to the device. Signed-off-by: Clemens Ladisch Signed-off-by: Greg Kroah-Hartman commit da89d452d6552a96216f79a823807b1887ef32bd Author: Tomasz Guszkowski Date: Tue Feb 5 22:10:31 2013 +0100 p54usb: corrected USB ID for T-Com Sinus 154 data II commit 008e33f733ca51acb2dd9d88ea878693b04d1d2a upstream. Corrected USB ID for T-Com Sinus 154 data II. ISL3887-based. The device was tested in managed mode with no security, WEP 128 bit and WPA-PSK (TKIP) with firmware 2.13.1.0.lm87.arm (md5sum: 7d676323ac60d6e1a3b6d61e8c528248). It works. Signed-off-by: Tomasz Guszkowski Acked-By: Christian Lamparter Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit eee7e23e91079b87fe81a90fd4405de90e538a2c Author: Weston Andros Adamson Date: Fri Feb 15 16:03:46 2013 -0500 NFSv4.1: Don't decode skipped layoutgets commit 085b7a45c63d3da5be155faab9249a5cab224561 upstream. layoutget's prepare hook can call rpc_exit with status = NFS4_OK (0). Because of this, nfs4_proc_layoutget can't depend on a 0 status to mean that the RPC was successfully sent, received and parsed. To fix this, use the result's len member to see if parsing took place. This fixes the following OOPS -- calling xdr_init_decode() with a buffer length 0 doesn't set the stream's 'p' member and ends up using uninitialized memory in filelayout_decode_layout. BUG: unable to handle kernel paging request at 0000000000008050 IP: [] memcpy+0x18/0x120 PGD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:11.0/0000:02:01.0/irq CPU 1 Modules linked in: nfs_layout_nfsv41_files nfs lockd fscache auth_rpcgss nfs_acl autofs4 sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log dm_mod ppdev parport_pc parport snd_ens1371 snd_rawmidi snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc e1000 microcode vmware_balloon i2c_piix4 i2c_core sg shpchp ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix mptspi mptscsih mptbase scsi_transport_spi [last unloaded: speedstep_lib] Pid: 1665, comm: flush-0:22 Not tainted 2.6.32-356-test-2 #2 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform RIP: 0010:[] [] memcpy+0x18/0x120 RSP: 0018:ffff88003dfab588 EFLAGS: 00010206 RAX: ffff88003dc42000 RBX: ffff88003dfab610 RCX: 0000000000000009 RDX: 000000003f807ff0 RSI: 0000000000008050 RDI: ffff88003dc42000 RBP: ffff88003dfab5b0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000080 R12: 0000000000000024 R13: ffff88003dc42000 R14: ffff88003f808030 R15: ffff88003dfab6a0 FS: 0000000000000000(0000) GS:ffff880003420000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000008050 CR3: 000000003bc92000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process flush-0:22 (pid: 1665, threadinfo ffff88003dfaa000, task ffff880037f77540) Stack: ffffffffa0398ac1 ffff8800397c5940 ffff88003dfab610 ffff88003dfab6a0 ffff88003dfab5d0 ffff88003dfab680 ffffffffa01c150b ffffea0000d82e70 000000508116713b 0000000000000000 0000000000000000 0000000000000000 Call Trace: [] ? xdr_inline_decode+0xb1/0x120 [sunrpc] [] filelayout_decode_layout+0xeb/0x350 [nfs_layout_nfsv41_files] [] filelayout_alloc_lseg+0x8c/0x3c0 [nfs_layout_nfsv41_files] [] ? __wait_on_bit+0x7e/0x90 Signed-off-by: Weston Andros Adamson Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman commit 58e772eae11a0569b80b8e2eebdee10ce0caa93d Author: Trond Myklebust Date: Tue Feb 19 12:04:42 2013 -0500 NLM: Ensure that we resend all pending blocking locks after a reclaim commit 666b3d803a511fbc9bc5e5ea8ce66010cf03ea13 upstream. Currently, nlmclnt_lock will break out of the for(;;) loop when the reclaimer wakes up the blocking lock thread by setting nlm_lck_denied_grace_period. This causes the lock request to fail with an ENOLCK error. The intention was always to ensure that we resend the lock request after the grace period has expired. Reported-by: Wangyuan Zhang Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman commit 0a41d4134695a7059c1cc5d0ac57f44c9e0a54a1 Author: fanchaoting Date: Mon Feb 4 21:15:02 2013 +0800 umount oops when remove blocklayoutdriver first commit 5a12cca697aca5dfba42a7d4c3356acc0445a2b0 upstream. now pnfs client uses block layout, maybe we can remove blocklayoutdriver first. if we umount later, it can cause oops in unset_pnfs_layoutdriver. because nfss->pnfs_curr_ld->clear_layoutdriver is invalid. reproduce it: modprobe blocklayoutdriver mount -t nfs4 -o minorversion=1 pnfsip:/ /mnt/ rmmod blocklayoutdriver umount /mnt then you can see following CPU 0 Pid: 17023, comm: umount.nfs4 Tainted: GF O 3.7.0-rc6-pnfs #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform RIP: 0010:[] [] unset_pnfs_layoutdriver+0x1d/0x70 [nfsv4] RSP: 0018:ffff8800022d9e48 EFLAGS: 00010286 RAX: ffffffffa04a1b00 RBX: ffff88000b013800 RCX: 0000000000000001 RDX: ffffffff81ae8ee0 RSI: ffff880001ee94b8 RDI: ffff88000b013800 RBP: ffff8800022d9e58 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff880001ee9400 R13: ffff8800105978c0 R14: 00007fff25846c08 R15: 0000000001bba550 FS: 00007f45ae7f0700(0000) GS:ffff880012c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffffffffa04a1b38 CR3: 0000000002c0c000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process umount.nfs4 (pid: 17023, threadinfo ffff8800022d8000, task ffff880006e48aa0) Stack: ffff8800105978c0 ffff88000b013800 ffff8800022d9e78 ffffffffa04cd0ce ffff8800022d9e78 ffff88000b013800 ffff8800022d9ea8 ffffffffa04755a7 ffff8800022d9ea8 ffff880002f96400 ffff88000b013800 ffff880002f96400 Call Trace: [] nfs4_destroy_server+0x1e/0x30 [nfsv4] [] nfs_free_server+0xb7/0x150 [nfs] [] nfs_kill_super+0x35/0x40 [nfs] [] deactivate_locked_super+0x45/0x70 [] deactivate_super+0x4a/0x70 [] mntput_no_expire+0xd2/0x130 [] sys_umount+0x72/0xe0 [] system_call_fastpath+0x16/0x1b Code: 06 e1 b8 ea ff ff ff eb 9e 0f 1f 44 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 8b 87 80 03 00 00 48 89 fb 48 85 c0 74 29 <48> 8b 40 38 48 85 c0 74 02 ff d0 48 8b 03 3e ff 48 04 0f 94 c2 RIP [] unset_pnfs_layoutdriver+0x1d/0x70 [nfsv4] RSP CR2: ffffffffa04a1b38 ---[ end trace 29f75aaedda058bf ]--- Signed-off-by: fanchaoting Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman commit 6f0b214adfb522285ea42fe697c561c80b79193a Author: Grant Likely Date: Thu Feb 14 18:14:27 2013 +0000 drivercore: Fix ordering between deferred_probe and exiting initcalls commit d72cca1eee5b26e313da2a380d4862924e271031 upstream. One of the side effects of deferred probe is that some drivers which used to be probed before initcalls completed are now happening slightly later. This causes two problems. - If a console driver gets deferred, then it may not be ready when userspace starts. For example, if a uart depends on pinctrl, then the uart will get deferred and /dev/console will not be available - __init sections will be discarded before built-in drivers are probed. Strictly speaking, __init functions should not be called in a drivers __probe path, but there are a lot of drivers (console stuff again) that do anyway. In the past it was perfectly safe to do so because all built-in drivers got probed before the end of initcalls. This patch fixes the problem by forcing the first pass of the deferred list to complete at late_initcall time. This is late enough to catch the drivers that are known to have the above issues. Signed-off-by: Grant Likely Tested-by: Haojian Zhuang Cc: Arnd Bergmann Cc: Russell King Cc: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit fef9589e069e0aafec0f0b378e95baa939bd1272 Author: Jan Beulich Date: Wed Feb 6 10:30:38 2013 -0500 xen-pciback: rate limit error messages from xen_pcibk_enable_msi{,x}() commit 51ac8893a7a51b196501164e645583bf78138699 upstream. ... as being guest triggerable (e.g. by invoking XEN_PCI_OP_enable_msi{,x} on a device not being MSI/MSI-X capable). This is CVE-2013-0231 / XSA-43. Also make the two messages uniform in both their wording and severity. Signed-off-by: Jan Beulich Acked-by: Ian Campbell Reviewed-by: Konrad Rzeszutek Wilk Signed-off-by: Greg Kroah-Hartman commit 2cc7be2ed22d47ca2bbb3d85ec61ae66d6d78d77 Author: Mel Gorman Date: Fri Feb 22 16:35:59 2013 -0800 mm/fadvise.c: drain all pagevecs if POSIX_FADV_DONTNEED fails to discard all pages commit 67d46b296a1ba1477c0df8ff3bc5e0167a0b0732 upstream. Rob van der Heij reported the following (paraphrased) on private mail. The scenario is that I want to avoid backups to fill up the page cache and purge stuff that is more likely to be used again (this is with s390x Linux on z/VM, so I don't give it as much memory that we don't care anymore). So I have something with LD_PRELOAD that intercepts the close() call (from tar, in this case) and issues a posix_fadvise() just before closing the file. This mostly works, except for small files (less than 14 pages) that remains in page cache after the face. Unfortunately Rob has not had a chance to test this exact patch but the test program below should be reproducing the problem he described. The issue is the per-cpu pagevecs for LRU additions. If the pages are added by one CPU but fadvise() is called on another then the pages remain resident as the invalidate_mapping_pages() only drains the local pagevecs via its call to pagevec_release(). The user-visible effect is that a program that uses fadvise() properly is not obeyed. A possible fix for this is to put the necessary smarts into invalidate_mapping_pages() to globally drain the LRU pagevecs if a pagevec page could not be discarded. The downside with this is that an inode cache shrink would send a global IPI and memory pressure potentially causing global IPI storms is very undesirable. Instead, this patch adds a check during fadvise(POSIX_FADV_DONTNEED) to check if invalidate_mapping_pages() discarded all the requested pages. If a subset of pages are discarded it drains the LRU pagevecs and tries again. If the second attempt fails, it assumes it is due to the pages being mapped, locked or dirty and does not care. With this patch, an application using fadvise() correctly will be obeyed but there is a downside that a malicious application can force the kernel to send global IPIs and increase overhead. If accepted, I would like this to be considered as a -stable candidate. It's not an urgent issue but it's a system call that is not working as advertised which is weak. The following test program demonstrates the problem. It should never report that pages are still resident but will without this patch. It assumes that CPU 0 and 1 exist. int main() { int fd; int pagesize = getpagesize(); ssize_t written = 0, expected; char *buf; unsigned char *vec; int resident, i; cpu_set_t set; /* Prepare a buffer for writing */ expected = FILESIZE_PAGES * pagesize; buf = malloc(expected + 1); if (buf == NULL) { printf("ENOMEM\n"); exit(EXIT_FAILURE); } buf[expected] = 0; memset(buf, 'a', expected); /* Prepare the mincore vec */ vec = malloc(FILESIZE_PAGES); if (vec == NULL) { printf("ENOMEM\n"); exit(EXIT_FAILURE); } /* Bind ourselves to CPU 0 */ CPU_ZERO(&set); CPU_SET(0, &set); if (sched_setaffinity(getpid(), sizeof(set), &set) == -1) { perror("sched_setaffinity"); exit(EXIT_FAILURE); } /* open file, unlink and write buffer */ fd = open("fadvise-test-file", O_CREAT|O_EXCL|O_RDWR); if (fd == -1) { perror("open"); exit(EXIT_FAILURE); } unlink("fadvise-test-file"); while (written < expected) { ssize_t this_write; this_write = write(fd, buf + written, expected - written); if (this_write == -1) { perror("write"); exit(EXIT_FAILURE); } written += this_write; } free(buf); /* * Force ourselves to another CPU. If fadvise only flushes the local * CPUs pagevecs then the fadvise will fail to discard all file pages */ CPU_ZERO(&set); CPU_SET(1, &set); if (sched_setaffinity(getpid(), sizeof(set), &set) == -1) { perror("sched_setaffinity"); exit(EXIT_FAILURE); } /* sync and fadvise to discard the page cache */ fsync(fd); if (posix_fadvise(fd, 0, expected, POSIX_FADV_DONTNEED) == -1) { perror("posix_fadvise"); exit(EXIT_FAILURE); } /* map the file and use mincore to see which parts of it are resident */ buf = mmap(NULL, expected, PROT_READ, MAP_SHARED, fd, 0); if (buf == NULL) { perror("mmap"); exit(EXIT_FAILURE); } if (mincore(buf, expected, vec) == -1) { perror("mincore"); exit(EXIT_FAILURE); } /* Check residency */ for (i = 0, resident = 0; i < FILESIZE_PAGES; i++) { if (vec[i]) resident++; } if (resident != 0) { printf("Nr unexpected pages resident: %d\n", resident); exit(EXIT_FAILURE); } munmap(buf, expected); close(fd); free(vec); exit(EXIT_SUCCESS); } Signed-off-by: Mel Gorman Reported-by: Rob van der Heij Tested-by: Rob van der Heij Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 95558dce307f5ac203cdd15192b8d9f028c0b6c4 Author: Greg Thelen Date: Fri Feb 22 16:36:01 2013 -0800 tmpfs: fix use-after-free of mempolicy object commit 5f00110f7273f9ff04ac69a5f85bb535a4fd0987 upstream. The tmpfs remount logic preserves filesystem mempolicy if the mpol=M option is not specified in the remount request. A new policy can be specified if mpol=M is given. Before this patch remounting an mpol bound tmpfs without specifying mpol= mount option in the remount request would set the filesystem's mempolicy object to a freed mempolicy object. To reproduce the problem boot a DEBUG_PAGEALLOC kernel and run: # mkdir /tmp/x # mount -t tmpfs -o size=100M,mpol=interleave nodev /tmp/x # grep /tmp/x /proc/mounts nodev /tmp/x tmpfs rw,relatime,size=102400k,mpol=interleave:0-3 0 0 # mount -o remount,size=200M nodev /tmp/x # grep /tmp/x /proc/mounts nodev /tmp/x tmpfs rw,relatime,size=204800k,mpol=??? 0 0 # note ? garbage in mpol=... output above # dd if=/dev/zero of=/tmp/x/f count=1 # panic here Panic: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [< (null)>] (null) [...] Oops: 0010 [#1] SMP DEBUG_PAGEALLOC Call Trace: mpol_shared_policy_init+0xa5/0x160 shmem_get_inode+0x209/0x270 shmem_mknod+0x3e/0xf0 shmem_create+0x18/0x20 vfs_create+0xb5/0x130 do_last+0x9a1/0xea0 path_openat+0xb3/0x4d0 do_filp_open+0x42/0xa0 do_sys_open+0xfe/0x1e0 compat_sys_open+0x1b/0x20 cstar_dispatch+0x7/0x1f Non-debug kernels will not crash immediately because referencing the dangling mpol will not cause a fault. Instead the filesystem will reference a freed mempolicy object, which will cause unpredictable behavior. The problem boils down to a dropped mpol reference below if shmem_parse_options() does not allocate a new mpol: config = *sbinfo shmem_parse_options(data, &config, true) mpol_put(sbinfo->mpol) sbinfo->mpol = config.mpol /* BUG: saves unreferenced mpol */ This patch avoids the crash by not releasing the mempolicy if shmem_parse_options() doesn't create a new mpol. How far back does this issue go? I see it in both 2.6.36 and 3.3. I did not look back further. Signed-off-by: Greg Thelen Acked-by: Hugh Dickins Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit b1bfc78bccc85bd063de5d0733d7531ad3578fe7 Author: Lars-Peter Clausen Date: Thu Feb 21 16:44:04 2013 -0800 drivers/video/backlight/adp88?0_bl.c: fix resume commit 5eb02c01bd1f3ef195989ab05e835e2b0711b5a9 upstream. Clearing the NSTBY bit in the control register also automatically clears the BLEN bit. So we need to make sure to set it again during resume, otherwise the backlight will stay off. Signed-off-by: Lars-Peter Clausen Acked-by: Michael Hennerich Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 89c97cf1a83d2ec5b38164752a7ac2c95e2b5a0d Author: Junxiao Bi Date: Thu Feb 21 16:42:45 2013 -0800 ocfs2: unlock super lock if lockres refresh failed commit 3278bb748d2437eb1464765f36429e5d6aa91c38 upstream. If lockres refresh failed, the super lock will never be released which will cause some processes on other cluster nodes hung forever. Signed-off-by: Junxiao Bi Cc: Joel Becker Cc: Mark Fasheh Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 9e3b619d0108cc911ac62eec181945800a4a9114 Author: MITSUNARI Shigeo Date: Thu Feb 21 16:42:01 2013 -0800 fs/block_dev.c: page cache wrongly left invalidated after revalidate_disk() commit 7630b661da330b35dd57b6f5d6d62b386f2dd751 upstream. We found that bdev->bd_invalidated was left set once revalidate_disk() is called, which results in page cache flush every time that device is open. Specifically, we found this problem in MD block device. Once we resize a MD device, mdadm --monitor periodically flush all page cache for that device every 60 or 1000 seconds when it opens the device. This bug lies since at least 3.2.0 till the latest kernel(3.6.2). Patch is attached. The following steps will reproduce the problem. 1. prepair a block device (eg /dev/sdb). 2. create two partitions: sudo parted /dev/sdb mklabel gpt mkpart primary 0% 50% mkpart primary 50% 100% 3. create a md device. sudo mdadm -C /dev/md/hoge -l 1 -n 2 -e 1.2 --assume-clean --auto=md --symlink=no /dev/sdb1 /dev/sdb2 4. create file system and mount it sudo mkfs.ext3 /dev/md/hoge sudo mkdir /mnt/test sudo mount /dev/md/hoge /mnt/test 5. try to resize the device sudo mdadm -G /dev/md/hoge --size=max 6. create a file to fill file cache. sudo dd if=/dev/urandom of=/mnt/test/data bs=1M count=10 and verify the current status of file by free command. 7. mdadm monitor will open the md device every 1000 seconds and you will find all file cache on the device are cleared. The timing can be reduced by the following steps. a) kill mdadm and restart it with --delay option /sbin/mdadm --monitor --delay=30 --pid-file /var/run/mdadm/monitor.pid --daemonise --scan --syslog or open the md device directly. sudo dd if=/dev/md/hoge of=/dev/null bs=4096 count=1 Signed-off-by: MITSUNARI Shigeo Cc: Al Viro Cc: Jeff Moyer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 1ec442c25a6a2b10ec45d7436fc2471dfc65d017 Author: Jim Somerville Date: Thu Feb 21 16:41:59 2013 -0800 inotify: remove broken mask checks causing unmount to be EINVAL commit 676a0675cf9200ac047fb50825f80867b3bb733b upstream. Running the command: inotifywait -e unmount /mnt/disk immediately aborts with a -EINVAL return code. This is however a valid parameter. This abort occurs only if unmount is the sole event parameter. If other event parameters are supplied, then the unmount event wait will work. The problem was introduced by commit 44b350fc23e ("inotify: Fix mask checks"). In that commit, it states: The mask checks in inotify_update_existing_watch() and inotify_new_watch() are useless because inotify_arg_to_mask() sets FS_IN_IGNORED and FS_EVENT_ON_CHILD bits anyway. But instead of removing the useless checks, it did this: mask = inotify_arg_to_mask(arg); - if (unlikely(!mask)) + if (unlikely(!(mask & IN_ALL_EVENTS))) return -EINVAL; The problem is that IN_ALL_EVENTS doesn't include IN_UNMOUNT, and other parts of the code keep IN_UNMOUNT separate from IN_ALL_EVENTS. So the check should be: if (unlikely(!(mask & (IN_ALL_EVENTS | IN_UNMOUNT)))) But inotify_arg_to_mask(arg) always sets the IN_UNMOUNT bit in the mask anyway, so the check is always going to pass and thus should simply be removed. Also note that inotify_arg_to_mask completely controls what mask bits get set from arg, there's no way for invalid bits to get enabled there. Lets fix it by simply removing the useless broken checks. Signed-off-by: Jim Somerville Signed-off-by: Paul Gortmaker Cc: Jerome Marchand Cc: John McCutchan Cc: Robert Love Cc: Eric Paris Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 4c51f3e0edde752f2f85b9b84f8201c0be6f5337 Author: Thomas Gleixner Date: Mon Feb 18 09:52:08 2013 +0100 futex: Revert "futex: Mark get_robust_list as deprecated" commit fe2b05f7ca9f906be61dced5489f63b8b4d7c770 upstream. This reverts commit ec0c4274e33c0373e476b73e01995c53128f1257. get_robust_list() is in use and a removal would break existing user space. With the permission checks in place it's not longer a security hole. Remove the deprecation warnings. Signed-off-by: Thomas Gleixner Cc: Cyrill Gorcunov Cc: Richard Weinberger Cc: akpm@linux-foundation.org Cc: paul.gortmaker@windriver.com Cc: davej@redhat.com Cc: keescook@chromium.org Cc: ebiederm@xmission.com Signed-off-by: Greg Kroah-Hartman commit 0872527e83ef693cbdf0edab72f24e360539f65b Author: Christian Borntraeger Date: Fri Jan 25 15:34:15 2013 +0100 s390/kvm: Fix store status for ACRS/FPRS commit 15bc8d8457875f495c59d933b05770ba88d1eacb upstream. On store status we need to copy the current state of registers into a save area. Currently we might save stale versions: The sie state descriptor doesnt have fields for guest ACRS,FPRS, those registers are simply stored in the host registers. The host program must copy these away if needed. We do that in vcpu_put/load. If we now do a store status in KVM code between vcpu_put/load, the saved values are not up-to-date. Lets collect the ACRS/FPRS before saving them. This also fixes some strange problems with hotplug and virtio-ccw, since the low level machine check handler (on hotplug a machine check will happen) will revalidate all registers with the content of the save area. Signed-off-by: Christian Borntraeger Signed-off-by: Gleb Natapov Signed-off-by: Greg Kroah-Hartman commit 2241467e3f4d1f8f9c4720b41f70231df59f3518 Author: Cornelia Huck Date: Fri Dec 14 17:02:16 2012 +0100 KVM: s390: Handle hosts not supporting s390-virtio. commit 55c171a6d90dc0574021f9c836127cfd1a7d2e30 upstream. Running under a kvm host does not necessarily imply the presence of a page mapped above the main memory with the virtio information; however, the code includes a hard coded access to that page. Instead, check for the presence of the page and exit gracefully before we hit an addressing exception if it does not exist. Reviewed-by: Marcelo Tosatti Reviewed-by: Alexander Graf Signed-off-by: Cornelia Huck Signed-off-by: Gleb Natapov Signed-off-by: Greg Kroah-Hartman commit cf63242fbe56cd6ea1cfb7d3536013eb549ed331 Author: Robin Holt Date: Fri Feb 22 16:35:34 2013 -0800 mmu_notifier_unregister NULL Pointer deref and multiple ->release() callouts commit 751efd8610d3d7d67b7bdf7f62646edea7365dd7 upstream. There is a race condition between mmu_notifier_unregister() and __mmu_notifier_release(). Assume two tasks, one calling mmu_notifier_unregister() as a result of a filp_close() ->flush() callout (task A), and the other calling mmu_notifier_release() from an mmput() (task B). A B t1 srcu_read_lock() t2 if (!hlist_unhashed()) t3 srcu_read_unlock() t4 srcu_read_lock() t5 hlist_del_init_rcu() t6 synchronize_srcu() t7 srcu_read_unlock() t8 hlist_del_rcu() <--- NULL pointer deref. Additionally, the list traversal in __mmu_notifier_release() is not protected by the by the mmu_notifier_mm->hlist_lock which can result in callouts to the ->release() notifier from both mmu_notifier_unregister() and __mmu_notifier_release(). -stable suggestions: The stable trees prior to 3.7.y need commits 21a92735f660 and 70400303ce0c cherry-picked in that order prior to cherry-picking this commit. The 3.7.y tree already has those two commits. Signed-off-by: Robin Holt Cc: Andrea Arcangeli Cc: Wanpeng Li Cc: Xiao Guangrong Cc: Avi Kivity Cc: Hugh Dickins Cc: Marcelo Tosatti Cc: Sagi Grimberg Cc: Haggai Eran Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 4ea103ddc124c66eec8df1d31566e9bda76b4b96 Author: Bjorn Helgaas Date: Tue Jan 29 16:44:27 2013 -0700 Driver core: treat unregistered bus_types as having no devices commit 4fa3e78be7e985ca814ce2aa0c09cbee404efcf7 upstream. A bus_type has a list of devices (klist_devices), but the list and the subsys_private structure that contains it are not initialized until the bus_type is registered with bus_register(). The panic/reboot path has fixups that look up devices in pci_bus_type. If we panic before registering pci_bus_type, the bus_type exists but the list does not, so mach_reboot_fixups() trips over a null pointer and panics again: mach_reboot_fixups pci_get_device .. bus_find_device(&pci_bus_type, ...) bus->p is NULL Joonsoo reported a problem when panicking before PCI was initialized. I think this patch should be sufficient to replace the patch he posted here: https://lkml.org/lkml/2012/12/28/75 ("[PATCH] x86, reboot: skip reboot_fixups in early boot phase") Reported-by: Joonsoo Kim Signed-off-by: Bjorn Helgaas Signed-off-by: Greg Kroah-Hartman commit 00d0e81db8dd3751a437199f3a2105df57308ff7 Author: Minchan Kim Date: Wed Jan 30 11:41:39 2013 +0900 zram: Fix deadlock bug in partial read/write commit 7e5a5104c6af709a8d97d5f4711e7c917761d464 upstream. Now zram allocates new page with GFP_KERNEL in zram I/O path if IO is partial. Unfortunately, It may cause deadlock with reclaim path like below. write_page from fs fs_lock allocation(GFP_KERNEL) reclaim pageout write_page from fs fs_lock <-- deadlock This patch fixes it by using GFP_NOIO. In read path, we reorganize code flow so that kmap_atomic is called after the GFP_NOIO allocation. Acked-by: Jerome Marchand Acked-by: Nitin Gupta [ penberg@kernel.org: don't use GFP_ATOMIC ] Signed-off-by: Pekka Enberg Signed-off-by: Minchan Kim Signed-off-by: Greg Kroah-Hartman commit 0cb034ded5f80d5f549ffb4b3e13fa6f2dcde951 Author: Wei Liu Date: Mon Feb 18 14:57:58 2013 +0000 xen: close evtchn port if binding to irq fails commit e7e44e444876478d50630f57b0c31d29f6725020 upstream. Signed-off-by: Wei Liu Signed-off-by: Konrad Rzeszutek Wilk Signed-off-by: Greg Kroah-Hartman commit ae2599d5acd2bae89ef41dbd2978ab773e03b0d9 Author: Stefan Bader Date: Fri Feb 15 09:48:52 2013 +0100 xen: Send spinlock IPI to all waiters commit 76eaca031f0af2bb303e405986f637811956a422 upstream. There is a loophole between Xen's current implementation of pv-spinlocks and the scheduler. This was triggerable through a testcase until v3.6 changed the TLB flushing code. The problem potentially is still there just not observable in the same way. What could happen was (is): 1. CPU n tries to schedule task x away and goes into a slow wait for the runq lock of CPU n-# (must be one with a lower number). 2. CPU n-#, while processing softirqs, tries to balance domains and goes into a slow wait for its own runq lock (for updating some records). Since this is a spin_lock_irqsave in softirq context, interrupts will be re-enabled for the duration of the poll_irq hypercall used by Xen. 3. Before the runq lock of CPU n-# is unlocked, CPU n-1 receives an interrupt (e.g. endio) and when processing the interrupt, tries to wake up task x. But that is in schedule and still on_cpu, so try_to_wake_up goes into a tight loop. 4. The runq lock of CPU n-# gets unlocked, but the message only gets sent to the first waiter, which is CPU n-# and that is busily stuck. 5. CPU n-# never returns from the nested interruption to take and release the lock because the scheduler uses a busy wait. And CPU n never finishes the task migration because the unlock notification only went to CPU n-#. To avoid this and since the unlocking code has no real sense of which waiter is best suited to grab the lock, just send the IPI to all of them. This causes the waiters to return from the hyper- call (those not interrupted at least) and do active spinlocking. BugLink: http://bugs.launchpad.net/bugs/1011792 Acked-by: Jan Beulich Signed-off-by: Stefan Bader Signed-off-by: Konrad Rzeszutek Wilk Signed-off-by: Greg Kroah-Hartman commit d86fa025b333c72656b3342ea7089b5b278d5d39 Author: Nicolas Pitre Date: Sun Feb 24 20:06:09 2013 -0500 tty vt: fix character insertion overflow commit a883b70d8e0a88278c0a1f80753b4dc99962b541 upstream. Commit 81732c3b2fed ("tty vt: Fix line garbage in virtual console on command line edition") broke insert_char() in multiple ways. Then commit b1a925f44a3a ("tty vt: Fix a regression in command line edition") partially fixed it. However, the buffer being moved is still too large and overflowing beyond the end of the current line, corrupting existing characters on the next line. Example test case: echo -e "abc\nde\x1b[A\x1b[4h \x1b[4l\x1b[B" Expected result: ab c de Current result: ab c e Needless to say that this is very annoying when inserting words in the middle of paragraphs with certain text editors. Signed-off-by: Nicolas Pitre Acked-by: Jean-François Moine Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit d3b1b29819924e0224648d9c33bb15ec768cfe33 Author: Jussi Kivilinna Date: Mon Feb 18 10:29:30 2013 +0200 rtlwifi: usb: allocate URB control message setup_packet and data buffer separately commit bc6b89237acb3dee6af6e64e51a18255fef89cc2 upstream. rtlwifi allocates both setup_packet and data buffer of control message urb, using shared kmalloc in _usbctrl_vendorreq_async_write. Structure used for allocating is: struct { u8 data[254]; struct usb_ctrlrequest dr; }; Because 'struct usb_ctrlrequest' is __packed, setup packet is unaligned and DMA mapping of both 'data' and 'dr' confuses ARM/sunxi, leading to memory corruptions and freezes. Patch changes setup packet to be allocated separately. [v2]: - Use WARN_ON_ONCE instead of WARN_ON Signed-off-by: Jussi Kivilinna Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit 25813769478011277a394700e001b91892d393a5 Author: Larry Finger Date: Fri Feb 8 12:28:18 2013 -0600 rtlwifi: rtl8192cu: Add new USB ID commit 8708aac79e4572ba673d7a21e94ddca9f3abb7fc upstream. A new model of the RTL8188CUS has appeared. Reported-and-tested-by: Thomas Rosenkrantz Signed-off-by: Larry Finger Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit 5ef7f3fcbd361463db3e205484343df0d64c6b31 Author: Larry Finger Date: Sun Feb 17 17:01:20 2013 +0000 b43: Increase number of RX DMA slots commit ccae0e50c16a7f7adb029c169147400d1ce9f703 upstream. Bastian Bittorf reported that some of the silent freezes on a Linksys WRT54G were due to overflow of the RX DMA ring buffer, which was created with 64 slots. That finding reminded me that I was seeing similar crashed on a netbook, which also has a relatively slow processor. After increasing the number of slots to 128, runs on the netbook that previously failed now worked; however, I found that 109 slots had been used in one test. For that reason, the number of slots is being increased to 256. Signed-off-by: Larry Finger Cc: Bastian Bittorf Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 3f1aaeede7bdd955b530522fe15cb663ee4f24d8 Author: Michael Chan Date: Tue Jan 29 17:54:44 2013 -0800 serial_core: Fix type definition for PORT_BRCM_TRUMANAGE. commit 85f024401bf80746ae08b7fd5809a9b16accf0b1 upstream. It was mistakenly defined to be 24 instead of the next higher number 25. Reported-by: Alexander Shishkin Cc: Stephen Hurd Signed-off-by: Michael Chan Signed-off-by: Greg Kroah-Hartman commit cf9b490b5f6137023d01cf284b3fbe2a5bf5190a Author: Thomas Gleixner Date: Thu Feb 14 21:01:06 2013 +0100 serial: imx: Fix recursive locking bug commit 677fe555cbfb188af58cce105f4dae9505e58c31 upstream. commit 9ec1882df2 (tty: serial: imx: console write routing is unsafe on SMP) introduced a recursive locking bug in imx_console_write(). The callchain is: imx_rxint() spin_lock_irqsave(&sport->port.lock,flags); ... uart_handle_sysrq_char(); sysrq_function(); printk(); imx_console_write(); spin_lock_irqsave(&sport->port.lock,flags); <--- DEAD The bad news is that the kernel debugging facilities can dectect the problem, but the printks never surface on the serial console for obvious reasons. There is a similar issue with oops_in_progress. If the kernel crashes we really don't want to be stuck on the lock and unable to tell what happened. In general most UP originated drivers miss these checks and nobody ever notices because CONFIG_PROVE_LOCKING seems to be still ignored by a large number of developers. The solution is to avoid locking in the sysrq case and trylock in the oops_in_progress case. This scheme is used in other drivers as well and it would be nice if we could move this to a common place, so the usual copy/paste/modify bugs can be avoided. Now there is another issue with this scheme: CPU0 CPU1 printk() rxint() sysrq_detection() -> sets port->sysrq return from interrupt console_write() if (port->sysrq) avoid locking port->sysrq is reset with the next receive character. So as long as the port->sysrq is not reset and this can take an endless amount of time if after the break no futher receive character follows, all console writes happen unlocked. While the current writer is protected against other console writers by the console sem, it's unprotected against open/close or other operations which fiddle with the port. That's what the above mentioned commit tried to solve. That's an issue in all drivers which use that scheme and unfortunately there is no easy workaround. The only solution is to have a separate indicator port->sysrq_cpu. uart_handle_sysrq_char() then sets it to smp_processor_id() before calling into handle_sysrq() and resets it to -1 after that. Then change the locking check to: if (port->sysrq_cpu == smp_processor_id()) locked = 0; else if (oops_in_progress) locked = spin_trylock_irqsave(port->lock, flags); else spin_lock_irqsave(port->lock, flags); That would force all other cpus into the spin_lock path. Problem solved, but that's way beyond the scope of this fix and really wants to be implemented in a common function which calls the uart specific write function to avoid another gazillion of hard to debug copy/paste/modify bugs. Reported-and-tested-by: Tim Sander Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit 96e2118869ca0134bd7370c060a36b8b4f722395 Author: Johan Hovold Date: Wed Feb 13 17:53:28 2013 +0100 USB: serial: fix null-pointer dereferences on disconnect commit b2ca699076573c94fee9a73cb0d8645383b602a0 upstream. Make sure serial-driver dtr_rts is called with disc_mutex held after checking the disconnected flag. Due to a bug in the tty layer, dtr_rts may get called after a device has been disconnected and the tty-device unregistered. Some drivers have had individual checks for disconnect to make sure the disconnected interface was not accessed, but this should really be handled in usb-serial core (at least until the long-standing tty-bug has been fixed). Note that the problem has been made more acute with commit 0998d0631001 ("device-core: Ensure drvdata = NULL when no driver is bound") as the port data is now also NULL when dtr_rts is called resulting in further oopses. Reported-by: Chris Ruehl Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit 7b5659b0d4287dbef7e6808bb9e5bef1e6e86abe Author: Oleg Nesterov Date: Tue Jan 29 20:07:41 2013 +0100 tty: set_termios/set_termiox should not return -EINTR commit 183d95cdd834381c594d3aa801c1f9f9c0c54fa9 upstream. See https://bugzilla.redhat.com/show_bug.cgi?id=904907 read command causes bash to abort with double free or corruption (out). A simple test-case from Roman: // Compile the reproducer and send sigchld ti that process. // EINTR occurs even if SA_RESTART flag is set. void handler(int sig) { } main() { struct sigaction act; act.sa_handler = handler; act.sa_flags = SA_RESTART; sigaction (SIGCHLD, &act, 0); struct termio ttp; ioctl(0, TCGETA, &ttp); while(1) { if (ioctl(0, TCSETAW, ttp) < 0) { if (errno == EINTR) { fprintf(stderr, "BUG!"); return(1); } } } } Change set_termios/set_termiox to return -ERESTARTSYS to fix this particular problem. I didn't dare to change other EINTR's in drivers/tty/, but they look equally wrong. Reported-by: Roman Rakus Reported-by: Lingzhu Xiang Signed-off-by: Oleg Nesterov Cc: Jiri Slaby Signed-off-by: Greg Kroah-Hartman commit 9a484d2d694b90c658ee99d54d227f1d7e2e5a75 Author: Dirkjan Bussink Date: Wed Jan 30 11:44:50 2013 +0100 tty: Prevent deadlock in n_gsm driver commit 4d9b109060f690f5c835130ff54165ae157b3087 upstream. This change fixes a deadlock when the multiplexer is closed while there are still client side ports open. When the multiplexer is closed and there are active tty's it tries to close them with tty_vhangup. This has a problem though, because tty_vhangup needs the tty_lock. This patch changes it to unlock the tty_lock before attempting the hangup and relocks afterwards. The additional call to tty_port_tty_set is needed because otherwise the port stays active because of the reference counter. This change also exposed another problem that other code paths don't expect that the multiplexer could have been closed. This patch also adds checks for these cases in the gsmtty_ class of function that could be called. The documentation explicitly states that "first close all virtual ports before closing the physical port" but we've found this to not always reality in our field situations. The GPRS / UTMS modem sometimes crashes and needs a power cycle in that case which means cleanly shutting down everything is not always possible. This change makes it much more robust for our situation where at least the system is recoverable with this patch and doesn't hang in a deadlock situation inside the kernel. The patch is against the long term support kernel (3.4.27) and should apply cleanly to more recent branches. Tested with a Telit GE864-QUADV2 and Telit HE910 modem. Signed-off-by: Dirkjan Bussink Signed-off-by: Greg Kroah-Hartman commit 9e7f31fcfc861560748ec523d26c53ef9bf812ba Author: Denis Efremov Date: Mon Feb 11 19:04:06 2013 +0400 ALSA: rme32.c irq enabling after spin_lock_irq commit f49a59c4471d81a233e09dda45187cc44fda009d upstream. According to the other code in this driver and similar code in rme96 it seems, that spin_lock_irq in snd_rme32_capture_close function should be paired with spin_unlock_irq. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Denis Efremov Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 4c38f4f53ce49902ccac03d031255c11f5317225 Author: Denis Efremov Date: Mon Feb 11 19:49:48 2013 +0400 ALSA: ali5451: remove irq enabling in pointer callback commit dacae5a19b4cbe1b5e3a86de23ea74cbe9ec9652 upstream. snd_ali_pointer function is called with local interrupts disabled. However it seems very strange to reenable them in such way. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Denis Efremov Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 97c6309382a9ed6bda8abfef7fa8f748779b3df7 Author: Konstantin Khlebnikov Date: Thu Jan 24 16:36:31 2013 +0400 workqueue: un-GPL function delayed_work_timer_fn() commit 1438ade5670b56d5386c220e1ad4b5a824a1e585 upstream. commit d8e794dfd51c368ed3f686b7f4172830b60ae47b ("workqueue: set delayed_work->timer function on initialization") exports function delayed_work_timer_fn() only for GPL modules. This makes delayed-works unusable for non-GPL modules, because initialization macro now requires GPL symbol. For example schedule_delayed_work() available for non-GPL. Signed-off-by: Konstantin Khlebnikov Signed-off-by: Tejun Heo Signed-off-by: Greg Kroah-Hartman commit cb2dec77f98bc2b92d60ffb9b2cb702ca85adda3 Author: Olaf Hering <[mailto:olaf@aepfle.de]> Date: Sun Feb 3 17:22:37 2013 -0800 x86: Hyper-V: register clocksource only if its advertised commit 32068f6527b8f1822a30671dedaf59c567325026 upstream. Enable hyperv_clocksource only if its advertised as a feature. XenServer 6 returns the signature which is checked in ms_hyperv_platform(), but it does not offer all features. Currently the clocksource is enabled unconditionally in ms_hyperv_init_platform(), and the result is a hanging guest. Hyper-V spec Bit 1 indicates the availability of Partition Reference Counter. Register the clocksource only if this bit is set. The guest in question prints this in dmesg: [ 0.000000] Hypervisor detected: Microsoft HyperV [ 0.000000] HyperV: features 0x70, hints 0x0 This bug can be reproduced easily be setting 'viridian=1' in a HVM domU .cfg file. A workaround without this patch is to boot the HVM guest with 'clocksource=jiffies'. Signed-off-by: Olaf Hering Link: http://lkml.kernel.org/r/1359940959-32168-1-git-send-email-kys@microsoft.com Signed-off-by: K. Y. Srinivasan Signed-off-by: H. Peter Anvin Signed-off-by: Greg Kroah-Hartman commit 682c421f6757fab1d2721852de6a29f0496db460 Author: Leonid Shatz Date: Mon Feb 4 14:33:37 2013 +0200 hrtimer: Prevent hrtimer_enqueue_reprogram race commit b22affe0aef429d657bc6505aacb1c569340ddd2 upstream. hrtimer_enqueue_reprogram contains a race which could result in timer.base switch during unlock/lock sequence. hrtimer_enqueue_reprogram is releasing the lock protecting the timer base for calling raise_softirq_irqsoff() due to a lock ordering issue versus rq->lock. If during that time another CPU calls __hrtimer_start_range_ns() on the same hrtimer, the timer base might switch, before the current CPU can lock base->lock again and therefor the unlock_timer_base() call will unlock the wrong lock. [ tglx: Added comment and massaged changelog ] Signed-off-by: Leonid Shatz Signed-off-by: Izik Eidus Cc: Andrea Arcangeli Link: http://lkml.kernel.org/r/1359981217-389-1-git-send-email-izik.eidus@ravellosystems.com Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit 15ec80f4bcf415309745fa77b1db9564ecedac67 Author: Stanislaw Gruszka Date: Fri Feb 15 11:08:11 2013 +0100 posix-cpu-timers: Fix nanosleep task_struct leak commit e6c42c295e071dd74a66b5a9fcf4f44049888ed8 upstream. The trinity fuzzer triggered a task_struct reference leak via clock_nanosleep with CPU_TIMERs. do_cpu_nanosleep() calls posic_cpu_timer_create(), but misses a corresponding posix_cpu_timer_del() which leads to the task_struct reference leak. Reported-and-tested-by: Tommi Rantala Signed-off-by: Stanislaw Gruszka Cc: Dave Jones Cc: John Stultz Cc: Oleg Nesterov Link: http://lkml.kernel.org/r/20130215100810.GF4392@redhat.com Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit 54b292f6f51ebdf214d68a9d5e9b7e65354d9b62 Author: Thomas Gleixner Date: Fri Nov 23 10:08:44 2012 +0100 genirq: Avoid deadlock in spurious handling commit e716efde75267eab919cdb2bef5b2cb77f305326 upstream. commit 52553ddf(genirq: fix regression in irqfixup, irqpoll) introduced a potential deadlock by calling the action handler with the irq descriptor lock held. Remove the call and let the handling code run even for an interrupt where only a single action is registered. That matches the goal of the above commit and avoids the deadlock. Document the confusing action = desc->action reload in the handling loop while at it. Reported-and-tested-by: "Wang, Warner" Tested-by: Edward Donovan Cc: "Wang, Song-Bo (Stoney)" Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit 83b53b8cc693fe3d1216f21796da009fc8809e9f Author: H. Peter Anvin Date: Thu Feb 7 17:14:08 2013 -0800 timeconst.pl: Eliminate Perl warning commit 63a3f603413ffe82ad775f2d62a5afff87fd94a0 upstream. defined(@array) is deprecated in Perl and gives off a warning. Restructure the code to remove that warning. [ hpa: it would be interesting to revert to the timeconst.bc script. It appears that the failures reported by akpm during testing of that script was due to a known broken version of make, not a problem with bc. The Makefile rules could probably be restructured to avoid the make bug, or it is probably old enough that it doesn't matter. ] Reported-by: Andi Kleen Signed-off-by: H. Peter Anvin Cc: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 51fb74af376cb553293ba03237a1cae4f4a6cf3c Author: Linus Torvalds Date: Mon Feb 18 09:58:02 2013 -0800 mm: fix pageblock bitmap allocation commit 7c45512df987c5619db041b5c9b80d281e26d3db upstream. Commit c060f943d092 ("mm: use aligned zone start for pfn_to_bitidx calculation") fixed out calculation of the index into the pageblock bitmap when a !SPARSEMEM zome was not aligned to pageblock_nr_pages. However, the _allocation_ of that bitmap had never taken this alignment requirement into accout, so depending on the exact size and alignment of the zone, the use of that index could then access past the allocation, resulting in some very subtle memory corruption. This was reported (and bisected) by Ingo Molnar: one of his random config builds would hang with certain very specific kernel command line options. In the meantime, commit c060f943d092 has been marked for stable, so this fix needs to be back-ported to the stable kernels that backported the commit to use the right alignment. Bisected-and-tested-by: Ingo Molnar Acked-by: Mel Gorman Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 3b2f159aa7ab05890e521778bf085507b08f575b Author: Jiri Olsa Date: Sat Oct 20 22:14:10 2012 +0200 perf hists: Fix period symbol_conf.field_sep display commit c0d246b85fc7d42688d7a5d999ea671777caf65b upstream. Currently we don't properly display hist data with symbol_conf.field_sep separator. We need to display either space or separator. Signed-off-by: Jiri Olsa Cc: Arnaldo Carvalho de Melo Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Paul Mackerras Cc: Corey Ashford Cc: Frederic Weisbecker Cc: Namhyung Kim Link: http://lkml.kernel.org/n/tip-cyggwys0bz5kqdowwvfd8h72@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo Cc: Josh Boyer Signed-off-by: Greg Kroah-Hartman commit 091c211bd466af959767efabf1edd3f6efd74440 Author: Vinson Lee Date: Wed Feb 13 13:48:58 2013 -0800 perf tools: Fix build with bison 2.3 and older. commit 85df3b3769222894e9692b383c7af124b7721086 upstream. The %name-prefix "prefix" syntax is not available on bison 2.3 and older. Substitute with the -p "prefix" command-line option for compatibility with older versions of bison. This patch fixes this build error with older versions of bison. CC util/sysfs.o BISON util/pmu-bison.c util/pmu.y:2.14-24: syntax error, unexpected string, expecting = make: *** [util/pmu-bison.c] Error 1 Signed-off-by: Vinson Lee Tested-by: Li Zefan Cc: Ingo Molnar Cc: Jiri Olsa Cc: Li Zefan Cc: Namhyung Kim Cc: Paul Mackerras Cc: Pekka Enberg Link: http://lkml.kernel.org/r/1360792138-29186-1-git-send-email-vlee@twitter.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Greg Kroah-Hartman commit 0d269c9ac33124bd9eb6137be39b559bd4724fd2 Author: H. Peter Anvin Date: Thu Jan 31 14:00:48 2013 -0800 x86-32, mm: Remove reference to alloc_remap() commit 07f4207a305c834f528d08428df4531744e25678 upstream. We have removed the remap allocator for x86-32, and x86-64 never had it (and doesn't need it). Remove residual reference to it. Reported-by: Yinghai Lu Signed-off-by: H. Peter Anvin Cc: Dave Hansen Link: http://lkml.kernel.org/r/CAE9FiQVn6_QZi3fNQ-JHYiR-7jeDJ5hT0SyT_%2BzVvfOj=PzF3w@mail.gmail.com Signed-off-by: Greg Kroah-Hartman commit 5d0acfb94c9fd79ec3ea10c249b8e707728f20fa Author: H. Peter Anvin Date: Thu Jan 31 13:53:10 2013 -0800 x86-32, mm: Remove reference to resume_map_numa_kva() commit bb112aec5ee41427e9b9726e3d57b896709598ed upstream. Remove reference to removed function resume_map_numa_kva(). Signed-off-by: H. Peter Anvin Cc: Dave Hansen Link: http://lkml.kernel.org/r/20130131005616.1C79F411@kernel.stglabs.ibm.com Signed-off-by: Greg Kroah-Hartman commit ee53f7173bdeb9c1517c27299dd84d0d4f5e8b38 Author: Dave Hansen Date: Wed Jan 30 16:56:16 2013 -0800 x86-32, mm: Rip out x86_32 NUMA remapping code commit f03574f2d5b2d6229dcdf2d322848065f72953c7 upstream. This code was an optimization for 32-bit NUMA systems. It has probably been the cause of a number of subtle bugs over the years, although the conditions to excite them would have been hard to trigger. Essentially, we remap part of the kernel linear mapping area, and then sometimes part of that area gets freed back in to the bootmem allocator. If those pages get used by kernel data structures (say mem_map[] or a dentry), there's no big deal. But, if anyone ever tried to use the linear mapping for these pages _and_ cared about their physical address, bad things happen. For instance, say you passed __GFP_ZERO to the page allocator and then happened to get handed one of these pages, it zero the remapped page, but it would make a pte to the _old_ page. There are probably a hundred other ways that it could screw with things. We don't need to hang on to performance optimizations for these old boxes any more. All my 32-bit NUMA systems are long dead and buried, and I probably had access to more than most people. This code is causing real things to break today: https://lkml.org/lkml/2013/1/9/376 I looked in to actually fixing this, but it requires surgery to way too much brittle code, as well as stuff like per_cpu_ptr_to_phys(). [ hpa: Cc: this for -stable, since it is a memory corruption issue. However, an alternative is to simply mark NUMA as depends BROKEN rather than EXPERIMENTAL in the X86_32 subclause... ] Link: http://lkml.kernel.org/r/20130131005616.1C79F411@kernel.stglabs.ibm.com Signed-off-by: H. Peter Anvin Signed-off-by: Greg Kroah-Hartman commit 56c8d200f7974c2d2bef905ef3d67f6f3eaba71c Author: Marcin Slusarz Date: Mon Dec 10 21:30:51 2012 +0100 drm/nouveau/vm: fix memory corruption when pgt allocation fails commit cfd376b6bfccf33782a0748a9c70f7f752f8b869 upstream. If we return freed vm, nouveau_drm_open will happily call nouveau_cli_destroy, which will try to free it again. Reported-by: Peter Hurley Signed-off-by: Marcin Slusarz Signed-off-by: Ben Skeggs Signed-off-by: Greg Kroah-Hartman