commit de84d7b22bd4690a1f3d3e36fb938ec2284f0b13 Author: Greg Kroah-Hartman Date: Wed Jul 13 05:31:47 2011 +0200 Linux 2.6.33.16 commit 7cf46ba5d43e2596e559df7988eaf7ada966a6c8 Author: Miklos Szeredi Date: Wed Feb 23 13:49:47 2011 +0100 mm: prevent concurrent unmap_mapping_range() on the same inode commit 2aa15890f3c191326678f1bd68af61ec6b8753ec upstream. Michael Leun reported that running parallel opens on a fuse filesystem can trigger a "kernel BUG at mm/truncate.c:475" Gurudas Pai reported the same bug on NFS. The reason is, unmap_mapping_range() is not prepared for more than one concurrent invocation per inode. For example: thread1: going through a big range, stops in the middle of a vma and stores the restart address in vm_truncate_count. thread2: comes in with a small (e.g. single page) unmap request on the same vma, somewhere before restart_address, finds that the vma was already unmapped up to the restart address and happily returns without doing anything. Another scenario would be two big unmap requests, both having to restart the unmapping and each one setting vm_truncate_count to its own value. This could go on forever without any of them being able to finish. Truncate and hole punching already serialize with i_mutex. Other callers of unmap_mapping_range() do not, and it's difficult to get i_mutex protection for all callers. In particular ->d_revalidate(), which calls invalidate_inode_pages2_range() in fuse, may be called with or without i_mutex. This patch adds a new mutex to 'struct address_space' to prevent running multiple concurrent unmap_mapping_range() on the same mapping. [ We'll hopefully get rid of all this with the upcoming mm preemptibility series by Peter Zijlstra, the "mm: Remove i_mmap_mutex lockbreak" patch in particular. But that is for 2.6.39 ] Signed-off-by: Miklos Szeredi Reported-by: Michael Leun Reported-by: Gurudas Pai Tested-by: Gurudas Pai Acked-by: Hugh Dickins Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit fae5c27e842fef2c59ea3ab4a9afaf903cbd267e Author: Xufeng Zhang Date: Tue Jun 21 10:43:40 2011 +0000 udp/recvmsg: Clear MSG_TRUNC flag when starting over for a new packet [ Upstream commit 9cfaa8def1c795a512bc04f2aec333b03724ca2e ] Consider this scenario: When the size of the first received udp packet is bigger than the receive buffer, MSG_TRUNC bit is set in msg->msg_flags. However, if checksum error happens and this is a blocking socket, it will goto try_again loop to receive the next packet. But if the size of the next udp packet is smaller than receive buffer, MSG_TRUNC flag should not be set, but because MSG_TRUNC bit is not cleared in msg->msg_flags before receive the next packet, MSG_TRUNC is still set, which is wrong. Fix this problem by clearing MSG_TRUNC flag when starting over for a new packet. Signed-off-by: Xufeng Zhang Signed-off-by: Paul Gortmaker Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 05c824cea44ddb773182ac849276a2e0419e3ab2 Author: Xufeng Zhang Date: Tue Jun 21 10:43:39 2011 +0000 ipv6/udp: Use the correct variable to determine non-blocking condition [ Upstream commit 32c90254ed4a0c698caa0794ebb4de63fcc69631 ] udpv6_recvmsg() function is not using the correct variable to determine whether or not the socket is in non-blocking operation, this will lead to unexpected behavior when a UDP checksum error occurs. Consider a non-blocking udp receive scenario: when udpv6_recvmsg() is called by sock_common_recvmsg(), MSG_DONTWAIT bit of flags variable in udpv6_recvmsg() is cleared by "flags & ~MSG_DONTWAIT" in this call: err = sk->sk_prot->recvmsg(iocb, sk, msg, size, flags & MSG_DONTWAIT, flags & ~MSG_DONTWAIT, &addr_len); i.e. with udpv6_recvmsg() getting these values: int noblock = flags & MSG_DONTWAIT int flags = flags & ~MSG_DONTWAIT So, when udp checksum error occurs, the execution will go to csum_copy_err, and then the problem happens: csum_copy_err: ............... if (flags & MSG_DONTWAIT) return -EAGAIN; goto try_again; ............... But it will always go to try_again as MSG_DONTWAIT has been cleared from flags at call time -- only noblock contains the original value of MSG_DONTWAIT, so the test should be: if (noblock) return -EAGAIN; This is also consistent with what the ipv4/udp code does. Signed-off-by: Xufeng Zhang Signed-off-by: Paul Gortmaker Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit bcdafe04648854ae6bb9bb2caea7f3c50d2df855 Author: Marcus Meissner Date: Wed Jun 1 21:05:22 2011 -0700 net/ipv4: Check for mistakenly passed in non-IPv4 address [ Upstream commit d0733d2e29b652b2e7b1438ececa732e4eed98eb ] Check against mistakenly passing in IPv6 addresses (which would result in an INADDR_ANY bind) or similar incompatible sockaddrs. Signed-off-by: Marcus Meissner Cc: Reinhard Max Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 26d3cbdece9463926d553dbeed3df10af22bb161 Author: Eric Dumazet Date: Mon Jun 6 22:42:06 2011 -0700 af_packet: prevent information leak [ Upstream commit 13fcb7bd322164c67926ffe272846d4860196dc6 ] In 2.6.27, commit 393e52e33c6c2 (packet: deliver VLAN TCI to userspace) added a small information leak. Add padding field and make sure its zeroed before copy to user. Signed-off-by: Eric Dumazet CC: Patrick McHardy Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 5952a72b61eee5675ef9f878fda98e5719ee4f35 Author: Joe Perches Date: Sat May 21 07:48:40 2011 +0000 net: filter: Use WARN_RATELIMIT [ Upstream commit 6c4a5cb219520c7bc937ee186ca53f03733bd09f ] A mis-configured filter can spam the logs with lots of stack traces. Rate-limit the warnings and add printout of the bogus filter information. Original-patch-by: Ben Greear Signed-off-by: Joe Perches Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 7cad013caa9a0f16f47b1b5139b67bdf1bf5c7d1 Author: Joe Perches Date: Sat May 21 07:48:39 2011 +0000 bug.h: Add WARN_RATELIMIT [ Upstream commit b3eec79b0776e5340a3db75b34953977c7e5086e ] Add a generic mechanism to ratelimit WARN(foo, fmt, ...) messages using a hidden per call site static struct ratelimit_state. Also add an __WARN_RATELIMIT variant to be able to use a specific struct ratelimit_state. Signed-off-by: Joe Perches Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 386f12d6c880e3c7b9a31da1c7b5cca3500470d0 Author: Rafael J. Wysocki Date: Wed Jul 6 20:15:23 2011 +0200 PM / Hibernate: Fix free_unnecessary_pages() commit 4d4cf23cdde2f8f9324f5684a7f349e182039529 upstream. There is a bug in free_unnecessary_pages() that causes it to attempt to free too many pages in some cases, which triggers the BUG_ON() in memory_bm_clear_bit() for copy_bm. Namely, if count_data_pages() is initially greater than alloc_normal, we get to_free_normal equal to 0 and "save" greater from 0. In that case, if the sum of "save" and count_highmem_pages() is greater than alloc_highmem, we subtract a positive number from to_free_normal. Hence, since to_free_normal was 0 before the subtraction and is an unsigned int, the result is converted to a huge positive number that is used as the number of pages to free. Fix this bug by checking if to_free_normal is actually greater than or equal to the number we're going to subtract from it. Signed-off-by: Rafael J. Wysocki Reported-and-tested-by: Matthew Garrett Signed-off-by: Greg Kroah-Hartman commit 87f97028802669d593a5f8b99249d47b02a08a30 Author: Rafael J. Wysocki Date: Sat Sep 11 20:58:27 2010 +0200 PM / Hibernate: Avoid hitting OOM during preallocation of memory commit 6715045ddc7472a22be5e49d4047d2d89b391f45 upstream. There is a problem in hibernate_preallocate_memory() that it calls preallocate_image_memory() with an argument that may be greater than the total number of available non-highmem memory pages. If that's the case, the OOM condition is guaranteed to trigger, which in turn can cause significant slowdown to occur during hibernation. To avoid that, make preallocate_image_memory() adjust its argument before calling preallocate_image_pages(), so that the total number of saveable non-highem pages left is not less than the minimum size of a hibernation image. Change hibernate_preallocate_memory() to try to allocate from highmem if the number of pages allocated by preallocate_image_memory() is too low. Modify free_unnecessary_pages() to take all possible memory allocation patterns into account. Reported-by: KOSAKI Motohiro Signed-off-by: Rafael J. Wysocki Tested-by: M. Vefa Bicakci Signed-off-by: Greg Kroah-Hartman commit ec15d0b0d6d43e8f9daf6282c7f22dacf5e2f089 Author: Eric Dumazet Date: Fri Jun 17 16:25:39 2011 -0400 inet_diag: fix inet_diag_bc_audit() [ Upstream commit eeb1497277d6b1a0a34ed36b97e18f2bd7d6de0d ] A malicious user or buggy application can inject code and trigger an infinite loop in inet_diag_bc_audit() Also make sure each instruction is aligned on 4 bytes boundary, to avoid unaligned accesses. Reported-by: Dan Rosenberg Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 29ee67cbbaac453bea60fcf843e91b9cb3f49874 Author: Nelson Elhage Date: Wed Nov 3 16:35:40 2010 +0000 netlink: Make nlmsg_find_attr take a const nlmsghdr*. commit 6b8c92ba07287578718335ce409de8e8d7217e40 upstream. This will let us use it on a nlmsghdr stored inside a netlink_callback. Signed-off-by: Nelson Elhage Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 27957444d79357335093639f4781afffc8fd1a43 Author: Liu Aleaxander Date: Tue Jun 29 15:05:40 2010 -0700 um: os-linux/mem.c needs sys/stat.h commit fb967ecc584c20c74a007de749ca597068b0fcac upstream. The os-linux/mem.c file calls fchmod function, which is declared in sys/stat.h header file, so include it. Fixes build breakage under FC13. Signed-off-by: Liu Aleaxander Acked-by: Boaz Harrosh Cc: Jeff Dike Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit e054ffd990a1bb888eb443b8fb351656d4e78751 Author: Roland McGrath Date: Tue Oct 26 14:22:19 2010 -0700 uml: fix CONFIG_STATIC_LINK=y build failure with newer glibc commit aa5fb4dbfd121296ca97c68cf90043a7ea97579d upstream. With glibc 2.11 or later that was built with --enable-multi-arch, the UML link fails with undefined references to __rel_iplt_start and similar symbols. In recent binutils, the default linker script defines these symbols (see ld --verbose). Fix the UML linker scripts to match the new defaults for these sections. Signed-off-by: Roland McGrath Cc: Jeff Dike Cc: Al Viro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 2cfbb3cde023531e0254623197de85e43c9fdfc1 Author: Alan Stern Date: Wed Jun 15 16:29:16 2011 -0400 USB: don't let the hub driver prevent system sleep commit cbb330045e5df8f665ac60227ff898421fc8fb92 upstream. This patch (as1465) continues implementation of the policy that errors during suspend or hibernation should not prevent the system from going to sleep. In this case, failure to turn on the Suspend feature for a hub port shouldn't be reported as an error. There are situations where this does actually occur (such as when the device plugged into that port was disconnected in the recent past), and it turns out to be harmless. There's no reason for it to prevent a system sleep. Also, don't allow the hub driver to fail a system suspend if the downstream ports aren't all suspended. This is also harmless (and should never happen, given the change mentioned above); printing a warning message in the kernel log is all we really need to do. Signed-off-by: Alan Stern Signed-off-by: Greg Kroah-Hartman commit 8ae176cfe1080b18f2023957118af249bd4ef1bd Author: Alan Stern Date: Wed Jun 15 16:27:43 2011 -0400 USB: don't let errors prevent system sleep commit 0af212ba8f123c2eba151af7726c34a50b127962 upstream. This patch (as1464) implements the recommended policy that most errors during suspend or hibernation should not prevent the system from going to sleep. In particular, failure to suspend a USB driver or a USB device should not prevent the sleep from succeeding: Failure to suspend a device won't matter, because the device will automatically go into suspend mode when the USB bus stops carrying packets. (This might be less true for USB-3.0 devices, but let's not worry about them now.) Failure of a driver to suspend might lead to trouble later on when the system wakes up, but it isn't sufficient reason to prevent the system from going to sleep. Signed-off-by: Alan Stern Signed-off-by: Greg Kroah-Hartman commit dbc6022c6cf86812bc622378913fc73fa511201c Author: Vasiliy Kulikov Date: Mon Jun 27 16:18:11 2011 -0700 taskstats: don't allow duplicate entries in listener mode commit 26c4caea9d697043cc5a458b96411b86d7f6babd upstream. Currently a single process may register exit handlers unlimited times. It may lead to a bloated listeners chain and very slow process terminations. Eg after 10KK sent TASKSTATS_CMD_ATTR_REGISTER_CPUMASKs ~300 Mb of kernel memory is stolen for the handlers chain and "time id" shows 2-7 seconds instead of normal 0.003. It makes it possible to exhaust all kernel memory and to eat much of CPU time by triggerring numerous exits on a single CPU. The patch limits the number of times a single process may register itself on a single CPU to one. One little issue is kept unfixed - as taskstats_exit() is called before exit_files() in do_exit(), the orphaned listener entry (if it was not explicitly deregistered) is kept until the next someone's exit() and implicit deregistration in send_cpu_listeners(). So, if a process registered itself as a listener exits and the next spawned process gets the same pid, it would inherit taskstats attributes. Signed-off-by: Vasiliy Kulikov Cc: Balbir Singh Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 34e32c9cf37e250c797917fd8529504e55023f86 Author: Arnd Bergmann Date: Fri Jul 1 17:30:00 2011 -0700 6pack,mkiss: fix lock inconsistency commit 6e4e2f811bade330126d4029c88c831784a7efd9 upstream. Lockdep found a locking inconsistency in the mkiss_close function: > kernel: [ INFO: inconsistent lock state ] > kernel: 2.6.39.1 #3 > kernel: --------------------------------- > kernel: inconsistent {IN-SOFTIRQ-R} -> {SOFTIRQ-ON-W} usage. > kernel: ax25ipd/2813 [HC0[0]:SC0[0]:HE1:SE1] takes: > kernel: (disc_data_lock){+++?.-}, at: [] mkiss_close+0x1b/0x90 [mkiss] > kernel: {IN-SOFTIRQ-R} state was registered at: The message hints that disc_data_lock is aquired with softirqs disabled, but does not itself disable softirqs, which can in rare circumstances lead to a deadlock. The same problem is present in the 6pack driver, this patch fixes both by using write_lock_bh instead of write_lock. Reported-by: Bernard F6BVP Tested-by: Bernard F6BVP Signed-off-by: Arnd Bergmann Acked-by: Ralf Baechle Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 273f86df6960cfc9e6402f0eca7d5b77204c5dd8 Author: Trond Myklebust Date: Fri Jun 17 10:14:59 2011 -0400 SUNRPC: Ensure the RPC client only quits on fatal signals commit 5afa9133cfe67f1bfead6049a9640c9262a7101c upstream. Fix a couple of instances where we were exiting the RPC client on arbitrary signals. We should only do so on fatal signals. Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman commit 60a967848c94dc3f4c4a29e0e40c325be10c1919 Author: NeilBrown Date: Tue Jun 28 16:59:42 2011 +1000 md: avoid endless recovery loop when waiting for fail device to complete. commit 4274215d24633df7302069e51426659d4759c5ed upstream. If a device fails in a way that causes pending request to take a while to complete, md will not be able to immediately remove it from the array in remove_and_add_spares. It will then incorrectly look like a spare device and md will try to recover it even though it is failed. This leads to a recovery process starting and instantly aborting over and over again. We should check if the device is faulty before considering it to be a spare. This will avoid trying to start a recovery that cannot proceed. This bug was introduced in 2.6.26 so that patch is suitable for any kernel since then. Reported-by: Jim Paradis Signed-off-by: NeilBrown Signed-off-by: Greg Kroah-Hartman commit 1ab4ae89594c47df6f9b4cd397f3281a30c059f2 Author: Jean Delvare Date: Wed Jun 29 11:36:10 2011 +0200 i2c-taos-evm: Fix log messages commit 9b640f2e154268cb516efcaf9c434f2e73c6783e upstream. * Print all error and information messages even when debugging is disabled. * Don't use adapter device to log messages before it is ready. Signed-off-by: Jean Delvare Signed-off-by: Greg Kroah-Hartman commit 2eaa27dbf562d1990397db5a1022beacecedc3b8 Author: Shaohua Li Date: Mon Jun 27 09:03:47 2011 +0200 cfq-iosched: fix a rcu warning commit 3181faa85bda3dc3f5e630a1846526c9caaa38e3 upstream. I got a rcu warnning at boot. the ioc->ioc_data is rcu_deferenced, but doesn't hold rcu_read_lock. Signed-off-by: Shaohua Li Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit d234df5a1ed1bab59b88660548c30ff1af69de5b Author: Jens Axboe Date: Sun Jun 5 06:01:13 2011 +0200 cfq-iosched: fix locking around ioc->ioc_data assignment commit ab4bd22d3cce6977dc039664cc2d052e3147d662 upstream. Since we are modifying this RCU pointer, we need to hold the lock protecting it around it. This fixes a potential reuse and double free of a cfq io_context structure. The bug has been in CFQ for a long time, it hit very few people but those it did hit seemed to see it a lot. Tracked in RH bugzilla here: https://bugzilla.redhat.com/show_bug.cgi?id=577968 Credit goes to Paul Bolle for figuring out that the issue was around the one-hit ioc->ioc_data cache. Thanks to his hard work the issue is now fixed. Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit b115036a750c9bba4b9a598683eb5105744fe196 Author: Marcin Slusarz Date: Sat May 28 13:23:42 2011 +0200 debugobjects: Fix boot crash when kmemleak and debugobjects enabled commit 161b6ae0e067e421b20bb35caf66bdb405c929ac upstream. Order of initialization look like this: ... debugobjects kmemleak ...(lots of other subsystems)... workqueues (through early initcall) ... debugobjects use schedule_work for batch freeing of its data and kmemleak heavily use debugobjects, so when it comes to freeing and workqueues were not initialized yet, kernel crashes: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] __queue_work+0x29/0x41a [] queue_work_on+0x16/0x1d [] queue_work+0x29/0x55 [] schedule_work+0x13/0x15 [] free_object+0x90/0x95 [] debug_check_no_obj_freed+0x187/0x1d3 [] ? _raw_spin_unlock_irqrestore+0x30/0x4d [] ? free_object_rcu+0x68/0x6d [] kmem_cache_free+0x64/0x12c [] free_object_rcu+0x68/0x6d [] __rcu_process_callbacks+0x1b6/0x2d9 ... because system_wq is NULL. Fix it by checking if workqueues susbystem was initialized before using. Signed-off-by: Marcin Slusarz Cc: Catalin Marinas Cc: Tejun Heo Cc: Dipankar Sarma Cc: Paul E. McKenney Link: http://lkml.kernel.org/r/20110528112342.GA3068@joi.lan Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit 72b6e8a847914a76cce6f0741a894872a55ec5f3 Author: Florian Fainelli Date: Wed Jun 15 19:15:23 2011 +0200 watchdog: mtx1-wdt: request gpio before using it commit 9b19d40aa3ebaf1078779da10555da2ab8512422 upstream. Otherwise, the gpiolib autorequest feature will produce a WARN_ON(): WARNING: at drivers/gpio/gpiolib.c:101 0x8020ec6c() autorequest GPIO-215 [...] Signed-off-by: Florian Fainelli Signed-off-by: Wim Van Sebroeck Signed-off-by: Greg Kroah-Hartman commit 5c0a75906a5ee6152e0e604aef32788b683a1815 Author: Sjoerd Simons Date: Tue May 24 12:22:03 2011 -0300 uvcvideo: Remove buffers from the queues when freeing commit 8ca2c80b170c47eeb55f0c2a0f2b8edf85f35d49 upstream. When freeing memory for the video buffers also remove them from the irq & main queues. This fixes an oops when doing the following: open ("/dev/video", ..) VIDIOC_REQBUFS VIDIOC_QBUF VIDIOC_REQBUFS close () As the second VIDIOC_REQBUFS will cause the list entries of the buffers to be cleared while they still hang around on the main and irc queues Signed-off-by: Sjoerd Simons Acked-by: Laurent Pinchart Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 98f9d144d13f337660c040ea830b7331a8afd542 Author: Rafael Aquini Date: Wed Jun 15 15:08:39 2011 -0700 mm: fix negative commitlimit when gigantic hugepages are allocated commit b0320c7b7d1ac1bd5c2d9dff3258524ab39bad32 upstream. When 1GB hugepages are allocated on a system, free(1) reports less available memory than what really is installed in the box. Also, if the total size of hugepages allocated on a system is over half of the total memory size, CommitLimit becomes a negative number. The problem is that gigantic hugepages (order > MAX_ORDER) can only be allocated at boot with bootmem, thus its frames are not accounted to 'totalram_pages'. However, they are accounted to hugetlb_total_pages() What happens to turn CommitLimit into a negative number is this calculation, in fs/proc/meminfo.c: allowed = ((totalram_pages - hugetlb_total_pages()) * sysctl_overcommit_ratio / 100) + total_swap_pages; A similar calculation occurs in __vm_enough_memory() in mm/mmap.c. Also, every vm statistic which depends on 'totalram_pages' will render confusing values, as if system were 'missing' some part of its memory. Impact of this bug: When gigantic hugepages are allocated and sysctl_overcommit_memory == OVERCOMMIT_NEVER. In a such situation, __vm_enough_memory() goes through the mentioned 'allowed' calculation and might end up mistakenly returning -ENOMEM, thus forcing the system to start reclaiming pages earlier than it would be ususal, and this could cause detrimental impact to overall system's performance, depending on the workload. Besides the aforementioned scenario, I can only think of this causing annoyances with memory reports from /proc/meminfo and free(1). [akpm@linux-foundation.org: standardize comment layout] Reported-by: Russ Anderson Signed-off-by: Rafael Aquini Acked-by: Russ Anderson Cc: Andrea Arcangeli Cc: Christoph Lameter Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 44252341f94309e76aeba0b7c805fb46b2d72fdc Author: Eugene A. Shatokhin Date: Tue Jun 28 23:04:51 2011 -0400 ath5k: fix memory leak when fewer than N_PD_CURVES are in use commit a0b8de350be458b33248e48b2174d9af8a4c4798 upstream. We would free the proper number of curves, but in the wrong slots, due to a missing level of indirection through the pdgain_idx table. It's simpler just to try to free all four slots, so do that. Signed-off-by: Bob Copeland Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit 92e438c204e4ae51e706725b6c007591e62504f2 Author: Michal Kubecek Date: Sat Jun 18 20:34:01 2011 +0200 PM: Free memory bitmaps if opening /dev/snapshot fails commit 8440f4b19494467883f8541b7aa28c7bbf6ac92b upstream. When opening /dev/snapshot device, snapshot_open() creates memory bitmaps which are freed in snapshot_release(). But if any of the callbacks called by pm_notifier_call_chain() returns NOTIFY_BAD, open() fails, snapshot_release() is never called and bitmaps are not freed. Next attempt to open /dev/snapshot then triggers BUG_ON() check in create_basic_memory_bitmaps(). This happens e.g. when vmwatchdog module is active on s390x. Signed-off-by: Michal Kubecek Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 20fea53565eb54fba2acd1820864002b6b26b669 Author: Sarah Sharp Date: Sun Jun 5 23:10:04 2011 -0700 xhci: Reject double add of active endpoints. commit fa75ac379e63c2864e9049b5e8615e40f65c1e70 upstream. While trying to switch a UAS device from the BOT configuration to the UAS configuration via the bConfigurationValue file, Tanya ran into an issue in the USB core. usb_disable_device() sets entries in udev->ep_out and udev->ep_out to NULL, but doesn't call into the xHCI bandwidth management functions to remove the BOT configuration endpoints from the xHCI host's internal structures. The USB core would then attempt to add endpoints for the UAS configuration, and some of the endpoints had the same address as endpoints in the BOT configuration. The xHCI driver blindly added the endpoints again, but the xHCI host controller rejected the Configure Endpoint command because active endpoints were added without being dropped. Make the xHCI driver reject calls to xhci_add_endpoint() that attempt to add active endpoints without first calling xhci_drop_endpoint(). This should be backported to kernels as old as 2.6.31. Signed-off-by: Sarah Sharp Reported-by: Tanya Brokhman Signed-off-by: Greg Kroah-Hartman commit 75fe2ba05b86b971290cbc5f731681251ebb88c6 Author: Jiri Slaby Date: Sun Jun 5 14:16:16 2011 +0200 TTY: ldisc, do not close until there are readers commit 92f6fa09bd453ffe3351fa1f1377a1b7cfa911e6 upstream. We restored tty_ldisc_wait_idle in 100eeae2c5c (TTY: restore tty_ldisc_wait_idle). We used it in the ldisc changing path to fix the case where there are tasks in n_tty_read waiting for data and somebody tries to change ldisc. Similar to the case above, there may be also tasks waiting in n_tty_read while hangup is performed. As 65b770468e98 (tty-ldisc: turn ldisc user count into a proper refcount) removed the wait-until-idle from all paths, hangup path won't wait for them to disappear either now. So add it back even to the hangup path. There is a difference, we need uninterruptible sleep as there is obviously HUP signal pending. So tty_ldisc_wait_idle now sleeps without possibility to be interrupted. This is what original tty_ldisc_wait_idle did. After the wait idle reintroduction (100eeae2c5c), we have had interruptible sleeps for the ldisc changing path. But as there is a 5s timeout anyway, we don't allow it to be interrupted from now on. It's not worth the added complexity of deciding what kind of sleep we want. Before 65b770468e98 tty_ldisc_release was called also from tty_ldisc_release. It is called from tty_release, so I don't think we need to restore that one. This is nicely reproducible after constifying the timing when drivers/tty/n_tty.c is patched as follows ("TTY: ntty, add one more sanity check" patch is needed to actually see it explode): %% -1548,6 +1549,7 @@ static int n_tty_open(struct tty_struct *tty) /* These are ugly. Currently a malloc failure here can panic */ if (!tty->read_buf) { + msleep(100); tty->read_buf = kzalloc(N_TTY_BUF_SIZE, GFP_KERNEL); if (!tty->read_buf) return -ENOMEM; %% -1785,6 +1788,7 @@ do_it_again: break; } timeout = schedule_timeout(timeout); + msleep(20); continue; } __set_current_state(TASK_RUNNING); ===== With a process: ===== while (1) { int fd = open(argv[1], O_RDWR); read(fd, buf, sizeof(buf)); close(fd); } ===== and its child: ===== setsid(); while (1) { int fd = open(tty, O_RDWR|O_NOCTTY); ioctl(fd, TIOCSCTTY, 1); vhangup(); close(fd); usleep(100 * (10 + random() % 1000)); } ===== EOF ===== References: https://bugzilla.novell.com/show_bug.cgi?id=693374 References: https://bugzilla.novell.com/show_bug.cgi?id=694509 Signed-off-by: Jiri Slaby Cc: Alan Cox Cc: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 43b951e4185f3132e77e6340d1aed42e90618e4b Author: Thomas Gleixner Date: Thu Jun 16 16:22:08 2011 +0200 clocksource: Make watchdog robust vs. interruption commit b5199515c25cca622495eb9c6a8a1d275e775088 upstream. The clocksource watchdog code is interruptible and it has been observed that this can trigger false positives which disable the TSC. The reason is that an interrupt storm or a long running interrupt handler between the read of the watchdog source and the read of the TSC brings the two far enough apart that the delta is larger than the unstable treshold. Move both reads into a short interrupt disabled region to avoid that. Reported-and-tested-by: Vernon Mauery Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit 6961b18f7cdd1357b3e31a6b97807034204220c6 Author: Stefano Stabellini Date: Fri Jun 3 09:51:34 2011 +0000 xen: partially revert "xen: set max_pfn_mapped to the last pfn mapped" commit a91d92875ee94e4703fd017ccaadb48cfb344994 upstream. We only need to set max_pfn_mapped to the last pfn mapped on x86_64 to make sure that cleanup_highmap doesn't remove important mappings at _end. We don't need to do this on x86_32 because cleanup_highmap is not called on x86_32. Besides lowering max_pfn_mapped on x86_32 has the unwanted side effect of limiting the amount of memory available for the 1:1 kernel pagetable allocation. This patch reverts the x86_32 part of the original patch. Signed-off-by: Stefano Stabellini Signed-off-by: Konrad Rzeszutek Wilk Signed-off-by: Greg Kroah-Hartman commit c3ce2d3992b7028a545b5dd9d25f892ef066a62f Author: Andrea Arcangeli Date: Thu Jun 16 12:56:19 2011 -0700 migrate: don't account swapcache as shmem commit 99a15e21d96f6857dafab1e5167e5e8183215c9c upstream. swapcache will reach the below code path in migrate_page_move_mapping, and swapcache is accounted as NR_FILE_PAGES but it's not accounted as NR_SHMEM. Hugh pointed out we must use PageSwapCache instead of comparing mapping to &swapper_space, to avoid build failure with CONFIG_SWAP=n. Signed-off-by: Andrea Arcangeli Acked-by: Hugh Dickins Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 0aa6e32c7cb5e80a0e381df2c6beb3e4f971d5f2 Author: Hugh Dickins Date: Wed Jun 15 15:08:58 2011 -0700 ksm: fix NULL pointer dereference in scan_get_next_rmap_item() commit 2b472611a32a72f4a118c069c2d62a1a3f087afd upstream. Andrea Righi reported a case where an exiting task can race against ksmd::scan_get_next_rmap_item (http://lkml.org/lkml/2011/6/1/742) easily triggering a NULL pointer dereference in ksmd. ksm_scan.mm_slot == &ksm_mm_head with only one registered mm CPU 1 (__ksm_exit) CPU 2 (scan_get_next_rmap_item) list_empty() is false lock slot == &ksm_mm_head list_del(slot->mm_list) (list now empty) unlock lock slot = list_entry(slot->mm_list.next) (list is empty, so slot is still ksm_mm_head) unlock slot->mm == NULL ... Oops Close this race by revalidating that the new slot is not simply the list head again. Andrea's test case: #include #include #include #include #define BUFSIZE getpagesize() int main(int argc, char **argv) { void *ptr; if (posix_memalign(&ptr, getpagesize(), BUFSIZE) < 0) { perror("posix_memalign"); exit(1); } if (madvise(ptr, BUFSIZE, MADV_MERGEABLE) < 0) { perror("madvise"); exit(1); } *(char *)NULL = 0; return 0; } Reported-by: Andrea Righi Tested-by: Andrea Righi Cc: Andrea Arcangeli Signed-off-by: Hugh Dickins Signed-off-by: Chris Wright Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman