commit d0a11cb1254411fd1d60084cd8498e8512cd6075 Author: Greg Kroah-Hartman Date: Wed Jul 13 05:29:43 2011 +0200 Linux 2.6.32.43 commit ffdd12eabed263a487ddc05fdf65be6e4fc717b4 Author: Miklos Szeredi Date: Wed Feb 23 13:49:47 2011 +0100 mm: prevent concurrent unmap_mapping_range() on the same inode commit 2aa15890f3c191326678f1bd68af61ec6b8753ec upstream. Michael Leun reported that running parallel opens on a fuse filesystem can trigger a "kernel BUG at mm/truncate.c:475" Gurudas Pai reported the same bug on NFS. The reason is, unmap_mapping_range() is not prepared for more than one concurrent invocation per inode. For example: thread1: going through a big range, stops in the middle of a vma and stores the restart address in vm_truncate_count. thread2: comes in with a small (e.g. single page) unmap request on the same vma, somewhere before restart_address, finds that the vma was already unmapped up to the restart address and happily returns without doing anything. Another scenario would be two big unmap requests, both having to restart the unmapping and each one setting vm_truncate_count to its own value. This could go on forever without any of them being able to finish. Truncate and hole punching already serialize with i_mutex. Other callers of unmap_mapping_range() do not, and it's difficult to get i_mutex protection for all callers. In particular ->d_revalidate(), which calls invalidate_inode_pages2_range() in fuse, may be called with or without i_mutex. This patch adds a new mutex to 'struct address_space' to prevent running multiple concurrent unmap_mapping_range() on the same mapping. [ We'll hopefully get rid of all this with the upcoming mm preemptibility series by Peter Zijlstra, the "mm: Remove i_mmap_mutex lockbreak" patch in particular. But that is for 2.6.39 ] Signed-off-by: Miklos Szeredi Reported-by: Michael Leun Reported-by: Gurudas Pai Tested-by: Gurudas Pai Acked-by: Hugh Dickins Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 453d61c441601d3f4ed9df7cfade4c5c35e98526 Author: Xufeng Zhang Date: Tue Jun 21 10:43:40 2011 +0000 udp/recvmsg: Clear MSG_TRUNC flag when starting over for a new packet [ Upstream commit 9cfaa8def1c795a512bc04f2aec333b03724ca2e ] Consider this scenario: When the size of the first received udp packet is bigger than the receive buffer, MSG_TRUNC bit is set in msg->msg_flags. However, if checksum error happens and this is a blocking socket, it will goto try_again loop to receive the next packet. But if the size of the next udp packet is smaller than receive buffer, MSG_TRUNC flag should not be set, but because MSG_TRUNC bit is not cleared in msg->msg_flags before receive the next packet, MSG_TRUNC is still set, which is wrong. Fix this problem by clearing MSG_TRUNC flag when starting over for a new packet. Signed-off-by: Xufeng Zhang Signed-off-by: Paul Gortmaker Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ac7573b57eab6975cf4d15bb79f9733639b437a3 Author: Xufeng Zhang Date: Tue Jun 21 10:43:39 2011 +0000 ipv6/udp: Use the correct variable to determine non-blocking condition [ Upstream commit 32c90254ed4a0c698caa0794ebb4de63fcc69631 ] udpv6_recvmsg() function is not using the correct variable to determine whether or not the socket is in non-blocking operation, this will lead to unexpected behavior when a UDP checksum error occurs. Consider a non-blocking udp receive scenario: when udpv6_recvmsg() is called by sock_common_recvmsg(), MSG_DONTWAIT bit of flags variable in udpv6_recvmsg() is cleared by "flags & ~MSG_DONTWAIT" in this call: err = sk->sk_prot->recvmsg(iocb, sk, msg, size, flags & MSG_DONTWAIT, flags & ~MSG_DONTWAIT, &addr_len); i.e. with udpv6_recvmsg() getting these values: int noblock = flags & MSG_DONTWAIT int flags = flags & ~MSG_DONTWAIT So, when udp checksum error occurs, the execution will go to csum_copy_err, and then the problem happens: csum_copy_err: ............... if (flags & MSG_DONTWAIT) return -EAGAIN; goto try_again; ............... But it will always go to try_again as MSG_DONTWAIT has been cleared from flags at call time -- only noblock contains the original value of MSG_DONTWAIT, so the test should be: if (noblock) return -EAGAIN; This is also consistent with what the ipv4/udp code does. Signed-off-by: Xufeng Zhang Signed-off-by: Paul Gortmaker Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 4e754b40ebbd638984e33eb8509acb63b9c7f253 Author: Marcus Meissner Date: Wed Jun 1 21:05:22 2011 -0700 net/ipv4: Check for mistakenly passed in non-IPv4 address [ Upstream commit d0733d2e29b652b2e7b1438ececa732e4eed98eb ] Check against mistakenly passing in IPv6 addresses (which would result in an INADDR_ANY bind) or similar incompatible sockaddrs. Signed-off-by: Marcus Meissner Cc: Reinhard Max Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit cc2c6564492b2a5f570d08d127c1496cc185a5f5 Author: Eric Dumazet Date: Mon Jun 6 22:42:06 2011 -0700 af_packet: prevent information leak [ Upstream commit 13fcb7bd322164c67926ffe272846d4860196dc6 ] In 2.6.27, commit 393e52e33c6c2 (packet: deliver VLAN TCI to userspace) added a small information leak. Add padding field and make sure its zeroed before copy to user. Signed-off-by: Eric Dumazet CC: Patrick McHardy Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 5eabe872301b2754122ce37fc4d1adfafbc94182 Author: Joe Perches Date: Sat May 21 07:48:40 2011 +0000 net: filter: Use WARN_RATELIMIT [ Upstream commit 6c4a5cb219520c7bc937ee186ca53f03733bd09f ] A mis-configured filter can spam the logs with lots of stack traces. Rate-limit the warnings and add printout of the bogus filter information. Original-patch-by: Ben Greear Signed-off-by: Joe Perches Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 46f40799c48c1a2452d06054b861ae6e7cf191f3 Author: Joe Perches Date: Sat May 21 07:48:39 2011 +0000 bug.h: Add WARN_RATELIMIT [ Upstream commit b3eec79b0776e5340a3db75b34953977c7e5086e ] Add a generic mechanism to ratelimit WARN(foo, fmt, ...) messages using a hidden per call site static struct ratelimit_state. Also add an __WARN_RATELIMIT variant to be able to use a specific struct ratelimit_state. Signed-off-by: Joe Perches Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e6c768e36cfa59595a3c72823c900aa34d7f5be8 Author: Rafael J. Wysocki Date: Wed Jul 6 20:15:23 2011 +0200 PM / Hibernate: Fix free_unnecessary_pages() commit 4d4cf23cdde2f8f9324f5684a7f349e182039529 upstream. There is a bug in free_unnecessary_pages() that causes it to attempt to free too many pages in some cases, which triggers the BUG_ON() in memory_bm_clear_bit() for copy_bm. Namely, if count_data_pages() is initially greater than alloc_normal, we get to_free_normal equal to 0 and "save" greater from 0. In that case, if the sum of "save" and count_highmem_pages() is greater than alloc_highmem, we subtract a positive number from to_free_normal. Hence, since to_free_normal was 0 before the subtraction and is an unsigned int, the result is converted to a huge positive number that is used as the number of pages to free. Fix this bug by checking if to_free_normal is actually greater than or equal to the number we're going to subtract from it. Signed-off-by: Rafael J. Wysocki Reported-and-tested-by: Matthew Garrett Signed-off-by: Greg Kroah-Hartman commit 7af74e7852333102fb02dd17d37aabf4d5f1220b Author: Rafael J. Wysocki Date: Sat Sep 11 20:58:27 2010 +0200 PM / Hibernate: Avoid hitting OOM during preallocation of memory commit 6715045ddc7472a22be5e49d4047d2d89b391f45 upstream. There is a problem in hibernate_preallocate_memory() that it calls preallocate_image_memory() with an argument that may be greater than the total number of available non-highmem memory pages. If that's the case, the OOM condition is guaranteed to trigger, which in turn can cause significant slowdown to occur during hibernation. To avoid that, make preallocate_image_memory() adjust its argument before calling preallocate_image_pages(), so that the total number of saveable non-highem pages left is not less than the minimum size of a hibernation image. Change hibernate_preallocate_memory() to try to allocate from highmem if the number of pages allocated by preallocate_image_memory() is too low. Modify free_unnecessary_pages() to take all possible memory allocation patterns into account. Reported-by: KOSAKI Motohiro Signed-off-by: Rafael J. Wysocki Tested-by: M. Vefa Bicakci Signed-off-by: Greg Kroah-Hartman commit ea573723636c6e7ede7cc876046a7027a02bcf2c Author: Eric Dumazet Date: Fri Jun 17 16:25:39 2011 -0400 inet_diag: fix inet_diag_bc_audit() [ Upstream commit eeb1497277d6b1a0a34ed36b97e18f2bd7d6de0d ] A malicious user or buggy application can inject code and trigger an infinite loop in inet_diag_bc_audit() Also make sure each instruction is aligned on 4 bytes boundary, to avoid unaligned accesses. Reported-by: Dan Rosenberg Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d91b1971d6f5f72e638123f30684332473c5e07a Author: Nelson Elhage Date: Wed Nov 3 16:35:40 2010 +0000 netlink: Make nlmsg_find_attr take a const nlmsghdr*. commit 6b8c92ba07287578718335ce409de8e8d7217e40 upstream. This will let us use it on a nlmsghdr stored inside a netlink_callback. Signed-off-by: Nelson Elhage Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2b71587b0da22ec843730baceae7dca082650bd6 Author: Liu Aleaxander Date: Tue Jun 29 15:05:40 2010 -0700 um: os-linux/mem.c needs sys/stat.h commit fb967ecc584c20c74a007de749ca597068b0fcac upstream. The os-linux/mem.c file calls fchmod function, which is declared in sys/stat.h header file, so include it. Fixes build breakage under FC13. Signed-off-by: Liu Aleaxander Acked-by: Boaz Harrosh Cc: Jeff Dike Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 3bbcb3edf0ede7fc35639b1c73434b55f969f5b0 Author: Roland McGrath Date: Tue Oct 26 14:22:19 2010 -0700 uml: fix CONFIG_STATIC_LINK=y build failure with newer glibc commit aa5fb4dbfd121296ca97c68cf90043a7ea97579d upstream. With glibc 2.11 or later that was built with --enable-multi-arch, the UML link fails with undefined references to __rel_iplt_start and similar symbols. In recent binutils, the default linker script defines these symbols (see ld --verbose). Fix the UML linker scripts to match the new defaults for these sections. Signed-off-by: Roland McGrath Cc: Jeff Dike Cc: Al Viro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 1e9c04fa407587a6aaf72b13219802565d22e5a0 Author: Alan Stern Date: Wed Jun 15 16:29:16 2011 -0400 USB: don't let the hub driver prevent system sleep commit cbb330045e5df8f665ac60227ff898421fc8fb92 upstream. This patch (as1465) continues implementation of the policy that errors during suspend or hibernation should not prevent the system from going to sleep. In this case, failure to turn on the Suspend feature for a hub port shouldn't be reported as an error. There are situations where this does actually occur (such as when the device plugged into that port was disconnected in the recent past), and it turns out to be harmless. There's no reason for it to prevent a system sleep. Also, don't allow the hub driver to fail a system suspend if the downstream ports aren't all suspended. This is also harmless (and should never happen, given the change mentioned above); printing a warning message in the kernel log is all we really need to do. Signed-off-by: Alan Stern Signed-off-by: Greg Kroah-Hartman commit ae6fe57b28b609c59f6e04ecf65d1e8f6fe470bc Author: Alan Stern Date: Wed Jun 15 16:27:43 2011 -0400 USB: don't let errors prevent system sleep commit 0af212ba8f123c2eba151af7726c34a50b127962 upstream. This patch (as1464) implements the recommended policy that most errors during suspend or hibernation should not prevent the system from going to sleep. In particular, failure to suspend a USB driver or a USB device should not prevent the sleep from succeeding: Failure to suspend a device won't matter, because the device will automatically go into suspend mode when the USB bus stops carrying packets. (This might be less true for USB-3.0 devices, but let's not worry about them now.) Failure of a driver to suspend might lead to trouble later on when the system wakes up, but it isn't sufficient reason to prevent the system from going to sleep. Signed-off-by: Alan Stern Signed-off-by: Greg Kroah-Hartman commit 68e62de3e747b9ae5e811a77ccece7c425dae830 Author: Vasiliy Kulikov Date: Mon Jun 27 16:18:11 2011 -0700 taskstats: don't allow duplicate entries in listener mode commit 26c4caea9d697043cc5a458b96411b86d7f6babd upstream. Currently a single process may register exit handlers unlimited times. It may lead to a bloated listeners chain and very slow process terminations. Eg after 10KK sent TASKSTATS_CMD_ATTR_REGISTER_CPUMASKs ~300 Mb of kernel memory is stolen for the handlers chain and "time id" shows 2-7 seconds instead of normal 0.003. It makes it possible to exhaust all kernel memory and to eat much of CPU time by triggerring numerous exits on a single CPU. The patch limits the number of times a single process may register itself on a single CPU to one. One little issue is kept unfixed - as taskstats_exit() is called before exit_files() in do_exit(), the orphaned listener entry (if it was not explicitly deregistered) is kept until the next someone's exit() and implicit deregistration in send_cpu_listeners(). So, if a process registered itself as a listener exits and the next spawned process gets the same pid, it would inherit taskstats attributes. Signed-off-by: Vasiliy Kulikov Cc: Balbir Singh Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 06ce414d4a0811a979a82cddc3e429e6bd704c98 Author: Arnd Bergmann Date: Fri Jul 1 17:30:00 2011 -0700 6pack,mkiss: fix lock inconsistency commit 6e4e2f811bade330126d4029c88c831784a7efd9 upstream. Lockdep found a locking inconsistency in the mkiss_close function: > kernel: [ INFO: inconsistent lock state ] > kernel: 2.6.39.1 #3 > kernel: --------------------------------- > kernel: inconsistent {IN-SOFTIRQ-R} -> {SOFTIRQ-ON-W} usage. > kernel: ax25ipd/2813 [HC0[0]:SC0[0]:HE1:SE1] takes: > kernel: (disc_data_lock){+++?.-}, at: [] mkiss_close+0x1b/0x90 [mkiss] > kernel: {IN-SOFTIRQ-R} state was registered at: The message hints that disc_data_lock is aquired with softirqs disabled, but does not itself disable softirqs, which can in rare circumstances lead to a deadlock. The same problem is present in the 6pack driver, this patch fixes both by using write_lock_bh instead of write_lock. Reported-by: Bernard F6BVP Tested-by: Bernard F6BVP Signed-off-by: Arnd Bergmann Acked-by: Ralf Baechle Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d2c7e885624f03cdbba80f124096e5073156969e Author: Trond Myklebust Date: Fri Jun 17 10:14:59 2011 -0400 SUNRPC: Ensure the RPC client only quits on fatal signals commit 5afa9133cfe67f1bfead6049a9640c9262a7101c upstream. Fix a couple of instances where we were exiting the RPC client on arbitrary signals. We should only do so on fatal signals. Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman commit 1ca39696ba621b0737c78af2c104939c60b29ce4 Author: NeilBrown Date: Tue Jun 28 16:59:42 2011 +1000 md: avoid endless recovery loop when waiting for fail device to complete. commit 4274215d24633df7302069e51426659d4759c5ed upstream. If a device fails in a way that causes pending request to take a while to complete, md will not be able to immediately remove it from the array in remove_and_add_spares. It will then incorrectly look like a spare device and md will try to recover it even though it is failed. This leads to a recovery process starting and instantly aborting over and over again. We should check if the device is faulty before considering it to be a spare. This will avoid trying to start a recovery that cannot proceed. This bug was introduced in 2.6.26 so that patch is suitable for any kernel since then. Reported-by: Jim Paradis Signed-off-by: NeilBrown Signed-off-by: Greg Kroah-Hartman commit 48984ada7416c0f533bc2a0f886ccb6c3646fe9a Author: Jean Delvare Date: Wed Jun 29 11:36:10 2011 +0200 i2c-taos-evm: Fix log messages commit 9b640f2e154268cb516efcaf9c434f2e73c6783e upstream. * Print all error and information messages even when debugging is disabled. * Don't use adapter device to log messages before it is ready. Signed-off-by: Jean Delvare Signed-off-by: Greg Kroah-Hartman commit 1e03bb2a2712d29db6f3ab5c642d007658a91dd6 Author: Shaohua Li Date: Mon Jun 27 09:03:47 2011 +0200 cfq-iosched: fix a rcu warning commit 3181faa85bda3dc3f5e630a1846526c9caaa38e3 upstream. I got a rcu warnning at boot. the ioc->ioc_data is rcu_deferenced, but doesn't hold rcu_read_lock. Signed-off-by: Shaohua Li Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 1ff36a0e02f978e533b13ce6a86ad6a73444cec8 Author: Jens Axboe Date: Sun Jun 5 06:01:13 2011 +0200 cfq-iosched: fix locking around ioc->ioc_data assignment commit ab4bd22d3cce6977dc039664cc2d052e3147d662 upstream. Since we are modifying this RCU pointer, we need to hold the lock protecting it around it. This fixes a potential reuse and double free of a cfq io_context structure. The bug has been in CFQ for a long time, it hit very few people but those it did hit seemed to see it a lot. Tracked in RH bugzilla here: https://bugzilla.redhat.com/show_bug.cgi?id=577968 Credit goes to Paul Bolle for figuring out that the issue was around the one-hit ioc->ioc_data cache. Thanks to his hard work the issue is now fixed. Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 336fca97910371e16506e5448ad0ca9daae9a59c Author: Marcin Slusarz Date: Sat May 28 13:23:42 2011 +0200 debugobjects: Fix boot crash when kmemleak and debugobjects enabled commit 161b6ae0e067e421b20bb35caf66bdb405c929ac upstream. Order of initialization look like this: ... debugobjects kmemleak ...(lots of other subsystems)... workqueues (through early initcall) ... debugobjects use schedule_work for batch freeing of its data and kmemleak heavily use debugobjects, so when it comes to freeing and workqueues were not initialized yet, kernel crashes: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] __queue_work+0x29/0x41a [] queue_work_on+0x16/0x1d [] queue_work+0x29/0x55 [] schedule_work+0x13/0x15 [] free_object+0x90/0x95 [] debug_check_no_obj_freed+0x187/0x1d3 [] ? _raw_spin_unlock_irqrestore+0x30/0x4d [] ? free_object_rcu+0x68/0x6d [] kmem_cache_free+0x64/0x12c [] free_object_rcu+0x68/0x6d [] __rcu_process_callbacks+0x1b6/0x2d9 ... because system_wq is NULL. Fix it by checking if workqueues susbystem was initialized before using. Signed-off-by: Marcin Slusarz Cc: Catalin Marinas Cc: Tejun Heo Cc: Dipankar Sarma Cc: Paul E. McKenney Link: http://lkml.kernel.org/r/20110528112342.GA3068@joi.lan Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit 6d86a0ee7e2ffac7c94f738b5d9a74148034df52 Author: Florian Fainelli Date: Wed Jun 15 19:15:23 2011 +0200 watchdog: mtx1-wdt: request gpio before using it commit 9b19d40aa3ebaf1078779da10555da2ab8512422 upstream. Otherwise, the gpiolib autorequest feature will produce a WARN_ON(): WARNING: at drivers/gpio/gpiolib.c:101 0x8020ec6c() autorequest GPIO-215 [...] Signed-off-by: Florian Fainelli Signed-off-by: Wim Van Sebroeck Signed-off-by: Greg Kroah-Hartman commit 986e0f6528a12e9425989098083b7ccf1d846fa9 Author: Sjoerd Simons Date: Tue May 24 12:22:03 2011 -0300 uvcvideo: Remove buffers from the queues when freeing commit 8ca2c80b170c47eeb55f0c2a0f2b8edf85f35d49 upstream. When freeing memory for the video buffers also remove them from the irq & main queues. This fixes an oops when doing the following: open ("/dev/video", ..) VIDIOC_REQBUFS VIDIOC_QBUF VIDIOC_REQBUFS close () As the second VIDIOC_REQBUFS will cause the list entries of the buffers to be cleared while they still hang around on the main and irc queues Signed-off-by: Sjoerd Simons Acked-by: Laurent Pinchart Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit ae3862c401f5ebfaa5d3c92bc87147e7dad0e615 Author: Rafael Aquini Date: Wed Jun 15 15:08:39 2011 -0700 mm: fix negative commitlimit when gigantic hugepages are allocated commit b0320c7b7d1ac1bd5c2d9dff3258524ab39bad32 upstream. When 1GB hugepages are allocated on a system, free(1) reports less available memory than what really is installed in the box. Also, if the total size of hugepages allocated on a system is over half of the total memory size, CommitLimit becomes a negative number. The problem is that gigantic hugepages (order > MAX_ORDER) can only be allocated at boot with bootmem, thus its frames are not accounted to 'totalram_pages'. However, they are accounted to hugetlb_total_pages() What happens to turn CommitLimit into a negative number is this calculation, in fs/proc/meminfo.c: allowed = ((totalram_pages - hugetlb_total_pages()) * sysctl_overcommit_ratio / 100) + total_swap_pages; A similar calculation occurs in __vm_enough_memory() in mm/mmap.c. Also, every vm statistic which depends on 'totalram_pages' will render confusing values, as if system were 'missing' some part of its memory. Impact of this bug: When gigantic hugepages are allocated and sysctl_overcommit_memory == OVERCOMMIT_NEVER. In a such situation, __vm_enough_memory() goes through the mentioned 'allowed' calculation and might end up mistakenly returning -ENOMEM, thus forcing the system to start reclaiming pages earlier than it would be ususal, and this could cause detrimental impact to overall system's performance, depending on the workload. Besides the aforementioned scenario, I can only think of this causing annoyances with memory reports from /proc/meminfo and free(1). [akpm@linux-foundation.org: standardize comment layout] Reported-by: Russ Anderson Signed-off-by: Rafael Aquini Acked-by: Russ Anderson Cc: Andrea Arcangeli Cc: Christoph Lameter Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 6d40246c4ed85bcc88f195cbc5c613673a73306a Author: Eugene A. Shatokhin Date: Tue Jun 28 23:04:51 2011 -0400 ath5k: fix memory leak when fewer than N_PD_CURVES are in use commit a0b8de350be458b33248e48b2174d9af8a4c4798 upstream. We would free the proper number of curves, but in the wrong slots, due to a missing level of indirection through the pdgain_idx table. It's simpler just to try to free all four slots, so do that. Signed-off-by: Bob Copeland Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit 1588e8230601e70f79743755c2e031392e6a9a53 Author: Michal Kubecek Date: Sat Jun 18 20:34:01 2011 +0200 PM: Free memory bitmaps if opening /dev/snapshot fails commit 8440f4b19494467883f8541b7aa28c7bbf6ac92b upstream. When opening /dev/snapshot device, snapshot_open() creates memory bitmaps which are freed in snapshot_release(). But if any of the callbacks called by pm_notifier_call_chain() returns NOTIFY_BAD, open() fails, snapshot_release() is never called and bitmaps are not freed. Next attempt to open /dev/snapshot then triggers BUG_ON() check in create_basic_memory_bitmaps(). This happens e.g. when vmwatchdog module is active on s390x. Signed-off-by: Michal Kubecek Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit ddaa6a0bb301607d0e2a3c0414115a596a6f7c14 Author: Sarah Sharp Date: Sun Jun 5 23:10:04 2011 -0700 xhci: Reject double add of active endpoints. commit fa75ac379e63c2864e9049b5e8615e40f65c1e70 upstream. While trying to switch a UAS device from the BOT configuration to the UAS configuration via the bConfigurationValue file, Tanya ran into an issue in the USB core. usb_disable_device() sets entries in udev->ep_out and udev->ep_out to NULL, but doesn't call into the xHCI bandwidth management functions to remove the BOT configuration endpoints from the xHCI host's internal structures. The USB core would then attempt to add endpoints for the UAS configuration, and some of the endpoints had the same address as endpoints in the BOT configuration. The xHCI driver blindly added the endpoints again, but the xHCI host controller rejected the Configure Endpoint command because active endpoints were added without being dropped. Make the xHCI driver reject calls to xhci_add_endpoint() that attempt to add active endpoints without first calling xhci_drop_endpoint(). This should be backported to kernels as old as 2.6.31. Signed-off-by: Sarah Sharp Reported-by: Tanya Brokhman Signed-off-by: Greg Kroah-Hartman commit 9bb7bdfbc6e0699e6daeda46b245ca7964c8bfd4 Author: Jiri Slaby Date: Sun Jun 5 14:16:16 2011 +0200 TTY: ldisc, do not close until there are readers commit 92f6fa09bd453ffe3351fa1f1377a1b7cfa911e6 upstream. We restored tty_ldisc_wait_idle in 100eeae2c5c (TTY: restore tty_ldisc_wait_idle). We used it in the ldisc changing path to fix the case where there are tasks in n_tty_read waiting for data and somebody tries to change ldisc. Similar to the case above, there may be also tasks waiting in n_tty_read while hangup is performed. As 65b770468e98 (tty-ldisc: turn ldisc user count into a proper refcount) removed the wait-until-idle from all paths, hangup path won't wait for them to disappear either now. So add it back even to the hangup path. There is a difference, we need uninterruptible sleep as there is obviously HUP signal pending. So tty_ldisc_wait_idle now sleeps without possibility to be interrupted. This is what original tty_ldisc_wait_idle did. After the wait idle reintroduction (100eeae2c5c), we have had interruptible sleeps for the ldisc changing path. But as there is a 5s timeout anyway, we don't allow it to be interrupted from now on. It's not worth the added complexity of deciding what kind of sleep we want. Before 65b770468e98 tty_ldisc_release was called also from tty_ldisc_release. It is called from tty_release, so I don't think we need to restore that one. This is nicely reproducible after constifying the timing when drivers/tty/n_tty.c is patched as follows ("TTY: ntty, add one more sanity check" patch is needed to actually see it explode): %% -1548,6 +1549,7 @@ static int n_tty_open(struct tty_struct *tty) /* These are ugly. Currently a malloc failure here can panic */ if (!tty->read_buf) { + msleep(100); tty->read_buf = kzalloc(N_TTY_BUF_SIZE, GFP_KERNEL); if (!tty->read_buf) return -ENOMEM; %% -1785,6 +1788,7 @@ do_it_again: break; } timeout = schedule_timeout(timeout); + msleep(20); continue; } __set_current_state(TASK_RUNNING); ===== With a process: ===== while (1) { int fd = open(argv[1], O_RDWR); read(fd, buf, sizeof(buf)); close(fd); } ===== and its child: ===== setsid(); while (1) { int fd = open(tty, O_RDWR|O_NOCTTY); ioctl(fd, TIOCSCTTY, 1); vhangup(); close(fd); usleep(100 * (10 + random() % 1000)); } ===== EOF ===== References: https://bugzilla.novell.com/show_bug.cgi?id=693374 References: https://bugzilla.novell.com/show_bug.cgi?id=694509 Signed-off-by: Jiri Slaby Cc: Alan Cox Cc: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit b63010f54cdcb456c3a29e242a0769e5b412d785 Author: Thomas Gleixner Date: Thu Jun 16 16:22:08 2011 +0200 clocksource: Make watchdog robust vs. interruption commit b5199515c25cca622495eb9c6a8a1d275e775088 upstream. The clocksource watchdog code is interruptible and it has been observed that this can trigger false positives which disable the TSC. The reason is that an interrupt storm or a long running interrupt handler between the read of the watchdog source and the read of the TSC brings the two far enough apart that the delta is larger than the unstable treshold. Move both reads into a short interrupt disabled region to avoid that. Reported-and-tested-by: Vernon Mauery Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit 86df348681f6fe4c63a1511634ca50e148fb442f Author: Stefano Stabellini Date: Fri Jun 3 09:51:34 2011 +0000 xen: partially revert "xen: set max_pfn_mapped to the last pfn mapped" commit a91d92875ee94e4703fd017ccaadb48cfb344994 upstream. We only need to set max_pfn_mapped to the last pfn mapped on x86_64 to make sure that cleanup_highmap doesn't remove important mappings at _end. We don't need to do this on x86_32 because cleanup_highmap is not called on x86_32. Besides lowering max_pfn_mapped on x86_32 has the unwanted side effect of limiting the amount of memory available for the 1:1 kernel pagetable allocation. This patch reverts the x86_32 part of the original patch. Signed-off-by: Stefano Stabellini Signed-off-by: Konrad Rzeszutek Wilk Signed-off-by: Greg Kroah-Hartman commit f55a989b78ac67a318f256d64d7fdb64fac44a65 Author: Andrea Arcangeli Date: Thu Jun 16 12:56:19 2011 -0700 migrate: don't account swapcache as shmem commit 99a15e21d96f6857dafab1e5167e5e8183215c9c upstream. swapcache will reach the below code path in migrate_page_move_mapping, and swapcache is accounted as NR_FILE_PAGES but it's not accounted as NR_SHMEM. Hugh pointed out we must use PageSwapCache instead of comparing mapping to &swapper_space, to avoid build failure with CONFIG_SWAP=n. Signed-off-by: Andrea Arcangeli Acked-by: Hugh Dickins Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 4553fbdf55fab298a213812d21988670a4bd0edd Author: Hugh Dickins Date: Wed Jun 15 15:08:58 2011 -0700 ksm: fix NULL pointer dereference in scan_get_next_rmap_item() commit 2b472611a32a72f4a118c069c2d62a1a3f087afd upstream. Andrea Righi reported a case where an exiting task can race against ksmd::scan_get_next_rmap_item (http://lkml.org/lkml/2011/6/1/742) easily triggering a NULL pointer dereference in ksmd. ksm_scan.mm_slot == &ksm_mm_head with only one registered mm CPU 1 (__ksm_exit) CPU 2 (scan_get_next_rmap_item) list_empty() is false lock slot == &ksm_mm_head list_del(slot->mm_list) (list now empty) unlock lock slot = list_entry(slot->mm_list.next) (list is empty, so slot is still ksm_mm_head) unlock slot->mm == NULL ... Oops Close this race by revalidating that the new slot is not simply the list head again. Andrea's test case: #include #include #include #include #define BUFSIZE getpagesize() int main(int argc, char **argv) { void *ptr; if (posix_memalign(&ptr, getpagesize(), BUFSIZE) < 0) { perror("posix_memalign"); exit(1); } if (madvise(ptr, BUFSIZE, MADV_MERGEABLE) < 0) { perror("madvise"); exit(1); } *(char *)NULL = 0; return 0; } Reported-by: Andrea Righi Tested-by: Andrea Righi Cc: Andrea Arcangeli Signed-off-by: Hugh Dickins Signed-off-by: Chris Wright Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman