commit 5878e067ecac4bd2320e933ec485c01190a5b881 Author: Paul Gortmaker Date: Mon Feb 10 17:31:40 2014 -0500 Linux 2.6.34.15 Signed-off-by: Paul Gortmaker commit dbe686b56faa362865b54f2e1dfa3ecc88977dd8 Author: Wang YanQing Date: Sun Apr 1 08:54:02 2012 +0800 video:uvesafb: Fix oops that uvesafb try to execute NX-protected page commit b78f29ca0516266431688c5eb42d39ce42ec039a upstream. This patch fix the oops below that catched in my machine [ 81.560602] uvesafb: NVIDIA Corporation, GT216 Board - 0696a290, Chip Rev , OEM: NVIDIA, VBE v3.0 [ 81.609384] uvesafb: protected mode interface info at c000:d350 [ 81.609388] uvesafb: pmi: set display start = c00cd3b3, set palette = c00cd40e [ 81.609390] uvesafb: pmi: ports = 3b4 3b5 3ba 3c0 3c1 3c4 3c5 3c6 3c7 3c8 3c9 3cc 3ce 3cf 3d0 3d1 3d2 3d3 3d4 3d5 3da [ 81.614558] uvesafb: VBIOS/hardware doesn't support DDC transfers [ 81.614562] uvesafb: no monitor limits have been set, default refresh rate will be used [ 81.614994] uvesafb: scrolling: ypan using protected mode interface, yres_virtual=4915 [ 81.744147] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) [ 81.744153] BUG: unable to handle kernel paging request at c00cd3b3 [ 81.744159] IP: [] 0xc00cd3b2 [ 81.744167] *pdpt = 00000000016d6001 *pde = 0000000001c7b067 *pte = 80000000000cd163 [ 81.744171] Oops: 0011 [#1] SMP [ 81.744174] Modules linked in: uvesafb(+) cfbcopyarea cfbimgblt cfbfillrect [ 81.744178] [ 81.744181] Pid: 3497, comm: modprobe Not tainted 3.3.0-rc4NX+ #71 Acer Aspire 4741 /Aspire 4741 [ 81.744185] EIP: 0060:[] EFLAGS: 00010246 CPU: 0 [ 81.744187] EIP is at 0xc00cd3b3 [ 81.744189] EAX: 00004f07 EBX: 00000000 ECX: 00000000 EDX: 00000000 [ 81.744191] ESI: f763f000 EDI: f763f6e8 EBP: f57f3a0c ESP: f57f3a00 [ 81.744192] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 81.744195] Process modprobe (pid: 3497, ti=f57f2000 task=f748c600 task.ti=f57f2000) [ 81.744196] Stack: [ 81.744197] f82512c5 f759341c 00000000 f57f3a30 c124a9bc 00000001 00000001 000001e0 [ 81.744202] f8251280 f763f000 f7593400 00000000 f57f3a40 c12598dd f5c0c000 00000000 [ 81.744206] f57f3b10 c1255efe c125a21a 00000006 f763f09c 00000000 c1c6cb60 f7593400 [ 81.744210] Call Trace: [ 81.744215] [] ? uvesafb_pan_display+0x45/0x60 [uvesafb] [ 81.744222] [] fb_pan_display+0x10c/0x160 [ 81.744226] [] ? uvesafb_vbe_find_mode+0x180/0x180 [uvesafb] [ 81.744230] [] bit_update_start+0x1d/0x50 [ 81.744232] [] fbcon_switch+0x39e/0x550 [ 81.744235] [] ? bit_cursor+0x4ea/0x560 [ 81.744240] [] redraw_screen+0x12b/0x220 [ 81.744245] [] ? tty_do_resize+0x3b/0xc0 [ 81.744247] [] vc_do_resize+0x3d2/0x3e0 [ 81.744250] [] vc_resize+0x14/0x20 [ 81.744253] [] fbcon_init+0x29d/0x500 [ 81.744255] [] ? set_inverse_trans_unicode+0xe4/0x110 [ 81.744258] [] visual_init+0xb8/0x150 [ 81.744261] [] bind_con_driver+0x16c/0x360 [ 81.744264] [] ? register_con_driver+0x6e/0x190 [ 81.744267] [] take_over_console+0x41/0x50 [ 81.744269] [] fbcon_takeover+0x6a/0xd0 [ 81.744272] [] fbcon_event_notify+0x758/0x790 [ 81.744277] [] notifier_call_chain+0x42/0xb0 [ 81.744280] [] __blocking_notifier_call_chain+0x60/0x90 [ 81.744283] [] blocking_notifier_call_chain+0x1a/0x20 [ 81.744285] [] fb_notifier_call_chain+0x11/0x20 [ 81.744288] [] register_framebuffer+0x1d9/0x2b0 [ 81.744293] [] ? ioremap_wc+0x33/0x40 [ 81.744298] [] uvesafb_probe+0xaba/0xc40 [uvesafb] [ 81.744302] [] platform_drv_probe+0xf/0x20 [ 81.744306] [] driver_probe_device+0x68/0x170 [ 81.744309] [] __device_attach+0x41/0x50 [ 81.744313] [] bus_for_each_drv+0x48/0x70 [ 81.744316] [] device_attach+0x83/0xa0 [ 81.744319] [] ? __driver_attach+0x90/0x90 [ 81.744321] [] bus_probe_device+0x6f/0x90 [ 81.744324] [] device_add+0x5e5/0x680 [ 81.744329] [] ? kvasprintf+0x43/0x60 [ 81.744332] [] ? kobject_set_name_vargs+0x64/0x70 [ 81.744335] [] ? kobject_set_name_vargs+0x64/0x70 [ 81.744339] [] platform_device_add+0xff/0x1b0 [ 81.744343] [] uvesafb_init+0x50/0x9b [uvesafb] [ 81.744346] [] do_one_initcall+0x2f/0x170 [ 81.744350] [] ? uvesafb_is_valid_mode+0x66/0x66 [uvesafb] [ 81.744355] [] sys_init_module+0xf4/0x1410 [ 81.744359] [] ? vfsmount_lock_local_unlock_cpu+0x30/0x30 [ 81.744363] [] sysenter_do_call+0x12/0x36 [ 81.744365] Code: f5 00 00 00 32 f6 66 8b da 66 d1 e3 66 ba d4 03 8a e3 b0 1c 66 ef b0 1e 66 ef 8a e7 b0 1d 66 ef b0 1f 66 ef e8 fa 00 00 00 61 c3 <60> e8 c8 00 00 00 66 8b f3 66 8b da 66 ba d4 03 b0 0c 8a e5 66 [ 81.744388] EIP: [] 0xc00cd3b3 SS:ESP 0068:f57f3a00 [ 81.744391] CR2: 00000000c00cd3b3 [ 81.744393] ---[ end trace 18b2c87c925b54d6 ]--- Signed-off-by: Wang YanQing Cc: Michal Januszewski Cc: Alan Cox Signed-off-by: Florian Tobias Schandinat Signed-off-by: Paul Gortmaker commit 9407270c29e59a669c77b8e31fb901561b7d7682 Author: Kent Yoder Date: Thu Apr 5 20:34:20 2012 +0800 crypto: sha512 - Fix byte counter overflow in SHA-512 commit 25c3d30c918207556ae1d6e663150ebdf902186b upstream. The current code only increments the upper 64 bits of the SHA-512 byte counter when the number of bytes hashed happens to hit 2^64 exactly. This patch increments the upper 64 bits whenever the lower 64 bits overflows. Signed-off-by: Kent Yoder Signed-off-by: Herbert Xu Signed-off-by: Paul Gortmaker commit f126b9e7d62a13fea31fdfeab2e471935326ee09 Author: Thomas Jarosch Date: Wed Dec 7 22:08:11 2011 +0100 PCI: Add quirk for still enabled interrupts on Intel Sandy Bridge GPUs commit cdb1f35dc7de42802527140a3613871c394548e1 upstream. commit f67fd55fa96f7d7295b43ffbc4a97d8f55e473aa upstream. Some BIOS implementations leave the Intel GPU interrupts enabled, even though no one is handling them (f.e. i915 driver is never loaded). Additionally the interrupt destination is not set up properly and the interrupt ends up -somewhere-. These spurious interrupts are "sticky" and the kernel disables the (shared) interrupt line after 100.000+ generated interrupts. Fix it by disabling the still enabled interrupts. This resolves crashes often seen on monitor unplug. Tested on the following boards: - Intel DH61CR: Affected - Intel DH67BL: Affected - Intel S1200KP server board: Affected - Asus P8H61-M LE: Affected, but system does not crash. Probably the IRQ ends up somewhere unnoticed. According to reports on the net, the Intel DH61WW board is also affected. Many thanks to Jesse Barnes from Intel for helping with the register configuration and to Intel in general for providing public hardware documentation. Signed-off-by: Thomas Jarosch Tested-by: Charlie Suffin Signed-off-by: Jesse Barnes Signed-off-by: Greg Kroah-Hartman Signed-off-by: Willy Tarreau Signed-off-by: Paul Gortmaker commit 00d23e71b7712ea744b6cd3117a75b96bfc27a82 Author: Sasha Levin Date: Thu Apr 5 12:07:45 2012 +0000 phonet: Check input from user before allocating commit bcf1b70ac6eb0ed8286c66e6bf37cb747cbaa04c upstream. A phonet packet is limited to USHRT_MAX bytes, this is never checked during tx which means that the user can specify any size he wishes, and the kernel will attempt to allocate that size. In the good case, it'll lead to the following warning, but it may also cause the kernel to kick in the OOM and kill a random task on the server. [ 8921.744094] WARNING: at mm/page_alloc.c:2255 __alloc_pages_slowpath+0x65/0x730() [ 8921.749770] Pid: 5081, comm: trinity Tainted: G W 3.4.0-rc1-next-20120402-sasha #46 [ 8921.756672] Call Trace: [ 8921.758185] [] warn_slowpath_common+0x87/0xb0 [ 8921.762868] [] warn_slowpath_null+0x15/0x20 [ 8921.765399] [] __alloc_pages_slowpath+0x65/0x730 [ 8921.769226] [] ? zone_watermark_ok+0x1a/0x20 [ 8921.771686] [] ? get_page_from_freelist+0x625/0x660 [ 8921.773919] [] __alloc_pages_nodemask+0x1f8/0x240 [ 8921.776248] [] kmalloc_large_node+0x70/0xc0 [ 8921.778294] [] __kmalloc_node_track_caller+0x34/0x1c0 [ 8921.780847] [] ? sock_alloc_send_pskb+0xbc/0x260 [ 8921.783179] [] __alloc_skb+0x75/0x170 [ 8921.784971] [] sock_alloc_send_pskb+0xbc/0x260 [ 8921.787111] [] ? release_sock+0x7e/0x90 [ 8921.788973] [] sock_alloc_send_skb+0x10/0x20 [ 8921.791052] [] pep_sendmsg+0x60/0x380 [ 8921.792931] [] ? pn_socket_bind+0x156/0x180 [ 8921.794917] [] ? pn_socket_autobind+0x3f/0x90 [ 8921.797053] [] pn_socket_sendmsg+0x4f/0x70 [ 8921.798992] [] sock_aio_write+0x187/0x1b0 [ 8921.801395] [] ? sub_preempt_count+0xae/0xf0 [ 8921.803501] [] ? __lock_acquire+0x42c/0x4b0 [ 8921.805505] [] ? __sock_recv_ts_and_drops+0x140/0x140 [ 8921.807860] [] do_sync_readv_writev+0xbc/0x110 [ 8921.809986] [] ? might_fault+0x97/0xa0 [ 8921.811998] [] ? security_file_permission+0x1e/0x90 [ 8921.814595] [] do_readv_writev+0xe2/0x1e0 [ 8921.816702] [] ? do_setitimer+0x1ac/0x200 [ 8921.818819] [] ? get_parent_ip+0x11/0x50 [ 8921.820863] [] ? sub_preempt_count+0xae/0xf0 [ 8921.823318] [] vfs_writev+0x46/0x60 [ 8921.825219] [] sys_writev+0x4f/0xb0 [ 8921.827127] [] system_call_fastpath+0x16/0x1b [ 8921.829384] ---[ end trace dffe390f30db9eb7 ]--- Signed-off-by: Sasha Levin Acked-by: Rémi Denis-Courmont Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit bcbb4d014a85dcbb499b9a9cfcfbd9f779e44edf Author: Pavel Shilovsky Date: Thu May 10 19:49:38 2012 +0400 fuse: fix stat call on 32 bit platforms commit 45c72cd73c788dd18c8113d4a404d6b4a01decf1 upstream. Now we store attr->ino at inode->i_ino, return attr->ino at the first time and then return inode->i_ino if the attribute timeout isn't expired. That's wrong on 32 bit platforms because attr->ino is 64 bit and inode->i_ino is 32 bit in this case. Fix this by saving 64 bit ino in fuse_inode structure and returning it every time we call getattr. Also squash attr->ino into inode->i_ino explicitly. Signed-off-by: Pavel Shilovsky Signed-off-by: Miklos Szeredi Signed-off-by: Paul Gortmaker commit a02302d63a1a2cda7bad8391b9da621a56f7765d Author: Tyler Hicks Date: Tue Jun 12 11:17:01 2012 -0700 eCryptfs: Properly check for O_RDONLY flag before doing privileged open commit 9fe79d7600497ed8a95c3981cbe5b73ab98222f0 upstream. If the first attempt at opening the lower file read/write fails, eCryptfs will retry using a privileged kthread. However, the privileged retry should not happen if the lower file's inode is read-only because a read/write open will still be unsuccessful. The check for determining if the open should be retried was intended to be based on the access mode of the lower file's open flags being O_RDONLY, but the check was incorrectly performed. This would cause the open to be retried by the privileged kthread, resulting in a second failed open of the lower file. This patch corrects the check to determine if the open request should be handled by the privileged kthread. Signed-off-by: Tyler Hicks Reported-by: Dan Carpenter Acked-by: Dan Carpenter Signed-off-by: Paul Gortmaker commit 177fa7d76336190ee4ac544ab45fb0dde5b499b2 Author: Dan Williams Date: Thu Jun 21 23:25:32 2012 -0700 fix eh wakeup (scsi_schedule_eh vs scsi_restart_operations) commit 57fc2e335fd3c2f898ee73570dc81426c28dc7b4 upstream. Rapid ata hotplug on a libsas controller results in cases where libsas is waiting indefinitely on eh to perform an ata probe. A race exists between scsi_schedule_eh() and scsi_restart_operations() in the case when scsi_restart_operations() issues i/o to other devices in the sas domain. When this happens the host state transitions from SHOST_RECOVERY (set by scsi_schedule_eh) back to SHOST_RUNNING and ->host_busy is non-zero so we put the eh thread to sleep even though ->host_eh_scheduled is active. Before putting the error handler to sleep we need to check if the host_state needs to return to SHOST_RECOVERY for another trip through eh. Since i/o that is released by scsi_restart_operations has been blocked for at least one eh cycle, this implementation allows those i/o's to run before another eh cycle starts to discourage hung task timeouts. Reported-by: Tom Jackson Tested-by: Tom Jackson Signed-off-by: Dan Williams Signed-off-by: James Bottomley Signed-off-by: Paul Gortmaker commit c3997b4070ddc798cd9906f3e64de2d184af6086 Author: Dan Williams Date: Thu Jun 21 23:36:20 2012 -0700 SCSI: libsas: fix sas_discover_devices return code handling commit e69e5d3d25d6b58543f782a515baeda064e2b601 upstream. commit b17caa174a7e1fd2e17b26e210d4ee91c4c28b37 upstream. commit 198439e4 [SCSI] libsas: do not set res = 0 in sas_ex_discover_dev() commit 19252de6 [SCSI] libsas: fix wide port hotplug issues The above commits seem to have confused the return value of sas_ex_discover_dev which is non-zero on failure and sas_ex_join_wide_port which just indicates short circuiting discovery on already established ports. The result is random discovery failures depending on configuration. Calls to sas_ex_join_wide_port are the source of the trouble as its return value is errantly assigned to 'res'. Convert it to bool and stop returning its result up the stack. Tested-by: Dan Melnic Reported-by: Dan Melnic Signed-off-by: Dan Williams Reviewed-by: Jack Wang Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman Signed-off-by: Willy Tarreau Signed-off-by: Paul Gortmaker commit 7888dfe3f5d02cce3c82b1f99cedb460983e4b4b Author: Dan Williams Date: Thu Jun 21 23:36:15 2012 -0700 libsas: continue revalidation commit 26f2f199ff150d8876b2641c41e60d1c92d2fb81 upstream. Continue running revalidation until no more broadcast devices are discovered. Fixes cases where re-discovery completes too early in a domain with multiple expanders with pending re-discovery events. Servicing BCNs can get backed up behind error recovery. Signed-off-by: Dan Williams Signed-off-by: James Bottomley Signed-off-by: Paul Gortmaker commit 425f11f628631c26ebc4dfe0dabb77e0eb4643af Author: Bart Van Assche Date: Fri Jun 29 15:34:26 2012 +0000 Avoid dangling pointer in scsi_requeue_command() commit 940f5d47e2f2e1fa00443921a0abf4822335b54d upstream. When we call scsi_unprep_request() the command associated with the request gets destroyed and therefore drops its reference on the device. If this was the only reference, the device may get released and we end up with a NULL pointer deref when we call blk_requeue_request. Reported-by: Mike Christie Signed-off-by: Bart Van Assche Reviewed-by: Mike Christie Reviewed-by: Tejun Heo [jejb: enhance commend and add commit log for stable] Signed-off-by: James Bottomley Signed-off-by: Paul Gortmaker commit 1f16241465b7dbe61a8abe70f732bf031c4f41c2 Author: Paul Moore Date: Tue Jul 17 11:07:47 2012 +0000 cipso: don't follow a NULL pointer when setsockopt() is called commit a9d0acf8d157c30374af76d43e7f05b5b108be0c upstream. [ Upstream commit 89d7ae34cdda4195809a5a987f697a517a2a3177 ] As reported by Alan Cox, and verified by Lin Ming, when a user attempts to add a CIPSO option to a socket using the CIPSO_V4_TAG_LOCAL tag the kernel dies a terrible death when it attempts to follow a NULL pointer (the skb argument to cipso_v4_validate() is NULL when called via the setsockopt() syscall). This patch fixes this by first checking to ensure that the skb is non-NULL before using it to find the incoming network interface. In the unlikely case where the skb is NULL and the user attempts to add a CIPSO option with the _TAG_LOCAL tag we return an error as this is not something we want to allow. A simple reproducer, kindly supplied by Lin Ming, although you must have the CIPSO DOI #3 configure on the system first or you will be caught early in cipso_v4_validate(): #include #include #include #include #include struct local_tag { char type; char length; char info[4]; }; struct cipso { char type; char length; char doi[4]; struct local_tag local; }; int main(int argc, char **argv) { int sockfd; struct cipso cipso = { .type = IPOPT_CIPSO, .length = sizeof(struct cipso), .local = { .type = 128, .length = sizeof(struct local_tag), }, }; memset(cipso.doi, 0, 4); cipso.doi[3] = 3; sockfd = socket(AF_INET, SOCK_DGRAM, 0); #define SOL_IP 0 setsockopt(sockfd, SOL_IP, IP_OPTIONS, &cipso, sizeof(struct cipso)); return 0; } CC: Lin Ming Reported-by: Alan Cox Signed-off-by: Paul Moore Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman Signed-off-by: Willy Tarreau Signed-off-by: Paul Gortmaker commit b945852dc87c7a995f3f092b84fb824c984fcb6c Author: Darren Hart Date: Fri Jul 20 11:53:29 2012 -0700 futex: Test for pi_mutex on fault in futex_wait_requeue_pi() commit b6070a8d9853eda010a549fa9a09eb8d7269b929 upstream. If fixup_pi_state_owner() faults, pi_mutex may be NULL. Test for pi_mutex != NULL before testing the owner against current and possibly unlocking it. Signed-off-by: Darren Hart Cc: Dave Jones Cc: Dan Carpenter Link: http://lkml.kernel.org/r/dc59890338fc413606f04e5c5b131530734dae3d.1342809673.git.dvhart@linux.intel.com Signed-off-by: Thomas Gleixner Signed-off-by: Paul Gortmaker commit 2d080ad431b7550b103c7fbac16c8d30dcf9f823 Author: Darren Hart Date: Fri Jul 20 11:53:30 2012 -0700 futex: Fix bug in WARN_ON for NULL q.pi_state commit f27071cb7fe3e1d37a9dbe6c0dfc5395cd40fa43 upstream. The WARN_ON in futex_wait_requeue_pi() for a NULL q.pi_state was testing the address (&q.pi_state) of the pointer instead of the value (q.pi_state) of the pointer. Correct it accordingly. Signed-off-by: Darren Hart Cc: Dave Jones Link: http://lkml.kernel.org/r/1c85d97f6e5f79ec389a4ead3e367363c74bd09a.1342809673.git.dvhart@linux.intel.com Signed-off-by: Thomas Gleixner Signed-off-by: Paul Gortmaker commit d2ee189d2e71d4ce73e31207f0a6474167adb0a3 Author: Darren Hart Date: Fri Jul 20 11:53:31 2012 -0700 futex: Forbid uaddr == uaddr2 in futex_wait_requeue_pi() commit 6f7b0a2a5c0fb03be7c25bd1745baa50582348ef upstream. If uaddr == uaddr2, then we have broken the rule of only requeueing from a non-pi futex to a pi futex with this call. If we attempt this, as the trinity test suite manages to do, we miss early wakeups as q.key is equal to key2 (because they are the same uaddr). We will then attempt to dereference the pi_mutex (which would exist had the futex_q been properly requeued to a pi futex) and trigger a NULL pointer dereference. Signed-off-by: Darren Hart Cc: Dave Jones Link: http://lkml.kernel.org/r/ad82bfe7f7d130247fbe2b5b4275654807774227.1342809673.git.dvhart@linux.intel.com Signed-off-by: Thomas Gleixner Signed-off-by: Paul Gortmaker commit a4e83523250937f22d079b483ac313e71f85881a Author: Greg Pearson Date: Mon Jul 30 14:39:05 2012 -0700 pcdp: use early_ioremap/early_iounmap to access pcdp table commit 6c4088ac3a4d82779903433bcd5f048c58fb1aca upstream. efi_setup_pcdp_console() is called during boot to parse the HCDP/PCDP EFI system table and setup an early console for printk output. The routine uses ioremap/iounmap to setup access to the HCDP/PCDP table information. The call to ioremap is happening early in the boot process which leads to a panic on x86_64 systems: panic+0x01ca do_exit+0x043c oops_end+0x00a7 no_context+0x0119 __bad_area_nosemaphore+0x0138 bad_area_nosemaphore+0x000e do_page_fault+0x0321 page_fault+0x0020 reserve_memtype+0x02a1 __ioremap_caller+0x0123 ioremap_nocache+0x0012 efi_setup_pcdp_console+0x002b setup_arch+0x03a9 start_kernel+0x00d4 x86_64_start_reservations+0x012c x86_64_start_kernel+0x00fe This replaces the calls to ioremap/iounmap in efi_setup_pcdp_console() with calls to early_ioremap/early_iounmap which can be called during early boot. This patch was tested on an x86_64 prototype system which uses the HCDP/PCDP table for early console setup. Signed-off-by: Greg Pearson Acked-by: Khalid Aziz Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit 05cbfc11ad3e22b60b1b44b8dd2181c0be4e100d Author: Zach Brown Date: Tue Jul 24 12:10:11 2012 -0700 fuse: verify all ioctl retry iov elements commit fb6ccff667712c46b4501b920ea73a326e49626a upstream. Commit 7572777eef78ebdee1ecb7c258c0ef94d35bad16 attempted to verify that the total iovec from the client doesn't overflow iov_length() but it only checked the first element. The iovec could still overflow by starting with a small element. The obvious fix is to check all the elements. The overflow case doesn't look dangerous to the kernel as the copy is limited by the length after the overflow. This fix restores the intention of returning an error instead of successfully copying less than the iovec represented. I found this by code inspection. I built it but don't have a test case. I'm cc:ing stable because the initial commit did as well. Signed-off-by: Zach Brown Signed-off-by: Miklos Szeredi Signed-off-by: Paul Gortmaker commit 55d06a71c12b8e7dd05517d3893cc6b99cd3c189 Author: Al Viro Date: Mon Aug 20 15:28:00 2012 +0100 vfs: missed source of ->f_pos races commit 0e665d5d1125f9f4ccff56a75e814f10f88861a2 upstream. compat_sys_{read,write}v() need the same "pass a copy of file->f_pos" thing as sys_{read,write}{,v}(). Signed-off-by: Al Viro Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit 2b718efa69f2431de9df4b35e725316fa25d5310 Author: J. Bruce Fields Date: Fri Aug 17 17:31:53 2012 -0400 svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping commit d10f27a750312ed5638c876e4bd6aa83664cccd8 upstream. The rpc server tries to ensure that there will be room to send a reply before it receives a request. It does this by tracking, in xpt_reserved, an upper bound on the total size of the replies that is has already committed to for the socket. Currently it is adding in the estimate for a new reply *before* it checks whether there is space available. If it finds that there is not space, it then subtracts the estimate back out. This may lead the subsequent svc_xprt_enqueue to decide that there is space after all. The results is a svc_recv() that will repeatedly return -EAGAIN, causing server threads to loop without doing any actual work. Reported-by: Michael Tokarev Tested-by: Michael Tokarev Signed-off-by: J. Bruce Fields Signed-off-by: Paul Gortmaker commit c8af16321375a4d6b4bc2cf874292d5943586d91 Author: J. Bruce Fields Date: Mon Aug 20 16:04:40 2012 -0400 svcrpc: sends on closed socket should stop immediately commit f06f00a24d76e168ecb38d352126fd203937b601 upstream. svc_tcp_sendto sets XPT_CLOSE if we fail to transmit the entire reply. However, the XPT_CLOSE won't be acted on immediately. Meanwhile other threads could send further replies before the socket is really shut down. This can manifest as data corruption: for example, if a truncated read reply is followed by another rpc reply, that second reply will look to the client like further read data. Symptoms were data corruption preceded by svc_tcp_sendto logging something like kernel: rpc-srv/tcp: nfsd: sent only 963696 when sending 1048708 bytes - shutting down socket Reported-by: Malahal Naineni Tested-by: Malahal Naineni Signed-off-by: J. Bruce Fields Signed-off-by: Paul Gortmaker commit 0b139c76d03dfe13f15492f8ce2d39f006b68a86 Author: Dave Jones Date: Thu Sep 6 12:01:00 2012 -0400 Remove user-triggerable BUG from mpol_to_str commit 80de7c3138ee9fd86a98696fd2cf7ad89b995d0a upstream. Trivially triggerable, found by trinity: kernel BUG at mm/mempolicy.c:2546! Process trinity-child2 (pid: 23988, threadinfo ffff88010197e000, task ffff88007821a670) Call Trace: show_numa_map+0xd5/0x450 show_pid_numa_map+0x13/0x20 traverse+0xf2/0x230 seq_read+0x34b/0x3e0 vfs_read+0xac/0x180 sys_pread64+0xa2/0xc0 system_call_fastpath+0x1a/0x1f RIP: mpol_to_str+0x156/0x360 Signed-off-by: Dave Jones Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit aae13dc959d0c0bf80836b7b5d540f95ecab7c9c Author: Mathias Krause Date: Wed Aug 15 11:31:54 2012 +0000 dccp: check ccid before dereferencing commit 276bdb82dedb290511467a5a4fdbe9f0b52dce6f upstream. ccid_hc_rx_getsockopt() and ccid_hc_tx_getsockopt() might be called with a NULL ccid pointer leading to a NULL pointer dereference. This could lead to a privilege escalation if the attacker is able to map page 0 and prepare it with a fake ccid_ops pointer. Signed-off-by: Mathias Krause Cc: Gerrit Renker Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 55a95b59f4728dd3739accd091cd9a775870abc0 Author: Dan Carpenter Date: Sat Jun 9 19:08:25 2012 +0300 mtd: cafe_nand: fix an & vs | mistake commit 48f8b641297df49021093763a3271119a84990a2 upstream. The intent here was clearly to set result to true if the 0x40000000 flag was set. But instead there was a | vs & typo and we always set result to true. Artem: check the spec at wiki.laptop.org/images/5/5c/88ALP01_Datasheet_July_2007.pdf and this fix looks correct. Signed-off-by: Dan Carpenter Signed-off-by: Artem Bityutskiy Signed-off-by: David Woodhouse Signed-off-by: Paul Gortmaker commit 14424d71693c91f25d74ac09793d9f4aae43c104 Author: Andi Kleen Date: Thu Oct 28 13:16:13 2010 +0100 Fix install_process_keyring error handling commit 27d6379894be4a81984da4d48002196a83939ca9 upstream. Fix an incorrect error check that returns 1 for error instead of the expected error code. Signed-off-by: Andi Kleen Signed-off-by: David Howells Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit 000e338c2808b8f416df36f43074a46eebdffff0 Author: Nikola Pajkovsky Date: Wed Aug 15 00:38:08 2012 +0200 udf: fix retun value on error path in udf_load_logicalvol commit 68766a2edcd5cd744262a70a2f67a320ac944760 upstream. In case we detect a problem and bail out, we fail to set "ret" to a nonzero value, and udf_load_logicalvol will mistakenly report success. Signed-off-by: Nikola Pajkovsky Signed-off-by: Jan Kara Signed-off-by: Paul Gortmaker commit b8c528e117de6c2250ffa550c05504b4580f4210 Author: Jan Kara Date: Wed Sep 5 15:48:23 2012 +0200 udf: Fix data corruption for files in ICB commit 9c2fc0de1a6e638fe58c354a463f544f42a90a09 upstream. When a file is stored in ICB (inode), we overwrite part of the file, and the page containing file's data is not in page cache, we end up corrupting file's data by overwriting them with zeros. The problem is we use simple_write_begin() which simply zeroes parts of the page which are not written to. The problem has been introduced by be021ee4 (udf: convert to new aops). Fix the problem by providing a ->write_begin function which makes the page properly uptodate. Reported-by: Ian Abbott Signed-off-by: Jan Kara Signed-off-by: Paul Gortmaker commit 8d0ad4e77355f6e8beb2adc421c9b8586a1071fc Author: Jan Kara Date: Tue Feb 5 13:59:56 2013 +0100 udf: Fix bitmap overflow on large filesystems with small block size commit 89b1f39eb4189de745fae554b0d614d87c8d5c63 upstream. For large UDF filesystems with 512-byte blocks the number of necessary bitmap blocks is larger than 2^16 so s_nr_groups in udf_bitmap overflows (the number will overflow for filesystems larger than 128 GB with 512-byte blocks). That results in ENOSPC errors despite the filesystem has plenty of free space. Fix the problem by changing s_nr_groups' type to 'int'. That is enough even for filesystems 2^32 blocks (UDF maximum) and 512-byte blocksize. Reported-and-tested-by: v10lator@myway.de Signed-off-by: Jan Kara Signed-off-by: Paul Gortmaker commit cd50b96209b07d1e650f037dcdcb5ceb862c4445 Author: Mathias Krause Date: Thu Jul 12 08:46:55 2012 +0200 udf: avoid info leak on export commit 0143fc5e9f6f5aad4764801015bc8d4b4a278200 upstream. For type 0x51 the udf.parent_partref member in struct fid gets copied uninitialized to userland. Fix this by initializing it to 0. Signed-off-by: Mathias Krause Signed-off-by: Jan Kara Signed-off-by: Paul Gortmaker commit 3cb827b381dc2aace21720eaef0dee832f3ffd46 Author: Namjae Jeon Date: Wed Oct 10 00:08:56 2012 +0900 udf: fix memory leak while allocating blocks during write commit 2fb7d99d0de3fd8ae869f35ab682581d8455887a upstream. Need to brelse the buffer_head stored in cur_epos and next_epos. Signed-off-by: Namjae Jeon Signed-off-by: Ashish Sangwan Signed-off-by: Jan Kara Signed-off-by: Paul Gortmaker commit c6577c482cfa98c8171c99d48606edf83736f150 Author: Chris Mason Date: Wed Jul 25 15:57:13 2012 -0400 Btrfs: call the ordered free operation without any locks held commit e9fbcb42201c862fd6ab45c48ead4f47bb2dea9d upstream. Each ordered operation has a free callback, and this was called with the worker spinlock held. Josef made the free callback also call iput, which we can't do with the spinlock. This drops the spinlock for the free operation and grabs it again before moving through the rest of the list. We'll circle back around to this and find a cleaner way that doesn't bounce the lock around so much. Signed-off-by: Chris Mason Signed-off-by: Paul Gortmaker commit 232efe16823d169342d8638b06b56ce15f3e1c14 Author: Eric Sandeen Date: Sat Mar 9 15:18:39 2013 +0000 btrfs: use rcu_barrier() to wait for bdev puts at unmount commit bc178622d40d87e75abc131007342429c9b03351 upstream. Doing this would reliably fail with -EBUSY for me: # mount /dev/sdb2 /mnt/scratch; umount /mnt/scratch; mkfs.btrfs -f /dev/sdb2 ... unable to open /dev/sdb2: Device or resource busy because mkfs.btrfs tries to open the device O_EXCL, and somebody still has it. Using systemtap to track bdev gets & puts shows a kworker thread doing a blkdev put after mkfs attempts a get; this is left over from the unmount path: btrfs_close_devices __btrfs_close_devices call_rcu(&device->rcu, free_device); free_device INIT_WORK(&device->rcu_work, __free_device); schedule_work(&device->rcu_work); so unmount might complete before __free_device fires & does its blkdev_put. Adding an rcu_barrier() to btrfs_close_devices() causes unmount to wait until all blkdev_put()s are done, and the device is truly free once unmount completes. Signed-off-by: Eric Sandeen Signed-off-by: Josef Bacik Signed-off-by: Chris Mason Signed-off-by: Paul Gortmaker commit 1d0a00e9979b465cb6d5ca22829437fd96e0f76e Author: Jan Kara Date: Sun Dec 18 17:37:02 2011 -0500 ext4: fix error handling on inode bitmap corruption commit acd6ad83517639e8f09a8c5525b1dccd81cd2a10 upstream. When insert_inode_locked() fails in ext4_new_inode() it most likely means inode bitmap got corrupted and we allocated again inode which is already in use. Also doing unlock_new_inode() during error recovery is wrong since the inode does not have I_NEW set. Fix the problem by jumping to fail: (instead of fail_drop:) which declares filesystem error and does not call unlock_new_inode(). Signed-off-by: Jan Kara Signed-off-by: "Theodore Ts'o" Signed-off-by: Paul Gortmaker commit 45c2d5ba97f2187ce6155d7e9f0963a628f2c0ce Author: Theodore Ts'o Date: Thu Dec 27 01:42:50 2012 -0500 ext4: avoid hang when mounting non-journal filesystems with orphan list commit 0e9a9a1ad619e7e987815d20262d36a2f95717ca upstream. When trying to mount a file system which does not contain a journal, but which does have a orphan list containing an inode which needs to be truncated, the mount call with hang forever in ext4_orphan_cleanup() because ext4_orphan_del() will return immediately without removing the inode from the orphan list, leading to an uninterruptible loop in kernel code which will busy out one of the CPU's on the system. This can be trivially reproduced by trying to mount the file system found in tests/f_orphan_extents_inode/image.gz from the e2fsprogs source tree. If a malicious user were to put this on a USB stick, and mount it on a Linux desktop which has automatic mounts enabled, this could be considered a potential denial of service attack. (Not a big deal in practice, but professional paranoids worry about such things, and have even been known to allocate CVE numbers for such problems.) Signed-off-by: "Theodore Ts'o" Reviewed-by: Zheng Liu Signed-off-by: Paul Gortmaker commit d910eb981287e3c4a64ce452e207566cdbd7d239 Author: Anatol Pomozov Date: Tue Sep 18 13:38:59 2012 -0400 ext4: make orphan functions be no-op in no-journal mode commit c9b92530a723ac5ef8e352885a1862b18f31b2f5 upstream. Instead of checking whether the handle is valid, we check if journal is enabled. This avoids taking the s_orphan_lock mutex in all cases when there is no journal in use, including the error paths where ext4_orphan_del() is called with a handle set to NULL. Signed-off-by: Anatol Pomozov Signed-off-by: "Theodore Ts'o" Signed-off-by: Paul Gortmaker commit 823f7ea93a21cb9d4403ce193cae60b950ae415d Author: Lachlan McIlroy Date: Sun May 5 23:10:00 2013 -0400 ext4: limit group search loop for non-extent files commit e6155736ad76b2070652745f9e54cdea3f0d8567 upstream. In the case where we are allocating for a non-extent file, we must limit the groups we allocate from to those below 2^32 blocks, and ext4_mb_regular_allocator() attempts to do this initially by putting a cap on ngroups for the subsequent search loop. However, the initial target group comes in from the allocation context (ac), and it may already be beyond the artificially limited ngroups. In this case, the limit if (group == ngroups) group = 0; at the top of the loop is never true, and the loop will run away. Catch this case inside the loop and reset the search to start at group 0. [sandeen@redhat.com: add commit msg & comments] Signed-off-by: Lachlan McIlroy Signed-off-by: Eric Sandeen Signed-off-by: "Theodore Ts'o" Signed-off-by: Paul Gortmaker commit 1adc9064553cd3efd382d49e3198988af5462c29 Author: Niu Yawei Date: Fri Feb 1 21:31:27 2013 -0500 ext4: fix race in ext4_mb_add_n_trim() commit f1167009711032b0d747ec89a632a626c901a1ad upstream. In ext4_mb_add_n_trim(), lg_prealloc_lock should be taken when changing the lg_prealloc_list. Signed-off-by: Niu Yawei Signed-off-by: "Theodore Ts'o" Signed-off-by: Paul Gortmaker commit 0fd8d8773fd8c2a136efbc0f0e705e51ee181be9 Author: Theodore Ts'o Date: Thu Dec 27 01:42:48 2012 -0500 ext4: lock i_mutex when truncating orphan inodes commit 721e3eba21e43532e438652dd8f1fcdfce3187e7 upstream. Commit c278531d39 added a warning when ext4_flush_unwritten_io() is called without i_mutex being taken. It had previously not been taken during orphan cleanup since races weren't possible at that point in the mount process, but as a result of this c278531d39, we will now see a kernel WARN_ON in this case. Take the i_mutex in ext4_orphan_cleanup() to suppress this warning. Reported-by: Alexander Beregalov Signed-off-by: "Theodore Ts'o" Reviewed-by: Zheng Liu Signed-off-by: Paul Gortmaker commit 34de201cb95e93565d55a5bb5e12997133eeffd5 Author: Jan Kara Date: Wed Sep 26 21:52:20 2012 -0400 ext4: fix fdatasync() for files with only i_size changes commit b71fc079b5d8f42b2a52743c8d2f1d35d655b1c5 upstream. Code tracking when transaction needs to be committed on fdatasync(2) forgets to handle a situation when only inode's i_size is changed. Thus in such situations fdatasync(2) doesn't force transaction with new i_size to disk and that can result in wrong i_size after a crash. Fix the issue by updating inode's i_datasync_tid whenever its size is updated. Reported-by: Kristian Nielsen Signed-off-by: Jan Kara Signed-off-by: Paul Gortmaker commit c26795029709e433ff88a436ca935b215e81c8f2 Author: Bernd Schubert Date: Wed Sep 26 21:24:57 2012 -0400 ext4: always set i_op in ext4_mknod() commit 6a08f447facb4f9e29fcc30fb68060bb5a0d21c2 upstream. ext4_special_inode_operations have their own ifdef CONFIG_EXT4_FS_XATTR to mask those methods. And ext4_iget also always sets it, so there is an inconsistency. Signed-off-by: Bernd Schubert Signed-off-by: "Theodore Ts'o" Signed-off-by: Paul Gortmaker commit faf16058ac73346057ac1c4f5126c0769c1f75e7 Author: Dmitry Monakhov Date: Wed Sep 26 12:32:54 2012 -0400 ext4: online defrag is not supported for journaled files commit f066055a3449f0e5b0ae4f3ceab4445bead47638 upstream. Proper block swap for inodes with full journaling enabled is truly non obvious task. In order to be on a safe side let's explicitly disable it for now. Signed-off-by: Dmitry Monakhov Signed-off-by: "Theodore Ts'o" Signed-off-by: Paul Gortmaker commit 41d4b0bae2fffc6558ac846199caee5baa8dd7f2 Author: Eugene Shatokhin Date: Thu Nov 8 15:11:11 2012 -0500 ext4: fix memory leak in ext4_xattr_set_acl()'s error path commit 24ec19b0ae83a385ad9c55520716da671274b96c upstream. In ext4_xattr_set_acl(), if ext4_journal_start() returns an error, posix_acl_release() will not be called for 'acl' which may result in a memory leak. This patch fixes that. Reviewed-by: Lukas Czerner Signed-off-by: Eugene Shatokhin Signed-off-by: "Theodore Ts'o" Signed-off-by: Paul Gortmaker commit 3b73beed27b6ad3877b7257ae17419930b8ae90c Author: Allison Henderson Date: Sun May 15 00:19:41 2011 -0400 ext4: don't dereference null pointer when make_indexed_dir() fails commit 6976a6f2acde2b0443cd64f1d08af90630e4ce81 upstream. Fix for a null pointer bug found while running punch hole tests Signed-off-by: Allison Henderson Signed-off-by: "Theodore Ts'o" Signed-off-by: Paul Gortmaker commit 20341d16ad9e8df56cbeac40471a6883ca60d751 Author: Jan Kara Date: Tue May 3 11:05:55 2011 -0400 ext4: Fix fs corruption when make_indexed_dir() fails commit 7ad8e4e6ae2a7c95445ee1715b1714106fb95037 upstream. When make_indexed_dir() fails (e.g. because of ENOSPC) after it has allocated block for index tree root, we did not properly mark all changed buffers dirty. This lead to only some of these buffers being written out and thus effectively corrupting the directory. Fix the issue by marking all changed data dirty even in the error failure case. Signed-off-by: Jan Kara Signed-off-by: "Theodore Ts'o" Signed-off-by: Paul Gortmaker commit 9e1326beaffcf5ccd9f177e1c8d45e5925637fab Author: Brian Foster Date: Sun Jul 22 23:59:40 2012 -0400 ext4: don't let i_reserved_meta_blocks go negative commit 97795d2a5b8d3c8dc4365d4bd3404191840453ba upstream. If we hit a condition where we have allocated metadata blocks that were not appropriately reserved, we risk underflow of ei->i_reserved_meta_blocks. In turn, this can throw sbi->s_dirtyclusters_counter significantly out of whack and undermine the nondelalloc fallback logic in ext4_nonda_switch(). Warn if this occurs and set i_allocated_meta_blocks to avoid this problem. This condition is reproduced by xfstests 270 against ext2 with delalloc enabled: Mar 28 08:58:02 localhost kernel: [ 171.526344] EXT4-fs (loop1): delayed block allocation failed for inode 14 at logical offset 64486 with max blocks 64 with error -28 Mar 28 08:58:02 localhost kernel: [ 171.526346] EXT4-fs (loop1): This should not happen!! Data will be lost 270 ultimately fails with an inconsistent filesystem and requires an fsck to repair. The cause of the error is an underflow in ext4_da_update_reserve_space() due to an unreserved meta block allocation. Signed-off-by: Brian Foster Signed-off-by: "Theodore Ts'o" Signed-off-by: Paul Gortmaker commit 8a6682199359240e66450999a79ae7c66a95bb42 Author: Jan Kara Date: Thu Dec 8 21:13:46 2011 +0100 ext3: Fix error handling on inode bitmap corruption commit 1415dd8705394399d59a3df1ab48d149e1e41e77 upstream. When insert_inode_locked() fails in ext3_new_inode() it most likely means inode bitmap got corrupted and we allocated again inode which is already in use. Also doing unlock_new_inode() during error recovery is wrong since inode does not have I_NEW set. Fix the problem by jumping to fail: (instead of fail_drop:) which declares filesystem error and does not call unlock_new_inode(). Reviewed-by: Eric Sandeen Signed-off-by: Jan Kara Signed-off-by: Paul Gortmaker commit 8840b297ba5508ee30452e5b1a876a0dcd09708b Author: Jan Kara Date: Mon Sep 3 16:50:42 2012 +0200 ext3: Fix fdatasync() for files with only i_size changes commit 156bddd8e505b295540f3ca0e27dda68cb0d49aa upstream. Code tracking when transaction needs to be committed on fdatasync(2) forgets to handle a situation when only inode's i_size is changed. Thus in such situations fdatasync(2) doesn't force transaction with new i_size to disk and that can result in wrong i_size after a crash. Fix the issue by updating inode's i_datasync_tid whenever its size is updated. Reported-by: Kristian Nielsen Signed-off-by: Jan Kara Signed-off-by: Paul Gortmaker commit e3d9f94025ecac934f29da1190ab8518201a46f7 Author: Tyler Hicks Date: Tue Feb 7 17:55:40 2012 -0600 eCryptfs: Copy up lower inode attrs after setting lower xattr commit 545d680938be1e86a6c5250701ce9abaf360c495 upstream. After passing through a ->setxattr() call, eCryptfs needs to copy the inode attributes from the lower inode to the eCryptfs inode, as they may have changed in the lower filesystem's ->setxattr() path. One example is if an extended attribute containing a POSIX Access Control List is being set. The new ACL may cause the lower filesystem to modify the mode of the lower inode and the eCryptfs inode would need to be updated to reflect the new mode. https://launchpad.net/bugs/926292 Signed-off-by: Tyler Hicks Reported-by: Sebastien Bacher Cc: John Johansen Signed-off-by: Paul Gortmaker commit 79de2d735050cc831de871470ddf1a67d95f0faa Author: Roberto Sassu Date: Tue Oct 5 18:53:45 2010 +0200 ecryptfs: call vfs_setxattr() in ecryptfs_setxattr() commit 48b512e6857139393cdfce26348c362b87537018 upstream. Ecryptfs is a stackable filesystem which relies on lower filesystems the ability of setting/getting extended attributes. If there is a security module enabled on the system it updates the 'security' field of inodes according to the owned extended attribute set with the function vfs_setxattr(). When this function is performed on a ecryptfs filesystem the 'security' field is not updated for the lower filesystem since the call security_inode_post_setxattr() is missing for the lower inode. Further, the call security_inode_setxattr() is missing for the lower inode, leading to policy violations in the security module because specific checks for this hook are not performed (i. e. filesystem 'associate' permission on SELinux is not checked for the lower filesystem). This patch replaces the call of the setxattr() method of the lower inode in the function ecryptfs_setxattr() with vfs_setxattr(). Signed-off-by: Roberto Sassu Cc: Dustin Kirkland Acked-by: James Morris Signed-off-by: Tyler Hicks Signed-off-by: Paul Gortmaker commit e5febf17bb287575969adf6c7ff14117adab7465 Author: Eric Sandeen Date: Mon Feb 20 17:53:01 2012 -0500 jbd2: clear BH_Delay & BH_Unwritten in journal_unmap_buffer commit 15291164b22a357cb211b618adfef4fa82fc0de3 upstream. journal_unmap_buffer()'s zap_buffer: code clears a lot of buffer head state ala discard_buffer(), but does not touch _Delay or _Unwritten as discard_buffer() does. This can be problematic in some areas of the ext4 code which assume that if they have found a buffer marked unwritten or delay, then it's a live one. Perhaps those spots should check whether it is mapped as well, but if jbd2 is going to tear down a buffer, let's really tear it down completely. Without this I get some fsx failures on sub-page-block filesystems up until v3.2, at which point 4e96b2dbbf1d7e81f22047a50f862555a6cb87cb and 189e868fa8fdca702eb9db9d8afc46b5cb9144c9 make the failures go away, because buried within that large change is some more flag clearing. I still think it's worth doing in jbd2, since ->invalidatepage leads here directly, and it's the right place to clear away these flags. Signed-off-by: Eric Sandeen Signed-off-by: "Theodore Ts'o" Signed-off-by: Paul Gortmaker commit 1213acec13a9011a5cd8b90e66b748871319851f Author: Jan Kara Date: Fri Nov 23 14:03:04 2012 +0100 jbd: Fix lock ordering bug in journal_unmap_buffer() commit 25389bb207987b5774182f763b9fb65ff08761c8 upstream. Commit 09e05d48 introduced a wait for transaction commit into journal_unmap_buffer() in the case we are truncating a buffer undergoing commit in the page stradding i_size on a filesystem with blocksize < pagesize. Sadly we forgot to drop buffer lock before waiting for transaction commit and thus deadlock is possible when kjournald wants to lock the buffer. Fix the problem by dropping the buffer lock before waiting for transaction commit. Since we are still holding page lock (and that is OK), buffer cannot disappear under us. Signed-off-by: Jan Kara Signed-off-by: Paul Gortmaker commit 2d15ac56c76d8c78eb46b45831c4eed1ba5d500e Author: Jan Kara Date: Wed Jul 11 23:16:25 2012 +0200 jbd: Fix assertion failure in commit code due to lacking transaction credits commit 09e05d4805e6c524c1af74e524e5d0528bb3fef3 upstream. ext3 users of data=journal mode with blocksize < pagesize were occasionally hitting assertion failure in journal_commit_transaction() checking whether the transaction has at least as many credits reserved as buffers attached. The core of the problem is that when a file gets truncated, buffers that still need checkpointing or that are attached to the committing transaction are left with buffer_mapped set. When this happens to buffers beyond i_size attached to a page stradding i_size, subsequent write extending the file will see these buffers and as they are mapped (but underlying blocks were freed) things go awry from here. The assertion failure just coincidentally (and in this case luckily as we would start corrupting filesystem) triggers due to journal_head not being properly cleaned up as well. Under some rare circumstances this bug could even hit data=ordered mode users. There the assertion won't trigger and we would end up corrupting the filesystem. We fix the problem by unmapping buffers if possible (in lots of cases we just need a buffer attached to a transaction as a place holder but it must not be written out anyway). And in one case, we just have to bite the bullet and wait for transaction commit to finish. Reviewed-by: Josef Bacik Signed-off-by: Jan Kara Signed-off-by: Paul Gortmaker commit 7ea27a8916dccec0b5854266febb09414c3de588 Author: Mathias Krause Date: Thu Jul 12 08:46:54 2012 +0200 isofs: avoid info leak on export commit fe685aabf7c8c9f138e5ea900954d295bf229175 upstream. For type 1 the parent_offset member in struct isofs_fid gets copied uninitialized to userland. Fix this by initializing it to 0. Signed-off-by: Mathias Krause Signed-off-by: Jan Kara Signed-off-by: Paul Gortmaker commit ab3487d189499530df7af9acdfbc0972aeb4f400 Author: Cong Ding Date: Tue Jan 22 19:20:58 2013 -0500 fs/cifs/cifs_dfs_ref.c: fix potential memory leakage commit 10b8c7dff5d3633b69e77f57d404dab54ead3787 upstream. When it goes to error through line 144, the memory allocated to *devname is not freed, and the caller doesn't free it either in line 250. So we free the memroy of *devname in function cifs_compose_mount_options() when it goes to error. Signed-off-by: Cong Ding Reviewed-by: Jeff Layton Signed-off-by: Steve French Signed-off-by: Paul Gortmaker commit 6918382fea8962b86f724f035d14db541887fd48 Author: Greg Thelen Date: Fri Feb 22 16:36:01 2013 -0800 tmpfs: fix use-after-free of mempolicy object commit 5f00110f7273f9ff04ac69a5f85bb535a4fd0987 upstream. The tmpfs remount logic preserves filesystem mempolicy if the mpol=M option is not specified in the remount request. A new policy can be specified if mpol=M is given. Before this patch remounting an mpol bound tmpfs without specifying mpol= mount option in the remount request would set the filesystem's mempolicy object to a freed mempolicy object. To reproduce the problem boot a DEBUG_PAGEALLOC kernel and run: # mkdir /tmp/x # mount -t tmpfs -o size=100M,mpol=interleave nodev /tmp/x # grep /tmp/x /proc/mounts nodev /tmp/x tmpfs rw,relatime,size=102400k,mpol=interleave:0-3 0 0 # mount -o remount,size=200M nodev /tmp/x # grep /tmp/x /proc/mounts nodev /tmp/x tmpfs rw,relatime,size=204800k,mpol=??? 0 0 # note ? garbage in mpol=... output above # dd if=/dev/zero of=/tmp/x/f count=1 # panic here Panic: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [< (null)>] (null) [...] Oops: 0010 [#1] SMP DEBUG_PAGEALLOC Call Trace: mpol_shared_policy_init+0xa5/0x160 shmem_get_inode+0x209/0x270 shmem_mknod+0x3e/0xf0 shmem_create+0x18/0x20 vfs_create+0xb5/0x130 do_last+0x9a1/0xea0 path_openat+0xb3/0x4d0 do_filp_open+0x42/0xa0 do_sys_open+0xfe/0x1e0 compat_sys_open+0x1b/0x20 cstar_dispatch+0x7/0x1f Non-debug kernels will not crash immediately because referencing the dangling mpol will not cause a fault. Instead the filesystem will reference a freed mempolicy object, which will cause unpredictable behavior. The problem boils down to a dropped mpol reference below if shmem_parse_options() does not allocate a new mpol: config = *sbinfo shmem_parse_options(data, &config, true) mpol_put(sbinfo->mpol) sbinfo->mpol = config.mpol /* BUG: saves unreferenced mpol */ This patch avoids the crash by not releasing the mempolicy if shmem_parse_options() doesn't create a new mpol. How far back does this issue go? I see it in both 2.6.36 and 3.3. I did not look back further. Signed-off-by: Greg Thelen Acked-by: Hugh Dickins Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit 393762b54f9ea3bdb0a4392da85063c6b7ab79fd Author: Geert Uytterhoeven Date: Sat Sep 29 22:23:19 2012 +0200 sysfs: sysfs_pathname/sysfs_add_one: Use strlcat() instead of strcat() commit 66081a72517a131430dcf986775f3268aafcb546 upstream. The warning check for duplicate sysfs entries can cause a buffer overflow when printing the warning, as strcat() doesn't check buffer sizes. Use strlcat() instead. Since strlcat() doesn't return a pointer to the passed buffer, unlike strcat(), I had to convert the nested concatenation in sysfs_add_one() to an admittedly more obscure comma operator construct, to avoid emitting code for the concatenation if CONFIG_BUG is disabled. Signed-off-by: Geert Uytterhoeven Signed-off-by: Greg Kroah-Hartman Signed-off-by: Paul Gortmaker commit 93925c373315e778707417d2c696326e163da83b Author: Anurup m Date: Mon Apr 29 15:05:52 2013 -0700 fs/fscache/stats.c: fix memory leak commit ec686c9239b4d472052a271c505d04dae84214cc upstream. There is a kernel memory leak observed when the proc file /proc/fs/fscache/stats is read. The reason is that in fscache_stats_open, single_open is called and the respective release function is not called during release. Hence fix with correct release function - single_release(). Addresses https://bugzilla.kernel.org/show_bug.cgi?id=57101 Signed-off-by: Anurup m Cc: shyju pv Cc: Sanil kumar Cc: Nataraj m Cc: Li Zefan Cc: David Howells Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit fabe66bd6a4683f802b3cc6ec59579effb1afc41 Author: Kees Cook Date: Thu Oct 25 13:38:16 2012 -0700 fs/compat_ioctl.c: VIDEO_SET_SPU_PALETTE missing error check commit 12176503366885edd542389eed3aaf94be163fdb upstream. The compat ioctl for VIDEO_SET_SPU_PALETTE was missing an error check while converting ioctl arguments. This could lead to leaking kernel stack contents into userspace. Patch extracted from existing fix in grsecurity. Signed-off-by: Kees Cook Cc: David Miller Cc: Brad Spengler Cc: PaX Team Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit 356ad39592cfcb537a512b2f88ed44380ae5cd78 Author: Eric Wong Date: Tue Jan 1 21:20:27 2013 +0000 epoll: prevent missed events on EPOLL_CTL_MOD commit 128dd1759d96ad36c379240f8b9463e8acfd37a1 upstream. EPOLL_CTL_MOD sets the interest mask before calling f_op->poll() to ensure events are not missed. Since the modifications to the interest mask are not protected by the same lock as ep_poll_callback, we need to ensure the change is visible to other CPUs calling ep_poll_callback. We also need to ensure f_op->poll() has an up-to-date view of past events which occured before we modified the interest mask. So this barrier also pairs with the barrier in wq_has_sleeper(). This should guarantee either ep_poll_callback or f_op->poll() (or both) will notice the readiness of a recently-ready/modified item. This issue was encountered by Andreas Voellmy and Junchang(Jason) Wang in: http://thread.gmane.org/gmane.linux.kernel/1408782/ Signed-off-by: Eric Wong Cc: Hans Verkuil Cc: Jiri Olsa Cc: Jonathan Corbet Cc: Al Viro Cc: Davide Libenzi Cc: Hans de Goede Cc: Mauro Carvalho Chehab Cc: David Miller Cc: Eric Dumazet Cc: Andrew Morton Cc: Andreas Voellmy Tested-by: "Junchang(Jason) Wang" Cc: netdev@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit d4a80d611a9b7d9c8b605325b17d422fb0cd6fe7 Author: Hans de Goede Date: Wed Jul 4 09:18:01 2012 +0200 usbdevfs: Correct amount of data copied to user in processcompl_compat commit 2102e06a5f2e414694921f23591f072a5ba7db9f upstream. iso data buffers may have holes in them if some packets were short, so for iso urbs we should always copy the entire buffer, just like the regular processcompl does. Signed-off-by: Hans de Goede Acked-by: Alan Stern Signed-off-by: Greg Kroah-Hartman Signed-off-by: Paul Gortmaker commit 93206d0617751590deab8fd4615280117fea8cc5 Author: Johan Hovold Date: Tue Mar 20 16:59:33 2012 +0100 USB: serial: fix race between probe and open commit a65a6f14dc24a90bde3f5d0073ba2364476200bf upstream. Fix race between probe and open by making sure that the disconnected flag is not cleared until all ports have been registered. A call to tty_open while probe is running may get a reference to the serial structure in serial_install before its ports have been registered. This may lead to usb_serial_core calling driver open before port is fully initialised. With ftdi_sio this result in the following NULL-pointer dereference as the private data has not been initialised at open: [ 199.698286] IP: [] ftdi_open+0x59/0xe0 [ftdi_sio] [ 199.698297] *pde = 00000000 [ 199.698303] Oops: 0000 [#1] PREEMPT SMP [ 199.698313] Modules linked in: ftdi_sio usbserial [ 199.698323] [ 199.698327] Pid: 1146, comm: ftdi_open Not tainted 3.2.11 #70 Dell Inc. Vostro 1520/0T816J [ 199.698339] EIP: 0060:[] EFLAGS: 00010286 CPU: 0 [ 199.698344] EIP is at ftdi_open+0x59/0xe0 [ftdi_sio] [ 199.698348] EAX: 0000003e EBX: f5067000 ECX: 00000000 EDX: 80000600 [ 199.698352] ESI: f48d8800 EDI: 00000001 EBP: f515dd54 ESP: f515dcfc [ 199.698356] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 199.698361] Process ftdi_open (pid: 1146, ti=f515c000 task=f481e040 task.ti=f515c000) [ 199.698364] Stack: [ 199.698368] f811a9fe f811a9e0 f811b3ef 00000000 00000000 00001388 00000000 f4a86800 [ 199.698387] 00000002 00000000 f806e68e 00000000 f532765c f481e040 00000246 22222222 [ 199.698479] 22222222 22222222 22222222 f5067004 f5327600 f5327638 f515dd74 f806e6ab [ 199.698496] Call Trace: [ 199.698504] [] ? serial_activate+0x2e/0x70 [usbserial] [ 199.698511] [] serial_activate+0x4b/0x70 [usbserial] [ 199.698521] [] tty_port_open+0x7c/0xd0 [ 199.698527] [] ? serial_set_termios+0xa0/0xa0 [usbserial] [ 199.698534] [] serial_open+0x2f/0x70 [usbserial] [ 199.698540] [] tty_open+0x20c/0x510 [ 199.698546] [] chrdev_open+0xe7/0x230 [ 199.698553] [] __dentry_open+0x1f2/0x390 [ 199.698559] [] ? _raw_spin_unlock+0x2c/0x50 [ 199.698565] [] nameidata_to_filp+0x66/0x80 [ 199.698570] [] ? cdev_put+0x20/0x20 [ 199.698576] [] do_last+0x198/0x730 [ 199.698581] [] path_openat+0xa0/0x350 [ 199.698587] [] do_filp_open+0x35/0x80 [ 199.698593] [] ? _raw_spin_unlock+0x2c/0x50 [ 199.698599] [] ? alloc_fd+0xc0/0x100 [ 199.698605] [] ? getname_flags+0x72/0x120 [ 199.698611] [] do_sys_open+0xf0/0x1c0 [ 199.698617] [] ? trace_hardirqs_on_thunk+0xc/0x10 [ 199.698623] [] sys_open+0x2e/0x40 [ 199.698628] [] sysenter_do_call+0x12/0x36 [ 199.698632] Code: 85 89 00 00 00 8b 16 8b 4d c0 c1 e2 08 c7 44 24 14 88 13 00 00 81 ca 00 00 00 80 c7 44 24 10 00 00 00 00 c7 44 24 0c 00 00 00 00 <0f> b7 41 78 31 c9 89 44 24 08 c7 44 24 04 00 00 00 00 c7 04 24 [ 199.698884] EIP: [] ftdi_open+0x59/0xe0 [ftdi_sio] SS:ESP 0068:f515dcfc [ 199.698893] CR2: 0000000000000078 [ 199.698925] ---[ end trace 77c43ec023940cff ]--- Reported-and-tested-by: Ken Huang Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman Signed-off-by: Paul Gortmaker commit b2d347d067ae2624ad63067a846dd680ae283668 Author: Bjørn Mork Date: Mon Jul 2 10:33:14 2012 +0200 USB: cdc-wdm: fix lockup on error in wdm_read commit b086b6b10d9f182cd8d2f0dcfd7fd11edba93fc9 upstream. Clear the WDM_READ flag on empty reads to avoid running forever in an infinite tight loop, causing lockups: Jul 1 21:58:11 nemi kernel: [ 3658.898647] qmi_wwan 2-1:1.2: Unexpected error -71 Jul 1 21:58:36 nemi kernel: [ 3684.072021] BUG: soft lockup - CPU#0 stuck for 23s! [qmi.pl:12235] Jul 1 21:58:36 nemi kernel: [ 3684.072212] CPU 0 Jul 1 21:58:36 nemi kernel: [ 3684.072355] Jul 1 21:58:36 nemi kernel: [ 3684.072367] Pid: 12235, comm: qmi.pl Tainted: P O 3.5.0-rc2+ #13 LENOVO 2776LEG/2776LEG Jul 1 21:58:36 nemi kernel: [ 3684.072383] RIP: 0010:[] [] spin_unlock_irq+0x8/0xc [cdc_wdm] Jul 1 21:58:36 nemi kernel: [ 3684.072388] RSP: 0018:ffff88022dca1e70 EFLAGS: 00000282 Jul 1 21:58:36 nemi kernel: [ 3684.072393] RAX: ffff88022fc3f650 RBX: ffffffff811c56f7 RCX: 00000001000ce8c1 Jul 1 21:58:36 nemi kernel: [ 3684.072398] RDX: 0000000000000010 RSI: 000000000267d810 RDI: ffff88022fc3f650 Jul 1 21:58:36 nemi kernel: [ 3684.072403] RBP: ffff88022dca1eb0 R08: ffffffffa063578e R09: 0000000000000000 Jul 1 21:58:36 nemi kernel: [ 3684.072407] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000002 Jul 1 21:58:36 nemi kernel: [ 3684.072412] R13: 0000000000000246 R14: ffffffff00000002 R15: ffff8802281d8c88 Jul 1 21:58:36 nemi kernel: [ 3684.072418] FS: 00007f666a260700(0000) GS:ffff88023bc00000(0000) knlGS:0000000000000000 Jul 1 21:58:36 nemi kernel: [ 3684.072423] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 1 21:58:36 nemi kernel: [ 3684.072428] CR2: 000000000270d9d8 CR3: 000000022e865000 CR4: 00000000000007f0 Jul 1 21:58:36 nemi kernel: [ 3684.072433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jul 1 21:58:36 nemi kernel: [ 3684.072438] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jul 1 21:58:36 nemi kernel: [ 3684.072444] Process qmi.pl (pid: 12235, threadinfo ffff88022dca0000, task ffff88022ff76380) Jul 1 21:58:36 nemi kernel: [ 3684.072448] Stack: Jul 1 21:58:36 nemi kernel: [ 3684.072458] ffffffffa063592e 0000000100020000 ffff88022fc3f650 ffff88022fc3f6a8 Jul 1 21:58:36 nemi kernel: [ 3684.072466] 0000000000000200 0000000100000000 000000000267d810 0000000000000000 Jul 1 21:58:36 nemi kernel: [ 3684.072475] 0000000000000000 ffff880212cfb6d0 0000000000000200 ffff880212cfb6c0 Jul 1 21:58:36 nemi kernel: [ 3684.072479] Call Trace: Jul 1 21:58:36 nemi kernel: [ 3684.072489] [] ? wdm_read+0x1a0/0x263 [cdc_wdm] Jul 1 21:58:36 nemi kernel: [ 3684.072500] [] ? vfs_read+0xa1/0xfb Jul 1 21:58:36 nemi kernel: [ 3684.072509] [] ? alarm_setitimer+0x35/0x64 Jul 1 21:58:36 nemi kernel: [ 3684.072517] [] ? sys_read+0x45/0x6e Jul 1 21:58:36 nemi kernel: [ 3684.072525] [] ? system_call_fastpath+0x16/0x1b Jul 1 21:58:36 nemi kernel: [ 3684.072557] Code: <66> 66 90 c3 83 ff ed 89 f8 74 16 7f 06 83 ff a1 75 0a c3 83 ff f4 The WDM_READ flag is normally cleared by wdm_int_callback before resubmitting the read urb, and set by wdm_in_callback when this urb returns with data or an error. But a crashing device may cause both a read error and cancelling all urbs. Make sure that the flag is cleared by wdm_read if the buffer is empty. We don't clear the flag on errors, as there may be pending data in the buffer which should be processed. The flag will instead be cleared on the next wdm_read call. Signed-off-by: Bjørn Mork Acked-by: Oliver Neukum Signed-off-by: Greg Kroah-Hartman Signed-off-by: Paul Gortmaker commit 208aeaad26586d1ca242139764b70d43fe112324 Author: Dan Carpenter Date: Fri Jul 27 01:46:51 2012 +0000 USB: kaweth.c: use GFP_ATOMIC under spin_lock commit e4c7f259c5be99dcfc3d98f913590663b0305bf8 upstream. The problem is that we call this with a spin lock held. The call tree is: kaweth_start_xmit() holds kaweth->device_lock. -> kaweth_async_set_rx_mode() -> kaweth_control() -> kaweth_internal_control_msg() The kaweth_internal_control_msg() function is only called from kaweth_control() which used GFP_ATOMIC for its allocations. Signed-off-by: Dan Carpenter Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 7387a31d0ed4fe2e6bf1f22a3dac5d0c8a9ff3dd Author: Colin Ian King Date: Mon Jul 30 16:06:42 2012 +0100 USB: echi-dbgp: increase the controller wait time to come out of halt. commit f96a4216e85050c0a9d41a41ecb0ae9d8e39b509 upstream. The default 10 microsecond delay for the controller to come out of halt in dbgp_ehci_startup is too short, so increase it to 1 millisecond. This is based on emperical testing on various USB debug ports on modern machines such as a Lenovo X220i and an Ivybridge development platform that needed to wait ~450-950 microseconds. Signed-off-by: Colin Ian King Signed-off-by: Jason Wessel Signed-off-by: Paul Gortmaker commit d3eeb6997e5fa986cf4606f4ddb30140e3331be5 Author: Mark Ferrell Date: Tue Jul 24 14:15:13 2012 -0500 usb: serial: mos7840: Fixup mos7840_chars_in_buffer() commit 5c263b92f828af6a8cf54041db45ceae5af8f2ab upstream. * Use the buffer content length as opposed to the total buffer size. This can be a real problem when using the mos7840 as a usb serial-console as all kernel output is truncated during boot. Signed-off-by: Mark Ferrell Signed-off-by: Greg Kroah-Hartman Signed-off-by: Paul Gortmaker commit 49bfc0de47432b653b20413fb4241586bcbf294d Author: Sven Schnelle Date: Fri Aug 17 21:43:43 2012 +0200 USB: CDC ACM: Fix NULL pointer dereference commit 99f347caa4568cb803862730b3b1f1942639523f upstream. If a device specifies zero endpoints in its interface descriptor, the kernel oopses in acm_probe(). Even though that's clearly an invalid descriptor, we should test wether we have all endpoints. This is especially bad as this oops can be triggered by just plugging a USB device in. Signed-off-by: Sven Schnelle Signed-off-by: Greg Kroah-Hartman Signed-off-by: Paul Gortmaker commit 8c59f0a9b89d27df04c174f589a0e85043fea642 Author: Andrew Worsley Date: Tue Nov 22 20:00:19 2011 +1100 USB: serial: ftdi_sio: Handle the old_termios == 0 case e.g. uart_resume_port() commit c515598e0f5769916c31c00392cc2bfe6af74e55 upstream. Handle null old_termios in ftdi_set_termios() calls from uart_resume_port(). Signed-off-by: Andrew Worsley Signed-off-by: Greg Kroah-Hartman Signed-off-by: Paul Gortmaker commit a30f1701a01b2f64e6d497d3ee0144811b479600 Author: Oliver Neukum Date: Tue Mar 12 14:52:42 2013 +0100 USB: cdc-wdm: fix buffer overflow commit c0f5ecee4e741667b2493c742b60b6218d40b3aa upstream. The buffer for responses must not overflow. If this would happen, set a flag, drop the data and return an error after user space has read all remaining data. Signed-off-by: Oliver Neukum Signed-off-by: Greg Kroah-Hartman [PG: minor adjustment since RESET from 880442027569 isn't in .34] Signed-off-by: Paul Gortmaker commit d04eaae1d63ba7f89d54ea80df8f79ecba14ea53 Author: Wolfgang Frisch Date: Thu Jan 17 01:07:02 2013 +0100 USB: io_ti: Fix NULL dereference in chase_port() commit 1ee0a224bc9aad1de496c795f96bc6ba2c394811 upstream. The tty is NULL when the port is hanging up. chase_port() needs to check for this. This patch is intended for stable series. The behavior was observed and tested in Linux 3.2 and 3.7.1. Johan Hovold submitted a more elaborate patch for the mainline kernel. [ 56.277883] usb 1-1: edge_bulk_in_callback - nonzero read bulk status received: -84 [ 56.278811] usb 1-1: USB disconnect, device number 3 [ 56.278856] usb 1-1: edge_bulk_in_callback - stopping read! [ 56.279562] BUG: unable to handle kernel NULL pointer dereference at 00000000000001c8 [ 56.280536] IP: [] _raw_spin_lock_irqsave+0x19/0x35 [ 56.281212] PGD 1dc1b067 PUD 1e0f7067 PMD 0 [ 56.282085] Oops: 0002 [#1] SMP [ 56.282744] Modules linked in: [ 56.283512] CPU 1 [ 56.283512] Pid: 25, comm: khubd Not tainted 3.7.1 #1 innotek GmbH VirtualBox/VirtualBox [ 56.283512] RIP: 0010:[] [] _raw_spin_lock_irqsave+0x19/0x35 [ 56.283512] RSP: 0018:ffff88001fa99ab0 EFLAGS: 00010046 [ 56.283512] RAX: 0000000000000046 RBX: 00000000000001c8 RCX: 0000000000640064 [ 56.283512] RDX: 0000000000010000 RSI: ffff88001fa99b20 RDI: 00000000000001c8 [ 56.283512] RBP: ffff88001fa99b20 R08: 0000000000000000 R09: 0000000000000000 [ 56.283512] R10: 0000000000000000 R11: ffffffff812fcb4c R12: ffff88001ddf53c0 [ 56.283512] R13: 0000000000000000 R14: 00000000000001c8 R15: ffff88001e19b9f4 [ 56.283512] FS: 0000000000000000(0000) GS:ffff88001fd00000(0000) knlGS:0000000000000000 [ 56.283512] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 56.283512] CR2: 00000000000001c8 CR3: 000000001dc51000 CR4: 00000000000006e0 [ 56.283512] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 56.283512] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 56.283512] Process khubd (pid: 25, threadinfo ffff88001fa98000, task ffff88001fa94f80) [ 56.283512] Stack: [ 56.283512] 0000000000000046 00000000000001c8 ffffffff810578ec ffffffff812fcb4c [ 56.283512] ffff88001e19b980 0000000000002710 ffffffff812ffe81 0000000000000001 [ 56.283512] ffff88001fa94f80 0000000000000202 ffffffff00000001 0000000000000296 [ 56.283512] Call Trace: [ 56.283512] [] ? add_wait_queue+0x12/0x3c [ 56.283512] [] ? usb_serial_port_work+0x28/0x28 [ 56.283512] [] ? chase_port+0x84/0x2d6 [ 56.283512] [] ? try_to_wake_up+0x199/0x199 [ 56.283512] [] ? tty_ldisc_hangup+0x222/0x298 [ 56.283512] [] ? edge_close+0x64/0x129 [ 56.283512] [] ? __wake_up+0x35/0x46 [ 56.283512] [] ? should_resched+0x5/0x23 [ 56.283512] [] ? tty_port_shutdown+0x39/0x44 [ 56.283512] [] ? usb_serial_port_work+0x28/0x28 [ 56.283512] [] ? __tty_hangup+0x307/0x351 [ 56.283512] [] ? usb_hcd_flush_endpoint+0xde/0xed [ 56.283512] [] ? _raw_spin_lock_irqsave+0x14/0x35 [ 56.283512] [] ? usb_serial_disconnect+0x57/0xc2 [ 56.283512] [] ? usb_unbind_interface+0x5c/0x131 [ 56.283512] [] ? __device_release_driver+0x7f/0xd5 [ 56.283512] [] ? device_release_driver+0x1a/0x25 [ 56.283512] [] ? bus_remove_device+0xd2/0xe7 [ 56.283512] [] ? device_del+0x119/0x167 [ 56.283512] [] ? usb_disable_device+0x6a/0x180 [ 56.283512] [] ? usb_disconnect+0x81/0xe6 [ 56.283512] [] ? hub_thread+0x577/0xe82 [ 56.283512] [] ? __schedule+0x490/0x4be [ 56.283512] [] ? abort_exclusive_wait+0x79/0x79 [ 56.283512] [] ? usb_remote_wakeup+0x2f/0x2f [ 56.283512] [] ? usb_remote_wakeup+0x2f/0x2f [ 56.283512] [] ? kthread+0x81/0x89 [ 56.283512] [] ? __kthread_parkme+0x5c/0x5c [ 56.283512] [] ? ret_from_fork+0x7c/0xb0 [ 56.283512] [] ? __kthread_parkme+0x5c/0x5c [ 56.283512] Code: 8b 7c 24 08 e8 17 0b c3 ff 48 8b 04 24 48 83 c4 10 c3 53 48 89 fb 41 50 e8 e0 0a c3 ff 48 89 04 24 e8 e7 0a c3 ff ba 00 00 01 00 0f c1 13 48 8b 04 24 89 d1 c1 ea 10 66 39 d1 74 07 f3 90 66 [ 56.283512] RIP [] _raw_spin_lock_irqsave+0x19/0x35 [ 56.283512] RSP [ 56.283512] CR2: 00000000000001c8 [ 56.283512] ---[ end trace 49714df27e1679ce ]--- Signed-off-by: Wolfgang Frisch Cc: Johan Hovold Signed-off-by: Greg Kroah-Hartman Signed-off-by: Paul Gortmaker commit 94213dca0f01147c7407aa64916e4c6c0205e64d Author: Johan Hovold Date: Tue Mar 19 09:21:07 2013 +0100 USB: garmin_gps: fix memory leak on disconnect commit 618aa1068df29c37a58045fe940f9106664153fd upstream. Remove bogus disconnect test introduced by 95bef012e ("USB: more serial drivers writing after disconnect") which prevented queued data from being freed on disconnect. The possible IO it was supposed to prevent is long gone. Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman Signed-off-by: Paul Gortmaker commit ba8439dd599756c5364419f173fdfe3ee27e319d Author: Johan Hovold Date: Thu Oct 25 13:35:10 2012 +0200 USB: mos7840: fix port-device leak in error path commit 3eb55cc4ed88eee3b5230f66abcdbd2a91639eda upstream. The driver set the usb-serial port pointers to NULL on errors in attach, effectively preventing usb-serial core from decrementing the port ref counters and releasing the port devices and associated data. Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman Signed-off-by: Paul Gortmaker commit ef5ebb463321e2b61e9f88a2ce44eb183a68478d Author: Johan Hovold Date: Thu Oct 25 13:35:09 2012 +0200 USB: mos7840: fix urb leak at release commit 65a4cdbb170e4ec1a7fa0e94936d47e24a17b0e8 upstream. Make sure control urb is freed at release. Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman Signed-off-by: Paul Gortmaker commit fa6a45bdce7042309bafd1a777033eef998f5452 Author: Lennart Sorensen Date: Wed Oct 24 10:23:09 2012 -0400 USB: serial: Fix memory leak in sierra_release() commit f7bc5051667b74c3861f79eed98c60d5c3b883f7 upstream. I found a memory leak in sierra_release() (well sierra_probe() I guess) that looses 8 bytes each time the driver releases a device. Signed-off-by: Len Sorensen Acked-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman Signed-off-by: Paul Gortmaker commit c899734ce7fb4a9ccb42efd647d81eabfb008275 Author: Johan Hovold Date: Thu Oct 25 10:29:01 2012 +0200 USB: whiteheat: fix memory leak in error path commit c129197c99550d356cf5f69b046994dd53cd1b9d upstream. Make sure command buffer is deallocated in case of errors during attach. Cc: Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman Signed-off-by: Paul Gortmaker commit 7947fee39844605ece99d8a4f0be8de93513bae2 Author: Alan Stern Date: Tue Jul 5 12:34:05 2011 -0400 USB: EHCI: go back to using the system clock for QH unlinks commit 004c19682884d4f40000ce1ded53f4a1d0b18206 upstream. This patch (as1477) fixes a problem affecting a few types of EHCI controller. Contrary to what one might expect, these controllers automatically stop their internal frame counter when no ports are enabled. Since ehci-hcd currently relies on the frame counter for determining when it should unlink QHs from the async schedule, those controllers run into trouble: The frame counter stops and the QHs never get unlinked. Some systems have also experienced other problems traced back to commit b963801164618e25fbdc0cd452ce49c3628b46c8 (USB: ehci-hcd unlink speedups), which made the original switch from using the system clock to using the frame counter. It never became clear what the reason was for these problems, but evidently it is related to use of the frame counter. To fix all these problems, this patch more or less reverts that commit and goes back to using the system clock. But this can't be done cleanly because other changes have since been made to the scan_async() subroutine. One of these changes involved the tricky logic that tries to avoid rescanning QHs that have already been seen when the scanning loop is restarted, which happens whenever an URB is given back. Switching back to clock-based unlinks would make this logic even more complicated. Therefore the new code doesn't rescan the entire async list whenever a giveback occurs. Instead it rescans only the current QH and continues on from there. This requires the use of a separate pointer to keep track of the next QH to scan, since the current QH may be unlinked while the scanning is in progress. That new pointer must be global, so that it can be adjusted forward whenever the _next_ QH gets unlinked. (uhci-hcd uses this same trick.) Simplification of the scanning loop removes a level of indentation, which accounts for the size of the patch. The amount of code changed is relatively small, and it isn't exactly a reversion of the b963801164 commit. This fixes Bugzilla #32432. Signed-off-by: Alan Stern Tested-by: Matej Kenda Signed-off-by: Greg Kroah-Hartman Signed-off-by: Paul Gortmaker commit 2b46f2b71df80632b1f5295fc564a5386a0d9b70 Author: Sarah Sharp Date: Fri Mar 16 13:09:39 2012 -0700 xhci: Don't write zeroed pointers to xHC registers. commit 159e1fcc9a60fc7daba23ee8fcdb99799de3fe84 upstream. When xhci_mem_cleanup() is called, we can't be sure if the xHC is actually halted. We can ask the xHC to halt by writing to the RUN bit in the command register, but that might timeout due to a HW hang. If the host controller is still running, we should not write zeroed values to the event ring dequeue pointers or base tables, the DCBAA pointers, or the command ring pointers. Eric Fu reports his VIA VL800 host accesses the event ring pointers after a failed register restore on resume from suspend. The hypothesis is that the host never actually halted before the register write to change the event ring pointer to zero. Remove all writes of zeroed values to pointer registers in xhci_mem_cleanup(). Instead, make all callers of the function reset the host controller first, which will reset those registers to zero. xhci_mem_init() is the only caller that doesn't first halt and reset the host controller before calling xhci_mem_cleanup(). This should be backported to kernels as old as 2.6.32. Signed-off-by: Sarah Sharp Tested-by: Elric Fu Signed-off-by: Paul Gortmaker commit 617734b48a50fbcb8b5a9eba56686570d7a4bb6c Author: Alex He Date: Fri Mar 30 10:21:38 2012 +0800 xHCI: Correct the #define XHCI_LEGACY_DISABLE_SMI commit 95018a53f7653e791bba1f54c8d75d9cb700d1bd upstream. Re-define XHCI_LEGACY_DISABLE_SMI and used it in right way. All SMI enable bits will be cleared to zero and flag bits 29:31 are also cleared to zero. Other bits should be presvered as Table 146. This patch should be backported to kernels as old as 2.6.31. Signed-off-by: Alex He Signed-off-by: Sarah Sharp Signed-off-by: Paul Gortmaker commit a4ad381723f28cef4b50f302801d9081e83f6ede Author: Sarah Sharp Date: Tue May 8 07:09:26 2012 -0700 xhci: Reset reserved command ring TRBs on cleanup. commit 33b2831ac870d50cc8e01c317b07fb1e69c13fe1 upstream. When the xHCI driver needs to clean up memory (perhaps due to a failed register restore on resume from S3 or resume from S4), it needs to reset the number of reserved TRBs on the command ring to zero. Otherwise, several resume cycles (about 30) with a UAS device attached will continually increment the number of reserved TRBs, until all command submissions fail because there isn't enough room on the command ring. This patch should be backported to kernels as old as 2.6.32, that contain the commit 913a8a344ffcaf0b4a586d6662a2c66a7106557d "USB: xhci: Change how xHCI commands are handled." Signed-off-by: Sarah Sharp Signed-off-by: Paul Gortmaker commit 4b95b8ee9bb1051c0f81922f970ed1e0977ed9b9 Author: Sarah Sharp Date: Mon Jul 23 16:06:08 2012 -0700 xhci: Increase reset timeout for Renesas 720201 host. commit 22ceac191211cf6688b1bf6ecd93c8b6bf80ed9b upstream. The NEC/Renesas 720201 xHCI host controller does not complete its reset within 250 milliseconds. In fact, it takes about 9 seconds to reset the host controller, and 1 second for the host to be ready for doorbell rings. Extend the reset and CNR polling timeout to 10 seconds each. This patch should be backported to kernels as old as 2.6.31, that contain the commit 66d4eadd8d067269ea8fead1a50fe87c2979a80d "USB: xhci: BIOS handoff and HW initialization." Signed-off-by: Sarah Sharp Reported-by: Edwin Klein Mentink Signed-off-by: Paul Gortmaker commit 8bcc9b593131db101cf21e09c9711f239e3008ef Author: Matthew Garrett Date: Tue Aug 14 16:44:49 2012 -0400 xhci: Make handover code more robust commit e955a1cd086de4d165ae0f4c7be7289d84b63bdc upstream. My test platform (Intel DX79SI) boots reliably under BIOS, but frequently crashes when booting via UEFI. I finally tracked this down to the xhci handoff code. It seems that reads from the device occasionally just return 0xff, resulting in xhci_find_next_cap_offset generating a value that's larger than the resource region. We then oops when attempting to read the value. Sanity checking that value lets us avoid the crash. I've no idea what's causing the underlying problem, and xhci still doesn't actually *work* even with this, but the machine at least boots which will probably make further debugging easier. This should be backported to kernels as old as 2.6.31, that contain the commit 66d4eadd8d067269ea8fead1a50fe87c2979a80d "USB: xhci: BIOS handoff and HW initialization." Signed-off-by: Matthew Garrett Signed-off-by: Sarah Sharp Signed-off-by: Paul Gortmaker commit 4eb33c261aacccb531fb996fa023716d3fddc1bd Author: Johan Hovold Date: Thu Mar 15 14:48:40 2012 +0100 Bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close commit 33b69bf80a3704d45341928e4ff68b6ebd470686 upstream. Do not close protocol driver until device has been unregistered. This fixes a race between tty_close and hci_dev_open which can result in a NULL-pointer dereference. The line discipline closes the protocol driver while we may still have hci_dev_open sleeping on the req_lock mutex resulting in a NULL-pointer dereference when lock is acquired and hci_init_req called. Bug is 100% reproducible using hciattach and a disconnected serial port: 0. # hciattach -n ttyO1 any noflow 1. hci_dev_open called from hci_power_on grabs req lock 2. hci_init_req executes but device fails to initialise (times out eventually) 3. hci_dev_open is called from hci_sock_ioctl and sleeps on req lock 4. hci_uart_tty_close detaches protocol driver and cancels init req 5. hci_dev_open (1) releases req lock 6. hci_dev_open (3) grabs req lock, calls hci_init_req, which triggers oops when request is prepared in hci_uart_send_frame [ 137.201263] Unable to handle kernel NULL pointer dereference at virtual address 00000028 [ 137.209838] pgd = c0004000 [ 137.212677] [00000028] *pgd=00000000 [ 137.216430] Internal error: Oops: 17 [#1] [ 137.220642] Modules linked in: [ 137.223846] CPU: 0 Tainted: G W (3.3.0-rc6-dirty #406) [ 137.230529] PC is at __lock_acquire+0x5c/0x1ab0 [ 137.235290] LR is at lock_acquire+0x9c/0x128 [ 137.239776] pc : [] lr : [] psr: 20000093 [ 137.239776] sp : cf869dd8 ip : c0529554 fp : c051c730 [ 137.251800] r10: 00000000 r9 : cf8673c0 r8 : 00000080 [ 137.257293] r7 : 00000028 r6 : 00000002 r5 : 00000000 r4 : c053fd70 [ 137.264129] r3 : 00000000 r2 : 00000000 r1 : 00000000 r0 : 00000001 [ 137.270965] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel [ 137.278717] Control: 10c5387d Table: 8f0f4019 DAC: 00000015 [ 137.284729] Process kworker/u:1 (pid: 7, stack limit = 0xcf8682e8) [ 137.291229] Stack: (0xcf869dd8 to 0xcf86a000) [ 137.295776] 9dc0: c0529554 00000000 [ 137.304351] 9de0: cf8673c0 cf868000 d03ea1ef cf868000 000001ef 00000470 00000000 00000002 [ 137.312927] 9e00: cf8673c0 00000001 c051c730 c00716ec 0000000c 00000440 c0529554 00000001 [ 137.321533] 9e20: c051c730 cf868000 d03ea1f3 00000000 c053b978 00000000 00000028 cf868000 [ 137.330078] 9e40: 00000000 00000000 00000002 00000000 00000000 c00733f8 00000002 00000080 [ 137.338684] 9e60: 00000000 c02a1d50 00000000 00000001 60000013 c0969a1c 60000093 c053b96c [ 137.347259] 9e80: 00000002 00000018 20000013 c02a1d50 cf0ac000 00000000 00000002 cf868000 [ 137.355834] 9ea0: 00000089 c0374130 00000002 00000000 c02a1d50 cf0ac000 0000000c cf0fc540 [ 137.364410] 9ec0: 00000018 c02a1d50 cf0fc540 00000000 cf0fc540 c0282238 c028220c cf178d80 [ 137.372985] 9ee0: 127525d8 c02821cc 9a1fa451 c032727c 9a1fa451 127525d8 cf0fc540 cf0ac4ec [ 137.381561] 9f00: cf0ac000 cf0fc540 cf0ac584 c03285f4 c0328580 cf0ac4ec cf85c740 c05510cc [ 137.390136] 9f20: ce825400 c004c914 00000002 00000000 c004c884 ce8254f5 cf869f48 00000000 [ 137.398712] 9f40: c0328580 ce825415 c0a7f914 c061af64 00000000 c048cf3c cf8673c0 cf85c740 [ 137.407287] 9f60: c05510cc c051a66c c05510ec c05510c4 cf85c750 cf868000 00000089 c004d6ac [ 137.415863] 9f80: 00000000 c0073d14 00000001 cf853ed8 cf85c740 c004d558 00000013 00000000 [ 137.424438] 9fa0: 00000000 00000000 00000000 c00516b0 00000000 00000000 cf85c740 00000000 [ 137.433013] 9fc0: 00000001 dead4ead ffffffff ffffffff c0551674 00000000 00000000 c0450aa4 [ 137.441589] 9fe0: cf869fe0 cf869fe0 cf853ed8 c005162c c0013b30 c0013b30 00ffff00 00ffff00 [ 137.450164] [] (__lock_acquire+0x5c/0x1ab0) from [] (lock_acquire+0x9c/0x128) [ 137.459503] [] (lock_acquire+0x9c/0x128) from [] (_raw_spin_lock_irqsave+0x44/0x58) [ 137.469360] [] (_raw_spin_lock_irqsave+0x44/0x58) from [] (skb_queue_tail+0x18/0x48) [ 137.479339] [] (skb_queue_tail+0x18/0x48) from [] (h4_enqueue+0x2c/0x34) [ 137.488189] [] (h4_enqueue+0x2c/0x34) from [] (hci_uart_send_frame+0x34/0x68) [ 137.497497] [] (hci_uart_send_frame+0x34/0x68) from [] (hci_send_frame+0x50/0x88) [ 137.507171] [] (hci_send_frame+0x50/0x88) from [] (hci_cmd_work+0x74/0xd4) [ 137.516204] [] (hci_cmd_work+0x74/0xd4) from [] (process_one_work+0x1a0/0x4ec) [ 137.525604] [] (process_one_work+0x1a0/0x4ec) from [] (worker_thread+0x154/0x344) [ 137.535278] [] (worker_thread+0x154/0x344) from [] (kthread+0x84/0x90) [ 137.543975] [] (kthread+0x84/0x90) from [] (kernel_thread_exit+0x0/0x8) [ 137.552734] Code: e59f4e5c e5941000 e3510000 0a000031 (e5971000) [ 137.559234] ---[ end trace 1b75b31a2719ed1e ]--- Signed-off-by: Johan Hovold Acked-by: Marcel Holtmann Signed-off-by: Johan Hedberg Signed-off-by: Paul Gortmaker commit 2f5e81c11b5d490009d2c51e929e4aaa317df121 Author: Jun Nie Date: Tue Dec 7 14:03:38 2010 +0800 Bluetooth: add NULL pointer check in HCI commit d9319560b86839506c2011346b1f2e61438a3c73 upstream. If we fail to find a hci device pointer in hci_uart, don't try to deref the NULL one we do have. Signed-off-by: Jun Nie Signed-off-by: Gustavo F. Padovan Signed-off-by: Paul Gortmaker commit 6dffe152f694412b91577145f7bbdf2dc5dd2be4 Author: Mathias Krause Date: Sun Apr 7 01:51:49 2013 +0000 Bluetooth: fix possible info leak in bt_sock_recvmsg() commit 4683f42fde3977bdb4e8a09622788cc8b5313778 upstream. In case the socket is already shutting down, bt_sock_recvmsg() returns with 0 without updating msg_namelen leading to net/socket.c leaking the local, uninitialized sockaddr_storage variable to userland -- 128 bytes of kernel stack memory. Fix this by moving the msg_namelen assignment in front of the shutdown test. Cc: Marcel Holtmann Cc: Gustavo Padovan Cc: Johan Hedberg Signed-off-by: Mathias Krause Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 7987696301f2b6623ead28bcc8fefa2202c7cf9b Author: Mathias Krause Date: Wed Aug 15 11:31:51 2012 +0000 Bluetooth: L2CAP - Fix info leak via getsockname() commit 792039c73cf176c8e39a6e8beef2c94ff46522ed upstream. The L2CAP code fails to initialize the l2_bdaddr_type member of struct sockaddr_l2 and the padding byte added for alignment. It that for leaks two bytes kernel stack via the getsockname() syscall. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause Cc: Marcel Holtmann Cc: Gustavo Padovan Cc: Johan Hedberg Signed-off-by: David S. Miller [PG: net/bluetooth/l2cap_sock.c --> net/bluetooth/l2cap.c in .34] Signed-off-by: Paul Gortmaker commit 5f9cd45f92653b4cc8493a47c61a36f5b7426cf1 Author: Mathias Krause Date: Sun Apr 7 01:51:50 2013 +0000 Bluetooth: RFCOMM - Fix missing msg_namelen update in rfcomm_sock_recvmsg() commit e11e0455c0d7d3d62276a0c55d9dfbc16779d691 upstream. If RFCOMM_DEFER_SETUP is set in the flags, rfcomm_sock_recvmsg() returns early with 0 without updating the possibly set msg_namelen member. This, in turn, leads to a 128 byte kernel stack leak in net/socket.c. Fix this by updating msg_namelen in this case. For all other cases it will be handled in bt_sock_stream_recvmsg(). Cc: Marcel Holtmann Cc: Gustavo Padovan Cc: Johan Hedberg Signed-off-by: Mathias Krause Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 8ad4a5c58c7192de527ff2fd818fc362eb270013 Author: Mathias Krause Date: Wed Aug 15 11:31:50 2012 +0000 Bluetooth: RFCOMM - Fix info leak via getsockname() commit 9344a972961d1a6d2c04d9008b13617bcb6ec2ef upstream. The RFCOMM code fails to initialize the trailing padding byte of struct sockaddr_rc added for alignment. It that for leaks one byte kernel stack via the getsockname() syscall. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause Cc: Marcel Holtmann Cc: Gustavo Padovan Cc: Johan Hedberg Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit d1b1f8bc674b942c4ed1ec324260cb81dfd60801 Author: Mathias Krause Date: Wed Aug 15 11:31:46 2012 +0000 Bluetooth: HCI - Fix info leak in getsockopt(HCI_FILTER) commit e15ca9a0ef9a86f0477530b0f44a725d67f889ee upstream. The HCI code fails to initialize the two padding bytes of struct hci_ufilter before copying it to userland -- that for leaking two bytes kernel stack. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause Cc: Marcel Holtmann Cc: Gustavo Padovan Cc: Johan Hedberg Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 0410beec773e5ea9c97486dda67620fbbb9e7189 Author: Anderson Lizardo Date: Sun Jan 6 18:28:53 2013 -0400 Bluetooth: Fix incorrect strncpy() in hidp_setup_hid() commit 0a9ab9bdb3e891762553f667066190c1d22ad62b upstream. The length parameter should be sizeof(req->name) - 1 because there is no guarantee that string provided by userspace will contain the trailing '\0'. Can be easily reproduced by manually setting req->name to 128 non-zero bytes prior to ioctl(HIDPCONNADD) and checking the device name setup on input subsystem: $ cat /sys/devices/pnp0/00\:04/tty/ttyS0/hci0/hci0\:1/input8/name AAAAAA[...]AAAAAAAAf0:af:f0:af:f0:af ("f0:af:f0:af:f0:af" is the device bluetooth address, taken from "phys" field in struct hid_device due to overflow.) Signed-off-by: Anderson Lizardo Acked-by: Marcel Holtmann Signed-off-by: Gustavo Padovan Signed-off-by: Paul Gortmaker commit 90162eab0fd28b731d67b0b83db3cf434583bcdf Author: Patrick McHardy Date: Thu Aug 30 07:01:30 2012 +0000 IPoIB: Fix use-after-free of multicast object commit bea1e22df494a729978e7f2c54f7bda328f74bc3 upstream. Fix a crash in ipoib_mcast_join_task(). (with help from Or Gerlitz) Commit c8c2afe360b7 ("IPoIB: Use rtnl lock/unlock when changing device flags") added a call to rtnl_lock() in ipoib_mcast_join_task(), which is run from the ipoib_workqueue, and hence the workqueue can't be flushed from the context of ipoib_stop(). In the current code, ipoib_stop() (which doesn't flush the workqueue) calls ipoib_mcast_dev_flush(), which goes and deletes all the multicast entries. This takes place without any synchronization with a possible running instance of ipoib_mcast_join_task() for the same ipoib device, leading to a crash due to NULL pointer dereference. Fix this by making sure that the workqueue is flushed before ipoib_mcast_dev_flush() is called. To make that possible, we move the RTNL-lock wrapped code to ipoib_mcast_join_finish(). Signed-off-by: Patrick McHardy Signed-off-by: Roland Dreier Signed-off-by: Paul Gortmaker commit 54f421b07dbda67ee57c5e5e90cba4dd65279242 Author: Jussi Kivilinna Date: Sun Oct 21 20:42:28 2012 +0300 crypto: cryptd - disable softirqs in cryptd_queue_worker to prevent data corruption commit 9efade1b3e981f5064f9db9ca971b4dc7557ae42 upstream. cryptd_queue_worker attempts to prevent simultaneous accesses to crypto workqueue by cryptd_enqueue_request using preempt_disable/preempt_enable. However cryptd_enqueue_request might be called from softirq context, so add local_bh_disable/local_bh_enable to prevent data corruption and panics. Bug report at http://marc.info/?l=linux-crypto-vger&m=134858649616319&w=2 v2: - Disable software interrupts instead of hardware interrupts Reported-by: Gurucharan Shetty Signed-off-by: Jussi Kivilinna Signed-off-by: Herbert Xu Signed-off-by: Paul Gortmaker commit 3e7f0a70958c7f5ed9bc684bc6601cf398c8c948 Author: David Howells Date: Tue Mar 12 16:44:31 2013 +1100 keys: fix race with concurrent install_user_keyrings() commit 0da9dfdd2cd9889201bc6f6f43580c99165cd087 upstream. This fixes CVE-2013-1792. There is a race in install_user_keyrings() that can cause a NULL pointer dereference when called concurrently for the same user if the uid and uid-session keyrings are not yet created. It might be possible for an unprivileged user to trigger this by calling keyctl() from userspace in parallel immediately after logging in. Assume that we have two threads both executing lookup_user_key(), both looking for KEY_SPEC_USER_SESSION_KEYRING. THREAD A THREAD B =============================== =============================== ==>call install_user_keyrings(); if (!cred->user->session_keyring) ==>call install_user_keyrings() ... user->uid_keyring = uid_keyring; if (user->uid_keyring) return 0; <== key = cred->user->session_keyring [== NULL] user->session_keyring = session_keyring; atomic_inc(&key->usage); [oops] At the point thread A dereferences cred->user->session_keyring, thread B hasn't updated user->session_keyring yet, but thread A assumes it is populated because install_user_keyrings() returned ok. The race window is really small but can be exploited if, for example, thread B is interrupted or preempted after initializing uid_keyring, but before doing setting session_keyring. This couldn't be reproduced on a stock kernel. However, after placing systemtap probe on 'user->session_keyring = session_keyring;' that introduced some delay, the kernel could be crashed reliably. Fix this by checking both pointers before deciding whether to return. Alternatively, the test could be done away with entirely as it is checked inside the mutex - but since the mutex is global, that may not be the best way. Signed-off-by: David Howells Reported-by: Mateusz Guzik Signed-off-by: Andrew Morton Signed-off-by: James Morris Signed-off-by: Paul Gortmaker commit 149646e70aa6a325b5e9656a4a70b82f1d7d4014 Author: Eddie Wai Date: Tue Aug 21 10:35:53 2012 -0700 bnx2i: Fixed NULL ptr deference for 1G bnx2 Linux iSCSI offload commit d6532207116307eb7ecbfa7b9e02c53230096a50 upstream. This patch fixes the following kernel panic invoked by uninitialized fields in the chip initialization for the 1G bnx2 iSCSI offload. One of the bits in the chip initialization is being used by the latest firmware to control overflow packets. When this control bit gets enabled erroneously, it would ultimately result in a bad packet placement which would cause the bnx2 driver to dereference a NULL ptr in the placement handler. This can happen under certain stress I/O environment under the Linux iSCSI offload operation. This change only affects Broadcom's 5709 chipset. Unable to handle kernel NULL pointer dereference at 0000000000000008 RIP: [] :bnx2:bnx2_poll_work+0xd0d/0x13c5 Pid: 0, comm: swapper Tainted: G ---- 2.6.18-333.el5debug #2 RIP: 0010:[] [] :bnx2:bnx2_poll_work+0xd0d/0x13c5 RSP: 0018:ffff8101b575bd50 EFLAGS: 00010216 RAX: 0000000000000005 RBX: ffff81007c5fb180 RCX: 0000000000000000 RDX: 0000000000000ffc RSI: 00000000817e8000 RDI: 0000000000000220 RBP: ffff81015bbd7ec0 R08: ffff8100817e9000 R09: 0000000000000000 R10: ffff81007c5fb180 R11: 00000000000000c8 R12: 000000007a25a010 R13: 0000000000000000 R14: 0000000000000005 R15: ffff810159f80558 FS: 0000000000000000(0000) GS:ffff8101afebc240(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000008 CR3: 0000000000201000 CR4: 00000000000006a0 Process swapper (pid: 0, threadinfo ffff8101b5754000, task ffff8101afebd820) Stack: 000000000000000b ffff810159f80000 0000000000000040 ffff810159f80520 ffff810159f80500 00cf00cf8008e84b ffffc200100939e0 ffff810009035b20 0000502900000000 000000be00000001 ffff8100817e7810 00d08101b575bea8 Call Trace: [] show_schedstat+0x1c2/0x25b [] :bnx2:bnx2_poll+0xf6/0x231 [] net_rx_action+0xac/0x1b1 [] __do_softirq+0x89/0x133 [] call_softirq+0x1c/0x28 [] do_softirq+0x2c/0x7d [] do_IRQ+0xee/0xf7 [] ret_from_intr+0x0/0xa [] acpi_processor_idle_simple+0x1c5/0x341 [] acpi_processor_idle_simple+0x182/0x341 [] acpi_processor_idle_simple+0x0/0x341 [] cpu_idle+0x95/0xb8 [] start_secondary+0x479/0x488 Signed-off-by: Eddie Wai Reviewed-by: Mike Christie Signed-off-by: James Bottomley Signed-off-by: Paul Gortmaker commit a9c9bdba7ae42affa08d66116ccc5f6c443478b9 Author: James Bottomley Date: Thu Jul 7 15:45:40 2011 -0500 fix crash in scsi_dispatch_cmd() commit bfe159a51203c15d23cb3158fffdc25ec4b4dda1 upstream. USB surprise removal of sr is triggering an oops in scsi_dispatch_command(). What seems to be happening is that USB is hanging on to a queue reference until the last close of the upper device, so the crash is caused by surprise remove of a mounted CD followed by attempted unmount. The problem is that USB doesn't issue its final commands as part of the SCSI teardown path, but on last close when the block queue is long gone. The long term fix is probably to make sr do the teardown in the same way as sd (so remove all the lower bits on ejection, but keep the upper disk alive until last close of user space). However, the current oops can be simply fixed by not allowing any commands to be sent to a dead queue. Signed-off-by: James Bottomley Signed-off-by: Paul Gortmaker commit 55c9bcbaeb4179aa27c436623943cd03a4b2deb6 Author: Xiaotian Feng Date: Thu Dec 13 16:12:18 2012 +0800 fix Null pointer dereference on disk error commit 26cd4d65deba587f3cf2329b6869ce02bcbe68ec upstream. Following oops were observed when disk error happened: [ 4272.896937] sd 0:0:0:0: [sda] Unhandled error code [ 4272.896939] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK [ 4272.896942] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 5a de a7 00 00 08 00 [ 4272.896951] end_request: I/O error, dev sda, sector 5955239 [ 4291.574947] BUG: unable to handle kernel NULL pointer dereference at (null) [ 4291.658305] IP: [] ahci_activity_show+0x1/0x40 [ 4291.730090] PGD 76dbbc067 PUD 6c4fba067 PMD 0 [ 4291.783408] Oops: 0000 [#1] SMP [ 4291.822100] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/sw_activity [ 4291.934235] CPU 9 [ 4291.958301] Pid: 27942, comm: hwinfo ...... ata_scsi_find_dev could return NULL, so ata_scsi_activity_{show,store} should check if atadev is NULL. Signed-off-by: Xiaotian Feng Cc: James Bottomley Signed-off-by: Jeff Garzik Signed-off-by: Paul Gortmaker commit 15e2803f7829f3077845bcfeee5564e87dd9a316 Author: Marcin Jurkowski Date: Sat Mar 2 14:50:15 2013 +0100 w1: fix oops when w1_search is called from netlink connector commit 9d1817cab2f030f6af360e961cc69bb1da8ad765 upstream. On Sat, Mar 02, 2013 at 10:45:10AM +0100, Sven Geggus wrote: > This is the bad commit I found doing git bisect: > 04f482faf50535229a5a5c8d629cf963899f857c is the first bad commit > commit 04f482faf50535229a5a5c8d629cf963899f857c > Author: Patrick McHardy > Date: Mon Mar 28 08:39:36 2011 +0000 Good job. I was too lazy to bisect for bad commit;) Reading the code I found problematic kthread_should_stop call from netlink connector which causes the oops. After applying a patch, I've been testing owfs+w1 setup for nearly two days and it seems to work very reliable (no hangs, no memleaks etc). More detailed description and possible fix is given below: Function w1_search can be called from either kthread or netlink callback. While the former works fine, the latter causes oops due to kthread_should_stop invocation. This patch adds a check if w1_search is serving netlink command, skipping kthread_should_stop invocation if so. Signed-off-by: Marcin Jurkowski Acked-by: Evgeniy Polyakov Cc: Josh Boyer Tested-by: Sven Geggus Signed-off-by: Greg Kroah-Hartman Signed-off-by: Paul Gortmaker commit 435e609d29e2ce578202b21fc6ba9a7d026061a6 Author: Chen Gang Date: Thu May 16 14:04:25 2013 -0500 drivers/char/ipmi: memcpy, need additional 2 bytes to avoid memory overflow commit a5f2b3d6a738e7d4180012fe7b541172f8c8dcea upstream. When calling memcpy, read_data and write_data need additional 2 bytes. write_data: for checking: "if (size > IPMI_MAX_MSG_LENGTH)" for operating: "memcpy(bt->write_data + 3, data + 1, size - 1)" read_data: for checking: "if (msg_len < 3 || msg_len > IPMI_MAX_MSG_LENGTH)" for operating: "memcpy(data + 2, bt->read_data + 4, msg_len - 2)" Signed-off-by: Chen Gang Signed-off-by: Corey Minyard Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit 197a5a154c73e0a8cdd5af21c25d0064db32af52 Author: Jiri Slaby Date: Sun Jun 5 22:51:49 2011 +0200 serial: 8250, increase PASS_LIMIT commit e7328ae1848966181a7ac47e8ae6cddbd2cf55f3 upstream. With virtual machines like qemu, it's pretty common to see "too much work for irq4" messages nowadays. This happens when a bunch of output is printed on the emulated serial console. This is caused by too low PASS_LIMIT. When ISR loops more than the limit, it spits the message. I've been using a kernel with doubled the limit and I couldn't see no problems. Maybe it's time to get rid of the message now? Signed-off-by: Jiri Slaby Cc: Alan Cox Signed-off-by: Greg Kroah-Hartman [PG: drivers/tty/serial/8250.c ---> drivers/serial/8250.c in 2.6.34] Signed-off-by: Paul Gortmaker commit b27a97e2d64df2dcb74c00ae0068b0dd93f03555 Author: Konrad Rzeszutek Wilk Date: Wed Jan 16 23:40:07 2013 +0100 ACPI / cpuidle: Fix NULL pointer issues when cpuidle is disabled commit b88a634a903d9670aa5f2f785aa890628ce0dece upstream. If cpuidle is disabled, that means that: per_cpu(acpi_cpuidle_device, pr->id) is set to NULL as the acpi_processor_power_init ends up failing at retval = cpuidle_register_driver(&acpi_idle_driver) (in acpi_processor_power_init) and never sets the per_cpu idle device. So when acpi_processor_hotplug on CPU online notification tries to reference said device it crashes: cpu 3 spinlock event irq 62 BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: [] acpi_processor_setup_cpuidle_cx+0x3f/0x105 PGD a259b067 PUD ab38b067 PMD 0 Oops: 0002 [#1] SMP odules linked in: dm_multipath dm_mod xen_evtchn iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi libcrc32c crc32c nouveau mxm_wmi wmi radeon ttm sg sr_mod sd_mod cdrom ata_generic ata_piix libata crc32c_intel scsi_mod atl1c i915 fbcon tileblit font bitblit softcursor drm_kms_helper video xen_blkfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea xenfs xen_privcmd mperf CPU 1 Pid: 3047, comm: bash Not tainted 3.8.0-rc3upstream-00250-g165c029 #1 MSI MS-7680/H61M-P23 (MS-7680) RIP: e030:[] [] acpi_processor_setup_cpuidle_cx+0x3f/0x105 RSP: e02b:ffff88001742dca8 EFLAGS: 00010202 RAX: 0000000000010be9 RBX: ffff8800a0a61800 RCX: ffff880105380000 RDX: 0000000000000003 RSI: 0000000000000200 RDI: ffff8800a0a61800 RBP: ffff88001742dce8 R08: ffffffff81812360 R09: 0000000000000200 R10: aaaaaaaaaaaaaaaa R11: 0000000000000001 R12: ffff8800a0a61800 R13: 00000000ffffff01 R14: 0000000000000000 R15: ffffffff81a907a0 FS: 00007fd6942f7700(0000) GS:ffff880105280000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000004 CR3: 00000000a6773000 CR4: 0000000000042660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process bash (pid: 3047, threadinfo ffff88001742c000, task ffff880017944000) Stack: 0000000000000150 ffff880100f59e00 ffff88001742dcd8 ffff8800a0a61800 0000000000000000 00000000ffffff01 0000000000000000 ffffffff81a907a0 ffff88001742dd18 ffffffff813815b1 ffff88001742dd08 ffffffff810ae336 Call Trace: [] acpi_processor_hotplug+0x7c/0x9f [] ? schedule_delayed_work_on+0x16/0x20 [] acpi_cpu_soft_notify+0x90/0xca [] notifier_call_chain+0x4d/0x70 [] __raw_notifier_call_chain+0x9/0x10 [] __cpu_notify+0x1b/0x30 [] _cpu_up+0x103/0x14b [] cpu_up+0xd9/0xec [] store_online+0x94/0xd0 [] dev_attr_store+0x1b/0x20 [] sysfs_write_file+0xf4/0x170 This patch fixes it. Signed-off-by: Konrad Rzeszutek Wilk Signed-off-by: Rafael J. Wysocki Signed-off-by: Paul Gortmaker commit 1e593418edb16863504a6914dc4c771294c10768 Author: Andi Kleen Date: Fri Nov 19 13:16:22 2010 +0100 MCE: Fix vm86 handling for 32bit mce handler commit a129a7c84582629741e5fa6f40026efcd7a65bd4 upstream. When running on 32bit the mce handler could misinterpret vm86 mode as ring 0. This can affect whether it does recovery or not; it was possible to panic when recovery was actually possible. Fix this by always forcing vm86 to look like ring 3. [ Backport to 3.0 notes: Things changed there slightly: - move mce_get_rip() up. It fills up m->cs and m->ip values which are evaluated in mce_severity(). Therefore move it up right before the mce_severity call. This seem to be another bug in 3.0? - Place the backport (fix m->cs in V86 case) to where m->cs gets filled which is mce_get_rip() in 3.0 ] Signed-off-by: Andi Kleen Signed-off-by: Tony Luck Signed-off-by: Thomas Renninger Reviewed-by: Tony Luck Signed-off-by: Greg Kroah-Hartman [PG: commit 8ef8fa7479fff9313387b873413f5ae233a2bd04 in v3.0.44] Signed-off-by: Paul Gortmaker commit ca4d293a73047922c85e03698396c2dd80634789 Author: Andy Honig Date: Wed Feb 20 14:49:16 2013 -0800 KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798) commit a2c118bfab8bc6b8bb213abfc35201e441693d55 upstream. If the guest specifies a IOAPIC_REG_SELECT with an invalid value and follows that with a read of the IOAPIC_REG_WINDOW KVM does not properly validate that request. ioapic_read_indirect contains an ASSERT(redir_index < IOAPIC_NUM_PINS), but the ASSERT has no effect in non-debug builds. In recent kernels this allows a guest to cause a kernel oops by reading invalid memory. In older kernels (pre-3.3) this allows a guest to read from large ranges of host memory. Tested: tested against apic unit tests. Signed-off-by: Andrew Honig Signed-off-by: Marcelo Tosatti Signed-off-by: Paul Gortmaker commit 1c46ee51d3f9aa8061c7db55b1b70d08db440862 Author: Andy Honig Date: Mon Mar 11 09:34:52 2013 -0700 KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME (CVE-2013-1796) commit c300aa64ddf57d9c5d9c898a64b36877345dd4a9 upstream. If the guest sets the GPA of the time_page so that the request to update the time straddles a page then KVM will write onto an incorrect page. The write is done byusing kmap atomic to get a pointer to the page for the time structure and then performing a memcpy to that page starting at an offset that the guest controls. Well behaved guests always provide a 32-byte aligned address, however a malicious guest could use this to corrupt host kernel memory. Tested: Tested against kvmclock unit test. Signed-off-by: Andrew Honig Signed-off-by: Marcelo Tosatti Signed-off-by: Paul Gortmaker commit 15cb729f98d2ca74b5670a8c13f8262bbd0398e7 Author: Konrad Rzeszutek Wilk Date: Wed Oct 10 13:25:48 2012 -0400 xen/bootup: allow {read|write}_cr8 pvops call. commit 1a7bbda5b1ab0e02622761305a32dc38735b90b2 upstream. We actually do not do anything about it. Just return a default value of zero and if the kernel tries to write anything but 0 we BUG_ON. This fixes the case when an user tries to suspend the machine and it blows up in save_processor_state b/c 'read_cr8' is set to NULL and we get: kernel BUG at /home/konrad/ssd/linux/arch/x86/include/asm/paravirt.h:100! invalid opcode: 0000 [#1] SMP Pid: 2687, comm: init.late Tainted: G O 3.6.0upstream-00002-gac264ac-dirty #4 Bochs Bochs RIP: e030:[] [] save_processor_state+0x212/0x270 .. snip.. Call Trace: [] do_suspend_lowlevel+0xf/0xac [] ? x86_acpi_suspend_lowlevel+0x10c/0x150 [] acpi_suspend_enter+0x57/0xd5 Signed-off-by: Konrad Rzeszutek Wilk Signed-off-by: Paul Gortmaker commit 7c43c08e17266918c10816708fbce8fe944f73ee Author: Konrad Rzeszutek Wilk Date: Wed Oct 10 13:30:47 2012 -0400 xen/bootup: allow read_tscp call for Xen PV guests. commit cd0608e71e9757f4dae35bcfb4e88f4d1a03a8ab upstream. The hypervisor will trap it. However without this patch, we would crash as the .read_tscp is set to NULL. This patch fixes it and sets it to the native_read_tscp call. Signed-off-by: Konrad Rzeszutek Wilk Signed-off-by: Paul Gortmaker commit f46337bfb74202370f582d2d7f949eb41cc16ada Author: Samu Kallio Date: Sat Mar 23 09:36:35 2013 -0400 x86, mm, paravirt: Fix vmalloc_fault oops during lazy MMU updates commit 1160c2779b826c6f5c08e5cc542de58fd1f667d5 upstream. In paravirtualized x86_64 kernels, vmalloc_fault may cause an oops when lazy MMU updates are enabled, because set_pgd effects are being deferred. One instance of this problem is during process mm cleanup with memory cgroups enabled. The chain of events is as follows: - zap_pte_range enables lazy MMU updates - zap_pte_range eventually calls mem_cgroup_charge_statistics, which accesses the vmalloc'd mem_cgroup per-cpu stat area - vmalloc_fault is triggered which tries to sync the corresponding PGD entry with set_pgd, but the update is deferred - vmalloc_fault oopses due to a mismatch in the PUD entries The OOPs usually looks as so: ------------[ cut here ]------------ kernel BUG at arch/x86/mm/fault.c:396! invalid opcode: 0000 [#1] SMP .. snip .. CPU 1 Pid: 10866, comm: httpd Not tainted 3.6.10-4.fc18.x86_64 #1 RIP: e030:[] [] vmalloc_fault+0x11f/0x208 .. snip .. Call Trace: [] do_page_fault+0x399/0x4b0 [] ? xen_mc_extend_args+0xec/0x110 [] page_fault+0x25/0x30 [] ? mem_cgroup_charge_statistics.isra.13+0x13/0x50 [] __mem_cgroup_uncharge_common+0xd8/0x350 [] mem_cgroup_uncharge_page+0x57/0x60 [] page_remove_rmap+0xe0/0x150 [] ? vm_normal_page+0x1a/0x80 [] unmap_single_vma+0x531/0x870 [] unmap_vmas+0x52/0xa0 [] ? pte_mfn_to_pfn+0x72/0x100 [] exit_mmap+0x98/0x170 [] ? __raw_callee_save_xen_pmd_val+0x11/0x1e [] mmput+0x83/0xf0 [] exit_mm+0x104/0x130 [] do_exit+0x15a/0x8c0 [] do_group_exit+0x3f/0xa0 [] sys_exit_group+0x17/0x20 [] system_call_fastpath+0x16/0x1b Calling arch_flush_lazy_mmu_mode immediately after set_pgd makes the changes visible to the consistency checks. RedHat-Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=914737 Tested-by: Josh Boyer Reported-and-Tested-by: Krishna Raman Signed-off-by: Samu Kallio Link: http://lkml.kernel.org/r/1364045796-10720-1-git-send-email-konrad.wilk@oracle.com Tested-by: Konrad Rzeszutek Wilk Signed-off-by: Konrad Rzeszutek Wilk Signed-off-by: H. Peter Anvin Signed-off-by: Paul Gortmaker commit 09ac8848d288b3484653a916adfe128c8634c6e8 Author: Mel Gorman Date: Mon Feb 11 14:52:36 2013 +0000 x86/mm: Check if PUD is large when validating a kernel address commit 0ee364eb316348ddf3e0dfcd986f5f13f528f821 upstream. A user reported the following oops when a backup process reads /proc/kcore: BUG: unable to handle kernel paging request at ffffbb00ff33b000 IP: [] kern_addr_valid+0xbe/0x110 [...] Call Trace: [] read_kcore+0x17a/0x370 [] proc_reg_read+0x77/0xc0 [] vfs_read+0xc7/0x130 [] sys_read+0x53/0xa0 [] system_call_fastpath+0x16/0x1b Investigation determined that the bug triggered when reading system RAM at the 4G mark. On this system, that was the first address using 1G pages for the virt->phys direct mapping so the PUD is pointing to a physical address, not a PMD page. The problem is that the page table walker in kern_addr_valid() is not checking pud_large() and treats the physical address as if it was a PMD. If it happens to look like pmd_none then it'll silently fail, probably returning zeros instead of real data. If the data happens to look like a present PMD though, it will be walked resulting in the oops above. This patch adds the necessary pud_large() check. Unfortunately the problem was not readily reproducible and now they are running the backup program without accessing /proc/kcore so the patch has not been validated but I think it makes sense. Signed-off-by: Mel Gorman Reviewed-by: Rik van Riel Reviewed-by: Michal Hocko Acked-by: Johannes Weiner Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20130211145236.GX21389@suse.de Signed-off-by: Ingo Molnar Signed-off-by: Paul Gortmaker commit 014875a2a96ad418d5ec9ef473fe8da83d0431a1 Author: Dan Carpenter Date: Sat Mar 24 10:52:50 2012 +0300 x86, tls: Off by one limit check commit 8f0750f19789cf352d7e24a6cc50f2ab1b4f1372 upstream. These are used as offsets into an array of GDT_ENTRY_TLS_ENTRIES members so GDT_ENTRY_TLS_ENTRIES is one past the end of the array. Signed-off-by: Dan Carpenter Link: http://lkml.kernel.org/r/20120324075250.GA28258@elgon.mountain Signed-off-by: H. Peter Anvin Signed-off-by: Paul Gortmaker commit 3dc43f219f391f0e5ae02605b53843d9aec9bdaa Author: Alan Cox Date: Thu Nov 15 13:06:22 2012 +0000 x86/msr: Add capabilities check commit c903f0456bc69176912dee6dd25c6a66ee1aed00 upstream. At the moment the MSR driver only relies upon file system checks. This means that anything as root with any capability set can write to MSRs. Historically that wasn't very interesting but on modern processors the MSRs are such that writing to them provides several ways to execute arbitary code in kernel space. Sample code and documentation on doing this is circulating and MSR attacks are used on Windows 64bit rootkits already. In the Linux case you still need to be able to open the device file so the impact is fairly limited and reduces the security of some capability and security model based systems down towards that of a generic "root owns the box" setup. Therefore they should require CAP_SYS_RAWIO to prevent an elevation of capabilities. The impact of this is fairly minimal on most setups because they don't have heavy use of capabilities. Those using SELinux, SMACK or AppArmor rules might want to consider if their rulesets on the MSR driver could be tighter. Signed-off-by: Alan Cox Cc: Linus Torvalds Cc: Andrew Morton Cc: Peter Zijlstra Signed-off-by: Ingo Molnar Signed-off-by: Paul Gortmaker commit 6156022ac5b6fc988d1829d0fcdda7044e36c806 Author: Jan Beulich Date: Thu Jan 24 13:11:10 2013 +0000 x86/xen: don't assume %ds is usable in xen_iret for 32-bit PVOPS. commit 13d2b4d11d69a92574a55bfd985cfb0ca77aebdc upstream. This fixes CVE-2013-0228 / XSA-42 Drew Jones while working on CVE-2013-0190 found that that unprivileged guest user in 32bit PV guest can use to crash the > guest with the panic like this: ------------- general protection fault: 0000 [#1] SMP last sysfs file: /sys/devices/vbd-51712/block/xvda/dev Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 xen_netfront ext4 mbcache jbd2 xen_blkfront dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Pid: 1250, comm: r Not tainted 2.6.32-356.el6.i686 #1 EIP: 0061:[] EFLAGS: 00010086 CPU: 0 EIP is at xen_iret+0x12/0x2b EAX: eb8d0000 EBX: 00000001 ECX: 08049860 EDX: 00000010 ESI: 00000000 EDI: 003d0f00 EBP: b77f8388 ESP: eb8d1fe0 DS: 0000 ES: 007b FS: 0000 GS: 00e0 SS: 0069 Process r (pid: 1250, ti=eb8d0000 task=c2953550 task.ti=eb8d0000) Stack: 00000000 0027f416 00000073 00000206 b77f8364 0000007b 00000000 00000000 Call Trace: Code: c3 8b 44 24 18 81 4c 24 38 00 02 00 00 8d 64 24 30 e9 03 00 00 00 8d 76 00 f7 44 24 08 00 00 02 80 75 33 50 b8 00 e0 ff ff 21 e0 <8b> 40 10 8b 04 85 a0 f6 ab c0 8b 80 0c b0 b3 c0 f6 44 24 0d 02 EIP: [] xen_iret+0x12/0x2b SS:ESP 0069:eb8d1fe0 general protection fault: 0000 [#2] ---[ end trace ab0d29a492dcd330 ]--- Kernel panic - not syncing: Fatal exception Pid: 1250, comm: r Tainted: G D --------------- 2.6.32-356.el6.i686 #1 Call Trace: [] ? panic+0x6e/0x122 [] ? oops_end+0xbc/0xd0 [] ? do_general_protection+0x0/0x210 [] ? error_code+0x73/ ------------- Petr says: " I've analysed the bug and I think that xen_iret() cannot cope with mangled DS, in this case zeroed out (null selector/descriptor) by either xen_failsafe_callback() or RESTORE_REGS because the corresponding LDT entry was invalidated by the reproducer. " Jan took a look at the preliminary patch and came up a fix that solves this problem: "This code gets called after all registers other than those handled by IRET got already restored, hence a null selector in %ds or a non-null one that got loaded from a code or read-only data descriptor would cause a kernel mode fault (with the potential of crashing the kernel as a whole, if panic_on_oops is set)." The way to fix this is to realize that the we can only relay on the registers that IRET restores. The two that are guaranteed are the %cs and %ss as they are always fixed GDT selectors. Also they are inaccessible from user mode - so they cannot be altered. This is the approach taken in this patch. Another alternative option suggested by Jan would be to relay on the subtle realization that using the %ebp or %esp relative references uses the %ss segment. In which case we could switch from using %eax to %ebp and would not need the %ss over-rides. That would also require one extra instruction to compensate for the one place where the register is used as scaled index. However Andrew pointed out that is too subtle and if further work was to be done in this code-path it could escape folks attention and lead to accidents. Reviewed-by: Petr Matousek Reported-by: Petr Matousek Reviewed-by: Andrew Cooper Signed-off-by: Jan Beulich Signed-off-by: Konrad Rzeszutek Wilk Signed-off-by: Paul Gortmaker commit 1ed63eb7d7efed253099c851359e39be850165e5 Author: Takashi Iwai Date: Fri Mar 8 18:11:17 2013 +0100 ALSA: seq: Fix missing error handling in snd_seq_timer_open() commit 66efdc71d95887b652a742a5dae51fa834d71465 upstream. snd_seq_timer_open() didn't catch the whole error path but let through if the timer id is a slave. This may lead to Oops by accessing the uninitialized pointer. BUG: unable to handle kernel NULL pointer dereference at 00000000000002ae IP: [] snd_seq_timer_open+0xe7/0x130 PGD 785cd067 PUD 76964067 PMD 0 Oops: 0002 [#4] SMP CPU 0 Pid: 4288, comm: trinity-child7 Tainted: G D W 3.9.0-rc1+ #100 Bochs Bochs RIP: 0010:[] [] snd_seq_timer_open+0xe7/0x130 RSP: 0018:ffff88006ece7d38 EFLAGS: 00010246 RAX: 0000000000000286 RBX: ffff88007851b400 RCX: 0000000000000000 RDX: 000000000000ffff RSI: ffff88006ece7d58 RDI: ffff88006ece7d38 RBP: ffff88006ece7d98 R08: 000000000000000a R09: 000000000000fffe R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff8800792c5400 R14: 0000000000e8f000 R15: 0000000000000007 FS: 00007f7aaa650700(0000) GS:ffff88007f800000(0000) GS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000002ae CR3: 000000006efec000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process trinity-child7 (pid: 4288, threadinfo ffff88006ece6000, task ffff880076a8a290) Stack: 0000000000000286 ffffffff828f2be0 ffff88006ece7d58 ffffffff810f354d 65636e6575716573 2065756575712072 ffff8800792c0030 0000000000000000 ffff88006ece7d98 ffff8800792c5400 ffff88007851b400 ffff8800792c5520 Call Trace: [] ? trace_hardirqs_on+0xd/0x10 [] snd_seq_queue_timer_open+0x29/0x70 [] snd_seq_ioctl_set_queue_timer+0xda/0x120 [] snd_seq_do_ioctl+0x9b/0xd0 [] snd_seq_ioctl+0x10/0x20 [] do_vfs_ioctl+0x522/0x570 [] ? file_has_perm+0x83/0xa0 [] ? trace_hardirqs_on+0xd/0x10 [] sys_ioctl+0x5d/0xa0 [] ? trace_hardirqs_on_thunk+0x3a/0x3f [] system_call_fastpath+0x16/0x1b Reported-and-tested-by: Tommi Rantala Signed-off-by: Takashi Iwai Signed-off-by: Paul Gortmaker commit c8d141b1aac4631511b9e886e6bf98b425444f9f Author: Mel Gorman Date: Mon Oct 8 16:29:17 2012 -0700 mempolicy: fix a race in shared_policy_replace() commit b22d127a39ddd10d93deee3d96e643657ad53a49 upstream. shared_policy_replace() use of sp_alloc() is unsafe. 1) sp_node cannot be dereferenced if sp->lock is not held and 2) another thread can modify sp_node between spin_unlock for allocating a new sp node and next spin_lock. The bug was introduced before 2.6.12-rc2. Kosaki's original patch for this problem was to allocate an sp node and policy within shared_policy_replace and initialise it when the lock is reacquired. I was not keen on this approach because it partially duplicates sp_alloc(). As the paths were sp->lock is taken are not that performance critical this patch converts sp->lock to sp->mutex so it can sleep when calling sp_alloc(). [kosaki.motohiro@jp.fujitsu.com: Original patch] Signed-off-by: Mel Gorman Acked-by: KOSAKI Motohiro Reviewed-by: Christoph Lameter Cc: Josh Boyer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit acc12b1b2744d0c7d0f462f9057bbca8427498c4 Author: Andy Lutomirski Date: Thu Jul 5 16:00:11 2012 -0700 mm: Hold a file reference in madvise_remove commit 9ab4233dd08036fe34a89c7dc6f47a8bf2eb29eb upstream. Otherwise the code races with munmap (causing a use-after-free of the vma) or with close (causing a use-after-free of the struct file). The bug was introduced by commit 90ed52ebe481 ("[PATCH] holepunch: fix mmap_sem i_mutex deadlock") [bwh: Backported to 3.2: - Adjust context - madvise_remove() calls vmtruncate_range(), not do_fallocate()] [luto: Backported to 3.0: Adjust context] Cc: Hugh Dickins Cc: Miklos Szeredi Cc: Badari Pulavarty Cc: Nick Piggin Signed-off-by: Ben Hutchings Signed-off-by: Andy Lutomirski Signed-off-by: Greg Kroah-Hartman [PG: commit e12fcd38abe8a869cbabd77724008f1cf812a3e7 in v3.0.37] Signed-off-by: Paul Gortmaker commit 0e3446e8bb3ae078d66434c1cb236b4d1546bfe8 Author: Xiao Guangrong Date: Tue Jul 31 16:45:52 2012 -0700 mm: mmu_notifier: fix freed page still mapped in secondary MMU commit 3ad3d901bbcfb15a5e4690e55350db0899095a68 upstream. mmu_notifier_release() is called when the process is exiting. It will delete all the mmu notifiers. But at this time the page belonging to the process is still present in page tables and is present on the LRU list, so this race will happen: CPU 0 CPU 1 mmu_notifier_release: try_to_unmap: hlist_del_init_rcu(&mn->hlist); ptep_clear_flush_notify: mmu nofifler not found free page !!!!!! /* * At the point, the page has been * freed, but it is still mapped in * the secondary MMU. */ mn->ops->release(mn, mm); Then the box is not stable and sometimes we can get this bug: [ 738.075923] BUG: Bad page state in process migrate-perf pfn:03bec [ 738.075931] page:ffffea00000efb00 count:0 mapcount:0 mapping: (null) index:0x8076 [ 738.075936] page flags: 0x20000000000014(referenced|dirty) The same issue is present in mmu_notifier_unregister(). We can call ->release before deleting the notifier to ensure the page has been unmapped from the secondary MMU before it is freed. Signed-off-by: Xiao Guangrong Cc: Avi Kivity Cc: Marcelo Tosatti Cc: Paul Gortmaker Cc: Andrea Arcangeli Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit c5d2113433298bd0fb05e99df0d0185215311e7f Author: Hugh Dickins Date: Mon Oct 8 16:33:14 2012 -0700 mm: fix invalidate_complete_page2() lock ordering commit ec4d9f626d5908b6052c2973f37992f1db52e967 upstream. In fuzzing with trinity, lockdep protested "possible irq lock inversion dependency detected" when isolate_lru_page() reenabled interrupts while still holding the supposedly irq-safe tree_lock: invalidate_inode_pages2 invalidate_complete_page2 spin_lock_irq(&mapping->tree_lock) clear_page_mlock isolate_lru_page spin_unlock_irq(&zone->lru_lock) isolate_lru_page() is correct to enable interrupts unconditionally: invalidate_complete_page2() is incorrect to call clear_page_mlock() while holding tree_lock, which is supposed to nest inside lru_lock. Both truncate_complete_page() and invalidate_complete_page() call clear_page_mlock() before taking tree_lock to remove page from radix_tree. I guess invalidate_complete_page2() preferred to test PageDirty (again) under tree_lock before committing to the munlock; but since the page has already been unmapped, its state is already somewhat inconsistent, and no worse if clear_page_mlock() moved up. Reported-by: Sasha Levin Deciphered-by: Andrew Morton Signed-off-by: Hugh Dickins Acked-by: Mel Gorman Cc: Rik van Riel Cc: Johannes Weiner Cc: Michel Lespinasse Cc: Ying Han Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit f0db044d2f407c9f4f058d5528e66e52fe1be2af Author: Takamori Yamaguchi Date: Thu Nov 8 15:53:39 2012 -0800 mm: bugfix: set current->reclaim_state to NULL while returning from kswapd() commit b0a8cc58e6b9aaae3045752059e5e6260c0b94bc upstream. In kswapd(), set current->reclaim_state to NULL before returning, as current->reclaim_state holds reference to variable on kswapd()'s stack. In rare cases, while returning from kswapd() during memory offlining, __free_slab() and freepages() can access the dangling pointer of current->reclaim_state. Signed-off-by: Takamori Yamaguchi Signed-off-by: Aaditya Kumar Acked-by: David Rientjes Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit e440541697f4343ee06384c693d1904005735120 Author: Christoffer Dall Date: Fri Dec 21 13:03:50 2012 -0500 mm: Fix PageHead when !CONFIG_PAGEFLAGS_EXTENDED commit ad4b3fb7ff9940bcdb1e4cd62bd189d10fa636ba upstream. Unfortunately with !CONFIG_PAGEFLAGS_EXTENDED, (!PageHead) is false, and (PageHead) is true, for tail pages. If this is indeed the intended behavior, which I doubt because it breaks cache cleaning on some ARM systems, then the nomenclature is highly problematic. This patch makes sure PageHead is only true for head pages and PageTail is only true for tail pages, and neither is true for non-compound pages. [ This buglet seems ancient - seems to have been introduced back in Apr 2008 in commit 6a1e7f777f61: "pageflags: convert to the use of new macros". And the reason nobody noticed is because the PageHead() tests are almost all about just sanity-checking, and only used on pages that are actual page heads. The fact that the old code returned true for tail pages too was thus not really noticeable. - Linus ] Signed-off-by: Christoffer Dall Acked-by: Andrea Arcangeli Cc: Andrew Morton Cc: Will Deacon Cc: Steve Capper Cc: Christoph Lameter Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit f14c1bdff9f14b997b7cd3897cf4c0857a48a1c8 Author: Dave Hansen Date: Wed May 30 07:51:07 2012 -0700 mm: fix vma_resv_map() NULL pointer commit 4523e1458566a0e8ecfaff90f380dd23acc44d27 upstream. hugetlb_reserve_pages() can be used for either normal file-backed hugetlbfs mappings, or MAP_HUGETLB. In the MAP_HUGETLB, semi-anonymous mode, there is not a VMA around. The new call to resv_map_put() assumed that there was, and resulted in a NULL pointer dereference: BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 IP: vma_resv_map+0x9/0x30 PGD 141453067 PUD 1421e1067 PMD 0 Oops: 0000 [#1] PREEMPT SMP ... Pid: 14006, comm: trinity-child6 Not tainted 3.4.0+ #36 RIP: vma_resv_map+0x9/0x30 ... Process trinity-child6 (pid: 14006, threadinfo ffff8801414e0000, task ffff8801414f26b0) Call Trace: resv_map_put+0xe/0x40 hugetlb_reserve_pages+0xa6/0x1d0 hugetlb_file_setup+0x102/0x2c0 newseg+0x115/0x360 ipcget+0x1ce/0x310 sys_shmget+0x5a/0x60 system_call_fastpath+0x16/0x1b This was reported by Dave Jones, but was reproducible with the libhugetlbfs test cases, so shame on me for not running them in the first place. With this, the oops is gone, and the output of libhugetlbfs's run_tests.py is identical to plain 3.4 again. [ Marked for stable, since this was introduced by commit c50ac050811d ("hugetlb: fix resv_map leak in error path") which was also marked for stable ] Reported-by: Dave Jones Cc: Mel Gorman Cc: KOSAKI Motohiro Cc: Christoph Lameter Cc: Andrea Arcangeli Cc: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit 840ad2cef3c0b891676ffdfbed0cda7f114f8ee4 Author: Dave Hansen Date: Tue May 29 15:06:46 2012 -0700 hugetlb: fix resv_map leak in error path commit c50ac050811d6485616a193eb0f37bfbd191cc89 upstream. When called for anonymous (non-shared) mappings, hugetlb_reserve_pages() does a resv_map_alloc(). It depends on code in hugetlbfs's vm_ops->close() to release that allocation. However, in the mmap() failure path, we do a plain unmap_region() without the remove_vma() which actually calls vm_ops->close(). This is a decent fix. This leak could get reintroduced if new code (say, after hugetlb_reserve_pages() in hugetlbfs_file_mmap()) decides to return an error. But, I think it would have to unroll the reservation anyway. Christoph's test case: http://marc.info/?l=linux-mm&m=133728900729735 This patch applies to 3.4 and later. A version for earlier kernels is at https://lkml.org/lkml/2012/5/22/418. Signed-off-by: Dave Hansen Acked-by: Mel Gorman Acked-by: KOSAKI Motohiro Reported-by: Christoph Lameter Tested-by: Christoph Lameter Cc: Andrea Arcangeli Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit 89077494341242331c68f92918e1a9b115c82eb6 Author: Namhyung Kim Date: Mon Apr 1 21:46:23 2013 +0900 tracing: Fix double free when function profile init failed commit 83e03b3fe4daffdebbb42151d5410d730ae50bd1 upstream. On the failure path, stat->start and stat->pages will refer same page. So it'll attempt to free the same page again and get kernel panic. Link: http://lkml.kernel.org/r/1364820385-32027-1-git-send-email-namhyung@kernel.org Cc: Frederic Weisbecker Cc: Namhyung Kim Signed-off-by: Namhyung Kim Signed-off-by: Steven Rostedt Signed-off-by: Paul Gortmaker commit d9e695874f17fe8c75c60b8093596d16b67a486f Author: Wen Congyang Date: Thu Sep 20 14:04:47 2012 +0800 tracing: Don't call page_to_pfn() if page is NULL commit 85f2a2ef1d0ab99523e0b947a2b723f5650ed6aa upstream. When allocating memory fails, page is NULL. page_to_pfn() will cause the kernel panicked if we don't use sparsemem vmemmap. Link: http://lkml.kernel.org/r/505AB1FF.8020104@cn.fujitsu.com Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Andrew Morton Acked-by: Mel Gorman Reviewed-by: Minchan Kim Signed-off-by: Wen Congyang Signed-off-by: Steven Rostedt Signed-off-by: Paul Gortmaker commit 0338ecc627173b943e928da889d4f456e1f6b5f8 Author: Daniel J Blueman Date: Mon Jul 23 12:22:37 2012 +0800 Prevent interface errors with Seagate FreeAgent GoFlex commit c531077f40abc9f2129c4c83a30b3f8d6ce1c0e7 upstream. When using my Seagate FreeAgent GoFlex eSATAp external disk enclosure, interface errors are always seen until 1.5Gbps is negotiated [1]. This occurs using any disk in the enclosure, and when the disk is connected directly with a generic passive eSATAp cable, we see stable 3Gbps operation as expected. Blacklist 3Gbps mode to avoid dataloss and the ~30s delay bus reset and renegotiation incurs. Signed-off-by: Daniel J Blueman Signed-off-by: Jeff Garzik Signed-off-by: Paul Gortmaker commit 533753370fbcbcce151c0e81c833b32aadf252d7 Author: Li Zhong Date: Tue Jul 24 15:02:49 2012 -0700 Fix a dead loop in async_synchronize_full() [commit 45516ddc16abc923104d78bb3eb772ac0a09e33e in v3.0.44 - paulg ] [Fixed upstream by commits 2955b47d2c1983998a8c5915cb96884e67f7cb53 and a4683487f90bfe3049686fc5c566bdc1ad03ace6 from Dan Williams, but they are much more intrusive than this tiny fix, according to Andrew - gregkh] This patch tries to fix a dead loop in async_synchronize_full(), which could be seen when preemption is disabled on a single cpu machine. void async_synchronize_full(void) { do { async_synchronize_cookie(next_cookie); } while (!list_empty(&async_running) || ! list_empty(&async_pending)); } async_synchronize_cookie() calls async_synchronize_cookie_domain() with &async_running as the default domain to synchronize. However, there might be some works in the async_pending list from other domains. On a single cpu system, without preemption, there is no chance for the other works to finish, so async_synchronize_full() enters a dead loop. It seems async_synchronize_full() wants to synchronize all entries in all running lists(domains), so maybe we could just check the entry_count to know whether all works are finished. Currently, async_synchronize_cookie_domain() expects a non-NULL running list ( if NULL, there would be NULL pointer dereference ), so maybe a NULL pointer could be used as an indication for the functions to synchronize all works in all domains. Reported-by: Paul E. McKenney Signed-off-by: Li Zhong Tested-by: Paul E. McKenney Tested-by: Christian Kujau Cc: Andrew Morton Cc: Dan Williams Cc: Christian Kujau Cc: Andrew Morton Cc: Cong Wang Signed-off-by: Greg Kroah-Hartman Signed-off-by: Paul Gortmaker commit 6a5af3b8e1b4f0ca35c39c08c2571ed00074c298 Author: Tejun Heo Date: Mon Nov 19 08:13:35 2012 -0800 cgroup: remove incorrect dget/dput() pair in cgroup_create_dir() commit 175431635ec09b1d1bba04979b006b99e8305a83 upstream. cgroup_create_dir() does weird dancing with dentry refcnt. On success, it gets and then puts it achieving nothing. On failure, it puts but there isn't no matching get anywhere leading to the following oops if cgroup_create_file() fails for whatever reason. ------------[ cut here ]------------ kernel BUG at /work/os/work/fs/dcache.c:552! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: CPU 2 Pid: 697, comm: mkdir Not tainted 3.7.0-rc4-work+ #3 Bochs Bochs RIP: 0010:[] [] dput+0x1dc/0x1e0 RSP: 0018:ffff88001a3ebef8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88000e5b1ef8 RCX: 0000000000000403 RDX: 0000000000000303 RSI: 2000000000000000 RDI: ffff88000e5b1f58 RBP: ffff88001a3ebf18 R08: ffffffff82c76960 R09: 0000000000000001 R10: ffff880015022080 R11: ffd9bed70f48a041 R12: 00000000ffffffea R13: 0000000000000001 R14: ffff88000e5b1f58 R15: 00007fff57656d60 FS: 00007ff05fcb3800(0000) GS:ffff88001fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004046f0 CR3: 000000001315f000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process mkdir (pid: 697, threadinfo ffff88001a3ea000, task ffff880015022080) Stack: ffff88001a3ebf48 00000000ffffffea 0000000000000001 0000000000000000 ffff88001a3ebf38 ffffffff811cc889 0000000000000001 ffff88000e5b1ef8 ffff88001a3ebf68 ffffffff811d1fc9 ffff8800198d7f18 ffff880019106ef8 Call Trace: [] done_path_create+0x19/0x50 [] sys_mkdirat+0x59/0x80 [] sys_mkdir+0x19/0x20 [] system_call_fastpath+0x16/0x1b Code: 00 48 8d 90 18 01 00 00 48 89 93 c0 00 00 00 4c 89 a0 18 01 00 00 48 8b 83 a0 00 00 00 83 80 28 01 00 00 01 e8 e6 6f a0 00 eb 92 <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 RIP [] dput+0x1dc/0x1e0 RSP ---[ end trace 1277bcfd9561ddb0 ]--- Fix it by dropping the unnecessary dget/dput() pair. Signed-off-by: Tejun Heo Acked-by: Li Zefan Signed-off-by: Paul Gortmaker commit b5aaf6a1f4e4143c9b97658ab44a837e92b1ce14 Author: Bjorn Helgaas Date: Tue Jan 29 16:44:27 2013 -0700 Driver core: treat unregistered bus_types as having no devices commit 4fa3e78be7e985ca814ce2aa0c09cbee404efcf7 upstream. A bus_type has a list of devices (klist_devices), but the list and the subsys_private structure that contains it are not initialized until the bus_type is registered with bus_register(). The panic/reboot path has fixups that look up devices in pci_bus_type. If we panic before registering pci_bus_type, the bus_type exists but the list does not, so mach_reboot_fixups() trips over a null pointer and panics again: mach_reboot_fixups pci_get_device .. bus_find_device(&pci_bus_type, ...) bus->p is NULL Joonsoo reported a problem when panicking before PCI was initialized. I think this patch should be sufficient to replace the patch he posted here: https://lkml.org/lkml/2012/12/28/75 ("[PATCH] x86, reboot: skip reboot_fixups in early boot phase") Reported-by: Joonsoo Kim Signed-off-by: Bjorn Helgaas Signed-off-by: Greg Kroah-Hartman Signed-off-by: Paul Gortmaker commit 788f0a7b94d2e2f372a6e1929baea4fad835f03a Author: T Makphaibulchoke Date: Thu Oct 4 17:16:55 2012 -0700 kernel/resource.c: fix stack overflow in __reserve_region_with_split() commit 4965f5667f36a95b41cda6638875bc992bd7d18b upstream. Using a recursive call add a non-conflicting region in __reserve_region_with_split() could result in a stack overflow in the case that the recursive calls are too deep. Convert the recursive calls to an iterative loop to avoid the problem. Tested on a machine containing 135 regions. The kernel no longer panicked with stack overflow. Also tested with code arbitrarily adding regions with no conflict, embedding two consecutive conflicts and embedding two non-consecutive conflicts. Signed-off-by: T Makphaibulchoke Reviewed-by: Ram Pai Cc: Paul Gortmaker Cc: Wei Yang Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit a99809112f1c7736e61e2688171da34e487a4430 Author: Shawn Guo Date: Thu Oct 4 17:12:23 2012 -0700 kernel/sys.c: call disable_nonboot_cpus() in kernel_restart() commit f96972f2dc6365421cf2366ebd61ee4cf060c8d5 upstream. As kernel_power_off() calls disable_nonboot_cpus(), we may also want to have kernel_restart() call disable_nonboot_cpus(). Doing so can help machines that require boot cpu be the last alive cpu during reboot to survive with kernel restart. This fixes one reboot issue seen on imx6q (Cortex-A9 Quad). The machine requires that the restart routine be run on the primary cpu rather than secondary ones. Otherwise, the secondary core running the restart routine will fail to come to online after reboot. Signed-off-by: Shawn Guo Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit 8d0ca59d21d373fa58d8819c1a5d07d2318fa95e Author: Denys Vlasenko Date: Wed Sep 26 11:34:50 2012 +1000 coredump: prevent double-free on an error path in core dumper commit f34f9d186df35e5c39163444c43b4fc6255e39c5 upstream. In !CORE_DUMP_USE_REGSET case, if elf_note_info_init fails to allocate memory for info->fields, it frees already allocated stuff and returns error to its caller, fill_note_info. Which in turn returns error to its caller, elf_core_dump. Which jumps to cleanup label and calls free_note_info, which will happily try to free all info->fields again. BOOM. This is the fix. Signed-off-by: Oleg Nesterov Signed-off-by: Denys Vlasenko Cc: Venu Byravarasu Signed-off-by: Andrew Morton Signed-off-by: Paul Gortmaker commit e9bf40599d67972c356eb7c024d36ba9093f363c Author: Oleg Nesterov Date: Mon Jan 21 20:48:17 2013 +0100 wake_up_process() should be never used to wakeup a TASK_STOPPED/TRACED task commit 9067ac85d533651b98c2ff903182a20cbb361fcb upstream. wake_up_process() should never wakeup a TASK_STOPPED/TRACED task. Change it to use TASK_NORMAL and add the WARN_ON(). TASK_ALL has no other users, probably can be killed. Signed-off-by: Oleg Nesterov Signed-off-by: Linus Torvalds [PG: kernel/sched/core.c --> kernel/sched.c on 2.6.34] Signed-off-by: Paul Gortmaker commit 8f0d69d53f4c0fe576d7dfedeaa281e19145f363 Author: Emese Revfy Date: Wed Apr 17 15:58:36 2013 -0700 kernel/signal.c: stop info leak via the tkill and the tgkill syscalls commit b9e146d8eb3b9ecae5086d373b50fa0c1f3e7f0f upstream. This fixes a kernel memory contents leak via the tkill and tgkill syscalls for compat processes. This is visible in the siginfo_t->_sifields._rt.si_sigval.sival_ptr field when handling signals delivered from tkill. The place of the infoleak: int copy_siginfo_to_user32(compat_siginfo_t __user *to, siginfo_t *from) { ... put_user_ex(ptr_to_compat(from->si_ptr), &to->si_ptr); ... } Signed-off-by: Emese Revfy Reviewed-by: PaX Team Signed-off-by: Kees Cook Cc: Al Viro Cc: Oleg Nesterov Cc: "Eric W. Biederman" Cc: Serge Hallyn Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit 04cfd22bbd7e70c39680456d417f927ee9a70831 Author: Oleg Nesterov Date: Wed May 25 19:20:21 2011 +0200 ptrace: ptrace_resume() shouldn't wake up !TASK_TRACED thread commit 0666fb51b1483f27506e212cc7f7b2645b5c7acc upstream. It is not clear why ptrace_resume() does wake_up_process(). Unless the caller is PTRACE_KILL the tracee should be TASK_TRACED so we can use wake_up_state(__TASK_TRACED). If sys_ptrace() races with SIGKILL we do not need the extra and potentionally spurious wakeup. If the caller is PTRACE_KILL, wake_up_process() is even more wrong. The tracee can sleep in any state in any place, and if we have a buggy code which doesn't handle a spurious wakeup correctly PTRACE_KILL can be used to exploit it. For example: int main(void) { int child, status; child = fork(); if (!child) { int ret; assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); ret = pause(); printf("pause: %d %m\n", ret); return 0x23; } sleep(1); assert(ptrace(PTRACE_KILL, child, 0,0) == 0); assert(child == wait(&status)); printf("wait: %x\n", status); return 0; } prints "pause: -1 Unknown error 514", -ERESTARTNOHAND leaks to the userland. In this case sys_pause() is buggy as well and should be fixed. I do not know what was the original rationality behind PTRACE_KILL. The man page is simply wrong and afaics it was always wrong. Imho it should be deprecated, or may be it should do send_sig(SIGKILL) as Denys suggests, but in any case I do not think that the current behaviour was intentional. Note: there is another problem, ptrace_resume() changes ->exit_code and this can race with SIGKILL too. Eventually we should change ptrace to not use ->exit_code. Signed-off-by: Oleg Nesterov Signed-off-by: Paul Gortmaker commit 12ae141c110748fc4ea08aeb7a320d06c42177ba Author: Kees Cook Date: Mon Dec 17 16:03:20 2012 -0800 exec: use -ELOOP for max recursion depth commit d740269867021faf4ce38a449353d2b986c34a67 upstream. To avoid an explosion of request_module calls on a chain of abusive scripts, fail maximum recursion with -ELOOP instead of -ENOEXEC. As soon as maximum recursion depth is hit, the error will fail all the way back up the chain, aborting immediately. This also has the side-effect of stopping the user's shell from attempting to reexecute the top-level file as a shell script. As seen in the dash source: if (cmd != path_bshell && errno == ENOEXEC) { *argv-- = cmd; *argv = cmd = path_bshell; goto repeat; } The above logic was designed for running scripts automatically that lacked the "#!" header, not to re-try failed recursion. On a legitimate -ENOEXEC, things continue to behave as the shell expects. Additionally, when tracking recursion, the binfmt handlers should not be involved. The recursion being tracked is the depth of calls through search_binary_handler(), so that function should be exclusively responsible for tracking the depth. Signed-off-by: Kees Cook Cc: halfdog Cc: P J P Cc: Alexander Viro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit 341c80d7ac4063f774397de6dd671b6cf28c6ab2 Author: Kees Cook Date: Thu Dec 20 15:05:16 2012 -0800 exec: do not leave bprm->interp on stack commit b66c5984017533316fd1951770302649baf1aa33 upstream. If a series of scripts are executed, each triggering module loading via unprintable bytes in the script header, kernel stack contents can leak into the command line. Normally execution of binfmt_script and binfmt_misc happens recursively. However, when modules are enabled, and unprintable bytes exist in the bprm->buf, execution will restart after attempting to load matching binfmt modules. Unfortunately, the logic in binfmt_script and binfmt_misc does not expect to get restarted. They leave bprm->interp pointing to their local stack. This means on restart bprm->interp is left pointing into unused stack memory which can then be copied into the userspace argv areas. After additional study, it seems that both recursion and restart remains the desirable way to handle exec with scripts, misc, and modules. As such, we need to protect the changes to interp. This changes the logic to require allocation for any changes to the bprm->interp. To avoid adding a new kmalloc to every exec, the default value is left as-is. Only when passing through binfmt_script or binfmt_misc does an allocation take place. For a proof of concept, see DoTest.sh from: http://www.halfdog.net/Security/2012/LinuxKernelBinfmtScriptStackDataDisclosure/ Signed-off-by: Kees Cook Cc: halfdog Cc: P J P Cc: Alexander Viro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit cbc6371912c8071719487089ef6c50aacf4b7542 Author: Kees Cook Date: Thu Oct 25 13:38:14 2012 -0700 gen_init_cpio: avoid stack overflow when expanding commit 20f1de659b77364d55d4e7fad2ef657e7730323f upstream. Fix possible overflow of the buffer used for expanding environment variables when building file list. In the extremely unlikely case of an attacker having control over the environment variables visible to gen_init_cpio, control over the contents of the file gen_init_cpio parses, and gen_init_cpio was built without compiler hardening, the attacker can gain arbitrary execution control via a stack buffer overflow. $ cat usr/crash.list file foo ${BIG}${BIG}${BIG}${BIG}${BIG}${BIG} 0755 0 0 $ BIG=$(perl -e 'print "A" x 4096;') ./usr/gen_init_cpio usr/crash.list *** buffer overflow detected ***: ./usr/gen_init_cpio terminated This also replaces the space-indenting with tabs. Patch based on existing fix extracted from grsecurity. Signed-off-by: Kees Cook Cc: Michal Marek Cc: Brad Spengler Cc: PaX Team Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit 0590600d49c61ed83de323779f37a94f5b362d38 Author: Thomas Gleixner Date: Fri May 3 15:02:50 2013 +0200 tick: Cleanup NOHZ per cpu data on cpu down commit 4b0c0f294f60abcdd20994a8341a95c8ac5eeb96 upstream. Prarit reported a crash on CPU offline/online. The reason is that on CPU down the NOHZ related per cpu data of the dead cpu is not cleaned up. If at cpu online an interrupt happens before the per cpu tick device is registered the irq_enter() check potentially sees stale data and dereferences a NULL pointer. Cleanup the data after the cpu is dead. Reported-by: Prarit Bhargava Cc: Mike Galbraith Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1305031451561.2886@ionos Signed-off-by: Thomas Gleixner Signed-off-by: Paul Gortmaker commit 019a7f285ef811e1cc5e30045c99774ad6507526 Author: Tirupathi Reddy Date: Tue May 14 13:59:02 2013 +0530 timer: Don't reinitialize the cpu base lock during CPU_UP_PREPARE commit 42a5cf46cd56f46267d2a9fcf2655f4078cd3042 upstream. An inactive timer's base can refer to a offline cpu's base. In the current code, cpu_base's lock is blindly reinitialized each time a CPU is brought up. If a CPU is brought online during the period that another thread is trying to modify an inactive timer on that CPU with holding its timer base lock, then the lock will be reinitialized under its feet. This leads to following SPIN_BUG(). <0> BUG: spinlock already unlocked on CPU#3, kworker/u:3/1466 <0> lock: 0xe3ebe000, .magic: dead4ead, .owner: kworker/u:3/1466, .owner_cpu: 1 <4> [] (unwind_backtrace+0x0/0x11c) from [] (do_raw_spin_unlock+0x40/0xcc) <4> [] (do_raw_spin_unlock+0x40/0xcc) from [] (_raw_spin_unlock+0x8/0x30) <4> [] (_raw_spin_unlock+0x8/0x30) from [] (mod_timer+0x294/0x310) <4> [] (mod_timer+0x294/0x310) from [] (queue_delayed_work_on+0x104/0x120) <4> [] (queue_delayed_work_on+0x104/0x120) from [] (sdhci_msm_bus_voting+0x88/0x9c) <4> [] (sdhci_msm_bus_voting+0x88/0x9c) from [] (sdhci_disable+0x40/0x48) <4> [] (sdhci_disable+0x40/0x48) from [] (mmc_release_host+0x4c/0xb0) <4> [] (mmc_release_host+0x4c/0xb0) from [] (mmc_sd_detect+0x90/0xfc) <4> [] (mmc_sd_detect+0x90/0xfc) from [] (mmc_rescan+0x7c/0x2c4) <4> [] (mmc_rescan+0x7c/0x2c4) from [] (process_one_work+0x27c/0x484) <4> [] (process_one_work+0x27c/0x484) from [] (worker_thread+0x210/0x3b0) <4> [] (worker_thread+0x210/0x3b0) from [] (kthread+0x80/0x8c) <4> [] (kthread+0x80/0x8c) from [] (kernel_thread_exit+0x0/0x8) As an example, this particular crash occurred when CPU #3 is executing mod_timer() on an inactive timer whose base is refered to offlined CPU #2. The code locked the timer_base corresponding to CPU #2. Before it could proceed, CPU #2 came online and reinitialized the spinlock corresponding to its base. Thus now CPU #3 held a lock which was reinitialized. When CPU #3 finally ended up unlocking the old cpu_base corresponding to CPU #2, we hit the above SPIN_BUG(). CPU #0 CPU #3 CPU #2 ------ ------- ------- ..... ...... mod_timer() lock_timer_base spin_lock_irqsave(&base->lock) cpu_up(2) ..... ...... init_timers_cpu() .... ..... spin_lock_init(&base->lock) ..... spin_unlock_irqrestore(&base->lock) ...... Allocation of per_cpu timer vector bases is done only once under "tvec_base_done[]" check. In the current code, spinlock_initialization of base->lock isn't under this check. When a CPU is up each time the base lock is reinitialized. Move base spinlock initialization under the check. Signed-off-by: Tirupathi Reddy Link: http://lkml.kernel.org/r/1368520142-4136-1-git-send-email-tirupath@codeaurora.org Signed-off-by: Thomas Gleixner Signed-off-by: Paul Gortmaker commit 90a0e9ae88829f5fa1e5998efe266b530b858f50 Author: Stanislaw Gruszka Date: Fri Feb 15 11:08:11 2013 +0100 posix-cpu-timers: Fix nanosleep task_struct leak commit e6c42c295e071dd74a66b5a9fcf4f44049888ed8 upstream. The trinity fuzzer triggered a task_struct reference leak via clock_nanosleep with CPU_TIMERs. do_cpu_nanosleep() calls posic_cpu_timer_create(), but misses a corresponding posix_cpu_timer_del() which leads to the task_struct reference leak. Reported-and-tested-by: Tommi Rantala Signed-off-by: Stanislaw Gruszka Cc: Dave Jones Cc: John Stultz Cc: Oleg Nesterov Link: http://lkml.kernel.org/r/20130215100810.GF4392@redhat.com Signed-off-by: Thomas Gleixner Signed-off-by: Paul Gortmaker commit 003cad75f07b923d1cc75d032a931f461930d28c Author: Mark Rutland Date: Thu Mar 7 15:09:24 2013 +0000 clockevents: Don't allow dummy broadcast timers commit a7dc19b8652c862d5b7c4d2339bd3c428bd29c4a upstream. Currently tick_check_broadcast_device doesn't reject clock_event_devices with CLOCK_EVT_FEAT_DUMMY, and may select them in preference to real hardware if they have a higher rating value. In this situation, the dummy timer is responsible for broadcasting to itself, and the core clockevents code may attempt to call non-existent callbacks for programming the dummy, eventually leading to a panic. This patch makes tick_check_broadcast_device always reject dummy timers, preventing this problem. Signed-off-by: Mark Rutland Cc: linux-arm-kernel@lists.infradead.org Cc: Jon Medhurst (Tixy) Signed-off-by: Thomas Gleixner Signed-off-by: Paul Gortmaker commit 95bbc6312c0a3e1dc9bfb2dea65bfc186578b81f Author: Vyacheslav Dubeyko Date: Wed Apr 17 15:58:33 2013 -0700 hfsplus: fix potential overflow in hfsplus_file_truncate() commit 12f267a20aecf8b84a2a9069b9011f1661c779b4 upstream. Change a u32 to loff_t hfsplus_file_truncate(). Signed-off-by: Vyacheslav Dubeyko Cc: Christoph Hellwig Cc: Al Viro Cc: Hin-Tak Leung Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit 529f9e4e99ad59a789803620d7a66a6ee7e9e2ab Author: Trond Myklebust Date: Mon Dec 20 21:19:26 2010 +0000 kernel panic when mount NFSv4 commit beb0f0a9fba1fa98b378329a9a5b0a73f25097ae upstream. On Tue, 2010-12-14 at 16:58 +0800, Mi Jinlong wrote: > Hi, > > When testing NFSv4 at RHEL6 with kernel 2.6.32, I got a kernel panic > at NFS client's __rpc_create_common function. > > The panic place is: > rpc_mkpipe > __rpc_lookup_create() <=== find pipefile *idmap* > __rpc_mkpipe() <=== pipefile is *idmap* > __rpc_create_common() > ****** BUG_ON(!d_unhashed(dentry)); ****** *panic* > > It means that the dentry's d_flags have be set DCACHE_UNHASHED, > but it should not be set here. > > Is someone known this bug? or give me some idea? > > A reproduce program is append, but it can't reproduce the bug every time. > the export is: "/nfsroot *(rw,no_root_squash,fsid=0,insecure)" > > And the panic message is append. > > ============================================================================ > #!/bin/sh > > LOOPTOTAL=768 > LOOPCOUNT=0 > ret=0 > > while [ $LOOPCOUNT -ne $LOOPTOTAL ] > do > ((LOOPCOUNT += 1)) > service nfs restart > /usr/sbin/rpc.idmapd > mount -t nfs4 127.0.0.1:/ /mnt|| return 1; > ls -l /var/lib/nfs/rpc_pipefs/nfs/*/ > umount /mnt > echo $LOOPCOUNT > done > > =============================================================================== > Code: af 60 01 00 00 89 fa 89 f0 e8 64 cf 89 f0 e8 5c 7c 64 cf 31 c0 8b 5c 24 10 8b > 74 24 14 8b 7c 24 18 8b 6c 24 1c 83 c4 20 c3 <0f> 0b eb fc 8b 46 28 c7 44 24 08 20 > de ee f0 c7 44 24 04 56 ea > EIP:[] __rpc_create_common+0x8a/0xc0 [sunrpc] SS:ESP 0068:eccb5d28 > ---[ end trace 8f5606cd08928ed2]--- > Kernel panic - not syncing: Fatal exception > Pid:7131, comm: mount.nfs4 Tainted: G D -------------------2.6.32 #1 > Call Trace: > [] ? panic+0x42/0xed > [] ? oops_end+0xbc/0xd0 > [] ? do_invalid_op+0x0/0x90 > [] ? do_invalid_op+0x7f/0x90 > [] ? __rpc_create_common+0x8a/0xc0[sunrpc] > [] ? rpc_free_task+0x33/0x70[sunrpc] > [] ? prc_call_sync+0x48/0x60[sunrpc] > [] ? rpc_ping+0x4e/0x60[sunrpc] > [] ? rpc_create+0x38f/0x4f0[sunrpc] > [] ? error_code+0x73/0x78 > [] ? __rpc_create_common+0x8a/0xc0[sunrpc] > [] ? d_lookup+0x2a/0x40 > [] ? rpc_mkpipe+0x111/0x1b0[sunrpc] > [] ? nfs_create_rpc_client+0xb4/0xf0[nfs] > [] ? nfs_fscache_get_client_cookie+0x1d/0x50[nfs] > [] ? nfs_idmap_new+0x7b/0x140[nfs] > [] ? strlcpy+0x3a/0x60 > [] ? nfs4_set_client+0xea/0x2b0[nfs] > [] ? nfs4_create_server+0xac/0x1b0[nfs] > [] ? krealloc+0x40/0x50 > [] ? nfs4_remote_get_sb+0x6b/0x250[nfs] > [] ? kstrdup+0x3c/0x60 > [] ? vfs_kern_mount+0x69/0x170 > [] ? nfs_do_root_mount+0x6c/0xa0[nfs] > [] ? nfs4_try_mount+0x37/0xa0[nfs] > [] ? nfs4_validate_text_mount_data+-x7d/0xf0[nfs] > [] ? nfs4_get_sb+0x92/0x2f0 > [] ? vfs_kern_mount+0x69/0x170 > [] ? get_fs_type+0x32/0xb0 > [] ? do_kern_mount+0x3f/0xe0 > [] ? do_mount+0x2ef/0x740 > [] ? copy_mount_options+0xb0/0x120 > [] ? sys_mount+0x6e/0xa0 Hi, Does the following patch fix the problem? Cheers Trond -------------------------- SUNRPC: Fix a BUG in __rpc_create_common From: Trond Myklebust Mi Jinlong reports: When testing NFSv4 at RHEL6 with kernel 2.6.32, I got a kernel panic at NFS client's __rpc_create_common function. The panic place is: rpc_mkpipe __rpc_lookup_create() <=== find pipefile *idmap* __rpc_mkpipe() <=== pipefile is *idmap* __rpc_create_common() ****** BUG_ON(!d_unhashed(dentry)); ****** *panic* The test is wrong: we can find ourselves with a hashed negative dentry here if the idmapper tried to look up the file before we got round to creating it. Just replace the BUG_ON() with a d_drop(dentry). Reported-by: Mi Jinlong Signed-off-by: Trond Myklebust Signed-off-by: Paul Gortmaker commit 609024d9e60eea802cf6affa04466c09287d670d Author: Jonathan Nieder Date: Fri May 11 04:20:20 2012 -0500 NFSv4: Revalidate uid/gid after open This is a shorter (and more appropriate for stable kernels) analog to the following upstream commit: commit 6926afd1925a54a13684ebe05987868890665e2b Author: Trond Myklebust Date: Sat Jan 7 13:22:46 2012 -0500 NFSv4: Save the owner/group name string when doing open ...so that we can do the uid/gid mapping outside the asynchronous RPC context. This fixes a bug in the current NFSv4 atomic open code where the client isn't able to determine what the true uid/gid fields of the file are, (because the asynchronous nature of the OPEN call denies it the ability to do an upcall) and so fills them with default values, marking the inode as needing revalidation. Unfortunately, in some cases, the VFS will do some additional sanity checks on the file, and may override the server's decision to allow the open because it sees the wrong owner/group fields. Signed-off-by: Trond Myklebust Without this patch, logging into two different machines with home directories mounted over NFS4 and then running "vim" and typing ":q" in each reliably produces the following error on the second machine: E137: Viminfo file is not writable: /users/system/rtheys/.viminfo This regression was introduced by 80e52aced138 ("NFSv4: Don't do idmapper upcalls for asynchronous RPC calls", merged during the 2.6.32 cycle) --- after the OPEN call, .viminfo has the default values for st_uid and st_gid (0xfffffffe) cached because we do not want to let rpciod wait for an idmapper upcall to fill them in. The fix used in mainline is to save the owner and group as strings and perform the upcall in _nfs4_proc_open outside the rpciod context, which takes about 600 lines. For stable, we can do something similar with a one-liner: make open check for the stale fields and make a (synchronous) GETATTR call to fill them when needed. Trond dictated the patch, I typed it in, and Rik tested it. Addresses http://bugs.debian.org/659111 and https://bugzilla.redhat.com/789298 Reported-by: Rik Theys Explained-by: David Flyn Signed-off-by: Jonathan Nieder Tested-by: Rik Theys Signed-off-by: Greg Kroah-Hartman [PG: commit 19165bdbb3622cfca0ff66e8b30248d469b849d6 in v3.0.32] Signed-off-by: Paul Gortmaker commit 57dc95eebafd7e0eebbac5f177ac3e60008342ae Author: Trond Myklebust Date: Mon Aug 20 12:42:15 2012 -0400 NFSv3: Ensure that do_proc_get_root() reports errors correctly commit 086600430493e04b802bee6e5b3ce0458e4eb77f upstream. If the rpc call to NFS3PROC_FSINFO fails, then we need to report that error so that the mount fails. Otherwise we can end up with a superblock with completely unusable values for block sizes, maxfilesize, etc. Reported-by: Yuanming Chen Signed-off-by: Trond Myklebust Signed-off-by: Paul Gortmaker commit dd17b5298be2e182556ef25ce9108343089e77e5 Author: J. Bruce Fields Date: Tue Dec 4 18:25:10 2012 -0500 nfsd4: fix oops on unusual readlike compound commit d5f50b0c290431c65377c4afa1c764e2c3fe5305 upstream. If the argument and reply together exceed the maximum payload size, then a reply with a read-like operation can overlow the rq_pages array. Signed-off-by: J. Bruce Fields Signed-off-by: Paul Gortmaker commit cbdfb1a8b4d54cfe3ee17d0dba60f41ac87ce068 Author: Nithin Nayak Sujir Date: Mon Jan 14 17:10:59 2013 +0000 tg3: Avoid null pointer dereference in tg3_interrupt in netconsole mode commit 9c13cb8bb477a83b9a3c9e5a5478a4e21294a760 upstream. When netconsole is enabled, logging messages generated during tg3_open can result in a null pointer dereference for the uninitialized tg3 status block. Use the irq_sync flag to disable polling in the early stages. irq_sync is cleared when the driver is enabling interrupts after all initialization is completed. Signed-off-by: Nithin Nayak Sujir Signed-off-by: Michael Chan Signed-off-by: David S. Miller [PG: drivers/net/ethernet/broadcom/tg3.c --> drivers/net/tg3.c in .34] Signed-off-by: Paul Gortmaker commit bd39784b505cd491d5ec8641f57fe35aeec32b4b Author: Larry Finger Date: Wed Sep 26 12:32:02 2012 -0500 b43legacy: Fix crash on unload when firmware not available commit 2d838bb608e2d1f6cb4280e76748cb812dc822e7 upstream. When b43legacy is loaded without the firmware being available, a following unload generates a kernel NULL pointer dereference BUG as follows: [ 214.330789] BUG: unable to handle kernel NULL pointer dereference at 0000004c [ 214.330997] IP: [] drain_workqueue+0x15/0x170 [ 214.331179] *pde = 00000000 [ 214.331311] Oops: 0000 [#1] SMP [ 214.331471] Modules linked in: b43legacy(-) ssb pcmcia mac80211 cfg80211 af_packet mperf arc4 ppdev sr_mod cdrom sg shpchp yenta_socket pcmcia_rsrc pci_hotplug pcmcia_core battery parport_pc parport floppy container ac button edd autofs4 ohci_hcd ehci_hcd usbcore usb_common thermal processor scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_dh fan thermal_sys hwmon ata_generic pata_ali libata [last unloaded: cfg80211] [ 214.333421] Pid: 3639, comm: modprobe Not tainted 3.6.0-rc6-wl+ #163 Source Technology VIC 9921/ALI Based Notebook [ 214.333580] EIP: 0060:[] EFLAGS: 00010246 CPU: 0 [ 214.333687] EIP is at drain_workqueue+0x15/0x170 [ 214.333788] EAX: c162ac40 EBX: cdfb8360 ECX: 0000002a EDX: 00002a2a [ 214.333890] ESI: 00000000 EDI: 00000000 EBP: cd767e7c ESP: cd767e5c [ 214.333957] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 214.333957] CR0: 8005003b CR2: 0000004c CR3: 0c96a000 CR4: 00000090 [ 214.333957] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 214.333957] DR6: ffff0ff0 DR7: 00000400 [ 214.333957] Process modprobe (pid: 3639, ti=cd766000 task=cf802e90 task.ti=cd766000) [ 214.333957] Stack: [ 214.333957] 00000292 cd767e74 c12c5e09 00000296 00000296 cdfb8360 cdfb9220 00000000 [ 214.333957] cd767e90 c104c4fd cdfb8360 cdfb9220 cd682800 cd767ea4 d0c10184 cd682800 [ 214.333957] cd767ea4 cba31064 cd767eb8 d0867908 cba31064 d087e09c cd96f034 cd767ec4 [ 214.333957] Call Trace: [ 214.333957] [] ? skb_dequeue+0x49/0x60 [ 214.333957] [] destroy_workqueue+0xd/0x150 [ 214.333957] [] ieee80211_unregister_hw+0xc4/0x100 [mac80211] [ 214.333957] [] b43legacy_remove+0x78/0x80 [b43legacy] [ 214.333957] [] ssb_device_remove+0x1d/0x30 [ssb] [ 214.333957] [] __device_release_driver+0x5a/0xb0 [ 214.333957] [] driver_detach+0x87/0x90 [ 214.333957] [] bus_remove_driver+0x6c/0xe0 [ 214.333957] [] driver_unregister+0x40/0x70 [ 214.333957] [] ssb_driver_unregister+0xb/0x10 [ssb] [ 214.333957] [] b43legacy_exit+0xd/0xf [b43legacy] [ 214.333957] [] sys_delete_module+0x14e/0x2b0 [ 214.333957] [] ? vfs_write+0xf7/0x150 [ 214.333957] [] ? tty_write_lock+0x50/0x50 [ 214.333957] [] ? sys_write+0x38/0x70 [ 214.333957] [] syscall_call+0x7/0xb [ 214.333957] Code: bc 27 00 00 00 00 a1 74 61 56 c1 55 89 e5 e8 a3 fc ff ff 5d c3 90 55 89 e5 57 56 89 c6 53 b8 40 ac 62 c1 83 ec 14 e8 bb b7 34 00 <8b> 46 4c 8d 50 01 85 c0 89 56 4c 75 03 83 0e 40 80 05 40 ac 62 [ 214.333957] EIP: [] drain_workqueue+0x15/0x170 SS:ESP 0068:cd767e5c [ 214.333957] CR2: 000000000000004c [ 214.341110] ---[ end trace c7e90ec026d875a6 ]---Index: wireless-testing/drivers/net/wireless/b43legacy/main.c The problem is fixed by making certain that the ucode pointer is not NULL before deregistering the driver in mac80211. Signed-off-by: Larry Finger Signed-off-by: John W. Linville Signed-off-by: Paul Gortmaker commit 0eac40376e83251cc200c9a8091609176b9f6540 Author: Mathias Krause Date: Fri Sep 14 09:58:32 2012 +0000 xfrm_user: return error pointer instead of NULL #2 commit c25463722509fef0ed630b271576a8c9a70236f3 upstream. When dump_one_policy() returns an error, e.g. because of a too small buffer to dump the whole xfrm policy, xfrm_policy_netlink() returns NULL instead of an error pointer. But its caller expects an error pointer and therefore continues to operate on a NULL skbuff. Signed-off-by: Mathias Krause Acked-by: Steffen Klassert Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit e19ec85a0593236c14563f7fdb0560c1cf1e9eef Author: Mathias Krause Date: Thu Sep 13 11:41:26 2012 +0000 xfrm_user: return error pointer instead of NULL commit 864745d291b5ba80ea0bd0edcbe67273de368836 upstream. When dump_one_state() returns an error, e.g. because of a too small buffer to dump the whole xfrm state, xfrm_state_netlink() returns NULL instead of an error pointer. But its callers expect an error pointer and therefore continue to operate on a NULL skbuff. This could lead to a privilege escalation (execution of user code in kernel context) if the attacker has CAP_NET_ADMIN and is able to map address 0. Signed-off-by: Mathias Krause Acked-by: Steffen Klassert Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit f19e36c418a5e2845b60f96c979f10e46e19418a Author: Mathias Krause Date: Wed Sep 19 11:33:41 2012 +0000 xfrm_user: fix info leak in copy_to_user_tmpl() commit 1f86840f897717f86d523a13e99a447e6a5d2fa5 upstream. The memory used for the template copy is a local stack variable. As struct xfrm_user_tmpl contains multiple holes added by the compiler for alignment, not initializing the memory will lead to leaking stack bytes to userland. Add an explicit memset(0) to avoid the info leak. Initial version of the patch by Brad Spengler. Cc: Brad Spengler Signed-off-by: Mathias Krause Acked-by: Steffen Klassert Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit f667e4c4889827f10f518c3b99f509a7ff602b66 Author: Mathias Krause Date: Wed Sep 19 11:33:40 2012 +0000 xfrm_user: fix info leak in copy_to_user_policy() commit 7b789836f434c87168eab067cfbed1ec4783dffd upstream. The memory reserved to dump the xfrm policy includes multiple padding bytes added by the compiler for alignment (padding bytes in struct xfrm_selector and struct xfrm_userpolicy_info). Add an explicit memset(0) before filling the buffer to avoid the heap info leak. Signed-off-by: Mathias Krause Acked-by: Steffen Klassert Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 59bed844deb2e6fc82fbf8bf98531b86192474e2 Author: Mathias Krause Date: Wed Sep 19 11:33:39 2012 +0000 xfrm_user: fix info leak in copy_to_user_state() commit f778a636713a435d3a922c60b1622a91136560c1 upstream. The memory reserved to dump the xfrm state includes the padding bytes of struct xfrm_usersa_info added by the compiler for alignment (7 for amd64, 3 for i386). Add an explicit memset(0) before filling the buffer to avoid the info leak. Signed-off-by: Mathias Krause Acked-by: Steffen Klassert Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 19c389a4254b2e0194a5430c8889acc0ef479386 Author: Eric Dumazet Date: Mon Jun 4 00:18:19 2012 +0000 drop_monitor: dont sleep in atomic context commit bec4596b4e6770c7037f21f6bd27567b152dc0d6 upstream. drop_monitor calls several sleeping functions while in atomic context. BUG: sleeping function called from invalid context at mm/slub.c:943 in_atomic(): 1, irqs_disabled(): 0, pid: 2103, name: kworker/0:2 Pid: 2103, comm: kworker/0:2 Not tainted 3.5.0-rc1+ #55 Call Trace: [] __might_sleep+0xca/0xf0 [] kmem_cache_alloc_node+0x1b3/0x1c0 [] ? queue_delayed_work_on+0x11c/0x130 [] __alloc_skb+0x4b/0x230 [] ? reset_per_cpu_data+0x160/0x160 [drop_monitor] [] reset_per_cpu_data+0x2f/0x160 [drop_monitor] [] send_dm_alert+0x4b/0xb0 [drop_monitor] [] process_one_work+0x130/0x4c0 [] worker_thread+0x159/0x360 [] ? manage_workers.isra.27+0x240/0x240 [] kthread+0x93/0xa0 [] kernel_thread_helper+0x4/0x10 [] ? kthread_freezable_should_stop+0x80/0x80 [] ? gs_change+0xb/0xb Rework the logic to call the sleeping functions in right context. Use standard timer/workqueue api to let system chose any cpu to perform the allocation and netlink send. Also avoid a loop if reset_per_cpu_data() cannot allocate memory : use mod_timer() to wait 1/10 second before next try. Signed-off-by: Eric Dumazet Cc: Neil Horman Reviewed-by: Neil Horman Signed-off-by: David S. Miller [PG: diffstat here is less by one line due to blank line removal] Signed-off-by: Paul Gortmaker commit 01913e6800f49e6d6204761edaf473fa8db1ebca Author: Neil Horman Date: Tue May 1 08:18:02 2012 +0000 drop_monitor: prevent init path from scheduling on the wrong cpu commit 4fdcfa12843bca38d0c9deff70c8720e4e8f515f upstream. I just noticed after some recent updates, that the init path for the drop monitor protocol has a minor error. drop monitor maintains a per cpu structure, that gets initalized from a single cpu. Normally this is fine, as the protocol isn't in use yet, but I recently made a change that causes a failed skb allocation to reschedule itself . Given the current code, the implication is that this workqueue reschedule will take place on the wrong cpu. If drop monitor is used early during the boot process, its possible that two cpus will access a single per-cpu structure in parallel, possibly leading to data corruption. This patch fixes the situation, by storing the cpu number that a given instance of this per-cpu data should be accessed from. In the case of a need for a reschedule, the cpu stored in the struct is assigned the rescheule, rather than the currently executing cpu Tested successfully by myself. Signed-off-by: Neil Horman CC: David Miller Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit ae41105eb938740268be8a6fc355aa74090780b5 Author: Neil Horman Date: Fri Apr 27 10:11:49 2012 +0000 drop_monitor: Make updating data->skb smp safe commit 3885ca785a3618593226687ced84f3f336dc3860 upstream. Eric Dumazet pointed out to me that the drop_monitor protocol has some holes in its smp protections. Specifically, its possible to replace data->skb while its being written. This patch corrects that by making data->skb an rcu protected variable. That will prevent it from being overwritten while a tracepoint is modifying it. Signed-off-by: Neil Horman Reported-by: Eric Dumazet CC: David Miller Acked-by: Eric Dumazet Signed-off-by: David S. Miller [PG: The "__rcu" sparse addition is only present in 2.6.37+ kernels.] Signed-off-by: Paul Gortmaker commit b350f3ad5b483c943c3f0fa5ac62e530ea3dbc01 Author: Neil Horman Date: Fri Apr 27 10:11:48 2012 +0000 drop_monitor: fix sleeping in invalid context warning commit cde2e9a651b76d8db36ae94cd0febc82b637e5dd upstream. Eric Dumazet pointed out this warning in the drop_monitor protocol to me: [ 38.352571] BUG: sleeping function called from invalid context at kernel/mutex.c:85 [ 38.352576] in_atomic(): 1, irqs_disabled(): 0, pid: 4415, name: dropwatch [ 38.352580] Pid: 4415, comm: dropwatch Not tainted 3.4.0-rc2+ #71 [ 38.352582] Call Trace: [ 38.352592] [] ? trace_napi_poll_hit+0xd0/0xd0 [ 38.352599] [] __might_sleep+0xca/0xf0 [ 38.352606] [] mutex_lock+0x26/0x50 [ 38.352610] [] ? trace_napi_poll_hit+0xd0/0xd0 [ 38.352616] [] tracepoint_probe_register+0x29/0x90 [ 38.352621] [] set_all_monitor_traces+0x105/0x170 [ 38.352625] [] net_dm_cmd_trace+0x2a/0x40 [ 38.352630] [] genl_rcv_msg+0x21a/0x2b0 [ 38.352636] [] ? zone_statistics+0x99/0xc0 [ 38.352640] [] ? genl_rcv+0x30/0x30 [ 38.352645] [] netlink_rcv_skb+0xa9/0xd0 [ 38.352649] [] genl_rcv+0x20/0x30 [ 38.352653] [] netlink_unicast+0x1ae/0x1f0 [ 38.352658] [] netlink_sendmsg+0x2b6/0x310 [ 38.352663] [] sock_sendmsg+0x10f/0x130 [ 38.352668] [] ? move_addr_to_kernel+0x60/0xb0 [ 38.352673] [] ? verify_iovec+0x64/0xe0 [ 38.352677] [] __sys_sendmsg+0x386/0x390 [ 38.352682] [] ? handle_mm_fault+0x139/0x210 [ 38.352687] [] ? do_page_fault+0x1ec/0x4f0 [ 38.352693] [] ? set_next_entity+0x9d/0xb0 [ 38.352699] [] ? tty_ldisc_deref+0x9/0x10 [ 38.352703] [] ? pick_next_task_fair+0x63/0x140 [ 38.352708] [] sys_sendmsg+0x44/0x80 [ 38.352713] [] system_call_fastpath+0x16/0x1b It stems from holding a spinlock (trace_state_lock) while attempting to register or unregister tracepoint hooks, making in_atomic() true in this context, leading to the warning when the tracepoint calls might_sleep() while its taking a mutex. Since we only use the trace_state_lock to prevent trace protocol state races, as well as hardware stat list updates on an rcu write side, we can just convert the spinlock to a mutex to avoid this problem. Signed-off-by: Neil Horman Reported-by: Eric Dumazet CC: David Miller Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 403634aa9e309659cc82bb6917fd6feb718f0727 Author: Willy Tarreau Date: Thu May 17 11:14:14 2012 +0000 tcp: do_tcp_sendpages() must try to push data out on oom conditions commit bad115cfe5b509043b684d3a007ab54b80090aa1 upstream. Since recent changes on TCP splicing (starting with commits 2f533844 "tcp: allow splice() to build full TSO packets" and 35f9c09f "tcp: tcp_sendpages() should call tcp_push() once"), I started seeing massive stalls when forwarding traffic between two sockets using splice() when pipe buffers were larger than socket buffers. Latest changes (net: netdev_alloc_skb() use build_skb()) made the problem even more apparent. The reason seems to be that if do_tcp_sendpages() fails on out of memory condition without being able to send at least one byte, tcp_push() is not called and the buffers cannot be flushed. After applying the attached patch, I cannot reproduce the stalls at all and the data rate it perfectly stable and steady under any condition which previously caused the problem to be permanent. The issue seems to have been there since before the kernel migrated to git, which makes me think that the stalls I occasionally experienced with tux during stress-tests years ago were probably related to the same issue. This issue was first encountered on 3.0.31 and 3.2.17, so please backport to -stable. Signed-off-by: Willy Tarreau Acked-by: Eric Dumazet Signed-off-by: Paul Gortmaker commit cc5e45629c6783c2b77d881f61ff52a68b8f0c8e Author: Eric Dumazet Date: Fri Dec 2 23:41:42 2011 +0000 tcp: drop SYN+FIN messages commit fdf5af0daf8019cec2396cdef8fb042d80fe71fa upstream. Denys Fedoryshchenko reported that SYN+FIN attacks were bringing his linux machines to their limits. Dont call conn_request() if the TCP flags includes SYN flag Reported-by: Denys Fedoryshchenko Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 450958657a8d2b4a50f7e72a565f014bfa464ca9 Author: Jiri Kosina Date: Fri Jul 27 10:38:50 2012 +0000 tcp: perform DMA to userspace only if there is a task waiting for it commit 59ea33a68a9083ac98515e4861c00e71efdc49a1 upstream. Back in 2006, commit 1a2449a87b ("[I/OAT]: TCP recv offload to I/OAT") added support for receive offloading to IOAT dma engine if available. The code in tcp_rcv_established() tries to perform early DMA copy if applicable. It however does so without checking whether the userspace task is actually expecting the data in the buffer. This is not a problem under normal circumstances, but there is a corner case where this doesn't work -- and that's when MSG_TRUNC flag to recvmsg() is used. If the IOAT dma engine is not used, the code properly checks whether there is a valid ucopy.task and the socket is owned by userspace, but misses the check in the dmaengine case. This problem can be observed in real trivially -- for example 'tbench' is a good reproducer, as it makes a heavy use of MSG_TRUNC. On systems utilizing IOAT, you will soon find tbench waiting indefinitely in sk_wait_data(), as they have been already early-copied in tcp_rcv_established() using dma engine. This patch introduces the same check we are performing in the simple iovec copy case to the IOAT case as well. It fixes the indefinite recvmsg(MSG_TRUNC) hangs. Signed-off-by: Jiri Kosina Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit f3e0756522c9f15dd6cba70feb9aa760cb05a028 Author: David S. Miller Date: Mon Jul 30 14:52:48 2012 -0700 tun: Fix formatting. commit 8bbb181308bc348e02bfdbebdedd4e4ec9d452ce upstream. Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit c1968274b39f8e5696baebcba04076678449628d Author: Mathias Krause Date: Sun Jul 29 19:45:14 2012 +0000 net/tun: fix ioctl() based info leaks commit a117dacde0288f3ec60b6e5bcedae8fa37ee0dfc upstream. The tun module leaks up to 36 bytes of memory by not fully initializing a structure located on the stack that gets copied to user memory by the TUNGETIFF and SIOCGIFHWADDR ioctl()s. Signed-off-by: Mathias Krause Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 197365095b482ed4f3ffef8d4534b4708b261137 Author: Alexey Khoroshilov Date: Wed Aug 8 00:33:25 2012 +0000 net/core: Fix potential memory leak in dev_set_alias() commit 7364e445f62825758fa61195d237a5b8ecdd06ec upstream. Do not leak memory by updating pointer with potentially NULL realloc return value. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 57f8570a279005ca08905f68fa0ca3eeb86213e7 Author: Eric Dumazet Date: Tue Mar 5 07:15:13 2013 +0000 net: reduce net_rx_action() latency to 2 HZ commit d1f41b67ff7735193bc8b418b98ac99a448833e2 upstream. We should use time_after_eq() to get maximum latency of two ticks, instead of three. Bug added in commit 24f8b2385 (net: increase receive packet quantum) Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit f0fc2c43f34a037e01ef25be55e841207bf7c7bc Author: Eric Dumazet Date: Thu Jan 10 15:26:34 2013 -0800 softirq: reduce latencies commit c10d73671ad30f54692f7f69f0e09e75d3a8926a upstream. In various network workloads, __do_softirq() latencies can be up to 20 ms if HZ=1000, and 200 ms if HZ=100. This is because we iterate 10 times in the softirq dispatcher, and some actions can consume a lot of cycles. This patch changes the fallback to ksoftirqd condition to : - A time limit of 2 ms. - need_resched() being set on current task When one of this condition is met, we wakeup ksoftirqd for further softirq processing if we still have pending softirqs. Using need_resched() as the only condition can trigger RCU stalls, as we can keep BH disabled for too long. I ran several benchmarks and got no significant difference in throughput, but a very significant reduction of latencies (one order of magnitude) : In following bench, 200 antagonist "netperf -t TCP_RR" are started in background, using all available cpus. Then we start one "netperf -t TCP_RR", bound to the cpu handling the NIC IRQ (hard+soft) Before patch : # netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind RT_LATENCY=550110.424 MIN_LATENCY=146858 MAX_LATENCY=997109 P50_LATENCY=305000 P90_LATENCY=550000 P99_LATENCY=710000 MEAN_LATENCY=376989.12 STDDEV_LATENCY=184046.92 After patch : # netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind RT_LATENCY=40545.492 MIN_LATENCY=9834 MAX_LATENCY=78366 P50_LATENCY=33583 P90_LATENCY=59000 P99_LATENCY=69000 MEAN_LATENCY=38364.67 STDDEV_LATENCY=12865.26 Signed-off-by: Eric Dumazet Cc: David Miller Cc: Tom Herbert Cc: Ben Hutchings Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit bd9fed146ee8c0212a42154108ec7ee8ad3561db Author: Eric Dumazet Date: Thu Apr 5 22:17:46 2012 +0000 netlink: fix races after skb queueing commit 4a7e7c2ad540e54c75489a70137bf0ec15d3a127 upstream. As soon as an skb is queued into socket receive_queue, another thread can consume it, so we are not allowed to reference skb anymore, or risk use after free. Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 64d53a6d07366e26ae0c750b2eabb857b72e627a Author: stephen hemminger Date: Thu Dec 22 08:52:03 2011 +0000 netlink: wake up netlink listeners sooner (v2) commit 2c64580046a122fa15bb586d8ca4fd5e4b69a1e7 upstream. This patch changes it to yield sooner at halfway instead. Still not a cure-all for listener overrun if listner is slow, but works much reliably. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit f0fb4f06b9f82f4ffee41bd19e3da44deabed78e Author: Eric Dumazet Date: Fri Apr 6 10:49:10 2012 +0200 net: fix a race in sock_queue_err_skb() commit 110c43304db6f06490961529536c362d9ac5732f upstream. As soon as an skb is queued into socket error queue, another thread can consume it, so we are not allowed to reference skb anymore, or risk use after free. Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller [PG: net/core/skbuff.c --> include/net/sock.h on 2.6.34 baseline] Signed-off-by: Paul Gortmaker commit 5ef210ddc80203dc81dbbbfd8fe2f1481185186c Author: David Ward Date: Sun Apr 15 12:31:45 2012 +0000 net_sched: gred: Fix oops in gred_dump() in WRED mode commit 244b65dbfede788f2fa3fe2463c44d0809e97c6b upstream. A parameter set exists for WRED mode, called wred_set, to hold the same values for qavg and qidlestart across all VQs. The WRED mode values had been previously held in the VQ for the default DP. After these values were moved to wred_set, the VQ for the default DP was no longer created automatically (so that it could be omitted on purpose, to have packets in the default DP enqueued directly to the device without using RED). However, gred_dump() was overlooked during that change; in WRED mode it still reads qavg/qidlestart from the VQ for the default DP, which might not even exist. As a result, this command sequence will cause an oops: tc qdisc add dev $DEV handle $HANDLE parent $PARENT gred setup \ DPs 3 default 2 grio tc qdisc change dev $DEV handle $HANDLE gred DP 0 prio 8 $RED_OPTIONS tc qdisc change dev $DEV handle $HANDLE gred DP 1 prio 8 $RED_OPTIONS This fixes gred_dump() in WRED mode to use the values held in wred_set. Signed-off-by: David Ward Signed-off-by: David S. Miller [PG: in 2.6.34 it is tab[]->parms vs. tab[]->vars of 3.4 baseline] Signed-off-by: Paul Gortmaker commit 0c4b80040717fd36afa06186ff676b6a831aa009 Author: Eric Dumazet Date: Sun Apr 29 09:08:22 2012 +0000 netem: fix possible skb leak commit 116a0fc31c6c9b8fc821be5a96e5bf0b43260131 upstream. skb_checksum_help(skb) can return an error, we must free skb in this case. qdisc_drop(skb, sch) can also be feeded with a NULL skb (if skb_unshare() failed), so lets use this generic helper. Signed-off-by: Eric Dumazet Cc: Stephen Hemminger Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 7daf25800c9235aad1abc71f7cacb3f6e7b24d06 Author: danborkmann@iogearbox.net Date: Fri Aug 10 22:48:54 2012 +0000 af_packet: remove BUG statement in tpacket_destruct_skb commit 7f5c3e3a80e6654cf48dfba7cf94f88c6b505467 upstream. Here's a quote of the comment about the BUG macro from asm-generic/bug.h: Don't use BUG() or BUG_ON() unless there's really no way out; one example might be detecting data structure corruption in the middle of an operation that can't be backed out of. If the (sub)system can somehow continue operating, perhaps with reduced functionality, it's probably not BUG-worthy. If you're tempted to BUG(), think again: is completely giving up really the *only* solution? There are usually better options, where users don't need to reboot ASAP and can mostly shut down cleanly. In our case, the status flag of a ring buffer slot is managed from both sides, the kernel space and the user space. This means that even though the kernel side might work as expected, the user space screws up and changes this flag right between the send(2) is triggered when the flag is changed to TP_STATUS_SENDING and a given skb is destructed after some time. Then, this will hit the BUG macro. As David suggested, the best solution is to simply remove this statement since it cannot be used for kernel side internal consistency checks. I've tested it and the system still behaves /stable/ in this case, so in accordance with the above comment, we should rather remove it. Signed-off-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 3f59f82650136ab39b7998439750199a747781e5 Author: Stephen Hemminger Date: Mon Feb 11 08:22:22 2013 +0000 bridge: set priority of STP packets commit 547b4e718115eea74087e28d7fa70aec619200db upstream. Spanning Tree Protocol packets should have always been marked as control packets, this causes them to get queued in the high prirority FIFO. As Radia Perlman mentioned in her LCA talk, STP dies if bridge gets overloaded and can't communicate. This is a long-standing bug back to the first versions of Linux bridge. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 7a2fc86bccc5b4b48957a9d4d7abf92634fcf1f1 Author: Hiroaki SHIMODA Date: Fri Aug 3 19:57:52 2012 +0900 net_sched: gact: Fix potential panic in tcf_gact(). commit 696ecdc10622d86541f2e35cc16e15b6b3b1b67e upstream. gact_rand array is accessed by gact->tcfg_ptype whose value is assumed to less than MAX_RAND, but any range checks are not performed. So add a check in tcf_gact_init(). And in tcf_gact(), we can reduce a branch. Signed-off-by: Hiroaki SHIMODA Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit c724c8701d2da28e7c5d1cd7ada4a769104d6854 Author: Stefan Hasko Date: Fri Dec 21 15:04:59 2012 +0000 net: sched: integer overflow fix commit d2fe85da52e89b8012ffad010ef352a964725d5f upstream. Fixed integer overflow in function htb_dequeue Signed-off-by: Stefan Hasko Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit a897438606f9805a77ca424bae3573e7624c0bfa Author: Cong Wang Date: Mon Jan 7 21:17:00 2013 +0000 net: prevent setting ttl=0 via IP_TTL commit c9be4a5c49cf51cc70a993f004c5bb30067a65ce upstream. A regression is introduced by the following commit: commit 4d52cfbef6266092d535237ba5a4b981458ab171 Author: Eric Dumazet Date: Tue Jun 2 00:42:16 2009 -0700 net: ipv4/ip_sockglue.c cleanups Pure cleanups but it is not a pure cleanup... - if (val != -1 && (val < 1 || val>255)) + if (val != -1 && (val < 0 || val > 255)) Since there is no reason provided to allow ttl=0, change it back. Reported-by: nitin padalia Cc: nitin padalia Cc: Eric Dumazet Cc: David S. Miller Signed-off-by: Cong Wang Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 1a81087bf57fb43c1dc08e18c44b75ca1bcf4925 Author: Jesper Dangaard Brouer Date: Wed Oct 31 02:45:32 2012 +0000 net: fix divide by zero in tcp algorithm illinois commit 8f363b77ee4fbf7c3bbcf5ec2c5ca482d396d664 upstream. Reading TCP stats when using TCP Illinois congestion control algorithm can cause a divide by zero kernel oops. The division by zero occur in tcp_illinois_info() at: do_div(t, ca->cnt_rtt); where ca->cnt_rtt can become zero (when rtt_reset is called) Steps to Reproduce: 1. Register tcp_illinois: # sysctl -w net.ipv4.tcp_congestion_control=illinois 2. Monitor internal TCP information via command "ss -i" # watch -d ss -i 3. Establish new TCP conn to machine Either it fails at the initial conn, or else it needs to wait for a loss or a reset. This is only related to reading stats. The function avg_delay() also performs the same divide, but is guarded with a (ca->cnt_rtt > 0) at its calling point in update_params(). Thus, simply fix tcp_illinois_info(). Function tcp_illinois_info() / get_info() is called without socket lock. Thus, eliminate any race condition on ca->cnt_rtt by using a local stack variable. Simply reuse info.tcpv_rttcnt, as its already set to ca->cnt_rtt. Function avg_delay() is not affected by this race condition, as its called with the socket lock. Cc: Petr Matousek Signed-off-by: Jesper Dangaard Brouer Acked-by: Eric Dumazet Acked-by: Stephen Hemminger Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 5b377aafbb0b58c2a87f9fdae44c649aa4d79272 Author: Eric Dumazet Date: Mon Sep 24 07:00:11 2012 +0000 net: guard tcp_set_keepalive() to tcp sockets commit 3e10986d1d698140747fcfc2761ec9cb64c1d582 upstream. Its possible to use RAW sockets to get a crash in tcp_set_keepalive() / sk_reset_timer() Fix is to make sure socket is a SOCK_STREAM one. Reported-by: Dave Jones Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 705388467e4393920abddc71dd864e0cb15e6459 Author: Mathias Krause Date: Wed Aug 15 11:31:57 2012 +0000 net: fix info leak in compat dev_ifconf() commit 43da5f2e0d0c69ded3d51907d9552310a6b545e8 upstream. The implementation of dev_ifconf() for the compat ioctl interface uses an intermediate ifc structure allocated in userland for the duration of the syscall. Though, it fails to initialize the padding bytes inserted for alignment and that for leaks four bytes of kernel stack. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 6acdcc4aa78ffbeded08e25c3831047bb5461fbd Author: Eric Dumazet Date: Thu Mar 21 17:36:09 2013 +0000 tcp: preserve ACK clocking in TSO commit f4541d60a449afd40448b06496dcd510f505928e upstream. A long standing problem with TSO is the fact that tcp_tso_should_defer() rearms the deferred timer, while it should not. Current code leads to following bad bursty behavior : 20:11:24.484333 IP A > B: . 297161:316921(19760) ack 1 win 119 20:11:24.484337 IP B > A: . ack 263721 win 1117 20:11:24.485086 IP B > A: . ack 265241 win 1117 20:11:24.485925 IP B > A: . ack 266761 win 1117 20:11:24.486759 IP B > A: . ack 268281 win 1117 20:11:24.487594 IP B > A: . ack 269801 win 1117 20:11:24.488430 IP B > A: . ack 271321 win 1117 20:11:24.489267 IP B > A: . ack 272841 win 1117 20:11:24.490104 IP B > A: . ack 274361 win 1117 20:11:24.490939 IP B > A: . ack 275881 win 1117 20:11:24.491775 IP B > A: . ack 277401 win 1117 20:11:24.491784 IP A > B: . 316921:332881(15960) ack 1 win 119 20:11:24.492620 IP B > A: . ack 278921 win 1117 20:11:24.493448 IP B > A: . ack 280441 win 1117 20:11:24.494286 IP B > A: . ack 281961 win 1117 20:11:24.495122 IP B > A: . ack 283481 win 1117 20:11:24.495958 IP B > A: . ack 285001 win 1117 20:11:24.496791 IP B > A: . ack 286521 win 1117 20:11:24.497628 IP B > A: . ack 288041 win 1117 20:11:24.498459 IP B > A: . ack 289561 win 1117 20:11:24.499296 IP B > A: . ack 291081 win 1117 20:11:24.500133 IP B > A: . ack 292601 win 1117 20:11:24.500970 IP B > A: . ack 294121 win 1117 20:11:24.501388 IP B > A: . ack 295641 win 1117 20:11:24.501398 IP A > B: . 332881:351881(19000) ack 1 win 119 While the expected behavior is more like : 20:19:49.259620 IP A > B: . 197601:202161(4560) ack 1 win 119 20:19:49.260446 IP B > A: . ack 154281 win 1212 20:19:49.261282 IP B > A: . ack 155801 win 1212 20:19:49.262125 IP B > A: . ack 157321 win 1212 20:19:49.262136 IP A > B: . 202161:206721(4560) ack 1 win 119 20:19:49.262958 IP B > A: . ack 158841 win 1212 20:19:49.263795 IP B > A: . ack 160361 win 1212 20:19:49.264628 IP B > A: . ack 161881 win 1212 20:19:49.264637 IP A > B: . 206721:211281(4560) ack 1 win 119 20:19:49.265465 IP B > A: . ack 163401 win 1212 20:19:49.265886 IP B > A: . ack 164921 win 1212 20:19:49.266722 IP B > A: . ack 166441 win 1212 20:19:49.266732 IP A > B: . 211281:215841(4560) ack 1 win 119 20:19:49.267559 IP B > A: . ack 167961 win 1212 20:19:49.268394 IP B > A: . ack 169481 win 1212 20:19:49.269232 IP B > A: . ack 171001 win 1212 20:19:49.269241 IP A > B: . 215841:221161(5320) ack 1 win 119 Signed-off-by: Eric Dumazet Cc: Yuchung Cheng Cc: Van Jacobson Cc: Neal Cardwell Cc: Nandita Dukkipati Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit fb883a89a99481dca7cd6b9ef942c35165ff2d54 Author: Eric Dumazet Date: Sun Jan 6 18:21:49 2013 +0000 tcp: fix MSG_SENDPAGE_NOTLAST logic commit ae62ca7b03217be5e74759dc6d7698c95df498b3 upstream. commit 35f9c09fe9c72e (tcp: tcp_sendpages() should call tcp_push() once) added an internal flag : MSG_SENDPAGE_NOTLAST meant to be set on all frags but the last one for a splice() call. The condition used to set the flag in pipe_to_sendpage() relied on splice() user passing the exact number of bytes present in the pipe, or a smaller one. But some programs pass an arbitrary high value, and the test fails. The effect of this bug is a lack of tcp_push() at the end of a splice(pipe -> socket) call, and possibly very slow or erratic TCP sessions. We should both test sd->total_len and fact that another fragment is in the pipe (pipe->nrbufs > 1) Many thanks to Willy for providing very clear bug report, bisection and test programs. Reported-by: Willy Tarreau Bisected-by: Willy Tarreau Tested-by: Willy Tarreau Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 9e1847c09be4a53358f5b5188cafe8b395a774cc Author: Eric Dumazet Date: Thu Apr 5 03:05:35 2012 +0000 tcp: tcp_sendpages() should call tcp_push() once commit 35f9c09fe9c72eb8ca2b8e89a593e1c151f28fc2 upstream. commit 2f533844242 (tcp: allow splice() to build full TSO packets) added a regression for splice() calls using SPLICE_F_MORE. We need to call tcp_flush() at the end of the last page processed in tcp_sendpages(), or else transmits can be deferred and future sends stall. Add a new internal flag, MSG_SENDPAGE_NOTLAST, acting like MSG_MORE, but with different semantic. For all sendpage() providers, its a transparent change. Only sock_sendpage() and tcp_sendpages() can differentiate the two different flags provided by pipe_to_sendpage() Reported-by: Tom Herbert Cc: Nandita Dukkipati Cc: Neal Cardwell Cc: Tom Herbert Cc: Yuchung Cheng Cc: H.K. Jerry Chu Cc: Maciej Żenczykowski Cc: Mahesh Bandewar Cc: Ilpo Järvinen Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit c37c78f51f66e519724c27a291f434df079fc8ee Author: Eric Dumazet Date: Tue Apr 3 09:37:01 2012 +0000 tcp: allow splice() to build full TSO packets commit 2f53384424251c06038ae612e56231b96ab610ee upstream. vmsplice()/splice(pipe, socket) call do_tcp_sendpages() one page at a time, adding at most 4096 bytes to an skb. (assuming PAGE_SIZE=4096) The call to tcp_push() at the end of do_tcp_sendpages() forces an immediate xmit when pipe is not already filled, and tso_fragment() try to split these skb to MSS multiples. 4096 bytes are usually split in a skb with 2 MSS, and a remaining sub-mss skb (assuming MTU=1500) This makes slow start suboptimal because many small frames are sent to qdisc/driver layers instead of big ones (constrained by cwnd and packets in flight of course) In fact, applications using sendmsg() (adding an additional memory copy) instead of vmsplice()/splice()/sendfile() are a bit faster because of this anomaly, especially if serving small files in environments with large initial [c]wnd. Call tcp_push() only if MSG_MORE is not set in the flags parameter. This bit is automatically provided by splice() internals but for the last page, or on all pages if user specified SPLICE_F_MORE splice() flag. In some workloads, this can reduce number of sent logical packets by an order of magnitude, making zero-copy TCP actually faster than one-copy :) Reported-by: Tom Herbert Cc: Nandita Dukkipati Cc: Neal Cardwell Cc: Tom Herbert Cc: Yuchung Cheng Cc: H.K. Jerry Chu Cc: Maciej Żenczykowski Cc: Mahesh Bandewar Cc: Ilpo Järvinen Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit b34a6044b346f2277fe6e2318db1da2abfd1bf65 Author: Paul Moore Date: Mon Mar 25 03:18:33 2013 +0000 unix: fix a race condition in unix_release() commit ded34e0fe8fe8c2d595bfa30626654e4b87621e0 upstream. As reported by Jan, and others over the past few years, there is a race condition caused by unix_release setting the sock->sk pointer to NULL before properly marking the socket as dead/orphaned. This can cause a problem with the LSM hook security_unix_may_send() if there is another socket attempting to write to this partially released socket in between when sock->sk is set to NULL and it is marked as dead/orphaned. This patch fixes this by only setting sock->sk to NULL after the socket has been marked as dead; I also take the opportunity to make unix_release_sock() a void function as it only ever returned 0/success. Dave, I think this one should go on the -stable pile. Special thanks to Jan for coming up with a reproducer for this problem. Reported-by: Jan Stancek Signed-off-by: Paul Moore Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit eb8896864a982080bb25627bdabcfce8561adce4 Author: Tommi Rantala Date: Tue Nov 27 04:01:46 2012 +0000 sctp: fix memory leak in sctp_datamsg_from_user() when copy from user space fails commit be364c8c0f17a3dd42707b5a090b318028538eb9 upstream. Trinity (the syscall fuzzer) discovered a memory leak in SCTP, reproducible e.g. with the sendto() syscall by passing invalid user space pointer in the second argument: #include #include #include int main(void) { int fd; struct sockaddr_in sa; fd = socket(AF_INET, SOCK_STREAM, 132 /*IPPROTO_SCTP*/); if (fd < 0) return 1; memset(&sa, 0, sizeof(sa)); sa.sin_family = AF_INET; sa.sin_addr.s_addr = inet_addr("127.0.0.1"); sa.sin_port = htons(11111); sendto(fd, NULL, 1, 0, (struct sockaddr *)&sa, sizeof(sa)); return 0; } As far as I can tell, the leak has been around since ~2003. Signed-off-by: Tommi Rantala Acked-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 051c3342813e7bda856fc43534e29c9023c648fc Author: Daniel Borkmann Date: Fri Feb 8 03:04:34 2013 +0000 net: sctp: sctp_setsockopt_auth_key: use kzfree instead of kfree commit 6ba542a291a5e558603ac51cda9bded347ce7627 upstream. In sctp_setsockopt_auth_key, we create a temporary copy of the user passed shared auth key for the endpoint or association and after internal setup, we free it right away. Since it's sensitive data, we should zero out the key before returning the memory back to the allocator. Thus, use kzfree instead of kfree, just as we do in sctp_auth_key_put(). Signed-off-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit f85a52045dda90826bbb59b6b68f6332529eadab Author: Daniel Borkmann Date: Fri Feb 8 03:04:35 2013 +0000 net: sctp: sctp_endpoint_free: zero out secret key data commit b5c37fe6e24eec194bb29d22fdd55d73bcc709bf upstream. On sctp_endpoint_destroy, previously used sensitive keying material should be zeroed out before the memory is returned, as we already do with e.g. auth keys when released. Signed-off-by: Daniel Borkmann Acked-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit c9ea1069d2243c74c812dc91a3f87903823dcc65 Author: Daniel Borkmann Date: Thu Feb 7 00:55:37 2013 +0000 net: sctp: sctp_auth_key_put: use kzfree instead of kfree commit 586c31f3bf04c290dc0a0de7fc91d20aa9a5ee53 upstream. For sensitive data like keying material, it is common practice to zero out keys before returning the memory back to the allocator. Thus, use kzfree instead of kfree. Signed-off-by: Daniel Borkmann Acked-by: Neil Horman Acked-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 5eb0345a28351e4e2b785e57f2552bf7c497625a Author: Jozsef Kadlecsik Date: Tue Apr 3 22:02:01 2012 +0200 netfilter: nf_ct_ipv4: packets with wrong ihl are invalid commit 07153c6ec074257ade76a461429b567cff2b3a1e upstream. It was reported that the Linux kernel sometimes logs: klogd: [2629147.402413] kernel BUG at net / netfilter / nf_conntrack_proto_tcp.c: 447! klogd: [1072212.887368] kernel BUG at net / netfilter / nf_conntrack_proto_tcp.c: 392 ipv4_get_l4proto() in nf_conntrack_l3proto_ipv4.c and tcp_error() in nf_conntrack_proto_tcp.c should catch malformed packets, so the errors at the indicated lines - TCP options parsing - should not happen. However, tcp_error() relies on the "dataoff" offset to the TCP header, calculated by ipv4_get_l4proto(). But ipv4_get_l4proto() does not check bogus ihl values in IPv4 packets, which then can slip through tcp_error() and get caught at the TCP options parsing routines. The patch fixes ipv4_get_l4proto() by invalidating packets with bogus ihl value. The patch closes netfilter bugzilla id 771. Signed-off-by: Jozsef Kadlecsik Signed-off-by: Pablo Neira Ayuso Signed-off-by: Paul Gortmaker commit 5600ff4a1b9f92a2030bd04da55866a4f937b1cf Author: Mathias Krause Date: Wed Aug 15 11:31:56 2012 +0000 ipvs: fix info leak in getsockopt(IP_VS_SO_GET_TIMEOUT) commit 2d8a041b7bfe1097af21441cb77d6af95f4f4680 upstream. If at least one of CONFIG_IP_VS_PROTO_TCP or CONFIG_IP_VS_PROTO_UDP is not set, __ip_vs_get_timeouts() does not fully initialize the structure that gets copied to userland and that for leaks up to 12 bytes of kernel stack. Add an explicit memset(0) before passing the structure to __ip_vs_get_timeouts() to avoid the info leak. Signed-off-by: Mathias Krause Cc: Wensong Zhang Cc: Simon Horman Cc: Julian Anastasov Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit be217bff917544cda30abac24fab22f175d1044e Author: Mathias Krause Date: Sun Apr 7 01:51:47 2013 +0000 atm: update msg_namelen in vcc_recvmsg() commit 9b3e617f3df53822345a8573b6d358f6b9e5ed87 upstream. The current code does not fill the msg_name member in case it is set. It also does not set the msg_namelen member to 0 and therefore makes net/socket.c leak the local, uninitialized sockaddr_storage variable to userland -- 128 bytes of kernel stack memory. Fix that by simply setting msg_namelen to 0 as obviously nobody cared about vcc_recvmsg() not filling the msg_name in case it was set. Signed-off-by: Mathias Krause Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit b34d3db52e94795a1b92563fde8c5de1be28906e Author: Mathias Krause Date: Wed Aug 15 11:31:45 2012 +0000 atm: fix info leak via getsockname() commit 3c0c5cfdcd4d69ffc4b9c0907cec99039f30a50a upstream. The ATM code fails to initialize the two padding bytes of struct sockaddr_atmpvc inserted for alignment. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 105253a4736e72ea28cffdd7143f204837b1a2e6 Author: Mathias Krause Date: Wed Aug 15 11:31:44 2012 +0000 atm: fix info leak in getsockopt(SO_ATMPVC) commit e862f1a9b7df4e8196ebec45ac62295138aa3fc2 upstream. The ATM code fails to initialize the two padding bytes of struct sockaddr_atmpvc inserted for alignment. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Mathias Krause Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit c327aec64e7ddd03b70e357331de5350ed2fe220 Author: Mathias Krause Date: Sun Apr 7 01:51:48 2013 +0000 ax25: fix info leak via msg_name in ax25_recvmsg() commit ef3313e84acbf349caecae942ab3ab731471f1a1 upstream. When msg_namelen is non-zero the sockaddr info gets filled out, as requested, but the code fails to initialize the padding bytes of struct sockaddr_ax25 inserted by the compiler for alignment. Additionally the msg_namelen value is updated to sizeof(struct full_sockaddr_ax25) but is not always filled up to this size. Both issues lead to the fact that the code will leak uninitialized kernel stack bytes in net/socket.c. Fix both issues by initializing the memory with memset(0). Cc: Ralf Baechle Signed-off-by: Mathias Krause Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit a3d88061efd02e2d4884118c8d094e1d568c980a Author: Wu Fengguang Date: Thu Aug 2 23:10:01 2012 +0000 isdnloop: fix and simplify isdnloop_init() commit 77f00f6324cb97cf1df6f9c4aaeea6ada23abdb2 upstream. Fix a buffer overflow bug by removing the revision and printk. [ 22.016214] isdnloop-ISDN-driver Rev 1.11.6.7 [ 22.097508] isdnloop: (loop0) virtual card added [ 22.174400] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff83244972 [ 22.174400] [ 22.436157] Pid: 1, comm: swapper Not tainted 3.5.0-bisect-00018-gfa8bbb1-dirty #129 [ 22.624071] Call Trace: [ 22.720558] [] ? CallcNew+0x56/0x56 [ 22.815248] [] panic+0x110/0x329 [ 22.914330] [] ? isdnloop_init+0xaf/0xb1 [ 23.014800] [] ? CallcNew+0x56/0x56 [ 23.090763] [] __stack_chk_fail+0x2b/0x30 [ 23.185748] [] isdnloop_init+0xaf/0xb1 Signed-off-by: Fengguang Wu Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 6109d00cfac64a954dc155d4e4d1bde5c432a9e8 Author: Mathias Krause Date: Sun Apr 7 01:51:54 2013 +0000 iucv: Fix missing msg_namelen update in iucv_sock_recvmsg() commit a5598bd9c087dc0efc250a5221e5d0e6f584ee88 upstream. The current code does not fill the msg_name member in case it is set. It also does not set the msg_namelen member to 0 and therefore makes net/socket.c leak the local, uninitialized sockaddr_storage variable to userland -- 128 bytes of kernel stack memory. Fix that by simply setting msg_namelen to 0 as obviously nobody cared about iucv_sock_recvmsg() not filling the msg_name in case it was set. Cc: Ursula Braun Signed-off-by: Mathias Krause Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 89b258e930ecf55c2c9a7c0fb0c428dc45f87980 Author: Mathias Krause Date: Sun Apr 7 01:51:56 2013 +0000 llc: Fix missing msg_namelen update in llc_ui_recvmsg() commit c77a4b9cffb6215a15196ec499490d116dfad181 upstream. For stream sockets the code misses to update the msg_namelen member to 0 and therefore makes net/socket.c leak the local, uninitialized sockaddr_storage variable to userland -- 128 bytes of kernel stack memory. The msg_namelen update is also missing for datagram sockets in case the socket is shutting down during receive. Fix both issues by setting msg_namelen to 0 early. It will be updated later if we're going to fill the msg_name member. Cc: Arnaldo Carvalho de Melo Signed-off-by: Mathias Krause Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit d798b105cee33543b235410d2510219fa887b4d2 Author: Mathias Krause Date: Wed Aug 15 11:31:53 2012 +0000 llc: fix info leak via getsockname() commit 3592aaeb80290bda0f2cf0b5456c97bfc638b192 upstream. The LLC code wrongly returns 0, i.e. "success", when the socket is zapped. Together with the uninitialized uaddrlen pointer argument from sys_getsockname this leads to an arbitrary memory leak of up to 128 bytes kernel stack via the getsockname() syscall. Return an error instead when the socket is zapped to prevent the info leak. Also remove the unnecessary memset(0). We don't directly write to the memory pointed by uaddr but memcpy() a local structure at the end of the function that is properly initialized. Signed-off-by: Mathias Krause Cc: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit dbaaf2eb9d0955c2ed5aa9cdd4d8607204ceec22 Author: Weiping Pan Date: Mon Jul 23 10:37:48 2012 +0800 rds: set correct msg_namelen commit 06b6a1cf6e776426766298d055bb3991957d90a7 upstream. Jay Fenlason (fenlason@redhat.com) found a bug, that recvfrom() on an RDS socket can return the contents of random kernel memory to userspace if it was called with a address length larger than sizeof(struct sockaddr_in). rds_recvmsg() also fails to set the addr_len paramater properly before returning, but that's just a bug. There are also a number of cases wher recvfrom() can return an entirely bogus address. Anything in rds_recvmsg() that returns a non-negative value but does not go through the "sin = (struct sockaddr_in *)msg->msg_name;" code path at the end of the while(1) loop will return up to 128 bytes of kernel memory to userspace. And I write two test programs to reproduce this bug, you will see that in rds_server, fromAddr will be overwritten and the following sock_fd will be destroyed. Yes, it is the programmer's fault to set msg_namelen incorrectly, but it is better to make the kernel copy the real length of address to user space in such case. How to run the test programs ? I test them on 32bit x86 system, 3.5.0-rc7. 1 compile gcc -o rds_client rds_client.c gcc -o rds_server rds_server.c 2 run ./rds_server on one console 3 run ./rds_client on another console 4 you will see something like: server is waiting to receive data... old socket fd=3 server received data from client:data from client msg.msg_namelen=32 new socket fd=-1067277685 sendmsg() : Bad file descriptor /***************** rds_client.c ********************/ int main(void) { int sock_fd; struct sockaddr_in serverAddr; struct sockaddr_in toAddr; char recvBuffer[128] = "data from client"; struct msghdr msg; struct iovec iov; sock_fd = socket(AF_RDS, SOCK_SEQPACKET, 0); if (sock_fd < 0) { perror("create socket error\n"); exit(1); } memset(&serverAddr, 0, sizeof(serverAddr)); serverAddr.sin_family = AF_INET; serverAddr.sin_addr.s_addr = inet_addr("127.0.0.1"); serverAddr.sin_port = htons(4001); if (bind(sock_fd, (struct sockaddr*)&serverAddr, sizeof(serverAddr)) < 0) { perror("bind() error\n"); close(sock_fd); exit(1); } memset(&toAddr, 0, sizeof(toAddr)); toAddr.sin_family = AF_INET; toAddr.sin_addr.s_addr = inet_addr("127.0.0.1"); toAddr.sin_port = htons(4000); msg.msg_name = &toAddr; msg.msg_namelen = sizeof(toAddr); msg.msg_iov = &iov; msg.msg_iovlen = 1; msg.msg_iov->iov_base = recvBuffer; msg.msg_iov->iov_len = strlen(recvBuffer) + 1; msg.msg_control = 0; msg.msg_controllen = 0; msg.msg_flags = 0; if (sendmsg(sock_fd, &msg, 0) == -1) { perror("sendto() error\n"); close(sock_fd); exit(1); } printf("client send data:%s\n", recvBuffer); memset(recvBuffer, '\0', 128); msg.msg_name = &toAddr; msg.msg_namelen = sizeof(toAddr); msg.msg_iov = &iov; msg.msg_iovlen = 1; msg.msg_iov->iov_base = recvBuffer; msg.msg_iov->iov_len = 128; msg.msg_control = 0; msg.msg_controllen = 0; msg.msg_flags = 0; if (recvmsg(sock_fd, &msg, 0) == -1) { perror("recvmsg() error\n"); close(sock_fd); exit(1); } printf("receive data from server:%s\n", recvBuffer); close(sock_fd); return 0; } /***************** rds_server.c ********************/ int main(void) { struct sockaddr_in fromAddr; int sock_fd; struct sockaddr_in serverAddr; unsigned int addrLen; char recvBuffer[128]; struct msghdr msg; struct iovec iov; sock_fd = socket(AF_RDS, SOCK_SEQPACKET, 0); if(sock_fd < 0) { perror("create socket error\n"); exit(0); } memset(&serverAddr, 0, sizeof(serverAddr)); serverAddr.sin_family = AF_INET; serverAddr.sin_addr.s_addr = inet_addr("127.0.0.1"); serverAddr.sin_port = htons(4000); if (bind(sock_fd, (struct sockaddr*)&serverAddr, sizeof(serverAddr)) < 0) { perror("bind error\n"); close(sock_fd); exit(1); } printf("server is waiting to receive data...\n"); msg.msg_name = &fromAddr; /* * I add 16 to sizeof(fromAddr), ie 32, * and pay attention to the definition of fromAddr, * recvmsg() will overwrite sock_fd, * since kernel will copy 32 bytes to userspace. * * If you just use sizeof(fromAddr), it works fine. * */ msg.msg_namelen = sizeof(fromAddr) + 16; /* msg.msg_namelen = sizeof(fromAddr); */ msg.msg_iov = &iov; msg.msg_iovlen = 1; msg.msg_iov->iov_base = recvBuffer; msg.msg_iov->iov_len = 128; msg.msg_control = 0; msg.msg_controllen = 0; msg.msg_flags = 0; while (1) { printf("old socket fd=%d\n", sock_fd); if (recvmsg(sock_fd, &msg, 0) == -1) { perror("recvmsg() error\n"); close(sock_fd); exit(1); } printf("server received data from client:%s\n", recvBuffer); printf("msg.msg_namelen=%d\n", msg.msg_namelen); printf("new socket fd=%d\n", sock_fd); strcat(recvBuffer, "--data from server"); if (sendmsg(sock_fd, &msg, 0) == -1) { perror("sendmsg()\n"); close(sock_fd); exit(1); } } close(sock_fd); return 0; } Signed-off-by: Weiping Pan Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 937eabb356279bd9d74de453f0052ba92f5733c5 Author: Mathias Krause Date: Sun Apr 7 01:51:59 2013 +0000 rose: fix info leak via msg_name in rose_recvmsg() commit 4a184233f21645cf0b719366210ed445d1024d72 upstream. The code in rose_recvmsg() does not initialize all of the members of struct sockaddr_rose/full_sockaddr_rose when filling the sockaddr info. Nor does it initialize the padding bytes of the structure inserted by the compiler for alignment. This will lead to leaking uninitialized kernel stack bytes in net/socket.c. Fix the issue by initializing the memory used for sockaddr info with memset(0). Cc: Ralf Baechle Signed-off-by: Mathias Krause Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 9c3c9ff3e60d351dfcd0e291eb7391fe9d3332a3 Author: Kees Cook Date: Wed Sep 11 21:56:54 2013 +0200 HID: LG: validate HID output report details commit 0fb6bd06e06792469acc15bbe427361b56ada528 upstream. A HID device could send a malicious output report that would cause the lg, lg3, and lg4 HID drivers to write beyond the output report allocation during an event, causing a heap overflow: [ 325.245240] usb 1-1: New USB device found, idVendor=046d, idProduct=c287 ... [ 414.518960] BUG kmalloc-4096 (Not tainted): Redzone overwritten Additionally, while lg2 did correctly validate the report details, it was cleaned up and shortened. CVE-2013-2893 Signed-off-by: Kees Cook Reviewed-by: Benjamin Tissoires Signed-off-by: Jiri Kosina [PG: drop hid-lg4ff.c chunk; file not present in 2.6.34 baseline.] Signed-off-by: Paul Gortmaker commit 625162a5188da5dcf1d65afa2307d22f58e0e4cc Author: Kees Cook Date: Wed Sep 11 21:56:51 2013 +0200 HID: zeroplus: validate output report details commit 78214e81a1bf43740ce89bb5efda78eac2f8ef83 upstream. The zeroplus HID driver was not checking the size of allocated values in fields it used. A HID device could send a malicious output report that would cause the driver to write beyond the output report allocation during initialization, causing a heap overflow: [ 1442.728680] usb 1-1: New USB device found, idVendor=0c12, idProduct=0005 ... [ 1466.243173] BUG kmalloc-192 (Tainted: G W ): Redzone overwritten CVE-2013-2889 Signed-off-by: Kees Cook Reviewed-by: Benjamin Tissoires Signed-off-by: Jiri Kosina Signed-off-by: Paul Gortmaker commit 8e8f904083cd716a0244e89020b26a7346202a3b Author: Kees Cook Date: Wed Sep 11 21:56:50 2013 +0200 HID: provide a helper for validating hid reports commit 331415ff16a12147d57d5c953f3a961b7ede348b upstream. Many drivers need to validate the characteristics of their HID report during initialization to avoid misusing the reports. This adds a common helper to perform validation of the report exisitng, the field existing, and the expected number of values within the field. Signed-off-by: Kees Cook Reviewed-by: Benjamin Tissoires Signed-off-by: Jiri Kosina [PG: in 2.6.34 it is err_hid(), original baseline had hid_err().] Signed-off-by: Paul Gortmaker commit 88b152e6f128b050bc15e16d55c003994be352ce Author: Kees Cook Date: Wed Aug 28 22:30:49 2013 +0200 HID: pantherlord: validate output report details commit 412f30105ec6735224535791eed5cdc02888ecb4 upstream. A HID device could send a malicious output report that would cause the pantherlord HID driver to write beyond the output report allocation during initialization, causing a heap overflow: [ 310.939483] usb 1-1: New USB device found, idVendor=0e8f, idProduct=0003 ... [ 315.980774] BUG kmalloc-192 (Tainted: G W ): Redzone overwritten CVE-2013-2892 Signed-off-by: Kees Cook Signed-off-by: Jiri Kosina Signed-off-by: Paul Gortmaker commit a04155e1d8343be1dee5e52979cae7a5afef1d88 Author: Kees Cook Date: Wed Aug 28 22:29:55 2013 +0200 HID: validate HID report id size commit 43622021d2e2b82ea03d883926605bdd0525e1d1 upstream. The "Report ID" field of a HID report is used to build indexes of reports. The kernel's index of these is limited to 256 entries, so any malicious device that sets a Report ID greater than 255 will trigger memory corruption on the host: [ 1347.156239] BUG: unable to handle kernel paging request at ffff88094958a878 [ 1347.156261] IP: [] hid_register_report+0x2a/0x8b CVE-2013-2888 Signed-off-by: Kees Cook Signed-off-by: Jiri Kosina [PG: hid_err() --> dbg_hid() in 2.6.34 baseline] Signed-off-by: Paul Gortmaker commit 00537e385f55251e291af246dfd72643acf58011 Author: Neil Horman Date: Tue Sep 17 08:33:11 2013 -0400 crypto: ansi_cprng - Fix off by one error in non-block size request commit 714b33d15130cbb5ab426456d4e3de842d6c5b8a upstream. Stephan Mueller reported to me recently a error in random number generation in the ansi cprng. If several small requests are made that are less than the instances block size, the remainder for loop code doesn't increment rand_data_valid in the last iteration, meaning that the last bytes in the rand_data buffer gets reused on the subsequent smaller-than-a-block request for random data. The fix is pretty easy, just re-code the for loop to make sure that rand_data_valid gets incremented appropriately Signed-off-by: Neil Horman Reported-by: Stephan Mueller CC: Stephan Mueller CC: Petr Matousek CC: Herbert Xu CC: "David S. Miller" Signed-off-by: Herbert Xu Signed-off-by: Paul Gortmaker commit 693270aa404c77f69a27cef959f097fdb70dcbd1 Author: Eryu Guan Date: Tue Nov 1 19:04:59 2011 -0400 jbd/jbd2: validate sb->s_first in journal_get_superblock() commit 8762202dd0d6e46854f786bdb6fb3780a1625efe upstream. I hit a J_ASSERT(blocknr != 0) failure in cleanup_journal_tail() when mounting a fsfuzzed ext3 image. It turns out that the corrupted ext3 image has s_first = 0 in journal superblock, and the 0 is passed to journal->j_head in journal_reset(), then to blocknr in cleanup_journal_tail(), in the end the J_ASSERT failed. So validate s_first after reading journal superblock from disk in journal_get_superblock() to ensure s_first is valid. The following script could reproduce it: fstype=ext3 blocksize=1024 img=$fstype.img offset=0 found=0 magic="c0 3b 39 98" dd if=/dev/zero of=$img bs=1M count=8 mkfs -t $fstype -b $blocksize -F $img filesize=`stat -c %s $img` while [ $offset -lt $filesize ] do if od -j $offset -N 4 -t x1 $img | grep -i "$magic";then echo "Found journal: $offset" found=1 break fi offset=`echo "$offset+$blocksize" | bc` done if [ $found -ne 1 ];then echo "Magic \"$magic\" not found" exit 1 fi dd if=/dev/zero of=$img seek=$(($offset+23)) conv=notrunc bs=1 count=1 mkdir -p ./mnt mount -o loop $img ./mnt Cc: Jan Kara Signed-off-by: Eryu Guan Signed-off-by: "Theodore Ts'o" Signed-off-by: Paul Gortmaker commit e109421436308ef2ffb4f91db428cce7b86de0fb Author: Hannes Frederic Sowa Date: Mon Jul 1 20:21:30 2013 +0200 ipv6: call udp_push_pending_frames when uncorking a socket with AF_INET pending data commit 8822b64a0fa64a5dd1dfcf837c5b0be83f8c05d1 upstream. We accidentally call down to ip6_push_pending_frames when uncorking pending AF_INET data on a ipv6 socket. This results in the following splat (from Dave Jones): skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev: ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:126! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth +netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ #37 task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000 RIP: 0010:[] [] skb_panic+0x63/0x65 RSP: 0018:ffff8801e6431de8 EFLAGS: 00010282 RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006 RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520 RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800 R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800 FS: 00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4 ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6 ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0 Call Trace: [] skb_push+0x3a/0x40 [] ip6_push_pending_frames+0x1f6/0x4d0 [] ? mark_held_locks+0xbb/0x140 [] udp_v6_push_pending_frames+0x2b9/0x3d0 [] ? udplite_getfrag+0x20/0x20 [] udp_lib_setsockopt+0x1aa/0x1f0 [] ? fget_light+0x387/0x4f0 [] udpv6_setsockopt+0x34/0x40 [] sock_common_setsockopt+0x14/0x20 [] SyS_setsockopt+0x71/0xd0 [] tracesys+0xdd/0xe2 Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 RIP [] skb_panic+0x63/0x65 RSP This patch adds a check if the pending data is of address family AF_INET and directly calls udp_push_ending_frames from udp_v6_push_pending_frames if that is the case. This bug was found by Dave Jones with trinity. (Also move the initialization of fl6 below the AF_INET check, even if not strictly necessary.) Cc: Dave Jones Cc: YOSHIFUJI Hideaki Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller [PG: The line "flowi6 *fl6 = &inet->cork.fl.u.ip6" was "flowi *fl = &inet->cork.fl" in 2.6.34 kernel.] Signed-off-by: Paul Gortmaker commit 859ca6fe36694a71a32952aacc29be742e04a3e0 Author: Tyler Hicks Date: Thu Jun 20 13:13:59 2013 -0700 libceph: Fix NULL pointer dereference in auth client code commit 2cb33cac622afde897aa02d3dcd9fbba8bae839e upstream. A malicious monitor can craft an auth reply message that could cause a NULL function pointer dereference in the client's kernel. To prevent this, the auth_none protocol handler needs an empty ceph_auth_client_ops->build_request() function. CVE-2013-1059 Signed-off-by: Tyler Hicks Reported-by: Chanam Park Reviewed-by: Seth Arnold Reviewed-by: Sage Weil [PG: in v2.6.34, file is fs/ceph and not net/ceph] Signed-off-by: Paul Gortmaker commit a11a914060471392c91a7f92fc334df220abdf64 Author: Paolo Bonzini Date: Thu Jan 12 16:01:29 2012 +0100 dm: do not forward ioctls from logical volumes to the underlying device commit ec8013beddd717d1740cfefb1a9b900deef85462 upstream. A logical volume can map to just part of underlying physical volume. In this case, it must be treated like a partition. Based on a patch from Alasdair G Kergon. Cc: Alasdair G Kergon Cc: dm-devel@redhat.com Signed-off-by: Paolo Bonzini Signed-off-by: Linus Torvalds [PG: drop drivers/md/dm-flakey.c chunk; file not present in 2.6.34] Signed-off-by: Paul Gortmaker commit 444f3ec5479aa771543435351460ac06eca0aeb0 Author: Paolo Bonzini Date: Thu Jan 12 16:01:28 2012 +0100 block: fail SCSI passthrough ioctls on partition devices commit 0bfc96cb77224736dfa35c3c555d37b3646ef35e upstream. Linux allows executing the SG_IO ioctl on a partition or LVM volume, and will pass the command to the underlying block device. This is well-known, but it is also a large security problem when (via Unix permissions, ACLs, SELinux or a combination thereof) a program or user needs to be granted access only to part of the disk. This patch lets partitions forward a small set of harmless ioctls; others are logged with printk so that we can see which ioctls are actually sent. In my tests only CDROM_GET_CAPABILITY actually occurred. Of course it was being sent to a (partition on a) hard disk, so it would have failed with ENOTTY and the patch isn't changing anything in practice. Still, I'm treating it specially to avoid spamming the logs. In principle, this restriction should include programs running with CAP_SYS_RAWIO. If for example I let a program access /dev/sda2 and /dev/sdb, it still should not be able to read/write outside the boundaries of /dev/sda2 independent of the capabilities. However, for now programs with CAP_SYS_RAWIO will still be allowed to send the ioctls. Their actions will still be logged. This patch does not affect the non-libata IDE driver. That driver however already tests for bd != bd->bd_contains before issuing some ioctl; it could be restricted further to forbid these ioctls even for programs running with CAP_SYS_ADMIN/CAP_SYS_RAWIO. Cc: linux-scsi@vger.kernel.org Cc: Jens Axboe Cc: James Bottomley Signed-off-by: Paolo Bonzini [ Make it also print the command name when warning - Linus ] Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit d559304967eb039340f645dd4292c39618369e1f Author: Paolo Bonzini Date: Thu Jan 12 16:01:27 2012 +0100 block: add and use scsi_blk_cmd_ioctl commit 577ebb374c78314ac4617242f509e2f5e7156649 upstream. Introduce a wrapper around scsi_cmd_ioctl that takes a block device. The function will then be enhanced to detect partition block devices and, in that case, subject the ioctls to whitelisting. Cc: linux-scsi@vger.kernel.org Cc: Jens Axboe Cc: James Bottomley Signed-off-by: Paolo Bonzini Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit b0d623c5095ffabae406dbdeb27f1c9188646c78 Author: Herbert Xu Date: Fri Feb 11 12:36:55 2011 +0000 bridge: Fix mglist corruption that leads to memory corruption commit 6b0d6a9b4296fa16a28d10d416db7a770fc03287 upstream. The list mp->mglist is used to indicate whether a multicast group is active on the bridge interface itself as opposed to one of the constituent interfaces in the bridge. Unfortunately the operation that adds the mp->mglist node to the list neglected to check whether it has already been added. This leads to list corruption in the form of nodes pointing to itself. Normally this would be quite obvious as it would cause an infinite loop when walking the list. However, as this list is never actually walked (which means that we don't really need it, I'll get rid of it in a subsequent patch), this instead is hidden until we perform a delete operation on the affected nodes. As the same node may now be pointed to by more than one node, the delete operations can then cause modification of freed memory. This was observed in practice to cause corruption in 512-byte slabs, most commonly leading to crashes in jbd2. Thanks to Josef Bacik for pointing me in the right direction. Reported-by: Ian Page Hands Signed-off-by: Herbert Xu Signed-off-by: David S. Miller Signed-off-by: Paul Gortmaker commit 624a0120be37b570a4f931933e48e3874fb9fdfb Author: Alex Williamson Date: Tue Apr 17 21:46:44 2012 -0600 KVM: lock slots_lock around device assignment commit 21a1416a1c945c5aeaeaf791b63c64926018eb77 upstream. As pointed out by Jason Baron, when assigning a device to a guest we first set the iommu domain pointer, which enables mapping and unmapping of memory slots to the iommu. This leaves a window where this path is enabled, but we haven't synchronized the iommu mappings to the existing memory slots. Thus a slot being removed at that point could send us down unexpected code paths removing non-existent pinnings and iommu mappings. Take the slots_lock around creating the iommu domain and initial mappings as well as around iommu teardown to avoid this race. Signed-off-by: Alex Williamson Signed-off-by: Marcelo Tosatti [PG: drop goto for EPERM check, 2.6.34 doesn't have that code] Signed-off-by: Paul Gortmaker commit 232fc56294ae61346b56a84d54417619fb187d41 Author: Alex Williamson Date: Wed Apr 11 09:51:49 2012 -0600 KVM: unmap pages from the iommu when slots are removed commit 32f6daad4651a748a58a3ab6da0611862175722f upstream. We've been adding new mappings, but not destroying old mappings. This can lead to a page leak as pages are pinned using get_user_pages, but only unpinned with put_page if they still exist in the memslots list on vm shutdown. A memslot that is destroyed while an iommu domain is enabled for the guest will therefore result in an elevated page reference count that is never cleared. Additionally, without this fix, the iommu is only programmed with the first translation for a gpa. This can result in peer-to-peer errors if a mapping is destroyed and replaced by a new mapping at the same gpa as the iommu will still be pointing to the original, pinned memory address. Signed-off-by: Alex Williamson Signed-off-by: Marcelo Tosatti [PG: minor tweak since 2.6.34 doesnt have kvm_for_each_memslot] Signed-off-by: Paul Gortmaker commit 323dc946757629b8f1798b23f749920de6f08dd4 Author: Eric Paris Date: Tue Apr 5 17:20:50 2011 -0400 inotify: fix double free/corruption of stuct user commit d0de4dc584ec6aa3b26fffea320a8457827768fc upstream. On an error path in inotify_init1 a normal user can trigger a double free of struct user. This is a regression introduced by a2ae4cc9a16e ("inotify: stop kernel memory leak on file creation failure"). We fix this by making sure that if a group exists the user reference is dropped when the group is cleaned up. We should not explictly drop the reference on error and also drop the reference when the group is cleaned up. The new lifetime rules are that an inotify group lives from inotify_new_group to the last fsnotify_put_group. Since the struct user and inotify_devs are directly tied to this lifetime they are only changed/updated in those two locations. We get rid of all special casing of struct user or user->inotify_devs. Signed-off-by: Eric Paris Signed-off-by: Linus Torvalds Signed-off-by: Paul Gortmaker commit 01f69c1211fdcbcb6f7f5624234499264687b258 Author: Eric Dumazet Date: Thu Apr 21 09:45:37 2011 +0000 inet: add RCU protection to inet->opt commit f6d8bd051c391c1c0458a30b2a7abcd939329259 upstream. We lack proper synchronization to manipulate inet->opt ip_options Problem is ip_make_skb() calls ip_setup_cork() and ip_setup_cork() possibly makes a copy of ipc->opt (struct ip_options), without any protection against another thread manipulating inet->opt. Another thread can change inet->opt pointer and free old one under us. Use RCU to protect inet->opt (changed to inet->inet_opt). Instead of handling atomic refcounts, just copy ip_options when necessary, to avoid cache line dirtying. We cant insert an rcu_head in struct ip_options since its included in skb->cb[], so this patch is large because I had to introduce a new ip_options_rcu structure. Signed-off-by: Eric Dumazet Cc: Herbert Xu Signed-off-by: David S. Miller [dannf/bwh: backported to Debian's 2.6.32] Signed-off-by: Ben Hutchings Signed-off-by: Willy Tarreau [PG: use 2.6.32 patch, since it is closer to 2.6.34 than original baseline; drop net/l2tp/l2tp_ip.c chunk as we don't have that file] Signed-off-by: Paul Gortmaker commit e7c0580a0fc5a0415e0462437417cad04f97141e Author: Paul Gortmaker Date: Mon Jan 27 19:19:40 2014 -0500 Revert "percpu: fix chunk range calculation" This reverts commit 264266e6897dd81c894d1c5cbd90b133707b32f3. The backport had dependencies on other mm/percpu.c restructurings, like those in commit 020ec6537aa65c18e9084c568d7b94727f2026fd ("percpu: factor out pcpu_addr_in_first/reserved_chunk() and update per_cpu_ptr_to_phys()"). Rather than drag in more changes, we simply revert the incomplete backport. Reported-by: George G. Davis Signed-off-by: Paul Gortmaker commit 54c0b8dd3b71ae9d23ce78b7606e233b00e08b0e Author: George G. Davis Date: Fri Apr 26 19:00:53 2013 -0400 udf: fix udf_error build warnings The v2.6.34.14 commit "d7542a6 udf: Avoid run away loop when partition table length is corrupted" introduced the following build warning due to a change in the number of args in udf_error/udf_err for v2.6.34.14: CC fs/udf/super.o fs/udf/super.c: In function 'udf_load_sparable_map': fs/udf/super.c:1259: warning: passing argument 3 of 'udf_error' makes pointer from integer without a cast fs/udf/super.c:1265: warning: passing argument 3 of 'udf_error' makes pointer from integer without a cast fs/udf/super.c: In function 'udf_load_logicalvol': fs/udf/super.c:1313: warning: passing argument 3 of 'udf_error' makes pointer from integer without a cast The above warnings are due to a missing __func__ argument in the above udf_error function calls. This is because of commit 8076c363da15e7 ("udf: Rename udf_error to udf_err") which removed the __func__ arg. Restore the missing __func__ argument to fix the build warnings. Signed-off-by: George G. Davis Signed-off-by: Paul Gortmaker commit 0398dd1747cb3525fd4b140bd1a05fe6dcda477f Author: Romain Francoise Date: Thu Jan 24 11:35:42 2013 -0500 x86, random: make ARCH_RANDOM prompt if EMBEDDED, not EXPERT Before v2.6.38 CONFIG_EXPERT was known as CONFIG_EMBEDDED but the Kconfig entry was not changed to match when upstream commit 628c6246d47b85f5357298601df2444d7f4dd3fd ("x86, random: Architectural inlines to get random integers with RDRAND") was backported. Signed-off-by: Romain Francoise Signed-off-by: Paul Gortmaker