commit 7cdefde351b6911ec5ef39322980296c091f6c52 Author: Greg Kroah-Hartman Date: Wed Jan 29 16:43:27 2020 +0100 Linux 4.19.100 commit 86834898d5a5e5aef9ae6d285201f2d99a4eb300 Author: David Hildenbrand Date: Tue Jan 28 10:50:21 2020 +0100 mm/memory_hotplug: shrink zones when offlining memory commit feee6b2989165631b17ac6d4ccdbf6759254e85a upstream. -- snip -- - Missing arm64 hot(un)plug support - Missing some vmem_altmap_offset() cleanups - Missing sub-section hotadd support - Missing unification of mm/hmm.c and kernel/memremap.c -- snip -- We currently try to shrink a single zone when removing memory. We use the zone of the first page of the memory we are removing. If that memmap was never initialized (e.g., memory was never onlined), we will read garbage and can trigger kernel BUGs (due to a stale pointer): BUG: unable to handle page fault for address: 000000000000353d #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP PTI CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4 Workqueue: kacpi_hotplug acpi_hotplug_work_fn RIP: 0010:clear_zone_contiguous+0x5/0x10 Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840 RSP: 0018:ffffad2400043c98 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000 RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40 RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000 R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680 FS: 0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __remove_pages+0x4b/0x640 arch_remove_memory+0x63/0x8d try_remove_memory+0xdb/0x130 __remove_memory+0xa/0x11 acpi_memory_device_remove+0x70/0x100 acpi_bus_trim+0x55/0x90 acpi_device_hotplug+0x227/0x3a0 acpi_hotplug_work_fn+0x1a/0x30 process_one_work+0x221/0x550 worker_thread+0x50/0x3b0 kthread+0x105/0x140 ret_from_fork+0x3a/0x50 Modules linked in: CR2: 000000000000353d Instead, shrink the zones when offlining memory or when onlining failed. Introduce and use remove_pfn_range_from_zone(() for that. We now properly shrink the zones, even if we have DIMMs whereby - Some memory blocks fall into no zone (never onlined) - Some memory blocks fall into multiple zones (offlined+re-onlined) - Multiple memory blocks that fall into different zones Drop the zone parameter (with a potential dubious value) from __remove_pages() and __remove_section(). Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after d0dc12e86b319] Signed-off-by: David Hildenbrand Reviewed-by: Oscar Salvador Cc: Michal Hocko Cc: "Matthew Wilcox (Oracle)" Cc: "Aneesh Kumar K.V" Cc: Pavel Tatashin Cc: Greg Kroah-Hartman Cc: Dan Williams Cc: Logan Gunthorpe Cc: [5.0+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit d98d053efa65169c4e29b8b477cd5bc2b7aef513 Author: David Hildenbrand Date: Tue Jan 28 10:50:20 2020 +0100 mm/memory_hotplug: fix try_offline_node() commit 2c91f8fc6c999fe10185d8ad99fda1759f662f70 upstream. -- snip -- Only contextual issues: - Unrelated check_and_unmap_cpu_on_node() changes are missing. - Unrelated walk_memory_blocks() has not been moved/refactored yet. -- snip -- try_offline_node() is pretty much broken right now: - The node span is updated when onlining memory, not when adding it. We ignore memory that was mever onlined. Bad. - We touch possible garbage memmaps. The pfn_to_nid(pfn) can easily trigger a kernel panic. Bad for memory that is offline but also bad for subsection hotadd with ZONE_DEVICE, whereby the memmap of the first PFN of a section might contain garbage. - Sections belonging to mixed nodes are not properly considered. As memory blocks might belong to multiple nodes, we would have to walk all pageblocks (or at least subsections) within present sections. However, we don't have a way to identify whether a memmap that is not online was initialized (relevant for ZONE_DEVICE). This makes things more complicated. Luckily, we can piggy pack on the node span and the nid stored in memory blocks. Currently, the node span is grown when calling move_pfn_range_to_zone() - e.g., when onlining memory, and shrunk when removing memory, before calling try_offline_node(). Sysfs links are created via link_mem_sections(), e.g., during boot or when adding memory. If the node still spans memory or if any memory block belongs to the nid, we don't set the node offline. As memory blocks that span multiple nodes cannot get offlined, the nid stored in memory blocks is reliable enough (for such online memory blocks, the node still spans the memory). Introduce for_each_memory_block() to efficiently walk all memory blocks. Note: We will soon stop shrinking the ZONE_DEVICE zone and the node span when removing ZONE_DEVICE memory to fix similar issues (access of garbage memmaps) - until we have a reliable way to identify whether these memmaps were properly initialized. This implies later, that once a node had ZONE_DEVICE memory, we won't be able to set a node offline - which should be acceptable. Since commit f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") memory that is added is not assoziated with a zone/node (memmap not initialized). The introducing commit 60a5a19e7419 ("memory-hotplug: remove sysfs file of node") already missed that we could have multiple nodes for a section and that the zone/node span is updated when onlining pages, not when adding them. I tested this by hotplugging two DIMMs to a memory-less and cpu-less NUMA node. The node is properly onlined when adding the DIMMs. When removing the DIMMs, the node is properly offlined. Masayoshi Mizuma reported: : Without this patch, memory hotplug fails as panic: : : BUG: kernel NULL pointer dereference, address: 0000000000000000 : ... : Call Trace: : remove_memory_block_devices+0x81/0xc0 : try_remove_memory+0xb4/0x130 : __remove_memory+0xa/0x20 : acpi_memory_device_remove+0x84/0x100 : acpi_bus_trim+0x57/0x90 : acpi_bus_trim+0x2e/0x90 : acpi_device_hotplug+0x2b2/0x4d0 : acpi_hotplug_work_fn+0x1a/0x30 : process_one_work+0x171/0x380 : worker_thread+0x49/0x3f0 : kthread+0xf8/0x130 : ret_from_fork+0x35/0x40 [david@redhat.com: v3] Link: http://lkml.kernel.org/r/20191102120221.7553-1-david@redhat.com Link: http://lkml.kernel.org/r/20191028105458.28320-1-david@redhat.com Fixes: 60a5a19e7419 ("memory-hotplug: remove sysfs file of node") Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") # visiable after d0dc12e86b319 Signed-off-by: David Hildenbrand Tested-by: Masayoshi Mizuma Cc: Tang Chen Cc: Greg Kroah-Hartman Cc: "Rafael J. Wysocki" Cc: Keith Busch Cc: Jiri Olsa Cc: "Peter Zijlstra (Intel)" Cc: Jani Nikula Cc: Nayna Jain Cc: Michal Hocko Cc: Oscar Salvador Cc: Stephen Rothwell Cc: Dan Williams Cc: Pavel Tatashin Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit f291080659d2bba2a33e65de57e1c2ce11473dac Author: Aneesh Kumar K.V Date: Tue Jan 28 10:50:19 2020 +0100 mm/memunmap: don't access uninitialized memmap in memunmap_pages() commit 77e080e7680e1e615587352f70c87b9e98126d03 upstream. -- snip -- - Missing mm/hmm.c and kernel/memremap.c unification. -- hmm code does not need fixes (no altmap) - Missing 7cc7867fb061 ("mm/devm_memremap_pages: enable sub-section remap") -- snip -- Patch series "mm/memory_hotplug: Shrink zones before removing memory", v6. This series fixes the access of uninitialized memmaps when shrinking zones/nodes and when removing memory. Also, it contains all fixes for crashes that can be triggered when removing certain namespace using memunmap_pages() - ZONE_DEVICE, reported by Aneesh. We stop trying to shrink ZONE_DEVICE, as it's buggy, fixing it would be more involved (we don't have SECTION_IS_ONLINE as an indicator), and shrinking is only of limited use (set_zone_contiguous() cannot detect the ZONE_DEVICE as contiguous). We continue shrinking !ZONE_DEVICE zones, however, I reduced the amount of code to a minimum. Shrinking is especially necessary to keep zone->contiguous set where possible, especially, on memory unplug of DIMMs at zone boundaries. -------------------------------------------------------------------------- Zones are now properly shrunk when offlining memory blocks or when onlining failed. This allows to properly shrink zones on memory unplug even if the separate memory blocks of a DIMM were onlined to different zones or re-onlined to a different zone after offlining. Example: :/# cat /proc/zoneinfo Node 1, zone Movable spanned 0 present 0 managed 0 :/# echo "online_movable" > /sys/devices/system/memory/memory41/state :/# echo "online_movable" > /sys/devices/system/memory/memory43/state :/# cat /proc/zoneinfo Node 1, zone Movable spanned 98304 present 65536 managed 65536 :/# echo 0 > /sys/devices/system/memory/memory43/online :/# cat /proc/zoneinfo Node 1, zone Movable spanned 32768 present 32768 managed 32768 :/# echo 0 > /sys/devices/system/memory/memory41/online :/# cat /proc/zoneinfo Node 1, zone Movable spanned 0 present 0 managed 0 This patch (of 10): With an altmap, the memmap falling into the reserved altmap space are not initialized and, therefore, contain a garbage NID and a garbage zone. Make sure to read the NID/zone from a memmap that was initialized. This fixes a kernel crash that is observed when destroying a namespace: kernel BUG at include/linux/mm.h:1107! cpu 0x1: Vector: 700 (Program Check) at [c000000274087890] pc: c0000000004b9728: memunmap_pages+0x238/0x340 lr: c0000000004b9724: memunmap_pages+0x234/0x340 ... pid = 3669, comm = ndctl kernel BUG at include/linux/mm.h:1107! devm_action_release+0x30/0x50 release_nodes+0x268/0x2d0 device_release_driver_internal+0x174/0x240 unbind_store+0x13c/0x190 drv_attr_store+0x44/0x60 sysfs_kf_write+0x70/0xa0 kernfs_fop_write+0x1ac/0x290 __vfs_write+0x3c/0x70 vfs_write+0xe4/0x200 ksys_write+0x7c/0x140 system_call+0x5c/0x68 The "page_zone(pfn_to_page(pfn)" was introduced by 69324b8f4833 ("mm, devm_memremap_pages: add MEMORY_DEVICE_PRIVATE support"), however, I think we will never have driver reserved memory with MEMORY_DEVICE_PRIVATE (no altmap AFAIKS). [david@redhat.com: minimze code changes, rephrase description] Link: http://lkml.kernel.org/r/20191006085646.5768-2-david@redhat.com Fixes: 2c2a5af6fed2 ("mm, memory_hotplug: add nid parameter to arch_remove_memory") Signed-off-by: Aneesh Kumar K.V Signed-off-by: David Hildenbrand Cc: Dan Williams Cc: Jason Gunthorpe Cc: Logan Gunthorpe Cc: Ira Weiny Cc: Damian Tometzki Cc: Alexander Duyck Cc: Alexander Potapenko Cc: Andy Lutomirski Cc: Anshuman Khandual Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Christophe Leroy Cc: Dave Hansen Cc: Fenghua Yu Cc: Gerald Schaefer Cc: Greg Kroah-Hartman Cc: Halil Pasic Cc: Heiko Carstens Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Jun Yao Cc: Mark Rutland Cc: Masahiro Yamada Cc: "Matthew Wilcox (Oracle)" Cc: Mel Gorman Cc: Michael Ellerman Cc: Michal Hocko Cc: Mike Rapoport Cc: Oscar Salvador Cc: Pankaj Gupta Cc: Paul Mackerras Cc: Pavel Tatashin Cc: Pavel Tatashin Cc: Peter Zijlstra Cc: Qian Cai Cc: Rich Felker Cc: Robin Murphy Cc: Steve Capper Cc: Thomas Gleixner Cc: Tom Lendacky Cc: Tony Luck Cc: Vasily Gorbik Cc: Vlastimil Babka Cc: Wei Yang Cc: Wei Yang Cc: Will Deacon Cc: Yoshinori Sato Cc: Yu Zhao Cc: [5.0+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit d830a11c6279c416424e0981b241e58bf51c7c8b Author: David Hildenbrand Date: Tue Jan 28 10:50:18 2020 +0100 drivers/base/node.c: simplify unregister_memory_block_under_nodes() commit d84f2f5a755208da3f93e17714631485cb3da11c upstream. We don't allow to offline memory block devices that belong to multiple numa nodes. Therefore, such devices can never get removed. It is sufficient to process a single node when removing the memory block. No need to iterate over each and every PFN. We already have the nid stored for each memory block. Make sure that the nid always has a sane value. Please note that checking for node_online(nid) is not required. If we would have a memory block belonging to a node that is no longer offline, then we would have a BUG in the node offlining code. Link: http://lkml.kernel.org/r/20190719135244.15242-1-david@redhat.com Signed-off-by: David Hildenbrand Cc: Greg Kroah-Hartman Cc: "Rafael J. Wysocki" Cc: David Hildenbrand Cc: Stephen Rothwell Cc: Pavel Tatashin Cc: Michal Hocko Cc: Oscar Salvador Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit b9cda6501a340c5d15487575a68454748e3132a4 Author: Dan Williams Date: Tue Jan 28 10:50:17 2020 +0100 mm/hotplug: kill is_dev_zone() usage in __remove_pages() commit 96da4350000973ef9310a10d077d65bbc017f093 upstream. -- snip -- Minor conflict, keep the altmap check. -- snip -- The zone type check was a leftover from the cleanup that plumbed altmap through the memory hotplug path, i.e. commit da024512a1fa "mm: pass the vmem_altmap to arch_remove_memory and __remove_pages". Link: http://lkml.kernel.org/r/156092352642.979959.6664333788149363039.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams Reviewed-by: David Hildenbrand Reviewed-by: Oscar Salvador Tested-by: Aneesh Kumar K.V [ppc64] Cc: Michal Hocko Cc: Logan Gunthorpe Cc: Pavel Tatashin Cc: Jane Chu Cc: Jeff Moyer Cc: Jérôme Glisse Cc: Jonathan Corbet Cc: Mike Rapoport Cc: Toshi Kani Cc: Vlastimil Babka Cc: Wei Yang Cc: Jason Gunthorpe Cc: Christoph Hellwig Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit dc6be8597c8c2959a7ba0c7cc809094d5dd8d99a Author: David Hildenbrand Date: Tue Jan 28 10:50:16 2020 +0100 mm/memory_hotplug: remove "zone" parameter from sparse_remove_one_section commit b9bf8d342d9b443c0d19aa57883d8ddb38d965de upstream. The parameter is unused, so let's drop it. Memory removal paths should never care about zones. This is the job of memory offlining and will require more refactorings. Link: http://lkml.kernel.org/r/20190527111152.16324-12-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Dan Williams Reviewed-by: Wei Yang Reviewed-by: Oscar Salvador Acked-by: Michal Hocko Cc: Alex Deucher Cc: Andrew Banman Cc: Andy Lutomirski Cc: Anshuman Khandual Cc: Ard Biesheuvel Cc: Arun KS Cc: Baoquan He Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Catalin Marinas Cc: Chintan Pandya Cc: Christophe Leroy Cc: Chris Wilson Cc: Dave Hansen Cc: "David S. Miller" Cc: Fenghua Yu Cc: Greg Kroah-Hartman Cc: Heiko Carstens Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Jonathan Cameron Cc: Joonsoo Kim Cc: Jun Yao Cc: "Kirill A. Shutemov" Cc: Logan Gunthorpe Cc: Mark Brown Cc: Mark Rutland Cc: Masahiro Yamada Cc: Mathieu Malaterre Cc: Michael Ellerman Cc: Mike Rapoport Cc: "mike.travis@hpe.com" Cc: Nicholas Piggin Cc: Paul Mackerras Cc: Pavel Tatashin Cc: Peter Zijlstra Cc: Qian Cai Cc: "Rafael J. Wysocki" Cc: Rich Felker Cc: Rob Herring Cc: Robin Murphy Cc: Thomas Gleixner Cc: Tony Luck Cc: Vasily Gorbik Cc: Will Deacon Cc: Yoshinori Sato Cc: Yu Zhao Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit 030d045dc067c6b67e4fb4a69e56d7cbda1b1f65 Author: David Hildenbrand Date: Tue Jan 28 10:50:15 2020 +0100 mm/memory_hotplug: make unregister_memory_block_under_nodes() never fail commit a31b264c2b415b29660da0bc2ba291a98629ce51 upstream. We really don't want anything during memory hotunplug to fail. We always pass a valid memory block device, that check can go. Avoid allocating memory and eventually failing. As we are always called under lock, we can use a static piece of memory. This avoids having to put the structure onto the stack, having to guess about the stack size of callers. Patch inspired by a patch from Oscar Salvador. In the future, there might be no need to iterate over nodes at all. mem->nid should tell us exactly what to remove. Memory block devices with mixed nodes (added during boot) should properly fenced off and never removed. Link: http://lkml.kernel.org/r/20190527111152.16324-11-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Wei Yang Reviewed-by: Oscar Salvador Acked-by: Michal Hocko Cc: Greg Kroah-Hartman Cc: "Rafael J. Wysocki" Cc: Alex Deucher Cc: "David S. Miller" Cc: Mark Brown Cc: Chris Wilson Cc: David Hildenbrand Cc: Jonathan Cameron Cc: Andrew Banman Cc: Andy Lutomirski Cc: Anshuman Khandual Cc: Ard Biesheuvel Cc: Arun KS Cc: Baoquan He Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Catalin Marinas Cc: Chintan Pandya Cc: Christophe Leroy Cc: Dan Williams Cc: Dave Hansen Cc: Fenghua Yu Cc: Heiko Carstens Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Joonsoo Kim Cc: Jun Yao Cc: "Kirill A. Shutemov" Cc: Logan Gunthorpe Cc: Mark Rutland Cc: Masahiro Yamada Cc: Mathieu Malaterre Cc: Michael Ellerman Cc: Mike Rapoport Cc: "mike.travis@hpe.com" Cc: Nicholas Piggin Cc: Paul Mackerras Cc: Pavel Tatashin Cc: Peter Zijlstra Cc: Qian Cai Cc: Rich Felker Cc: Rob Herring Cc: Robin Murphy Cc: Thomas Gleixner Cc: Tony Luck Cc: Vasily Gorbik Cc: Will Deacon Cc: Yoshinori Sato Cc: Yu Zhao Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit d883abbc097e9e86ffba1342c6972ebdda06a127 Author: David Hildenbrand Date: Tue Jan 28 10:50:14 2020 +0100 mm/memory_hotplug: remove memory block devices before arch_remove_memory() commit 4c4b7f9ba9486c565aead99a198ceeef73ae81f6 upstream. Let's factor out removing of memory block devices, which is only necessary for memory added via add_memory() and friends that created memory block devices. Remove the devices before calling arch_remove_memory(). This finishes factoring out memory block device handling from arch_add_memory() and arch_remove_memory(). Link: http://lkml.kernel.org/r/20190527111152.16324-10-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Dan Williams Acked-by: Michal Hocko Cc: Greg Kroah-Hartman Cc: "Rafael J. Wysocki" Cc: David Hildenbrand Cc: "mike.travis@hpe.com" Cc: Andrew Banman Cc: Ingo Molnar Cc: Alex Deucher Cc: "David S. Miller" Cc: Mark Brown Cc: Chris Wilson Cc: Oscar Salvador Cc: Jonathan Cameron Cc: Arun KS Cc: Mathieu Malaterre Cc: Andy Lutomirski Cc: Anshuman Khandual Cc: Ard Biesheuvel Cc: Baoquan He Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Catalin Marinas Cc: Chintan Pandya Cc: Christophe Leroy Cc: Dave Hansen Cc: Fenghua Yu Cc: Heiko Carstens Cc: "H. Peter Anvin" Cc: Joonsoo Kim Cc: Jun Yao Cc: "Kirill A. Shutemov" Cc: Logan Gunthorpe Cc: Mark Rutland Cc: Masahiro Yamada Cc: Michael Ellerman Cc: Mike Rapoport Cc: Nicholas Piggin Cc: Oscar Salvador Cc: Paul Mackerras Cc: Pavel Tatashin Cc: Peter Zijlstra Cc: Qian Cai Cc: Rich Felker Cc: Rob Herring Cc: Robin Murphy Cc: Thomas Gleixner Cc: Tony Luck Cc: Vasily Gorbik Cc: Wei Yang Cc: Will Deacon Cc: Yoshinori Sato Cc: Yu Zhao Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit aa49b6abcefc9baf8d254f75ed19c38b6866951f Author: David Hildenbrand Date: Tue Jan 28 10:50:13 2020 +0100 mm/memory_hotplug: create memory block devices after arch_add_memory() commit db051a0dac13db24d58470d75cee0ce7c6b031a1 upstream. Only memory to be added to the buddy and to be onlined/offlined by user space using /sys/devices/system/memory/... needs (and should have!) memory block devices. Factor out creation of memory block devices. Create all devices after arch_add_memory() succeeded. We can later drop the want_memblock parameter, because it is now effectively stale. Only after memory block devices have been added, memory can be onlined by user space. This implies, that memory is not visible to user space at all before arch_add_memory() succeeded. While at it - use WARN_ON_ONCE instead of BUG_ON in moved unregister_memory() - introduce find_memory_block_by_id() to search via block id - Use find_memory_block_by_id() in init_memory_block() to catch duplicates Link: http://lkml.kernel.org/r/20190527111152.16324-8-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Pavel Tatashin Acked-by: Michal Hocko Cc: Greg Kroah-Hartman Cc: "Rafael J. Wysocki" Cc: David Hildenbrand Cc: "mike.travis@hpe.com" Cc: Ingo Molnar Cc: Andrew Banman Cc: Oscar Salvador Cc: Qian Cai Cc: Wei Yang Cc: Arun KS Cc: Mathieu Malaterre Cc: Alex Deucher Cc: Andy Lutomirski Cc: Anshuman Khandual Cc: Ard Biesheuvel Cc: Baoquan He Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Catalin Marinas Cc: Chintan Pandya Cc: Christophe Leroy Cc: Chris Wilson Cc: Dan Williams Cc: Dave Hansen Cc: "David S. Miller" Cc: Fenghua Yu Cc: Heiko Carstens Cc: "H. Peter Anvin" Cc: Jonathan Cameron Cc: Joonsoo Kim Cc: Jun Yao Cc: "Kirill A. Shutemov" Cc: Logan Gunthorpe Cc: Mark Brown Cc: Mark Rutland Cc: Masahiro Yamada Cc: Michael Ellerman Cc: Mike Rapoport Cc: Nicholas Piggin Cc: Oscar Salvador Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Rich Felker Cc: Rob Herring Cc: Robin Murphy Cc: Thomas Gleixner Cc: Tony Luck Cc: Vasily Gorbik Cc: Will Deacon Cc: Yoshinori Sato Cc: Yu Zhao Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit 97c60869dbf351d7d9a7dbdeeebdcdd53418c706 Author: David Hildenbrand Date: Tue Jan 28 10:50:12 2020 +0100 drivers/base/memory: pass a block_id to init_memory_block() commit 1811582587c43bdf13d690d83345610d4df433bb upstream. We'll rework hotplug_memory_register() shortly, so it no longer consumes pass a section. [cai@lca.pw: fix a compilation warning] Link: http://lkml.kernel.org/r/1559320186-28337-1-git-send-email-cai@lca.pw Link: http://lkml.kernel.org/r/20190527111152.16324-6-david@redhat.com Signed-off-by: David Hildenbrand Signed-off-by: Qian Cai Acked-by: Michal Hocko Cc: Greg Kroah-Hartman Cc: "Rafael J. Wysocki" Cc: Alex Deucher Cc: Andrew Banman Cc: Andy Lutomirski Cc: Anshuman Khandual Cc: Ard Biesheuvel Cc: Arun KS Cc: Baoquan He Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Catalin Marinas Cc: Chintan Pandya Cc: Christophe Leroy Cc: Chris Wilson Cc: Dan Williams Cc: Dave Hansen Cc: "David S. Miller" Cc: Fenghua Yu Cc: Heiko Carstens Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Jonathan Cameron Cc: Joonsoo Kim Cc: Jun Yao Cc: "Kirill A. Shutemov" Cc: Logan Gunthorpe Cc: Mark Brown Cc: Mark Rutland Cc: Masahiro Yamada Cc: Mathieu Malaterre Cc: Michael Ellerman Cc: Mike Rapoport Cc: "mike.travis@hpe.com" Cc: Nicholas Piggin Cc: Oscar Salvador Cc: Oscar Salvador Cc: Paul Mackerras Cc: Pavel Tatashin Cc: Peter Zijlstra Cc: Rich Felker Cc: Rob Herring Cc: Robin Murphy Cc: Thomas Gleixner Cc: Tony Luck Cc: Vasily Gorbik Cc: Wei Yang Cc: Will Deacon Cc: Yoshinori Sato Cc: Yu Zhao Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit 000a1d59cfe9d6e875462ed72de32770322c282b Author: David Hildenbrand Date: Tue Jan 28 10:50:11 2020 +0100 mm/memory_hotplug: allow arch_remove_memory() without CONFIG_MEMORY_HOTREMOVE commit 80ec922dbd87fd38d15719c86a94457204648aeb upstream. -- snip -- Missing arm64 memory hot(un)plug support. -- snip -- We want to improve error handling while adding memory by allowing to use arch_remove_memory() and __remove_pages() even if CONFIG_MEMORY_HOTREMOVE is not set to e.g., implement something like: arch_add_memory() rc = do_something(); if (rc) { arch_remove_memory(); } We won't get rid of CONFIG_MEMORY_HOTREMOVE for now, as it will require quite some dependencies for memory offlining. Link: http://lkml.kernel.org/r/20190527111152.16324-7-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Pavel Tatashin Cc: Tony Luck Cc: Fenghua Yu Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Cc: Heiko Carstens Cc: Yoshinori Sato Cc: Rich Felker Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Borislav Petkov Cc: "H. Peter Anvin" Cc: Greg Kroah-Hartman Cc: "Rafael J. Wysocki" Cc: Michal Hocko Cc: David Hildenbrand Cc: Oscar Salvador Cc: "Kirill A. Shutemov" Cc: Alex Deucher Cc: "David S. Miller" Cc: Mark Brown Cc: Chris Wilson Cc: Christophe Leroy Cc: Nicholas Piggin Cc: Vasily Gorbik Cc: Rob Herring Cc: Masahiro Yamada Cc: "mike.travis@hpe.com" Cc: Andrew Banman Cc: Arun KS Cc: Qian Cai Cc: Mathieu Malaterre Cc: Baoquan He Cc: Logan Gunthorpe Cc: Anshuman Khandual Cc: Ard Biesheuvel Cc: Catalin Marinas Cc: Chintan Pandya Cc: Dan Williams Cc: Ingo Molnar Cc: Jonathan Cameron Cc: Joonsoo Kim Cc: Jun Yao Cc: Mark Rutland Cc: Mike Rapoport Cc: Oscar Salvador Cc: Robin Murphy Cc: Wei Yang Cc: Will Deacon Cc: Yu Zhao Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit 817edd2bb385aa4bc96f287081bac0d9c99bbf9a Author: David Hildenbrand Date: Tue Jan 28 10:50:10 2020 +0100 s390x/mm: implement arch_remove_memory() commit 18c86506c80f6b6b5e67d95bf0d6f7e665de5239 upstream. Will come in handy when wanting to handle errors after arch_add_memory(). Link: http://lkml.kernel.org/r/20190527111152.16324-4-david@redhat.com Signed-off-by: David Hildenbrand Cc: Heiko Carstens Cc: Michal Hocko Cc: Mike Rapoport Cc: David Hildenbrand Cc: Vasily Gorbik Cc: Oscar Salvador Cc: Alex Deucher Cc: Andrew Banman Cc: Andy Lutomirski Cc: Anshuman Khandual Cc: Ard Biesheuvel Cc: Arun KS Cc: Baoquan He Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Catalin Marinas Cc: Chintan Pandya Cc: Christophe Leroy Cc: Chris Wilson Cc: Dan Williams Cc: Dave Hansen Cc: "David S. Miller" Cc: Fenghua Yu Cc: Greg Kroah-Hartman Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Jonathan Cameron Cc: Joonsoo Kim Cc: Jun Yao Cc: "Kirill A. Shutemov" Cc: Logan Gunthorpe Cc: Mark Brown Cc: Mark Rutland Cc: Masahiro Yamada Cc: Mathieu Malaterre Cc: Michael Ellerman Cc: Mike Rapoport Cc: "mike.travis@hpe.com" Cc: Nicholas Piggin Cc: Oscar Salvador Cc: Paul Mackerras Cc: Pavel Tatashin Cc: Peter Zijlstra Cc: Qian Cai Cc: "Rafael J. Wysocki" Cc: Rich Felker Cc: Rob Herring Cc: Robin Murphy Cc: Thomas Gleixner Cc: Tony Luck Cc: Wei Yang Cc: Will Deacon Cc: Yoshinori Sato Cc: Yu Zhao Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit 5163b1ec3a0c3a2e1e53b7794b64866cd6ba8697 Author: David Hildenbrand Date: Tue Jan 28 10:50:09 2020 +0100 mm/memory_hotplug: make __remove_pages() and arch_remove_memory() never fail commit ac5c94264580f498e484c854031d0226b3c1038f upstream. -- snip -- Minor conflict in arch/powerpc/mm/mem.c -- snip -- All callers of arch_remove_memory() ignore errors. And we should really try to remove any errors from the memory removal path. No more errors are reported from __remove_pages(). BUG() in s390x code in case arch_remove_memory() is triggered. We may implement that properly later. WARN in case powerpc code failed to remove the section mapping, which is better than ignoring the error completely right now. Link: http://lkml.kernel.org/r/20190409100148.24703-5-david@redhat.com Signed-off-by: David Hildenbrand Cc: Tony Luck Cc: Fenghua Yu Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Cc: Martin Schwidefsky Cc: Heiko Carstens Cc: Yoshinori Sato Cc: Rich Felker Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: "H. Peter Anvin" Cc: Michal Hocko Cc: Mike Rapoport Cc: Oscar Salvador Cc: "Kirill A. Shutemov" Cc: Christophe Leroy Cc: Stefan Agner Cc: Nicholas Piggin Cc: Pavel Tatashin Cc: Vasily Gorbik Cc: Arun KS Cc: Geert Uytterhoeven Cc: Masahiro Yamada Cc: Rob Herring Cc: Joonsoo Kim Cc: Wei Yang Cc: Qian Cai Cc: Mathieu Malaterre Cc: Andrew Banman Cc: Greg Kroah-Hartman Cc: Ingo Molnar Cc: Mike Travis Cc: Oscar Salvador Cc: "Rafael J. Wysocki" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit 58ddf0b0eff2a6cb536082fc6b046f5eb51c240c Author: Aneesh Kumar K.V Date: Tue Jan 28 10:50:08 2020 +0100 powerpc/mm: Fix section mismatch warning commit 26ad26718dfaa7cf49d106d212ebf2370076c253 upstream. This patch fix the below section mismatch warnings. WARNING: vmlinux.o(.text+0x2d1f44): Section mismatch in reference from the function devm_memremap_pages_release() to the function .meminit.text:arch_remove_memory() WARNING: vmlinux.o(.text+0x2d265c): Section mismatch in reference from the function devm_memremap_pages() to the function .meminit.text:arch_add_memory() Signed-off-by: Aneesh Kumar K.V Signed-off-by: Michael Ellerman Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit efaa8fb877a081d82c4dcd14e904bdb97f074eeb Author: David Hildenbrand Date: Tue Jan 28 10:50:07 2020 +0100 mm/memory_hotplug: make __remove_section() never fail commit 9d1d887d785b4fe0590bd3c5e71acaa3908044e2 upstream. Let's just warn in case a section is not valid instead of failing to remove somewhere in the middle of the process, returning an error that will be mostly ignored by callers. Link: http://lkml.kernel.org/r/20190409100148.24703-4-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Oscar Salvador Cc: Michal Hocko Cc: David Hildenbrand Cc: Pavel Tatashin Cc: Qian Cai Cc: Wei Yang Cc: Arun KS Cc: Mathieu Malaterre Cc: Andrew Banman Cc: Andy Lutomirski Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Christophe Leroy Cc: Dave Hansen Cc: Fenghua Yu Cc: Geert Uytterhoeven Cc: Greg Kroah-Hartman Cc: Heiko Carstens Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Ingo Molnar Cc: Joonsoo Kim Cc: "Kirill A. Shutemov" Cc: Martin Schwidefsky Cc: Masahiro Yamada Cc: Michael Ellerman Cc: Mike Rapoport Cc: Mike Travis Cc: Nicholas Piggin Cc: Oscar Salvador Cc: Paul Mackerras Cc: Peter Zijlstra Cc: "Rafael J. Wysocki" Cc: Rich Felker Cc: Rob Herring Cc: Stefan Agner Cc: Thomas Gleixner Cc: Tony Luck Cc: Vasily Gorbik Cc: Yoshinori Sato Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit 36976713c4d781e42b27eddd16d91ce023caf669 Author: David Hildenbrand Date: Tue Jan 28 10:50:06 2020 +0100 mm/memory_hotplug: make unregister_memory_section() never fail commit cb7b3a3685b20d3b5900ff24b2cb96d002960189 upstream. Failing while removing memory is mostly ignored and cannot really be handled. Let's treat errors in unregister_memory_section() in a nice way, warning, but continuing. Link: http://lkml.kernel.org/r/20190409100148.24703-3-david@redhat.com Signed-off-by: David Hildenbrand Cc: Greg Kroah-Hartman Cc: "Rafael J. Wysocki" Cc: Ingo Molnar Cc: Andrew Banman Cc: Mike Travis Cc: David Hildenbrand Cc: Oscar Salvador Cc: Michal Hocko Cc: Pavel Tatashin Cc: Qian Cai Cc: Wei Yang Cc: Arun KS Cc: Mathieu Malaterre Cc: Andy Lutomirski Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Christophe Leroy Cc: Dave Hansen Cc: Fenghua Yu Cc: Geert Uytterhoeven Cc: Heiko Carstens Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Joonsoo Kim Cc: "Kirill A. Shutemov" Cc: Martin Schwidefsky Cc: Masahiro Yamada Cc: Michael Ellerman Cc: Mike Rapoport Cc: Nicholas Piggin Cc: Oscar Salvador Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Rich Felker Cc: Rob Herring Cc: Stefan Agner Cc: Thomas Gleixner Cc: Tony Luck Cc: Vasily Gorbik Cc: Yoshinori Sato Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit 8893b51a89600e6a8d7d79f397b6427edb508f7b Author: Dan Carpenter Date: Tue Jan 28 10:50:05 2020 +0100 mm, memory_hotplug: update a comment in unregister_memory() commit 16df1456aa858a86f398dbc7d27649eb6662b0cc upstream. The remove_memory_block() function was renamed to in commit cc292b0b4302 ("drivers/base/memory.c: rename remove_memory_block() to remove_memory_section()"). Signed-off-by: Dan Carpenter Signed-off-by: Greg Kroah-Hartman Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit 9e59baa2da6d025851b130e9f294cb3605395a2e Author: Baoquan He Date: Tue Jan 28 10:50:04 2020 +0100 drivers/base/memory.c: clean up relics in function parameters commit 063b8a4cee8088224bcdb79bcd08db98df16178e upstream. The input parameter 'phys_index' of memory_block_action() is actually the section number, but not the phys_index of memory_block. This is a relic from the past when one memory block could only contain one section. Rename it to start_section_nr. And also in remove_memory_section(), the 'node_id' and 'phys_device' arguments are not used by anyone. Remove them. Link: http://lkml.kernel.org/r/20190329144250.14315-2-bhe@redhat.com Signed-off-by: Baoquan He Acked-by: Michal Hocko Reviewed-by: Rafael J. Wysocki Reviewed-by: Mukesh Ojha Reviewed-by: Oscar Salvador Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit 2ad264f6887334550a8137c04d955992391e3e4c Author: David Hildenbrand Date: Tue Jan 28 10:50:03 2020 +0100 mm/memory_hotplug: release memory resource after arch_remove_memory() commit d9eb1417c77df7ce19abd2e41619e9dceccbdf2a upstream. Patch series "mm/memory_hotplug: Better error handling when removing memory", v1. Error handling when removing memory is somewhat messed up right now. Some errors result in warnings, others are completely ignored. Memory unplug code can essentially not deal with errors properly as of now. remove_memory() will never fail. We have basically two choices: 1. Allow arch_remov_memory() and friends to fail, propagating errors via remove_memory(). Might be problematic (e.g. DIMMs consisting of multiple pieces added/removed separately). 2. Don't allow the functions to fail, handling errors in a nicer way. It seems like most errors that can theoretically happen are really corner cases and mostly theoretical (e.g. "section not valid"). However e.g. aborting removal of sections while all callers simply continue in case of errors is not nice. If we can gurantee that removal of memory always works (and WARN/skip in case of theoretical errors so we can figure out what is going on), we can go ahead and implement better error handling when adding memory. E.g. via add_memory(): arch_add_memory() ret = do_stuff() if (ret) { arch_remove_memory(); goto error; } Handling here that arch_remove_memory() might fail is basically impossible. So I suggest, let's avoid reporting errors while removing memory, warning on theoretical errors instead and continuing instead of aborting. This patch (of 4): __add_pages() doesn't add the memory resource, so __remove_pages() shouldn't remove it. Let's factor it out. Especially as it is a special case for memory used as system memory, added via add_memory() and friends. We now remove the resource after removing the sections instead of doing it the other way around. I don't think this change is problematic. add_memory() register memory resource arch_add_memory() remove_memory arch_remove_memory() release memory resource While at it, explain why we ignore errors and that it only happeny if we remove memory in a different granularity as we added it. [david@redhat.com: fix printk warning] Link: http://lkml.kernel.org/r/20190417120204.6997-1-david@redhat.com Link: http://lkml.kernel.org/r/20190409100148.24703-2-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Oscar Salvador Cc: Michal Hocko Cc: David Hildenbrand Cc: Pavel Tatashin Cc: Wei Yang Cc: Qian Cai Cc: Arun KS Cc: Mathieu Malaterre Cc: Andrew Banman Cc: Andy Lutomirski Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Christophe Leroy Cc: Dave Hansen Cc: Fenghua Yu Cc: Geert Uytterhoeven Cc: Greg Kroah-Hartman Cc: Heiko Carstens Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Ingo Molnar Cc: Joonsoo Kim Cc: "Kirill A. Shutemov" Cc: Martin Schwidefsky Cc: Masahiro Yamada Cc: Michael Ellerman Cc: Mike Rapoport Cc: Mike Travis Cc: Nicholas Piggin Cc: Oscar Salvador Cc: Paul Mackerras Cc: Peter Zijlstra Cc: "Rafael J. Wysocki" Cc: Rich Felker Cc: Rob Herring Cc: Stefan Agner Cc: Thomas Gleixner Cc: Tony Luck Cc: Vasily Gorbik Cc: Yoshinori Sato Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit 5c1f8f5358e8cd501245ee0e954dc0c0b231d6a2 Author: Oscar Salvador Date: Tue Jan 28 10:50:02 2020 +0100 mm, memory_hotplug: add nid parameter to arch_remove_memory commit 2c2a5af6fed20cf74401c9d64319c76c5ff81309 upstream. -- snip -- Missing unification of mm/hmm.c and kernel/memremap.c -- snip -- Patch series "Do not touch pages in hot-remove path", v2. This patchset aims for two things: 1) A better definition about offline and hot-remove stage 2) Solving bugs where we can access non-initialized pages during hot-remove operations [2] [3]. This is achieved by moving all page/zone handling to the offline stage, so we do not need to access pages when hot-removing memory. [1] https://patchwork.kernel.org/cover/10691415/ [2] https://patchwork.kernel.org/patch/10547445/ [3] https://www.spinics.net/lists/linux-mm/msg161316.html This patch (of 5): This is a preparation for the following-up patches. The idea of passing the nid is that it will allow us to get rid of the zone parameter afterwards. Link: http://lkml.kernel.org/r/20181127162005.15833-2-osalvador@suse.de Signed-off-by: Oscar Salvador Reviewed-by: David Hildenbrand Reviewed-by: Pavel Tatashin Cc: Michal Hocko Cc: Dan Williams Cc: Jerome Glisse Cc: Jonathan Cameron Cc: "Rafael J. Wysocki" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit 4149c8693a8c23947cb60d7c1d177e9d93e6481e Author: Wei Yang Date: Tue Jan 28 10:50:01 2020 +0100 drivers/base/memory.c: remove an unnecessary check on NR_MEM_SECTIONS commit 3b6fd6ffb27c2efa003c6d4d15ca72c054b71d7c upstream. In cb5e39b8038b ("drivers: base: refactor add_memory_section() to add_memory_block()"), add_memory_block() is introduced, which is only invoked in memory_dev_init(). When combining these two loops in memory_dev_init() and add_memory_block(), they looks like this: for (i = 0; i < NR_MEM_SECTIONS; i += sections_per_block) for (j = i; (j < i + sections_per_block) && j < NR_MEM_SECTIONS; j++) Since it is sure the (i < NR_MEM_SECTIONS) and j sits in its own memory block, the check of (j < NR_MEM_SECTIONS) is not necessary. This patch just removes this check. Link: http://lkml.kernel.org/r/20181123222811.18216-1-richard.weiyang@gmail.com Signed-off-by: Wei Yang Reviewed-by: Andrew Morton Cc: Greg Kroah-Hartman Cc: "Rafael J. Wysocki" Cc: Seth Jennings Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit aa2e8b68f24462ceb1c495cbde642a174c80fe55 Author: Wei Yang Date: Tue Jan 28 10:50:00 2020 +0100 mm, sparse: pass nid instead of pgdat to sparse_add_one_section() commit 4e0d2e7ef14d9e1c900dac909db45263822b824f upstream. Since the information needed in sparse_add_one_section() is node id to allocate proper memory, it is not necessary to pass its pgdat. This patch changes the prototype of sparse_add_one_section() to pass node id directly. This is intended to reduce misleading that sparse_add_one_section() would touch pgdat. Link: http://lkml.kernel.org/r/20181204085657.20472-2-richard.weiyang@gmail.com Signed-off-by: Wei Yang Reviewed-by: David Hildenbrand Acked-by: Michal Hocko Cc: Dave Hansen Cc: Oscar Salvador Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit b1dbaa19162857eeb8cd751e58394d1c8c34b12e Author: Wei Yang Date: Tue Jan 28 10:49:59 2020 +0100 mm, sparse: drop pgdat_resize_lock in sparse_add/remove_one_section() commit 83af658898cb292a32d8b6cd9b51266d7cfc4b6a upstream. pgdat_resize_lock is used to protect pgdat's memory region information like: node_start_pfn, node_present_pages, etc. While in function sparse_add/remove_one_section(), pgdat_resize_lock is used to protect initialization/release of one mem_section. This looks not proper. These code paths are currently protected by mem_hotplug_lock currently but should there ever be any reason for locking at the sparse layer a dedicated lock should be introduced. Following is the current call trace of sparse_add/remove_one_section() mem_hotplug_begin() arch_add_memory() add_pages() __add_pages() __add_section() sparse_add_one_section() mem_hotplug_done() mem_hotplug_begin() arch_remove_memory() __remove_pages() __remove_section() sparse_remove_one_section() mem_hotplug_done() The comment above the pgdat_resize_lock also mentions "Holding this will also guarantee that any pfn_valid() stays that way.", which is true with the current implementation and false after this patch. But current implementation doesn't meet this comment. There isn't any pfn walkers to take the lock so this looks like a relict from the past. This patch also removes this comment. [richard.weiyang@gmail.com: v4] Link: http://lkml.kernel.org/r/20181204085657.20472-1-richard.weiyang@gmail.com [mhocko@suse.com: changelog suggestion] Link: http://lkml.kernel.org/r/20181128091243.19249-1-richard.weiyang@gmail.com Signed-off-by: Wei Yang Reviewed-by: David Hildenbrand Acked-by: Michal Hocko Cc: Dave Hansen Cc: Oscar Salvador Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit a3cf10bf73fdddbf0e9f0ceb1e7a3bb487dfc3fa Author: David Hildenbrand Date: Tue Jan 28 10:49:58 2020 +0100 mm/memory_hotplug: make remove_memory() take the device_hotplug_lock commit d15e59260f62bd5e0f625cf5f5240f6ffac78ab6 upstream. Patch series "mm: online/offline_pages called w.o. mem_hotplug_lock", v3. Reading through the code and studying how mem_hotplug_lock is to be used, I noticed that there are two places where we can end up calling device_online()/device_offline() - online_pages()/offline_pages() without the mem_hotplug_lock. And there are other places where we call device_online()/device_offline() without the device_hotplug_lock. While e.g. echo "online" > /sys/devices/system/memory/memory9/state is fine, e.g. echo 1 > /sys/devices/system/memory/memory9/online Will not take the mem_hotplug_lock. However the device_lock() and device_hotplug_lock. E.g. via memory_probe_store(), we can end up calling add_memory()->online_pages() without the device_hotplug_lock. So we can have concurrent callers in online_pages(). We e.g. touch in online_pages() basically unprotected zone->present_pages then. Looks like there is a longer history to that (see Patch #2 for details), and fixing it to work the way it was intended is not really possible. We would e.g. have to take the mem_hotplug_lock in device/base/core.c, which sounds wrong. Summary: We had a lock inversion on mem_hotplug_lock and device_lock(). More details can be found in patch 3 and patch 6. I propose the general rules (documentation added in patch 6): 1. add_memory/add_memory_resource() must only be called with device_hotplug_lock. 2. remove_memory() must only be called with device_hotplug_lock. This is already documented and holds for all callers. 3. device_online()/device_offline() must only be called with device_hotplug_lock. This is already documented and true for now in core code. Other callers (related to memory hotplug) have to be fixed up. 4. mem_hotplug_lock is taken inside of add_memory/remove_memory/ online_pages/offline_pages. To me, this looks way cleaner than what we have right now (and easier to verify). And looking at the documentation of remove_memory, using lock_device_hotplug also for add_memory() feels natural. This patch (of 6): remove_memory() is exported right now but requires the device_hotplug_lock, which is not exported. So let's provide a variant that takes the lock and only export that one. The lock is already held in arch/powerpc/platforms/pseries/hotplug-memory.c drivers/acpi/acpi_memhotplug.c arch/powerpc/platforms/powernv/memtrace.c Apart from that, there are not other users in the tree. Link: http://lkml.kernel.org/r/20180925091457.28651-2-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Pavel Tatashin Reviewed-by: Rafael J. Wysocki Reviewed-by: Rashmica Gupta Reviewed-by: Oscar Salvador Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Cc: "Rafael J. Wysocki" Cc: Len Brown Cc: Rashmica Gupta Cc: Michael Neuling Cc: Balbir Singh Cc: Nathan Fontenot Cc: John Allen Cc: Michal Hocko Cc: Dan Williams Cc: Joonsoo Kim Cc: Vlastimil Babka Cc: Greg Kroah-Hartman Cc: YASUAKI ISHIMATSU Cc: Mathieu Malaterre Cc: Boris Ostrovsky Cc: Haiyang Zhang Cc: Heiko Carstens Cc: Jonathan Corbet Cc: Juergen Gross Cc: Kate Stewart Cc: "K. Y. Srinivasan" Cc: Martin Schwidefsky Cc: Philippe Ombredanne Cc: Stephen Hemminger Cc: Thomas Gleixner Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand Signed-off-by: Greg Kroah-Hartman commit 868f9e509e8f774681e387a0f39f850c06560c7e Author: Martin Schiller Date: Thu Jan 9 07:31:14 2020 +0100 net/x25: fix nonblocking connect commit e21dba7a4df4d93da237da65a096084b4f2e87b4 upstream. This patch fixes 2 issues in x25_connect(): 1. It makes absolutely no sense to reset the neighbour and the connection state after a (successful) nonblocking call of x25_connect. This prevents any connection from being established, since the response (call accept) cannot be processed. 2. Any further calls to x25_connect() while a call is pending should simply return, instead of creating new Call Request (on different logical channels). This patch should also fix the "KASAN: null-ptr-deref Write in x25_connect" and "BUG: unable to handle kernel NULL pointer dereference in x25_connect" bugs reported by syzbot. Signed-off-by: Martin Schiller Reported-by: syzbot+429c200ffc8772bfe070@syzkaller.appspotmail.com Reported-by: syzbot+eec0c87f31a7c3b66f7b@syzkaller.appspotmail.com Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 1f7a1bcd27c388b4cc286e943218c69db2d3ba71 Author: Pablo Neira Ayuso Date: Tue Jan 21 16:07:00 2020 +0100 netfilter: nf_tables: add __nft_chain_type_get() commit 826035498ec14b77b62a44f0cb6b94d45530db6f upstream. This new helper function validates that unknown family and chain type coming from userspace do not trigger an out-of-bound array access. Bail out in case __nft_chain_type_get() returns NULL from nft_chain_parse_hook(). Fixes: 9370761c56b6 ("netfilter: nf_tables: convert built-in tables/chains to chain types") Reported-by: syzbot+156a04714799b1d480bc@syzkaller.appspotmail.com Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 5b0d87620bbe943e12c802255b76f4356de1093b Author: Kadlecsik József Date: Sun Jan 19 22:06:49 2020 +0100 netfilter: ipset: use bitmap infrastructure completely commit 32c72165dbd0e246e69d16a3ad348a4851afd415 upstream. The bitmap allocation did not use full unsigned long sizes when calculating the required size and that was triggered by KASAN as slab-out-of-bounds read in several places. The patch fixes all of them. Reported-by: syzbot+fabca5cbf5e54f3fe2de@syzkaller.appspotmail.com Reported-by: syzbot+827ced406c9a1d9570ed@syzkaller.appspotmail.com Reported-by: syzbot+190d63957b22ef673ea5@syzkaller.appspotmail.com Reported-by: syzbot+dfccdb2bdb4a12ad425e@syzkaller.appspotmail.com Reported-by: syzbot+df0d0f5895ef1f41a65b@syzkaller.appspotmail.com Reported-by: syzbot+b08bd19bb37513357fd4@syzkaller.appspotmail.com Reported-by: syzbot+53cdd0ec0bbabd53370a@syzkaller.appspotmail.com Signed-off-by: Jozsef Kadlecsik Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit a76e62517465d984f672b9e85e7d96e02a451e6b Author: Bo Wu Date: Wed Nov 20 13:26:17 2019 +0000 scsi: iscsi: Avoid potential deadlock in iscsi_if_rx func commit bba340c79bfe3644829db5c852fdfa9e33837d6d upstream. In iscsi_if_rx func, after receiving one request through iscsi_if_recv_msg func, iscsi_if_send_reply will be called to try to reply to the request in a do-while loop. If the iscsi_if_send_reply function keeps returning -EAGAIN, a deadlock will occur. For example, a client only send msg without calling recvmsg func, then it will result in the watchdog soft lockup. The details are given as follows: sock_fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_ISCSI); retval = bind(sock_fd, (struct sock addr*) & src_addr, sizeof(src_addr); while (1) { state_msg = sendmsg(sock_fd, &msg, 0); //Note: recvmsg(sock_fd, &msg, 0) is not processed here. } close(sock_fd); watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [netlink_test:253305] Sample time: 4000897528 ns(HZ: 250) Sample stat: curr: user: 675503481560, nice: 321724050, sys: 448689506750, idle: 4654054240530, iowait: 40885550700, irq: 14161174020, softirq: 8104324140, st: 0 deta: user: 0, nice: 0, sys: 3998210100, idle: 0, iowait: 0, irq: 1547170, softirq: 242870, st: 0 Sample softirq: TIMER: 992 SCHED: 8 Sample irqstat: irq 2: delta 1003, curr: 3103802, arch_timer CPU: 7 PID: 253305 Comm: netlink_test Kdump: loaded Tainted: G OE Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 pstate: 40400005 (nZcv daif +PAN -UAO) pc : __alloc_skb+0x104/0x1b0 lr : __alloc_skb+0x9c/0x1b0 sp : ffff000033603a30 x29: ffff000033603a30 x28: 00000000000002dd x27: ffff800b34ced810 x26: ffff800ba7569f00 x25: 00000000ffffffff x24: 0000000000000000 x23: ffff800f7c43f600 x22: 0000000000480020 x21: ffff0000091d9000 x20: ffff800b34eff200 x19: ffff800ba7569f00 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0001000101000100 x13: 0000000101010000 x12: 0101000001010100 x11: 0001010101010001 x10: 00000000000002dd x9 : ffff000033603d58 x8 : ffff800b34eff400 x7 : ffff800ba7569200 x6 : ffff800b34eff400 x5 : 0000000000000000 x4 : 00000000ffffffff x3 : 0000000000000000 x2 : 0000000000000001 x1 : ffff800b34eff2c0 x0 : 0000000000000300 Call trace: __alloc_skb+0x104/0x1b0 iscsi_if_rx+0x144/0x12bc [scsi_transport_iscsi] netlink_unicast+0x1e0/0x258 netlink_sendmsg+0x310/0x378 sock_sendmsg+0x4c/0x70 sock_write_iter+0x90/0xf0 __vfs_write+0x11c/0x190 vfs_write+0xac/0x1c0 ksys_write+0x6c/0xd8 __arm64_sys_write+0x24/0x30 el0_svc_common+0x78/0x130 el0_svc_handler+0x38/0x78 el0_svc+0x8/0xc Link: https://lore.kernel.org/r/EDBAAA0BBBA2AC4E9C8B6B81DEEE1D6915E3D4D2@dggeml505-mbx.china.huawei.com Signed-off-by: Bo Wu Reviewed-by: Zhiqiang Liu Reviewed-by: Lee Duncan Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit f008896751c611aae2db5e0607ae59218462d553 Author: Hans Verkuil Date: Sun Nov 10 07:27:04 2019 +0100 media: v4l2-ioctl.c: zero reserved fields for S/TRY_FMT commit ee8951e56c0f960b9621636603a822811cef3158 upstream. v4l2_vbi_format, v4l2_sliced_vbi_format and v4l2_sdr_format have a reserved array at the end that should be zeroed by drivers as per the V4L2 spec. Older drivers often do not do this, so just handle this in the core. Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit cbd56515be5a8ea97134ef762b7a2923b94cb9c4 Author: Wen Huang Date: Thu Nov 28 18:51:04 2019 +0800 libertas: Fix two buffer overflows at parsing bss descriptor commit e5e884b42639c74b5b57dc277909915c0aefc8bb upstream. add_ie_rates() copys rates without checking the length in bss descriptor from remote AP.when victim connects to remote attacker, this may trigger buffer overflow. lbs_ibss_join_existing() copys rates without checking the length in bss descriptor from remote IBSS node.when victim connects to remote attacker, this may trigger buffer overflow. Fix them by putting the length check before performing copy. This fix addresses CVE-2019-14896 and CVE-2019-14897. This also fix build warning of mixed declarations and code. Reported-by: kbuild test robot Signed-off-by: Wen Huang Signed-off-by: Kalle Valo Signed-off-by: Greg Kroah-Hartman commit cb75ab69193287893f4a2d55e29923b67de27d71 Author: Suzuki K Poulose Date: Thu Jun 20 16:12:35 2019 -0600 coresight: tmc-etf: Do not call smp_processor_id from preemptible commit 024c1fd9dbcc1d8a847f1311f999d35783921b7f upstream. During a perf session we try to allocate buffers on the "node" associated with the CPU the event is bound to. If it is not bound to a CPU, we use the current CPU node, using smp_processor_id(). However this is unsafe in a pre-emptible context and could generate the splats as below : BUG: using smp_processor_id() in preemptible [00000000] code: perf/2544 caller is tmc_alloc_etf_buffer+0x5c/0x60 CPU: 2 PID: 2544 Comm: perf Not tainted 5.1.0-rc6-147786-g116841e #344 Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Feb 1 2019 Call trace: dump_backtrace+0x0/0x150 show_stack+0x14/0x20 dump_stack+0x9c/0xc4 debug_smp_processor_id+0x10c/0x110 tmc_alloc_etf_buffer+0x5c/0x60 etm_setup_aux+0x1c4/0x230 rb_alloc_aux+0x1b8/0x2b8 perf_mmap+0x35c/0x478 mmap_region+0x34c/0x4f0 do_mmap+0x2d8/0x418 vm_mmap_pgoff+0xd0/0xf8 ksys_mmap_pgoff+0x88/0xf8 __arm64_sys_mmap+0x28/0x38 el0_svc_handler+0xd8/0x138 el0_svc+0x8/0xc Use NUMA_NO_NODE hint instead of using the current node for events not bound to CPUs. Fixes: 2e499bbc1a929ac ("coresight: tmc: implementing TMC-ETF AUX space API") Cc: Mathieu Poirier Signed-off-by: Suzuki K Poulose Cc: stable # 4.7+ Signed-off-by: Mathieu Poirier Link: https://lore.kernel.org/r/20190620221237.3536-4-mathieu.poirier@linaro.org Signed-off-by: Greg Kroah-Hartman commit 63906caff4a8a24e6c5a55c1b81ec0d308a152fd Author: Suzuki K Poulose Date: Thu Jun 20 16:12:36 2019 -0600 coresight: etb10: Do not call smp_processor_id from preemptible commit 730766bae3280a25d40ea76a53dc6342e84e6513 upstream. During a perf session we try to allocate buffers on the "node" associated with the CPU the event is bound to. If it is not bound to a CPU, we use the current CPU node, using smp_processor_id(). However this is unsafe in a pre-emptible context and could generate the splats as below : BUG: using smp_processor_id() in preemptible [00000000] code: perf/2544 Use NUMA_NO_NODE hint instead of using the current node for events not bound to CPUs. Fixes: 2997aa4063d97fdb39 ("coresight: etb10: implementing AUX API") Cc: Mathieu Poirier Signed-off-by: Suzuki K Poulose Cc: stable # 4.6+ Signed-off-by: Mathieu Poirier Link: https://lore.kernel.org/r/20190620221237.3536-5-mathieu.poirier@linaro.org Signed-off-by: Greg Kroah-Hartman commit 03e520dcdc0ab793074503bc1422ccc948aad254 Author: Ard Biesheuvel Date: Sat Oct 5 11:11:10 2019 +0200 crypto: geode-aes - switch to skcipher for cbc(aes) fallback commit 504582e8e40b90b8f8c58783e2d1e4f6a2b71a3a upstream. Commit 79c65d179a40e145 ("crypto: cbc - Convert to skcipher") updated the generic CBC template wrapper from a blkcipher to a skcipher algo, to get away from the deprecated blkcipher interface. However, as a side effect, drivers that instantiate CBC transforms using the blkcipher as a fallback no longer work, since skciphers can wrap blkciphers but not the other way around. This broke the geode-aes driver. So let's fix it by moving to the sync skcipher interface when allocating the fallback. At the same time, align with the generic API for ECB and CBC by rejecting inputs that are not a multiple of the AES block size. Fixes: 79c65d179a40e145 ("crypto: cbc - Convert to skcipher") Cc: # v4.20+ ONLY Signed-off-by: Ard Biesheuvel Signed-off-by: Florian Bezdeka Signed-off-by: Herbert Xu Signed-off-by: Florian Bezdeka Signed-off-by: Greg Kroah-Hartman commit 8d9aa36cc7acdd106225d323da17e6015a6e4d2f Author: Masato Suzuki Date: Mon Jan 27 14:07:46 2020 +0900 sd: Fix REQ_OP_ZONE_REPORT completion handling ZBC/ZAC report zones command may return less bytes than requested if the number of matching zones for the report request is small. However, unlike read or write commands, the remainder of incomplete report zones commands cannot be automatically requested by the block layer: the start sector of the next report cannot be known, and the report reply may not be 512B aligned for SAS drives (a report zone reply size is always a multiple of 64B). The regular request completion code executing bio_advance() and restart of the command remainder part currently causes invalid zone descriptor data to be reported to the caller if the report zone size is smaller than 512B (a case that can happen easily for a report of the last zones of a SAS drive for example). Since blkdev_report_zones() handles report zone command processing in a loop until completion (no more zones are being reported), we can safely avoid that the block layer performs an incorrect bio_advance() call and restart of the remainder of incomplete report zone BIOs. To do so, always indicate a full completion of REQ_OP_ZONE_REPORT by setting good_bytes to the request buffer size and by setting the command resid to 0. This does not affect the post processing of the report zone reply done by sd_zbc_complete() since the reply header indicates the number of zones reported. Fixes: 89d947561077 ("sd: Implement support for ZBC devices") Cc: # 4.19 Cc: # 4.14 Signed-off-by: Masato Suzuki Reviewed-by: Damien Le Moal Acked-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit ce28d664054df01997baace61d1defca77689798 Author: Steven Rostedt (VMware) Date: Mon Jan 20 13:07:31 2020 -0500 tracing: Fix histogram code when expression has same var as value commit 8bcebc77e85f3d7536f96845a0fe94b1dddb6af0 upstream. While working on a tool to convert SQL syntex into the histogram language of the kernel, I discovered the following bug: # echo 'first u64 start_time u64 end_time pid_t pid u64 delta' >> synthetic_events # echo 'hist:keys=pid:start=common_timestamp' > events/sched/sched_waking/trigger # echo 'hist:keys=next_pid:delta=common_timestamp-$start,start2=$start:onmatch(sched.sched_waking).trace(first,$start2,common_timestamp,next_pid,$delta)' > events/sched/sched_switch/trigger Would not display any histograms in the sched_switch histogram side. But if I were to swap the location of "delta=common_timestamp-$start" with "start2=$start" Such that the last line had: # echo 'hist:keys=next_pid:start2=$start,delta=common_timestamp-$start:onmatch(sched.sched_waking).trace(first,$start2,common_timestamp,next_pid,$delta)' > events/sched/sched_switch/trigger The histogram works as expected. What I found out is that the expressions clear out the value once it is resolved. As the variables are resolved in the order listed, when processing: delta=common_timestamp-$start The $start is cleared. When it gets to "start2=$start", it errors out with "unresolved symbol" (which is silent as this happens at the location of the trace), and the histogram is dropped. When processing the histogram for variable references, instead of adding a new reference for a variable used twice, use the same reference. That way, not only is it more efficient, but the order will no longer matter in processing of the variables. From Tom Zanussi: "Just to clarify some more about what the problem was is that without your patch, we would have two separate references to the same variable, and during resolve_var_refs(), they'd both want to be resolved separately, so in this case, since the first reference to start wasn't part of an expression, it wouldn't get the read-once flag set, so would be read normally, and then the second reference would do the read-once read and also be read but using read-once. So everything worked and you didn't see a problem: from: start2=$start,delta=common_timestamp-$start In the second case, when you switched them around, the first reference would be resolved by doing the read-once, and following that the second reference would try to resolve and see that the variable had already been read, so failed as unset, which caused it to short-circuit out and not do the trigger action to generate the synthetic event: to: delta=common_timestamp-$start,start2=$start With your patch, we only have the single resolution which happens correctly the one time it's resolved, so this can't happen." Link: https://lore.kernel.org/r/20200116154216.58ca08eb@gandalf.local.home Cc: stable@vger.kernel.org Fixes: 067fe038e70f6 ("tracing: Add variable reference handling to hist triggers") Reviewed-by: Tom Zanuss Tested-by: Tom Zanussi Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit cbb042fd8794b7db5f8bfb978c761b6b9fb4c4f5 Author: Tom Zanussi Date: Tue Dec 18 14:33:23 2018 -0600 tracing: Remove open-coding of hist trigger var_ref management commit de40f033d4e84e843d6a12266e3869015ea9097c upstream. Have create_var_ref() manage the hist trigger's var_ref list, rather than having similar code doing it in multiple places. This cleans up the code and makes sure var_refs are always accounted properly. Also, document the var_ref-related functions to make what their purpose clearer. Link: http://lkml.kernel.org/r/05ddae93ff514e66fc03897d6665231892939913.1545161087.git.tom.zanussi@linux.intel.com Acked-by: Namhyung Kim Reviewed-by: Masami Hiramatsu Signed-off-by: Tom Zanussi Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit 836717841a30d7845cdf9cb806ee0ff3bfa7135a Author: Tom Zanussi Date: Tue Dec 18 14:33:24 2018 -0600 tracing: Use hist trigger's var_ref array to destroy var_refs commit 656fe2ba85e81d00e4447bf77b8da2be3c47acb2 upstream. Since every var ref for a trigger has an entry in the var_ref[] array, use that to destroy the var_refs, instead of piecemeal via the field expressions. This allows us to avoid having to keep and treat differently separate lists for the action-related references, which future patches will remove. Link: http://lkml.kernel.org/r/fad1a164f0e257c158e70d6eadbf6c586e04b2a2.1545161087.git.tom.zanussi@linux.intel.com Acked-by: Namhyung Kim Reviewed-by: Masami Hiramatsu Signed-off-by: Tom Zanussi Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit 90042a53980b1453ba7ec78218a743374b59c6eb Author: Finn Thain Date: Thu Jan 23 09:07:26 2020 +1100 net/sonic: Prevent tx watchdog timeout commit 686f85d71d095f1d26b807e23b0f0bfd22042c45 upstream. Section 5.5.3.2 of the datasheet says, If FIFO Underrun, Byte Count Mismatch, Excessive Collision, or Excessive Deferral (if enabled) errors occur, transmission ceases. In this situation, the chip asserts a TXER interrupt rather than TXDN. But the handler for the TXDN is the only way that the transmit queue gets restarted. Hence, an aborted transmission can result in a watchdog timeout. This problem can be reproduced on congested link, as that can result in excessive transmitter collisions. Another way to reproduce this is with a FIFO Underrun, which may be caused by DMA latency. In event of a TXER interrupt, prevent a watchdog timeout by restarting transmission. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Tested-by: Stan Johnson Signed-off-by: Finn Thain Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 85d1250227b0ae48a5111e48a6c1769402992f9b Author: Finn Thain Date: Thu Jan 23 09:07:26 2020 +1100 net/sonic: Fix CAM initialization commit 772f66421d5aa0b9f256056f513bbc38ac132271 upstream. Section 4.3.1 of the datasheet says, This bit [TXP] must not be set if a Load CAM operation is in progress (LCAM is set). The SONIC will lock up if both bits are set simultaneously. Testing has shown that the driver sometimes attempts to set LCAM while TXP is set. Avoid this by waiting for command completion before and after giving the LCAM command. After issuing the Load CAM command, poll for !SONIC_CR_LCAM rather than SONIC_INT_LCD, because the SONIC_CR_TXP bit can't be used until !SONIC_CR_LCAM. When in reset mode, take the opportunity to reset the CAM Enable register. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Tested-by: Stan Johnson Signed-off-by: Finn Thain Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 6382bb92be25292c84de7d470457ce3728657a71 Author: Finn Thain Date: Thu Jan 23 09:07:26 2020 +1100 net/sonic: Fix command register usage commit 27e0c31c5f27c1d1a1d9d135c123069f60dcf97b upstream. There are several issues relating to command register usage during chip initialization. Firstly, the SONIC sometimes comes out of software reset with the Start Timer bit set. This gets logged as, macsonic macsonic eth0: sonic_init: status=24, i=101 Avoid this by giving the Stop Timer command earlier than later. Secondly, the loop that waits for the Read RRA command to complete has the break condition inverted. That's why the for loop iterates until its termination condition. Call the helper for this instead. Finally, give the Receiver Enable command after clearing interrupts, not before, to avoid the possibility of losing an interrupt. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Tested-by: Stan Johnson Signed-off-by: Finn Thain Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit fc590dcb62e64cc07d654881c1b23358449a0050 Author: Finn Thain Date: Thu Jan 23 09:07:26 2020 +1100 net/sonic: Quiesce SONIC before re-initializing descriptor memory commit 3f4b7e6a2be982fd8820a2b54d46dd9c351db899 upstream. Make sure the SONIC's DMA engine is idle before altering the transmit and receive descriptors. Add a helper for this as it will be needed again. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Tested-by: Stan Johnson Signed-off-by: Finn Thain Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 7d8c24e07569503ef0f853292b7546cda60a85f6 Author: Finn Thain Date: Thu Jan 23 09:07:26 2020 +1100 net/sonic: Fix receive buffer replenishment commit 89ba879e95582d3bba55081e45b5409e883312ca upstream. As soon as the driver is finished with a receive buffer it allocs a new one and overwrites the corresponding RRA entry with a new buffer pointer. Problem is, the buffer pointer is split across two word-sized registers. It can't be updated in one atomic store. So this operation races with the chip while it stores received packets and advances its RRP register. This could result in memory corruption by a DMA write. Avoid this problem by adding buffers only at the location given by the RWP register, in accordance with the National Semiconductor datasheet. Re-factor this code into separate functions to calculate a RRA pointer and to update the RWP. Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update") Tested-by: Stan Johnson Signed-off-by: Finn Thain Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 1de65d2cadb9bf6c126a69bce017b3c1162c1689 Author: Finn Thain Date: Thu Jan 23 09:07:26 2020 +1100 net/sonic: Improve receive descriptor status flag check commit 94b166349503957079ef5e7d6f667f157aea014a upstream. After sonic_tx_timeout() calls sonic_init(), it can happen that sonic_rx() will subsequently encounter a receive descriptor with no flags set. Remove the comment that says that this can't happen. When giving a receive descriptor to the SONIC, clear the descriptor status field. That way, any rx descriptor with flags set can only be a newly received packet. Don't process a descriptor without the LPKT bit set. The buffer is still in use by the SONIC. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Tested-by: Stan Johnson Signed-off-by: Finn Thain Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 6f1355914bfb70d7093ca1bbb6730e749ce938e0 Author: Finn Thain Date: Thu Jan 23 09:07:26 2020 +1100 net/sonic: Avoid needless receive descriptor EOL flag updates commit eaabfd19b2c787bbe88dc32424b9a43d67293422 upstream. The while loop in sonic_rx() traverses the rx descriptor ring. It stops when it reaches a descriptor that the SONIC has not used. Each iteration advances the EOL flag so the SONIC can keep using more descriptors. Therefore, the while loop has no definite termination condition. The algorithm described in the National Semiconductor literature is quite different. It consumes descriptors up to the one with its EOL flag set (which will also have its "in use" flag set). All freed descriptors are then returned to the ring at once, by adjusting the EOL flags (and link pointers). Adopt the algorithm from datasheet as it's simpler, terminates quickly and avoids a lot of pointless descriptor EOL flag changes. Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update") Tested-by: Stan Johnson Signed-off-by: Finn Thain Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 75f91ec93567ac7000c3df1a2b07b39add50a4e2 Author: Finn Thain Date: Thu Jan 23 09:07:26 2020 +1100 net/sonic: Fix receive buffer handling commit 9e311820f67e740f4fb8dcb82b4c4b5b05bdd1a5 upstream. The SONIC can sometimes advance its rx buffer pointer (RRP register) without advancing its rx descriptor pointer (CRDA register). As a result the index of the current rx descriptor may not equal that of the current rx buffer. The driver mistakenly assumes that they are always equal. This assumption leads to incorrect packet lengths and possible packet duplication. Avoid this by calling a new function to locate the buffer corresponding to a given descriptor. Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update") Tested-by: Stan Johnson Signed-off-by: Finn Thain Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 04b5473a21a4dc68690ea4e341f26cb8053e8a3a Author: Finn Thain Date: Thu Jan 23 09:07:26 2020 +1100 net/sonic: Fix interface error stats collection commit 427db97df1ee721c20bdc9a66db8a9e1da719855 upstream. The tx_aborted_errors statistic should count packets flagged with EXD, EXC, FU, or BCM bits because those bits denote an aborted transmission. That corresponds to the bitmask 0x0446, not 0x0642. Use macros for these constants to avoid mistakes. Better to leave out FIFO Underruns (FU) as there's a separate counter for that purpose. Don't lump all these errors in with the general tx_errors counter as that's used for tx timeout events. On the rx side, don't count RDE and RBAE interrupts as dropped packets. These interrupts don't indicate a lost packet, just a lack of resources. When a lack of resources results in a lost packet, this gets reported in the rx_missed_errors counter (along with RFO events). Don't double-count rx_frame_errors and rx_crc_errors. Don't use the general rx_errors counter for events that already have special counters. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Tested-by: Stan Johnson Signed-off-by: Finn Thain Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 5205e9b20840a3f58a095b03503bba82dac56c45 Author: Finn Thain Date: Thu Jan 23 09:07:26 2020 +1100 net/sonic: Use MMIO accessors commit e3885f576196ddfc670b3d53e745de96ffcb49ab upstream. The driver accesses descriptor memory which is simultaneously accessed by the chip, so the compiler must not be allowed to re-order CPU accesses. sonic_buf_get() used 'volatile' to prevent that. sonic_buf_put() should have done so too but was overlooked. Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update") Tested-by: Stan Johnson Signed-off-by: Finn Thain Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit b9ef3fe67d108b329e448d9dc8d23272a7495c4f Author: Finn Thain Date: Thu Jan 23 09:07:26 2020 +1100 net/sonic: Clear interrupt flags immediately commit 5fedabf5a70be26b19d7520f09f12a62274317c6 upstream. The chip can change a packet's descriptor status flags at any time. However, an active interrupt flag gets cleared rather late. This allows a race condition that could theoretically lose an interrupt. Fix this by clearing asserted interrupt flags immediately. Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update") Tested-by: Stan Johnson Signed-off-by: Finn Thain Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 655fb220909222cc7176da6e048c69c1d6c27350 Author: Finn Thain Date: Thu Jan 23 09:07:26 2020 +1100 net/sonic: Add mutual exclusion for accessing shared state commit 865ad2f2201dc18685ba2686f13217f8b3a9c52c upstream. The netif_stop_queue() call in sonic_send_packet() races with the netif_wake_queue() call in sonic_interrupt(). This causes issues like "NETDEV WATCHDOG: eth0 (macsonic): transmit queue 0 timed out". Fix this by disabling interrupts when accessing tx_skb[] and next_tx. Update a comment to clarify the synchronization properties. Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update") Tested-by: Stan Johnson Signed-off-by: Finn Thain Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 752f72edea55f9b7c6fd019e71365def13a0f2b6 Author: Al Viro Date: Sun Jan 26 09:29:34 2020 -0500 do_last(): fetch directory ->i_mode and ->i_uid before it's too late commit d0cb50185ae942b03c4327be322055d622dc79f6 upstream. may_create_in_sticky() call is done when we already have dropped the reference to dir. Fixes: 30aba6656f61e (namei: allow restricted O_CREAT of FIFOs and regular files) Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman commit 05f010d2ff4ba3969337cf0495b4849f184e3cd4 Author: Changbin Du Date: Sun Jan 12 11:42:31 2020 +0800 tracing: xen: Ordered comparison of function pointers commit d0695e2351102affd8efae83989056bc4b275917 upstream. Just as commit 0566e40ce7 ("tracing: initcall: Ordered comparison of function pointers"), this patch fixes another remaining one in xen.h found by clang-9. In file included from arch/x86/xen/trace.c:21: In file included from ./include/trace/events/xen.h:475: In file included from ./include/trace/define_trace.h:102: In file included from ./include/trace/trace_events.h:473: ./include/trace/events/xen.h:69:7: warning: ordered comparison of function \ pointers ('xen_mc_callback_fn_t' (aka 'void (*)(void *)') and 'xen_mc_callback_fn_t') [-Wordered-compare-function-pointers] __field(xen_mc_callback_fn_t, fn) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./include/trace/trace_events.h:421:29: note: expanded from macro '__field' ^ ./include/trace/trace_events.h:407:6: note: expanded from macro '__field_ext' is_signed_type(type), filter_type); \ ^ ./include/linux/trace_events.h:554:44: note: expanded from macro 'is_signed_type' ^ Fixes: c796f213a6934 ("xen/trace: add multicall tracing") Signed-off-by: Changbin Du Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit 5ce5ebfa007baa2fc22c889bb813e3eeea03c6bc Author: Bart Van Assche Date: Wed Jan 15 20:47:37 2020 -0800 scsi: RDMA/isert: Fix a recently introduced regression related to logout commit 04060db41178c7c244f2c7dcd913e7fd331de915 upstream. iscsit_close_connection() calls isert_wait_conn(). Due to commit e9d3009cb936 both functions call target_wait_for_sess_cmds() although that last function should be called only once. Fix this by removing the target_wait_for_sess_cmds() call from isert_wait_conn() and by only calling isert_wait_conn() after target_wait_for_sess_cmds(). Fixes: e9d3009cb936 ("scsi: target: iscsi: Wait for all commands to finish before freeing a session"). Link: https://lore.kernel.org/r/20200116044737.19507-1-bvanassche@acm.org Reported-by: Rahul Kundu Signed-off-by: Bart Van Assche Tested-by: Mike Marciniszyn Acked-by: Sagi Grimberg Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 7bf1558012e06f54bb65b084424255c68f0f5e41 Author: Gilles Buloz Date: Wed Nov 27 18:09:34 2019 +0100 hwmon: (nct7802) Fix voltage limits to wrong registers commit 7713e62c8623c54dac88d1fa724aa487a38c3efb upstream. in0 thresholds are written to the in2 thresholds registers in2 thresholds to in3 thresholds in3 thresholds to in4 thresholds in4 thresholds to in0 thresholds Signed-off-by: Gilles Buloz Link: https://lore.kernel.org/r/5de0f509.rc0oEvPOMjbfPW1w%gilles.buloz@kontron.com Fixes: 3434f3783580 ("hwmon: Driver for Nuvoton NCT7802Y") Signed-off-by: Guenter Roeck Signed-off-by: Greg Kroah-Hartman commit 666a530b2e022bc4bd90fe1770f75fae9b1c15b9 Author: Florian Westphal Date: Sat Jan 18 11:27:25 2020 +0100 netfilter: nft_osf: add missing check for DREG attribute commit 7eaecf7963c1c8f62d62c6a8e7c439b0e7f2d365 upstream. syzbot reports just another NULL deref crash because of missing test for presence of the attribute. Reported-by: syzbot+cf23983d697c26c34f60@syzkaller.appspotmail.com Fixes: b96af92d6eaf9fadd ("netfilter: nf_tables: implement Passive OS fingerprint module in nft_osf") Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit f5cdfc16faa80d3ce82b811826526d3a481d09e7 Author: Chuhong Yuan Date: Fri Jan 10 10:30:04 2020 -0800 Input: sun4i-ts - add a check for devm_thermal_zone_of_sensor_register commit 97e24b095348a15ec08c476423c3b3b939186ad7 upstream. The driver misses a check for devm_thermal_zone_of_sensor_register(). Add a check to fix it. Fixes: e28d0c9cd381 ("input: convert sun4i-ts to use devm_thermal_zone_of_sensor_register") Signed-off-by: Chuhong Yuan Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman commit f6d8ff75271069b9df82949adc8ff2f533ef535d Author: Johan Hovold Date: Fri Jan 10 11:55:47 2020 -0800 Input: pegasus_notetaker - fix endpoint sanity check commit bcfcb7f9b480dd0be8f0df2df17340ca92a03b98 upstream. The driver was checking the number of endpoints of the first alternate setting instead of the current one, something which could be used by a malicious device (or USB descriptor fuzzer) to trigger a NULL-pointer dereference. Fixes: 1afca2b66aac ("Input: add Pegasus Notetaker tablet driver") Signed-off-by: Johan Hovold Acked-by: Martin Kepplinger Acked-by: Vladis Dronov Link: https://lore.kernel.org/r/20191210113737.4016-2-johan@kernel.org Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman commit d6ca8b03fd808b0fa05ced36b6a5e75a337fad8d Author: Johan Hovold Date: Fri Jan 10 11:59:32 2020 -0800 Input: aiptek - fix endpoint sanity check commit 3111491fca4f01764e0c158c5e0f7ced808eef51 upstream. The driver was checking the number of endpoints of the first alternate setting instead of the current one, something which could lead to the driver binding to an invalid interface. This in turn could cause the driver to misbehave or trigger a WARN() in usb_submit_urb() that kernels with panic_on_warn set would choke on. Fixes: 8e20cf2bce12 ("Input: aiptek - fix crash on detecting device without endpoints") Signed-off-by: Johan Hovold Acked-by: Vladis Dronov Link: https://lore.kernel.org/r/20191210113737.4016-3-johan@kernel.org Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman commit 20ae16280a6dfc3a0f546e281e085701cebf70c9 Author: Johan Hovold Date: Fri Jan 10 12:00:18 2020 -0800 Input: gtco - fix endpoint sanity check commit a8eeb74df5a6bdb214b2b581b14782c5f5a0cf83 upstream. The driver was checking the number of endpoints of the first alternate setting instead of the current one, something which could lead to the driver binding to an invalid interface. This in turn could cause the driver to misbehave or trigger a WARN() in usb_submit_urb() that kernels with panic_on_warn set would choke on. Fixes: 162f98dea487 ("Input: gtco - fix crash on detecting device without endpoints") Signed-off-by: Johan Hovold Acked-by: Vladis Dronov Link: https://lore.kernel.org/r/20191210113737.4016-5-johan@kernel.org Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman commit 0c022c4a2391b35a6717818b9cb108f8b2362099 Author: Johan Hovold Date: Fri Jan 10 12:01:27 2020 -0800 Input: sur40 - fix interface sanity checks commit 6b32391ed675827f8425a414abbc6fbd54ea54fe upstream. Make sure to use the current alternate setting when verifying the interface descriptors to avoid binding to an invalid interface. This in turn could cause the driver to misbehave or trigger a WARN() in usb_submit_urb() that kernels with panic_on_warn set would choke on. Fixes: bdb5c57f209c ("Input: add sur40 driver for Samsung SUR40 (aka MS Surface 2.0/Pixelsense)") Signed-off-by: Johan Hovold Acked-by: Vladis Dronov Link: https://lore.kernel.org/r/20191210113737.4016-8-johan@kernel.org Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman commit c694050c96794ce9d28a2399bdac85da10460052 Author: Stephan Gerhold Date: Fri Jan 17 13:40:36 2020 -0800 Input: pm8xxx-vib - fix handling of separate enable register commit 996d5d5f89a558a3608a46e73ccd1b99f1b1d058 upstream. Setting the vibrator enable_mask is not implemented correctly: For regmap_update_bits(map, reg, mask, val) we give in either regs->enable_mask or 0 (= no-op) as mask and "val" as value. But "val" actually refers to the vibrator voltage control register, which has nothing to do with the enable_mask. So we usually end up doing nothing when we really wanted to enable the vibrator. We want to set or clear the enable_mask (to enable/disable the vibrator). Therefore, change the call to always modify the enable_mask and set the bits only if we want to enable the vibrator. Fixes: d4c7c5c96c92 ("Input: pm8xxx-vib - handle separate enable register") Signed-off-by: Stephan Gerhold Link: https://lore.kernel.org/r/20200114183442.45720-1-stephan@gerhold.net Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman commit a243850af3fb3e748694073fa6679edfbbdc3f3a Author: Jeremy Linton Date: Fri Jan 25 12:07:00 2019 -0600 Documentation: Document arm64 kpti control commit de19055564c8f8f9d366f8db3395836da0b2176c upstream. For a while Arm64 has been capable of force enabling or disabling the kpti mitigations. Lets make sure the documentation reflects that. Signed-off-by: Jeremy Linton Reviewed-by: Andre Przywara Signed-off-by: Jonathan Corbet Signed-off-by: Greg Kroah-Hartman commit 6491a9dd3cf9d92f058d96698d22bf6eb87b9da8 Author: Michał Mirosław Date: Wed Jan 15 10:54:35 2020 +0100 mmc: sdhci: fix minimum clock rate for v3 controller commit 2a187d03352086e300daa2044051db00044cd171 upstream. For SDHCIv3+ with programmable clock mode, minimal clock frequency is still base clock / max(divider). Minimal programmable clock frequency is always greater than minimal divided clock frequency. Without this patch, SDHCI uses out-of-spec initial frequency when multiplier is big enough: mmc1: mmc_rescan_try_freq: trying to init card at 468750 Hz [for 480 MHz source clock divided by 1024] The code in sdhci_calc_clk() already chooses a correct SDCLK clock mode. Fixes: c3ed3877625f ("mmc: sdhci: add support for programmable clock mode") Cc: # 4f6aa3264af4: mmc: tegra: Only advertise UHS modes if IO regulator is present Cc: Signed-off-by: Michał Mirosław Acked-by: Adrian Hunter Link: https://lore.kernel.org/r/ffb489519a446caffe7a0a05c4b9372bd52397bb.1579082031.git.mirq-linux@rere.qmqm.pl Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 3018dc1af52460a4b780dc1e1942470fe1d8ab03 Author: Michał Mirosław Date: Tue Jan 7 10:47:34 2020 +0100 mmc: tegra: fix SDR50 tuning override commit f571389c0b015e76f91c697c4c1700aba860d34f upstream. Commit 7ad2ed1dfcbe inadvertently mixed up a quirk flag's name and broke SDR50 tuning override. Use correct NVQUIRK_ name. Fixes: 7ad2ed1dfcbe ("mmc: tegra: enable UHS-I modes") Cc: Acked-by: Adrian Hunter Reviewed-by: Thierry Reding Tested-by: Thierry Reding Signed-off-by: Michał Mirosław Link: https://lore.kernel.org/r/9aff1d859935e59edd81e4939e40d6c55e0b55f6.1578390388.git.mirq-linux@rere.qmqm.pl Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit ddb2f192d723b2961f3f71dc5d93646095edf9a1 Author: Alex Sverdlin Date: Wed Jan 8 15:57:47 2020 +0100 ARM: 8950/1: ftrace/recordmcount: filter relocation types commit 927d780ee371d7e121cea4fc7812f6ef2cea461c upstream. Scenario 1, ARMv7 ================= If code in arch/arm/kernel/ftrace.c would operate on mcount() pointer the following may be generated: 00000230 : 230: b5f8 push {r3, r4, r5, r6, r7, lr} 232: b500 push {lr} 234: f7ff fffe bl 0 <__gnu_mcount_nc> 234: R_ARM_THM_CALL __gnu_mcount_nc 238: f240 0600 movw r6, #0 238: R_ARM_THM_MOVW_ABS_NC __gnu_mcount_nc 23c: f8d0 1180 ldr.w r1, [r0, #384] ; 0x180 FTRACE currently is not able to deal with it: WARNING: CPU: 0 PID: 0 at .../kernel/trace/ftrace.c:1979 ftrace_bug+0x1ad/0x230() ... CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.116-... #1 ... [] (unwind_backtrace) from [] (show_stack+0x11/0x14) [] (show_stack) from [] (dump_stack+0x81/0xa8) [] (dump_stack) from [] (warn_slowpath_common+0x69/0x90) [] (warn_slowpath_common) from [] (warn_slowpath_null+0x17/0x1c) [] (warn_slowpath_null) from [] (ftrace_bug+0x1ad/0x230) [] (ftrace_bug) from [] (ftrace_process_locs+0x27d/0x444) [] (ftrace_process_locs) from [] (ftrace_init+0x91/0xe8) [] (ftrace_init) from [] (start_kernel+0x34b/0x358) [] (start_kernel) from [<00308095>] (0x308095) ---[ end trace cb88537fdc8fa200 ]--- ftrace failed to modify [] prealloc_fixed_plts+0x8/0x60 actual: 44:f2:e1:36 ftrace record flags: 0 (0) expected tramp: c03143e9 Scenario 2, ARMv4T ================== ftrace: allocating 14435 entries in 43 pages ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:2029 ftrace_bug+0x204/0x310 CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.5 #1 Hardware name: Cirrus Logic EDB9302 Evaluation Board [] (unwind_backtrace) from [] (show_stack+0x20/0x2c) [] (show_stack) from [] (dump_stack+0x20/0x30) [] (dump_stack) from [] (__warn+0xdc/0x104) [] (__warn) from [] (warn_slowpath_null+0x4c/0x5c) [] (warn_slowpath_null) from [] (ftrace_bug+0x204/0x310) [] (ftrace_bug) from [] (ftrace_init+0x3b4/0x4d4) [] (ftrace_init) from [] (start_kernel+0x20c/0x410) [] (start_kernel) from [<00000000>] ( (null)) ---[ end trace 0506a2f5dae6b341 ]--- ftrace failed to modify [] perf_trace_sys_exit+0x5c/0xe8 actual: 1e:ff:2f:e1 Initializing ftrace call sites ftrace record flags: 0 (0) expected tramp: c000fb24 The analysis for this problem has been already performed previously, refer to the link below. Fix the above problems by allowing only selected reloc types in __mcount_loc. The list itself comes from the legacy recordmcount.pl script. Link: https://lore.kernel.org/lkml/56961010.6000806@pengutronix.de/ Cc: stable@vger.kernel.org Fixes: ed60453fa8f8 ("ARM: 6511/1: ftrace: add ARM support for C version of recordmcount") Signed-off-by: Alexander Sverdlin Acked-by: Steven Rostedt (VMware) Signed-off-by: Russell King Signed-off-by: Greg Kroah-Hartman commit 76ac84d52720e35395d352dada760afc03fff317 Author: Hans Verkuil Date: Thu Jan 16 20:12:27 2020 -0800 Revert "Input: synaptics-rmi4 - don't increment rmiaddr for SMBus transfers" commit 8ff771f8c8d55d95f102cf88a970e541a8bd6bcf upstream. This reverts commit a284e11c371e446371675668d8c8120a27227339. This causes problems (drifting cursor) with at least the F11 function that reads more than 32 bytes. The real issue is in the F54 driver, and so this should be fixed there, and not in rmi_smbus.c. So first revert this bad commit, then fix the real problem in F54 in another patch. Signed-off-by: Hans Verkuil Reported-by: Timo Kaufmann Fixes: a284e11c371e ("Input: synaptics-rmi4 - don't increment rmiaddr for SMBus transfers") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200115124819.3191024-2-hverkuil-cisco@xs4all.nl Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman commit ef2f9f37c3929c500335c4d3ff2a61c126be325f Author: Johan Hovold Date: Mon Jan 13 10:38:57 2020 -0800 Input: keyspan-remote - fix control-message timeouts commit ba9a103f40fc4a3ec7558ec9b0b97d4f92034249 upstream. The driver was issuing synchronous uninterruptible control requests without using a timeout. This could lead to the driver hanging on probe due to a malfunctioning (or malicious) device until the device is physically disconnected. While sleeping in probe the driver prevents other devices connected to the same hub from being added to (or removed from) the bus. The USB upper limit of five seconds per request should be more than enough. Fixes: 99f83c9c9ac9 ("[PATCH] USB: add driver for Keyspan Digital Remote") Signed-off-by: Johan Hovold Reviewed-by: Greg Kroah-Hartman Cc: stable # 2.6.13 Link: https://lore.kernel.org/r/20200113171715.30621-1-johan@kernel.org Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman commit 47eb3574d0ab0c2962af1f4cc608c842654f1ca4 Author: Masami Hiramatsu Date: Fri Dec 20 11:31:43 2019 +0900 tracing: trigger: Replace unneeded RCU-list traversals commit aeed8aa3874dc15b9d82a6fe796fd7cfbb684448 upstream. With CONFIG_PROVE_RCU_LIST, I had many suspicious RCU warnings when I ran ftracetest trigger testcases. ----- # dmesg -c > /dev/null # ./ftracetest test.d/trigger ... # dmesg | grep "RCU-list traversed" | cut -f 2 -d ] | cut -f 2 -d " " kernel/trace/trace_events_hist.c:6070 kernel/trace/trace_events_hist.c:1760 kernel/trace/trace_events_hist.c:5911 kernel/trace/trace_events_trigger.c:504 kernel/trace/trace_events_hist.c:1810 kernel/trace/trace_events_hist.c:3158 kernel/trace/trace_events_hist.c:3105 kernel/trace/trace_events_hist.c:5518 kernel/trace/trace_events_hist.c:5998 kernel/trace/trace_events_hist.c:6019 kernel/trace/trace_events_hist.c:6044 kernel/trace/trace_events_trigger.c:1500 kernel/trace/trace_events_trigger.c:1540 kernel/trace/trace_events_trigger.c:539 kernel/trace/trace_events_trigger.c:584 ----- I investigated those warnings and found that the RCU-list traversals in event trigger and hist didn't need to use RCU version because those were called only under event_mutex. I also checked other RCU-list traversals related to event trigger list, and found that most of them were called from event_hist_trigger_func() or hist_unregister_trigger() or register/unregister functions except for a few cases. Replace these unneeded RCU-list traversals with normal list traversal macro and lockdep_assert_held() to check the event_mutex is held. Link: http://lkml.kernel.org/r/157680910305.11685.15110237954275915782.stgit@devnote2 Cc: stable@vger.kernel.org Fixes: 30350d65ac567 ("tracing: Add variable support to hist triggers") Reviewed-by: Tom Zanussi Signed-off-by: Masami Hiramatsu Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit b48fea52b951f0652c39ea8b3362338d909fca79 Author: Alex Deucher Date: Tue Jan 14 17:09:28 2020 -0600 PCI: Mark AMD Navi14 GPU rev 0xc5 ATS as broken commit 5e89cd303e3a4505752952259b9f1ba036632544 upstream. To account for parts of the chip that are "harvested" (disabled) due to silicon flaws, caches on some AMD GPUs must be initialized before ATS is enabled. ATS is normally enabled by the IOMMU driver before the GPU driver loads, so this cache initialization would have to be done in a quirk, but that's too complex to be practical. For Navi14 (device ID 0x7340), this initialization is done by the VBIOS, but apparently some boards went to production with an older VBIOS that doesn't do it. Disable ATS for those boards. Link: https://lore.kernel.org/r/20200114205523.1054271-3-alexander.deucher@amd.com Bug: https://gitlab.freedesktop.org/drm/amd/issues/1015 See-also: d28ca864c493 ("PCI: Mark AMD Stoney Radeon R7 GPU ATS as broken") See-also: 9b44b0b09dec ("PCI: Mark AMD Stoney GPU ATS as broken") [bhelgaas: squash into one patch, simplify slightly, commit log] Signed-off-by: Alex Deucher Signed-off-by: Bjorn Helgaas Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 4235c1e802856b559a7229d89acfd58a69e636da Author: Guenter Roeck Date: Thu Jan 16 10:44:17 2020 -0800 hwmon: (core) Do not use device managed functions for memory allocations commit 3bf8bdcf3bada771eb12b57f2a30caee69e8ab8d upstream. The hwmon core uses device managed functions, tied to the hwmon parent device, for various internal memory allocations. This is problematic since hwmon device lifetime does not necessarily match its parent's device lifetime. If there is a mismatch, memory leaks will accumulate until the parent device is released. Fix the problem by managing all memory allocations internally. The only exception is memory allocation for thermal device registration, which can be tied to the hwmon device, along with thermal device registration itself. Fixes: d560168b5d0f ("hwmon: (core) New hwmon registration API") Cc: stable@vger.kernel.org # v4.14.x: 47c332deb8e8: hwmon: Deal with errors from the thermal subsystem Cc: stable@vger.kernel.org # v4.14.x: 74e3512731bd: hwmon: (core) Fix double-free in __hwmon_device_register() Cc: stable@vger.kernel.org # v4.9.x: 3a412d5e4a1c: hwmon: (core) Simplify sysfs attribute name allocation Cc: stable@vger.kernel.org # v4.9.x: 47c332deb8e8: hwmon: Deal with errors from the thermal subsystem Cc: stable@vger.kernel.org # v4.9.x: 74e3512731bd: hwmon: (core) Fix double-free in __hwmon_device_register() Cc: stable@vger.kernel.org # v4.9+ Cc: Martin K. Petersen Signed-off-by: Guenter Roeck Signed-off-by: Greg Kroah-Hartman commit c84732496ce7fab7e8abd3c8cb216b4059289f28 Author: Luuk Paulussen Date: Fri Dec 6 12:16:59 2019 +1300 hwmon: (adt7475) Make volt2reg return same reg as reg2volt input commit cf3ca1877574a306c0207cbf7fdf25419d9229df upstream. reg2volt returns the voltage that matches a given register value. Converting this back the other way with volt2reg didn't return the same register value because it used truncation instead of rounding. This meant that values read from sysfs could not be written back to sysfs to set back the same register value. With this change, volt2reg will return the same value for every voltage previously returned by reg2volt (for the set of possible input values) Signed-off-by: Luuk Paulussen Link: https://lore.kernel.org/r/20191205231659.1301-1-luuk.paulussen@alliedtelesis.co.nz cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck Signed-off-by: Greg Kroah-Hartman commit 881c9706ebf625df24c29d183c5f8589f7ca62ec Author: David Howells Date: Sun Jan 26 01:02:53 2020 +0000 afs: Fix characters allowed into cell names commit a45ea48e2bcd92c1f678b794f488ca0bda9835b8 upstream. The afs filesystem needs to prohibit certain characters from cell names, such as '/', as these are used to form filenames in procfs, leading to the following warning being generated: WARNING: CPU: 0 PID: 3489 at fs/proc/generic.c:178 Fix afs_alloc_cell() to disallow nonprintable characters, '/', '@' and names that begin with a dot. Remove the check for "@cell" as that is then redundant. This can be tested by running: echo add foo/.bar 1.2.3.4 >/proc/fs/afs/cells Note that we will also need to deal with: - Names ending in ".invalid" shouldn't be passed to the DNS. - Names that contain non-valid domainname chars shouldn't be passed to the DNS. - DNS replies that say "your-dns-needs-immediate-attention." and replies containing A records that say 127.0.53.53 should be considered invalid. [https://www.icann.org/en/system/files/files/name-collision-mitigation-01aug14-en.pdf] but these need to be dealt with by the kafs-client DNS program rather than the kernel. Reported-by: syzbot+b904ba7c947a37b4b291@syzkaller.appspotmail.com Cc: stable@kernel.org Signed-off-by: David Howells Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 8f50a05dd6fe2372ac0d5c67645e4e480254ce30 Author: Eric Dumazet Date: Wed Jan 22 09:07:35 2020 -0800 tun: add mutex_unlock() call and napi.skb clearing in tun_get_user() [ Upstream commit 1efba987c48629c0c64703bb4ea76ca1a3771d17 ] If both IFF_NAPI_FRAGS mode and XDP are enabled, and the XDP program consumes the skb, we need to clear the napi.skb (or risk a use-after-free) and release the mutex (or risk a deadlock) WARNING: lock held when returning to user space! 5.5.0-rc6-syzkaller #0 Not tainted ------------------------------------------------ syz-executor.0/455 is leaving the kernel with locks still held! 1 lock held by syz-executor.0/455: #0: ffff888098f6e748 (&tfile->napi_mutex){+.+.}, at: tun_get_user+0x1604/0x3fc0 drivers/net/tun.c:1835 Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver") Signed-off-by: Eric Dumazet Reported-by: syzbot Cc: Petar Penkov Cc: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 9bbde0825846002c6931f41fbbd71eeb848ca0e1 Author: Eric Dumazet Date: Wed Jan 22 21:03:00 2020 -0800 tcp: do not leave dangling pointers in tp->highest_sack [ Upstream commit 2bec445f9bf35e52e395b971df48d3e1e5dc704a ] Latest commit 853697504de0 ("tcp: Fix highest_sack and highest_sack_seq") apparently allowed syzbot to trigger various crashes in TCP stack [1] I believe this commit only made things easier for syzbot to find its way into triggering use-after-frees. But really the bugs could lead to bad TCP behavior or even plain crashes even for non malicious peers. I have audited all calls to tcp_rtx_queue_unlink() and tcp_rtx_queue_unlink_and_free() and made sure tp->highest_sack would be updated if we are removing from rtx queue the skb that tp->highest_sack points to. These updates were missing in three locations : 1) tcp_clean_rtx_queue() [This one seems quite serious, I have no idea why this was not caught earlier] 2) tcp_rtx_queue_purge() [Probably not a big deal for normal operations] 3) tcp_send_synack() [Probably not a big deal for normal operations] [1] BUG: KASAN: use-after-free in tcp_highest_sack_seq include/net/tcp.h:1864 [inline] BUG: KASAN: use-after-free in tcp_highest_sack_seq include/net/tcp.h:1856 [inline] BUG: KASAN: use-after-free in tcp_check_sack_reordering+0x33c/0x3a0 net/ipv4/tcp_input.c:891 Read of size 4 at addr ffff8880a488d068 by task ksoftirqd/1/16 CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.5.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374 __kasan_report.cold+0x1b/0x41 mm/kasan/report.c:506 kasan_report+0x12/0x20 mm/kasan/common.c:639 __asan_report_load4_noabort+0x14/0x20 mm/kasan/generic_report.c:134 tcp_highest_sack_seq include/net/tcp.h:1864 [inline] tcp_highest_sack_seq include/net/tcp.h:1856 [inline] tcp_check_sack_reordering+0x33c/0x3a0 net/ipv4/tcp_input.c:891 tcp_try_undo_partial net/ipv4/tcp_input.c:2730 [inline] tcp_fastretrans_alert+0xf74/0x23f0 net/ipv4/tcp_input.c:2847 tcp_ack+0x2577/0x5bf0 net/ipv4/tcp_input.c:3710 tcp_rcv_established+0x6dd/0x1e90 net/ipv4/tcp_input.c:5706 tcp_v4_do_rcv+0x619/0x8d0 net/ipv4/tcp_ipv4.c:1619 tcp_v4_rcv+0x307f/0x3b40 net/ipv4/tcp_ipv4.c:2001 ip_protocol_deliver_rcu+0x5a/0x880 net/ipv4/ip_input.c:204 ip_local_deliver_finish+0x23b/0x380 net/ipv4/ip_input.c:231 NF_HOOK include/linux/netfilter.h:307 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] ip_local_deliver+0x1e9/0x520 net/ipv4/ip_input.c:252 dst_input include/net/dst.h:442 [inline] ip_rcv_finish+0x1db/0x2f0 net/ipv4/ip_input.c:428 NF_HOOK include/linux/netfilter.h:307 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] ip_rcv+0xe8/0x3f0 net/ipv4/ip_input.c:538 __netif_receive_skb_one_core+0x113/0x1a0 net/core/dev.c:5148 __netif_receive_skb+0x2c/0x1d0 net/core/dev.c:5262 process_backlog+0x206/0x750 net/core/dev.c:6093 napi_poll net/core/dev.c:6530 [inline] net_rx_action+0x508/0x1120 net/core/dev.c:6598 __do_softirq+0x262/0x98c kernel/softirq.c:292 run_ksoftirqd kernel/softirq.c:603 [inline] run_ksoftirqd+0x8e/0x110 kernel/softirq.c:595 smpboot_thread_fn+0x6a3/0xa40 kernel/smpboot.c:165 kthread+0x361/0x430 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 Allocated by task 10091: save_stack+0x23/0x90 mm/kasan/common.c:72 set_track mm/kasan/common.c:80 [inline] __kasan_kmalloc mm/kasan/common.c:513 [inline] __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:486 kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:521 slab_post_alloc_hook mm/slab.h:584 [inline] slab_alloc_node mm/slab.c:3263 [inline] kmem_cache_alloc_node+0x138/0x740 mm/slab.c:3575 __alloc_skb+0xd5/0x5e0 net/core/skbuff.c:198 alloc_skb_fclone include/linux/skbuff.h:1099 [inline] sk_stream_alloc_skb net/ipv4/tcp.c:875 [inline] sk_stream_alloc_skb+0x113/0xc90 net/ipv4/tcp.c:852 tcp_sendmsg_locked+0xcf9/0x3470 net/ipv4/tcp.c:1282 tcp_sendmsg+0x30/0x50 net/ipv4/tcp.c:1432 inet_sendmsg+0x9e/0xe0 net/ipv4/af_inet.c:807 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:672 __sys_sendto+0x262/0x380 net/socket.c:1998 __do_sys_sendto net/socket.c:2010 [inline] __se_sys_sendto net/socket.c:2006 [inline] __x64_sys_sendto+0xe1/0x1a0 net/socket.c:2006 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 10095: save_stack+0x23/0x90 mm/kasan/common.c:72 set_track mm/kasan/common.c:80 [inline] kasan_set_free_info mm/kasan/common.c:335 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/common.c:474 kasan_slab_free+0xe/0x10 mm/kasan/common.c:483 __cache_free mm/slab.c:3426 [inline] kmem_cache_free+0x86/0x320 mm/slab.c:3694 kfree_skbmem+0x178/0x1c0 net/core/skbuff.c:645 __kfree_skb+0x1e/0x30 net/core/skbuff.c:681 sk_eat_skb include/net/sock.h:2453 [inline] tcp_recvmsg+0x1252/0x2930 net/ipv4/tcp.c:2166 inet_recvmsg+0x136/0x610 net/ipv4/af_inet.c:838 sock_recvmsg_nosec net/socket.c:886 [inline] sock_recvmsg net/socket.c:904 [inline] sock_recvmsg+0xce/0x110 net/socket.c:900 __sys_recvfrom+0x1ff/0x350 net/socket.c:2055 __do_sys_recvfrom net/socket.c:2073 [inline] __se_sys_recvfrom net/socket.c:2069 [inline] __x64_sys_recvfrom+0xe1/0x1a0 net/socket.c:2069 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff8880a488d040 which belongs to the cache skbuff_fclone_cache of size 456 The buggy address is located 40 bytes inside of 456-byte region [ffff8880a488d040, ffff8880a488d208) The buggy address belongs to the page: page:ffffea0002922340 refcount:1 mapcount:0 mapping:ffff88821b057000 index:0x0 raw: 00fffe0000000200 ffffea00022a5788 ffffea0002624a48 ffff88821b057000 raw: 0000000000000000 ffff8880a488d040 0000000100000006 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8880a488cf00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8880a488cf80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff8880a488d000: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb ^ ffff8880a488d080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880a488d100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Fixes: 853697504de0 ("tcp: Fix highest_sack and highest_sack_seq") Fixes: 50895b9de1d3 ("tcp: highest_sack fix") Fixes: 737ff314563c ("tcp: use sequence distance to detect reordering") Signed-off-by: Eric Dumazet Cc: Cambda Zhu Cc: Yuchung Cheng Cc: Neal Cardwell Acked-by: Neal Cardwell Acked-by: Yuchung Cheng Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 33dba56493e7ebd586a62e2a982c5b796b9c3d88 Author: Wen Yang Date: Mon Jan 20 18:04:56 2020 +0800 tcp_bbr: improve arithmetic division in bbr_update_bw() [ Upstream commit 5b2f1f3070b6447b76174ea8bfb7390dc6253ebd ] do_div() does a 64-by-32 division. Use div64_long() instead of it if the divisor is long, to avoid truncation to 32-bit. And as a nice side effect also cleans up the function a bit. Signed-off-by: Wen Yang Cc: Eric Dumazet Cc: "David S. Miller" Cc: Alexey Kuznetsov Cc: Hideaki YOSHIFUJI Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 4c1c35c01531cda945cd1e0ae994a6c91c88de81 Author: Paolo Abeni Date: Tue Jan 21 16:50:49 2020 +0100 Revert "udp: do rmem bulk free even if the rx sk queue is empty" [ Upstream commit d39ca2590d10712f412add7a88e1dd467a7246f4 ] This reverts commit 0d4a6608f68c7532dcbfec2ea1150c9761767d03. Willem reported that after commit 0d4a6608f68c ("udp: do rmem bulk free even if the rx sk queue is empty") the memory allocated by an almost idle system with many UDP sockets can grow a lot. For stable kernel keep the solution as simple as possible and revert the offending commit. Reported-by: Willem de Bruijn Diagnosed-by: Eric Dumazet Fixes: 0d4a6608f68c ("udp: do rmem bulk free even if the rx sk queue is empty") Signed-off-by: Paolo Abeni Acked-by: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit c74b3d128d57df60bd3ff746bf3f2025597b2d90 Author: James Hughes Date: Mon Jan 20 11:12:40 2020 +0000 net: usb: lan78xx: Add .ndo_features_check [ Upstream commit ce896476c65d72b4b99fa09c2f33436b4198f034 ] As reported by Eric Dumazet, there are still some outstanding cases where the driver does not handle TSO correctly when skb's are over a certain size. Most cases have been fixed, this patch should ensure that forwarded SKB's that are greater than MAX_SINGLE_PACKET_SIZE - TX_OVERHEAD are software segmented and handled correctly. Signed-off-by: James Hughes Reviewed-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit b4b0f1fc194614859486b4fd19bd5885a3c8818f Author: Jouni Hogander Date: Mon Jan 20 09:51:03 2020 +0200 net-sysfs: Fix reference count leak [ Upstream commit cb626bf566eb4433318d35681286c494f04fedcc ] Netdev_register_kobject is calling device_initialize. In case of error reference taken by device_initialize is not given up. Drivers are supposed to call free_netdev in case of error. In non-error case the last reference is given up there and device release sequence is triggered. In error case this reference is kept and the release sequence is never started. Fix this by setting reg_state as NETREG_UNREGISTERED if registering fails. This is the rootcause for couple of memory leaks reported by Syzkaller: BUG: memory leak unreferenced object 0xffff8880675ca008 (size 256): comm "netdev_register", pid 281, jiffies 4294696663 (age 6.808s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000058ca4711>] kmem_cache_alloc_trace+0x167/0x280 [<000000002340019b>] device_add+0x882/0x1750 [<000000001d588c3a>] netdev_register_kobject+0x128/0x380 [<0000000011ef5535>] register_netdevice+0xa1b/0xf00 [<000000007fcf1c99>] __tun_chr_ioctl+0x20d5/0x3dd0 [<000000006a5b7b2b>] tun_chr_ioctl+0x2f/0x40 [<00000000f30f834a>] do_vfs_ioctl+0x1c7/0x1510 [<00000000fba062ea>] ksys_ioctl+0x99/0xb0 [<00000000b1c1b8d2>] __x64_sys_ioctl+0x78/0xb0 [<00000000984cabb9>] do_syscall_64+0x16f/0x580 [<000000000bde033d>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<00000000e6ca2d9f>] 0xffffffffffffffff BUG: memory leak unreferenced object 0xffff8880668ba588 (size 8): comm "kobject_set_nam", pid 286, jiffies 4294725297 (age 9.871s) hex dump (first 8 bytes): 6e 72 30 00 cc be df 2b nr0....+ backtrace: [<00000000a322332a>] __kmalloc_track_caller+0x16e/0x290 [<00000000236fd26b>] kstrdup+0x3e/0x70 [<00000000dd4a2815>] kstrdup_const+0x3e/0x50 [<0000000049a377fc>] kvasprintf_const+0x10e/0x160 [<00000000627fc711>] kobject_set_name_vargs+0x5b/0x140 [<0000000019eeab06>] dev_set_name+0xc0/0xf0 [<0000000069cb12bc>] netdev_register_kobject+0xc8/0x320 [<00000000f2e83732>] register_netdevice+0xa1b/0xf00 [<000000009e1f57cc>] __tun_chr_ioctl+0x20d5/0x3dd0 [<000000009c560784>] tun_chr_ioctl+0x2f/0x40 [<000000000d759e02>] do_vfs_ioctl+0x1c7/0x1510 [<00000000351d7c31>] ksys_ioctl+0x99/0xb0 [<000000008390040a>] __x64_sys_ioctl+0x78/0xb0 [<0000000052d196b7>] do_syscall_64+0x16f/0x580 [<0000000019af9236>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<00000000bc384531>] 0xffffffffffffffff v3 -> v4: Set reg_state to NETREG_UNREGISTERED if registering fails v2 -> v3: * Replaced BUG_ON with WARN_ON in free_netdev and netdev_release v1 -> v2: * Relying on driver calling free_netdev rather than calling put_device directly in error path Reported-by: syzbot+ad8ca40ecd77896d51e2@syzkaller.appspotmail.com Cc: David Miller Cc: Greg Kroah-Hartman Cc: Lukas Bulwahn Signed-off-by: Jouni Hogander Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 33c540f663d1dc95461380e2e2ca83cc9b25dd59 Author: Jouni Hogander Date: Tue Dec 17 13:46:34 2019 +0200 net-sysfs: Call dev_hold always in rx_queue_add_kobject commit ddd9b5e3e765d8ed5a35786a6cb00111713fe161 upstream. Dev_hold has to be called always in rx_queue_add_kobject. Otherwise usage count drops below 0 in case of failure in kobject_init_and_add. Fixes: b8eb718348b8 ("net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject") Reported-by: syzbot Cc: Tetsuo Handa Cc: David Miller Cc: Lukas Bulwahn Signed-off-by: Jouni Hogander Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f8862bc44fadf9d96c7819bf12b9025240f065c9 Author: Jouni Hogander Date: Thu Dec 5 15:57:07 2019 +0200 net-sysfs: Call dev_hold always in netdev_queue_add_kobject commit e0b60903b434a7ee21ba8d8659f207ed84101e89 upstream. Dev_hold has to be called always in netdev_queue_add_kobject. Otherwise usage count drops below 0 in case of failure in kobject_init_and_add. Fixes: b8eb718348b8 ("net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject") Reported-by: Hulk Robot Cc: Tetsuo Handa Cc: David Miller Cc: Lukas Bulwahn Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 7070695e6077e2c3bb3a67432682cf4b3c258942 Author: Eric Dumazet Date: Wed Nov 20 19:19:07 2019 -0800 net-sysfs: fix netdev_queue_add_kobject() breakage commit 48a322b6f9965b2f1e4ce81af972f0e287b07ed0 upstream. kobject_put() should only be called in error path. Fixes: b8eb718348b8 ("net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject") Signed-off-by: Eric Dumazet Cc: Jouni Hogander Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 60e715466109b76073c02e02b50df1c56ea4aac9 Author: Jouni Hogander Date: Wed Nov 20 09:08:16 2019 +0200 net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject commit b8eb718348b8fb30b5a7d0a8fce26fb3f4ac741b upstream. kobject_init_and_add takes reference even when it fails. This has to be given up by the caller in error handling. Otherwise memory allocated by kobject_init_and_add is never freed. Originally found by Syzkaller: BUG: memory leak unreferenced object 0xffff8880679f8b08 (size 8): comm "netdev_register", pid 269, jiffies 4294693094 (age 12.132s) hex dump (first 8 bytes): 72 78 2d 30 00 36 20 d4 rx-0.6 . backtrace: [<000000008c93818e>] __kmalloc_track_caller+0x16e/0x290 [<000000001f2e4e49>] kvasprintf+0xb1/0x140 [<000000007f313394>] kvasprintf_const+0x56/0x160 [<00000000aeca11c8>] kobject_set_name_vargs+0x5b/0x140 [<0000000073a0367c>] kobject_init_and_add+0xd8/0x170 [<0000000088838e4b>] net_rx_queue_update_kobjects+0x152/0x560 [<000000006be5f104>] netdev_register_kobject+0x210/0x380 [<00000000e31dab9d>] register_netdevice+0xa1b/0xf00 [<00000000f68b2465>] __tun_chr_ioctl+0x20d5/0x3dd0 [<000000004c50599f>] tun_chr_ioctl+0x2f/0x40 [<00000000bbd4c317>] do_vfs_ioctl+0x1c7/0x1510 [<00000000d4c59e8f>] ksys_ioctl+0x99/0xb0 [<00000000946aea81>] __x64_sys_ioctl+0x78/0xb0 [<0000000038d946e5>] do_syscall_64+0x16f/0x580 [<00000000e0aa5d8f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<00000000285b3d1a>] 0xffffffffffffffff Cc: David Miller Cc: Lukas Bulwahn Signed-off-by: Jouni Hogander Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 66ac8ee96faa582a252ae19510f35529c9143670 Author: Cong Wang Date: Wed Jan 22 15:42:02 2020 -0800 net_sched: fix datalen for ematch [ Upstream commit 61678d28d4a45ef376f5d02a839cc37509ae9281 ] syzbot reported an out-of-bound access in em_nbyte. As initially analyzed by Eric, this is because em_nbyte sets its own em->datalen in em_nbyte_change() other than the one specified by user, but this value gets overwritten later by its caller tcf_em_validate(). We should leave em->datalen untouched to respect their choices. I audit all the in-tree ematch users, all of those implement ->change() set em->datalen, so we can just avoid setting it twice in this case. Reported-and-tested-by: syzbot+5af9a90dad568aa9f611@syzkaller.appspotmail.com Reported-by: syzbot+2f07903a5b05e7f36410@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: Eric Dumazet Signed-off-by: Cong Wang Reviewed-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit be1a2be7a7b0ed5a758fd8decc39386ba3b5d556 Author: Eric Dumazet Date: Tue Jan 21 22:47:29 2020 -0800 net: rtnetlink: validate IFLA_MTU attribute in rtnl_create_link() [ Upstream commit d836f5c69d87473ff65c06a6123e5b2cf5e56f5b ] rtnl_create_link() needs to apply dev->min_mtu and dev->max_mtu checks that we apply in do_setlink() Otherwise malicious users can crash the kernel, for example after an integer overflow : BUG: KASAN: use-after-free in memset include/linux/string.h:365 [inline] BUG: KASAN: use-after-free in __alloc_skb+0x37b/0x5e0 net/core/skbuff.c:238 Write of size 32 at addr ffff88819f20b9c0 by task swapper/0/0 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.5.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374 __kasan_report.cold+0x1b/0x41 mm/kasan/report.c:506 kasan_report+0x12/0x20 mm/kasan/common.c:639 check_memory_region_inline mm/kasan/generic.c:185 [inline] check_memory_region+0x134/0x1a0 mm/kasan/generic.c:192 memset+0x24/0x40 mm/kasan/common.c:108 memset include/linux/string.h:365 [inline] __alloc_skb+0x37b/0x5e0 net/core/skbuff.c:238 alloc_skb include/linux/skbuff.h:1049 [inline] alloc_skb_with_frags+0x93/0x590 net/core/skbuff.c:5664 sock_alloc_send_pskb+0x7ad/0x920 net/core/sock.c:2242 sock_alloc_send_skb+0x32/0x40 net/core/sock.c:2259 mld_newpack+0x1d7/0x7f0 net/ipv6/mcast.c:1609 add_grhead.isra.0+0x299/0x370 net/ipv6/mcast.c:1713 add_grec+0x7db/0x10b0 net/ipv6/mcast.c:1844 mld_send_cr net/ipv6/mcast.c:1970 [inline] mld_ifc_timer_expire+0x3d3/0x950 net/ipv6/mcast.c:2477 call_timer_fn+0x1ac/0x780 kernel/time/timer.c:1404 expire_timers kernel/time/timer.c:1449 [inline] __run_timers kernel/time/timer.c:1773 [inline] __run_timers kernel/time/timer.c:1740 [inline] run_timer_softirq+0x6c3/0x1790 kernel/time/timer.c:1786 __do_softirq+0x262/0x98c kernel/softirq.c:292 invoke_softirq kernel/softirq.c:373 [inline] irq_exit+0x19b/0x1e0 kernel/softirq.c:413 exiting_irq arch/x86/include/asm/apic.h:536 [inline] smp_apic_timer_interrupt+0x1a3/0x610 arch/x86/kernel/apic/apic.c:1137 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829 RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61 Code: 98 6b ea f9 eb 8a cc cc cc cc cc cc e9 07 00 00 00 0f 00 2d 44 1c 60 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 34 1c 60 00 fb f4 cc 55 48 89 e5 41 57 41 56 41 55 41 54 53 e8 4e 5d 9a f9 e8 79 RSP: 0018:ffffffff89807ce8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13 RAX: 1ffffffff13266ae RBX: ffffffff8987a1c0 RCX: 0000000000000000 RDX: dffffc0000000000 RSI: 0000000000000006 RDI: ffffffff8987aa54 RBP: ffffffff89807d18 R08: ffffffff8987a1c0 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: dffffc0000000000 R13: ffffffff8a799980 R14: 0000000000000000 R15: 0000000000000000 arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:690 default_idle_call+0x84/0xb0 kernel/sched/idle.c:94 cpuidle_idle_call kernel/sched/idle.c:154 [inline] do_idle+0x3c8/0x6e0 kernel/sched/idle.c:269 cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:361 rest_init+0x23b/0x371 init/main.c:451 arch_call_rest_init+0xe/0x1b start_kernel+0x904/0x943 init/main.c:784 x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:490 x86_64_start_kernel+0x77/0x7b arch/x86/kernel/head64.c:471 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:242 The buggy address belongs to the page: page:ffffea00067c82c0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 raw: 057ffe0000000000 ffffea00067c82c8 ffffea00067c82c8 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88819f20b880: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff88819f20b900: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >ffff88819f20b980: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff88819f20ba00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff88819f20ba80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Fixes: 61e84623ace3 ("net: centralize net_device min/max MTU checking") Signed-off-by: Eric Dumazet Reported-by: syzbot Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 1d3b53f716b56be2afd794b1fc47633e7621f018 Author: William Dauchy Date: Tue Jan 21 15:26:24 2020 +0100 net, ip_tunnel: fix namespaces move [ Upstream commit d0f418516022c32ecceaf4275423e5bd3f8743a9 ] in the same manner as commit 690afc165bb3 ("net: ip6_gre: fix moving ip6gre between namespaces"), fix namespace moving as it was broken since commit 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata."). Indeed, the ip6_gre commit removed the local flag for collect_md condition, so there is no reason to keep it for ip_gre/ip_tunnel. this patch will fix both ip_tunnel and ip_gre modules. Fixes: 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.") Signed-off-by: William Dauchy Acked-by: Nicolas Dichtel Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit fddb6ea5143a5d92247ffab103d1eaa659de740b Author: William Dauchy Date: Tue Jan 21 21:49:54 2020 +0100 net, ip6_tunnel: fix namespaces move [ Upstream commit 5311a69aaca30fa849c3cc46fb25f75727fb72d0 ] in the same manner as commit d0f418516022 ("net, ip_tunnel: fix namespaces move"), fix namespace moving as it was broken since commit 8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnel"), but for ipv6 this time; there is no reason to keep it for ip6_tunnel. Fixes: 8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnel") Signed-off-by: William Dauchy Acked-by: Nicolas Dichtel Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d0201d2405dac8d9b16773e97709925e397552d0 Author: Niko Kortstrom Date: Thu Jan 16 11:43:27 2020 +0200 net: ip6_gre: fix moving ip6gre between namespaces [ Upstream commit 690afc165bb314354667f67157c1a1aea7dc797a ] Support for moving IPv4 GRE tunnels between namespaces was added in commit b57708add314 ("gre: add x-netns support"). The respective change for IPv6 tunnels, commit 22f08069e8b4 ("ip6gre: add x-netns support") did not drop NETIF_F_NETNS_LOCAL flag so moving them from one netns to another is still denied in IPv6 case. Drop NETIF_F_NETNS_LOCAL flag from ip6gre tunnels to allow moving ip6gre tunnel endpoints between network namespaces. Signed-off-by: Niko Kortstrom Acked-by: Nicolas Dichtel Acked-by: William Tu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 404d333fd36172ee7730c9a17746d3e35a167f5d Author: Michael Ellerman Date: Fri Jan 24 20:41:44 2020 +1100 net: cxgb3_main: Add CAP_NET_ADMIN check to CHELSIO_GET_MEM [ Upstream commit 3546d8f1bbe992488ed91592cf6bf76e7114791a = The cxgb3 driver for "Chelsio T3-based gigabit and 10Gb Ethernet adapters" implements a custom ioctl as SIOCCHIOCTL/SIOCDEVPRIVATE in cxgb_extension_ioctl(). One of the subcommands of the ioctl is CHELSIO_GET_MEM, which appears to read memory directly out of the adapter and return it to userspace. It's not entirely clear what the contents of the adapter memory contains, but the assumption is that it shouldn't be accessible to all users. So add a CAP_NET_ADMIN check to the CHELSIO_GET_MEM case. Put it after the is_offload() check, which matches two of the other subcommands in the same function which also check for is_offload() and CAP_NET_ADMIN. Found by Ilja by code inspection, not tested as I don't have the required hardware. Reported-by: Ilja Van Sprundel Signed-off-by: Michael Ellerman Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0705c8d7aae518e4cb22c68749eaac917e9902b8 Author: Florian Fainelli Date: Thu Jan 23 09:49:34 2020 -0800 net: bcmgenet: Use netif_tx_napi_add() for TX NAPI [ Upstream commit 148965df1a990af98b2c84092c2a2274c7489284 ] Before commit 7587935cfa11 ("net: bcmgenet: move NAPI initialization to ring initialization") moved the code, this used to be netif_tx_napi_add(), but we lost that small semantic change in the process, restore that. Fixes: 7587935cfa11 ("net: bcmgenet: move NAPI initialization to ring initialization") Signed-off-by: Florian Fainelli Acked-by: Doug Berger Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d3c0a8be8bbbf3f67ccd88909993b802886b6b18 Author: Yuki Taguchi Date: Mon Jan 20 13:48:37 2020 +0900 ipv6: sr: remove SKB_GSO_IPXIP6 on End.D* actions [ Upstream commit 62ebaeaedee7591c257543d040677a60e35c7aec ] After LRO/GRO is applied, SRv6 encapsulated packets have SKB_GSO_IPXIP6 feature flag, and this flag must be removed right after decapulation procedure. Currently, SKB_GSO_IPXIP6 flag is not removed on End.D* actions, which creates inconsistent packet state, that is, a normal TCP/IP packets have the SKB_GSO_IPXIP6 flag. This behavior can cause unexpected fallback to GSO on routing to netdevices that do not support SKB_GSO_IPXIP6. For example, on inter-VRF forwarding, decapsulated packets separated into small packets by GSO because VRF devices do not support TSO for packets with SKB_GSO_IPXIP6 flag, and this degrades forwarding performance. This patch removes encapsulation related GSO flags from the skb right after the End.D* action is applied. Fixes: d7a669dd2f8b ("ipv6: sr: add helper functions for seg6local") Signed-off-by: Yuki Taguchi Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d3b5ecceea7dc3ce36c5306b3e45bd75cd192291 Author: Eric Dumazet Date: Tue Jan 21 23:17:14 2020 -0800 gtp: make sure only SOCK_DGRAM UDP sockets are accepted [ Upstream commit 940ba14986657a50c15f694efca1beba31fa568f ] A malicious user could use RAW sockets and fool GTP using them as standard SOCK_DGRAM UDP sockets. BUG: KMSAN: uninit-value in udp_tunnel_encap_enable include/net/udp_tunnel.h:174 [inline] BUG: KMSAN: uninit-value in setup_udp_tunnel_sock+0x45e/0x6f0 net/ipv4/udp_tunnel.c:85 CPU: 0 PID: 11262 Comm: syz-executor613 Not tainted 5.5.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215 udp_tunnel_encap_enable include/net/udp_tunnel.h:174 [inline] setup_udp_tunnel_sock+0x45e/0x6f0 net/ipv4/udp_tunnel.c:85 gtp_encap_enable_socket+0x37f/0x5a0 drivers/net/gtp.c:827 gtp_encap_enable drivers/net/gtp.c:844 [inline] gtp_newlink+0xfb/0x1e50 drivers/net/gtp.c:666 __rtnl_newlink net/core/rtnetlink.c:3305 [inline] rtnl_newlink+0x2973/0x3920 net/core/rtnetlink.c:3363 rtnetlink_rcv_msg+0x1153/0x1570 net/core/rtnetlink.c:5424 netlink_rcv_skb+0x451/0x650 net/netlink/af_netlink.c:2477 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5442 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0xf9e/0x1100 net/netlink/af_netlink.c:1328 netlink_sendmsg+0x1248/0x14d0 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:639 [inline] sock_sendmsg net/socket.c:659 [inline] ____sys_sendmsg+0x12b6/0x1350 net/socket.c:2330 ___sys_sendmsg net/socket.c:2384 [inline] __sys_sendmsg+0x451/0x5f0 net/socket.c:2417 __do_sys_sendmsg net/socket.c:2426 [inline] __se_sys_sendmsg+0x97/0xb0 net/socket.c:2424 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424 do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x441359 Code: e8 ac e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fff1cd0ac28 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441359 RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000003 RBP: 00000000006cb018 R08: 00000000004002c8 R09: 00000000004002c8 R10: 00000000004002c8 R11: 0000000000000246 R12: 00000000004020d0 R13: 0000000000402160 R14: 0000000000000000 R15: 0000000000000000 Uninit was created at: kmsan_save_stack_with_flags+0x3c/0x90 mm/kmsan/kmsan.c:144 kmsan_internal_alloc_meta_for_pages mm/kmsan/kmsan_shadow.c:307 [inline] kmsan_alloc_page+0x12a/0x310 mm/kmsan/kmsan_shadow.c:336 __alloc_pages_nodemask+0x57f2/0x5f60 mm/page_alloc.c:4800 alloc_pages_current+0x67d/0x990 mm/mempolicy.c:2207 alloc_pages include/linux/gfp.h:534 [inline] alloc_slab_page+0x111/0x12f0 mm/slub.c:1511 allocate_slab mm/slub.c:1656 [inline] new_slab+0x2bc/0x1130 mm/slub.c:1722 new_slab_objects mm/slub.c:2473 [inline] ___slab_alloc+0x1533/0x1f30 mm/slub.c:2624 __slab_alloc mm/slub.c:2664 [inline] slab_alloc_node mm/slub.c:2738 [inline] slab_alloc mm/slub.c:2783 [inline] kmem_cache_alloc+0xb23/0xd70 mm/slub.c:2788 sk_prot_alloc+0xf2/0x620 net/core/sock.c:1597 sk_alloc+0xf0/0xbe0 net/core/sock.c:1657 inet_create+0x7c7/0x1370 net/ipv4/af_inet.c:321 __sock_create+0x8eb/0xf00 net/socket.c:1420 sock_create net/socket.c:1471 [inline] __sys_socket+0x1a1/0x600 net/socket.c:1513 __do_sys_socket net/socket.c:1522 [inline] __se_sys_socket+0x8d/0xb0 net/socket.c:1520 __x64_sys_socket+0x4a/0x70 net/socket.c:1520 do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Eric Dumazet Cc: Pablo Neira Reported-by: syzbot Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 8e360d7c42788acf1e3e0196f702e8f51c60b1c7 Author: Wenwen Wang Date: Sat Jan 25 14:33:29 2020 +0000 firestream: fix memory leaks [ Upstream commit fa865ba183d61c1ec8cbcab8573159c3b72b89a4 ] In fs_open(), 'vcc' is allocated through kmalloc() and assigned to 'atm_vcc->dev_data.' In the following execution, if an error occurs, e.g., there is no more free channel, an error code EBUSY or ENOMEM will be returned. However, 'vcc' is not deallocated, leading to memory leaks. Note that, in normal cases where fs_open() returns 0, 'vcc' will be deallocated in fs_close(). But, if fs_open() fails, there is no guarantee that fs_close() will be invoked. To fix this issue, deallocate 'vcc' before the error code is returned. Signed-off-by: Wenwen Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit bd1448458c6a67782d4e82c181b5540b5727546b Author: Richard Palethorpe Date: Tue Jan 21 14:42:58 2020 +0100 can, slip: Protect tty->disc_data in write_wakeup and close with RCU [ Upstream commit 0ace17d56824165c7f4c68785d6b58971db954dd ] write_wakeup can happen in parallel with close/hangup where tty->disc_data is set to NULL and the netdevice is freed thus also freeing disc_data. write_wakeup accesses disc_data so we must prevent close from freeing the netdev while write_wakeup has a non-NULL view of tty->disc_data. We also need to make sure that accesses to disc_data are atomic. Which can all be done with RCU. This problem was found by Syzkaller on SLCAN, but the same issue is reproducible with the SLIP line discipline using an LTP test based on the Syzkaller reproducer. A fix which didn't use RCU was posted by Hillf Danton. Fixes: 661f7fda21b1 ("slip: Fix deadlock in write_wakeup") Fixes: a8e83b17536a ("slcan: Port write_wakeup deadlock fix from slip") Reported-by: syzbot+017e491ae13c0068598a@syzkaller.appspotmail.com Signed-off-by: Richard Palethorpe Cc: Wolfgang Grandegger Cc: Marc Kleine-Budde Cc: "David S. Miller" Cc: Tyler Hall Cc: linux-can@vger.kernel.org Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: syzkaller@googlegroups.com Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman