aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2018-01-05Merge tag 'for-4.15-rc7-tag' of ↵HEADmasterLinus Torvalds2-12/+34
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "We have two more fixes for 4.15, both aimed for stable. The leak fix is obvious, the second patch fixes a bug revealed by the refcount API, when it behaves differently than previous atomic_t and reports refs going from 0 to 1 in one case" * tag 'for-4.15-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix refcount_t usage when deleting btrfs_delayed_nodes btrfs: Fix flush bio leak
2018-01-05Merge tag 'xfs-4.15-fixes-10' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds3-19/+33
Pull XFS fixes from Darrick Wong: "I have just a few fixes for bugs and resource cleanup problems this week: - Fix resource cleanup of failed quota initialization - Fix integer overflow problems wrt s_maxbytes" * tag 'xfs-4.15-fixes-10' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix s_maxbytes overflow problems xfs: quota: check result of register_shrinker() xfs: quota: fix missed destroy of qi_tree_lock
2018-01-05Merge tag 'mfd-fixes-4.15-1' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd Pull MFD fix from Lee Jones: "Late bugfix to plug a leak in rtsx_pcr" * tag 'mfd-fixes-4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: mfd: rtsx: Release IRQ during shutdown
2018-01-05Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds10-43/+55
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull more x86 pti fixes from Thomas Gleixner: "Another small stash of fixes for fallout from the PTI work: - Fix the modules vs. KASAN breakage which was caused by making MODULES_END depend of the fixmap size. That was done when the cpu entry area moved into the fixmap, but now that we have a separate map space for that this is causing more issues than it solves. - Use the proper cache flush methods for the debugstore buffers as they are mapped/unmapped during runtime and not statically mapped at boot time like the rest of the cpu entry area. - Make the map layout of the cpu_entry_area consistent for 4 and 5 level paging and fix the KASLR vaddr_end wreckage. - Use PER_CPU_EXPORT for per cpu variable and while at it unbreak nvidia gfx drivers by dropping the GPL export. The subject line of the commit tells it the other way around, but I noticed that too late. - Fix the ASM alternative macros so they can be used in the middle of an inline asm block. - Rename the BUG_CPU_INSECURE flag to BUG_CPU_MELTDOWN so the attack vector is properly identified. The Spectre mitigations will come with their own bug bits later" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/pti: Rename BUG_CPU_INSECURE to BUG_CPU_MELTDOWN x86/alternatives: Add missing '\n' at end of ALTERNATIVE inline asm x86/tlb: Drop the _GPL from the cpu_tlbstate export x86/events/intel/ds: Use the proper cache flush method for mapping ds buffers x86/kaslr: Fix the vaddr_end mess x86/mm: Map cpu_entry_area at the same place on 4/5 level x86/mm: Set MODULES_END to 0xffffffffff000000
2018-01-05Merge branch 'efi-urgent-for-linus' of ↵Linus Torvalds5-17/+57
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI updates from Thomas Gleixner: - A fix for a add_efi_memmap parameter regression which ensures that the parameter is parsed before it is used. - Reinstate the virtual capsule mapping as the cached copy turned out to break Quark and other things - Remove Matt Fleming as EFI co-maintainer. He stepped back a few days ago. Thanks Matt for all your great work! * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: MAINTAINERS: Remove Matt Fleming as EFI co-maintainer efi/capsule-loader: Reinstate virtual capsule mapping x86/efi: Fix kernel param add_efi_memmap regression
2018-01-05Merge branch 'for-linus' of ↵Linus Torvalds5-3/+35
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "Four bug fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/dasd: fix wrongly assigned configuration data s390: fix preemption race in disable_sacf_uaccess s390/sclp: disable FORTIFY_SOURCE for early sclp code s390/pci: handle insufficient resources during dma tlb flush
2018-01-05Merge tag 'for-linus-4.15-rc7-tag' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fix from Juergen Gross: "One minor fix adjusting the kmalloc flags in the new pvcalls driver added in rc1" * tag 'for-linus-4.15-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/pvcalls: use GFP_ATOMIC under spin lock
2018-01-05Merge branch 'linus' of ↵Linus Torvalds12-62/+173
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes the following issues: - racy use of ctx->rcvused in af_alg - algif_aead crash in chacha20poly1305 - freeing bogus pointer in pcrypt - build error on MIPS in mpi - memory leak in inside-secure - memory overwrite in inside-secure - NULL pointer dereference in inside-secure - state corruption in inside-secure - build error without CRYPTO_GF128MUL in chelsio - use after free in n2" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: inside-secure - do not use areq->result for partial results crypto: inside-secure - fix request allocations in invalidation path crypto: inside-secure - free requests even if their handling failed crypto: inside-secure - per request invalidation lib/mpi: Fix umul_ppmm() for MIPS64r6 crypto: pcrypt - fix freeing pcrypt instances crypto: n2 - cure use after free crypto: af_alg - Fix race around ctx->rcvused by making it atomic_t crypto: chacha20poly1305 - validate the digest size crypto: chelsio - select CRYPTO_GF128MUL
2018-01-05Merge branch 'akpm' (patches from Andrew)Linus Torvalds12-23/+43
Merge misc fixes from Andrew Morton: "9 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mailmap: update Mark Yao's email address userfaultfd: clear the vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK fails mm/sparse.c: wrong allocation for mem_section mm/zsmalloc.c: include fs.h mm/debug.c: provide useful debugging information for VM_BUG kernel/exit.c: export abort() to modules mm/mprotect: add a cond_resched() inside change_pmd_range() kernel/acct.c: fix the acct->needcheck check in check_free_space() mm: check pfn_valid first in zero_resv_unavail
2018-01-05x86/pti: Rename BUG_CPU_INSECURE to BUG_CPU_MELTDOWNThomas Gleixner3-5/+5
Use the name associated with the particular attack which needs page table isolation for mitigation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk> Cc: Jiri Koshina <jikos@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andi Lutomirski <luto@amacapital.net> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Greg KH <gregkh@linux-foundation.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801051525300.1724@nanos
2018-01-05x86/alternatives: Add missing '\n' at end of ALTERNATIVE inline asmDavid Woodhouse1-2/+2
Where an ALTERNATIVE is used in the middle of an inline asm block, this would otherwise lead to the following instruction being appended directly to the trailing ".popsection", and a failed compile. Fixes: 9cebed423c84 ("x86, alternative: Use .pushsection/.popsection") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: ak@linux.intel.com Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180104143710.8961-8-dwmw@amazon.co.uk
2018-01-05mfd: rtsx: Release IRQ during shutdownSinan Kaya1-0/+3
'Commit cc27b735ad3a ("PCI/portdrv: Turn off PCIe services during shutdown")' revealed a resource leak in rtsx_pci driver during shutdown. Issue shows up as a warning during shutdown as follows: remove_proc_entry: removing non-empty directory 'irq/17', leaking at least 'rtsx_pci' WARNING: CPU: 0 PID: 1578 at fs/proc/generic.c:572 remove_proc_entry+0x11d/0x130 Modules linked in <long list but none that are out-of-tree> ... Call Trace: unregister_irq_proc free_desc irq_free_descs mp_unmap_irq acpi_unregister_gsi_apic acpi_pci_irq_disable do_pci_disable_device pci_disable_device device_shutdown kernel_restart Sys_reboot Even though rtsx_pci driver implements a shutdown callback, it is not releasing the interrupt that it registered during probe. This is causing the ACPI layer to complain that the shared IRQ is in use while freeing IRQ. This code releases the IRQ to prevent resource leak and eliminate the warning. Fixes: cc27b735ad3a ("PCI/portdrv: Turn off PCIe services during shutdown") Link: https://bugzilla.kernel.org/show_bug.cgi?id=198141 Reported-by: Chris Clayton <chris2553@googlemail.com> Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2018-01-04Merge tag 'drm-fixes-for-v4.15-rc7' of ↵Linus Torvalds16-106/+127
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Just collecting some fixes to finish my hoildays :-). A few fixes for i915 (one documentation build fix), one ttm fix, one AMD display fix, one omapdrm fix, and a set of armada fixes from Russell. All seem pretty small, you can now return to your latest security news site" * tag 'drm-fixes-for-v4.15-rc7' of git://people.freedesktop.org/~airlied/linux: drm/i915: Apply Display WA #1183 on skl, kbl, and cfl drm/ttm: check the return value of kzalloc drm/amd/display: call set csc_default if enable adjustment is false docs: fix, intel_guc_loader.c has been moved to intel_guc_fw.c omapdrm/dss/hdmi4_cec: fix interrupt handling documentation/gpu/i915: fix docs build error after file rename drm/i915: Put all non-blocking modesets onto an ordered wq drm/i915: Disable DC states around GMBUS on GLK drm/i915/psr: Fix register name mess up. drm/armada: fix YUV planar format framebuffer offsets drm/armada: improve efficiency of armada_drm_plane_calc_addrs() drm/armada: fix UV swap code drm/armada: fix SRAM powerdown drm/armada: fix leak of crtc structure
2018-01-04mailmap: update Mark Yao's email addressJeffy Chen1-0/+1
Change the previous employers email addresses to the current email address. Link: http://lkml.kernel.org/r/20171229121726.31589-1-jeffy.chen@rock-chips.com Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Acked-by: Martin Kepplinger <martink@posteo.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-04userfaultfd: clear the vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK failsAndrea Arcangeli1-2/+18
The previous fix in commit 384632e67e08 ("userfaultfd: non-cooperative: fix fork use after free") corrected the refcounting in case of UFFD_EVENT_FORK failure for the fork userfault paths. That still didn't clear the vma->vm_userfaultfd_ctx of the vmas that were set to point to the aborted new uffd ctx earlier in dup_userfaultfd. Link: http://lkml.kernel.org/r/20171223002505.593-2-aarcange@redhat.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-04mm/sparse.c: wrong allocation for mem_sectionBaoquan He1-1/+1
In commit 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y") mem_section is allocated at runtime to save memory. It allocates the first dimension of array with sizeof(struct mem_section). It costs extra memory, should be sizeof(struct mem_section *). Fix it. Link: http://lkml.kernel.org/r/1513932498-20350-1-git-send-email-bhe@redhat.com Fixes: 83e3c48729 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y") Signed-off-by: Baoquan He <bhe@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-04mm/zsmalloc.c: include fs.hSergey Senozhatsky1-0/+1
`struct file_system_type' and alloc_anon_inode() function are defined in fs.h, include it directly. Link: http://lkml.kernel.org/r/20171219104219.3017-1-sergey.senozhatsky@gmail.com Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-04mm/debug.c: provide useful debugging information for VM_BUGMatthew Wilcox1-14/+14
With the recent addition of hashed kernel pointers, places which need to produce useful debug output have to specify %px, not %p. This patch fixes all the VM debug to use %px. This is appropriate because it's debug output that the user should never be able to trigger, and kernel developers need to see the actual pointers. Link: http://lkml.kernel.org/r/20171219133236.GE13680@bombadil.infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: "Tobin C. Harding" <me@tobin.cc> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-04kernel/exit.c: export abort() to modulesAndrew Morton4-3/+1
gcc -fisolate-erroneous-paths-dereference can generate calls to abort() from modular code too. [arnd@arndb.de: drop duplicate exports of abort()] Link: http://lkml.kernel.org/r/20180102103311.706364-1-arnd@arndb.de Reported-by: Vineet Gupta <Vineet.Gupta1@synopsys.com> Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Alexey Brodkin <Alexey.Brodkin@synopsys.com> Cc: Russell King <rmk+kernel@armlinux.org.uk> Cc: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-04mm/mprotect: add a cond_resched() inside change_pmd_range()Anshuman Khandual1-2/+4
While testing on a large CPU system, detected the following RCU stall many times over the span of the workload. This problem is solved by adding a cond_resched() in the change_pmd_range() function. INFO: rcu_sched detected stalls on CPUs/tasks: 154-....: (670 ticks this GP) idle=022/140000000000000/0 softirq=2825/2825 fqs=612 (detected by 955, t=6002 jiffies, g=4486, c=4485, q=90864) Sending NMI from CPU 955 to CPUs 154: NMI backtrace for cpu 154 CPU: 154 PID: 147071 Comm: workload Not tainted 4.15.0-rc3+ #3 NIP: c0000000000b3f64 LR: c0000000000b33d4 CTR: 000000000000aa18 REGS: 00000000a4b0fb44 TRAP: 0501 Not tainted (4.15.0-rc3+) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 22422082 XER: 00000000 CFAR: 00000000006cf8f0 SOFTE: 1 GPR00: 0010000000000000 c00003ef9b1cb8c0 c0000000010cc600 0000000000000000 GPR04: 8e0000018c32b200 40017b3858fd6e00 8e0000018c32b208 40017b3858fd6e00 GPR08: 8e0000018c32b210 40017b3858fd6e00 8e0000018c32b218 40017b3858fd6e00 GPR12: ffffffffffffffff c00000000fb25100 NIP [c0000000000b3f64] plpar_hcall9+0x44/0x7c LR [c0000000000b33d4] pSeries_lpar_flush_hash_range+0x384/0x420 Call Trace: flush_hash_range+0x48/0x100 __flush_tlb_pending+0x44/0xd0 hpte_need_flush+0x408/0x470 change_protection_range+0xaac/0xf10 change_prot_numa+0x30/0xb0 task_numa_work+0x2d0/0x3e0 task_work_run+0x130/0x190 do_notify_resume+0x118/0x120 ret_from_except_lite+0x70/0x74 Instruction dump: 60000000 f8810028 7ca42b78 7cc53378 7ce63b78 7d074378 7d284b78 7d495378 e9410060 e9610068 e9810070 44000022 <7d806378> e9810028 f88c0000 f8ac0008 Link: http://lkml.kernel.org/r/20171214140551.5794-1-khandual@linux.vnet.ibm.com Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Suggested-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-04kernel/acct.c: fix the acct->needcheck check in check_free_space()Oleg Nesterov1-1/+1
As Tsukada explains, the time_is_before_jiffies(acct->needcheck) check is very wrong, we need time_is_after_jiffies() to make sys_acct() work. Ignoring the overflows, the code should "goto out" if needcheck > jiffies, while currently it checks "needcheck < jiffies" and thus in the likely case check_free_space() does nothing until jiffies overflow. In particular this means that sys_acct() is simply broken, acct_on() sets acct->needcheck = jiffies and expects that check_free_space() should set acct->active = 1 after the free-space check, but this won't happen if jiffies increments in between. This was broken by commit 32dc73086015 ("get rid of timer in kern/acct.c") in 2011, then another (correct) commit 795a2f22a8ea ("acct() should honour the limits from the very beginning") made the problem more visible. Link: http://lkml.kernel.org/r/20171213133940.GA6554@redhat.com Fixes: 32dc73086015 ("get rid of timer in kern/acct.c") Reported-by: TSUKADA Koutaro <tsukada@ascade.co.jp> Suggested-by: TSUKADA Koutaro <tsukada@ascade.co.jp> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-04mm: check pfn_valid first in zero_resv_unavailDave Young1-0/+2
With latest kernel I get below bug while testing kdump: BUG: unable to handle kernel paging request at ffffea00034b1040 IP: zero_resv_unavail+0xbd/0x126 PGD 37b98067 P4D 37b98067 PUD 37b97067 PMD 0 Oops: 0002 [#1] SMP Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 4.15.0-rc1+ #316 Hardware name: LENOVO 20ARS1BJ02/20ARS1BJ02, BIOS GJET92WW (2.42 ) 03/03/2017 task: ffffffff81a0e4c0 task.stack: ffffffff81a00000 RIP: 0010:zero_resv_unavail+0xbd/0x126 RSP: 0000:ffffffff81a03d88 EFLAGS: 00010006 RAX: 0000000000000000 RBX: ffffea00034b1040 RCX: 0000000000000010 RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffffea00034b1040 RBP: 00000000000d2c41 R08: 00000000000000c0 R09: 0000000000000a0d R10: 0000000000000002 R11: 0000000000007f01 R12: ffffffff81a03d90 R13: ffffea0000000000 R14: 0000000000000063 R15: 0000000000000062 FS: 0000000000000000(0000) GS:ffffffff81c73000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffea00034b1040 CR3: 0000000037609000 CR4: 00000000000606b0 Call Trace: ? free_area_init_nodes+0x640/0x664 ? zone_sizes_init+0x58/0x72 ? setup_arch+0xb50/0xc6c ? start_kernel+0x64/0x43d ? secondary_startup_64+0xa5/0xb0 Code: c1 e8 0c 48 39 d8 76 27 48 89 de 48 c1 e3 06 48 c7 c7 7a 87 79 81 e8 b0 c0 3e ff 4c 01 eb b9 10 00 00 00 31 c0 48 89 df 49 ff c6 <f3> ab eb bc 6a 00 49 c7 c0 f0 93 d1 81 31 d2 83 ce ff 41 54 49 RIP: zero_resv_unavail+0xbd/0x126 RSP: ffffffff81a03d88 CR2: ffffea00034b1040 ---[ end trace f5ba9e8f73c7ee26 ]--- This is introduced by commit a4a3ede2132a ("mm: zero reserved and unavailable struct pages"). The reason is some efi reserved boot ranges is not reported in E820 ram. In my case it is a bgrt buffer: efi: mem00: [Boot Data |RUN| | | | | | | |WB|WT|WC|UC] range=[0x00000000d2c41000-0x00000000d2c85fff] (0MB) Use "add_efi_memmap" can workaround the problem with another fix: http://lkml.kernel.org/r/20171130052327.GA3500@dhcp-128-65.nay.redhat.com In zero_resv_unavail it would be better to check pfn_valid first before zero the page struct. This fixes the problem and potential other similar problems. Also as Pavel Tatashin suggested checks pfn_valid at the beginning of the section. The range is backed by real memory. The memory range is efi "Boot Service Data", that means after ExitBootServices() these ranges can be used as system ram. But some of them need to be reserved, for example the bgrt image address in an acpi table, if the image memory is freed then kexec reboot will fail because kexec inherit same acpi table to initialize the driver. Link: http://lkml.kernel.org/r/20171201095048.GA3084@dhcp-128-65.nay.redhat.com Fixes: a4a3ede2132a ("mm: zero reserved and unavailable struct pages") Signed-off-by: Dave Young <dyoung@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-05x86/tlb: Drop the _GPL from the cpu_tlbstate exportThomas Gleixner1-1/+1
The recent changes for PTI touch cpu_tlbstate from various tlb_flush inlines. cpu_tlbstate is exported as GPL symbol, so this causes a regression when building out of tree drivers for certain graphics cards. Aside of that the export was wrong since it was introduced as it should have been EXPORT_PER_CPU_SYMBOL_GPL(). Use the correct PER_CPU export and drop the _GPL to restore the previous state which allows users to utilize the cards they payed for. As always I'm really thrilled to make this kind of change to support the #friends (or however the hot hashtag of today is spelled) from that closet sauce graphics corp. Fixes: 1e02ce4cccdc ("x86: Store a per-cpu shadow copy of CR4") Fixes: 6fd166aae78c ("x86/mm: Use/Fix PCID to optimize user/kernel switches") Reported-by: Kees Cook <keescook@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: stable@vger.kernel.org
2018-01-05x86/events/intel/ds: Use the proper cache flush method for mapping ds buffersPeter Zijlstra1-0/+16
Thomas reported the following warning: BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498 caller is native_flush_tlb_single+0x57/0xc0 native_flush_tlb_single+0x57/0xc0 __set_pte_vaddr+0x2d/0x40 set_pte_vaddr+0x2f/0x40 cea_set_pte+0x30/0x40 ds_update_cea.constprop.4+0x4d/0x70 reserve_ds_buffers+0x159/0x410 x86_reserve_hardware+0x150/0x160 x86_pmu_event_init+0x3e/0x1f0 perf_try_init_event+0x69/0x80 perf_event_alloc+0x652/0x740 SyS_perf_event_open+0x3f6/0xd60 do_syscall_64+0x5c/0x190 set_pte_vaddr is used to map the ds buffers into the cpu entry area, but there are two problems with that: 1) The resulting flush is not supposed to be called in preemptible context 2) The cpu entry area is supposed to be per CPU, but the debug store buffers are mapped for all CPUs so these mappings need to be flushed globally. Add the necessary preemption protection across the mapping code and flush TLBs globally. Fixes: c1961a4631da ("x86/events/intel/ds: Map debug buffers in cpu_entry_area") Reported-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180104170712.GB3040@hirez.programming.kicks-ass.net
2018-01-05x86/kaslr: Fix the vaddr_end messThomas Gleixner3-24/+22
vaddr_end for KASLR is only documented in the KASLR code itself and is adjusted depending on config options. So it's not surprising that a change of the memory layout causes KASLR to have the wrong vaddr_end. This can map arbitrary stuff into other areas causing hard to understand problems. Remove the whole ifdef magic and define the start of the cpu_entry_area to be the end of the KASLR vaddr range. Add documentation to that effect. Fixes: 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap") Reported-by: Benjamin Gilbert <benjamin.gilbert@coreos.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Benjamin Gilbert <benjamin.gilbert@coreos.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: stable <stable@vger.kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Garnier <thgarnie@google.com>, Cc: Alexander Kuleshov <kuleshovmail@gmail.com> Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos
2018-01-05Merge tag 'drm-intel-fixes-2018-01-04' of ↵Dave Airlie7-24/+62
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes drm/i915 fixes for v4.15-rc7 - couple of documentation build fixes - serialize non-blocking modesets - prevent DMC from messing up GMBUS transfers - PSR regression fix * tag 'drm-intel-fixes-2018-01-04' of git://anongit.freedesktop.org/drm/drm-intel: drm/i915: Apply Display WA #1183 on skl, kbl, and cfl docs: fix, intel_guc_loader.c has been moved to intel_guc_fw.c documentation/gpu/i915: fix docs build error after file rename drm/i915: Put all non-blocking modesets onto an ordered wq drm/i915: Disable DC states around GMBUS on GLK drm/i915/psr: Fix register name mess up.
2018-01-05Merge branch 'drm-fixes-4.15' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie5-6/+8
into drm-fixes - backport of a DC change which fixes a greenish tint on some RV hw - properly handle kzalloc fail in ttm * 'drm-fixes-4.15' of git://people.freedesktop.org/~agd5f/linux: drm/ttm: check the return value of kzalloc drm/amd/display: call set csc_default if enable adjustment is false
2018-01-05Merge branch 'drm-armada-fixes-4.15' of ↵Dave Airlie3-39/+48
git://git.armlinux.org.uk/~rmk/linux-arm into drm-fixes Armada fixes. * 'drm-armada-fixes-4.15' of git://git.armlinux.org.uk/~rmk/linux-arm: drm/armada: fix YUV planar format framebuffer offsets drm/armada: improve efficiency of armada_drm_plane_calc_addrs() drm/armada: fix UV swap code drm/armada: fix SRAM powerdown drm/armada: fix leak of crtc structure
2018-01-05Merge tag 'omapdrm-4.15-fixes' of ↵Dave Airlie1-37/+9
git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux into drm-fixes omapdrm fixes for 4.15 * Fix OMAP4 HDMI CEC interrupt handling and a possible buffer overflow * tag 'omapdrm-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: omapdrm/dss/hdmi4_cec: fix interrupt handling
2018-01-04x86/mm: Map cpu_entry_area at the same place on 4/5 levelThomas Gleixner3-6/+7
There is no reason for 4 and 5 level pagetables to have a different layout. It just makes determining vaddr_end for KASLR harder than necessary. Fixes: 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Gilbert <benjamin.gilbert@coreos.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: stable <stable@vger.kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Garnier <thgarnie@google.com>, Cc: Alexander Kuleshov <kuleshovmail@gmail.com> Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos
2018-01-04x86/mm: Set MODULES_END to 0xffffffffff000000Andrey Ryabinin2-5/+2
Since f06bdd4001c2 ("x86/mm: Adapt MODULES_END based on fixmap section size") kasan_mem_to_shadow(MODULES_END) could be not aligned to a page boundary. So passing page unaligned address to kasan_populate_zero_shadow() have two possible effects: 1) It may leave one page hole in supposed to be populated area. After commit 21506525fb8d ("x86/kasan/64: Teach KASAN about the cpu_entry_area") that hole happens to be in the shadow covering fixmap area and leads to crash: BUG: unable to handle kernel paging request at fffffbffffe8ee04 RIP: 0010:check_memory_region+0x5c/0x190 Call Trace: <NMI> memcpy+0x1f/0x50 ghes_copy_tofrom_phys+0xab/0x180 ghes_read_estatus+0xfb/0x280 ghes_notify_nmi+0x2b2/0x410 nmi_handle+0x115/0x2c0 default_do_nmi+0x57/0x110 do_nmi+0xf8/0x150 end_repeat_nmi+0x1a/0x1e Note, the crash likely disappeared after commit 92a0f81d8957, which changed kasan_populate_zero_shadow() call the way it was before commit 21506525fb8d. 2) Attempt to load module near MODULES_END will fail, because __vmalloc_node_range() called from kasan_module_alloc() will hit the WARN_ON(!pte_none(*pte)) in the vmap_pte_range() and bail out with error. To fix this we need to make kasan_mem_to_shadow(MODULES_END) page aligned which means that MODULES_END should be 8*PAGE_SIZE aligned. The whole point of commit f06bdd4001c2 was to move MODULES_END down if NR_CPUS is big, so the cpu_entry_area takes a lot of space. But since 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap") the cpu_entry_area is no longer in fixmap, so we could just set MODULES_END to a fixed 8*PAGE_SIZE aligned address. Fixes: f06bdd4001c2 ("x86/mm: Adapt MODULES_END based on fixmap section size") Reported-by: Jakub Kicinski <kubakici@wp.pl> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Garnier <thgarnie@google.com> Link: https://lkml.kernel.org/r/20171228160620.23818-1-aryabinin@virtuozzo.com
2018-01-04Merge tag 'armsoc-fixes' of ↵Linus Torvalds27-54/+54
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Arnd Bergmann: "Fixes this time include mostly device tree changes, as usual, the notable ones include: - A number of patches to fix most of the remaining DTC warnings that got introduced when DTC started warning about some obvious mistakes. We still have some remaining warnings that probably may have to wait until 4.16 to get fixed while we try to figure out what the correct contents should be. - On Allwinner A64, Ethernet PHYs need a fix after a mistake in coordination between patches merged through multiple branches. - Various fixes for PMICs on allwinner based boards - Two fixes for ethernet link detection on some Renesas machines - Two stability fixes for rockchip based boards Aside from device-tree, two other areas got fixes for older problems: - For TI Davinci DM365, a couple of fixes were needed to repair the MMC DMA engine support, apparently this has been broken for a while. - One important fix for all Allwinner chips with the PMIC driver as a loadable module" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (23 commits) arm64: dts: uniphier: fix gpio-ranges property of PXs3 SoC arm64: dts: renesas: ulcb: Remove renesas, no-ether-link property arm64: dts: renesas: salvator-x: Remove renesas, no-ether-link property ARM: dts: tango4: remove bogus interrupt-controller property ARM: dts: ls1021a: fix incorrect clock references ARM: dts: aspeed-g4: Correct VUART IRQ number ARM: dts: exynos: Enable Mixer node for Exynos5800 Peach Pi machine ARM: dts: sun8i: a711: Reinstate the PMIC compatible ARM: davinci: fix mmc entries in dm365's dma_slave_map ARM: dts: da850-lego-ev3: Fix battery voltage gpio ARM: davinci: Add dma_mask to dm365's eDMA device ARM: davinci: Use platform_device_register_full() to create pdev for dm365's eDMA arm64: dts: rockchip: limit rk3328-rock64 gmac speed to 100MBit for now arm64: dts: rockchip: remove vdd_log from rk3399-puma arm64: dts: orange-pi-zero-plus2: fix sdcard detect arm64: allwinner: a64-sopine: Fix to use dcdc1 regulator instead of vcc3v3 ARM: dts: sunxi: Convert to CCU index macros for HDMI controller sunxi-rsb: Include OF based modalias in device uevent ARM: dts: at91: disable the nxp,se97b SMBUS timeout on the TSE-850 arm64: dts: rockchip: fix trailing 0 in rk3328 tsadc interrupts ...
2018-01-04arm64: dts: uniphier: fix gpio-ranges property of PXs3 SoCMasahiro Yamada1-2/+2
This is probably a copy-paste mistake. The gpio-ranges of PXs3 is different from that of LD20. Fixes: 277b51e7050f ("arm64: dts: uniphier: add GPIO controller nodes") Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-01-04Merge tag 'sunxi-fixes-for-4.15' of ↵Arnd Bergmann11-20/+16
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes Pull "Allwinner fixes for 4.15" from Chen-Yu Tsai: First, one fix that adds proper regulator references for the EMAC external PHYs on A64 boards. The EMAC bindings were developed for 4.13, but reverted at the last minute. They were finalized and brought back for 4.15. However in the time between, regulator support for the A64 boards was merged. When EMAC device tree changes were reintroduced, this was not taken into account. Second, a patch that adds OF based modalias uevent for RSB slave devices. This has been missing since the introduction of RSB, and recently with PMIC regulator support introduced for the A64, has been seen affecting distributions, which have the all-important PMIC mfd drivers built as modules, which then don't get loaded. Other minor cleanups include final conversion of raw indices to CCU binding macros for sun[4567]i HDMI, cleanup of dummy regulators on the A64 SOPINE, a SD card detection polarity fix for the Orange Pi Zero Plus2, and adding a missing compatible for the PMIC on the TBS A711 tablet. * tag 'sunxi-fixes-for-4.15' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/sunxi/linux: ARM: dts: sun8i: a711: Reinstate the PMIC compatible arm64: dts: orange-pi-zero-plus2: fix sdcard detect arm64: allwinner: a64-sopine: Fix to use dcdc1 regulator instead of vcc3v3 ARM: dts: sunxi: Convert to CCU index macros for HDMI controller sunxi-rsb: Include OF based modalias in device uevent arm64: allwinner: a64: add Ethernet PHY regulator for several boards
2018-01-04Merge tag 'renesas-fixes-for-v4.15' of ↵Arnd Bergmann2-2/+0
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes Pull "Renesas ARM Based SoC Fixes for v4.15" from Simon Horman: Vladimir Zapolskiy says: The present change is a bug fix for AVB link iteratively up/down. Steps to reproduce: - start AVB TX stream (Using aplay via MSE), - disconnect+reconnect the eth cable, - after a reconnection the eth connection goes iteratively up/down without user interaction, - this may heal after some seconds or even stay for minutes. As the documentation specifies, the "renesas,no-ether-link" option should be used when a board does not provide a proper AVB_LINK signal. There is no need for this option enabled on RCAR H3/M3 Salvator-X/XS and ULCB starter kits since the AVB_LINK is correctly handled by HW. Choosing to keep or remove the "renesas,no-ether-link" option will have impact on the code flow in the following ways: - keeping this option enabled may lead to unexpected behavior since the RX & TX are enabled/disabled directly from adjust_link function without any HW interrogation, - removing this option, the RX & TX will only be enabled/disabled after HW interrogation. The HW check is made through the LMON pin in PSR register which specifies AVB_LINK signal value (0 - at low level; 1 - at high level). In conclusion, the change is also a safety improvement because it removes the "renesas,no-ether-link" option leading to a proper way of detecting the link state based on HW interrogation and not on software heuristic. Note that DTS files for V3M Starter Kit, Draak and Eagle boards contain the same property, the files are untouched due to unavailable schematics to verify if the fix applies to these boards as well. * tag 'renesas-fixes-for-v4.15' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/horms/renesas: arm64: dts: renesas: ulcb: Remove renesas, no-ether-link property arm64: dts: renesas: salvator-x: Remove renesas, no-ether-link property
2018-01-04drm/i915: Apply Display WA #1183 on skl, kbl, and cflLucas De Marchi3-9/+38
Display WA #1183 was recently added to workaround "Failures when enabling DPLL0 with eDP link rate 2.16 or 4.32 GHz and CD clock frequency 308.57 or 617.14 MHz (CDCLK_CTL CD Frequency Select 10b or 11b) used in this enabling or in previous enabling." This workaround was designed to minimize the impact only to save the bad case with that link rates. But HW engineers indicated that it should be safe to apply broadly, although they were expecting the DPLL0 link rate to be unchanged on runtime. We need to cover 2 cases: when we are in fact enabling DPLL0 and when we are just changing the frequency with small differences. This is based on previous patch by Rodrigo Vivi with suggestions from Ville Syrjälä. Cc: Arthur J Runyan <arthur.j.runyan@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171204232210.4958-1-lucas.demarchi@intel.com (cherry picked from commit 53421c2fe99ce16838639ad89d772d914a119a49) [ Lucas: Backport to 4.15 adding back variable that has been removed on commits not meant to be backported ] Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180102201837.6812-1-lucas.demarchi@intel.com
2018-01-03Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds8-25/+48
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 page table isolation fixes from Thomas Gleixner: "A couple of urgent fixes for PTI: - Fix a PTE mismatch between user and kernel visible mapping of the cpu entry area (differs vs. the GLB bit) and causes a TLB mismatch MCE on older AMD K8 machines - Fix the misplaced CR3 switch in the SYSCALL compat entry code which causes access to unmapped kernel memory resulting in double faults. - Fix the section mismatch of the cpu_tss_rw percpu storage caused by using a different mechanism for declaration and definition. - Two fixes for dumpstack which help to decode entry stack issues better - Enable PTI by default in Kconfig. We should have done that earlier, but it slipped through the cracks. - Exclude AMD from the PTI enforcement. Not necessarily a fix, but if AMD is so confident that they are not affected, then we should not burden users with the overhead" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/process: Define cpu_tss_rw in same section as declaration x86/pti: Switch to kernel CR3 at early in entry_SYSCALL_compat() x86/dumpstack: Print registers for first stack frame x86/dumpstack: Fix partial register dumps x86/pti: Make sure the user/kernel PTEs match x86/cpu, x86/pti: Do not enable PTI on AMD processors x86/pti: Enable PTI by default
2018-01-03x86/process: Define cpu_tss_rw in same section as declarationNick Desaulniers1-1/+1
cpu_tss_rw is declared with DECLARE_PER_CPU_PAGE_ALIGNED but then defined with DEFINE_PER_CPU_SHARED_ALIGNED leading to section mismatch warnings. Use DEFINE_PER_CPU_PAGE_ALIGNED consistently. This is necessary because it's mapped to the cpu entry area and must be page aligned. [ tglx: Massaged changelog a bit ] Fixes: 1a935bc3d4ea ("x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct") Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: thomas.lendacky@amd.com Cc: Borislav Petkov <bpetkov@suse.de> Cc: tklauser@distanz.ch Cc: minipli@googlemail.com Cc: me@kylehuey.com Cc: namit@vmware.com Cc: luto@kernel.org Cc: jpoimboe@redhat.com Cc: tj@kernel.org Cc: cl@linux.com Cc: bp@suse.de Cc: thgarnie@google.com Cc: kirill.shutemov@linux.intel.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180103203954.183360-1-ndesaulniers@google.com
2018-01-03x86/pti: Switch to kernel CR3 at early in entry_SYSCALL_compat()Thomas Gleixner1-7/+6
The preparation for PTI which added CR3 switching to the entry code misplaced the CR3 switch in entry_SYSCALL_compat(). With PTI enabled the entry code tries to access a per cpu variable after switching to kernel GS. This fails because that variable is not mapped to user space. This results in a double fault and in the worst case a kernel crash. Move the switch ahead of the access and clobber RSP which has been saved already. Fixes: 8a09317b895f ("x86/mm/pti: Prepare the x86/entry assembly code for entry/exit CR3 switching") Reported-by: Lars Wendler <wendler.lars@web.de> Reported-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Betkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org>, Cc: Dave Hansen <dave.hansen@linux.intel.com>, Cc: Peter Zijlstra <peterz@infradead.org>, Cc: Greg KH <gregkh@linuxfoundation.org>, , Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>, Cc: Juergen Gross <jgross@suse.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801031949200.1957@nanos
2018-01-03Merge branch 'for-linus' of ↵Linus Torvalds1-3/+5
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull pid allocation bug fix from Eric Biederman: "The replacement of the pid hash table and the pid bitmap with an idr resulted in an implementation that now fails more often in low memory situations. Allowing fuzzers to observe bad behavior from a memory allocation failure during pid allocation. This is a small change to fix this by making the kernel more robust in the case of error. The non-error paths are left alone so the only danger is to the already broken error path. I have manually injected errors and verified that this new error handling works" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: pid: Handle failure to allocate the first pid in a pid namespace
2018-01-03Merge branch 'afs-fixes' of ↵Linus Torvalds5-13/+40
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull afs/fscache fixes from David Howells: - Fix the default return of fscache_maybe_release_page() when a cache isn't in use - it prevents a filesystem from releasing pages. This can cause a system to OOM. - Fix a potential uninitialised variable in AFS. - Fix AFS unlink's handling of the nlink count. It needs to use the nlink manipulation functions so that inode structs of deleted inodes actually get scheduled for destruction. - Fix error handling in afs_write_end() so that the page gets unlocked and put if we can't fill the unwritten portion. * 'afs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix missing error handling in afs_write_end() afs: Fix unlink afs: Potential uninitialized variable in afs_extract_data() fscache: Fix the default for fscache_maybe_release_page()
2018-01-03Merge branch 'for-linus' of ↵Linus Torvalds1-12/+9
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull capabilities fix from James Morris. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: capabilities: fix buffer overread on very short xattr
2018-01-03exec: Weaken dumpability for secureexecKees Cook1-2/+7
This is a logical revert of commit e37fdb785a5f ("exec: Use secureexec for setting dumpability") This weakens dumpability back to checking only for uid/gid changes in current (which is useless), but userspace depends on dumpability not being tied to secureexec. https://bugzilla.redhat.com/show_bug.cgi?id=1528633 Reported-by: Tom Horsley <horsley1953@gmail.com> Fixes: e37fdb785a5f ("exec: Use secureexec for setting dumpability") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-03x86/dumpstack: Print registers for first stack frameJosh Poimboeuf1-1/+2
In the stack dump code, if the frame after the starting pt_regs is also a regs frame, the registers don't get printed. Fix that. Reported-by: Andy Lutomirski <luto@amacapital.net> Tested-by: Alexander Tsoy <alexander@tsoy.me> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toralf Förster <toralf.foerster@gmx.de> Cc: stable@vger.kernel.org Fixes: 3b3fa11bc700 ("x86/dumpstack: Print any pt_regs found on the stack") Link: http://lkml.kernel.org/r/396f84491d2f0ef64eda4217a2165f5712f6a115.1514736742.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-03x86/dumpstack: Fix partial register dumpsJosh Poimboeuf3-13/+34
The show_regs_safe() logic is wrong. When there's an iret stack frame, it prints the entire pt_regs -- most of which is random stack data -- instead of just the five registers at the end. show_regs_safe() is also poorly named: the on_stack() checks aren't for safety. Rename the function to show_regs_if_on_stack() and add a comment to explain why the checks are needed. These issues were introduced with the "partial register dump" feature of the following commit: b02fcf9ba121 ("x86/unwinder: Handle stack overflows more gracefully") That patch had gone through a few iterations of development, and the above issues were artifacts from a previous iteration of the patch where 'regs' pointed directly to the iret frame rather than to the (partially empty) pt_regs. Tested-by: Alexander Tsoy <alexander@tsoy.me> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toralf Förster <toralf.foerster@gmx.de> Cc: stable@vger.kernel.org Fixes: b02fcf9ba121 ("x86/unwinder: Handle stack overflows more gracefully") Link: http://lkml.kernel.org/r/5b05b8b344f59db2d3d50dbdeba92d60f2304c54.1514736742.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-03x86/pti: Make sure the user/kernel PTEs matchThomas Gleixner1-1/+2
Meelis reported that his K8 Athlon64 emits MCE warnings when PTI is enabled: [Hardware Error]: Error Addr: 0x0000ffff81e000e0 [Hardware Error]: MC1 Error: L1 TLB multimatch. [Hardware Error]: cache level: L1, tx: INSN The address is in the entry area, which is mapped into kernel _AND_ user space. That's special because we switch CR3 while we are executing there. User mapping: 0xffffffff81e00000-0xffffffff82000000 2M ro PSE GLB x pmd Kernel mapping: 0xffffffff81000000-0xffffffff82000000 16M ro PSE x pmd So the K8 is complaining that the TLB entries differ. They differ in the GLB bit. Drop the GLB bit when installing the user shared mapping. Fixes: 6dc72c3cbca0 ("x86/mm/pti: Share entry text PMD") Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Meelis Roos <mroos@linux.ee> Cc: Borislav Petkov <bp@alien8.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801031407180.1957@nanos
2018-01-03x86/cpu, x86/pti: Do not enable PTI on AMD processorsTom Lendacky1-2/+2
AMD processors are not subject to the types of attacks that the kernel page table isolation feature protects against. The AMD microarchitecture does not allow memory references, including speculative references, that access higher privileged data when running in a lesser privileged mode when that access would result in a page fault. Disable page table isolation by default on AMD processors by not setting the X86_BUG_CPU_INSECURE feature, which controls whether X86_FEATURE_PTI is set. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20171227054354.20369.94587.stgit@tlendack-t1.amdoffice.net
2018-01-03x86/pti: Enable PTI by defaultThomas Gleixner1-0/+1
This really want's to be enabled by default. Users who know what they are doing can disable it either in the config or on the kernel command line. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org
2018-01-03MAINTAINERS: Remove Matt Fleming as EFI co-maintainerMatt Fleming1-4/+3
Instate Ard Biesheuvel as the sole EFI maintainer and leave other folks as maintainers for the EFI test driver and efivarfs file system. Also add Ard Biesheuvel as the EFI test driver and efivarfs maintainer. Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Ivan Hu <ivan.hu@canonical.com> Cc: Jeremy Kerr <jk@ozlabs.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20180103094417.6353-1-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-03efi/capsule-loader: Reinstate virtual capsule mappingArd Biesheuvel3-10/+52
Commit: 82c3768b8d68 ("efi/capsule-loader: Use a cached copy of the capsule header") ... refactored the capsule loading code that maps the capsule header, to avoid having to map it several times. However, as it turns out, the vmap() call we ended up removing did not just map the header, but the entire capsule image, and dropping this virtual mapping breaks capsules that are processed by the firmware immediately (i.e., without a reboot). Unfortunately, that change was part of a larger refactor that allowed a quirk to be implemented for Quark, which has a non-standard memory layout for capsules, and we have slightly painted ourselves into a corner by allowing quirk code to mangle the capsule header and memory layout. So we need to fix this without breaking Quark. Fortunately, Quark does not appear to care about the virtual mapping, and so we can simply do a partial revert of commit: 2a457fb31df6 ("efi/capsule-loader: Use page addresses rather than struct page pointers") ... and create a vmap() mapping of the entire capsule (including header) based on the reinstated struct page array, unless running on Quark, in which case we pass the capsule header copy as before. Reported-by: Ge Song <ge.song@hxt-semitech.com> Tested-by: Bryan O'Donoghue <pure.logic@nexus-software.ie> Tested-by: Ge Song <ge.song@hxt-semitech.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: <stable@vger.kernel.org> Cc: Dave Young <dyoung@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Fixes: 82c3768b8d68 ("efi/capsule-loader: Use a cached copy of the capsule header") Link: http://lkml.kernel.org/r/20180102172110.17018-3-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-03x86/efi: Fix kernel param add_efi_memmap regressionDave Young1-3/+2
'add_efi_memmap' is an early param, but do_add_efi_memmap() has no chance to run because the code path is before parse_early_param(). I believe it worked when the param was introduced but probably later some other changes caused the wrong order and nobody noticed it. Move efi_memblock_x86_reserve_range() after parse_early_param() to fix it. Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Bryan O'Donoghue <pure.logic@nexus-software.ie> Cc: Ge Song <ge.song@hxt-semitech.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20180102172110.17018-2-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-02drm/ttm: check the return value of kzallocXiongwei Song1-0/+2
In the function ttm_page_alloc_init, kzalloc call is made for variable _manager, we need to check its return value, it may return NULL. Signed-off-by: Xiongwei Song <sxwjean@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-01-02drm/amd/display: call set csc_default if enable adjustment is falseYue Hin Lau4-6/+6
Fixes a greenish tint on RV displays. Signed-off-by: Yue Hin Lau <Yuehin.Lau@amd.com> Reviewed-by: Eric Bernstein <Eric.Bernstein@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> [drake@endlessm.com: backport to 4.15] Signed-off-by: Daniel Drake <drake@endlessm.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-01-02xfs: fix s_maxbytes overflow problemsDarrick J. Wong2-3/+3
Fix some integer overflow problems if offset + count happen to be large enough to cause an integer overflow. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-01-02xfs: quota: check result of register_shrinker()Aliaksei Karaliou1-16/+29
xfs_qm_init_quotainfo() does not check result of register_shrinker() which was tagged as __must_check recently, reported by sparse. Signed-off-by: Aliaksei Karaliou <akaraliou.dev@gmail.com> [darrick: move xfs_qm_destroy_quotainos nearer xfs_qm_init_quotainos] Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-01-02xfs: quota: fix missed destroy of qi_tree_lockAliaksei Karaliou1-0/+1
xfs_qm_destroy_quotainfo() does not destroy quotainfo->qi_tree_lock while destroys quotainfo->qi_quotaofflock. Signed-off-by: Aliaksei Karaliou <akaraliou.dev@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-01-02btrfs: fix refcount_t usage when deleting btrfs_delayed_nodesChris Mason1-11/+34
refcounts have a generic implementation and an asm optimized one. The generic version has extra debugging to make sure that once a refcount goes to zero, refcount_inc won't increase it. The btrfs delayed inode code wasn't expecting this, and we're tripping over the warnings when the generic refcounts are used. We ended up with this race: Process A Process B btrfs_get_delayed_node() spin_lock(root->inode_lock) radix_tree_lookup() __btrfs_release_delayed_node() refcount_dec_and_test(&delayed_node->refs) our refcount is now zero refcount_add(2) <--- warning here, refcount unchanged spin_lock(root->inode_lock) radix_tree_delete() With the generic refcounts, we actually warn again when process B above tries to release his refcount because refcount_add() turned into a no-op. We saw this in production on older kernels without the asm optimized refcounts. The fix used here is to use refcount_inc_not_zero() to detect when the object is in the middle of being freed and return NULL. This is almost always the right answer anyway, since we usually end up pitching the delayed_node if it didn't have fresh data in it. This also changes __btrfs_release_delayed_node() to remove the extra check for zero refcounts before radix tree deletion. btrfs_get_delayed_node() was the only path that was allowing refcounts to go from zero to one. Fixes: 6de5f18e7b0da ("btrfs: fix refcount_t usage when deleting btrfs_delayed_node") CC: <stable@vger.kernel.org> # 4.12+ Signed-off-by: Chris Mason <clm@fb.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-02btrfs: Fix flush bio leakNikolay Borisov1-1/+0
Commit e0ae99941423 ("btrfs: preallocate device flush bio") reworked the way the flush bio is allocated and used. Concretely it allocates the bio in __alloc_device and then re-uses it multiple times with a very simple endio routine that just calls complete() without consuming a reference. Allocated bios by default come with a ref count of 1, which is then consumed by the endio routine (or not, in which case they should be bio_put by the caller). The way the impleementation works now is that the flush bio has a refcount of 2 and we only ever bio_put it once, leaving it to hang indefinitely. Fix this by removing the extra bio_get in __alloc_device. Fixes: e0ae99941423 ("btrfs: preallocate device flush bio") Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-02docs: fix, intel_guc_loader.c has been moved to intel_guc_fw.cMarkus Heiser1-3/+0
With commit d9e2e0143c the 'GuC-specific firmware loader' doc section was removed from intel_guc_loader.c without a replacement. So lets remove it from the Kernel-doc:: .. kernel-doc:: drivers/gpu/drm/i915/intel_guc_loader.c :doc: GuC-specific firmware loader With commit e8668bbcb0 intel_guc_loader.c was renamed to to intel_guc_fw.c and to name just one, intel_guc_init_hw() was renamed to intel_guc_fw_upload(). Since we get errors in the Sphinx build like: - Error: Cannot open file ./drivers/gpu/drm/i915/intel_guc_loader.c Change the kernel-doc directive from intel_guc_loader.c to intel_guc_fw.c Signed-off-by: Markus Heiser <markus.heiser@darmarit.de> [danvet: Rebase onto the partial fix 006c23327f8d ("documentation/gpu/i915: fix docs build error after file rename")] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/1513078717-12373-1-git-send-email-markus.heiser@darmarit.de (cherry picked from commit 0132a1a5d44d2cd32a249dbe999a88c2134a6bd1) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2018-01-02xen/pvcalls: use GFP_ATOMIC under spin lockWei Yongjun1-1/+1
A spin lock is taken here so we should use GFP_ATOMIC. Fixes: 9774c6cca266 ("xen/pvcalls: implement accept command") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-01-02s390/dasd: fix wrongly assigned configuration dataStefan Haberland1-0/+10
We store per path and per device configuration data to identify the path or device correctly. The per path configuration data might get mixed up if the original request gets into error recovery and is started with a random path mask. This would lead to a wrong identification of a path in case of a CUIR event for example. Fix by copying the path mask from the original request to the error recovery request in case it is a path verification request. Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com> Reviewed-by: Jan Hoeppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-01-02omapdrm/dss/hdmi4_cec: fix interrupt handlingHans Verkuil1-37/+9
The omap4 CEC hardware cannot tell a Nack from a Low Drive from an Arbitration Lost error, so just report a Nack, which is almost certainly the reason for the error anyway. This also simplifies the implementation. The only three interrupts that need to be enabled are: Transmit Buffer Full/Empty Change event: triggered when the transmit finished successfully and cleared the buffer. Receiver FIFO Not Empty event: triggered when a message was received. Frame Retransmit Count Exceeded event: triggered when a transmit failed repeatedly, usually due to the message being Nacked. Other reasons are possible (Low Drive, Arbitration Lost) but there is no way to know. If this happens the TX buffer needs to be cleared manually. While testing various error conditions I noticed that the hardware can receive messages up to 18 bytes in total, which exceeds the legal maximum of 16. This could cause a buffer overflow, so we check for this and constrain the size to 16 bytes. The old incorrect interrupt handler could cause the CEC framework to enter into a bad state because it mis-detected the "Start Bit Irregularity event" as an ARB_LOST transmit error when it actually is a receive error which should be ignored. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Reported-by: Henrik Austad <haustad@cisco.com> Tested-by: Henrik Austad <haustad@cisco.com> Tested-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2018-01-02documentation/gpu/i915: fix docs build error after file renameRandy Dunlap1-2/+2
Fix documentation build errors after intel_guc_loader.c was renamed to intel_guc_fw.c. Error: Cannot open file ../drivers/gpu/drm/i915/intel_guc_loader.c WARNING: kernel-doc '../scripts/kernel-doc -rst -enable-lineno -function GuC-specific firmware loader ../drivers/gpu/drm/i915/intel_guc_loader.c' failed with return code 1 Error: Cannot open file ../drivers/gpu/drm/i915/intel_guc_loader.c Error: Cannot open file ../drivers/gpu/drm/i915/intel_guc_loader.c WARNING: kernel-doc '../scripts/kernel-doc -rst -enable-lineno -internal ../drivers/gpu/drm/i915/intel_guc_loader.c' failed with return code 2 Fixes: e8668bbcb0f9 ("drm/i915/guc: Rename intel_guc_loader.c to intel_guc_fw.c") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/1b214f53-47f5-bef3-f58e-8136de5678ed@infradead.org (cherry picked from commit 006c23327f8de8575508c458131b304188d426f7) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2018-01-02drm/i915: Put all non-blocking modesets onto an ordered wqVille Syrjälä2-3/+14
We have plenty of global registers and whatnot programmed without any further locking by the modeset code. Currently non-bocking modesets are allowed to execute in parallel which could corrupt said registers. To avoid the problem let's run all non-blocking modesets on an ordered workqueue. We still put page flips etc. to system_unbound_wq allowing page flips on one pipe to execute in parallel with page flips or a modeset on a another pipe (assuming no known state is shared between them, at which point they would have been added to the same atomic commit and serialized that way). Blocking modesets are already serialized with each other by connection_mutex, and thus are safe. To serialize them with non-blocking modesets we just flush the workqueue before executing blocking modesets. Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: 94f050246b42 ("drm/i915: nonblocking commit") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171113133622.8593-1-ville.syrjala@linux.intel.com Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> (cherry picked from commit 757fffcfdffb6c0dd46c1b264091c36b4e5a86ae) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2018-01-02drm/i915: Disable DC states around GMBUS on GLKVille Syrjälä1-0/+1
Prevent the DMC from destroying GMBUS transfers on GLK. GMBUS lives in PG1 so DC off is all we need. Cc: stable@vger.kernel.org Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171208213739.16388-1-ville.syrjala@linux.intel.com Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> (cherry picked from commit 156961ae7bdf6feb72778e8da83d321b273343fd) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2018-01-02drm/i915/psr: Fix register name mess up.Dhinakaran Pandiyan1-8/+8
Commit 77affa31722b ("drm/i915/psr: Fix compiler warnings for hsw_psr_disable()") swapped status and control registers while fixing indentation. The _ctl at the end of the status register name must have to led to this. Fixes: 77affa31722b ("drm/i915/psr: Fix compiler warnings for hsw_psr_disable()") References: https://www.mrc-cbu.cam.ac.uk/people/matt.davis/cmabridge/ Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171220043520.2599-1-dhinakaran.pandiyan@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (cherry picked from commit 14c6547d6df641d3e41fa4f4164f6e267ebfab89) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2018-01-02afs: Fix missing error handling in afs_write_end()David Howells1-3/+5
afs_write_end() is missing page unlock and put if afs_fill_page() fails. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com>
2018-01-02afs: Fix unlinkDavid Howells2-8/+33
Repeating creation and deletion of a file on an afs mount will run the box out of memory, e.g.: dd if=/dev/zero of=/afs/scratch/m0 bs=$((1024*1024)) count=512 rm /afs/scratch/m0 The problem seems to be that it's not properly decrementing the nlink count so that the inode can be scrapped. Note that this doesn't fix local creation followed by remote deletion. That's harder to handle and will require a separate patch as we're not told that the file has been deleted - only that the directory has changed. Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>
2018-01-02afs: Potential uninitialized variable in afs_extract_data()Dan Carpenter1-1/+1
Smatch warns that: fs/afs/rxrpc.c:922 afs_extract_data() error: uninitialized symbol 'remote_abort'. Smatch is right that "remote_abort" might be uninitialized when we pass it to afs_set_call_complete(). I don't know if that function uses the uninitialized variable. Anyway, the comment for rxrpc_kernel_recv_data(), says that "*_abort should also be initialised to 0." and this patch does that. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com>
2018-01-02fscache: Fix the default for fscache_maybe_release_page()David Howells1-1/+1
Fix the default for fscache_maybe_release_page() for when the cookie isn't valid or the page isn't cached. It mustn't return false as that indicates the page cannot yet be freed. The problem with the default is that if, say, there's no cache, but a network filesystem's pages are using up almost all the available memory, a system can OOM because the filesystem ->releasepage() op will not allow them to be released as fscache_maybe_release_page() incorrectly prevents it. This can be tested by writing a sequence of 512MiB files to an AFS mount. It does not affect NFS or CIFS because both of those wrap the call in a check of PG_fscache and it shouldn't bother Ceph as that only has PG_private set whilst writeback is in progress. This might be an issue for 9P, however. Note that the pages aren't entirely stuck. Removing a file or unmounting will clear things because that uses ->invalidatepage() instead. Fixes: 201a15428bd5 ("FS-Cache: Handle pages pending storage that get evicted under OOM conditions") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Marc Dionne <marc.dionne@auristor.com> cc: stable@vger.kernel.org # 2.6.32+
2018-01-02capabilities: fix buffer overread on very short xattrEric Biggers1-12/+9
If userspace attempted to set a "security.capability" xattr shorter than 4 bytes (e.g. 'setfattr -n security.capability -v x file'), then cap_convert_nscap() read past the end of the buffer containing the xattr value because it accessed the ->magic_etc field without verifying that the xattr value is long enough to contain that field. Fix it by validating the xattr value size first. This bug was found using syzkaller with KASAN. The KASAN report was as follows (cleaned up slightly): BUG: KASAN: slab-out-of-bounds in cap_convert_nscap+0x514/0x630 security/commoncap.c:498 Read of size 4 at addr ffff88002d8741c0 by task syz-executor1/2852 CPU: 0 PID: 2852 Comm: syz-executor1 Not tainted 4.15.0-rc6-00200-gcc0aac99d977 #253 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0xe3/0x195 lib/dump_stack.c:53 print_address_description+0x73/0x260 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x235/0x350 mm/kasan/report.c:409 cap_convert_nscap+0x514/0x630 security/commoncap.c:498 setxattr+0x2bd/0x350 fs/xattr.c:446 path_setxattr+0x168/0x1b0 fs/xattr.c:472 SYSC_setxattr fs/xattr.c:487 [inline] SyS_setxattr+0x36/0x50 fs/xattr.c:483 entry_SYSCALL_64_fastpath+0x18/0x85 Fixes: 8db6c34f1dbc ("Introduce v3 namespaced file capabilities") Cc: <stable@vger.kernel.org> # v4.14+ Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-12-31Linux 4.15-rc6v4.15-rc6Linus Torvalds1-1/+1
2017-12-31Merge branch 'x86/urgent' of ↵Linus Torvalds14-39/+42
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A couple of fixlets for x86: - Fix the ESPFIX double fault handling for 5-level pagetables - Fix the commandline parsing for 'apic=' on 32bit systems and update documentation - Make zombie stack traces reliable - Fix kexec with stack canary - Fix the delivery mode for APICs which was missed when the x86 vector management was converted to single target delivery. Caused a regression due to the broken hardware which ignores affinity settings in lowest prio delivery mode. - Unbreak modules when AMD memory encryption is enabled - Remove an unused parameter of prepare_switch_to" * 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Switch all APICs to Fixed delivery mode x86/apic: Update the 'apic=' description of setting APIC driver x86/apic: Avoid wrong warning when parsing 'apic=' in X86-32 case x86-32: Fix kexec with stack canary (CONFIG_CC_STACKPROTECTOR) x86: Remove unused parameter of prepare_switch_to x86/stacktrace: Make zombie stack traces reliable x86/mm: Unbreak modules that use the DMA API x86/build: Make isoimage work on Debian x86/espfix/64: Fix espfix double-fault handling on 5-level systems
2017-12-31Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds3-16/+16
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 page table isolation fixes from Thomas Gleixner: "Four patches addressing the PTI fallout as discussed and debugged yesterday: - Remove stale and pointless TLB flush invocations from the hotplug code - Remove stale preempt_disable/enable from __native_flush_tlb() - Plug the memory leak in the write_ldt() error path" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ldt: Make LDT pgtable free conditional x86/ldt: Plug memory leak in error path x86/mm: Remove preempt_disable/enable() from __native_flush_tlb() x86/smpboot: Remove stale TLB flush invocations
2017-12-31Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds6-20/+52
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A pile of fixes for long standing issues with the timer wheel and the NOHZ code: - Prevent timer base confusion accross the nohz switch, which can cause unlocked access and data corruption - Reinitialize the stale base clock on cpu hotplug to prevent subtle side effects including rollovers on 32bit - Prevent an interrupt storm when the timer softirq is already pending caused by tick_nohz_stop_sched_tick() - Move the timer start tracepoint to a place where it actually makes sense - Add documentation to timerqueue functions as they caused confusion several times now" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timerqueue: Document return values of timerqueue_add/del() timers: Invoke timer_start_debug() where it makes sense nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick() timers: Reinitialize per cpu bases on hotplug timers: Use deferrable base independent of base::nohz_active
2017-12-31Merge branch 'smp-urgent-for-linus' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull smp fixlet from Thomas Gleixner: "A trivial build warning fix for newer compilers" * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu/hotplug: Move inline keyword at the beginning of declaration
2017-12-31Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds4-3/+9
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Thomas Gleixner: "Three patches addressing the fallout of the CPU_ISOLATION changes especially with NO_HZ_FULL plus documentation of boot parameter dependency" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/isolation: Document boot parameters dependency on CONFIG_CPU_ISOLATION=y sched/isolation: Enable CONFIG_CPU_ISOLATION=y by default sched/isolation: Make CONFIG_NO_HZ_FULL select CONFIG_CPU_ISOLATION
2017-12-31Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds9-43/+190
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: - plug a memory leak in the intel pmu init code - clang fixes - tooling fix to avoid including kernel headers - a fix for jvmti to generate correct debug information for inlined code - replace backtick with a regular shell function - fix the build in hardened environments * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Plug memory leak in intel_pmu_init() x86/asm: Allow again using asm.h when building for the 'bpf' clang target tools arch s390: Do not include header files from the kernel sources perf jvmti: Generate correct debug information for inlined code perf tools: Fix up build in hardened environments perf tools: Use shell function for perl cflags retrieval
2017-12-31Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds28-72/+196
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "A rather large update after the kaisered maintainer finally found time to handle regression reports. - The larger part addresses a regression caused by the x86 vector management rework. The reservation based model does not work reliably for MSI interrupts, if they cannot be masked (yes, yet another hw engineering trainwreck). The reason is that the reservation mode assigns a dummy vector when the interrupt is allocated and switches to a real vector when the interrupt is requested. If the MSI entry cannot be masked then the initialization might raise an interrupt before the interrupt is requested, which ends up as spurious interrupt and causes device malfunction and worse. The fix is to exclude MSI interrupts which do not support masking from reservation mode and assign a real vector right away. - Extend the extra lockdep class setup for nested interrupts with a class for the recently added irq_desc::request_mutex so lockdep can differeniate and does not emit false positive warnings. - A ratelimit guard for the bad irq printout so in case a bad irq comes back immediately the system does not drown in dmesg spam" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI genirq/irqdomain: Rename early argument of irq_domain_activate_irq() x86/vector: Use IRQD_CAN_RESERVE flag genirq: Introduce IRQD_CAN_RESERVE flag genirq/msi: Handle reactivation only on success gpio: brcmstb: Make really use of the new lockdep class genirq: Guard handle_bad_irq log messages kernel/irq: Extend lockdep class for request mutex
2017-12-31Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds3-2/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fixes from Thomas Gleixner: "Three fixlets for objtool: - Address two segfaults related to missing parameter and clang objects - Make it compile clean with clang" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Fix seg fault with clang-compiled objects objtool: Fix seg fault caused by missing parameter objtool: Fix Clang enum conversion warning
2017-12-31Merge tag 'char-misc-4.15-rc6' of ↵Linus Torvalds6-17/+38
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc fixes from Greg KH: "Here are six small fixes of some of the char/misc drivers that have been sent in to resolve reported issues. Nothing major, a binder use-after-free fix, some thunderbolt bugfixes, a hyper-v bugfix, and an nvmem driver fix. All of these have been in linux-next with no reported issues for a while" * tag 'char-misc-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: nvmem: meson-mx-efuse: fix reading from an offset other than 0 binder: fix proc->files use-after-free vmbus: unregister device_obj->channels_kset thunderbolt: Mask ring interrupt properly when polling starts MAINTAINERS: Add thunderbolt.rst to the Thunderbolt driver entry thunderbolt: Make pathname to force_power shorter
2017-12-31Merge tag 'driver-core-4.15-rc6' of ↵Linus Torvalds2-4/+25
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are two driver core fixes for 4.15-rc6, resolving some reported issues. The first is a cacheinfo fix for DT based systems to resolve a reported issue that has been around for a while, and the other is to resolve a regression in the kobject uevent code that showed up in 4.15-rc1. Both have been in linux-next for a while with no reported issues" * tag 'driver-core-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: kobject: fix suppressing modalias in uevents delivered over netlink drivers: base: cacheinfo: fix cache type for non-architected system cache
2017-12-31Merge tag 'staging-4.15-rc6' of ↵Linus Torvalds4-20/+24
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging fixes from Greg KH: "Here are three staging driver fixes for 4.15-rc6 The first resolves a bug in the lustre driver that came about due to a broken cleanup patch, due to crazy list usage in that codebase. The remaining two are ion driver fixes, finally getting the CMA interaction to work properly, resolving two regressions in that area of the code. All have been in linux-next with no reported issues for a while" * tag 'staging-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: android: ion: Fix dma direction for dma_sync_sg_for_cpu/device staging: ion: Fix ion_cma_heap allocations staging: lustre: lnet: Fix recent breakage from list_for_each conversion
2017-12-31Merge tag 'tty-4.15-rc6' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull TTY fix from Greg KH: "Here is a single tty fix for a reported issue that you wrote the patch for :) It's been in linux-next for a week or so with no reported issues" * tag 'tty-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD)
2017-12-31Merge tag 'usb-4.15-rc6' of ↵Linus Torvalds23-104/+117
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB/PHY fixes from Greg KH: "Here are a number of small USB and PHY driver fixes for 4.15-rc6. Nothing major, but there are a number of regression fixes in here that resolve issues that have been reported a bunch. There are also the usual xhci fixes as well as a number of new usb serial device ids. All of these have been in linux-next for a while with no reported issues" * tag 'usb-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201 xhci: Fix use-after-free in xhci debugfs xhci: Fix xhci debugfs NULL pointer dereference in resume from hibernate USB: serial: ftdi_sio: add id for Airbus DS P8GR usb: Add device quirk for Logitech HD Pro Webcam C925e usb: add RESET_RESUME for ELSA MicroLink 56K usbip: fix usbip bind writing random string after command in match_busid usbip: stub_rx: fix static checker warning on unnecessary checks usbip: prevent leaking socket pointer address in messages usbip: stub: stop printing kernel pointer addresses in messages usbip: vhci: stop printing kernel pointer addresses in messages USB: Fix off by one in type-specific length check of BOS SSP capability USB: serial: option: adding support for YUGA CLM920-NC5 phy: rcar-gen3-usb2: select USB_COMMON phy: rockchip-typec: add pm_runtime_disable in err case phy: cpcap-usb: Fix platform_get_irq_byname's error checking. phy: tegra: fix device-tree node lookups USB: serial: qcserial: add Sierra Wireless EM7565 USB: serial: option: add support for Telit ME910 PID 0x1101 USB: chipidea: msm: fix ulpi-node lookup
2017-12-31MAINTAINERS: mark arch/blackfin/ and its gubbins as orphanedAdam Borowski1-9/+7
The blackfin architecture has seen no maintainer action of any kind since April 2015. No new code, no pull requests, no acks to patches, no response to mails, nothing. The web site has an expired certificate (expiration Sep 2017, issued in 2013), the mailing list sees no answers either, with one exception: https://sourceforge.net/p/adi-buildroot/mailman/adi-buildroot-devel/ > > Steven is no longer working on this for ADI. Acked by me if this works. Thanks. > > Best regards, > Aaron Wu > Analog Devices Inc. But, Aaron doesn't seem to respond to queries either. Signed-off-by: Adam Borowski <kilobyte@angband.pl> Acked-by: Linus Walleij <linus.walleij@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds1-2/+2
Pull sparc bugfix from David Miller. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: repair calling incorrect hweight function from stubs
2017-12-31x86/ldt: Make LDT pgtable free conditionalThomas Gleixner1-1/+2
Andy prefers to be paranoid about the pagetable free in the error path of write_ldt(). Make it conditional and warn whenever the installment of a secondary LDT fails. Requested-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-12-31x86/ldt: Plug memory leak in error pathThomas Gleixner1-1/+7
The error path in write_ldt() tries to free 'old_ldt' instead of the newly allocated 'new_ldt', resulting in a memory leak. It also misses to clean up a half populated LDT pagetable, which is not a leak as it gets cleaned up when the process exits. Free both the potentially half populated LDT pagetable and the newly allocated LDT struct. This can be done unconditionally because once an LDT is mapped subsequent maps will succeed, because the PTE page is already populated and the two LDTs fit into that single page. Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: f55f0501cbf6 ("x86/pti: Put the LDT in its own PGD if PTI is on") Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1712311121340.1899@nanos Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-31x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()Thomas Gleixner1-6/+8
The preempt_disable/enable() pair in __native_flush_tlb() was added in commit: 5cf0791da5c1 ("x86/mm: Disable preemption during CR3 read+write") ... to protect the UP variant of flush_tlb_mm_range(). That preempt_disable/enable() pair should have been added to the UP variant of flush_tlb_mm_range() instead. The UP variant was removed with commit: ce4a4e565f52 ("x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code") ... but the preempt_disable/enable() pair stayed around. The latest change to __native_flush_tlb() in commit: 6fd166aae78c ("x86/mm: Use/Fix PCID to optimize user/kernel switches") ... added an access to a per CPU variable outside the preempt disabled regions, which makes no sense at all. __native_flush_tlb() must always be called with at least preemption disabled. Remove the preempt_disable/enable() pair and add a WARN_ON_ONCE() to catch bad callers independent of the smp_processor_id() debugging. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171230211829.679325424@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-31x86/smpboot: Remove stale TLB flush invocationsThomas Gleixner1-9/+0
smpboot_setup_warm_reset_vector() and smpboot_restore_warm_reset_vector() invoke local_flush_tlb() for no obvious reason. Digging in history revealed that the original code in the 2.1 era added those because the code manipulated a swapper_pg_dir pagetable entry. The pagetable manipulation was removed long ago in the 2.3 timeframe, but the TLB flush invocations stayed around forever. Remove them along with the pointless pr_debug()s which come from the same 2.1 change. Reported-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171230211829.586548655@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-30Merge tag 'scsi-fixes' of ↵Linus Torvalds2-2/+6
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two simple fixes, both of which cause I/O hangs. The storvsc one is from the hyper-v which can hang under certain hot add/remove conditions and the other is generally, where removing a target and a device in close proximity can result in the release method being executed twice (and subsequent list and other corruption and an eventual panic)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: storvsc: Fix scsi_cmd error assignments in storvsc_handle_error scsi: core: check for device state in __scsi_remove_target()
2017-12-30objtool: Fix seg fault with clang-compiled objectsSimon Ser1-0/+2
Fix a seg fault which happens when an input file provided to 'objtool orc generate' doesn't have a '.shstrtab' section (for instance, object files produced by clang don't have this section). Signed-off-by: Simon Ser <contact@emersion.fr> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/c0f2231683e9bed40fac1f13ce2c33b8389854bc.1514666459.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-30objtool: Fix seg fault caused by missing parameterSimon Ser1-1/+3
Fix a seg fault when no parameter is provided to 'objtool orc'. Signed-off-by: Simon Ser <contact@emersion.fr> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/9172803ec7ebb72535bcd0b7f966ae96d515968e.1514666459.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-30Merge branch 'for-linus' of ↵Linus Torvalds3-7/+18
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID fixes from Jiri Kosina: - two cosmetic fixes from Daniel Axtens and Hans de Goede - fix for I2C command mismatch fix for cp2112 driver from Eudean Sun * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: HID: core: lower log level for unknown main item tags to warnings HID: holtekff: move MODULE_* parameters out of #ifdef block HID: cp2112: Fix I2C_BLOCK_DATA transactions
2017-12-30kbuild: add '-fno-stack-check' to kernel build optionsLinus Torvalds1-0/+3
It appears that hardened gentoo enables "-fstack-check" by default for gcc. That doesn't work _at_all_ for the kernel, because the kernel stack doesn't act like a user stack at all: it's much smaller, and it doesn't auto-expand on use. So the extra "probe one page below the stack" code generated by -fstack-check just breaks the kernel in horrible ways, causing infinite double faults etc. [ I have to say, that the particular code gcc generates looks very stupid even for user space where it works, but that's a separate issue. ] Reported-and-tested-by: Alexander Tsoy <alexander@tsoy.me> Reported-and-tested-by: Toralf Förster <toralf.foerster@gmx.de> Cc: stable@kernel.org Cc: Dave Hansen <dave.hansen@intel.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-29Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds45-202/+1636
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 page table isolation updates from Thomas Gleixner: "This is the final set of enabling page table isolation on x86: - Infrastructure patches for handling the extra page tables. - Patches which map the various bits and pieces which are required to get in and out of user space into the user space visible page tables. - The required changes to have CR3 switching in the entry/exit code. - Optimizations for the CR3 switching along with documentation how the ASID/PCID mechanism works. - Updates to dump pagetables to cover the user space page tables for W+X scans and extra debugfs files to analyze both the kernel and the user space visible page tables The whole functionality is compile time controlled via a config switch and can be turned on/off on the command line as well" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits) x86/ldt: Make the LDT mapping RO x86/mm/dump_pagetables: Allow dumping current pagetables x86/mm/dump_pagetables: Check user space page table for WX pages x86/mm/dump_pagetables: Add page table directory to the debugfs VFS hierarchy x86/mm/pti: Add Kconfig x86/dumpstack: Indicate in Oops whether PTI is configured and enabled x86/mm: Clarify the whole ASID/kernel PCID/user PCID naming x86/mm: Use INVPCID for __native_flush_tlb_single() x86/mm: Optimize RESTORE_CR3 x86/mm: Use/Fix PCID to optimize user/kernel switches x86/mm: Abstract switching CR3 x86/mm: Allow flushing for future ASID switches x86/pti: Map the vsyscall page if needed x86/pti: Put the LDT in its own PGD if PTI is on x86/mm/64: Make a full PGD-entry size hole in the memory map x86/events/intel/ds: Map debug buffers in cpu_entry_area x86/cpu_entry_area: Add debugstore entries to cpu_entry_area x86/mm/pti: Map ESPFIX into user space x86/mm/pti: Share entry text PMD x86/entry: Align entry text section to PMD boundary ...
2017-12-29timerqueue: Document return values of timerqueue_add/del()Thomas Gleixner1-3/+5
The return values of timerqueue_add/del() are not documented in the kernel doc comment. Add proper documentation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: rt@linutronix.de Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Link: https://lkml.kernel.org/r/20171222145337.872681338@linutronix.de
2017-12-29timers: Invoke timer_start_debug() where it makes senseThomas Gleixner1-2/+2
The timer start debug function is called before the proper timer base is set. As a consequence the trace data contains the stale CPU and flags values. Call the debug function after setting the new base and flags. Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: stable@vger.kernel.org Cc: rt@linutronix.de Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Link: https://lkml.kernel.org/r/20171222145337.792907137@linutronix.de
2017-12-29nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()Thomas Gleixner1-2/+17
The conditions in irq_exit() to invoke tick_nohz_irq_exit() which subsequently invokes tick_nohz_stop_sched_tick() are: if ((idle_cpu(cpu) && !need_resched()) || tick_nohz_full_cpu(cpu)) If need_resched() is not set, but a timer softirq is pending then this is an indication that the softirq code punted and delegated the execution to softirqd. need_resched() is not true because the current interrupted task takes precedence over softirqd. Invoking tick_nohz_irq_exit() in this case can cause an endless loop of timer interrupts because the timer wheel contains an expired timer, but softirqs are not yet executed. So it returns an immediate expiry request, which causes the timer to fire immediately again. Lather, rinse and repeat.... Prevent that by adding a check for a pending timer soft interrupt to the conditions in tick_nohz_stop_sched_tick() which avoid calling get_next_timer_interrupt(). That keeps the tick sched timer on the tick and prevents a repetitive programming of an already expired timer. Reported-by: Sebastian Siewior <bigeasy@linutronix.d> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272156050.2431@nanos
2017-12-29timers: Reinitialize per cpu bases on hotplugThomas Gleixner4-4/+21
The timer wheel bases are not (re)initialized on CPU hotplug. That leaves them with a potentially stale clk and next_expiry valuem, which can cause trouble then the CPU is plugged. Add a prepare callback which forwards the clock, sets next_expiry to far in the future and reset the control flags to a known state. Set base->must_forward_clk so the first timer which is queued will try to forward the clock to current jiffies. Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel") Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272152200.2431@nanos
2017-12-29timers: Use deferrable base independent of base::nohz_activeAnna-Maria Gleixner1-9/+7
During boot and before base::nohz_active is set in the timer bases, deferrable timers are enqueued into the standard timer base. This works correctly as long as base::nohz_active is false. Once it base::nohz_active is set and a timer which was enqueued before that is accessed the lock selector code choses the lock of the deferred base. This causes unlocked access to the standard base and in case the timer is removed it does not clear the pending flag in the standard base bitmap which causes get_next_timer_interrupt() to return bogus values. To prevent that, the deferrable timers must be enqueued in the deferrable base, even when base::nohz_active is not set. Those deferrable timers also need to be expired unconditional. Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel") Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: stable@vger.kernel.org Cc: rt@linutronix.de Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Link: https://lkml.kernel.org/r/20171222145337.633328378@linutronix.de
2017-12-29genirq/msi, x86/vector: Prevent reservation mode for non maskable MSIThomas Gleixner2-5/+44
The new reservation mode for interrupts assigns a dummy vector when the interrupt is allocated and assigns a real vector when the interrupt is requested. The reservation mode prevents vector pressure when devices with a large amount of queues/interrupts are initialized, but only a minimal subset of those queues/interrupts is actually used. This mode has an issue with MSI interrupts which cannot be masked. If the driver is not careful or the hardware emits an interrupt before the device irq is requestd by the driver then the interrupt ends up on the dummy vector as a spurious interrupt which can cause malfunction of the device or in the worst case a lockup of the machine. Change the logic for the reservation mode so that the early activation of MSI interrupts checks whether: - the device is a PCI/MSI device - the reservation mode of the underlying irqdomain is activated - PCI/MSI masking is globally enabled - the PCI/MSI device uses either MSI-X, which supports masking, or MSI with the maskbit supported. If one of those conditions is false, then clear the reservation mode flag in the irq data of the interrupt and invoke irq_domain_activate_irq() with the reserve argument cleared. In the x86 vector code, clear the can_reserve flag in the vector allocation data so a subsequent free_irq() won't create the same situation again. The interrupt stays assigned to a real vector until pci_disable_msi() is invoked and all allocations are undone. Fixes: 4900be83602b ("x86/vector/msi: Switch to global reservation mode") Reported-by: Alexandru Chirvasitu <achirvasub@gmail.com> Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexandru Chirvasitu <achirvasub@gmail.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Mikael Pettersson <mikpelinux@gmail.com> Cc: Josh Poulson <jopoulso@microsoft.com> Cc: Mihai Costache <v-micos@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-pci@vger.kernel.org Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Simon Xiao <sixiao@microsoft.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Jork Loeser <Jork.Loeser@microsoft.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: devel@linuxdriverproject.org Cc: KY Srinivasan <kys@microsoft.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Sakari Ailus <sakari.ailus@intel.com>, Cc: linux-media@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291406420.1899@nanos Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291409460.1899@nanos
2017-12-29genirq/irqdomain: Rename early argument of irq_domain_activate_irq()Thomas Gleixner13-28/+29
The 'early' argument of irq_domain_activate_irq() is actually used to denote reservation mode. To avoid confusion, rename it before abuse happens. No functional change. Fixes: 72491643469a ("genirq/irqdomain: Update irq_domain_ops.activate() signature") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandru Chirvasitu <achirvasub@gmail.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Mikael Pettersson <mikpelinux@gmail.com> Cc: Josh Poulson <jopoulso@microsoft.com> Cc: Mihai Costache <v-micos@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-pci@vger.kernel.org Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Simon Xiao <sixiao@microsoft.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Jork Loeser <Jork.Loeser@microsoft.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: devel@linuxdriverproject.org Cc: KY Srinivasan <kys@microsoft.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Sakari Ailus <sakari.ailus@intel.com>, Cc: linux-media@vger.kernel.org
2017-12-29x86/vector: Use IRQD_CAN_RESERVE flagThomas Gleixner1-0/+2
Set the new CAN_RESERVE flag when the initial reservation for an interrupt happens. The flag is used in a subsequent patch to disable reservation mode for a certain class of MSI devices. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexandru Chirvasitu <achirvasub@gmail.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Mikael Pettersson <mikpelinux@gmail.com> Cc: Josh Poulson <jopoulso@microsoft.com> Cc: Mihai Costache <v-micos@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-pci@vger.kernel.org Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Simon Xiao <sixiao@microsoft.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Jork Loeser <Jork.Loeser@microsoft.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: devel@linuxdriverproject.org Cc: KY Srinivasan <kys@microsoft.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Sakari Ailus <sakari.ailus@intel.com>, Cc: linux-media@vger.kernel.org
2017-12-29genirq: Introduce IRQD_CAN_RESERVE flagThomas Gleixner2-0/+18
Add a new flag to mark interrupts which can use reservation mode. This is going to be used in subsequent patches to disable reservation mode for a certain class of MSI devices. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexandru Chirvasitu <achirvasub@gmail.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Mikael Pettersson <mikpelinux@gmail.com> Cc: Josh Poulson <jopoulso@microsoft.com> Cc: Mihai Costache <v-micos@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-pci@vger.kernel.org Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Simon Xiao <sixiao@microsoft.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Jork Loeser <Jork.Loeser@microsoft.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: devel@linuxdriverproject.org Cc: KY Srinivasan <kys@microsoft.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Sakari Ailus <sakari.ailus@intel.com>, Cc: linux-media@vger.kernel.org
2017-12-29genirq/msi: Handle reactivation only on successThomas Gleixner1-8/+27
When analyzing the fallout of the x86 vector allocation rework it turned out that the error handling in msi_domain_alloc_irqs() is broken. If MSI_FLAG_MUST_REACTIVATE is set for a MSI domain then it clears the activation flag for a successfully initialized msi descriptor. If a subsequent initialization fails then the error handling code path does not deactivate the interrupt because the activation flag got cleared. Move the clearing of the activation flag outside of the initialization loop so that an eventual failure can be cleaned up correctly. Fixes: 22d0b12f3560 ("genirq/irqdomain: Add force reactivation flag to irq domains") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexandru Chirvasitu <achirvasub@gmail.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Mikael Pettersson <mikpelinux@gmail.com> Cc: Josh Poulson <jopoulso@microsoft.com> Cc: Mihai Costache <v-micos@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-pci@vger.kernel.org Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Simon Xiao <sixiao@microsoft.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Jork Loeser <Jork.Loeser@microsoft.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: devel@linuxdriverproject.org Cc: KY Srinivasan <kys@microsoft.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Sakari Ailus <sakari.ailus@intel.com>, Cc: linux-media@vger.kernel.org
2017-12-29Merge tag 'pm-4.15-rc6' of ↵Linus Torvalds3-1/+15
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "This fixes a schedutil cpufreq governor regression from the 4.14 cycle that may cause a CPU idleness check to return incorrect results in some cases which leads to suboptimal decisions (Joel Fernandes)" * tag 'pm-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: schedutil: Use idle_calls counter of the remote CPU
2017-12-29gpio: brcmstb: Make really use of the new lockdep classThomas Gleixner1-1/+1
The recent extension of irq_set_lockdep_class() with a second argument added the new lockdep class to the mrcmstb driver, but used the already existing lockdep class as second argument, which leaves the new lockdep class defined but unused. Use the new lockdep class as that's what the change intended to do. Fixes: 39c3fd58952d ("kernel/irq: Extend lockdep class for request mutex") Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Lunn <andrew@lunn.ch> Cc: linus.walleij@linaro.org
2017-12-29x86/apic: Switch all APICs to Fixed delivery modeThomas Gleixner6-16/+8
Some of the APIC incarnations are operating in lowest priority delivery mode. This worked as long as the vector management code allocated the same vector on all possible CPUs for each interrupt. Lowest priority delivery mode does not necessarily respect the affinity setting and may redirect to some other online CPU. This was documented somewhere in the old code and the conversion to single target delivery missed to update the delivery mode of the affected APIC drivers which results in spurious interrupts on some of the affected CPU/Chipset combinations. Switch the APIC drivers over to Fixed delivery mode and remove all leftovers of lowest priority delivery mode. Switching to Fixed delivery mode is not a problem on these CPUs because the kernel already uses Fixed delivery mode for IPIs. The reason for this is that th SDM explicitely forbids lowest prio mode for IPIs. The reason is obvious: If the irq routing does not honor destination targets in lowest prio mode then an IPI targeted at CPU1 might end up on CPU0, which would be a fatal problem in many cases. As a consequence of this change, the apic::irq_delivery_mode field is now pointless, but this needs to be cleaned up in a separate patch. Fixes: fdba46ffb4c2 ("x86/apic: Get rid of multi CPU affinity") Reported-by: vcaputo@pengaru.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: vcaputo@pengaru.com Cc: Pavel Machek <pavel@ucw.cz> Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712281140440.1688@nanos
2017-12-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds30-121/+298
Pull networking fixes from David Miller: 1) IPv6 gre tunnels end up with different default features enabled depending upon whether netlink or ioctls are used to bring them up. Fix from Alexey Kodanev. 2) Fix read past end of user control message in RDS< from Avinash Repaka. 3) Missing RCU barrier in mini qdisc code, from Cong Wang. 4) Missing policy put when reusing per-cpu route entries, from Florian Westphal. 5) Handle nested PCI errors properly in bnx2x driver, from Guilherme G. Piccoli. 6) Run nested transport mode IPSEC packets via tasklet, from Herbert Xu. 7) Fix handling poll() for stream sockets in tipc, from Parthasarathy Bhuvaragan. 8) Fix two stack-out-of-bounds issues in IPSEC, from Steffen Klassert. 9) Another zerocopy ubuf handling fix, from Willem de Bruijn. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits) strparser: Call sock_owned_by_user_nocheck sock: Add sock_owned_by_user_nocheck skbuff: in skb_copy_ubufs unclone before releasing zerocopy tipc: fix hanging poll() for stream sockets sctp: Replace use of sockets_allocated with specified macro. bnx2x: Improve reliability in case of nested PCI errors tg3: Enable PHY reset in MTU change path for 5720 tg3: Add workaround to restrict 5762 MRRS to 2048 tg3: Update copyright net: fec: unmap the xmit buffer that are not transferred by DMA tipc: fix tipc_mon_delete() oops in tipc_enable_bearer() error path tipc: error path leak fixes in tipc_enable_bearer() RDS: Check cmsg_len before dereferencing CMSG_DATA tcp: Avoid preprocessor directives in tracepoint macro args tipc: fix memory leak of group member when peer node is lost net: sched: fix possible null pointer deref in tcf_block_put tipc: base group replicast ack counter on number of actual receivers net_sched: fix a missing rcu barrier in mini_qdisc_pair_swap() net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround ip6_gre: fix device features for ioctl setup ...
2017-12-28Merge tag 'drm-fixes-for-v4.15-rc6' of ↵Linus Torvalds3-4/+5
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "nouveau and i915 regression fixes" * tag 'drm-fixes-for-v4.15-rc6' of git://people.freedesktop.org/~airlied/linux: drm/nouveau: fix race when adding delayed work items i915: Reject CCS modifiers for pipe C on Geminilake drm/i915/gvt: Fix pipe A enable as default for vgpu
2017-12-28Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fix from Stephen Boyd: "One more fix for the runtime PM clk patches. We're calling a runtime PM API that may schedule from somewhere that we can't do that. We change to the async version of pm_runtime_put() to fix it" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: use atomic runtime pm api in clk_core_is_enabled
2017-12-28Merge tag 'led_fixes_for_4.15-rc6' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds Pull LED fix from Jacek Anaszewski: "A single LED fix for brightness setting when delay_off is 0" * tag 'led_fixes_for_4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds: led: core: Fix brightness setting when setting delay_off=0
2017-12-28Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds22-105/+208
Pull rdma fixes from Jason Gunthorpe: "This is the next batch of for-rc patches from RDMA. It includes the fix for the ipoib regression I mentioned last time, and the result of a fairly major debugging effort to get iser working reliably on cxgb4 hardware - it turns out the cxgb4 driver was not handling QP error flushing properly causing iser to fail. - cxgb4 fix for an iser testing failure as debugged by Steve and Sagi. The problem was a driver bug in the handling of shutting down a QP. - Various vmw_pvrdma fixes for bogus WARN_ON, missed resource free on error unwind and a use after free bug - Improper congestion counter values on mlx5 when link aggregation is enabled - ipoib lockdep regression introduced in this merge window - hfi1 regression supporting the device in a VM introduced in a recent patch - Typo that breaks future uAPI compatibility in the verbs core - More SELinux related oops fixing - Fix an oops during error unwind in mlx5" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/mlx5: Fix mlx5_ib_alloc_mr error flow IB/core: Verify that QP is security enabled in create and destroy IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp() IB/mlx5: Serialize access to the VMA list IB/hfi: Only read capability registers if the capability exists IB/ipoib: Fix lockdep issue found on ipoib_ib_dev_heavy_flush IB/mlx5: Fix congestion counters in LAG mode RDMA/vmw_pvrdma: Avoid use after free due to QP/CQ/SRQ destroy RDMA/vmw_pvrdma: Use refcount_dec_and_test to avoid warning RDMA/vmw_pvrdma: Call ib_umem_release on destroy QP path iw_cxgb4: when flushing, complete all wrs in a chain iw_cxgb4: reflect the original WR opcode in drain cqes iw_cxgb4: Only validate the MSN for successful completions
2017-12-28Merge branch 'strparser-Fix-lockdep-issue'David S. Miller2-1/+6
Tom Herbert says: ==================== strparser: Fix lockdep issue When sock_owned_by_user returns true in strparser. Fix is to add and call sock_owned_by_user_nocheck since the check for owned by user is not an error condition in this case. ==================== Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages") Reported-by: syzbot <syzkaller@googlegroups.com> Reported-and-tested-by: <syzbot+c91c53af67f9ebe599a337d2e70950366153b295@syzkaller.appspotmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28strparser: Call sock_owned_by_user_nocheckTom Herbert1-1/+1
strparser wants to check socket ownership without producing any warnings. As indicated by the comment in the code, it is permissible for owned_by_user to return true. Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages") Reported-by: syzbot <syzkaller@googlegroups.com> Reported-and-tested-by: <syzbot+c91c53af67f9ebe599a337d2e70950366153b295@syzkaller.appspotmail.com> Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28sock: Add sock_owned_by_user_nocheckTom Herbert1-0/+5
This allows checking socket lock ownership with producing lockdep warnings. Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28skbuff: in skb_copy_ubufs unclone before releasing zerocopyWillem de Bruijn1-3/+3
skb_copy_ubufs must unclone before it is safe to modify its skb_shared_info with skb_zcopy_clear. Commit b90ddd568792 ("skbuff: skb_copy_ubufs must release uarg even without user frags") ensures that all skbs release their zerocopy state, even those without frags. But I forgot an edge case where such an skb arrives that is cloned. The stack does not build such packets. Vhost/tun skbs have their frags orphaned before cloning. TCP skbs only attach zerocopy state when a frag is added. But if TCP packets can be trimmed or linearized, this might occur. Tracing the code I found no instance so far (e.g., skb_linearize ends up calling skb_zcopy_clear if !skb->data_len). Still, it is non-obvious that no path exists. And it is fragile to rely on this. Fixes: b90ddd568792 ("skbuff: skb_copy_ubufs must release uarg even without user frags") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28tipc: fix hanging poll() for stream socketsParthasarathy Bhuvaragan1-1/+1
In commit 42b531de17d2f6 ("tipc: Fix missing connection request handling"), we replaced unconditional wakeup() with condtional wakeup for clients with flags POLLIN | POLLRDNORM | POLLRDBAND. This breaks the applications which do a connect followed by poll with POLLOUT flag. These applications are not woken when the connection is ESTABLISHED and hence sleep forever. In this commit, we fix it by including the POLLOUT event for sockets in TIPC_CONNECTING state. Fixes: 42b531de17d2f6 ("tipc: Fix missing connection request handling") Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28objtool: Fix Clang enum conversion warningLukas Bulwahn1-1/+1
Fix the following Clang enum conversion warning: arch/x86/decode.c:141:20: error: implicit conversion from enumeration type 'enum op_src_type' to different enumeration type 'enum op_dest_type' [-Werror,-Wenum-conversion] op->dest.type = OP_SRC_REG; ~ ^~~~~~~~~~ It just happened to work before because OP_SRC_REG and OP_DEST_REG have the same value. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Nicholas Mc Guire <der.herr@hofr.at> Reviewed-by: Nick Desaulniers <nick.desaulniers@gmail.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: baa41469a7b9 ("objtool: Implement stack validation 2.0") Link: http://lkml.kernel.org/r/b4156c5738bae781c392e7a3691aed4514ebbdf2.1514323568.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-28x86/apic: Update the 'apic=' description of setting APIC driverDou Liyang1-1/+5
There are two consumers of apic=: the APIC debug level and the low level generic architecture code, but Linux just documented the first one. Append the second description. Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: peterz@infradead.org Cc: rdunlap@infradead.org Cc: corbet@lwn.net Link: https://lkml.kernel.org/r/20171204040313.24824-2-douly.fnst@cn.fujitsu.com
2017-12-28x86/apic: Avoid wrong warning when parsing 'apic=' in X86-32 caseDou Liyang1-0/+2
There are two consumers of apic=: apic_set_verbosity() for setting the APIC debug level; parse_apic() for registering APIC driver by hand. X86-32 supports both of them, but sometimes, kernel issues a weird warning. eg: when kernel was booted up with 'apic=bigsmp' in command line, early_param would warn like that: ... [ 0.000000] APIC Verbosity level bigsmp not recognised use apic=verbose or apic=debug [ 0.000000] Malformed early option 'apic' ... Wrap the warning code in CONFIG_X86_64 case to avoid this. Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: peterz@infradead.org Cc: rdunlap@infradead.org Cc: corbet@lwn.net Link: https://lkml.kernel.org/r/20171204040313.24824-1-douly.fnst@cn.fujitsu.com
2017-12-28genirq: Guard handle_bad_irq log messagesGuenter Roeck1-0/+5
An interrupt storm on a bad interrupt will cause the kernel log to be clogged. [ 60.089234] ->handle_irq(): ffffffffbe2f803f, [ 60.090455] 0xffffffffbf2af380 [ 60.090510] handle_bad_irq+0x0/0x2e5 [ 60.090522] ->irq_data.chip(): ffffffffbf2af380, [ 60.090553] IRQ_NOPROBE set [ 60.090584] ->handle_irq(): ffffffffbe2f803f, [ 60.090590] handle_bad_irq+0x0/0x2e5 [ 60.090596] ->irq_data.chip(): ffffffffbf2af380, [ 60.090602] 0xffffffffbf2af380 [ 60.090608] ->action(): (null) [ 60.090779] handle_bad_irq+0x0/0x2e5 This was seen when running an upstream kernel on Acer Chromebook R11. The system was unstable as result. Guard the log message with __printk_ratelimit to reduce the impact. This won't prevent the interrupt storm from happening, but at least the system remains stable. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dmitry Torokhov <dtor@chromium.org> Cc: Joe Perches <joe@perches.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=197953 Link: https://lkml.kernel.org/r/1512234784-21038-1-git-send-email-linux@roeck-us.net
2017-12-28cpufreq: schedutil: Use idle_calls counter of the remote CPUJoel Fernandes3-1/+15
Since the recent remote cpufreq callback work, its possible that a cpufreq update is triggered from a remote CPU. For single policies however, the current code uses the local CPU when trying to determine if the remote sg_cpu entered idle or is busy. This is incorrect. To remedy this, compare with the nohz tick idle_calls counter of the remote CPU. Fixes: 674e75411fc2 (sched: cpufreq: Allow remote cpufreq callbacks) Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Joel Fernandes <joelaf@google.com> Cc: 4.14+ <stable@vger.kernel.org> # 4.14+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-12-28kernel/irq: Extend lockdep class for request mutexAndrew Lunn11-35/+75
The IRQ code already has support for lockdep class for the lock mutex in an interrupt descriptor. Extend this to add a second class for the request mutex in the descriptor. Not having a class is resulting in false positive splats in some code paths. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: linus.walleij@linaro.org Cc: grygorii.strashko@ti.com Cc: f.fainelli@gmail.com Link: https://lkml.kernel.org/r/1512234664-21555-1-git-send-email-andrew@lunn.ch
2017-12-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller3-4/+8
Daniel Borkmann says: ==================== pull-request: bpf 2017-12-28 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Two small fixes for bpftool. Fix otherwise broken output if any of the system calls failed when listing maps in json format and instead of bailing out, skip maps or progs that disappeared between fetching next id and getting an fd for that id, both from Jakub. 2) Small fix in BPF selftests to respect LLC passed from command line when testing for -mcpu=probe presence, from Quentin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-27sparc64: repair calling incorrect hweight function from stubsJan Engelhardt1-2/+2
Commit v4.12-rc4-1-g9289ea7f952b introduced a mistake that made the 64-bit hweight stub call the 16-bit hweight function. Fixes: 9289ea7f952b ("sparc64: Use indirect calls in hamming weight stubs") Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-27IB/mlx5: Fix mlx5_ib_alloc_mr error flowNitzan Carmi1-0/+1
ibmr.device is being set only after ib_alloc_mr() is (successfully) complete. Therefore, in case mlx5_core_create_mkey() return with error, the error flow calls mlx5_free_priv_descs() which uses ibmr.device (which doesn't exist yet), causing a NULL dereference oops. To fix this, the IB device should be set in the mr struct earlier stage (e.g. prior to calling mlx5_core_create_mkey()). Fixes: 8a187ee52b04 ("IB/mlx5: Support the new memory registration API") Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Nitzan Carmi <nitzanc@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-27IB/core: Verify that QP is security enabled in create and destroyMoni Shoua2-1/+5
The XRC target QP create flow sets up qp_sec only if there is an IB link with LSM security enabled. However, several other related uAPI entry points blindly follow the qp_sec NULL pointer, resulting in a possible oops. Check for NULL before using qp_sec. Cc: <stable@vger.kernel.org> # v4.12 Fixes: d291f1a65232 ("IB/core: Enforce PKey security on QPs") Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-27IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp()Moni Shoua1-2/+2
If the input command length is larger than the kernel supports an error should be returned in case the unsupported bytes are not cleared, instead of the other way aroudn. This matches what all other callers of ib_is_udata_cleared do and will avoid user ABI problems in the future. Cc: <stable@vger.kernel.org> # v4.10 Fixes: 189aba99e700 ("IB/uverbs: Extend modify_qp and support packet pacing") Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-27IB/mlx5: Serialize access to the VMA listMajd Dibbiny2-0/+12
User-space applications can do mmap and munmap directly at any time. Since the VMA list is not protected with a mutex, concurrent accesses to the VMA list from the mmap and munmap can cause data corruption. Add a mutex around the list. Cc: <stable@vger.kernel.org> # v4.7 Fixes: 7c2344c3bbf9 ("IB/mlx5: Implements disassociate_ucontext API") Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-27Merge tag 'trace-v4.15-rc4' of ↵Linus Torvalds2-10/+15
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "While doing tests on tracing over the network, I found that the packets were getting corrupted. In the process I found three bugs. One was the culprit, but the other two scared me. After deeper investigation, they were not as major as I thought they were, due to a signed compared to an unsigned that prevented a negative number from doing actual harm. The two bigger bugs: - Mask the ring buffer data page length. There are data flags at the high bits of the length field. These were not cleared via the length function, and the length could return a negative number. (Although the number returned was unsigned, but was assigned to a signed number) Luckily, this value was compared to PAGE_SIZE which is unsigned and kept it from entering the path that could have caused damage. - Check the page usage before reusing the ring buffer reader page. TCP increments the page ref when passing the page off to the network. The page is passed back to the ring buffer for use on free. But the page could still be in use by the TCP stack. Minor bugs: - Related to the first bug. No need to clear out the unused ring buffer data before sending to user space. It is now done by the ring buffer code itself. - Reset pointers after free on error path. There were some cases in the error path that pointers were freed but not set to NULL, and could have them freed again, having a pointer freed twice" * tag 'trace-v4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix possible double free on failure of allocating trace buffer tracing: Fix crash when it fails to alloc ring buffer ring-buffer: Do no reuse reader page if still in use tracing: Remove extra zeroing out of the ring buffer page ring-buffer: Mask out the info bits when returning buffer page length
2017-12-27Merge tag 'sound-4.15-rc6' of ↵Linus Torvalds31-143/+173
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "It seems that Santa overslept with a bunch of gifts; the majority of changes here are various device-specific ASoC fixes, most notably the revert of rcar IOMMU support and fsl_ssi AC97 fixes, but also lots of small fixes for codecs. Besides that, the usual HD-audio quirks and fixes are included, too" * tag 'sound-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (31 commits) ALSA: hda - Fix missing COEF init for ALC225/295/299 ALSA: hda: Drop useless WARN_ON() ALSA: hda - change the location for one mic on a Lenovo machine ALSA: hda - fix headset mic detection issue on a Dell machine ALSA: hda - Add MIC_NO_PRESENCE fixup for 2 HP machines ASoC: rsnd: fixup ADG register mask ASoC: rt5514-spi: only enable wakeup when fully initialized ASoC: nau8825: fix issue that pop noise when start capture ASoC: rt5663: Fix the wrong result of the first jack detection ASoC: rsnd: ssi: fix race condition in rsnd_ssi_pointer_update ASoC: Intel: Change kern log level to avoid unwanted messages ASoC: atmel-classd: select correct Kconfig symbol ASoC: wm_adsp: Fix validation of firmware and coeff lengths ASoC: Intel: Skylake: Do not check dev_type for dmic link type ASoC: rockchip: disable clock on error ASoC: tlv320aic31xx: Fix GPIO1 register definition ASoC: codecs: msm8916-wcd: Fix supported formats ASoC: fsl_asrc: Fix typo in a field define ASoC: rsnd: ssiu: clear SSI_MODE for non TDM Extended modes ASoC: da7218: Correct IRQ level in DT binding example ...
2017-12-27x86-32: Fix kexec with stack canary (CONFIG_CC_STACKPROTECTOR)Linus Torvalds1-3/+1
Commit e802a51ede91 ("x86/idt: Consolidate IDT invalidation") cleaned up and unified the IDT invalidation that existed in a couple of places. It changed no actual real code. Despite not changing any actual real code, it _did_ change code generation: by implementing the common idt_invalidate() function in archx86/kernel/idt.c, it made the use of the function in arch/x86/kernel/machine_kexec_32.c be a real function call rather than an (accidental) inlining of the function. That, in turn, exposed two issues: - in load_segments(), we had incorrectly reset all the segment registers, which then made the stack canary load (which gcc does using offset of %gs) cause a trap. Instead of %gs pointing to the stack canary, it will be the normal zero-based kernel segment, and the stack canary load will take a page fault at address 0x14. - to make this even harder to debug, we had invalidated the GDT just before calling idt_invalidate(), which meant that the fault happened with an invalid GDT, which in turn causes a triple fault and immediate reboot. Fix this by (a) not reloading the special segments in load_segments(). We currently don't do any percpu accesses (which would require %fs on x86-32) in this area, but there's no reason to think that we might not want to do them, and like %gs, it's pointless to break it. (b) doing idt_invalidate() before invalidating the GDT, to keep things at least _slightly_ more debuggable for a bit longer. Without a IDT, traps will not work. Without a GDT, traps also will not work, but neither will any segment loads etc. So in a very real sense, the GDT is even more core than the IDT. Fixes: e802a51ede91 ("x86/idt: Consolidate IDT invalidation") Reported-and-tested-by: Alexandru Chirvasitu <achirvasub@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.LFD.2.21.1712271143180.8572@i7.lan
2017-12-27led: core: Fix brightness setting when setting delay_off=0Matthieu CASTET1-1/+1
With the current code, the following sequence won't work : echo timer > trigger echo 0 > delay_off * at this point we call ** led_delay_off_store ** led_blink_set *** stop timer ** led_blink_setup ** led_set_software_blink *** if !delay_on, led off *** if !delay_off, set led_set_brightness_nosleep <--- LED_BLINK_SW is set but timer is stop *** otherwise start timer/set LED_BLINK_SW flag echo xxx > brightness * led_set_brightness ** if LED_BLINK_SW *** if brightness=0, led off *** else apply brightness if next timer <--- timer is stop, and will never apply new setting ** otherwise set led_set_brightness_nosleep To fix that, when we delete the timer, we should clear LED_BLINK_SW. Cc: linux-leds@vger.kernel.org Signed-off-by: Matthieu CASTET <matthieu.castet@parrot.com> Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
2017-12-27x86: Remove unused parameter of prepare_switch_torodrigosiqueira1-3/+2
Commit e37e43a497d5 ("x86/mm/64: Enable vmapped stacks (CONFIG_HAVE_ARCH_VMAP_STACK=y)") added prepare_switch_to with one extra parameter which is not used by the function, remove it. Signed-off-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-janitors@vger.kernel.org Link: https://lkml.kernel.org/r/20171215131533.hp6kqebw45o7uvsb@smtp.gmail.com
2017-12-27perf/x86/intel: Plug memory leak in intel_pmu_init()Thomas Gleixner1-1/+4
A recent commit introduced an extra merge_attr() call in the skylake branch, which causes a memory leak. Store the pointer to the extra allocated memory and free it at the end of the function. Fixes: a5df70c354c2 ("perf/x86: Only show format attributes when supported") Reported-by: Tommi Rantala <tommi.t.rantala@nokia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@linux.intel.com>
2017-12-27tracing: Fix possible double free on failure of allocating trace bufferSteven Rostedt (VMware)1-0/+1
Jing Xia and Chunyan Zhang reported that on failing to allocate part of the tracing buffer, memory is freed, but the pointers that point to them are not initialized back to NULL, and later paths may try to free the freed memory again. Jing and Chunyan fixed one of the locations that does this, but missed a spot. Link: http://lkml.kernel.org/r/20171226071253.8968-1-chunyan.zhang@spreadtrum.com Cc: stable@vger.kernel.org Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code") Reported-by: Jing Xia <jing.xia@spreadtrum.com> Reported-by: Chunyan Zhang <chunyan.zhang@spreadtrum.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-12-27tracing: Fix crash when it fails to alloc ring bufferJing Xia1-0/+2
Double free of the ring buffer happens when it fails to alloc new ring buffer instance for max_buffer if TRACER_MAX_TRACE is configured. The root cause is that the pointer is not set to NULL after the buffer is freed in allocate_trace_buffers(), and the freeing of the ring buffer is invoked again later if the pointer is not equal to Null, as: instance_mkdir() |-allocate_trace_buffers() |-allocate_trace_buffer(tr, &tr->trace_buffer...) |-allocate_trace_buffer(tr, &tr->max_buffer...) // allocate fail(-ENOMEM),first free // and the buffer pointer is not set to null |-ring_buffer_free(tr->trace_buffer.buffer) // out_free_tr |-free_trace_buffers() |-free_trace_buffer(&tr->trace_buffer); //if trace_buffer is not null, free again |-ring_buffer_free(buf->buffer) |-rb_free_cpu_buffer(buffer->buffers[cpu]) // ring_buffer_per_cpu is null, and // crash in ring_buffer_per_cpu->pages Link: http://lkml.kernel.org/r/20171226071253.8968-1-chunyan.zhang@spreadtrum.com Cc: stable@vger.kernel.org Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code") Signed-off-by: Jing Xia <jing.xia@spreadtrum.com> Signed-off-by: Chunyan Zhang <chunyan.zhang@spreadtrum.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-12-27ring-buffer: Do no reuse reader page if still in useSteven Rostedt (VMware)1-0/+6
To free the reader page that is allocated with ring_buffer_alloc_read_page(), ring_buffer_free_read_page() must be called. For faster performance, this page can be reused by the ring buffer to avoid having to free and allocate new pages. The issue arises when the page is used with a splice pipe into the networking code. The networking code may up the page counter for the page, and keep it active while sending it is queued to go to the network. The incrementing of the page ref does not prevent it from being reused in the ring buffer, and this can cause the page that is being sent out to the network to be modified before it is sent by reading new data. Add a check to the page ref counter, and only reuse the page if it is not being used anywhere else. Cc: stable@vger.kernel.org Fixes: 73a757e63114d ("ring-buffer: Return reader page back into existing ring buffer") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-12-27tracing: Remove extra zeroing out of the ring buffer pageSteven Rostedt (VMware)1-9/+1
The ring_buffer_read_page() takes care of zeroing out any extra data in the page that it returns. There's no need to zero it out again from the consumer. It was removed from one consumer of this function, but read_buffers_splice_read() did not remove it, and worse, it contained a nasty bug because of it. Cc: stable@vger.kernel.org Fixes: 2711ca237a084 ("ring-buffer: Move zeroing out excess in page to ring buffer code") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-12-28Merge tag 'drm-intel-fixes-2017-12-22-1' of ↵Dave Airlie2-3/+4
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes GLK pipe C related fix, and a gvt fix. * tag 'drm-intel-fixes-2017-12-22-1' of git://anongit.freedesktop.org/drm/drm-intel: i915: Reject CCS modifiers for pipe C on Geminilake drm/i915/gvt: Fix pipe A enable as default for vgpu
2017-12-27ring-buffer: Mask out the info bits when returning buffer page lengthSteven Rostedt (VMware)1-1/+5
Two info bits were added to the "commit" part of the ring buffer data page when returned to be consumed. This was to inform the user space readers that events have been missed, and that the count may be stored at the end of the page. What wasn't handled, was the splice code that actually called a function to return the length of the data in order to zero out the rest of the page before sending it up to user space. These data bits were returned with the length making the value negative, and that negative value was not checked. It was compared to PAGE_SIZE, and only used if the size was less than PAGE_SIZE. Luckily PAGE_SIZE is unsigned long which made the compare an unsigned compare, meaning the negative size value did not end up causing a large portion of memory to be randomly zeroed out. Cc: stable@vger.kernel.org Fixes: 66a8cb95ed040 ("ring-buffer: Add place holder recording of dropped events") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-12-27sctp: Replace use of sockets_allocated with specified macro.Tonghao Zhang1-2/+2
The patch(180d8cd942ce) replaces all uses of struct sock fields' memory_pressure, memory_allocated, sockets_allocated, and sysctl_mem to accessor macros. But the sockets_allocated field of sctp sock is not replaced at all. Then replace it now for unifying the code. Fixes: 180d8cd942ce ("foundations of per-cgroup memory pressure controlling.") Cc: Glauber Costa <glommer@parallels.com> Signed-off-by: Tonghao Zhang <zhangtonghao@didichuxing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-27cpu/hotplug: Move inline keyword at the beginning of declarationMathieu Malaterre1-4/+4
Fix non-fatal warnings such as: kernel/cpu.c:95:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration] static void inline cpuhp_lock_release(bool bringup) { } ^~~~~~ Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Link: https://lkml.kernel.org/r/20171226140855.16583-1-malat@debian.org
2017-12-27bnx2x: Improve reliability in case of nested PCI errorsGuilherme G. Piccoli2-3/+15
While in recovery process of PCI error (called EEH on PowerPC arch), another PCI transaction could be corrupted causing a situation of nested PCI errors. Also, this scenario could be reproduced with error injection mechanisms (for debug purposes). We observe that in case of nested PCI errors, bnx2x might attempt to initialize its shmem and cause a kernel crash due to bad addresses read from MCP. Multiple different stack traces were observed depending on the point the second PCI error happens. This patch avoids the crashes by: * failing PCI recovery in case of nested errors (since multiple PCI errors in a row are not expected to lead to a functional adapter anyway), and by, * preventing access to adapter FW when MCP is failed (we mark it as failed when shmem cannot get initialized properly). Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Acked-by: Shahed Shaikh <Shahed.Shaikh@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-27Merge branch 'tg3-fixes'David S. Miller2-4/+22
Siva Reddy Kallam says: ==================== tg3: update on copyright and couple of fixes First patch: Update copyright Second patch: Add workaround to restrict 5762 MRRS Third patch: Add PHY reset in change MTU path for 5720 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-27tg3: Enable PHY reset in MTU change path for 5720Siva Reddy Kallam1-1/+2
A customer noticed RX path hang when MTU is changed on the fly while running heavy traffic with NCSI enabled for 5717 and 5719. Since 5720 belongs to same ASIC family, we observed same issue and same fix could solve this problem for 5720. Signed-off-by: Siva Reddy Kallam <siva.kallam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-27tg3: Add workaround to restrict 5762 MRRS to 2048Siva Reddy Kallam2-0/+14
One of AMD based server with 5762 hangs with jumbo frame traffic. This AMD platform has southbridge limitation which is restricting MRRS to 4000. As a work around, driver to restricts the MRRS to 2048 for this particular 5762 NX1 card. Signed-off-by: Siva Reddy Kallam <siva.kallam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-27tg3: Update copyrightSiva Reddy Kallam2-3/+6
Signed-off-by: Siva Reddy Kallam <siva.kallam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-27Merge branch 'master' of ↵David S. Miller7-5/+125
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2017-12-22 1) Check for valid id proto in validate_tmpl(), otherwise we may trigger a warning in xfrm_state_fini(). From Cong Wang. 2) Fix a typo on XFRMA_OUTPUT_MARK policy attribute. From Michal Kubecek. 3) Verify the state is valid when encap_type < 0, otherwise we may crash on IPsec GRO . From Aviv Heller. 4) Fix stack-out-of-bounds read on socket policy lookup. We access the flowi of the wrong address family in the IPv4 mapped IPv6 case, fix this by catching address family missmatches before we do the lookup. 5) fix xfrm_do_migrate() with AEAD to copy the geniv field too. Otherwise the state is not fully initialized and migration fails. From Antony Antony. 6) Fix stack-out-of-bounds with misconfigured transport mode policies. Our policy template validation is not strict enough. It is possible to configure policies with transport mode template where the address family of the template does not match the selectors address family. Fix this by refusing such a configuration, address family can not change on transport mode. 7) Fix a policy reference leak when reusing pcpu xdst entry. From Florian Westphal. 8) Reinject transport-mode packets through tasklet, otherwise it is possible to reate a recursion loop. From Herbert Xu. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-27net: fec: unmap the xmit buffer that are not transferred by DMAFugang Duan1-0/+6
The enet IP only support 32 bit, it will use swiotlb buffer to do dma mapping when xmit buffer DMA memory address is bigger than 4G in i.MX platform. After stress suspend/resume test, it will print out: log: [12826.352864] fec 5b040000.ethernet: swiotlb buffer is full (sz: 191 bytes) [12826.359676] DMA: Out of SW-IOMMU space for 191 bytes at device 5b040000.ethernet [12826.367110] fec 5b040000.ethernet eth0: Tx DMA memory map failed The issue is that the ready xmit buffers that are dma mapped but DMA still don't copy them into fifo, once MAC restart, these DMA buffers are not unmapped. So it should check the dma mapping buffer and unmap them. Signed-off-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-27tipc: fix tipc_mon_delete() oops in tipc_enable_bearer() error pathTommi Rantala1-1/+5
Calling tipc_mon_delete() before the monitor has been created will oops. This can happen in tipc_enable_bearer() error path if tipc_disc_create() fails. [ 48.589074] BUG: unable to handle kernel paging request at 0000000000001008 [ 48.590266] IP: tipc_mon_delete+0xea/0x270 [tipc] [ 48.591223] PGD 1e60c5067 P4D 1e60c5067 PUD 1eb0cf067 PMD 0 [ 48.592230] Oops: 0000 [#1] SMP KASAN [ 48.595610] CPU: 5 PID: 1199 Comm: tipc Tainted: G B 4.15.0-rc4-pc64-dirty #5 [ 48.597176] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014 [ 48.598489] RIP: 0010:tipc_mon_delete+0xea/0x270 [tipc] [ 48.599347] RSP: 0018:ffff8801d827f668 EFLAGS: 00010282 [ 48.600705] RAX: ffff8801ee813f00 RBX: 0000000000000204 RCX: 0000000000000000 [ 48.602183] RDX: 1ffffffff1de6a75 RSI: 0000000000000297 RDI: 0000000000000297 [ 48.604373] RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff1dd1533 [ 48.605607] R10: ffffffff8eafbb05 R11: fffffbfff1dd1534 R12: 0000000000000050 [ 48.607082] R13: dead000000000200 R14: ffffffff8e73f310 R15: 0000000000001020 [ 48.608228] FS: 00007fc686484800(0000) GS:ffff8801f5540000(0000) knlGS:0000000000000000 [ 48.610189] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 48.611459] CR2: 0000000000001008 CR3: 00000001dda70002 CR4: 00000000003606e0 [ 48.612759] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 48.613831] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 48.615038] Call Trace: [ 48.615635] tipc_enable_bearer+0x415/0x5e0 [tipc] [ 48.620623] tipc_nl_bearer_enable+0x1ab/0x200 [tipc] [ 48.625118] genl_family_rcv_msg+0x36b/0x570 [ 48.631233] genl_rcv_msg+0x5a/0xa0 [ 48.631867] netlink_rcv_skb+0x1cc/0x220 [ 48.636373] genl_rcv+0x24/0x40 [ 48.637306] netlink_unicast+0x29c/0x350 [ 48.639664] netlink_sendmsg+0x439/0x590 [ 48.642014] SYSC_sendto+0x199/0x250 [ 48.649912] do_syscall_64+0xfd/0x2c0 [ 48.650651] entry_SYSCALL64_slow_path+0x25/0x25 [ 48.651843] RIP: 0033:0x7fc6859848e3 [ 48.652539] RSP: 002b:00007ffd25dff938 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 48.654003] RAX: ffffffffffffffda RBX: 00007ffd25dff990 RCX: 00007fc6859848e3 [ 48.655303] RDX: 0000000000000054 RSI: 00007ffd25dff990 RDI: 0000000000000003 [ 48.656512] RBP: 00007ffd25dff980 R08: 00007fc685c35fc0 R09: 000000000000000c [ 48.657697] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000d13010 [ 48.658840] R13: 00007ffd25e009c0 R14: 0000000000000000 R15: 0000000000000000 [ 48.662972] RIP: tipc_mon_delete+0xea/0x270 [tipc] RSP: ffff8801d827f668 [ 48.664073] CR2: 0000000000001008 [ 48.664576] ---[ end trace e811818d54d5ce88 ]--- Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-27tipc: error path leak fixes in tipc_enable_bearer()Tommi Rantala1-1/+4
Fix memory leak in tipc_enable_bearer() if enable_media() fails, and cleanup with bearer_disable() if tipc_mon_create() fails. Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-27RDS: Check cmsg_len before dereferencing CMSG_DATAAvinash Repaka1-0/+3
RDS currently doesn't check if the length of the control message is large enough to hold the required data, before dereferencing the control message data. This results in following crash: BUG: KASAN: stack-out-of-bounds in rds_rdma_bytes net/rds/send.c:1013 [inline] BUG: KASAN: stack-out-of-bounds in rds_sendmsg+0x1f02/0x1f90 net/rds/send.c:1066 Read of size 8 at addr ffff8801c928fb70 by task syzkaller455006/3157 CPU: 0 PID: 3157 Comm: syzkaller455006 Not tainted 4.15.0-rc3+ #161 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 print_address_description+0x73/0x250 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x25b/0x340 mm/kasan/report.c:409 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430 rds_rdma_bytes net/rds/send.c:1013 [inline] rds_sendmsg+0x1f02/0x1f90 net/rds/send.c:1066 sock_sendmsg_nosec net/socket.c:628 [inline] sock_sendmsg+0xca/0x110 net/socket.c:638 ___sys_sendmsg+0x320/0x8b0 net/socket.c:2018 __sys_sendmmsg+0x1ee/0x620 net/socket.c:2108 SYSC_sendmmsg net/socket.c:2139 [inline] SyS_sendmmsg+0x35/0x60 net/socket.c:2134 entry_SYSCALL_64_fastpath+0x1f/0x96 RIP: 0033:0x43fe49 RSP: 002b:00007fffbe244ad8 EFLAGS: 00000217 ORIG_RAX: 0000000000000133 RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fe49 RDX: 0000000000000001 RSI: 000000002020c000 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000217 R12: 00000000004017b0 R13: 0000000000401840 R14: 0000000000000000 R15: 0000000000000000 To fix this, we verify that the cmsg_len is large enough to hold the data to be read, before proceeding further. Reported-by: syzbot <syzkaller-bugs@googlegroups.com> Signed-off-by: Avinash Repaka <avinash.repaka@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-27usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201Daniel Thompson1-0/+3
When plugging in a USB webcam I see the following message: xhci_hcd 0000:04:00.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk? handle_tx_event: 913 callbacks suppressed All is quiet again with this patch (and I've done a fair but of soak testing with the camera since). Cc: <stable@vger.kernel.org> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-27xhci: Fix use-after-free in xhci debugfsAlexander Kappner1-8/+8
Trying to read from debugfs after the system has resumed from hibernate causes a use-after-free and thus a protection fault. Steps to reproduce: Hibernate system, resume from hibernate, then run $ cat /sys/kernel/debug/usb/xhci/*/command-ring/enqueue [ 3902.765086] general protection fault: 0000 [#1] PREEMPT SMP ... [ 3902.765136] RIP: 0010:xhci_trb_virt_to_dma.part.50+0x5/0x30 ... [ 3902.765178] Call Trace: [ 3902.765188] xhci_ring_enqueue_show+0x1e/0x40 [ 3902.765197] seq_read+0xdb/0x3a0 [ 3902.765204] ? __handle_mm_fault+0x5fb/0x1210 [ 3902.765211] full_proxy_read+0x4a/0x70 [ 3902.765219] __vfs_read+0x23/0x120 [ 3902.765228] vfs_read+0x8e/0x130 [ 3902.765235] SyS_read+0x42/0x90 [ 3902.765242] do_syscall_64+0x6b/0x290 [ 3902.765251] entry_SYSCALL64_slow_path+0x25/0x25 The issue is caused by the xhci ring structures being reallocated when the system is resumed, but pointers to the old structures being retained in the debugfs files "private" field: The proposed patch fixes this issue by storing a pointer to the xhci_ring field in the xhci device structure in debugfs rather than directly storing a pointer to the xhci_ring. Fixes: 02b6fdc2a153 ("usb: xhci: Add debugfs interface for xHCI driver") Signed-off-by: Alexander Kappner <agk@godking.net> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-27xhci: Fix xhci debugfs NULL pointer dereference in resume from hibernateMathias Nyman1-3/+3
Free the virt_device and its debugfs_private member together. When resuming from hibernate the .free_dev callback unconditionally freed the debugfs_private member, but could leave virt_device intact. This triggered a NULL pointer dereference after resume when usbmuxd sent a USBDEVFS_SETCONFIGURATION ioctl to a device, trying to add a endpoint debugfs entry to a already freed debugfs_private pointer. Fixes: 02b6fdc2a153 ("usb: xhci: Add debugfs interface for xHCI driver") Reported-by: Alexander Kappner <agk@godking.net> Tested-by: Alexander Kappner <agk@godking.net> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-27Merge tag 'usb-serial-4.15-rc6' of ↵Greg Kroah-Hartman4-0/+27
https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes USB-serial fixes for v4.15-rc6 Here are some new device ids for ftdi_sio, option and qcserial. Note that the qcserial patch enables the SetControlLineState request (used to raise DTR/RTS) for the GPS interface of all devices using the Sierra Wireless layout. This was required for the Sierra Wireless EM7565 and has been tested using several other modems as well. All but the final commit have been in linux-next without any reported issues. Signed-off-by: Johan Hovold <johan@kernel.org>
2017-12-27USB: serial: ftdi_sio: add id for Airbus DS P8GRMax Schulze2-0/+7
Add AIRBUS_DS_P8GR device IDs to ftdi_sio driver. Signed-off-by: Max Schulze <max.schulze@posteo.de> Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org>
2017-12-27ALSA: hda - Fix missing COEF init for ALC225/295/299Takashi Iwai1-3/+5
There was a long-standing problem on HP Spectre X360 with Kabylake where it lacks of the front speaker output in some situations. Also there are other products showing the similar behavior. The culprit seems to be the missing COEF setup on ALC codecs, ALC225/295/299, which are all compatible. This patch adds the proper COEF setup (to initialize idx 0x67 / bits 0x3000) for addressing the issue. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195457 Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-12-26Merge tag 'hwmon-for-linus-v4.15-rc6' of ↵Linus Torvalds1-4/+17
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fix from Guenter Roeck: "Handle errors from thermal subsystem" * tag 'hwmon-for-linus-v4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: Deal with errors from the thermal subsystem
2017-12-26Merge tag 'gpio-v4.15-3' of ↵Linus Torvalds5-15/+14
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "Two fixes. They are both kind of important, so why not send a pull request on christmas eve. - Fix a build problem in the gpio single register created by refactorings. - Fix assignment of GPIO line names, something that was mangled by another patch" * tag 'gpio-v4.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: fix "gpio-line-names" property retrieval gpio: gpio-reg: fix build
2017-12-26clk: use atomic runtime pm api in clk_core_is_enabledDong Aisheng1-1/+2
Current clk_pm_runtime_put is using pm_runtime_put_sync which is not safe to be called in clk_core_is_enabled as it should be able to run in atomic context. Thus use pm_runtime_put instead which is atomic safe. Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Michael Turquette <mturquette@baylibre.com> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Fixes: 9a34b45397e5 ("clk: Add support for runtime PM") Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2017-12-27Merge branch 'linux-4.15' of git://github.com/skeggsb/linux into drm-fixesDave Airlie1-1/+1
one nouveau regression fix * 'linux-4.15' of git://github.com/skeggsb/linux: drm/nouveau: fix race when adding delayed work items
2017-12-26tcp: Avoid preprocessor directives in tracepoint macro argsMat Martineau1-60/+37
Using a preprocessor directive to check for CONFIG_IPV6 in the middle of a DECLARE_EVENT_CLASS macro's arg list causes sparse to report a series of errors: ./include/trace/events/tcp.h:68:1: error: directive in argument list ./include/trace/events/tcp.h:75:1: error: directive in argument list ./include/trace/events/tcp.h:144:1: error: directive in argument list ./include/trace/events/tcp.h:151:1: error: directive in argument list ./include/trace/events/tcp.h:216:1: error: directive in argument list ./include/trace/events/tcp.h:223:1: error: directive in argument list ./include/trace/events/tcp.h:274:1: error: directive in argument list ./include/trace/events/tcp.h:281:1: error: directive in argument list Once sparse finds an error, it stops printing warnings for the file it is checking. This masks any sparse warnings that would normally be reported for the core TCP code. Instead, handle the preprocessor conditionals in a couple of auxiliary macros. This also has the benefit of reducing duplicate code. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-26hwmon: Deal with errors from the thermal subsystemLinus Walleij1-4/+17
If the thermal subsystem returne -EPROBE_DEFER or any other error when hwmon calls devm_thermal_zone_of_sensor_register(), this is silently ignored. I ran into this with an incorrectly defined thermal zone, making it non-existing and thus this call failed with -EPROBE_DEFER assuming it would appear later. The sensor was still added which is incorrect: sensors must strictly be added after the thermal zones, so deferred probe must be respected. Fixes: d560168b5d0f ("hwmon: (core) New hwmon registration API") Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-12-26tipc: fix memory leak of group member when peer node is lostJon Maloy1-9/+18
When a group member receives a member WITHDRAW event, this might have two reasons: either the peer member is leaving the group, or the link to the member's node has been lost. In the latter case we need to issue a DOWN event to the user right away, and let function tipc_group_filter_msg() perform delete of the member item. However, in this case we miss to change the state of the member item to MBR_LEAVING, so the member item is not deleted, and we have a memory leak. We now separate better between the four sub-cases of a WITHRAW event and make sure that each case is handled correctly. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-26net: sched: fix possible null pointer deref in tcf_block_putJiri Pirko1-0/+2
We need to check block for being null in both tcf_block_put and tcf_block_put_ext. Fixes: 343723dd51ef ("net: sched: fix clsact init error path") Reported-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-26tipc: base group replicast ack counter on number of actual receiversJon Maloy1-1/+3
In commit 2f487712b893 ("tipc: guarantee that group broadcast doesn't bypass group unicast") we introduced a mechanism that requires the first (replicated) broadcast sent after a unicast to be acknowledged by all receivers before permitting sending of the next (true) broadcast. The counter for keeping track of the number of acknowledges to expect is based on the tipc_group::member_cnt variable. But this misses that some of the known members may not be ready for reception, and will never acknowledge the message, either because they haven't fully joined the group or because they are leaving the group. Such members are identified by not fulfilling the condition tested for in the function tipc_group_is_enabled(). We now set the counter for the actual number of acks to receive at the moment the message is sent, by just counting the number of recipients satisfying the tipc_group_is_enabled() test. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-26net_sched: fix a missing rcu barrier in mini_qdisc_pair_swap()Cong Wang1-1/+3
The rcu_barrier_bh() in mini_qdisc_pair_swap() is to wait for flying RCU callback installed by a previous mini_qdisc_pair_swap(), however we miss it on the tp_head==NULL path, which leads to that the RCU callback still uses miniq_old->rcu after it is freed together with qdisc in qdisc_graft(). So just add it on that path too. Fixes: 46209401f8f6 ("net: core: introduce mini_Qdisc and eliminate usage of tp->q for clsact fastpath ") Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com> Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Jiri Pirko <jiri@mellanox.com> Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-26net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaroundGrygorii Strashko1-0/+1
Under some circumstances driver will perform PHY reset in ksz9031_read_status() to fix autoneg failure case (idle error count = 0xFF). When this happens ksz9031 will not detect link status change any more when connecting to Netgear 1G switch (link can be recovered sometimes by restarting netdevice "ifconfig down up"). Reproduced with TI am572x board equipped with ksz9031 PHY while connecting to Netgear 1G switch. Fix the issue by reconfiguring autonegotiation after PHY reset in ksz9031_read_status(). Fixes: d2fd719bcb0e ("net/phy: micrel: Add workaround for bad autoneg") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-26ip6_gre: fix device features for ioctl setupAlexey Kodanev1-25/+32
When ip6gre is created using ioctl, its features, such as scatter-gather, GSO and tx-checksumming will be turned off: # ip -f inet6 tunnel add gre6 mode ip6gre remote fd00::1 # ethtool -k gre6 (truncated output) tx-checksumming: off scatter-gather: off tcp-segmentation-offload: off generic-segmentation-offload: off [requested on] But when netlink is used, they will be enabled: # ip link add gre6 type ip6gre remote fd00::1 # ethtool -k gre6 (truncated output) tx-checksumming: on scatter-gather: on tcp-segmentation-offload: on generic-segmentation-offload: on This results in a loss of performance when gre6 is created via ioctl. The issue was found with LTP/gre tests. Fix it by moving the setup of device features to a separate function and invoke it with ndo_init callback because both netlink and ioctl will eventually call it via register_netdevice(): register_netdevice() - ndo_init() callback -> ip6gre_tunnel_init() or ip6gre_tap_init() - ip6gre_tunnel_init_common() - ip6gre_tnl_init_features() The moved code also contains two minor style fixes: * removed needless tab from GRE6_FEATURES on NETIF_F_HIGHDMA line. * fixed the issue reported by checkpatch: "Unnecessary parentheses around 'nt->encap.type == TUNNEL_ENCAP_NONE'" Fixes: ac4eb009e477 ("ip6gre: Add support for basic offloads offloads excluding GSO") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-25phylink: ensure AN is enabledRussell King1-0/+1
Ensure that we mark AN as enabled at boot time, rather than leaving it disabled. This is noticable if your SFP module is fiber, and it supports faster speeds than 1G with 2.5G support in place. Fixes: 9525ae83959b ("phylink: add phylink infrastructure") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-25phylink: ensure the PHY interface mode is appropriately setRussell King1-0/+1
When setting the ethtool settings, ensure that the validated PHY interface mode is propagated to the current link settings, so that 2500BaseX can be selected. Fixes: 9525ae83959b ("phylink: add phylink infrastructure") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-23Linux 4.15-rc5v4.15-rc5Linus Torvalds1-1/+1
2017-12-23pid: Handle failure to allocate the first pid in a pid namespaceEric W. Biederman1-3/+5
With the replacement of the pid bitmap and hashtable with an idr in alloc_pid started occassionally failing when allocating the first pid in a pid namespace. Things were not completely reset resulting in the first allocated pid getting the number 2 (not 1). Which further resulted in ns->proc_mnt not getting set and eventually causing an oops in proc_flush_task. Oops: 0000 [#1] SMP CPU: 2 PID: 6743 Comm: trinity-c117 Not tainted 4.15.0-rc4-think+ #2 RIP: 0010:proc_flush_task+0x8e/0x1b0 RSP: 0018:ffffc9000bbffc40 EFLAGS: 00010286 RAX: 0000000000000001 RBX: 0000000000000001 RCX: 00000000fffffffb RDX: 0000000000000000 RSI: ffffc9000bbffc50 RDI: 0000000000000000 RBP: ffffc9000bbffc63 R08: 0000000000000000 R09: 0000000000000002 R10: ffffc9000bbffb70 R11: ffffc9000bbffc64 R12: 0000000000000003 R13: 0000000000000000 R14: 0000000000000003 R15: ffff8804c10d7840 FS: 00007f7cb8965700(0000) GS:ffff88050a200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000003e21ae003 CR4: 00000000001606e0 DR0: 00007fb1d6c22000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 Call Trace: ? release_task+0xaf/0x680 release_task+0xd2/0x680 ? wait_consider_task+0xb82/0xce0 wait_consider_task+0xbe9/0xce0 ? do_wait+0xe1/0x330 do_wait+0x151/0x330 kernel_wait4+0x8d/0x150 ? task_stopped_code+0x50/0x50 SYSC_wait4+0x95/0xa0 ? rcu_read_lock_sched_held+0x6c/0x80 ? syscall_trace_enter+0x2d7/0x340 ? do_syscall_64+0x60/0x210 do_syscall_64+0x60/0x210 entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:0x7f7cb82603aa RSP: 002b:00007ffd60770bc8 EFLAGS: 00000246 ORIG_RAX: 000000000000003d RAX: ffffffffffffffda RBX: 00007f7cb6cd4000 RCX: 00007f7cb82603aa RDX: 000000000000000b RSI: 00007ffd60770bd0 RDI: 0000000000007cca RBP: 0000000000007cca R08: 00007f7cb8965700 R09: 00007ffd607c7080 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffd60770bd0 R14: 00007f7cb6cd4058 R15: 00000000cccccccd Code: c1 e2 04 44 8b 60 30 48 8b 40 38 44 8b 34 11 48 c7 c2 60 3a f5 81 44 89 e1 4c 8b 68 58 e8 4b b4 77 00 89 44 24 14 48 8d 74 24 10 <49> 8b 7d 00 e8 b9 6a f9 ff 48 85 c0 74 1a 48 89 c7 48 89 44 24 RIP: proc_flush_task+0x8e/0x1b0 RSP: ffffc9000bbffc40 CR2: 0000000000000000 ---[ end trace 53d67a6481059862 ]--- Improve the quality of the implementation by resetting the place to start allocating pids on failure to allocate the first pid. As improving the quality of the implementation is the goal remove the now unnecesarry disable_pid_allocations call when we fail to mount proc. Fixes: 95846ecf9dac ("pid: replace pid bitmap implementation with IDR API") Fixes: 8ef047aaaeb8 ("pid namespaces: make alloc_pid(), free_pid() and put_pid() work with struct upid") Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2017-12-23Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds4-41/+236
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "These fixes are all tagged for -stable and have received a build success notification from the kbuild robot. - NVDIMM namespaces, configured to enforce 1GB alignment, fail to initialize on platforms that mis-align the start or end of the physical address range. - The Linux implementation of the BTT (Block Translation Table) is incompatible with the UEFI 2.7 definition of the BTT format. The BTT layers a software atomic sector semantic on top of an NVDIMM namespace. Linux needs to be compatible with the UEFI definition to enable boot support or any pre-OS access of data on a BTT enabled namespace. - A fix for ACPI SMART notification events, this allows a userspace monitor to register for health events rather than poll. This has been broken since it was initially merged as the unit test inadvertently worked around the problem. The urgency for fixing this during the -rc series is driven by how expensive it is to poll for this data (System Management Mode entry)" * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm, btt: Fix an incompatibility in the log layout libnvdimm, btt: add a couple of missing kernel-doc lines libnvdimm, dax: fix 1GB-aligned namespaces vs physical misalignment libnvdimm, pfn: fix start_pad handling for aligned namespaces acpi, nfit: fix health event notification
2017-12-23x86/ldt: Make the LDT mapping ROThomas Gleixner4-12/+11
Now that the LDT mapping is in a known area when PAGE_TABLE_ISOLATION is enabled its a primary target for attacks, if a user space interface fails to validate a write address correctly. That can never happen, right? The SDM states: If the segment descriptors in the GDT or an LDT are placed in ROM, the processor can enter an indefinite loop if software or the processor attempts to update (write to) the ROM-based segment descriptors. To prevent this problem, set the accessed bits for all segment descriptors placed in a ROM. Also, remove operating-system or executive code that attempts to modify segment descriptors located in ROM. So its a valid approach to set the ACCESS bit when setting up the LDT entry and to map the table RO. Fixup the selftest so it can handle that new mode. Remove the manual ACCESS bit setter in set_tls_desc() as this is now pointless. Folded the patch from Peter Ziljstra. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/mm/dump_pagetables: Allow dumping current pagetablesThomas Gleixner3-6/+73
Add two debugfs files which allow to dump the pagetable of the current task. current_kernel dumps the regular page table. This is the page table which is normally shared between kernel and user space. If kernel page table isolation is enabled this is the kernel space mapping. If kernel page table isolation is enabled the second file, current_user, dumps the user space page table. These files allow to verify the resulting page tables for page table isolation, but even in the normal case its useful to be able to inspect user space page tables of current for debugging purposes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: linux-mm@kvack.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/mm/dump_pagetables: Check user space page table for WX pagesThomas Gleixner3-6/+27
ptdump_walk_pgd_level_checkwx() checks the kernel page table for WX pages, but does not check the PAGE_TABLE_ISOLATION user space page table. Restructure the code so that dmesg output is selected by an explicit argument and not implicit via checking the pgd argument for !NULL. Add the check for the user space page table. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: linux-mm@kvack.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/mm/dump_pagetables: Add page table directory to the debugfs VFS hierarchyBorislav Petkov1-5/+10
The upcoming support for dumping the kernel and the user space page tables of the current process would create more random files in the top level debugfs directory. Add a page table directory and move the existing file to it. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/mm/pti: Add KconfigDave Hansen1-0/+10
Finally allow CONFIG_PAGE_TABLE_ISOLATION to be enabled. PARAVIRT generally requires that the kernel not manage its own page tables. It also means that the hypervisor and kernel must agree wholeheartedly about what format the page tables are in and what they contain. PAGE_TABLE_ISOLATION, unfortunately, changes the rules and they can not be used together. I've seen conflicting feedback from maintainers lately about whether they want the Kconfig magic to go first or last in a patch series. It's going last here because the partially-applied series leads to kernels that can not boot in a bunch of cases. I did a run through the entire series with CONFIG_PAGE_TABLE_ISOLATION=y to look for build errors, though. [ tglx: Removed SMP and !PARAVIRT dependencies as they not longer exist ] Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: linux-mm@kvack.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/dumpstack: Indicate in Oops whether PTI is configured and enabledVlastimil Babka1-2/+4
CONFIG_PAGE_TABLE_ISOLATION is relatively new and intrusive feature that may still have some corner cases which could take some time to manifest and be fixed. It would be useful to have Oops messages indicate whether it was enabled for building the kernel, and whether it was disabled during boot. Example of fully enabled: Oops: 0001 [#1] SMP PTI Example of enabled during build, but disabled during boot: Oops: 0001 [#1] SMP NOPTI We can decide to remove this after the feature has been tested in the field long enough. [ tglx: Made it use boot_cpu_has() as requested by Borislav ] Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Eduardo Valentin <eduval@amazon.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirsky <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: bpetkov@suse.de Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: jkosina@suse.cz Cc: keescook@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/mm: Clarify the whole ASID/kernel PCID/user PCID namingPeter Zijlstra1-12/+43
Ideally we'd also use sparse to enforce this separation so it becomes much more difficult to mess up. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: linux-mm@kvack.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/mm: Use INVPCID for __native_flush_tlb_single()Dave Hansen3-28/+60
This uses INVPCID to shoot down individual lines of the user mapping instead of marking the entire user map as invalid. This could/might/possibly be faster. This for sure needs tlb_single_page_flush_ceiling to be redetermined; esp. since INVPCID is _slow_. A detailed performance analysis is available here: https://lkml.kernel.org/r/3062e486-3539-8a1f-5724-16199420be71@intel.com [ Peterz: Split out from big combo patch ] Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/mm: Optimize RESTORE_CR3Peter Zijlstra2-4/+30
Most NMI/paranoid exceptions will not in fact change pagetables and would thus not require TLB flushing, however RESTORE_CR3 uses flushing CR3 writes. Restores to kernel PCIDs can be NOFLUSH, because we explicitly flush the kernel mappings and now that we track which user PCIDs need flushing we can avoid those too when possible. This does mean RESTORE_CR3 needs an additional scratch_reg, luckily both sites have plenty available. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/mm: Use/Fix PCID to optimize user/kernel switchesPeter Zijlstra9-33/+162
We can use PCID to retain the TLBs across CR3 switches; including those now part of the user/kernel switch. This increases performance of kernel entry/exit at the cost of more expensive/complicated TLB flushing. Now that we have two address spaces, one for kernel and one for user space, we need two PCIDs per mm. We use the top PCID bit to indicate a user PCID (just like we use the PFN LSB for the PGD). Since we do TLB invalidation from kernel space, the existing code will only invalidate the kernel PCID, we augment that by marking the corresponding user PCID invalid, and upon switching back to userspace, use a flushing CR3 write for the switch. In order to access the user_pcid_flush_mask we use PER_CPU storage, which means the previously established SWAPGS vs CR3 ordering is now mandatory and required. Having to do this memory access does require additional registers, most sites have a functioning stack and we can spill one (RAX), sites without functional stack need to otherwise provide the second scratch register. Note: PCID is generally available on Intel Sandybridge and later CPUs. Note: Up until this point TLB flushing was broken in this series. Based-on-code-from: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/mm: Abstract switching CR3Dave Hansen1-2/+20
In preparation to adding additional PCID flushing, abstract the loading of a new ASID into CR3. [ PeterZ: Split out from big combo patch ] Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/mm: Allow flushing for future ASID switchesDave Hansen2-8/+64
If changing the page tables in such a way that an invalidation of all contexts (aka. PCIDs / ASIDs) is required, they can be actively invalidated by: 1. INVPCID for each PCID (works for single pages too). 2. Load CR3 with each PCID without the NOFLUSH bit set 3. Load CR3 with the NOFLUSH bit set for each and do INVLPG for each address. But, none of these are really feasible since there are ~6 ASIDs (12 with PAGE_TABLE_ISOLATION) at the time that invalidation is required. Instead of actively invalidating them, invalidate the *current* context and also mark the cpu_tlbstate _quickly_ to indicate future invalidation to be required. At the next context-switch, look for this indicator ('invalidate_other' being set) invalidate all of the cpu_tlbstate.ctxs[] entries. This ensures that any future context switches will do a full flush of the TLB, picking up the previous changes. [ tglx: Folded more fixups from Peter ] Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/pti: Map the vsyscall page if neededAndy Lutomirski3-3/+69
Make VSYSCALLs work fully in PTI mode by mapping them properly to the user space visible page tables. [ tglx: Hide unused functions (Patch by Arnd Bergmann) ] Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/pti: Put the LDT in its own PGD if PTI is onAndy Lutomirski6-17/+220
With PTI enabled, the LDT must be mapped in the usermode tables somewhere. The LDT is per process, i.e. per mm. An earlier approach mapped the LDT on context switch into a fixmap area, but that's a big overhead and exhausted the fixmap space when NR_CPUS got big. Take advantage of the fact that there is an address space hole which provides a completely unused pgd. Use this pgd to manage per-mm LDT mappings. This has a down side: the LDT isn't (currently) randomized, and an attack that can write the LDT is instant root due to call gates (thanks, AMD, for leaving call gates in AMD64 but designing them wrong so they're only useful for exploits). This can be mitigated by making the LDT read-only or randomizing the mapping, either of which is strightforward on top of this patch. This will significantly slow down LDT users, but that shouldn't matter for important workloads -- the LDT is only used by DOSEMU(2), Wine, and very old libc implementations. [ tglx: Cleaned it up. ] Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/mm/64: Make a full PGD-entry size hole in the memory mapAndy Lutomirski2-4/+4
Shrink vmalloc space from 16384TiB to 12800TiB to enlarge the hole starting at 0xff90000000000000 to be a full PGD entry. A subsequent patch will use this hole for the pagetable isolation LDT alias. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/events/intel/ds: Map debug buffers in cpu_entry_areaHugh Dickins2-45/+82
The BTS and PEBS buffers both have their virtual addresses programmed into the hardware. This means that any access to them is performed via the page tables. The times that the hardware accesses these are entirely dependent on how the performance monitoring hardware events are set up. In other words, there is no way for the kernel to tell when the hardware might access these buffers. To avoid perf crashes, place 'debug_store' allocate pages and map them into the cpu_entry_area. The PEBS fixup buffer does not need this treatment. [ tglx: Got rid of the kaiser_add_mapping() complication ] Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: keescook@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/cpu_entry_area: Add debugstore entries to cpu_entry_areaThomas Gleixner5-21/+81
The Intel PEBS/BTS debug store is a design trainwreck as it expects virtual addresses which must be visible in any execution context. So it is required to make these mappings visible to user space when kernel page table isolation is active. Provide enough room for the buffer mappings in the cpu_entry_area so the buffers are available in the user space visible page tables. At the point where the kernel side entry area is populated there is no buffer available yet, but the kernel PMD must be populated. To achieve this set the entries for these buffers to non present. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/mm/pti: Map ESPFIX into user spaceAndy Lutomirski1-0/+11
Map the ESPFIX pages into user space when PTI is enabled. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/mm/pti: Share entry text PMDThomas Gleixner1-0/+10
Share the entry text PMD of the kernel mapping with the user space mapping. If large pages are enabled this is a single PMD entry and at the point where it is copied into the user page table the RW bit has not been cleared yet. Clear it right away so the user space visible map becomes RX. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/entry: Align entry text section to PMD boundaryThomas Gleixner1-0/+8
The (irq)entry text must be visible in the user space page tables. To allow simple PMD based sharing, make the entry text PMD aligned. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-23x86/mm/pti: Share cpu_entry_area with user space page tablesAndy Lutomirski1-0/+25
Share the cpu entry area so the user space and kernel space page tables have the same P4D page. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>