commit dad5c1402c570cd07a80113784bc20a7f930c8ae Author: Greg Kroah-Hartman Date: Mon Dec 25 14:26:48 2017 +0100 Linux 4.14.9 commit a9772285a7246c2a816da7f50c8c1e94b264770d Author: Will Deacon Date: Tue Oct 24 11:22:46 2017 +0100 linux/compiler.h: Split into compiler.h and compiler_types.h commit d15155824c5014803d91b829736d249c500bdda6 upstream. linux/compiler.h is included indirectly by linux/types.h via uapi/linux/types.h -> uapi/linux/posix_types.h -> linux/stddef.h -> uapi/linux/stddef.h and is needed to provide a proper definition of offsetof. Unfortunately, compiler.h requires a definition of smp_read_barrier_depends() for defining lockless_dereference() and soon for defining READ_ONCE(), which means that all users of READ_ONCE() will need to include asm/barrier.h to avoid splats such as: In file included from include/uapi/linux/stddef.h:1:0, from include/linux/stddef.h:4, from arch/h8300/kernel/asm-offsets.c:11: include/linux/list.h: In function 'list_empty': >> include/linux/compiler.h:343:2: error: implicit declaration of function 'smp_read_barrier_depends' [-Werror=implicit-function-declaration] smp_read_barrier_depends(); /* Enforce dependency ordering from x */ \ ^ A better alternative is to include asm/barrier.h in linux/compiler.h, but this requires a type definition for "bool" on some architectures (e.g. x86), which is defined later by linux/types.h. Type "bool" is also used directly in linux/compiler.h, so the whole thing is pretty fragile. This patch splits compiler.h in two: compiler_types.h contains type annotations, definitions and the compiler-specific parts, whereas compiler.h #includes compiler-types.h and additionally defines macros such as {READ,WRITE.ACCESS}_ONCE(). uapi/linux/stddef.h and linux/linkage.h are then moved over to include linux/compiler_types.h, which fixes the build for h8 and blackfin. Signed-off-by: Will Deacon Cc: Linus Torvalds Cc: Paul E. McKenney Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1508840570-22169-2-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit d605778b613a535d258c1e6a931bc2153aa63185 Author: Daniel Borkmann Date: Fri Dec 22 16:23:12 2017 +0100 selftests/bpf: add tests for recent bugfixes From: Jann Horn [ Upstream commit 2255f8d520b0a318fc6d387d0940854b2f522a7f ] These tests should cover the following cases: - MOV with both zero-extended and sign-extended immediates - implicit truncation of register contents via ALU32/MOV32 - implicit 32-bit truncation of ALU32 output - oversized register source operand for ALU32 shift - right-shift of a number that could be positive or negative - map access where adding the operation size to the offset causes signed 32-bit overflow - direct stack access at a ~4GiB offset Also remove the F_LOAD_WITH_STRICT_ALIGNMENT flag from a bunch of tests that should fail independent of what flags userspace passes. Signed-off-by: Jann Horn Signed-off-by: Alexei Starovoitov Signed-off-by: Daniel Borkmann Signed-off-by: Greg Kroah-Hartman commit de31796c052e47c99b1bb342bc70aa826733e862 Author: Daniel Borkmann Date: Fri Dec 22 16:23:11 2017 +0100 bpf: fix integer overflows From: Alexei Starovoitov [ Upstream commit bb7f0f989ca7de1153bd128a40a71709e339fa03 ] There were various issues related to the limited size of integers used in the verifier: - `off + size` overflow in __check_map_access() - `off + reg->off` overflow in check_mem_access() - `off + reg->var_off.value` overflow or 32-bit truncation of `reg->var_off.value` in check_mem_access() - 32-bit truncation in check_stack_boundary() Make sure that any integer math cannot overflow by not allowing pointer math with large values. Also reduce the scope of "scalar op scalar" tracking. Fixes: f1174f77b50c ("bpf/verifier: rework value tracking") Reported-by: Jann Horn Signed-off-by: Alexei Starovoitov Signed-off-by: Daniel Borkmann Signed-off-by: Greg Kroah-Hartman commit cb56cc1b292b8b3f787fad89f1208f8e98d12c7d Author: Daniel Borkmann Date: Fri Dec 22 16:23:10 2017 +0100 bpf: don't prune branches when a scalar is replaced with a pointer From: Jann Horn [ Upstream commit 179d1c5602997fef5a940c6ddcf31212cbfebd14 ] This could be made safe by passing through a reference to env and checking for env->allow_ptr_leaks, but it would only work one way and is probably not worth the hassle - not doing it will not directly lead to program rejection. Fixes: f1174f77b50c ("bpf/verifier: rework value tracking") Signed-off-by: Jann Horn Signed-off-by: Alexei Starovoitov Signed-off-by: Daniel Borkmann Signed-off-by: Greg Kroah-Hartman commit c90268f7cbee0781331b96d1423d0f28a6183889 Author: Daniel Borkmann Date: Fri Dec 22 16:23:09 2017 +0100 bpf: force strict alignment checks for stack pointers From: Jann Horn [ Upstream commit a5ec6ae161d72f01411169a938fa5f8baea16e8f ] Force strict alignment checks for stack pointers because the tracking of stack spills relies on it; unaligned stack accesses can lead to corruption of spilled registers, which is exploitable. Fixes: f1174f77b50c ("bpf/verifier: rework value tracking") Signed-off-by: Jann Horn Signed-off-by: Alexei Starovoitov Signed-off-by: Daniel Borkmann Signed-off-by: Greg Kroah-Hartman commit 2120fca0ecfb4552d27608d409ebd3403ce02ce4 Author: Daniel Borkmann Date: Fri Dec 22 16:23:08 2017 +0100 bpf: fix missing error return in check_stack_boundary() From: Jann Horn Prevent indirect stack accesses at non-constant addresses, which would permit reading and corrupting spilled pointers. Fixes: f1174f77b50c ("bpf/verifier: rework value tracking") Signed-off-by: Jann Horn Signed-off-by: Alexei Starovoitov Signed-off-by: Daniel Borkmann Signed-off-by: Greg Kroah-Hartman commit 6c8e098d0324412d4ae9e06c7e611a96b87faf80 Author: Daniel Borkmann Date: Fri Dec 22 16:23:07 2017 +0100 bpf: fix 32-bit ALU op verification From: Jann Horn [ Upstream commit 468f6eafa6c44cb2c5d8aad35e12f06c240a812a ] 32-bit ALU ops operate on 32-bit values and have 32-bit outputs. Adjust the verifier accordingly. Fixes: f1174f77b50c ("bpf/verifier: rework value tracking") Signed-off-by: Jann Horn Signed-off-by: Alexei Starovoitov Signed-off-by: Daniel Borkmann Signed-off-by: Greg Kroah-Hartman commit bf5ee24e87e39548bf30d4e18e479e61a5a98336 Author: Daniel Borkmann Date: Fri Dec 22 16:23:06 2017 +0100 bpf: fix incorrect tracking of register size truncation From: Jann Horn [ Upstream commit 0c17d1d2c61936401f4702e1846e2c19b200f958 ] Properly handle register truncation to a smaller size. The old code first mirrors the clearing of the high 32 bits in the bitwise tristate representation, which is correct. But then, it computes the new arithmetic bounds as the intersection between the old arithmetic bounds and the bounds resulting from the bitwise tristate representation. Therefore, when coerce_reg_to_32() is called on a number with bounds [0xffff'fff8, 0x1'0000'0007], the verifier computes [0xffff'fff8, 0xffff'ffff] as bounds of the truncated number. This is incorrect: The truncated number could also be in the range [0, 7], and no meaningful arithmetic bounds can be computed in that case apart from the obvious [0, 0xffff'ffff]. Starting with v4.14, this is exploitable by unprivileged users as long as the unprivileged_bpf_disabled sysctl isn't set. Debian assigned CVE-2017-16996 for this issue. v2: - flip the mask during arithmetic bounds calculation (Ben Hutchings) v3: - add CVE number (Ben Hutchings) Fixes: b03c9f9fdc37 ("bpf/verifier: track signed and unsigned min/max values") Signed-off-by: Jann Horn Acked-by: Edward Cree Signed-off-by: Alexei Starovoitov Signed-off-by: Daniel Borkmann Signed-off-by: Greg Kroah-Hartman commit 6e12ea4fb45ca86cdd7425276b6993455fee947a Author: Daniel Borkmann Date: Fri Dec 22 16:23:05 2017 +0100 bpf: fix incorrect sign extension in check_alu_op() From: Jann Horn [ Upstream commit 95a762e2c8c942780948091f8f2a4f32fce1ac6f ] Distinguish between BPF_ALU64|BPF_MOV|BPF_K (load 32-bit immediate, sign-extended to 64-bit) and BPF_ALU|BPF_MOV|BPF_K (load 32-bit immediate, zero-padded to 64-bit); only perform sign extension in the first case. Starting with v4.14, this is exploitable by unprivileged users as long as the unprivileged_bpf_disabled sysctl isn't set. Debian assigned CVE-2017-16995 for this issue. v3: - add CVE number (Ben Hutchings) Fixes: 484611357c19 ("bpf: allow access into map value arrays") Signed-off-by: Jann Horn Acked-by: Edward Cree Signed-off-by: Alexei Starovoitov Signed-off-by: Daniel Borkmann Signed-off-by: Greg Kroah-Hartman commit 4d54f7df5131d67f653f674003ec5f52c9818b53 Author: Daniel Borkmann Date: Fri Dec 22 16:23:04 2017 +0100 bpf/verifier: fix bounds calculation on BPF_RSH From: Edward Cree [ Upstream commit 4374f256ce8182019353c0c639bb8d0695b4c941 ] Incorrect signed bounds were being computed. If the old upper signed bound was positive and the old lower signed bound was negative, this could cause the new upper signed bound to be too low, leading to security issues. Fixes: b03c9f9fdc37 ("bpf/verifier: track signed and unsigned min/max values") Reported-by: Jann Horn Signed-off-by: Edward Cree Acked-by: Alexei Starovoitov [jannh@google.com: changed description to reflect bug impact] Signed-off-by: Jann Horn Signed-off-by: Alexei Starovoitov Signed-off-by: Daniel Borkmann Signed-off-by: Greg Kroah-Hartman commit 82a9d62f603f0cb5549c4ca554f06e70510b7296 Author: Daniel Borkmann Date: Fri Dec 22 16:23:03 2017 +0100 bpf, sparc: fix usage of wrong reg for load_skb_regs after call [ Upstream commit 07aee94394547721ac168cbf4e1c09c14a5fe671 ] When LD_ABS/IND is used in the program, and we have a BPF helper call that changes packet data (bpf_helper_changes_pkt_data() returns true), then in case of sparc JIT, we try to reload cached skb data from bpf2sparc[BPF_REG_6]. However, there is no such guarantee or assumption that skb sits in R6 at this point, all helpers changing skb data only have a guarantee that skb sits in R1. Therefore, store BPF R1 in L7 temporarily and after procedure call use L7 to reload cached skb data. skb sitting in R6 is only true at the time when LD_ABS/IND is executed. Fixes: 7a12b5031c6b ("sparc64: Add eBPF JIT.") Signed-off-by: Daniel Borkmann Acked-by: David S. Miller Acked-by: Alexei Starovoitov Signed-off-by: Alexei Starovoitov Signed-off-by: Greg Kroah-Hartman commit 8a681dfd8fb253cd3cac03a59e2867f9baad7934 Author: Daniel Borkmann Date: Fri Dec 22 16:23:02 2017 +0100 bpf, ppc64: do not reload skb pointers in non-skb context [ Upstream commit 87338c8e2cbb317b5f757e6172f94e2e3799cd20 ] The assumption of unconditionally reloading skb pointers on BPF helper calls where bpf_helper_changes_pkt_data() holds true is wrong. There can be different contexts where the helper would enforce a reload such as in case of XDP. Here, we do have a struct xdp_buff instead of struct sk_buff as context, thus this will access garbage. JITs only ever need to deal with cached skb pointer reload when ld_abs/ind was seen, therefore guard the reload behind SEEN_SKB. Fixes: 156d0e290e96 ("powerpc/ebpf/jit: Implement JIT compiler for extended BPF") Signed-off-by: Daniel Borkmann Reviewed-by: Naveen N. Rao Acked-by: Alexei Starovoitov Tested-by: Sandipan Das Signed-off-by: Alexei Starovoitov Signed-off-by: Greg Kroah-Hartman commit 83ab155d144922cb7421fb975e500901185e7644 Author: Daniel Borkmann Date: Fri Dec 22 16:23:01 2017 +0100 bpf, s390x: do not reload skb pointers in non-skb context [ Upstream commit 6d59b7dbf72ed20d0138e2f9b75ca3d4a9d4faca ] The assumption of unconditionally reloading skb pointers on BPF helper calls where bpf_helper_changes_pkt_data() holds true is wrong. There can be different contexts where the BPF helper would enforce a reload such as in case of XDP. Here, we do have a struct xdp_buff instead of struct sk_buff as context, thus this will access garbage. JITs only ever need to deal with cached skb pointer reload when ld_abs/ind was seen, therefore guard the reload behind SEEN_SKB only. Tested on s390x. Fixes: 9db7f2b81880 ("s390/bpf: recache skb->data/hlen for skb_vlan_push/pop") Signed-off-by: Daniel Borkmann Acked-by: Alexei Starovoitov Cc: Michael Holzheu Signed-off-by: Alexei Starovoitov Signed-off-by: Greg Kroah-Hartman commit a23244e8845f510ce3ba8b77b32cdd3d3d8ae853 Author: Daniel Borkmann Date: Fri Dec 22 16:23:00 2017 +0100 bpf: fix corruption on concurrent perf_event_output calls [ Upstream commit 283ca526a9bd75aed7350220d7b1f8027d99c3fd ] When tracing and networking programs are both attached in the system and both use event-output helpers that eventually call into perf_event_output(), then we could end up in a situation where the tracing attached program runs in user context while a cls_bpf program is triggered on that same CPU out of softirq context. Since both rely on the same per-cpu perf_sample_data, we could potentially corrupt it. This can only ever happen in a combination of the two types; all tracing programs use a bpf_prog_active counter to bail out in case a program is already running on that CPU out of a different context. XDP and cls_bpf programs by themselves don't have this issue as they run in the same context only. Therefore, split both perf_sample_data so they cannot be accessed from each other. Fixes: 20b9d7ac4852 ("bpf: avoid excessive stack usage for perf_sample_data") Reported-by: Alexei Starovoitov Signed-off-by: Daniel Borkmann Tested-by: Song Liu Acked-by: Alexei Starovoitov Signed-off-by: Alexei Starovoitov Signed-off-by: Greg Kroah-Hartman commit 2b3ea8ceb2bb71e9e58527661261dba127137d9b Author: Daniel Borkmann Date: Fri Dec 22 16:22:59 2017 +0100 bpf: fix branch pruning logic From: Alexei Starovoitov [ Upstream commit c131187db2d3fa2f8bf32fdf4e9a4ef805168467 ] when the verifier detects that register contains a runtime constant and it's compared with another constant it will prune exploration of the branch that is guaranteed not to be taken at runtime. This is all correct, but malicious program may be constructed in such a way that it always has a constant comparison and the other branch is never taken under any conditions. In this case such path through the program will not be explored by the verifier. It won't be taken at run-time either, but since all instructions are JITed the malicious program may cause JITs to complain about using reserved fields, etc. To fix the issue we have to track the instructions explored by the verifier and sanitize instructions that are dead at run time with NOPs. We cannot reject such dead code, since llvm generates it for valid C code, since it doesn't do as much data flow analysis as the verifier does. Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)") Signed-off-by: Alexei Starovoitov Acked-by: Daniel Borkmann Signed-off-by: Daniel Borkmann Signed-off-by: Greg Kroah-Hartman commit 7d7545295e714751fa236c6284167342b066d825 Author: Kirill A. Shutemov Date: Tue Nov 7 11:33:37 2017 +0300 mm/sparsemem: Fix ARM64 boot crash when CONFIG_SPARSEMEM_EXTREME=y commit 629a359bdb0e0652a8227b4ff3125431995fec6e upstream. Since commit: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y") we allocate the mem_section array dynamically in sparse_memory_present_with_active_regions(), but some architectures, like arm64, don't call the routine to initialize sparsemem. Let's move the initialization into memory_present() it should cover all architectures. Reported-and-tested-by: Sudeep Holla Tested-by: Bjorn Andersson Signed-off-by: Kirill A. Shutemov Acked-by: Will Deacon Cc: Andrew Morton Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-mm@kvack.org Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y") Link: http://lkml.kernel.org/r/20171107083337.89952-1-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar Cc: Dan Rue Cc: Naresh Kamboju Signed-off-by: Greg Kroah-Hartman commit 0237a0a456563d461814111db15ae26666ce730e Author: Peter Hutterer Date: Mon Dec 4 10:26:17 2017 +1000 platform/x86: asus-wireless: send an EV_SYN/SYN_REPORT between state changes commit bff5bf9db1c9453ffd0a78abed3e2d040c092fd9 upstream. Sending the switch state change twice within the same frame is invalid evdev protocol and only works if the client handles keys immediately as well. Processing events immediately is incorrect, it forces a fake order of events that does not exist on the device. Recent versions of libinput changed to only process the device state and SYN_REPORT time, so now the key event is lost. https://bugs.freedesktop.org/show_bug.cgi?id=104041 Signed-off-by: Peter Hutterer Signed-off-by: Darren Hart (VMware) Signed-off-by: Greg Kroah-Hartman commit 5431aef93678a1f91e8b3ab41076f2cd1be34353 Author: Daniel Lezcano Date: Thu Oct 19 19:05:47 2017 +0200 thermal/drivers/hisi: Fix multiple alarm interrupts firing commit db2b0332608c8e648ea1e44727d36ad37cdb56cb upstream. The DT specifies a threshold of 65000, we setup the register with a value in the temperature resolution for the controller, 64656. When we reach 64656, the interrupt fires, the interrupt is disabled. Then the irq thread runs and calls thermal_zone_device_update() which will call in turn hisi_thermal_get_temp(). The function will look if the temperature decreased, assuming it was more than 65000, but that is not the case because the current temperature is 64656 (because of the rounding when setting the threshold). This condition being true, we re-enable the interrupt which fires immediately after exiting the irq thread. That happens again and again until the temperature goes to more than 65000. Potentially, there is here an interrupt storm if the temperature stabilizes at this temperature. A very unlikely case but possible. In any case, it does not make sense to handle dozens of alarm interrupt for nothing. Fix this by rounding the threshold value to the controller resolution so the check against the threshold is consistent with the one set in the controller. Signed-off-by: Daniel Lezcano Reviewed-by: Leo Yan Tested-by: Leo Yan Signed-off-by: Eduardo Valentin Signed-off-by: Kevin Wangtao Signed-off-by: Greg Kroah-Hartman commit 02c17c0f825c2d22de05c18e97982796b634eff3 Author: Daniel Lezcano Date: Thu Oct 19 19:05:46 2017 +0200 thermal/drivers/hisi: Simplify the temperature/step computation commit 48880b979cdc9ef5a70af020f42b8ba1e51dbd34 upstream. The step and the base temperature are fixed values, we can simplify the computation by converting the base temperature to milli celsius and use a pre-computed step value. That saves us a lot of mult + div for nothing at runtime. Take also the opportunity to change the function names to be consistent with the rest of the code. Signed-off-by: Daniel Lezcano Reviewed-by: Leo Yan Tested-by: Leo Yan Signed-off-by: Eduardo Valentin Signed-off-by: Kevin Wangtao Signed-off-by: Greg Kroah-Hartman commit cf826c57785384341c3f0cf2cfbfe86c02eb6b9b Author: Daniel Lezcano Date: Thu Oct 19 19:05:45 2017 +0200 thermal/drivers/hisi: Fix kernel panic on alarm interrupt commit 2cb4de785c40d4a2132cfc13e63828f5a28c3351 upstream. The threaded interrupt for the alarm interrupt is requested before the temperature controller is setup. This one can fire an interrupt immediately leading to a kernel panic as the sensor data is not initialized. In order to prevent that, move the threaded irq after the Tsensor is setup. Signed-off-by: Daniel Lezcano Reviewed-by: Leo Yan Tested-by: Leo Yan Signed-off-by: Eduardo Valentin Signed-off-by: Kevin Wangtao Signed-off-by: Greg Kroah-Hartman commit 7254834c43bd32842e96fd856c9c9793a29e6fe2 Author: Daniel Lezcano Date: Thu Oct 19 19:05:43 2017 +0200 thermal/drivers/hisi: Fix missing interrupt enablement commit c176b10b025acee4dc8f2ab1cd64eb73b5ccef53 upstream. The interrupt for the temperature threshold is not enabled at the end of the probe function, enable it after the setup is complete. On the other side, the irq_enabled is not correctly set as we are checking if the interrupt is masked where 'yes' means irq_enabled=false. irq_get_irqchip_state(data->irq, IRQCHIP_STATE_MASKED, &data->irq_enabled); As we are always enabling the interrupt, it is pointless to check if the interrupt is masked or not, just set irq_enabled to 'true'. Signed-off-by: Daniel Lezcano Reviewed-by: Leo Yan Tested-by: Leo Yan Signed-off-by: Eduardo Valentin Signed-off-by: Kevin Wangtao Signed-off-by: Greg Kroah-Hartman commit 0da9db57c0423f281db8ae2fb1a67e88d3fecd34 Author: Niranjana Vishwanathapura Date: Tue Sep 26 06:44:07 2017 -0700 IB/opa_vnic: Properly return the total MACs in UC MAC list [ Upstream commit b77eb45e0d9c324245d165656ab3b38b6f386436 ] Do not include EM specified MAC address in total MACs of the UC MAC list. Reviewed-by: Sudeep Dutt Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Dennis Dalessandro Signed-off-by: Doug Ledford Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit ecffae11228f96c045eb38fa60bdf17dc677b94b Author: Scott Franco Date: Tue Sep 26 06:44:13 2017 -0700 IB/opa_vnic: Properly clear Mac Table Digest [ Upstream commit 4bbdfe25600c1909c26747d0b5c39fd0e409bb87 ] Clear the MAC table digest when the MAC table is freed. Reviewed-by: Niranjana Vishwanathapura Signed-off-by: Scott Franco Signed-off-by: Dennis Dalessandro Signed-off-by: Doug Ledford Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 0d74c05ca7ef083cbc676aacfd147ff60de54c95 Author: Eric Anholt Date: Tue Aug 15 16:47:19 2017 -0700 drm/vc4: Avoid using vrefresh==0 mode in DSI htotal math. [ Upstream commit af2eca53206c59ce9308a4f5f46c4a104a179b6b ] The incoming mode might have a missing vrefresh field if it came from drmModeSetCrtc(), which the kernel is supposed to calculate using drm_mode_vrefresh(). We could either use that or the adjusted_mode's original vrefresh value. However, we can maintain a more exact vrefresh value (not just the integer approximation), by scaling by the ratio of our clocks. v2: Use math suggested by Andrzej Hajda instead. v3: Simplify math now that adjusted_mode->clock isn't padded. v4: Drop some parens. Signed-off-by: Eric Anholt Link: https://patchwork.freedesktop.org/patch/msgid/20170815234722.20700-2-eric@anholt.net Reviewed-by: Andrzej Hajda Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 80879ecb4624d58e10cfe03550056c34483187a5 Author: Nicholas Piggin Date: Fri Sep 1 14:29:56 2017 +1000 cpuidle: fix broadcast control when broadcast can not be entered [ Upstream commit f187851b9b4a76952b1158b86434563dd2031103 ] When failing to enter broadcast timer mode for an idle state that requires it, a new state is selected that does not require broadcast, but the broadcast variable remains set. This causes tick_broadcast_exit to be called despite not having entered broadcast mode. This causes the WARN_ON_ONCE(!irqs_disabled()) to trigger in some cases. It does not appear to cause problems for code today, but seems to violate the interface so should be fixed. Signed-off-by: Nicholas Piggin Reviewed-by: Thomas Gleixner Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 051c3df7d6b8d5f1948503637c69fc939db69035 Author: Alexandre Belloni Date: Thu Sep 28 13:53:27 2017 +0200 rtc: set the alarm to the next expiring timer [ Upstream commit 74717b28cb32e1ad3c1042cafd76b264c8c0f68d ] If there is any non expired timer in the queue, the RTC alarm is never set. This is an issue when adding a timer that expires before the next non expired timer. Ensure the RTC alarm is set in that case. Fixes: 2b2f5ff00f63 ("rtc: interface: ignore expired timers when enqueuing new timers") Signed-off-by: Alexandre Belloni Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 0aaff15c10132eb36ab16b4785cd207075e5943b Author: Hoang Tran Date: Wed Sep 27 18:30:58 2017 +0200 tcp: fix under-evaluated ssthresh in TCP Vegas [ Upstream commit cf5d74b85ef40c202c76d90959db4d850f301b95 ] With the commit 76174004a0f19785 (tcp: do not slow start when cwnd equals ssthresh), the comparison to the reduced cwnd in tcp_vegas_ssthresh() would under-evaluate the ssthresh. Signed-off-by: Hoang Tran Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 71e51e4d488d56345d72dc3c89e67af015afea22 Author: Chen-Yu Tsai Date: Fri Sep 29 16:22:54 2017 +0800 clk: sunxi-ng: sun6i: Rename HDMI DDC clock to avoid name collision [ Upstream commit 7f3ed79188f2f094d0ee366fa858857fb7f511ba ] The HDMI DDC clock found in the CCU is the parent of the actual DDC clock within the HDMI controller. That clock is also named "hdmi-ddc". Rename the one in the CCU to "ddc". This makes more sense than renaming the one in the HDMI controller to something else. Fixes: c6e6c96d8fa6 ("clk: sunxi-ng: Add A31/A31s clocks") Signed-off-by: Chen-Yu Tsai Signed-off-by: Maxime Ripard Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit ae35e16e0a57da2a7aa8631c8929dd79bac74bde Author: Arvind Yadav Date: Sat Sep 23 13:25:30 2017 +0530 staging: greybus: light: Release memory obtained by kasprintf [ Upstream commit 04820da21050b35eed68aa046115d810163ead0c ] Free memory region, if gb_lights_channel_config is not successful. Signed-off-by: Arvind Yadav Reviewed-by: Rui Miguel Silva Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 0fbdd907e4b3836b025b8690823c7271f939a938 Author: Wei Hu(Xavier) Date: Fri Sep 29 23:10:12 2017 +0800 RDMA/hns: Avoid NULL pointer exception [ Upstream commit 5e437b1d7e8d31ff9a4b8e898eb3a6cee309edd9 ] After the loop in hns_roce_v1_mr_free_work_fn function, it is possible that all qps will have been freed (in which case ne will be 0). If that happens, then later in the function when we dereference hr_qp we will get an exception. Check ne is not 0 to make sure we actually have an hr_qp left to work on. This patch fixes the smatch error as below: drivers/infiniband/hw/hns/hns_roce_hw_v1.c:1009 hns_roce_v1_mr_free_work_fn() error: we previously assumed 'hr_qp' could be null Signed-off-by: Wei Hu (Xavier) Signed-off-by: Lijun Ou Signed-off-by: Shaobo Xu Signed-off-by: Doug Ledford Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 0006d8c76b0cbfc1dfefe41ac93881f1f8c15432 Author: Mike Manning Date: Mon Sep 25 22:01:36 2017 +0100 net: ipv6: send NS for DAD when link operationally up [ Upstream commit 1f372c7bfb23286d2bf4ce0423ab488e86b74bb2 ] The NS for DAD are sent on admin up as long as a valid qdisc is found. A race condition exists by which these packets will not egress the interface if the operational state of the lower device is not yet up. The solution is to delay DAD until the link is operationally up according to RFC2863. Rather than only doing this, follow the existing code checks by deferring IPv6 device initialization altogether. The fix allows DAD on devices like tunnels that are controlled by userspace control plane. The fix has no impact on regular deployments, but means that there is no IPv6 connectivity until the port has been opened in the case of port-based network access control, which should be desirable. Signed-off-by: Mike Manning Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit a58a3af86a4e8a2bd1d2d041341efa28818c6e89 Author: Mick Tarsel Date: Thu Sep 28 13:53:18 2017 -0700 ibmvnic: Set state UP [ Upstream commit e876a8a7e9dd89dc88c12ca2e81beb478dbe9897 ] State is initially reported as UNKNOWN. Before register call netif_carrier_off(). Once the device is opened, call netif_carrier_on() in order to set the state to UP. Signed-off-by: Mick Tarsel Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit fb383223d00ffa445b266c5fd54ed80fd8c57d17 Author: Jacob Keller Date: Mon Oct 2 07:17:50 2017 -0700 fm10k: ensure we process SM mbx when processing VF mbx [ Upstream commit 17a91809942ca32c70026d2d5ba3348a2c4fdf8f ] When we process VF mailboxes, the driver is likely going to also queue up messages to the switch manager. This process merely queues up the FIFO, but doesn't actually begin the transmission process. Because we hold the mailbox lock during this VF processing, the PF<->SM mailbox is not getting processed at this time. Ensure that we actually process the PF<->SM mailbox in between each PF<->VF mailbox. This should ensure prompt transmission of the messages queued up after each VF message is received and handled. Signed-off-by: Jacob Keller Tested-by: Krishneil Singh Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 2b401d9f1d4543e4e1ad23a46726376adb958831 Author: Marek Szyprowski Date: Mon Oct 2 08:39:35 2017 +0200 ARM: exynos_defconfig: Enable UAS support for Odroid HC1 board [ Upstream commit a99897f550de96841aecb811455a67ad7a4e39a7 ] Odroid HC1 board has built-in JMicron USB to SATA bridge, which supports UAS protocol. Compile-in support for it (instead of enabling it as module) to make sure that all built-in storage devices are available for rootfs. The bridge itself also supports fallback to standard USB Mass Storage protocol, but USB Mass Storage class doesn't bind to it when UAS is compiled as module and modules are not (yet) available. Signed-off-by: Marek Szyprowski Signed-off-by: Krzysztof Kozlowski Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit b27bbf1f5b9e62b7b0b8ce9a4f5e474c3119a9a9 Author: Alex Williamson Date: Mon Oct 2 12:39:09 2017 -0600 vfio/pci: Virtualize Maximum Payload Size [ Upstream commit 523184972b282cd9ca17a76f6ca4742394856818 ] With virtual PCI-Express chipsets, we now see userspace/guest drivers trying to match the physical MPS setting to a virtual downstream port. Of course a lone physical device surrounded by virtual interconnects cannot make a correct decision for a proper MPS setting. Instead, let's virtualize the MPS control register so that writes through to hardware are disallowed. Userspace drivers like QEMU assume they can write anything to the device and we'll filter out anything dangerous. Since mismatched MPS can lead to AER and other faults, let's add it to the kernel side rather than relying on userspace virtualization to handle it. Signed-off-by: Alex Williamson Reviewed-by: Eric Auger Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 4297cf42691eb626ab5f72f416cf7d9ea126a410 Author: Alan Brady Date: Tue Aug 22 06:57:53 2017 -0400 i40e: fix client notify of VF reset [ Upstream commit c53d11f669c0e7d0daf46a717b6712ad0b09de99 ] Currently there is a bug in which the PF driver fails to inform clients of a VF reset which then causes clients to leak resources. The bug exists because we were incorrectly checking the I40E_VF_STATE_PRE_ENABLE bit. When a VF is first init we go through a reset to initialize variables and allocate resources but we don't want to inform clients of this first reset since the client isn't fully enabled yet so we set a state bit signifying we're in a "pre-enabled" client state. During the first reset we should be clearing the bit, allowing all following resets to notify the client of the reset when the bit is not set. This patch fixes the issue by negating the 'test_and_clear_bit' check to accurately reflect the behavior we want. Signed-off-by: Alan Brady Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 8da9104839c86b9f80fb45df60ec0beae9f103b1 Author: Dick Kennedy Date: Fri Sep 29 17:34:31 2017 -0700 scsi: lpfc: Fix warning messages when NVME_TARGET_FC not defined [ Upstream commit 2299e4323d2bf6e0728fdc6b9e8e9704978d2dd7 ] Warning messages when NVME_TARGET_FC not defined on ppc builds The lpfc_nvmet_replenish_context() function is only meaningful when NVME target mode enabled. Surround the function body with ifdefs for target mode enablement. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reported-by: Stephen Rothwell Reviewed-by: Johannes Thumshirn Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 6fe8e4f3e4e937ab762fe9114818c64ae215c882 Author: Dick Kennedy Date: Fri Sep 29 17:34:32 2017 -0700 scsi: lpfc: PLOGI failures during NPIV testing [ Upstream commit e8bcf0ae4c0346fdc78ebefe0eefcaa6a6622d38 ] Local Reject/Invalid RPI errors seen during discovery. Temporary RPI cleanup was occurring regardless of SLI rev. It's only necessary on SLI-4. Adjust the test for whether cleanup is necessary. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Johannes Thumshirn Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 096232d99989e34c80bd3a15e343fc3bbdfa42d8 Author: Dick Kennedy Date: Fri Sep 29 17:34:42 2017 -0700 scsi: lpfc: Fix secure firmware updates [ Upstream commit 184fc2b9a8bcbda9c14d0a1e7fbecfc028c7702e ] Firmware update fails with: status x17 add_status x56 on the final write If multiple DMA buffers are used for the download, some firmware revs have difficulty with signatures and crcs split across the dma buffer boundaries. Resolve by making all writes be a single 4k page in length. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Johannes Thumshirn Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 16ddeff35b7b84eefccddae23133f3eced68a8c0 Author: Jacob Keller Date: Fri Aug 11 11:14:58 2017 -0700 fm10k: fix mis-ordered parameters in declaration for .ndo_set_vf_bw [ Upstream commit 3e256ac5b1ec307e5dd5a4c99fbdbc651446c738 ] We've had support for setting both a minimum and maximum bandwidth via .ndo_set_vf_bw since commit 883a9ccbae56 ("fm10k: Add support for SR-IOV to driver", 2014-09-20). Likely because we do not support minimum rates, the declaration mis-ordered the "unused" parameter, which causes warnings when analyzed with cppcheck. Fix this warning by properly declaring the min_rate and max_rate variables in the declaration and definition (rather than using "unused"). Also rename "rate" to max_rate so as to clarify that we only support setting the maximum rate. Signed-off-by: Jacob Keller Tested-by: Krishneil Singh Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit c01b06d9ac357e1d6d95d7b62be60ecd69071c2b Author: Nicolas Dechesne Date: Tue Oct 3 11:49:51 2017 +0200 ASoC: codecs: msm8916-wcd-analog: fix module autoload [ Upstream commit 46d69e141d479585c105a4d5b2337cd2ce6967e5 ] If the driver is built as a module, autoload won't work because the module alias information is not filled. So user-space can't match the registered device with the corresponding module. Export the module alias information using the MODULE_DEVICE_TABLE() macro. Before this patch: $ modinfo snd_soc_msm8916_analog | grep alias $ After this patch: $ modinfo snd_soc_msm8916_analog | grep alias alias: of:N*T*Cqcom,pm8916-wcd-analog-codecC* alias: of:N*T*Cqcom,pm8916-wcd-analog-codec Signed-off-by: Nicolas Dechesne Signed-off-by: Mark Brown Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 5d1b6695edb773d27f100eafc8cc9a34c58cb07f Author: Marcelo Ricardo Leitner Date: Tue Oct 3 19:20:08 2017 -0300 sctp: silence warns on sctp_stream_init allocations [ Upstream commit 1ae2eaaa229bc350b6f38fbf4ab9c873532aecfb ] As SCTP supports up to 65535 streams, that can lead to very large allocations in sctp_stream_init(). As Xin Long noticed, systems with small amounts of memory are more prone to not have enough memory and dump warnings on dmesg initiated by user actions. Thus, silence them. Also, if the reallocation of stream->out is not necessary, skip it and keep the memory we already have. Reported-by: Xin Long Tested-by: Xin Long Signed-off-by: Marcelo Ricardo Leitner Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit fa21a13d76a7eed4b1df1a5223752051dbfde948 Author: Nicholas Piggin Date: Fri Sep 29 13:29:39 2017 +1000 powerpc/watchdog: Do not trigger SMP crash from touch_nmi_watchdog [ Upstream commit 80e4d70b06863e0104e5a0dc78aa3710297fbd4b ] In xmon, touch_nmi_watchdog() is not expected to be checking that other CPUs have not touched the watchdog, so the code will just call touch_nmi_watchdog() once before re-enabling hard interrupts. Just update our CPU's state, and ignore apparently stuck SMP threads. Arguably touch_nmi_watchdog should check for SMP lockups, and callers should be fixed, but that's not trivial for the input code of xmon. Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 97f41b41c432e5a80c91445d92c2f4b729984d36 Author: Nicholas Piggin Date: Fri Sep 29 13:29:40 2017 +1000 powerpc/xmon: Avoid tripping SMP hardlockup watchdog [ Upstream commit 064996d62a33ffe10264b5af5dca92d54f60f806 ] The SMP hardlockup watchdog cross-checks other CPUs for lockups, which causes xmon headaches because it's assuming interrupts hard disabled means no watchdog troubles. Try to improve that by calling touch_nmi_watchdog() in obvious places where secondaries are spinning. Also annotate these spin loops with spin_begin/end calls. Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit f5fec0590cd8732c810c0b6d9ae9e3a42021f2cb Author: Ed Blake Date: Mon Oct 2 11:00:33 2017 +0100 ASoC: img-parallel-out: Add pm_runtime_get/put to set_fmt callback [ Upstream commit c70458890ff15d858bd347fa9f563818bcd6e457 ] Add pm_runtime_get_sync and pm_runtime_put calls to set_fmt callback function. This fixes a bus error during boot when CONFIG_SUSPEND is defined when this function gets called while the device is runtime disabled and device registers are accessed while the clock is disabled. Signed-off-by: Ed Blake Signed-off-by: Mark Brown Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit f9e51fb046db69a2e83d7d7422e33f3402d1800b Author: Jean-François Têtu Date: Fri Sep 29 16:19:44 2017 -0400 ASoC: codecs: msm8916-wcd-analog: fix micbias level [ Upstream commit 664611e7e02f76fbc5470ef545b2657ed25c292b ] The macro used to set the microphone bias level causes the snd_soc_write() call to overwrite other fields in the CDC_A_MICB_1_VAL register. The macro also does not return the proper level value to use. This fixes this by preserving all bits from the register that are not the level while setting the level. Signed-off-by: Jean-François Têtu Acked-by: Srinivas Kandagatla Signed-off-by: Mark Brown Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 16e1626e54f835cb009de675d1f6b5a0ff9183d9 Author: Tom Zanussi Date: Fri Sep 22 14:58:17 2017 -0500 tracing: Exclude 'generic fields' from histograms [ Upstream commit a15f7fc20389a8827d5859907568b201234d4b79 ] There are a small number of 'generic fields' (comm/COMM/cpu/CPU) that are found by trace_find_event_field() but are only meant for filtering. Specifically, they unlike normal fields, they have a size of 0 and thus wreak havoc when used as a histogram key. Exclude these (return -EINVAL) when used as histogram keys. Link: http://lkml.kernel.org/r/956154cbc3e8a4f0633d619b886c97f0f0edf7b4.1506105045.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 45c911bb1814980da78a34423305310d5d327b39 Author: Gabriele Paoloni Date: Thu Sep 28 15:33:05 2017 +0100 PCI/AER: Report non-fatal errors only to the affected endpoint [ Upstream commit 86acc790717fb60fb51ea3095084e331d8711c74 ] Previously, if an non-fatal error was reported by an endpoint, we called report_error_detected() for the endpoint, every sibling on the bus, and their descendents. If any of them did not implement the .error_detected() method, do_recovery() failed, leaving all these devices unrecovered. For example, the system described in the bugzilla below has two devices: 0000:74:02.0 [19e5:a230] SAS controller, driver has .error_detected() 0000:74:03.0 [19e5:a235] SATA controller, driver lacks .error_detected() When a device such as 74:02.0 reported a non-fatal error, do_recovery() failed because 74:03.0 lacked an .error_detected() method. But per PCIe r3.1, sec 6.2.2.2.2, such an error does not compromise the Link and does not affect 74:03.0: Non-fatal errors are uncorrectable errors which cause a particular transaction to be unreliable but the Link is otherwise fully functional. Isolating Non-fatal from Fatal errors provides Requester/Receiver logic in a device or system management software the opportunity to recover from the error without resetting the components on the Link and disturbing other transactions in progress. Devices not associated with the transaction in error are not impacted by the error. Report non-fatal errors only to the endpoint that reported them. We really want to check for AER_NONFATAL here, but the current code structure doesn't allow that. Looking for pci_channel_io_normal is the best we can do now. Link: https://bugzilla.kernel.org/show_bug.cgi?id=197055 Fixes: 6c2b374d7485 ("PCI-Express AER implemetation: AER core and aerdriver") Signed-off-by: Gabriele Paoloni Signed-off-by: Dongdong Liu [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit cbd6b3694a4adec779cdb9266883a21254d5897c Author: Jacob Keller Date: Tue Aug 29 05:32:31 2017 -0400 i40e/i40evf: spread CPU affinity hints across online CPUs only [ Upstream commit be664cbefc50977aaefc868ba6a1109ec9b7449d ] Currently, when setting up the IRQ for a q_vector, we set an affinity hint based on the v_idx of that q_vector. Meaning a loop iterates on v_idx, which is an incremental value, and the cpumask is created based on this value. This is a problem in systems with multiple logical CPUs per core (like in simultaneous multithreading (SMT) scenarios). If we disable some logical CPUs, by turning SMT off for example, we will end up with a sparse cpu_online_mask, i.e., only the first CPU in a core is online, and incremental filling in q_vector cpumask might lead to multiple offline CPUs being assigned to q_vectors. Example: if we have a system with 8 cores each one containing 8 logical CPUs (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is disabled, only the 1st CPU in each core remains online, so the cpu_online_mask in this case would have only 8 bits set, in a sparse way. In general case, when SMT is off the cpu_online_mask has only C bits set: 0, 1*N, 2*N, ..., C*(N-1) where C == # of cores; N == # of logical CPUs per core. In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set. Instead, we should only assign hints for CPUs which are online. Even better, the kernel already provides a function, cpumask_local_spread() which takes an index and returns a CPU, spreading the interrupts across local NUMA nodes first, and then remote ones if necessary. Since we generally have a 1:1 mapping between vectors and CPUs, there is no real advantage to spreading vectors to local CPUs first. In order to avoid mismatch of the default XPS hints, we'll pass -1 so that it spreads across all CPUs without regard to the node locality. Note that we don't need to change the q_vector->affinity_mask as this is initialized to cpu_possible_mask, until an actual affinity is set and then notified back to us. Signed-off-by: Jacob Keller Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit da548d5a6f9eb188b4b0b07dd70f60d0b7ca05d3 Author: Hans de Goede Date: Wed Oct 4 20:43:36 2017 +0200 Bluetooth: hci_bcm: Fix setting of irq trigger type [ Upstream commit 227630cccdbb8f8a1b24ac26517b75079c9a69c9 ] This commit fixes 2 issues with host-wake irq trigger type handling in hci_bcm: 1) bcm_setup_sleep sets sleep_params.host_wake_active based on bcm_device.irq_polarity, but bcm_request_irq was always requesting IRQF_TRIGGER_RISING as trigger type independent of irq_polarity. This was a problem when the irq is described as a GpioInt rather then an Interrupt in the DSDT as for GpioInt-s the value passed to request_irq is honored. This commit fixes this by requesting the correct trigger type depending on bcm_device.irq_polarity. 2) bcm_device.irq_polarity was used to directly store an ACPI polarity value (ACPI_ACTIVE_*). This is undesirable because hci_bcm is also used with device-tree and checking for something like ACPI_ACTIVE_LOW in a non ACPI specific function like bcm_request_irq feels wrong. This commit fixes this by renaming irq_polarity to irq_active_low and changing its type to a bool. Signed-off-by: Hans de Goede Signed-off-by: Marcel Holtmann Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 56ea88ec49042b34f6ba10280c1fddf0703b5e57 Author: Hans de Goede Date: Wed Oct 4 20:43:35 2017 +0200 Bluetooth: hci_uart_set_flow_control: Fix NULL deref when using serdev [ Upstream commit 7841d554809b518a22349e7e39b6b63f8a48d0fb ] Fix a NULL pointer deref (hu->tty) when calling hci_uart_set_flow_control on hci_uart-s using serdev. Signed-off-by: Hans de Goede Signed-off-by: Marcel Holtmann Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 44ee83c6d6e0b0913f32f3580270daa9c784cc0d Author: Andrew Jeffery Date: Fri Sep 1 15:08:58 2017 +0930 leds: pca955x: Don't invert requested value in pca955x_gpio_set_value() [ Upstream commit 52ca7d0f7bdad832b291ed979146443533ee79c0 ] The PCA9552 lines can be used either for driving LEDs or as GPIOs. The manual states that for LEDs, the operation is open-drain: The LSn LED select registers determine the source of the LED data. 00 = output is set LOW (LED on) 01 = output is set high-impedance (LED off; default) 10 = output blinks at PWM0 rate 11 = output blinks at PWM1 rate For GPIOs it suggests a pull-up so that the open-case drives the line high: For use as output, connect external pull-up resistor to the pin and size it according to the DC recommended operating characteristics. LED output pin is HIGH when the output is programmed as high-impedance, and LOW when the output is programmed LOW through the ‘LED selector’ register. The output can be pulse-width controlled when PWM0 or PWM1 are used. Now, I have a hardware design that uses the LED controller to control LEDs. However, for $reasons, we're using the leds-gpio driver to drive the them. The reasons are here are a tangent but lead to the discovery of the inversion, which manifested as the LEDs being set to full brightness at boot when we expected them to be off. As we're driving the LEDs through leds-gpio, this means wending our way through the gpiochip abstractions. So with that in mind we need to describe an active-low GPIO configuration to drive the LEDs as though they were GPIOs. The set() gpiochip callback in leds-pca955x does the following: ... if (val) pca955x_led_set(&led->led_cdev, LED_FULL); else pca955x_led_set(&led->led_cdev, LED_OFF); ... Where LED_FULL = 255. pca955x_led_set() in turn does: ... switch (value) { case LED_FULL: ls = pca955x_ledsel(ls, ls_led, PCA955X_LS_LED_ON); break; ... Where PCA955X_LS_LED_ON is defined as: #define PCA955X_LS_LED_ON 0x0 /* Output LOW */ So here we have some type confusion: We've crossed domains from GPIO behaviour to LED behaviour without accounting for possible inversions in the process. Stepping back to leds-gpio for a moment, during probe() we call create_gpio_led(), which eventually executes: if (template->default_state == LEDS_GPIO_DEFSTATE_KEEP) { state = gpiod_get_value_cansleep(led_dat->gpiod); if (state < 0) return state; } else { state = (template->default_state == LEDS_GPIO_DEFSTATE_ON); } ... ret = gpiod_direction_output(led_dat->gpiod, state); In the devicetree the GPIO is annotated as active-low, and gpiod_get_value_cansleep() handles this for us: int gpiod_get_value_cansleep(const struct gpio_desc *desc) { int value; might_sleep_if(extra_checks); VALIDATE_DESC(desc); value = _gpiod_get_raw_value(desc); if (value < 0) return value; if (test_bit(FLAG_ACTIVE_LOW, &desc->flags)) value = !value; return value; } _gpiod_get_raw_value() in turn calls through the get() callback for the gpiochip implementation, so returning to our get() implementation in leds-pca955x we find we extract the raw value from hardware: static int pca955x_gpio_get_value(struct gpio_chip *gc, unsigned int offset) { struct pca955x *pca955x = gpiochip_get_data(gc); struct pca955x_led *led = &pca955x->leds[offset]; u8 reg = pca955x_read_input(pca955x->client, led->led_num / 8); return !!(reg & (1 << (led->led_num % 8))); } This behaviour is not symmetric with that of set(), where the val is inverted by the driver. Closing the loop on the GPIO_ACTIVE_LOW inversions, gpiod_direction_output(), like gpiod_get_value_cansleep(), handles it for us: int gpiod_direction_output(struct gpio_desc *desc, int value) { VALIDATE_DESC(desc); if (test_bit(FLAG_ACTIVE_LOW, &desc->flags)) value = !value; else value = !!value; return _gpiod_direction_output_raw(desc, value); } All-in-all, with a value of 'keep' for default-state property in a leds-gpio child node, the current state of the hardware will in-fact be inverted; precisely the opposite of what was intended. Rework leds-pca955x so that we avoid the incorrect inversion and clarify the semantics with respect to GPIO. Signed-off-by: Andrew Jeffery Reviewed-by: Cédric Le Goater Tested-by: Joel Stanley Tested-by: Matt Spinler Signed-off-by: Jacek Anaszewski Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 9704f8147e88213f2fa580f713b42b08a4f1a7d2 Author: Wei Wang Date: Fri Oct 6 12:06:04 2017 -0700 ipv6: grab rt->rt6i_ref before allocating pcpu rt [ Upstream commit a94b9367e044ba672c9f4105eb1516ff6ff4948a ] After rwlock is replaced with rcu and spinlock, ip6_pol_route() will be called with only rcu held. That means rt6 route deletion could happen simultaneously with rt6_make_pcpu_rt(). This could potentially cause memory leak if rt6_release() is called right before rt6_make_pcpu_rt() on the same route. This patch grabs rt->rt6i_ref safely before calling rt6_make_pcpu_rt() to make sure rt6_release() will not get triggered while rt6_make_pcpu_rt() is in progress. And rt6_release() is called after rt6_make_pcpu_rt() is finished. Note: As we are incrementing rt->rt6i_ref in ip6_pol_route(), there is a very slim chance that fib6_purge_rt() will be triggered unnecessarily when deleting a route if ip6_pol_route() running on another thread picks this route as well and tries to make pcpu cache for it. Signed-off-by: Wei Wang Signed-off-by: Martin KaFai Lau Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 2f48fc1742a2da6196482695ec3bcd2438d24a13 Author: William Tu Date: Thu Oct 5 12:07:12 2017 -0700 ip_gre: check packet length and mtu correctly in erspan tx [ Upstream commit f192970de860d3ab90aa9e2a22853201a57bde78 ] Similarly to early patch for erspan_xmit(), the ARPHDR_ETHER device is the length of the whole ether packet. So skb->len should subtract the dev->hard_header_len. Fixes: 1a66a836da63 ("gre: add collect_md mode to ERSPAN tunnel") Fixes: 84e54fe0a5ea ("gre: introduce native tunnel support for ERSPAN") Signed-off-by: William Tu Cc: Xin Long Cc: David Laight Reviewed-by: Xin Long Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 6d7bdad132d59c429dc340f8d10a4977e58813b0 Author: Guoqing Jiang Date: Mon Oct 9 10:32:48 2017 +0800 md: always set THREAD_WAKEUP and wake up wqueue if thread existed [ Upstream commit d1d90147c9680aaec4a5757932c2103c42c9c23b ] Since commit 4ad23a976413 ("MD: use per-cpu counter for writes_pending"), the wait_queue is only got invoked if THREAD_WAKEUP is not set previously. With above change, I can see process_metadata_update could always hang on the wait queue, because mddev->thread could stay on 'D' status and the THREAD_WAKEUP flag is not cleared since there are lots of place to wake up mddev->thread. Then deadlock happened as follows: linux175:~ # ps aux|grep md|grep D root 20117 0.0 0.0 0 0 ? D 03:45 0:00 [md0_raid1] root 20125 0.0 0.0 0 0 ? D 03:45 0:00 [md0_cluster_rec] linux175:~ # cat /proc/20117/stack [] dlm_lock_sync+0x94/0xd0 [md_cluster] [] lock_token+0x34/0xd0 [md_cluster] [] metadata_update_start+0x64/0x110 [md_cluster] [] md_update_sb.part.58+0x9b/0x860 [md_mod] [] md_update_sb+0x15/0x30 [md_mod] [] md_check_recovery+0x266/0x490 [md_mod] [] raid1d+0x42/0x810 [raid1] [] md_thread+0x122/0x150 [md_mod] [] kthread+0x101/0x140 linux175:~ # cat /proc/20125/stack [] recv_daemon+0x3f9/0x5c0 [md_cluster] [] md_thread+0x122/0x150 [md_mod] [] kthread+0x101/0x140 So let's revert the part of code in the commit to resovle the problem since we can't get lots of benefits of previous change. Fixes: 4ad23a976413 ("MD: use per-cpu counter for writes_pending") Signed-off-by: Guoqing Jiang Signed-off-by: Shaohua Li Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 7535afccf97c9dd1559e0b2d7586a522020e5522 Author: Luca Miccio Date: Mon Oct 9 16:27:21 2017 +0200 block,bfq: Disable writeback throttling [ Upstream commit b5dc5d4d1f4ff9032eb6c21a3c571a1317dc9289 ] Similarly to CFQ, BFQ has its write-throttling heuristics, and it is better not to combine them with further write-throttling heuristics of a different nature. So this commit disables write-back throttling for a device if BFQ is used as I/O scheduler for that device. Signed-off-by: Luca Miccio Signed-off-by: Paolo Valente Tested-by: Oleksandr Natalenko Tested-by: Lee Tibbert Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit afdbec5d3c652e8d731861c460eee96ef9498315 Author: Colin Ian King Date: Fri Sep 8 15:37:45 2017 +0100 IB/rxe: check for allocation failure on elem [ Upstream commit 4831ca9e4a8e48cb27e0a792f73250390827a228 ] The allocation for elem may fail (especially because we're using GFP_ATOMIC) so best to check for a null return. This fixes a potential null pointer dereference when assigning elem->pool. Detected by CoverityScan CID#1357507 ("Dereference null return value") Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Colin Ian King Signed-off-by: Doug Ledford Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit afccf6f360df511a432d4c2b99f1eb8d9ac369e6 Author: Emil Tantilov Date: Mon Sep 11 14:21:31 2017 -0700 ixgbe: fix use of uninitialized padding [ Upstream commit dcfd6b839c998bc9838e2a47f44f37afbdf3099c ] This patch is resolving Coverity hits where padding in a structure could be used uninitialized. - Initialize fwd_cmd.pad/2 before ixgbe_calculate_checksum() - Initialize buffer.pad2/3 before ixgbe_hic_unlocked() Signed-off-by: Emil Tantilov Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit d1f13dcad56bdd655a29f33b5ac84e0fae48a4eb Author: Lorenzo Bianconi Date: Wed Aug 30 13:50:39 2017 +0200 iio: st_sensors: add register mask for status register [ Upstream commit e72a060151e5bb673af24993665e270fc4f674a7 ] Introduce register mask for data-ready status register since pressure sensors (e.g. LPS22HB) export just two channels (BIT(0) and BIT(1)) and BIT(2) is marked reserved while in st_sensors_new_samples_available() value read from status register is masked using 0x7. Moreover do not mask status register using active_scan_mask since now status value is properly masked and if the result is not zero the interrupt has to be consumed by the driver. This fix an issue on LPS25H and LPS331AP where channel definition is swapped respect to status register. Furthermore that change allows to properly support new devices (e.g LIS2DW12) that report just ZYXDA (data-ready) field in status register to figure out if the interrupt has been generated by the device. Fixes: 97865fe41322 (iio: st_sensors: verify interrupt event to status) Signed-off-by: Lorenzo Bianconi Reviewed-by: Linus Walleij Signed-off-by: Jonathan Cameron Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit c817cb56b8d62cf2677988cf5fa70cc515a526e1 Author: Lihong Yang Date: Thu Sep 7 08:05:46 2017 -0400 i40e: use the safe hash table iterator when deleting mac filters [ Upstream commit 784548c40d6f43eff2297220ad7800dc04be03c6 ] This patch replaces hash_for_each function with hash_for_each_safe when calling __i40e_del_filter. The hash_for_each_safe function is the right one to use when iterating over a hash table to safely remove a hash entry. Otherwise, incorrect values may be read from freed memory. Detected by CoverityScan, CID 1402048 Read from pointer after free Signed-off-by: Lihong Yang Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 66efe26b0b074ef11575b883fc9a93af51aab597 Author: Christophe JAILLET Date: Sun Aug 27 08:39:51 2017 +0200 igb: check memory allocation failure [ Upstream commit 18eb86362a52f0af933cc0fd5e37027317eb2d1c ] Check memory allocation failures and return -ENOMEM in such cases, as already done for other memory allocations in this function. This avoids NULL pointers dereference. Signed-off-by: Christophe JAILLET Tested-by: Aaron Brown Acked-by: PJ Waskiewicz Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 349384cd7028affdb23634b56d10da015b080cba Author: Fabio Estevam Date: Fri Sep 29 14:39:49 2017 -0300 PM / OPP: Move error message to debug level [ Upstream commit 035ed07208dc501d023873447113f3f178592156 ] On some i.MX6 platforms which do not have speed grading check, opp table will not be created in platform code, so cpufreq driver prints the following error message: cpu cpu0: dev_pm_opp_get_opp_count: OPP table not found (-19) However, this is not really an error in this case because the imx6q-cpufreq driver first calls dev_pm_opp_get_opp_count() and if it fails, it means that platform code does not provide OPP and then dev_pm_opp_of_add_table() will be called. In order to avoid such confusing error message, move it to debug level. It is up to the caller of dev_pm_opp_get_opp_count() to check its return value and decide if it will print an error or not. Signed-off-by: Fabio Estevam Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 7af9f9cd68c7b6f009c4a0c0d8ea7703aa46a26b Author: Stuart Hayes Date: Wed Oct 4 10:57:52 2017 -0500 PCI: Create SR-IOV virtfn/physfn links before attaching driver [ Upstream commit 27d6162944b9b34c32cd5841acd21786637ee743 ] When creating virtual functions, create the "virtfn%u" and "physfn" links in sysfs *before* attaching the driver instead of after. When we attach the driver to the new virtual network interface first, there is a race when the driver attaches to the new sends out an "add" udev event, and the network interface naming software (biosdevname or systemd, for example) tries to look at these links. Signed-off-by: Stuart Hayes Signed-off-by: Bjorn Helgaas Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 6d95d05bafbafc88a13d81e7ad63fcb899de499d Author: Sreekanth Reddy Date: Tue Oct 10 18:41:18 2017 +0530 scsi: mpt3sas: Fix IO error occurs on pulling out a drive from RAID1 volume created on two SATA drive [ Upstream commit 2ce9a3645299ba1752873d333d73f67620f4550b ] Whenever an I/O for a RAID volume fails with IOCStatus MPI2_IOCSTATUS_SCSI_IOC_TERMINATED and SCSIStatus equal to (MPI2_SCSI_STATE_TERMINATED | MPI2_SCSI_STATE_NO_SCSI_STATUS) then return the I/O to SCSI midlayer with "DID_RESET" (i.e. retry the IO infinite times) set in the host byte. Previously, the driver was completing the I/O with "DID_SOFT_ERROR" which causes the I/O to be quickly retried. However, firmware needed more time and hence I/Os were failing. Signed-off-by: Sreekanth Reddy Reviewed-by: Tomas Henzl Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 3aaaf02c110f36dc716780412add4799b2a89e5b Author: Varun Prakash Date: Wed Oct 11 19:33:07 2017 +0530 scsi: cxgb4i: fix Tx skb leak [ Upstream commit 9b3a081fb62158b50bcc90522ca2423017544367 ] In case of connection reset Tx skb queue can have some skbs which are not transmitted so purge Tx skb queue in release_offload_resources() to avoid skb leak. Signed-off-by: Varun Prakash Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit bfd66a406fe7e590055c1d6714adc697f18664c8 Author: David Daney Date: Fri Sep 8 10:10:31 2017 +0200 PCI: Avoid bus reset if bridge itself is broken [ Upstream commit 357027786f3523d26f42391aa4c075b8495e5d28 ] When checking to see if a PCI bus can safely be reset, we previously checked to see if any of the children had their PCI_DEV_FLAGS_NO_BUS_RESET flag set. Children marked with that flag are known not to behave well after a bus reset. Some PCIe root port bridges also do not behave well after a bus reset, sometimes causing the devices behind the bridge to become unusable. Add a check for PCI_DEV_FLAGS_NO_BUS_RESET being set in the bridge device to allow these bridges to be flagged, and prevent their secondary buses from being reset. Signed-off-by: David Daney [jglauber@cavium.com: fixed typo] Signed-off-by: Jan Glauber Signed-off-by: Bjorn Helgaas Reviewed-by: Alex Williamson Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit a5171fe705fb31bf3a0d93e8d69285d2b5b691c9 Author: Dan Murphy Date: Tue Oct 10 12:42:56 2017 -0500 net: phy: at803x: Change error to EINVAL for invalid MAC [ Upstream commit fc7556877d1748ac00958822a0a3bba1d4bd9e0d ] Change the return error code to EINVAL if the MAC address is not valid in the set_wol function. Signed-off-by: Dan Murphy Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit f3a68b4b82f3f19b80b84cf11753082e7fb6ec8f Author: Shakeel Butt Date: Thu Oct 5 18:07:24 2017 -0700 kvm, mm: account kvm related kmem slabs to kmemcg [ Upstream commit 46bea48ac241fe0b413805952dda74dd0c09ba8b ] The kvm slabs can consume a significant amount of system memory and indeed in our production environment we have observed that a lot of machines are spending significant amount of memory that can not be left as system memory overhead. Also the allocations from these slabs can be triggered directly by user space applications which has access to kvm and thus a buggy application can leak such memory. So, these caches should be accounted to kmemcg. Signed-off-by: Shakeel Butt Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit af826fdfb14c51fc4416993d0fcef8f4aa43b54c Author: Russell King Date: Fri Sep 29 11:22:15 2017 +0100 rtc: pl031: make interrupt optional [ Upstream commit 5b64a2965dfdfca8039e93303c64e2b15c19ff0c ] On some platforms, the interrupt for the PL031 is optional. Avoid trying to claim the interrupt if it's not specified. Reviewed-by: Linus Walleij Signed-off-by: Russell King Signed-off-by: Alexandre Belloni Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit bd5139895727bcf97a411bf17fc137fb0d57f55c Author: Christophe Jaillet Date: Sun Oct 8 11:39:49 2017 +0200 crypto: lrw - Fix an error handling path in 'create()' [ Upstream commit 616129cc6e75fb4da6681c16c981fa82dfe5e4c7 ] All error handling paths 'goto err_drop_spawn' except this one. In order to avoid some resources leak, we should do it as well here. Fixes: 700cb3f5fe75 ("crypto: lrw - Convert to skcipher") Signed-off-by: Christophe JAILLET Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 714abd2d6996bb36b334dd15c6c52d3ad00766a0 Author: Christian Lamparter Date: Wed Oct 4 01:00:08 2017 +0200 crypto: crypto4xx - increase context and scatter ring buffer elements [ Upstream commit 778f81d6cdb7d25360f082ac0384d5103f04eca5 ] If crypto4xx is used in conjunction with dm-crypt, the available ring buffer elements are not enough to handle the load properly. On an aes-cbc-essiv:sha256 encrypted swap partition the read performance is abyssal: (tested with hdparm -t) /dev/mapper/swap_crypt: Timing buffered disk reads: 14 MB in 3.68 seconds = 3.81 MB/sec The patch increases both PPC4XX_NUM_SD and PPC4XX_NUM_PD to 256. This improves the performance considerably: /dev/mapper/swap_crypt: Timing buffered disk reads: 104 MB in 3.03 seconds = 34.31 MB/sec Furthermore, PPC4XX_LAST_SD, PPC4XX_LAST_GD and PPC4XX_LAST_PD can be easily calculated from their respective PPC4XX_NUM_* constant. Signed-off-by: Christian Lamparter Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 9fe2989cdf3d433ab07d9e5979c5a811e26dd925 Author: Chen-Yu Tsai Date: Thu Oct 12 16:36:57 2017 +0800 clk: sunxi-ng: sun5i: Fix bit offset of audio PLL post-divider [ Upstream commit d51fe3ba9773c8b6fc79f82bbe75d64baf604292 ] The post-divider for the audio PLL is in bits [29:26], as specified in the user manual, not [19:16] as currently programmed in the code. The post-divider has a default register value of 2, i.e. a divider of 3. This means the clock rate fed to the audio codec would be off. This was discovered when porting sigma-delta modulation for the PLL to sun5i, which needs the post-divider to be 1. Fix the bit offset, so we do actually force the post-divider to a certain value. Fixes: 5e73761786d6 ("clk: sunxi-ng: Add sun5i CCU driver") Signed-off-by: Chen-Yu Tsai Signed-off-by: Maxime Ripard Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit a7455b113feff0b99cad533c63bc6b877c1a1960 Author: Chen-Yu Tsai Date: Thu Oct 12 16:36:58 2017 +0800 clk: sunxi-ng: nm: Check if requested rate is supported by fractional clock [ Upstream commit 4cdbc40d64d4b8303a97e29a52862e4d99502beb ] The round_rate callback for N-M-factor style clocks does not check if the requested clock rate is supported by the fractional clock mode. While this doesn't affect usage in practice, since the clock rates are also supported through N-M factors, it does not match the set_rate code. Add a check to the round_rate callback so it matches the set_rate callback. Fixes: 6174a1e24b0d ("clk: sunxi-ng: Add N-M-factor clock support") Signed-off-by: Chen-Yu Tsai Signed-off-by: Maxime Ripard Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 5d583a7e2d92a723e9833604f760517a64f630cc Author: Shashank Sharma Date: Thu Oct 12 22:10:08 2017 +0530 drm: Add retries for lspcon mode detection [ Upstream commit f687e25a7a245952349f1f9f9cc238ac5a3be258 ] >From the CI builds, its been observed that during a driver reload/insert, dp dual mode read function sometimes fails to read from LSPCON device over i2c-over-aux channel. This patch: - adds some delay and few retries, allowing a scope for these devices to settle down and respond. - changes one error log's level from ERROR->DEBUG as we want to call it an error only after all the retries are exhausted. V2: Addressed review comments from Jani (for loop for retry) V3: Addressed review comments from Imre (break on partial read too) V3: Addressed review comments from Ville/Imre (Add the retries exclusively for LSPCON, not for all dp_dual_mode devices) V4: Added r-b from Imre, sending it to dri-devel (Jani) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102294 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102295 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102359 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103186 Cc: Ville Syrjala Cc: Imre Deak Cc: Jani Nikula Reviewed-by: Imre Deak Acked-by: Dave Airlie Signed-off-by: Shashank Sharma Signed-off-by: Jani Nikula Link: https://patchwork.freedesktop.org/patch/msgid/1507826408-19322-1-git-send-email-shashank.sharma@intel.com Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit b04c22da18b53c7be730478282f7e6f62cef9425 Author: Derek Basehore Date: Tue Aug 29 13:34:34 2017 -0700 backlight: pwm_bl: Fix overflow condition [ Upstream commit 5d0c49acebc9488e37db95f1d4a55644e545ffe7 ] This fixes an overflow condition that can happen with high max brightness and period values in compute_duty_cycle. This fixes it by using a 64 bit variable for computing the duty cycle. Signed-off-by: Derek Basehore Acked-by: Thierry Reding Reviewed-by: Brian Norris Signed-off-by: Lee Jones Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 23b22186b27f6f16cddc69fd12b607c71de6451f Author: Jens Wiklander Date: Mon Oct 9 11:11:49 2017 +0200 optee: fix invalid of_node_put() in optee_driver_init() commit f044113113dd95ba73916bde10e804d3cdfa2662 upstream. The first node supplied to of_find_matching_node() has its reference counter decreased as part of call to that function. In optee_driver_init() after calling of_find_matching_node() it's invalid to call of_node_put() on the supplied node again. So remove the invalid call to of_node_put(). Reported-by: Alex Shi Signed-off-by: Jens Wiklander Cc: Signed-off-by: Greg Kroah-Hartman commit 8388d287e361a2fd0a39bece30a736d692d5c3d8 Author: Thomas Gleixner Date: Mon Dec 4 15:07:32 2017 +0100 x86/cpufeatures: Make CPU bugs sticky commit 6cbd2171e89b13377261d15e64384df60ecb530e upstream. There is currently no way to force CPU bug bits like CPU feature bits. That makes it impossible to set a bug bit once at boot and have it stick for all upcoming CPUs. Extend the force set/clear arrays to handle bug bits as well. Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: Andy Lutomirski Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.992156574@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 1eb2e614fd17ae8de373aaf9919b803b0a6cfc27 Author: Thomas Gleixner Date: Mon Dec 4 15:07:31 2017 +0100 x86/paravirt: Provide a way to check for hypervisors commit 79cc74155218316b9a5d28577c7077b2adba8e58 upstream. There is no generic way to test whether a kernel is running on a specific hypervisor. But that's required to prevent the upcoming user address space separation feature in certain guest modes. Make the hypervisor type enum unconditionally available and provide a helper function which allows to test for a specific type. Signed-off-by: Thomas Gleixner Reviewed-by: Juergen Gross Cc: Andy Lutomirski Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.912938129@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 96e63420e281205e98cca0b6dbb5a79af98a5efd Author: Thomas Gleixner Date: Mon Dec 4 15:07:30 2017 +0100 x86/paravirt: Dont patch flush_tlb_single commit a035795499ca1c2bd1928808d1a156eda1420383 upstream. native_flush_tlb_single() will be changed with the upcoming PAGE_TABLE_ISOLATION feature. This requires to have more code in there than INVLPG. Remove the paravirt patching for it. Signed-off-by: Thomas Gleixner Reviewed-by: Josh Poimboeuf Reviewed-by: Juergen Gross Acked-by: Peter Zijlstra Cc: Andy Lutomirski Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: linux-mm@kvack.org Cc: michael.schwarz@iaik.tugraz.at Cc: moritz.lipp@iaik.tugraz.at Cc: richard.fellner@student.tugraz.at Link: https://lkml.kernel.org/r/20171204150606.828111617@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 25e2999e630c744c27c9179d1279c7f9d503385e Author: Andy Lutomirski Date: Mon Dec 4 15:07:29 2017 +0100 x86/entry/64: Make cpu_entry_area.tss read-only commit c482feefe1aeb150156248ba0fd3e029bc886605 upstream. The TSS is a fairly juicy target for exploits, and, now that the TSS is in the cpu_entry_area, it's no longer protected by kASLR. Make it read-only on x86_64. On x86_32, it can't be RO because it's written by the CPU during task switches, and we use a task gate for double faults. I'd also be nervous about errata if we tried to make it RO even on configurations without double fault handling. [ tglx: AMD confirmed that there is no problem on 64-bit with TSS RO. So it's probably safe to assume that it's a non issue, though Intel might have been creative in that area. Still waiting for confirmation. ] Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Kees Cook Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.733700132@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit e313437c85da22a26fe1ba01d94abec7698b1987 Author: Andy Lutomirski Date: Mon Dec 4 15:07:28 2017 +0100 x86/entry: Clean up the SYSENTER_stack code commit 0f9a48100fba3f189724ae88a450c2261bf91c80 upstream. The existing code was a mess, mainly because C arrays are nasty. Turn SYSENTER_stack into a struct, add a helper to find it, and do all the obvious cleanups this enables. Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.653244723@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit b25ca49efac58fe9744b9a3f725c4e4d69b276f8 Author: Andy Lutomirski Date: Mon Dec 4 15:07:27 2017 +0100 x86/entry/64: Remove the SYSENTER stack canary commit 7fbbd5cbebf118a9e09f5453f686656a167c3d1c upstream. Now that the SYSENTER stack has a guard page, there's no need for a canary to detect overflow after the fact. Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.572577316@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit bb568391775d4a840992e2d2493f39d6e86401e3 Author: Andy Lutomirski Date: Mon Dec 4 15:07:26 2017 +0100 x86/entry/64: Move the IST stacks into struct cpu_entry_area commit 40e7f949e0d9a33968ebde5d67f7e3a47c97742a upstream. The IST stacks are needed when an IST exception occurs and are accessed before any kernel code at all runs. Move them into struct cpu_entry_area. The IST stacks are unlike the rest of cpu_entry_area: they're used even for entries from kernel mode. This means that they should be set up before we load the final IDT. Move cpu_entry_area setup to trap_init() for the boot CPU and set it up for all possible CPUs at once in native_smp_prepare_cpus(). Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.480598743@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit c631a16e5b84521ae80ab10d47cd6ccd2afc6817 Author: Andy Lutomirski Date: Mon Dec 4 15:07:25 2017 +0100 x86/entry/64: Create a per-CPU SYSCALL entry trampoline commit 3386bc8aed825e9f1f65ce38df4b109b2019b71a upstream. Handling SYSCALL is tricky: the SYSCALL handler is entered with every single register (except FLAGS), including RSP, live. It somehow needs to set RSP to point to a valid stack, which means it needs to save the user RSP somewhere and find its own stack pointer. The canonical way to do this is with SWAPGS, which lets us access percpu data using the %gs prefix. With PAGE_TABLE_ISOLATION-like pagetable switching, this is problematic. Without a scratch register, switching CR3 is impossible, so %gs-based percpu memory would need to be mapped in the user pagetables. Doing that without information leaks is difficult or impossible. Instead, use a different sneaky trick. Map a copy of the first part of the SYSCALL asm at a different address for each CPU. Now RIP varies depending on the CPU, so we can use RIP-relative memory access to access percpu memory. By putting the relevant information (one scratch slot and the stack address) at a constant offset relative to RIP, we can make SYSCALL work without relying on %gs. A nice thing about this approach is that we can easily switch it on and off if we want pagetable switching to be configurable. The compat variant of SYSCALL doesn't have this problem in the first place -- there are plenty of scratch registers, since we don't care about preserving r8-r15. This patch therefore doesn't touch SYSCALL32 at all. This patch actually seems to be a small speedup. With this patch, SYSCALL touches an extra cache line and an extra virtual page, but the pipeline no longer stalls waiting for SWAPGS. It seems that, at least in a tight loop, the latter outweights the former. Thanks to David Laight for an optimization tip. Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.403607157@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 564cea11777e9d73aa04ce5f6fbe9169d7c116a2 Author: Andy Lutomirski Date: Mon Dec 4 15:07:24 2017 +0100 x86/entry/64: Return to userspace from the trampoline stack commit 3e3b9293d392c577b62e24e4bc9982320438e749 upstream. By itself, this is useless. It gives us the ability to run some final code before exit that cannnot run on the kernel stack. This could include a CR3 switch a la PAGE_TABLE_ISOLATION or some kernel stack erasing, for example. (Or even weird things like *changing* which kernel stack gets used as an ASLR-strengthening mechanism.) The SYSRET32 path is not covered yet. It could be in the future or we could just ignore it and force the slow path if needed. Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.306546484@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a Author: Andy Lutomirski Date: Mon Dec 4 15:07:23 2017 +0100 x86/entry/64: Use a per-CPU trampoline stack for IDT entries commit 7f2590a110b837af5679d08fc25c6227c5a8c497 upstream. Historically, IDT entries from usermode have always gone directly to the running task's kernel stack. Rearrange it so that we enter on a per-CPU trampoline stack and then manually switch to the task's stack. This touches a couple of extra cachelines, but it gives us a chance to run some code before we touch the kernel stack. The asm isn't exactly beautiful, but I think that fully refactoring it can wait. Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Reviewed-by: Thomas Gleixner Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.225330557@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e Author: Andy Lutomirski Date: Mon Dec 4 15:07:22 2017 +0100 x86/espfix/64: Stop assuming that pt_regs is on the entry stack commit 6d9256f0a89eaff97fca6006100bcaea8d1d8bdb upstream. When we start using an entry trampoline, a #GP from userspace will be delivered on the entry stack, not on the task stack. Fix the espfix64 #DF fixup to set up #GP according to TSS.SP0, rather than assuming that pt_regs + 1 == SP0. This won't change anything without an entry stack, but it will make the code continue to work when an entry stack is added. While we're at it, improve the comments to explain what's actually going on. Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.130778051@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit d120cd749ef9770ee98b708a83b49547dcf1c0e1 Author: Andy Lutomirski Date: Mon Dec 4 15:07:21 2017 +0100 x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 commit 9aaefe7b59ae00605256a7d6bd1c1456432495fc upstream. On 64-bit kernels, we used to assume that TSS.sp0 was the current top of stack. With the addition of an entry trampoline, this will no longer be the case. Store the current top of stack in TSS.sp1, which is otherwise unused but shares the same cacheline. Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.050864668@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 5bb40c6d4c2a453b21e8e7c2b9bae87e77c4d878 Author: Andy Lutomirski Date: Mon Dec 4 15:07:20 2017 +0100 x86/entry: Remap the TSS into the CPU entry area commit 72f5e08dbba2d01aa90b592cf76c378ea233b00b upstream. This has a secondary purpose: it puts the entry stack into a region with a well-controlled layout. A subsequent patch will take advantage of this to streamline the SYSCALL entry code to be able to find it more easily. Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.962042855@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 969f5706f6f51a4a01a0dfedfa405b33e25fb708 Author: Andy Lutomirski Date: Mon Dec 4 15:07:19 2017 +0100 x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct commit 1a935bc3d4ea61556461a9e92a68ca3556232efd upstream. SYSENTER_stack should have reliable overflow detection, which means that it needs to be at the bottom of a page, not the top. Move it to the beginning of struct tss_struct and page-align it. Also add an assertion to make sure that the fixed hardware TSS doesn't cross a page boundary. Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.881827433@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 41964ef17cce3f2497c1ec428d22b42fb615da8d Author: Andy Lutomirski Date: Mon Dec 4 15:07:18 2017 +0100 x86/dumpstack: Handle stack overflow on all stacks commit 6e60e583426c2f8751c22c2dfe5c207083b4483a upstream. We currently special-case stack overflow on the task stack. We're going to start putting special stacks in the fixmap with a custom layout, so they'll have guard pages, too. Teach the unwinder to be able to unwind an overflow of any of the stacks. Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.802057305@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 5be136953f62c551c1f40ddbb83c2f4e8fc9c41f Author: Andy Lutomirski Date: Mon Dec 4 15:07:17 2017 +0100 x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss commit 7fb983b4dd569e08564134a850dfd4eb1c63d9b8 upstream. A future patch will move SYSENTER_stack to the beginning of cpu_tss to help detect overflow. Before this can happen, fix several code paths that hardcode assumptions about the old layout. Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Reviewed-by: Dave Hansen Reviewed-by: Thomas Gleixner Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.722425540@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 487f3ddcd9863d38490ca4f6f2b9e67d4f8a197c Author: Andy Lutomirski Date: Mon Dec 4 15:07:16 2017 +0100 x86/kasan/64: Teach KASAN about the cpu_entry_area commit 21506525fb8ddb0342f2a2370812d47f6a1f3833 upstream. The cpu_entry_area will contain stacks. Make sure that KASAN has appropriate shadow mappings for them. Signed-off-by: Andy Lutomirski Signed-off-by: Andrey Ryabinin Signed-off-by: Thomas Gleixner Cc: Alexander Potapenko Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Dmitry Vyukov Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: kasan-dev@googlegroups.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.642806442@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit ece614dcfd964ab04fb37186ea2c4b020dbc3ee9 Author: Andy Lutomirski Date: Mon Dec 4 15:07:15 2017 +0100 x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct cpu_entry_area commit ef8813ab280507972bb57e4b1b502811ad4411e9 upstream. Currently, the GDT is an ad-hoc array of pages, one per CPU, in the fixmap. Generalize it to be an array of a new 'struct cpu_entry_area' so that we can cleanly add new things to it. Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.563271721@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 5684dd300f675a9f5ad6a43f96f128839c98a7fd Author: Andy Lutomirski Date: Mon Dec 4 15:07:14 2017 +0100 x86/entry/gdt: Put per-CPU GDT remaps in ascending order commit aaeed3aeb39c1ba69f0a49baec8cb728121d0a91 upstream. We currently have CPU 0's GDT at the top of the GDT range and higher-numbered CPUs at lower addresses. This happens because the fixmap is upside down (index 0 is the top of the fixmap). Flip it so that GDTs are in ascending order by virtual address. This will simplify a future patch that will generalize the GDT remap to contain multiple pages. Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Reviewed-by: Thomas Gleixner Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.471561421@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 2329da3fc03d19a8fc3bb212972b97e9f57abb32 Author: Andy Lutomirski Date: Mon Dec 4 15:07:13 2017 +0100 x86/dumpstack: Add get_stack_info() support for the SYSENTER stack commit 33a2f1a6c4d7c0a02d1c006fb0379cc5ca3b96bb upstream. get_stack_info() doesn't currently know about the SYSENTER stack, so unwinding will fail if we entered the kernel on the SYSENTER stack and haven't fully switched off. Teach get_stack_info() about the SYSENTER stack. With future patches applied that run part of the entry code on the SYSENTER stack and introduce an intentional BUG(), I would get: PANIC: double fault, error_code: 0x0 ... RIP: 0010:do_error_trap+0x33/0x1c0 ... Call Trace: Code: ... With this patch, I get: PANIC: double fault, error_code: 0x0 ... Call Trace: ? async_page_fault+0x36/0x60 ? invalid_op+0x22/0x40 ? async_page_fault+0x36/0x60 ? sync_regs+0x3c/0x40 ? sync_regs+0x2e/0x40 ? error_entry+0x6c/0xd0 ? async_page_fault+0x36/0x60 Code: ... which is a lot more informative. Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.392711508@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 9b654aba0360c10baa7cc8d15f976e7ca1af44ac Author: Andy Lutomirski Date: Mon Dec 4 15:07:12 2017 +0100 x86/entry/64: Allocate and enable the SYSENTER stack commit 1a79797b58cddfa948420a7553241c79c013e3ca upstream. This will simplify future changes that want scratch variables early in the SYSENTER handler -- they'll be able to spill registers to the stack. It also lets us get rid of a SWAPGS_UNSAFE_STACK user. This does not depend on CONFIG_IA32_EMULATION=y because we'll want the stack space even without IA32 emulation. As far as I can tell, the reason that this wasn't done from day 1 is that we use IST for #DB and #BP, which is IMO rather nasty and causes a lot more problems than it solves. But, since #DB uses IST, we don't actually need a real stack for SYSENTER (because SYSENTER with TF set will invoke #DB on the IST stack rather than the SYSENTER stack). I want to remove IST usage from these vectors some day, and this patch is a prerequisite for that as well. Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.312726423@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit e9b7b111e5be68e41e1984bcaa03535cb5e76141 Author: Andy Lutomirski Date: Mon Dec 4 15:07:11 2017 +0100 x86/irq/64: Print the offending IP in the stack overflow warning commit 4f3789e792296e21405f708cf3cb409d7c7d5683 upstream. In case something goes wrong with unwind (not unlikely in case of overflow), print the offending IP where we detected the overflow. Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Reviewed-by: Thomas Gleixner Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.231677119@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 996d087af015da99632fde450092923d574d4bbc Author: Andy Lutomirski Date: Mon Dec 4 15:07:10 2017 +0100 x86/irq: Remove an old outdated comment about context tracking races commit 6669a692605547892a026445e460bf233958bd7f upstream. That race has been fixed and code cleaned up for a while now. Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Reviewed-by: Thomas Gleixner Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.150551639@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 5209e8ac937261925c12db63a28cfaa033fa30ed Author: Josh Poimboeuf Date: Mon Dec 4 15:07:09 2017 +0100 x86/unwinder: Handle stack overflows more gracefully commit b02fcf9ba1211097754b286043cd87a8b4907e75 upstream. There are at least two unwinder bugs hindering the debugging of stack-overflow crashes: - It doesn't deal gracefully with the case where the stack overflows and the stack pointer itself isn't on a valid stack but the to-be-dereferenced data *is*. - The ORC oops dump code doesn't know how to print partial pt_regs, for the case where if we get an interrupt/exception in *early* entry code before the full pt_regs have been saved. Fix both issues. http://lkml.kernel.org/r/20171126024031.uxi4numpbjm5rlbr@treble Signed-off-by: Josh Poimboeuf Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: Andy Lutomirski Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.071425003@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 40ddc692b5a11cf7c3f9677744676af617d54a24 Author: Andy Lutomirski Date: Mon Dec 4 15:07:08 2017 +0100 x86/unwinder/orc: Dont bail on stack overflow commit d3a09104018cf2ad5973dfa8a9c138ef9f5015a3 upstream. If the stack overflows into a guard page and the ORC unwinder should work well: by construction, there can't be any meaningful data in the guard page because no writes to the guard page will have succeeded. But there is a bug that prevents unwinding from working correctly: if the starting register state has RSP pointing into a stack guard page, the ORC unwinder bails out immediately. Instead of bailing out immediately check whether the next page up is a valid check page and if so analyze that. As a result the ORC unwinder will start the unwind. Tested by intentionally overflowing the task stack. The result is an accurate call trace instead of a trace consisting purely of '?' entries. There are a few other bugs that are triggered if the unwinder encounters a stack overflow after the first step, but they are outside the scope of this fix. Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150604.991389777@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 21ddc15fa82b7a2a7b6209437c4d137672370c86 Author: Boris Ostrovsky Date: Mon Dec 4 15:07:07 2017 +0100 x86/entry/64/paravirt: Use paravirt-safe macro to access eflags commit e17f8234538d1ff708673f287a42457c4dee720d upstream. Commit 1d3e53e8624a ("x86/entry/64: Refactor IRQ stacks and make them NMI-safe") added DEBUG_ENTRY_ASSERT_IRQS_OFF macro that acceses eflags using 'pushfq' instruction when testing for IF bit. On PV Xen guests looking at IF flag directly will always see it set, resulting in 'ud2'. Introduce SAVE_FLAGS() macro that will use appropriate save_fl pv op when running paravirt. Signed-off-by: Boris Ostrovsky Signed-off-by: Thomas Gleixner Reviewed-by: Juergen Gross Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Dave Hansen Cc: David Laight Cc: Denys Vlasenko Cc: Eduardo Valentin Cc: Greg KH Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Will Deacon Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: xen-devel@lists.xenproject.org Link: https://lkml.kernel.org/r/20171204150604.899457242@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit d455b71e7393975c5c50c26bd2199f19c6eed80f Author: Andrey Ryabinin Date: Wed Nov 15 17:36:35 2017 -0800 x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow commit 2aeb07365bcd489620f71390a7d2031cd4dfb83e upstream. [ Note, this is a Git cherry-pick of the following commit: d17a1d97dc20: ("x86/mm/kasan: don't use vmemmap_populate() to initialize shadow") ... for easier x86 PTI code testing and back-porting. ] The KASAN shadow is currently mapped using vmemmap_populate() since that provides a semi-convenient way to map pages into init_top_pgt. However, since that no longer zeroes the mapped pages, it is not suitable for KASAN, which requires zeroed shadow memory. Add kasan_populate_shadow() interface and use it instead of vmemmap_populate(). Besides, this allows us to take advantage of gigantic pages and use them to populate the shadow, which should save us some memory wasted on page tables and reduce TLB pressure. Link: http://lkml.kernel.org/r/20171103185147.2688-2-pasha.tatashin@oracle.com Signed-off-by: Andrey Ryabinin Signed-off-by: Pavel Tatashin Cc: Andy Lutomirski Cc: Steven Sistare Cc: Daniel Jordan Cc: Bob Picco Cc: Michal Hocko Cc: Alexander Potapenko Cc: Ard Biesheuvel Cc: Catalin Marinas Cc: Christian Borntraeger Cc: David S. Miller Cc: Dmitry Vyukov Cc: Heiko Carstens Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Mark Rutland Cc: Matthew Wilcox Cc: Mel Gorman Cc: Michal Hocko Cc: Sam Ravnborg Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 5383f45db38c0459f8d3d3f90f7f0a132ad104f3 Author: Will Deacon Date: Tue Oct 24 11:22:48 2017 +0100 locking/barriers: Convert users of lockless_dereference() to READ_ONCE() commit 3382290ed2d5e275429cef510ab21889d3ccd164 upstream. [ Note, this is a Git cherry-pick of the following commit: 506458efaf15 ("locking/barriers: Convert users of lockless_dereference() to READ_ONCE()") ... for easier x86 PTI code testing and back-porting. ] READ_ONCE() now has an implicit smp_read_barrier_depends() call, so it can be used instead of lockless_dereference() without any change in semantics. Signed-off-by: Will Deacon Cc: Linus Torvalds Cc: Paul E. McKenney Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1508840570-22169-4-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 1aedecaf12a67de999eaf233cdb64839170ca035 Author: Will Deacon Date: Tue Oct 24 11:22:47 2017 +0100 locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE() commit c2bc66082e1048c7573d72e62f597bdc5ce13fea upstream. [ Note, this is a Git cherry-pick of the following commit: 76ebbe78f739 ("locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE()") ... for easier x86 PTI code testing and back-porting. ] In preparation for the removal of lockless_dereference(), which is the same as READ_ONCE() on all architectures other than Alpha, add an implicit smp_read_barrier_depends() to READ_ONCE() so that it can be used to head dependency chains on all architectures. Signed-off-by: Will Deacon Cc: Linus Torvalds Cc: Paul E. McKenney Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1508840570-22169-3-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 065060cdd3de1f7a1f08284999081c4c38601ef7 Author: Daniel Borkmann Date: Tue Dec 12 02:25:31 2017 +0100 bpf: fix build issues on um due to mising bpf_perf_event.h commit ab95477e7cb35557ecfc837687007b646bab9a9f upstream. [ Note, this is a Git cherry-pick of the following commit: a23f06f06dbe ("bpf: fix build issues on um due to mising bpf_perf_event.h") ... for easier x86 PTI code testing and back-porting. ] Since c895f6f703ad ("bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type") um (uml) won't build on i386 or x86_64: [...] CC init/main.o In file included from ../include/linux/perf_event.h:18:0, from ../include/linux/trace_events.h:10, from ../include/trace/syscall.h:7, from ../include/linux/syscalls.h:82, from ../init/main.c:20: ../include/uapi/linux/bpf_perf_event.h:11:32: fatal error: asm/bpf_perf_event.h: No such file or directory #include [...] Lets add missing bpf_perf_event.h also to um arch. This seems to be the only one still missing. Fixes: c895f6f703ad ("bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type") Reported-by: Randy Dunlap Suggested-by: Richard Weinberger Signed-off-by: Daniel Borkmann Tested-by: Randy Dunlap Cc: Hendrik Brueckner Cc: Richard Weinberger Acked-by: Alexei Starovoitov Acked-by: Richard Weinberger Signed-off-by: Alexei Starovoitov Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 2d8c24ed9310b89921bd0459d8df1258134015d4 Author: Andi Kleen Date: Thu Aug 31 14:46:30 2017 -0700 perf/x86: Enable free running PEBS for REGS_USER/INTR commit 2fe1bc1f501d55e5925b4035bcd85781adc76c63 upstream. [ Note, this is a Git cherry-pick of the following commit: a47ba4d77e12 ("perf/x86: Enable free running PEBS for REGS_USER/INTR") ... for easier x86 PTI code testing and back-porting. ] Currently free running PEBS is disabled when user or interrupt registers are requested. Most of the registers are actually available in the PEBS record and can be supported. So we just need to check for the supported registers and then allow it: it is all except for the segment register. For user registers this only works when the counter is limited to ring 3 only, so this also needs to be checked. Signed-off-by: Andi Kleen Signed-off-by: Peter Zijlstra (Intel) Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20170831214630.21892-1-andi@firstfloor.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit e918424231eef4f6652c2221ba2f1d6b44e04233 Author: Rudolf Marek Date: Tue Nov 28 22:01:06 2017 +0100 x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD commit f2dbad36c55e5d3a91dccbde6e8cae345fe5632f upstream. [ Note, this is a Git cherry-pick of the following commit: 2b67799bdf25 ("x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD") ... for easier x86 PTI code testing and back-porting. ] The latest AMD AMD64 Architecture Programmer's Manual adds a CPUID feature XSaveErPtr (CPUID_Fn80000008_EBX[2]). If this feature is set, the FXSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES / FXRSTOR, XRSTOR, XRSTORS always save/restore error pointers, thus making the X86_BUG_FXSAVE_LEAK workaround obsolete on such CPUs. Signed-Off-By: Rudolf Marek Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Tested-by: Borislav Petkov Cc: Andy Lutomirski Link: https://lkml.kernel.org/r/bdcebe90-62c5-1f05-083c-eba7f08b2540@assembler.cz Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit c6e38628af6d6e06d6b6038c00ed140323e3ba97 Author: Ricardo Neri Date: Sun Nov 5 18:27:51 2017 -0800 x86/cpufeature: Add User-Mode Instruction Prevention definitions commit a8b4db562e7283a1520f9e9730297ecaab7622ea upstream. [ Note, this is a Git cherry-pick of the following commit: (limited to the cpufeatures.h file) 3522c2a6a4f3 ("x86/cpufeature: Add User-Mode Instruction Prevention definitions") ... for easier x86 PTI code testing and back-porting. ] User-Mode Instruction Prevention is a security feature present in new Intel processors that, when set, prevents the execution of a subset of instructions if such instructions are executed in user mode (CPL > 0). Attempting to execute such instructions causes a general protection exception. The subset of instructions comprises: * SGDT - Store Global Descriptor Table * SIDT - Store Interrupt Descriptor Table * SLDT - Store Local Descriptor Table * SMSW - Store Machine Status Word * STR - Store Task Register This feature is also added to the list of disabled-features to allow a cleaner handling of build-time configuration. Signed-off-by: Ricardo Neri Reviewed-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: Andrew Morton Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Chen Yucong Cc: Chris Metcalf Cc: Dave Hansen Cc: Denys Vlasenko Cc: Fenghua Yu Cc: H. Peter Anvin Cc: Huang Rui Cc: Jiri Slaby Cc: Jonathan Corbet Cc: Josh Poimboeuf Cc: Linus Torvalds Cc: Masami Hiramatsu Cc: Michael S. Tsirkin Cc: Paolo Bonzini Cc: Paul Gortmaker Cc: Peter Zijlstra Cc: Ravi V. Shankar Cc: Shuah Khan Cc: Tony Luck Cc: Vlastimil Babka Cc: ricardo.neri@intel.com Link: http://lkml.kernel.org/r/1509935277-22138-7-git-send-email-ricardo.neri-calderon@linux.intel.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 330a4f53bbb4add9c998344af15c27afc9b1e825 Author: Ingo Molnar Date: Tue Dec 5 14:14:47 2017 +0100 drivers/misc/intel/pti: Rename the header file to free up the namespace commit 1784f9144b143a1e8b19fe94083b040aa559182b upstream. We'd like to use the 'PTI' acronym for 'Page Table Isolation' - free up the namespace by renaming the driver header to . (Also standardize the header guard name while at it.) Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: J Freyensee Cc: Greg Kroah-Hartman Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 399bbc9bb611e10dc5fd232b96b0bea18fca3482 Author: Juergen Gross Date: Thu Nov 9 14:27:36 2017 +0100 x86/virt: Add enum for hypervisors to replace x86_hyper commit 03b2a320b19f1424e9ac9c21696be9c60b6d0d93 upstream. The x86_hyper pointer is only used for checking whether a virtual device is supporting the hypervisor the system is running on. Use an enum for that purpose instead and drop the x86_hyper pointer. Signed-off-by: Juergen Gross Acked-by: Thomas Gleixner Acked-by: Xavier Deguillard Cc: Linus Torvalds Cc: Peter Zijlstra Cc: akataria@vmware.com Cc: arnd@arndb.de Cc: boris.ostrovsky@oracle.com Cc: devel@linuxdriverproject.org Cc: dmitry.torokhov@gmail.com Cc: gregkh@linuxfoundation.org Cc: haiyangz@microsoft.com Cc: kvm@vger.kernel.org Cc: kys@microsoft.com Cc: linux-graphics-maintainer@vmware.com Cc: linux-input@vger.kernel.org Cc: moltmann@vmware.com Cc: pbonzini@redhat.com Cc: pv-drivers@vmware.com Cc: rkrcmar@redhat.com Cc: sthemmin@microsoft.com Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20171109132739.23465-3-jgross@suse.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 04d26709b13e72b02d5e435c6a3bebb63167ac71 Author: Juergen Gross Date: Thu Nov 9 14:27:35 2017 +0100 x86/virt, x86/platform: Merge 'struct x86_hyper' into 'struct x86_platform' and 'struct x86_init' commit f72e38e8ec8869ac0ba5a75d7d2f897d98a1454e upstream. Instead of x86_hyper being either NULL on bare metal or a pointer to a struct hypervisor_x86 in case of the kernel running as a guest merge the struct into x86_platform and x86_init. This will remove the need for wrappers making it hard to find out what is being called. With dummy functions added for all callbacks testing for a NULL function pointer can be removed, too. Suggested-by: Ingo Molnar Signed-off-by: Juergen Gross Acked-by: Thomas Gleixner Cc: Linus Torvalds Cc: Peter Zijlstra Cc: akataria@vmware.com Cc: boris.ostrovsky@oracle.com Cc: devel@linuxdriverproject.org Cc: haiyangz@microsoft.com Cc: kvm@vger.kernel.org Cc: kys@microsoft.com Cc: pbonzini@redhat.com Cc: rkrcmar@redhat.com Cc: rusty@rustcorp.com.au Cc: sthemmin@microsoft.com Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20171109132739.23465-2-jgross@suse.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 99aee22dca890fb835570fa3a2a72ea43cca93ae Author: James Morse Date: Mon Nov 6 18:44:24 2017 +0000 ACPI / APEI: Replace ioremap_page_range() with fixmap commit 4f89fa286f6729312e227e7c2d764e8e7b9d340e upstream. Replace ghes_io{re,un}map_pfn_{nmi,irq}()s use of ioremap_page_range() with __set_fixmap() as ioremap_page_range() may sleep to allocate a new level of page-table, even if its passed an existing final-address to use in the mapping. The GHES driver can only be enabled for architectures that select HAVE_ACPI_APEI: Add fixmap entries to both x86 and arm64. clear_fixmap() does the TLB invalidation in __set_fixmap() for arm64 and __set_pte_vaddr() for x86. In each case its the same as the respective arch_apei_flush_tlb_one(). Reported-by: Fengguang Wu Suggested-by: Linus Torvalds Signed-off-by: James Morse Reviewed-by: Borislav Petkov Tested-by: Tyler Baicar Tested-by: Toshi Kani [ For the arm64 bits: ] Acked-by: Will Deacon [ For the x86 bits: ] Acked-by: Ingo Molnar Signed-off-by: Rafael J. Wysocki Cc: All applicable Signed-off-by: Greg Kroah-Hartman commit 4a464a66db6d574dd622b5218ca617da41461149 Author: Andy Lutomirski Date: Sat Nov 4 04:19:51 2017 -0700 selftests/x86/ldt_gdt: Run most existing LDT test cases against the GDT as well commit adedf2893c192dd09b1cc2f2dcfdd7cad99ec49d upstream. Now that the main test infrastructure supports the GDT, run tests that will pass the kernel's GDT permission tests against the GDT. Signed-off-by: Andy Lutomirski Cc: Borislav Petkov Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/686a1eda63414da38fcecc2412db8dba1ae40581.1509794321.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 46e6a15b40c99402aedde940f157614e0c927c7e Author: Andy Lutomirski Date: Sat Nov 4 04:19:50 2017 -0700 selftests/x86/ldt_gdt: Add infrastructure to test set_thread_area() commit d744dcad39094c9187075e274d1cdef79c57c8b5 upstream. Much of the test design could apply to set_thread_area() (i.e. GDT), not just modify_ldt(). Add set_thread_area() to the install_valid_mode() helper. Signed-off-by: Andy Lutomirski Cc: Borislav Petkov Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/02c23f8fba5547007f741dc24c3926e5284ede02.1509794321.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit d9eb267780ff856580885df063c5f4a6630e345f Author: Ingo Molnar Date: Tue Oct 31 13:17:23 2017 +0100 x86/cpufeatures: Fix various details in the feature definitions commit f3a624e901c633593156f7b00ca743a6204a29bc upstream. Kept this commit separate from the re-tabulation changes, to make the changes easier to review: - add better explanation for entries with no explanation - fix/enhance the text of some of the entries - fix the vertical alignment of some of the feature number definitions - fix inconsistent capitalization - ... and lots of other small details i.e. make it all more of a coherent unit, instead of a patchwork of years of additions. Cc: Andrew Morton Cc: Andy Lutomirski Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: Josh Poimboeuf Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20171031121723.28524-4-mingo@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 47af9e68f3f2e85b043e2c67e90d5aa90220a1d4 Author: Ingo Molnar Date: Tue Oct 31 13:17:22 2017 +0100 x86/cpufeatures: Re-tabulate the X86_FEATURE definitions commit acbc845ffefd9fb70466182cd8555a26189462b2 upstream. Over the years asm/cpufeatures.h has become somewhat of a mess: the original tabulation style was too narrow, while x86 feature names also kept growing in length, creating frequent field width overflows. Re-tabulate it to make it wider and easier to read/modify. Also harmonize the tabulation of the other defines in this file to match it. Cc: Andrew Morton Cc: Andy Lutomirski Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: Josh Poimboeuf Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20171031121723.28524-3-mingo@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 64766453be2ee380ba7f3620bfeadcee2787145f Author: Borislav Petkov Date: Fri Nov 3 11:20:28 2017 +0100 x86/mm: Define _PAGE_TABLE using _KERNPG_TABLE commit c7da092a1f243bfd1bfb4124f538e69e941882da upstream. ... so that the difference is obvious. No functionality change. Signed-off-by: Borislav Petkov Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20171103102028.20284-1-bp@alien8.de Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit c1ffb6aefbc5aa6472e172d9c24c0e7a414a1e1f Author: Thomas Gleixner Date: Thu Nov 2 13:30:03 2017 +0100 bitops: Revert cbe96375025e ("bitops: Add clear/set_bit32() to linux/bitops.h") commit 1943dc07b45e347c52c1bfdd4a37e04a86e399aa upstream. These ops are not endian safe and may break on architectures which have aligment requirements. Reverts: cbe96375025e ("bitops: Add clear/set_bit32() to linux/bitops.h") Reported-by: Peter Zijlstra Signed-off-by: Thomas Gleixner Cc: Andi Kleen Signed-off-by: Greg Kroah-Hartman commit 3243ae92926c4ad3a7f9596952542c6803e69050 Author: Thomas Gleixner Date: Thu Nov 2 13:22:35 2017 +0100 x86/cpuid: Replace set/clear_bit32() commit 06dd688ddda5819025e014b79aea9af6ab475fa2 upstream. Peter pointed out that the set/clear_bit32() variants are broken in various aspects. Replace them with open coded set/clear_bit() and type cast cpu_info::x86_capability as it's done in all other places throughout x86. Fixes: 0b00de857a64 ("x86/cpuid: Add generic table for CPUID dependencies") Reported-by: Peter Ziljstra Signed-off-by: Thomas Gleixner Cc: Andi Kleen Signed-off-by: Greg Kroah-Hartman commit b36c2c3ab339f50fb60f095ee508fc3376bdb1d6 Author: Borislav Petkov Date: Thu Nov 2 13:09:26 2017 +0100 x86/entry/64: Shorten TEST instructions commit 1e4c4f610f774df6088d7c065b2dd4d22adba698 upstream. Convert TESTL to TESTB and save 3 bytes per callsite. No functionality change. Signed-off-by: Borislav Petkov Cc: Andy Lutomirski Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20171102120926.4srwerqrr7g72e2k@pd.tnic Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 35c1d57e63914254d463a2361b6348a2b823c70d Author: Andy Lutomirski Date: Thu Nov 2 00:59:17 2017 -0700 x86/traps: Use a new on_thread_stack() helper to clean up an assertion commit 3383642c2f9d4f5b4fa37436db4a109a1a10018c upstream. Let's keep the stack-related logic together rather than open-coding a comparison in an assertion in the traps code. Signed-off-by: Andy Lutomirski Reviewed-by: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/856b15bee1f55017b8f79d3758b0d51c48a08cf8.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit c6f563cd1393521c3bd74730cb972f4ec373694a Author: Andy Lutomirski Date: Thu Nov 2 00:59:16 2017 -0700 x86/entry/64: Remove thread_struct::sp0 commit d375cf1530595e33961a8844192cddab913650e3 upstream. On x86_64, we can easily calculate sp0 when needed instead of storing it in thread_struct. On x86_32, a similar cleanup would be possible, but it would require cleaning up the vm86 code first, and that can wait for a later cleanup series. Signed-off-by: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/719cd9c66c548c4350d98a90f050aee8b17f8919.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 266a0b19177e9ad7767aed90e14033dd46a8f000 Author: Andy Lutomirski Date: Thu Nov 2 00:59:15 2017 -0700 x86/entry/32: Fix cpu_current_top_of_stack initialization at boot commit cd493a6deb8b78eca280d05f7fa73fd69403ae29 upstream. cpu_current_top_of_stack's initialization forgot about TOP_OF_KERNEL_STACK_PADDING. This bug didn't matter because the idle threads never enter user mode. Signed-off-by: Andy Lutomirski Reviewed-by: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/e5e370a7e6e4fddd1c4e4cf619765d96bb874b21.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit c30eb760e3ecc7aebf37e557baecefb244733caa Author: Andy Lutomirski Date: Thu Nov 2 00:59:14 2017 -0700 x86/entry/64: Remove all remaining direct thread_struct::sp0 reads commit 46f5a10a721ce8dce8cc8fe55279b49e1c6b3288 upstream. The only remaining readers in context switch code or vm86(), and they all just want to update TSS.sp0 to match the current task. Replace them all with a new helper update_sp0(). Signed-off-by: Andy Lutomirski Reviewed-by: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/2d231687f4ff288c9d9e98d7861b7df374246ac3.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 71d7244efb0c9af0443f7376c844f3785a33497d Author: Andy Lutomirski Date: Thu Nov 2 00:59:13 2017 -0700 x86/entry/64: Stop initializing TSS.sp0 at boot commit 20bb83443ea79087b5e5f8dab4e9d80bb9bf7acb upstream. In my quest to get rid of thread_struct::sp0, I want to clean up or remove all of its readers. Two of them are in cpu_init() (32-bit and 64-bit), and they aren't needed. This is because we never enter userspace at all on the threads that CPUs are initialized in. Poison the initial TSS.sp0 and stop initializing it on CPU init. The comment text mostly comes from Dave Hansen. Thanks! Signed-off-by: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/ee4a00540ad28c6cff475fbcc7769a4460acc861.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 0917dd6e7a73c39d5bb1036e2e1fc73d5d56cb8c Author: Andy Lutomirski Date: Thu Nov 2 00:59:12 2017 -0700 x86/xen/64, x86/entry/64: Clean up SP code in cpu_initialize_context() commit f16b3da1dc936c0f8121741d0a1731bf242f2f56 upstream. I'm removing thread_struct::sp0, and Xen's usage of it is slightly dubious and unnecessary. Use appropriate helpers instead. While we're at at, reorder the code slightly to make it more obvious what's going on. Signed-off-by: Andy Lutomirski Reviewed-by: Juergen Gross Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/d5b9a3da2b47c68325bd2bbe8f82d9554dee0d0f.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit f576136bc88175014ef3f57378d0d982a5c4e786 Author: Andy Lutomirski Date: Thu Nov 2 00:59:11 2017 -0700 x86/entry: Add task_top_of_stack() to find the top of a task's stack commit 3500130b84a3cdc5b6796eba1daf178944935efe upstream. This will let us get rid of a few places that hardcode accesses to thread.sp0. Signed-off-by: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/b49b3f95a8ff858c40c9b0f5b32be0355324327d.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit e37558449abadb91ac13b85737d963f230118bdf Author: Andy Lutomirski Date: Thu Nov 2 00:59:10 2017 -0700 x86/entry/64: Pass SP0 directly to load_sp0() commit da51da189a24bb9b7e2d5a123be096e51a4695a5 upstream. load_sp0() had an odd signature: void load_sp0(struct tss_struct *tss, struct thread_struct *thread); Simplify it to: void load_sp0(unsigned long sp0); Also simplify a few get_cpu()/put_cpu() sequences to preempt_disable()/preempt_enable(). Signed-off-by: Andy Lutomirski Reviewed-by: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/2655d8b42ed940aa384fe18ee1129bbbcf730a08.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit ebef3548d5777e842acb0ab1bba599643525536c Author: Andy Lutomirski Date: Thu Nov 2 00:59:09 2017 -0700 x86/entry/32: Pull the MSR_IA32_SYSENTER_CS update code out of native_load_sp0() commit bd7dc5a6afac719d8ce4092391eef2c7e83c2a75 upstream. This causes the MSR_IA32_SYSENTER_CS write to move out of the paravirt callback. This shouldn't affect Xen PV: Xen already ignores MSR_IA32_SYSENTER_ESP writes. In any event, Xen doesn't support vm86() in a useful way. Note to any potential backporters: This patch won't break lguest, as lguest didn't have any SYSENTER support at all. Signed-off-by: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/75cf09fe03ae778532d0ca6c65aa58e66bc2f90c.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 6ff096cf2bf86824579bdcce3743143f993fbe5c Author: Andy Lutomirski Date: Thu Nov 2 00:59:08 2017 -0700 x86/entry/64: De-Xen-ify our NMI code commit 929bacec21478a72c78e4f29f98fb799bd00105a upstream. Xen PV is fundamentally incompatible with our fancy NMI code: it doesn't use IST at all, and Xen entries clobber two stack slots below the hardware frame. Drop Xen PV support from our NMI code entirely. Signed-off-by: Andy Lutomirski Reviewed-by: Borislav Petkov Acked-by: Juergen Gross Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/bfbe711b5ae03f672f8848999a8eb2711efc7f98.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit f53f7a3f0156a39eccea5bfd8a746121627c9c27 Author: Juergen Gross Date: Thu Nov 2 00:59:07 2017 -0700 xen, x86/entry/64: Add xen NMI trap entry commit 43e4111086a70c78bedb6ad990bee97f17b27a6e upstream. Instead of trying to execute any NMI via the bare metal's NMI trap handler use a Xen specific one for PV domains, like we do for e.g. debug traps. As in a PV domain the NMI is handled via the normal kernel stack this is the correct thing to do. This will enable us to get rid of the very fragile and questionable dependencies between the bare metal NMI handler and Xen assumptions believed to be broken anyway. Signed-off-by: Juergen Gross Signed-off-by: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/5baf5c0528d58402441550c5770b98e7961e7680.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 8d50dee92fb29c953a7101af63b0aea667c13296 Author: Andy Lutomirski Date: Thu Nov 2 00:59:06 2017 -0700 x86/entry/64: Remove the RESTORE_..._REGS infrastructure commit c39858de696f0cc160a544455e8403d663d577e9 upstream. All users of RESTORE_EXTRA_REGS, RESTORE_C_REGS and such, and REMOVE_PT_GPREGS_FROM_STACK are gone. Delete the macros. Signed-off-by: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/c32672f6e47c561893316d48e06c7656b1039a36.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 2550871499e4521792db6f282964d994d25aa459 Author: Andy Lutomirski Date: Thu Nov 2 00:59:05 2017 -0700 x86/entry/64: Use POP instead of MOV to restore regs on NMI return commit 471ee4832209e986029b9fabdaad57b1eecb856b upstream. This gets rid of the last user of the old RESTORE_..._REGS infrastructure. Signed-off-by: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/652a260f17a160789bc6a41d997f98249b73e2ab.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit e7273aae139703223ef2128c88188a6a3182e855 Author: Andy Lutomirski Date: Thu Nov 2 00:59:04 2017 -0700 x86/entry/64: Merge the fast and slow SYSRET paths commit a512210643da8082cb44181dba8b18e752bd68f0 upstream. They did almost the same thing. Remove a bunch of pointless instructions (mostly hidden in macros) and reduce cognitive load by merging them. Signed-off-by: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1204e20233fcab9130a1ba80b3b1879b5db3fc1f.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 673b1522c6585548798b281419f4ef67b8d17045 Author: Andy Lutomirski Date: Thu Nov 2 00:59:03 2017 -0700 x86/entry/64: Use pop instead of movq in syscall_return_via_sysret commit 4fbb39108f972437c44e5ffa781b56635d496826 upstream. Saves 64 bytes. Signed-off-by: Andy Lutomirski Reviewed-by: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/6609b7f74ab31c36604ad746e019ea8495aec76c.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit b774233bcdd4753267c7238a4b7754b9cc2b45ed Author: Andy Lutomirski Date: Thu Nov 2 00:59:02 2017 -0700 x86/entry/64: Shrink paranoid_exit_restore and make labels local commit e53178328c9b96fbdbc719e78c93b5687ee007c3 upstream. paranoid_exit_restore was a copy of restore_regs_and_return_to_kernel. Merge them and make the paranoid_exit internal labels local. Keeping .Lparanoid_exit makes the code a bit shorter because it allows a 2-byte jnz instead of a 5-byte jnz. Saves 96 bytes of text. ( This is still a bit suboptimal in a non-CONFIG_TRACE_IRQFLAGS kernel, but fixing that would make the code rather messy. ) Signed-off-by: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/510d66a1895cda9473c84b1086f0bb974f22de6a.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 37bae8ecdb518a676c5e12520f393d0069e57dfe Author: Andy Lutomirski Date: Thu Nov 2 00:59:01 2017 -0700 x86/entry/64: Simplify reg restore code in the standard IRET paths commit e872045bfd9c465a8555bab4b8567d56a4d2d3bb upstream. The old code restored all the registers with movq instead of pop. In theory, this was done because some CPUs have higher movq throughput, but any gain there would be tiny and is almost certainly outweighed by the higher text size. This saves 96 bytes of text. Signed-off-by: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/ad82520a207ccd851b04ba613f4f752b33ac05f7.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 7d4bb32bc6fded10ff2874ae1d6d655b5511829c Author: Andy Lutomirski Date: Thu Nov 2 00:59:00 2017 -0700 x86/entry/64: Move SWAPGS into the common IRET-to-usermode path commit 8a055d7f411d41755ce30db5bb65b154777c4b78 upstream. All of the code paths that ended up doing IRET to usermode did SWAPGS immediately beforehand. Move the SWAPGS into the common code. Signed-off-by: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/27fd6f45b7cd640de38fb9066fd0349bcd11f8e1.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 65236abc42b068506720f4a5168e5e63636138dd Author: Andy Lutomirski Date: Thu Nov 2 00:58:59 2017 -0700 x86/entry/64: Split the IRET-to-user and IRET-to-kernel paths commit 26c4ef9c49d8a0341f6d97ce2cfdd55d1236ed29 upstream. These code paths will diverge soon. Signed-off-by: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/dccf8c7b3750199b4b30383c812d4e2931811509.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit b7ee7fcca8a525c9bc636fab23779f18d3de9e80 Author: Andy Lutomirski Date: Thu Nov 2 00:58:58 2017 -0700 x86/entry/64: Remove the restore_c_regs_and_iret label commit 9da78ba6b47b46428cfdfc0851511ab29c869798 upstream. The only user was the 64-bit opportunistic SYSRET failure path, and that path didn't really need it. This change makes the opportunistic SYSRET code a bit more straightforward and gets rid of the label. Signed-off-by: Andy Lutomirski Reviewed-by: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/be3006a7ad3326e3458cf1cc55d416252cbe1986.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit ab3e5dfff36f4af4df09d86384376ebd188d7ae9 Author: Ricardo Neri Date: Fri Oct 27 13:25:30 2017 -0700 ptrace,x86: Make user_64bit_mode() available to 32-bit builds commit e27c310af5c05cf876d9cad006928076c27f54d4 upstream. In its current form, user_64bit_mode() can only be used when CONFIG_X86_64 is selected. This implies that code built with CONFIG_X86_64=n cannot use it. If a piece of code needs to be built for both CONFIG_X86_64=y and CONFIG_X86_64=n and wants to use this function, it needs to wrap it in an #ifdef/#endif; potentially, in multiple places. This can be easily avoided with a single #ifdef/#endif pair within user_64bit_mode() itself. Suggested-by: Borislav Petkov Signed-off-by: Ricardo Neri Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: "Michael S. Tsirkin" Cc: Peter Zijlstra Cc: Dave Hansen Cc: ricardo.neri@intel.com Cc: Adrian Hunter Cc: Paul Gortmaker Cc: Huang Rui Cc: Qiaowei Ren Cc: Shuah Khan Cc: Kees Cook Cc: Jonathan Corbet Cc: Jiri Slaby Cc: Dmitry Vyukov Cc: "Ravi V. Shankar" Cc: Chris Metcalf Cc: Brian Gerst Cc: Arnaldo Carvalho de Melo Cc: Andy Lutomirski Cc: Colin Ian King Cc: Chen Yucong Cc: Adam Buchbinder Cc: Vlastimil Babka Cc: Lorenzo Stoakes Cc: Masami Hiramatsu Cc: Paolo Bonzini Cc: Andrew Morton Cc: Thomas Garnier Link: https://lkml.kernel.org/r/1509135945-13762-4-git-send-email-ricardo.neri-calderon@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 985cba48423594b8e8792770a65543d7ee689eba Author: Ricardo Neri Date: Fri Oct 27 13:25:29 2017 -0700 x86/boot: Relocate definition of the initial state of CR0 commit b0ce5b8c95c83a7b98c679b117e3d6ae6f97154b upstream. Both head_32.S and head_64.S utilize the same value to initialize the control register CR0. Also, other parts of the kernel might want to access this initial definition (e.g., emulation code for User-Mode Instruction Prevention uses this state to provide a sane dummy value for CR0 when emulating the smsw instruction). Thus, relocate this definition to a header file from which it can be conveniently accessed. Suggested-by: Borislav Petkov Signed-off-by: Ricardo Neri Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Reviewed-by: Andy Lutomirski Cc: "Michael S. Tsirkin" Cc: Peter Zijlstra Cc: Dave Hansen Cc: ricardo.neri@intel.com Cc: linux-mm@kvack.org Cc: Paul Gortmaker Cc: Huang Rui Cc: Shuah Khan Cc: linux-arch@vger.kernel.org Cc: Jonathan Corbet Cc: Jiri Slaby Cc: "Ravi V. Shankar" Cc: Denys Vlasenko Cc: Chris Metcalf Cc: Brian Gerst Cc: Josh Poimboeuf Cc: Chen Yucong Cc: Vlastimil Babka Cc: Dave Hansen Cc: Andy Lutomirski Cc: Masami Hiramatsu Cc: Paolo Bonzini Cc: Andrew Morton Cc: Linus Torvalds Link: https://lkml.kernel.org/r/1509135945-13762-3-git-send-email-ricardo.neri-calderon@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 4ed772a7dee902953066869264934c36c7689097 Author: Ricardo Neri Date: Fri Oct 27 13:25:28 2017 -0700 x86/mm: Relocate page fault error codes to traps.h commit 1067f030994c69ca1fba8c607437c8895dcf8509 upstream. Up to this point, only fault.c used the definitions of the page fault error codes. Thus, it made sense to keep them within such file. Other portions of code might be interested in those definitions too. For instance, the User- Mode Instruction Prevention emulation code will use such definitions to emulate a page fault when it is unable to successfully copy the results of the emulated instructions to user space. While relocating the error code enumeration, the prefix X86_ is used to make it consistent with the rest of the definitions in traps.h. Of course, code using the enumeration had to be updated as well. No functional changes were performed. Signed-off-by: Ricardo Neri Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Reviewed-by: Andy Lutomirski Cc: "Michael S. Tsirkin" Cc: Peter Zijlstra Cc: Dave Hansen Cc: ricardo.neri@intel.com Cc: Paul Gortmaker Cc: Huang Rui Cc: Shuah Khan Cc: Jonathan Corbet Cc: Jiri Slaby Cc: "Ravi V. Shankar" Cc: Chris Metcalf Cc: Brian Gerst Cc: Josh Poimboeuf Cc: Chen Yucong Cc: Vlastimil Babka Cc: Masami Hiramatsu Cc: Paolo Bonzini Cc: Andrew Morton Cc: "Kirill A. Shutemov" Link: https://lkml.kernel.org/r/1509135945-13762-2-git-send-email-ricardo.neri-calderon@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 12dc3fa30178a4635e435ed2b4e543f76c10dc93 Author: Gayatri Kammela Date: Mon Oct 30 18:20:29 2017 -0700 x86/cpufeatures: Enable new SSE/AVX/AVX512 CPU features commit c128dbfa0f879f8ce7b79054037889b0b2240728 upstream. Add a few new SSE/AVX/AVX512 instruction groups/features for enumeration in /proc/cpuinfo: AVX512_VBMI2, GFNI, VAES, VPCLMULQDQ, AVX512_VNNI, AVX512_BITALG. CPUID.(EAX=7,ECX=0):ECX[bit 6] AVX512_VBMI2 CPUID.(EAX=7,ECX=0):ECX[bit 8] GFNI CPUID.(EAX=7,ECX=0):ECX[bit 9] VAES CPUID.(EAX=7,ECX=0):ECX[bit 10] VPCLMULQDQ CPUID.(EAX=7,ECX=0):ECX[bit 11] AVX512_VNNI CPUID.(EAX=7,ECX=0):ECX[bit 12] AVX512_BITALG Detailed information of CPUID bits for these features can be found in the Intel Architecture Instruction Set Extensions and Future Features Programming Interface document (refer to Table 1-1. and Table 1-2.). A copy of this document is available at https://bugzilla.kernel.org/show_bug.cgi?id=197239 Signed-off-by: Gayatri Kammela Acked-by: Thomas Gleixner Cc: Andi Kleen Cc: Fenghua Yu Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Ravi Shankar Cc: Ricardo Neri Cc: Yang Zhong Cc: bp@alien8.de Link: http://lkml.kernel.org/r/1509412829-23380-1-git-send-email-gayatri.kammela@intel.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit c60238f5712ba5a7d8391914f8290ae1bf5c08c0 Author: Baoquan He Date: Sat Oct 28 09:30:38 2017 +0800 x86/mm/64: Rename the register_page_bootmem_memmap() 'size' parameter to 'nr_pages' commit 15670bfe19905b1dcbb63137f40d718b59d84479 upstream. register_page_bootmem_memmap()'s 3rd 'size' parameter is named in a somewhat misleading fashion - rename it to 'nr_pages' which makes the units of it much clearer. Meanwhile rename the existing local variable 'nr_pages' to 'nr_pmd_pages', a more expressive name, to avoid conflict with new function parameter 'nr_pages'. (Also clean up the unnecessary parentheses in which get_order() is called.) Signed-off-by: Baoquan He Acked-by: Thomas Gleixner Cc: Linus Torvalds Cc: Peter Zijlstra Cc: akpm@linux-foundation.org Link: http://lkml.kernel.org/r/1509154238-23250-1-git-send-email-bhe@redhat.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 3df257ddc5f48214e229b26f3e3d5f42e0f45380 Author: Masahiro Yamada Date: Fri Oct 27 13:11:10 2017 +0900 x86/build: Beautify build log of syscall headers commit af8e947079a7dab0480b5d6db6b093fd04b86fc9 upstream. This makes the build log look nicer. Before: SYSTBL arch/x86/entry/syscalls/../../include/generated/asm/syscalls_32.h SYSHDR arch/x86/entry/syscalls/../../include/generated/asm/unistd_32_ia32.h SYSHDR arch/x86/entry/syscalls/../../include/generated/asm/unistd_64_x32.h SYSTBL arch/x86/entry/syscalls/../../include/generated/asm/syscalls_64.h SYSHDR arch/x86/entry/syscalls/../../include/generated/uapi/asm/unistd_32.h SYSHDR arch/x86/entry/syscalls/../../include/generated/uapi/asm/unistd_64.h SYSHDR arch/x86/entry/syscalls/../../include/generated/uapi/asm/unistd_x32.h After: SYSTBL arch/x86/include/generated/asm/syscalls_32.h SYSHDR arch/x86/include/generated/asm/unistd_32_ia32.h SYSHDR arch/x86/include/generated/asm/unistd_64_x32.h SYSTBL arch/x86/include/generated/asm/syscalls_64.h SYSHDR arch/x86/include/generated/uapi/asm/unistd_32.h SYSHDR arch/x86/include/generated/uapi/asm/unistd_64.h SYSHDR arch/x86/include/generated/uapi/asm/unistd_x32.h Signed-off-by: Masahiro Yamada Acked-by: Thomas Gleixner Cc: Linus Torvalds Cc: Peter Zijlstra Cc: "H. Peter Anvin" Cc: linux-kbuild@vger.kernel.org Link: http://lkml.kernel.org/r/1509077470-2735-1-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 20e9bfd7b8a3463c5ecc1d22b60be30a32607dc4 Author: Josh Poimboeuf Date: Fri Oct 20 11:21:35 2017 -0500 x86/asm: Don't use the confusing '.ifeq' directive commit 82c62fa0c49aa305104013cee4468772799bb391 upstream. I find the '.ifeq ' directive to be confusing. Reading it quickly seems to suggest its opposite meaning, or that it's missing an argument. Improve readability by replacing all of its x86 uses with '.if == 0'. Signed-off-by: Josh Poimboeuf Cc: Andrei Vagin Cc: Andy Lutomirski Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/757da028e802c7e98d23fbab8d234b1063e161cf.1508516398.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 09807080a97cedc160bc42efd3e8c3a3b72a29c7 Author: Dongjiu Geng Date: Tue Oct 17 16:02:20 2017 +0800 ACPI / APEI: remove the unused dead-code for SEA/NMI notification type commit c49870e89f4d2c21c76ebe90568246bb0f3572b7 upstream. For the SEA notification, the two functions ghes_sea_add() and ghes_sea_remove() are only called when CONFIG_ACPI_APEI_SEA is defined. If not, it will return errors in the ghes_probe() and not continue. If the probe is failed, the ghes_sea_remove() also has no chance to be called. Hence, remove the unnecessary handling when CONFIG_ACPI_APEI_SEA is not defined. For the NMI notification, it has the same issue as SEA notification, so also remove the unused dead-code for it. Signed-off-by: Dongjiu Geng Tested-by: Tyler Baicar Reviewed-by: Borislav Petkov Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit af02cd973d86cf3f6bd6eea928ccc9e70a4132d0 Author: Kirill A. Shutemov Date: Fri Sep 29 17:08:20 2017 +0300 x86/xen: Drop 5-level paging support code from the XEN_PV code commit 773dd2fca581b0a80e5a33332cc8ee67e5a79cba upstream. It was decided 5-level paging is not going to be supported in XEN_PV. Let's drop the dead code from the XEN_PV code. Tested-by: Juergen Gross Signed-off-by: Kirill A. Shutemov Reviewed-by: Juergen Gross Cc: Andrew Morton Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Cyrill Gorcunov Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170929140821.37654-6-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 13bda9cfea5116f15a69725c15bc2614a6c803dc Author: Kirill A. Shutemov Date: Fri Sep 29 17:08:19 2017 +0300 x86/xen: Provide pre-built page tables only for CONFIG_XEN_PV=y and CONFIG_XEN_PVH=y commit 4375c29985f155d7eb2346615d84e62d1b673682 upstream. Looks like we only need pre-built page tables in the CONFIG_XEN_PV=y and CONFIG_XEN_PVH=y cases. Let's not provide them for other configurations. Signed-off-by: Kirill A. Shutemov Reviewed-by: Juergen Gross Cc: Andrew Morton Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Cyrill Gorcunov Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170929140821.37654-5-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 873f59b8bd88c897fc1545efdf5742fc63f7a252 Author: Andrey Ryabinin Date: Fri Sep 29 17:08:18 2017 +0300 x86/kasan: Use the same shadow offset for 4- and 5-level paging commit 12a8cc7fcf54a8575f094be1e99032ec38aa045c upstream. We are going to support boot-time switching between 4- and 5-level paging. For KASAN it means we cannot have different KASAN_SHADOW_OFFSET for different paging modes: the constant is passed to gcc to generate code and cannot be changed at runtime. This patch changes KASAN code to use 0xdffffc0000000000 as shadow offset for both 4- and 5-level paging. For 5-level paging it means that shadow memory region is not aligned to PGD boundary anymore and we have to handle unaligned parts of the region properly. In addition, we have to exclude paravirt code from KASAN instrumentation as we now use set_pgd() before KASAN is fully ready. [kirill.shutemov@linux.intel.com: clenaup, changelog message] Signed-off-by: Andrey Ryabinin Signed-off-by: Kirill A. Shutemov Cc: Andrew Morton Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Cyrill Gorcunov Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170929140821.37654-4-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 4afaf6ea65acb07e151470580d89e6a2c0268610 Author: Kirill A. Shutemov Date: Fri Sep 29 17:08:16 2017 +0300 mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4 upstream. Size of the mem_section[] array depends on the size of the physical address space. In preparation for boot-time switching between paging modes on x86-64 we need to make the allocation of mem_section[] dynamic, because otherwise we waste a lot of RAM: with CONFIG_NODE_SHIFT=10, mem_section[] size is 32kB for 4-level paging and 2MB for 5-level paging mode. The patch allocates the array on the first call to sparse_memory_present_with_active_regions(). Signed-off-by: Kirill A. Shutemov Cc: Andrew Morton Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Cyrill Gorcunov Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170929140821.37654-2-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 36295155d8d4c3f9fb425faac9120559c051b861 Author: Thomas Gleixner Date: Wed Oct 18 19:39:35 2017 +0200 x86/cpuid: Prevent out of bound access in do_clear_cpu_cap() commit 57b8b1a1856adaa849d02d547411a553a531022b upstream. do_clear_cpu_cap() allocates a bitmap to keep track of disabled feature dependencies. That bitmap is sized NCAPINTS * BITS_PER_INIT. The possible 'features' which can be handed in are larger than this, because after the capabilities the bug 'feature' bits occupy another 32bit. Not really obvious... So clearing any of the misfeature bits, as 32bit does for the F00F bug, accesses that bitmap out of bounds thereby corrupting the stack. Size the bitmap proper and add a sanity check to catch accidental out of bound access. Fixes: 0b00de857a64 ("x86/cpuid: Add generic table for CPUID dependencies") Reported-by: kernel test robot Signed-off-by: Thomas Gleixner Cc: Andi Kleen Cc: Borislav Petkov Link: https://lkml.kernel.org/r/20171018022023.GA12058@yexl-desktop Signed-off-by: Greg Kroah-Hartman commit 61d1ad3631150386f3c142daff97913c1770e0fa Author: Kamalesh Babulal Date: Sat Oct 14 20:17:54 2017 +0530 objtool: Print top level commands on incorrect usage commit 6a93bb7e4a7d6670677d5b0eb980936eb9cc5d2e upstream. Print top-level objtool commands, along with the error on incorrect command line usage. Objtool command line parser exit's with code 129, for incorrect usage. Convert the cmd_usage() exit code also, to maintain consistency across objtool. After the patch: $ ./objtool -j Unknown option: -j usage: objtool COMMAND [ARGS] Commands: check Perform stack metadata validation on an object file orc Generate in-place ORC unwind tables for an object file $ echo $? 129 Signed-off-by: Kamalesh Babulal Acked-by: Josh Poimboeuf Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1507992474-16142-1-git-send-email-kamalesh@linux.vnet.ibm.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 3b57d66c8e53390b0035c578063db22c73ebce1a Author: Kees Cook Date: Mon Oct 16 16:22:31 2017 -0700 x86/platform/UV: Convert timers to use timer_setup() commit 376f3bcebdc999cc737d9052109cc33b573b3a8b upstream. In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Signed-off-by: Kees Cook Signed-off-by: Thomas Gleixner Cc: Dimitri Sivanich Cc: Russ Anderson Cc: Mike Travis Link: https://lkml.kernel.org/r/20171016232231.GA100493@beast Signed-off-by: Greg Kroah-Hartman commit 110ad51cc874a943792e16258e52ff3360cb4f96 Author: Andi Kleen Date: Fri Oct 13 14:56:45 2017 -0700 x86/fpu: Remove the explicit clearing of XSAVE dependent features commit 73e3a7d2a7c3be29a5a22b85026f6cfa5664267f upstream. Clearing a CPU feature with setup_clear_cpu_cap() clears all features which depend on it. Expressing feature dependencies in one place is easier to maintain than keeping functions like fpu__xstate_clear_all_cpu_caps() up to date. The features which depend on XSAVE have their dependency expressed in the dependency table, so its sufficient to clear X86_FEATURE_XSAVE. Remove the explicit clearing of XSAVE dependent features. Signed-off-by: Andi Kleen Reviewed-by: Thomas Gleixner Cc: Linus Torvalds Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20171013215645.23166-6-andi@firstfloor.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 0e7127aa76e05b51f767dbec74b236a1fa5eab06 Author: Andi Kleen Date: Fri Oct 13 14:56:44 2017 -0700 x86/fpu: Make XSAVE check the base CPUID features before enabling commit ccb18db2ab9d923df07e7495123fe5fb02329713 upstream. Before enabling XSAVE, not only check the XSAVE specific CPUID bits, but also the base CPUID features of the respective XSAVE feature. This allows to disable individual XSAVE states using the existing clearcpuid= option, which can be useful for performance testing and debugging, and also in general avoids inconsistencies. Signed-off-by: Andi Kleen Reviewed-by: Thomas Gleixner Cc: Linus Torvalds Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20171013215645.23166-5-andi@firstfloor.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit f01d7efac7929e3630e25a84e101228ea907bbea Author: Andi Kleen Date: Fri Oct 13 14:56:43 2017 -0700 x86/fpu: Parse clearcpuid= as early XSAVE argument commit 0c2a3913d6f50503f7c59d83a6219e39508cc898 upstream. With a followon patch we want to make clearcpuid affect the XSAVE configuration. But xsave is currently initialized before arguments are parsed. Move the clearcpuid= parsing into the special early xsave argument parsing code. Since clearcpuid= contains a = we need to keep the old __setup around as a dummy, otherwise it would end up as a environment variable in init's environment. Signed-off-by: Andi Kleen Reviewed-by: Thomas Gleixner Cc: Linus Torvalds Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20171013215645.23166-4-andi@firstfloor.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit d602a3465c77190ef75be052c76ccbdb6c72ed6e Author: Andi Kleen Date: Fri Oct 13 14:56:42 2017 -0700 x86/cpuid: Add generic table for CPUID dependencies commit 0b00de857a648dafe7020878c7a27cf776f5edf4 upstream. Some CPUID features depend on other features. Currently it's possible to to clear dependent features, but not clear the base features, which can cause various interesting problems. This patch implements a generic table to describe dependencies between CPUID features, to be used by all code that clears CPUID. Some subsystems (like XSAVE) had an own implementation of this, but it's better to do it all in a single place for everyone. Then clear_cpu_cap and setup_clear_cpu_cap always look up this table and clear all dependencies too. This is intended to be a practical table: only for features that make sense to clear. If someone for example clears FPU, or other features that are essentially part of the required base feature set, not much is going to work. Handling that is right now out of scope. We're only handling features which can be usefully cleared. Signed-off-by: Andi Kleen Reviewed-by: Thomas Gleixner Cc: Jonathan McDowell Cc: Linus Torvalds Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20171013215645.23166-3-andi@firstfloor.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit eb3addb22727034dd733fcc07a3b4248fe523b72 Author: Andi Kleen Date: Fri Oct 13 14:56:41 2017 -0700 bitops: Add clear/set_bit32() to linux/bitops.h commit cbe96375025e14fc76f9ed42ee5225120d7210f8 upstream. Add two simple wrappers around set_bit/clear_bit() that accept the common case of an u32 array. This avoids writing casts in all callers. Signed-off-by: Andi Kleen Reviewed-by: Thomas Gleixner Cc: Linus Torvalds Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20171013215645.23166-2-andi@firstfloor.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit b40a923903d0a535676b8d7b22bfe17260c3d35a Author: Josh Poimboeuf Date: Fri Oct 13 15:02:01 2017 -0500 x86/unwind: Make CONFIG_UNWINDER_ORC=y the default in kconfig for 64-bit commit fc72ae40e30327aa24eb88a24b9c7058f938bd36 upstream. The ORC unwinder has been stable in testing so far. Give it much wider testing by making it the default in kconfig for x86_64. It's not yet supported for 32-bit, so leave frame pointers as the default there. Suggested-by: Ingo Molnar Signed-off-by: Josh Poimboeuf Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/9b1237bbe7244ed9cdf8db2dcb1253e37e1c341e.1507924831.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 8af220c9e240a47660686161e6d04043ad09c563 Author: Josh Poimboeuf Date: Fri Oct 13 15:02:00 2017 -0500 x86/unwind: Rename unwinder config options to 'CONFIG_UNWINDER_*' commit 11af847446ed0d131cf24d16a7ef3d5ea7a49554 upstream. Rename the unwinder config options from: CONFIG_ORC_UNWINDER CONFIG_FRAME_POINTER_UNWINDER CONFIG_GUESS_UNWINDER to: CONFIG_UNWINDER_ORC CONFIG_UNWINDER_FRAME_POINTER CONFIG_UNWINDER_GUESS ... in order to give them a more logical config namespace. Suggested-by: Ingo Molnar Signed-off-by: Josh Poimboeuf Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/73972fc7e2762e91912c6b9584582703d6f1b8cc.1507924831.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit a11adc6a2e0b429adab534c44f68c057d12ecd4d Author: Steven Rostedt (VMware) Date: Thu Oct 12 18:06:19 2017 -0400 x86/fpu/debug: Remove unused 'x86_fpu_state' and 'x86_fpu_deactivate_state' tracepoints commit 127a1bea40f7f2a36bc7207ea4d51bb6b4e936fa upstream. Commit: d1898b733619 ("x86/fpu: Add tracepoints to dump FPU state at key points") ... added the 'x86_fpu_state' and 'x86_fpu_deactivate_state' trace points, but never used them. Today they are still not used. As they take up and waste memory, remove them. Signed-off-by: Steven Rostedt (VMware) Cc: Dave Hansen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20171012180619.670b68b6@gandalf.local.home Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit ab7fc55ef231511ae40adf4c51cec89035dad4db Author: Ingo Molnar Date: Thu Oct 12 09:24:30 2017 +0200 x86/unwinder: Make CONFIG_UNWINDER_ORC=y the default in the 64-bit defconfig commit 1e4078f0bba46ad61b69548abe6a6faf63b89380 upstream. Increase testing coverage by turning on the primary x86 unwinder for the 64-bit defconfig. Cc: Josh Poimboeuf Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 2cb7165b4dcf80b03d7cc86fedd1a65a455823b3 Author: Jan Beulich Date: Mon Sep 25 02:06:19 2017 -0600 ACPI / APEI: adjust a local variable type in ghes_ioremap_pfn_irq() commit 095f613c6b386a1704b73a549e9ba66c1d5381ae upstream. Match up with what 7edda0886b ("acpi: apei: handle SEA notification type for ARMv8") did for ghes_ioremap_pfn_nmi(). Signed-off-by: Jan Beulich Reviewed-by: Borislav Petkov Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 5d5e60c80fd8613b8a76876b3ffd357f7a0bf7e7 Author: Josh Poimboeuf Date: Mon Sep 18 21:43:37 2017 -0500 x86/head: Add unwind hint annotations commit 2704fbb672d0d9a19414907fda7949283dcef6a1 upstream. Jiri Slaby reported an ORC issue when unwinding from an idle task. The stack was: ffffffff811083c2 do_idle+0x142/0x1e0 ffffffff8110861d cpu_startup_entry+0x5d/0x60 ffffffff82715f58 start_kernel+0x3ff/0x407 ffffffff827153e8 x86_64_start_kernel+0x14e/0x15d ffffffff810001bf secondary_startup_64+0x9f/0xa0 The ORC unwinder errored out at secondary_startup_64 because the head code isn't annotated yet so there wasn't a corresponding ORC entry. Fix that and any other head-related unwinding issues by adding unwind hints to the head code. Reported-by: Jiri Slaby Tested-by: Jiri Slaby Signed-off-by: Josh Poimboeuf Cc: Andy Lutomirski Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/78ef000a2f68f545d6eef44ee912edceaad82ccf.1505764066.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit d074a1075f6a123e3f9c5fdbac7eea4c1006e03b Author: Josh Poimboeuf Date: Mon Sep 18 21:43:36 2017 -0500 x86/xen: Add unwind hint annotations commit abbe1cac6214d81d2f4e149aba64a8760703144e upstream. Add unwind hint annotations to the xen head code so the ORC unwinder can read head_64.o. hypercall_page needs empty annotations at 32-byte intervals to match the 'xen_hypercall_*' ELF functions at those locations. Signed-off-by: Josh Poimboeuf Cc: Andy Lutomirski Cc: Boris Ostrovsky Cc: Jiri Slaby Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/70ed2eb516fe9266be766d953f93c2571bca88cc.1505764066.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 2c9863c1687b2e1398d943351ede4ad45ab77cbe Author: Josh Poimboeuf Date: Mon Sep 18 21:43:35 2017 -0500 x86/xen: Fix xen head ELF annotations commit 2582d3df95c76d3b686453baf90b64d57e87d1e8 upstream. Mark the ends of the startup_xen and hypercall_page code sections. Signed-off-by: Josh Poimboeuf Cc: Andy Lutomirski Cc: Boris Ostrovsky Cc: Jiri Slaby Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/3a80a394d30af43d9cefa1a29628c45ed8420c97.1505764066.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit aad9d83f9dcbc896a034f2676d3d2b661b0ca5cf Author: Josh Poimboeuf Date: Mon Sep 18 21:43:34 2017 -0500 x86/boot: Annotate verify_cpu() as a callable function commit e93db75a0054b23a874a12c63376753544f3fe9e upstream. verify_cpu() is a callable function. Annotate it as such. Signed-off-by: Josh Poimboeuf Cc: Andy Lutomirski Cc: Boris Ostrovsky Cc: Jiri Slaby Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/293024b8a080832075312f38c07ccc970fc70292.1505764066.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 8233afff9b45feca9020b2fc52f7967da5a13907 Author: Josh Poimboeuf Date: Mon Sep 18 21:43:33 2017 -0500 x86/head: Fix head ELF function annotations commit 015a2ea5478680fc5216d56b7ff306f2a74efaf9 upstream. These functions aren't callable C-type functions, so don't annotate them as such. Signed-off-by: Josh Poimboeuf Cc: Andy Lutomirski Cc: Boris Ostrovsky Cc: Jiri Slaby Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/36eb182738c28514f8bf95e403d89b6413a88883.1505764066.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 98ce8eee6021e32ba8f606ed7de2a9c0f3807dcf Author: Josh Poimboeuf Date: Mon Sep 18 21:43:32 2017 -0500 x86/head: Remove unused 'bad_address' code commit a8b88e84d124bc92c4808e72b8b8c0e0bb538630 upstream. It's no longer possible for this code to be executed, so remove it. Signed-off-by: Josh Poimboeuf Cc: Andy Lutomirski Cc: Boris Ostrovsky Cc: Jiri Slaby Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/32a46fe92d2083700599b36872b26e7dfd7b7965.1505764066.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 9cf5a88b165e203d8c7d5284640ea6aa61baf78a Author: Josh Poimboeuf Date: Mon Sep 18 21:43:31 2017 -0500 x86/head: Remove confusing comment commit 17270717e80de33a884ad328fea5f407d87f6d6a upstream. This comment is actively wrong and confusing. It refers to the registers' stack offsets after the pt_regs has been constructed on the stack, but this code is *before* that. At this point the stack just has the standard iret frame, for which no comment should be needed. Signed-off-by: Josh Poimboeuf Cc: Andy Lutomirski Cc: Boris Ostrovsky Cc: Jiri Slaby Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/a3c267b770fc56c9b86df9c11c552848248aace2.1505764066.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 42314edefac8a80efb3d00a9f83bf1b1871bd473 Author: Josh Poimboeuf Date: Mon Sep 18 21:43:30 2017 -0500 objtool: Don't report end of section error after an empty unwind hint commit 00d96180dc38ef872ac471c2d3e14b067cbd895d upstream. If asm code specifies an UNWIND_HINT_EMPTY hint, don't warn if the section ends unexpectedly. This can happen with the xen-head.S code because the hypercall_page is "text" but it's all zeros. Signed-off-by: Josh Poimboeuf Cc: Andy Lutomirski Cc: Boris Ostrovsky Cc: Jiri Slaby Cc: Juergen Gross Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/ddafe199dd8797e40e3c2777373347eba1d65572.1505764066.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit c09061aec2e5245f547ea6eecf321e172e42b7fb Author: Uros Bizjak Date: Wed Sep 6 17:18:08 2017 +0200 x86/asm: Remove unnecessary \n\t in front of CC_SET() from asm templates commit 3c52b5c64326d9dcfee4e10611c53ec1b1b20675 upstream. There is no need for \n\t in front of CC_SET(), as the macro already includes these two. Signed-off-by: Uros Bizjak Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20170906151808.5634-1-ubizjak@gmail.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman