commit 2134d97aa3a7ce38bb51f933f2e20cafde371085
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Thu Feb 25 12:01:36 2016 -0800

    Linux 4.4.3

commit e2f712dc927e3b9a981ecd86a64d944d0b140322
Author: Luis R. Rodriguez <mcgrof@suse.com>
Date:   Wed Feb 3 16:55:26 2016 +1030

    modules: fix modparam async_probe request
    
    commit 4355efbd80482a961cae849281a8ef866e53d55c upstream.
    
    Commit f2411da746985 ("driver-core: add driver module
    asynchronous probe support") added async probe support,
    in two forms:
    
      * in-kernel driver specification annotation
      * generic async_probe module parameter (modprobe foo async_probe)
    
    To support the generic kernel parameter parse_args() was
    extended via commit ecc8617053e0 ("module: add extra
    argument for parse_params() callback") however commit
    failed to f2411da746985 failed to add the required argument.
    
    This causes a crash then whenever async_probe generic
    module parameter is used. This was overlooked when the
    form in which in-kernel async probe support was reworked
    a bit... Fix this as originally intended.
    
    Cc: Hannes Reinecke <hare@suse.de>
    Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> [minimized]
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a24d9a2fee9857a162ab18c9ad72aa6571ff4715
Author: Rusty Russell <rusty@rustcorp.com.au>
Date:   Wed Feb 3 16:55:26 2016 +1030

    module: wrapper for symbol name.
    
    commit 2e7bac536106236104e9e339531ff0fcdb7b8147 upstream.
    
    This trivial wrapper adds clarity and makes the following patch
    smaller.
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 82e730baa9f7ba55cebe9365efde3e9f008d4376
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Thu Jan 14 16:54:48 2016 +0000

    itimers: Handle relative timers with CONFIG_TIME_LOW_RES proper
    
    commit 51cbb5242a41700a3f250ecfb48dcfb7e4375ea4 upstream.
    
    As Helge reported for timerfd we have the same issue in itimers. We return
    remaining time larger than the programmed relative time to user space in case
    of CONFIG_TIME_LOW_RES=y. Use the proper function to adjust the extra time
    added in hrtimer_start_range_ns().
    
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Helge Deller <deller@gmx.de>
    Cc: John Stultz <john.stultz@linaro.org>
    Cc: linux-m68k@lists.linux-m68k.org
    Cc: dhowells@redhat.com
    Link: http://lkml.kernel.org/r/20160114164159.528222587@linutronix.de
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 1c94da3e7480e3370945aad2b784343da9eb664f
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Thu Jan 14 16:54:47 2016 +0000

    posix-timers: Handle relative timers with CONFIG_TIME_LOW_RES proper
    
    commit 572c39172684c3711e4a03c9a7380067e2b0661c upstream.
    
    As Helge reported for timerfd we have the same issue in posix timers. We
    return remaining time larger than the programmed relative time to user space
    in case of CONFIG_TIME_LOW_RES=y. Use the proper function to adjust the extra
    time added in hrtimer_start_range_ns().
    
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Helge Deller <deller@gmx.de>
    Cc: John Stultz <john.stultz@linaro.org>
    Cc: linux-m68k@lists.linux-m68k.org
    Cc: dhowells@redhat.com
    Link: http://lkml.kernel.org/r/20160114164159.450510905@linutronix.de
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 565f222968d300657ad40d2fad91e4ccf3148d89
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Thu Jan 14 16:54:46 2016 +0000

    timerfd: Handle relative timers with CONFIG_TIME_LOW_RES proper
    
    commit b62526ed11a1fe3861ab98d40b7fdab8981d788a upstream.
    
    Helge reported that a relative timer can return a remaining time larger than
    the programmed relative time on parisc and other architectures which have
    CONFIG_TIME_LOW_RES set. This happens because we add a jiffie to the resulting
    expiry time to prevent short timeouts.
    
    Use the new function hrtimer_expires_remaining_adjusted() to calculate the
    remaining time. It takes that extra added time into account for relative
    timers.
    
    Reported-and-tested-by: Helge Deller <deller@gmx.de>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: John Stultz <john.stultz@linaro.org>
    Cc: linux-m68k@lists.linux-m68k.org
    Cc: dhowells@redhat.com
    Link: http://lkml.kernel.org/r/20160114164159.354500742@linutronix.de
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e5e99792b6474b38ef2dea33cbcd1dde5fd70330
Author: Mateusz Guzik <mguzik@redhat.com>
Date:   Wed Jan 20 15:01:02 2016 -0800

    prctl: take mmap sem for writing to protect against others
    
    commit ddf1d398e517e660207e2c807f76a90df543a217 upstream.
    
    An unprivileged user can trigger an oops on a kernel with
    CONFIG_CHECKPOINT_RESTORE.
    
    proc_pid_cmdline_read takes mmap_sem for reading and obtains args + env
    start/end values. These get sanity checked as follows:
            BUG_ON(arg_start > arg_end);
            BUG_ON(env_start > env_end);
    
    These can be changed by prctl_set_mm. Turns out also takes the semaphore for
    reading, effectively rendering it useless. This results in:
    
      kernel BUG at fs/proc/base.c:240!
      invalid opcode: 0000 [#1] SMP
      Modules linked in: virtio_net
      CPU: 0 PID: 925 Comm: a.out Not tainted 4.4.0-rc8-next-20160105dupa+ #71
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      task: ffff880077a68000 ti: ffff8800784d0000 task.ti: ffff8800784d0000
      RIP: proc_pid_cmdline_read+0x520/0x530
      RSP: 0018:ffff8800784d3db8  EFLAGS: 00010206
      RAX: ffff880077c5b6b0 RBX: ffff8800784d3f18 RCX: 0000000000000000
      RDX: 0000000000000002 RSI: 00007f78e8857000 RDI: 0000000000000246
      RBP: ffff8800784d3e40 R08: 0000000000000008 R09: 0000000000000001
      R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000050
      R13: 00007f78e8857800 R14: ffff88006fcef000 R15: ffff880077c5b600
      FS:  00007f78e884a740(0000) GS:ffff88007b200000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00007f78e8361770 CR3: 00000000790a5000 CR4: 00000000000006f0
      Call Trace:
        __vfs_read+0x37/0x100
        vfs_read+0x82/0x130
        SyS_read+0x58/0xd0
        entry_SYSCALL_64_fastpath+0x12/0x76
      Code: 4c 8b 7d a8 eb e9 48 8b 9d 78 ff ff ff 4c 8b 7d 90 48 8b 03 48 39 45 a8 0f 87 f0 fe ff ff e9 d1 fe ff ff 4c 8b 7d 90 eb c6 0f 0b <0f> 0b 0f 0b 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
      RIP   proc_pid_cmdline_read+0x520/0x530
      ---[ end trace 97882617ae9c6818 ]---
    
    Turns out there are instances where the code just reads aformentioned
    values without locking whatsoever - namely environ_read and get_cmdline.
    
    Interestingly these functions look quite resilient against bogus values,
    but I don't believe this should be relied upon.
    
    The first patch gets rid of the oops bug by grabbing mmap_sem for
    writing.
    
    The second patch is optional and puts locking around aformentioned
    consumers for safety.  Consumers of other fields don't seem to benefit
    from similar treatment and are left untouched.
    
    This patch (of 2):
    
    The code was taking the semaphore for reading, which does not protect
    against readers nor concurrent modifications.
    
    The problem could cause a sanity checks to fail in procfs's cmdline
    reader, resulting in an OOPS.
    
    Note that some functions perform an unlocked read of various mm fields,
    but they seem to be fine despite possible modificaton.
    
    Signed-off-by: Mateusz Guzik <mguzik@redhat.com>
    Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
    Cc: Alexey Dobriyan <adobriyan@gmail.com>
    Cc: Jarod Wilson <jarod@redhat.com>
    Cc: Jan Stancek <jstancek@redhat.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Anshuman Khandual <anshuman.linux@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f86701c4f3cd59bac9363846e62fd26ecb2437c2
Author: Dave Chinner <dchinner@redhat.com>
Date:   Tue Jan 19 08:28:10 2016 +1100

    xfs: log mount failures don't wait for buffers to be released
    
    commit 85bec5460ad8e05e0a8d70fb0f6750eb719ad092 upstream.
    
    Recently I've been seeing xfs/051 fail on 1k block size filesystems.
    Trying to trace the events during the test lead to the problem going
    away, indicating that it was a race condition that lead to this
    ASSERT failure:
    
    XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0, file: fs/xfs/xfs_mount.c, line: 156
    .....
    [<ffffffff814e1257>] xfs_free_perag+0x87/0xb0
    [<ffffffff814e21b9>] xfs_mountfs+0x4d9/0x900
    [<ffffffff814e5dff>] xfs_fs_fill_super+0x3bf/0x4d0
    [<ffffffff811d8800>] mount_bdev+0x180/0x1b0
    [<ffffffff814e3ff5>] xfs_fs_mount+0x15/0x20
    [<ffffffff811d90a8>] mount_fs+0x38/0x170
    [<ffffffff811f4347>] vfs_kern_mount+0x67/0x120
    [<ffffffff811f7018>] do_mount+0x218/0xd60
    [<ffffffff811f7e5b>] SyS_mount+0x8b/0xd0
    
    When I finally caught it with tracing enabled, I saw that AG 2 had
    an elevated reference count and a buffer was responsible for it. I
    tracked down the specific buffer, and found that it was missing the
    final reference count release that would put it back on the LRU and
    hence be found by xfs_wait_buftarg() calls in the log mount failure
    handling.
    
    The last four traces for the buffer before the assert were (trimmed
    for relevance)
    
    kworker/0:1-5259   xfs_buf_iodone:        hold 2  lock 0 flags ASYNC
    kworker/0:1-5259   xfs_buf_ioerror:       hold 2  lock 0 error -5
    mount-7163	   xfs_buf_lock_done:     hold 2  lock 0 flags ASYNC
    mount-7163	   xfs_buf_unlock:        hold 2  lock 1 flags ASYNC
    
    This is an async write that is completing, so there's nobody waiting
    for it directly.  Hence we call xfs_buf_relse() once all the
    processing is complete. That does:
    
    static inline void xfs_buf_relse(xfs_buf_t *bp)
    {
    	xfs_buf_unlock(bp);
    	xfs_buf_rele(bp);
    }
    
    Now, it's clear that mount is waiting on the buffer lock, and that
    it has been released by xfs_buf_relse() and gained by mount. This is
    expected, because at this point the mount process is in
    xfs_buf_delwri_submit() waiting for all the IO it submitted to
    complete.
    
    The mount process, however, is waiting on the lock for the buffer
    because it is in xfs_buf_delwri_submit(). This waits for IO
    completion, but it doesn't wait for the buffer reference owned by
    the IO to go away. The mount process collects all the completions,
    fails the log recovery, and the higher level code then calls
    xfs_wait_buftarg() to free all the remaining buffers in the
    filesystem.
    
    The issue is that on unlocking the buffer, the scheduler has decided
    that the mount process has higher priority than the the kworker
    thread that is running the IO completion, and so immediately
    switched contexts to the mount process from the semaphore unlock
    code, hence preventing the kworker thread from finishing the IO
    completion and releasing the IO reference to the buffer.
    
    Hence by the time that xfs_wait_buftarg() is run, the buffer still
    has an active reference and so isn't on the LRU list that the
    function walks to free the remaining buffers. Hence we miss that
    buffer and continue onwards to tear down the mount structures,
    at which time we get find a stray reference count on the perag
    structure. On a non-debug kernel, this will be ignored and the
    structure torn down and freed. Hence when the kworker thread is then
    rescheduled and the buffer released and freed, it will access a
    freed perag structure.
    
    The problem here is that when the log mount fails, we still need to
    quiesce the log to ensure that the IO workqueues have returned to
    idle before we run xfs_wait_buftarg(). By synchronising the
    workqueues, we ensure that all IO completions are fully processed,
    not just to the point where buffers have been unlocked. This ensures
    we don't end up in the situation above.
    
    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Brian Foster <bfoster@redhat.com>
    Signed-off-by: Dave Chinner <david@fromorbit.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 16f14a28f660d0a3bdb169ebd17e0181c03d4778
Author: Dave Chinner <david@fromorbit.com>
Date:   Tue Jan 19 08:21:46 2016 +1100

    Revert "xfs: clear PF_NOFREEZE for xfsaild kthread"
    
    commit 3e85286e75224fa3f08bdad20e78c8327742634e upstream.
    
    This reverts commit 24ba16bb3d499c49974669cd8429c3e4138ab102 as it
    prevents machines from suspending. This regression occurs when the
    xfsaild is idle on entry to suspend, and so there s no activity to
    wake it from it's idle sleep and hence see that it is supposed to
    freeze. Hence the freezer times out waiting for it and suspend is
    cancelled.
    
    There is no obvious fix for this short of freezing the filesystem
    properly, so revert this change for now.
    
    Signed-off-by: Dave Chinner <david@fromorbit.com>
    Acked-by: Jiri Kosina <jkosina@suse.cz>
    Reviewed-by: Brian Foster <bfoster@redhat.com>
    Signed-off-by: Dave Chinner <david@fromorbit.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 7530e6fdd9f207a6ebcf669490656def4f7cf73e
Author: Dave Chinner <dchinner@redhat.com>
Date:   Tue Jan 12 07:03:44 2016 +1100

    xfs: inode recovery readahead can race with inode buffer creation
    
    commit b79f4a1c68bb99152d0785ee4ea3ab4396cdacc6 upstream.
    
    When we do inode readahead in log recovery, we do can do the
    readahead before we've replayed the icreate transaction that stamps
    the buffer with inode cores. The inode readahead verifier catches
    this and marks the buffer as !done to indicate that it doesn't yet
    contain valid inodes.
    
    In adding buffer error notification  (i.e. setting b_error = -EIO at
    the same time as as we clear the done flag) to such a readahead
    verifier failure, we can then get subsequent inode recovery failing
    with this error:
    
    XFS (dm-0): metadata I/O error: block 0xa00060 ("xlog_recover_do..(read#2)") error 5 numblks 32
    
    This occurs when readahead completion races with icreate item replay
    such as:
    
    	inode readahead
    		find buffer
    		lock buffer
    		submit RA io
    	....
    	icreate recovery
    	    xfs_trans_get_buffer
    		find buffer
    		lock buffer
    		<blocks on RA completion>
    	.....
    	<ra completion>
    		fails verifier
    		clear XBF_DONE
    		set bp->b_error = -EIO
    		release and unlock buffer
    	<icreate gains lock>
    	icreate initialises buffer
    	marks buffer as done
    	adds buffer to delayed write queue
    	releases buffer
    
    At this point, we have an initialised inode buffer that is up to
    date but has an -EIO state registered against it. When we finally
    get to recovering an inode in that buffer:
    
    	inode item recovery
    	    xfs_trans_read_buffer
    		find buffer
    		lock buffer
    		sees XBF_DONE is set, returns buffer
    	    sees bp->b_error is set
    		fail log recovery!
    
    Essentially, we need xfs_trans_get_buf_map() to clear the error status of
    the buffer when doing a lookup. This function returns uninitialised
    buffers, so the buffer returned can not be in an error state and
    none of the code that uses this function expects b_error to be set
    on return. Indeed, there is an ASSERT(!bp->b_error); in the
    transaction case in xfs_trans_get_buf_map() that would have caught
    this if log recovery used transactions....
    
    This patch firstly changes the inode readahead failure to set -EIO
    on the buffer, and secondly changes xfs_buf_get_map() to never
    return a buffer with an error state set so this first change doesn't
    cause unexpected log recovery failures.
    
    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Brian Foster <bfoster@redhat.com>
    Signed-off-by: Dave Chinner <david@fromorbit.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 888959f2fd5042609f1beefe663ced4dea4dc109
Author: Darrick J. Wong <darrick.wong@oracle.com>
Date:   Mon Jan 4 16:13:21 2016 +1100

    libxfs: pack the agfl header structure so XFS_AGFL_SIZE is correct
    
    commit 96f859d52bcb1c6ea6f3388d39862bf7143e2f30 upstream.
    
    Because struct xfs_agfl is 36 bytes long and has a 64-bit integer
    inside it, gcc will quietly round the structure size up to the nearest
    64 bits -- in this case, 40 bytes.  This results in the XFS_AGFL_SIZE
    macro returning incorrect results for v5 filesystems on 64-bit
    machines (118 items instead of 119).  As a result, a 32-bit xfs_repair
    will see garbage in AGFL item 119 and complain.
    
    Therefore, tell gcc not to pad the structure so that the AGFL size
    calculation is correct.
    
    Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Signed-off-by: Dave Chinner <david@fromorbit.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 8373f6590f6b371bff2c5f2c0581548eb0192014
Author: Miklos Szeredi <miklos@szeredi.hu>
Date:   Fri Dec 11 16:30:49 2015 +0100

    ovl: setattr: check permissions before copy-up
    
    commit cf9a6784f7c1b5ee2b9159a1246e327c331c5697 upstream.
    
    Without this copy-up of a file can be forced, even without actually being
    allowed to do anything on the file.
    
    [Arnd Bergmann] include <linux/pagemap.h> for PAGE_CACHE_SIZE (used by
    MAX_LFS_FILESIZE definition).
    
    Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 7193e802960f884e0ae6c40711c4e3c70fa3b070
Author: Miklos Szeredi <miklos@szeredi.hu>
Date:   Wed Dec 9 16:11:59 2015 +0100

    ovl: root: copy attr
    
    commit ed06e069775ad9236087594a1c1667367e983fb5 upstream.
    
    We copy i_uid and i_gid of underlying inode into overlayfs inode.  Except
    for the root inode.
    
    Fix this omission.
    
    Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 367e439dbc238907963ddf2758200b55087cab04
Author: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Date:   Mon Nov 16 18:44:11 2015 +0300

    ovl: check dentry positiveness in ovl_cleanup_whiteouts()
    
    commit 84889d49335627bc770b32787c1ef9ebad1da232 upstream.
    
    This patch fixes kernel crash at removing directory which contains
    whiteouts from lower layers.
    
    Cache of directory content passed as "list" contains entries from all
    layers, including whiteouts from lower layers. So, lookup in upper dir
    (moved into work at this stage) will return negative entry. Plus this
    cache is filled long before and we can race with external removal.
    
    Example:
     mkdir -p lower0/dir lower1/dir upper work overlay
     touch lower0/dir/a lower0/dir/b
     mknod lower1/dir/a c 0 0
     mount -t overlay none overlay -o lowerdir=lower1:lower0,upperdir=upper,workdir=work
     rm -fr overlay/dir
    
    Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
    Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit fa932190a5f332e1936ab0223c7c5d79b5596a9c
Author: Vito Caputo <vito.caputo@coreos.com>
Date:   Sat Oct 24 07:19:46 2015 -0500

    ovl: use a minimal buffer in ovl_copy_xattr
    
    commit e4ad29fa0d224d05e08b2858e65f112fd8edd4fe upstream.
    
    Rather than always allocating the high-order XATTR_SIZE_MAX buffer
    which is costly and prone to failure, only allocate what is needed and
    realloc if necessary.
    
    Fixes https://github.com/coreos/bugs/issues/489
    
    Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 85a7ed329aca5efd0eb70789db145a1e989c7a53
Author: Miklos Szeredi <miklos@szeredi.hu>
Date:   Tue Nov 10 17:08:41 2015 +0100

    ovl: allow zero size xattr
    
    commit 97daf8b97ad6f913a34c82515be64dc9ac08d63e upstream.
    
    When ovl_copy_xattr() encountered a zero size xattr no more xattrs were
    copied and the function returned success.  This is clearly not the desired
    behavior.
    
    Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit acaf84251f8d49d0e7ee41d654f987dbd441f991
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Sat Dec 19 20:07:38 2015 +0000

    futex: Drop refcount if requeue_pi() acquired the rtmutex
    
    commit fb75a4282d0d9a3c7c44d940582c2d226cf3acfb upstream.
    
    If the proxy lock in the requeue loop acquires the rtmutex for a
    waiter then it acquired also refcount on the pi_state related to the
    futex, but the waiter side does not drop the reference count.
    
    Add the missing free_pi_state() call.
    
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Darren Hart <darren@dvhart.com>
    Cc: Davidlohr Bueso <dave@stgolabs.net>
    Cc: Bhuvanesh_Surachari@mentor.com
    Cc: Andy Lowe <Andy_Lowe@mentor.com>
    Link: http://lkml.kernel.org/r/20151219200607.178132067@linutronix.de
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 30066dcdf98a1474d71b3ce8a05f81e14a61a3bd
Author: Toshi Kani <toshi.kani@hpe.com>
Date:   Wed Feb 17 13:11:29 2016 -0800

    devm_memremap_release(): fix memremap'd addr handling
    
    commit 9273a8bbf58a15051e53a777389a502420ddc60e upstream.
    
    The pmem driver calls devm_memremap() to map a persistent memory range.
    When the pmem driver is unloaded, this memremap'd range is not released
    so the kernel will leak a vma.
    
    Fix devm_memremap_release() to handle a given memremap'd address
    properly.
    
    Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
    Acked-by: Dan Williams <dan.j.williams@intel.com>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
    Cc: Matthew Wilcox <willy@linux.intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 15db15e2f10ae12d021c9a2e9edd8a03b9238551
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Date:   Wed Feb 17 13:11:35 2016 -0800

    ipc/shm: handle removed segments gracefully in shm_mmap()
    
    commit 1ac0b6dec656f3f78d1c3dd216fad84cb4d0a01e upstream.
    
    remap_file_pages(2) emulation can reach file which represents removed
    IPC ID as long as a memory segment is mapped.  It breaks expectations of
    IPC subsystem.
    
    Test case (rewritten to be more human readable, originally autogenerated
    by syzkaller[1]):
    
    	#define _GNU_SOURCE
    	#include <stdlib.h>
    	#include <sys/ipc.h>
    	#include <sys/mman.h>
    	#include <sys/shm.h>
    
    	#define PAGE_SIZE 4096
    
    	int main()
    	{
    		int id;
    		void *p;
    
    		id = shmget(IPC_PRIVATE, 3 * PAGE_SIZE, 0);
    		p = shmat(id, NULL, 0);
    		shmctl(id, IPC_RMID, NULL);
    		remap_file_pages(p, 3 * PAGE_SIZE, 0, 7, 0);
    
    	        return 0;
    	}
    
    The patch changes shm_mmap() and code around shm_lock() to propagate
    locking error back to caller of shm_mmap().
    
    [1] http://github.com/google/syzkaller
    
    Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Reported-by: Dmitry Vyukov <dvyukov@google.com>
    Cc: Davidlohr Bueso <dave@stgolabs.net>
    Cc: Manfred Spraul <manfred@colorfullife.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit fe90acff279808bf69425a1118639273a656c981
Author: Dan Carpenter <dan.carpenter@oracle.com>
Date:   Tue Jan 26 12:24:25 2016 +0300

    intel_scu_ipcutil: underflow in scu_reg_access()
    
    commit b1d353ad3d5835b16724653b33c05124e1b5acf1 upstream.
    
    "count" is controlled by the user and it can be negative.  Let's prevent
    that by making it unsigned.  You have to have CAP_SYS_RAWIO to call this
    function so the bug is not as serious as it could be.
    
    Fixes: 5369c02d951a ('intel_scu_ipc: Utility driver for intel scu ipc')
    Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Darren Hart <dvhart@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit edfde263bd8a70a0b93b11996fd2a77526ff3b80
Author: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Date:   Thu Feb 11 16:13:09 2016 -0800

    mm,thp: khugepaged: call pte flush at the time of collapse
    
    commit 6a6ac72fd6ea32594b316513e1826c3f6db4cc93 upstream.
    
    This showed up on ARC when running LMBench bw_mem tests as Overlapping
    TLB Machine Check Exception triggered due to STLB entry (2M pages)
    overlapping some NTLB entry (regular 8K page).
    
    bw_mem 2m touches a large chunk of vaddr creating NTLB entries.  In the
    interim khugepaged kicks in, collapsing the contiguous ptes into a
    single pmd.  pmdp_collapse_flush()->flush_pmd_tlb_range() is called to
    flush out NTLB entries for the ptes.  This for ARC (by design) can only
    shootdown STLB entries (for pmd).  The stray NTLB entries cause the
    overlap with the subsequent STLB entry for collapsed page.  So make
    pmdp_collapse_flush() call pte flush interface not pmd flush.
    
    Note that originally all thp flush call sites in generic code called
    flush_tlb_range() leaving it to architecture to implement the flush for
    pte and/or pmd.  Commit 12ebc1581ad11454 changed this by calling a new
    opt-in API flush_pmd_tlb_range() which made the semantics more explicit
    but failed to distinguish the pte vs pmd flush in generic code, which is
    what this patch fixes.
    
    Note that ARC can fixed w/o touching the generic pmdp_collapse_flush()
    by defining a ARC version, but that defeats the purpose of generic
    version, plus sementically this is the right thing to do.
    
    Fixes STAR 9000961194: LMBench on AXS103 triggering duplicate TLB
    exceptions with super pages
    
    Fixes: 12ebc1581ad11454 ("mm,thp: introduce flush_pmd_tlb_range")
    Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
    Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
    Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e31e4672559674b0950885465a42968a86865d42
Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Feb 5 15:36:16 2016 -0800

    dump_stack: avoid potential deadlocks
    
    commit d7ce36924344ace0dbdc855b1206cacc46b36d45 upstream.
    
    Some servers experienced fatal deadlocks because of a combination of
    bugs, leading to multiple cpus calling dump_stack().
    
    The checksumming bug was fixed in commit 34ae6a1aa054 ("ipv6: update
    skb->csum when CE mark is propagated").
    
    The second problem is a faulty locking in dump_stack()
    
    CPU1 runs in process context and calls dump_stack(), grabs dump_lock.
    
       CPU2 receives a TCP packet under softirq, grabs socket spinlock, and
       call dump_stack() from netdev_rx_csum_fault().
    
       dump_stack() spins on atomic_cmpxchg(&dump_lock, -1, 2), since
       dump_lock is owned by CPU1
    
    While dumping its stack, CPU1 is interrupted by a softirq, and happens
    to process a packet for the TCP socket locked by CPU2.
    
    CPU1 spins forever in spin_lock() : deadlock
    
    Stack trace on CPU1 looked like :
    
        NMI backtrace for cpu 1
        RIP: _raw_spin_lock+0x25/0x30
        ...
        Call Trace:
          <IRQ>
          tcp_v6_rcv+0x243/0x620
          ip6_input_finish+0x11f/0x330
          ip6_input+0x38/0x40
          ip6_rcv_finish+0x3c/0x90
          ipv6_rcv+0x2a9/0x500
          process_backlog+0x461/0xaa0
          net_rx_action+0x147/0x430
          __do_softirq+0x167/0x2d0
          call_softirq+0x1c/0x30
          do_softirq+0x3f/0x80
          irq_exit+0x6e/0xc0
          smp_call_function_single_interrupt+0x35/0x40
          call_function_single_interrupt+0x6a/0x70
          <EOI>
          printk+0x4d/0x4f
          printk_address+0x31/0x33
          print_trace_address+0x33/0x3c
          print_context_stack+0x7f/0x119
          dump_trace+0x26b/0x28e
          show_trace_log_lvl+0x4f/0x5c
          show_stack_log_lvl+0x104/0x113
          show_stack+0x42/0x44
          dump_stack+0x46/0x58
          netdev_rx_csum_fault+0x38/0x3c
          __skb_checksum_complete_head+0x6e/0x80
          __skb_checksum_complete+0x11/0x20
          tcp_rcv_established+0x2bd5/0x2fd0
          tcp_v6_do_rcv+0x13c/0x620
          sk_backlog_rcv+0x15/0x30
          release_sock+0xd2/0x150
          tcp_recvmsg+0x1c1/0xfc0
          inet_recvmsg+0x7d/0x90
          sock_recvmsg+0xaf/0xe0
          ___sys_recvmsg+0x111/0x3b0
          SyS_recvmsg+0x5c/0xb0
          system_call_fastpath+0x16/0x1b
    
    Fixes: b58d977432c8 ("dump_stack: serialize the output from dump_stack()")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Alex Thorlton <athorlton@sgi.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 55e0d9869f1d3a6bbd5d1e864c0e866fe1247f97
Author: Konstantin Khlebnikov <koct9i@gmail.com>
Date:   Fri Feb 5 15:37:01 2016 -0800

    radix-tree: fix oops after radix_tree_iter_retry
    
    commit 732042821cfa106b3c20b9780e4c60fee9d68900 upstream.
    
    Helper radix_tree_iter_retry() resets next_index to the current index.
    In following radix_tree_next_slot current chunk size becomes zero.  This
    isn't checked and it tries to dereference null pointer in slot.
    
    Tagged iterator is fine because retry happens only at slot 0 where tag
    bitmask in iter->tags is filled with single bit.
    
    Fixes: 46437f9a554f ("radix-tree: fix race in gang lookup")
    Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
    Cc: Matthew Wilcox <willy@linux.intel.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Ohad Ben-Cohen <ohad@wizery.com>
    Cc: Jeremiah Mahler <jmmahler@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 077b6173a8c89d666a06203f42267ff8b6d02d73
Author: Matthew Wilcox <willy@linux.intel.com>
Date:   Tue Feb 2 16:57:55 2016 -0800

    drivers/hwspinlock: fix race between radix tree insertion and lookup
    
    commit c6400ba7e13a41539342f1b6e1f9e78419cb0148 upstream.
    
    of_hwspin_lock_get_id() is protected by the RCU lock, which means that
    insertions can occur simultaneously with the lookup.  If the radix tree
    transitions from a height of 0, we can see a slot with the indirect_ptr
    bit set, which will cause us to at least read random memory, and could
    cause other havoc.
    
    Fix this by using the newly introduced radix_tree_iter_retry().
    
    Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Ohad Ben-Cohen <ohad@wizery.com>
    Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f4595e0081495b677a98c780e9ec1ab68ce89488
Author: Matthew Wilcox <willy@linux.intel.com>
Date:   Tue Feb 2 16:57:52 2016 -0800

    radix-tree: fix race in gang lookup
    
    commit 46437f9a554fbe3e110580ca08ab703b59f2f95a upstream.
    
    If the indirect_ptr bit is set on a slot, that indicates we need to redo
    the lookup.  Introduce a new function radix_tree_iter_retry() which
    forces the loop to retry the lookup by setting 'slot' to NULL and
    turning the iterator back to point at the problematic entry.
    
    This is a pretty rare problem to hit at the moment; the lookup has to
    race with a grow of the radix tree from a height of 0.  The consequences
    of hitting this race are that gang lookup could return a pointer to a
    radix_tree_node instead of a pointer to whatever the user had inserted
    in the tree.
    
    Fixes: cebbd29e1c2f ("radix-tree: rewrite gang lookup using iterator")
    Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Ohad Ben-Cohen <ohad@wizery.com>
    Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 262139f0244b3ade149b785dc820118c4265cdb1
Author: Rich Felker <dalias@libc.org>
Date:   Fri Jan 22 15:11:05 2016 -0800

    MAINTAINERS: return arch/sh to maintained state, with new maintainers
    
    commit 114bf37e04d839b555b3dc460b5e6ce156f49cf0 upstream.
    
    Add Yoshinori Sato and Rich Felker as maintainers for arch/sh
    (SUPERH).
    
    Signed-off-by: Rich Felker <dalias@libc.org>
    Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
    Acked-by: D. Jeff Dionne <jeff@uClinux.org>
    Acked-by: Rob Landley <rob@landley.net>
    Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Acked-by: Simon Horman <horms+renesas@verge.net.au>
    Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ececa3ebe27f61db28161bc8c6ef2e1612dbafbe
Author: Martijn Coenen <maco@google.com>
Date:   Fri Jan 15 16:57:49 2016 -0800

    memcg: only free spare array when readers are done
    
    commit 6611d8d76132f86faa501de9451a89bf23fb2371 upstream.
    
    A spare array holding mem cgroup threshold events is kept around to make
    sure we can always safely deregister an event and have an array to store
    the new set of events in.
    
    In the scenario where we're going from 1 to 0 registered events, the
    pointer to the primary array containing 1 event is copied to the spare
    slot, and then the spare slot is freed because no events are left.
    However, it is freed before calling synchronize_rcu(), which means
    readers may still be accessing threshold->primary after it is freed.
    
    Fixed by only freeing after synchronize_rcu().
    
    Signed-off-by: Martijn Coenen <maco@google.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 4b20545910cbc96cede009adb89e19f7d2aa8f52
Author: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Date:   Tue Feb 2 16:57:26 2016 -0800

    numa: fix /proc/<pid>/numa_maps for hugetlbfs on s390
    
    commit 5c2ff95e41c9290d16556cd02e35b25d81be8fe0 upstream.
    
    When working with hugetlbfs ptes (which are actually pmds) is not valid to
    directly use pte functions like pte_present() because the hardware bit
    layout of pmds and ptes can be different.  This is the case on s390.
    Therefore we have to convert the hugetlbfs ptes first into a valid pte
    encoding with huge_ptep_get().
    
    Currently the /proc/<pid>/numa_maps code uses hugetlbfs ptes without
    huge_ptep_get().  On s390 this leads to the following two problems:
    
    1) The pte_present() function returns false (instead of true) for
       PROT_NONE hugetlb ptes. Therefore PROT_NONE vmas are missing
       completely in the "numa_maps" output.
    
    2) The pte_dirty() function always returns false for all hugetlb ptes.
       Therefore these pages are reported as "mapped=xxx" instead of
       "dirty=xxx".
    
    Therefore use huge_ptep_get() to correctly convert the hugetlb ptes.
    
    Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
    Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit db33368ca32dd307cdcc191361de34f3937f513a
Author: Mike Kravetz <mike.kravetz@oracle.com>
Date:   Fri Jan 15 16:57:37 2016 -0800

    fs/hugetlbfs/inode.c: fix bugs in hugetlb_vmtruncate_list()
    
    commit 9aacdd354d197ad64685941b36d28ea20ab88757 upstream.
    
    Hillf Danton noticed bugs in the hugetlb_vmtruncate_list routine.  The
    argument end is of type pgoff_t.  It was being converted to a vaddr
    offset and passed to unmap_hugepage_range.  However, end was also being
    used as an argument to the vma_interval_tree_foreach controlling loop.
    In addition, the conversion of end to vaddr offset was incorrect.
    
    hugetlb_vmtruncate_list is called as part of a file truncate or
    fallocate hole punch operation.
    
    When truncating a hugetlbfs file, this bug could prevent some pages from
    being unmapped.  This is possible if there are multiple vmas mapping the
    file, and there is a sufficiently sized hole between the mappings.  The
    size of the hole between two vmas (A,B) must be such that the starting
    virtual address of B is greater than (ending virtual address of A <<
    PAGE_SHIFT).  In this case, the pages in B would not be unmapped.  If
    pages are not properly unmapped during truncate, the following BUG is
    hit:
    
    	kernel BUG at fs/hugetlbfs/inode.c:428!
    
    In the fallocate hole punch case, this bug could prevent pages from
    being unmapped as in the truncate case.  However, for hole punch the
    result is that unmapped pages will not be removed during the operation.
    For hole punch, it is also possible that more pages than desired will be
    unmapped.  This unnecessary unmapping will cause page faults to
    reestablish the mappings on subsequent page access.
    
    Fixes: 1bfad99ab (" hugetlbfs: hugetlb_vmtruncate_list() needs to take a range")Reported-by: Hillf Danton <hillf.zj@alibaba-inc.com>
    Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Cc: Davidlohr Bueso <dave@stgolabs.net>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b105aa33af0d487fc349212edecac04ef35a2622
Author: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Date:   Thu Jan 14 15:16:53 2016 -0800

    scripts/bloat-o-meter: fix python3 syntax error
    
    commit 72214a24a7677d4c7501eecc9517ed681b5f2db2 upstream.
    
    In Python3+ print is a function so the old syntax is not correct
    anymore:
    
      $ ./scripts/bloat-o-meter vmlinux.o vmlinux.o.old
        File "./scripts/bloat-o-meter", line 61
          print "add/remove: %s/%s grow/shrink: %s/%s up/down: %s/%s (%s)" % \
                                                                         ^
      SyntaxError: invalid syntax
    
    Fix by calling print as a function.
    
    Tested on python 2.7.11, 3.5.1
    
    Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit dad5038f3fe2ac41078eed8abba9b7f5ce2fa05c
Author: Laura Abbott <labbott@fedoraproject.org>
Date:   Thu Jan 14 15:16:50 2016 -0800

    dma-debug: switch check from _text to _stext
    
    commit ea535e418c01837d07b6c94e817540f50bfdadb0 upstream.
    
    In include/asm-generic/sections.h:
    
      /*
       * Usage guidelines:
       * _text, _data: architecture specific, don't use them in
       * arch-independent code
       * [_stext, _etext]: contains .text.* sections, may also contain
       * .rodata.*
       *                   and/or .init.* sections
    
    _text is not guaranteed across architectures.  Architectures such as ARM
    may reuse parts which are not actually text and erroneously trigger a bug.
    Switch to using _stext which is guaranteed to contain text sections.
    
    Came out of https://lkml.kernel.org/g/<567B1176.4000106@redhat.com>
    
    Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
    Reviewed-by: Kees Cook <keescook@chromium.org>
    Cc: Russell King <linux@arm.linux.org.uk>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 275adaf191c617409452bc569ad8f423e74678c7
Author: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Date:   Thu Jan 14 15:16:47 2016 -0800

    m32r: fix m32104ut_defconfig build fail
    
    commit 601f1db653217f205ffa5fb33514b4e1711e56d1 upstream.
    
    The build of m32104ut_defconfig for m32r arch was failing for long long
    time with the error:
    
      ERROR: "memory_start" [fs/udf/udf.ko] undefined!
      ERROR: "memory_end" [fs/udf/udf.ko] undefined!
      ERROR: "memory_end" [drivers/scsi/sg.ko] undefined!
      ERROR: "memory_start" [drivers/scsi/sg.ko] undefined!
      ERROR: "memory_end" [drivers/i2c/i2c-dev.ko] undefined!
      ERROR: "memory_start" [drivers/i2c/i2c-dev.ko] undefined!
    
    As done in other architectures export the symbols to fix the error.
    
    Reported-by: Fengguang Wu <fengguang.wu@intel.com>
    Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 71e5a4a747b0eadbff4835cf41493187bcbbd886
Author: Mathias Nyman <mathias.nyman@linux.intel.com>
Date:   Tue Jan 26 17:50:12 2016 +0200

    xhci: Fix list corruption in urb dequeue at host removal
    
    commit 5c82171167adb8e4ac77b91a42cd49fb211a81a0 upstream.
    
    xhci driver frees data for all devices, both usb2 and and usb3 the
    first time usb_remove_hcd() is called, including td_list and and xhci_ring
    structures.
    
    When usb_remove_hcd() is called a second time for the second xhci bus it
    will try to dequeue all pending urbs, and touches td_list which is already
    freed for that endpoint.
    
    Reported-by: Joe Lawrence <joe.lawrence@stratus.com>
    Tested-by: Joe Lawrence <joe.lawrence@stratus.com>
    Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d15298509b86f06d63135770ac8433295a18375f
Author: Mathias Nyman <mathias.nyman@linux.intel.com>
Date:   Tue Jan 26 17:50:04 2016 +0200

    Revert "xhci: don't finish a TD if we get a short-transfer event mid TD"
    
    commit a6835090716a85f2297668ba593bd00e1051e662 upstream.
    
    This reverts commit e210c422b6fd ("xhci: don't finish a TD if we get a
    short transfer event mid TD")
    
    Turns out that most host controllers do not follow the xHCI specs and never
    send the second event for the last TRB in the TD if there was a short event
    mid-TD.
    
    Returning the URB directly after the first short-transfer event is far
    better than never returning the URB. (class drivers usually timeout
    after 30sec). For the hosts that do send the second event we will go
    back to treating it as misplaced event and print an error message for it.
    
    The origial patch was sent to stable kernels and needs to be reverted from
    there as well
    
    Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 2231e5748746cd57df389521397e1c7f91882077
Author: David Woodhouse <David.Woodhouse@intel.com>
Date:   Mon Feb 15 12:42:38 2016 +0000

    iommu/vt-d: Clear PPR bit to ensure we get more page request interrupts
    
    commit 46924008273ed03bd11dbb32136e3da4cfe056e1 upstream.
    
    According to the VT-d specification we need to clear the PPR bit in
    the Page Request Status register when handling page requests, or the
    hardware won't generate any more interrupts.
    
    This wasn't actually necessary on SKL/KBL (which may well be the
    subject of a hardware erratum, although it's harmless enough). But
    other implementations do appear to get it right, and we only ever get
    one interrupt unless we clear the PPR bit.
    
    Reported-by: CQ Tang <cq.tang@intel.com>
    Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit db3ac35cbd310b3ce2e668f82eb6d0c77cc10b15
Author: CQ Tang <cq.tang@intel.com>
Date:   Wed Jan 13 21:15:03 2016 +0000

    iommu/vt-d: Fix 64-bit accesses to 32-bit DMAR_GSTS_REG
    
    commit fda3bec12d0979aae3f02ee645913d66fbc8a26e upstream.
    
    This is a 32-bit register. Apparently harmless on real hardware, but
    causing justified warnings in simulation.
    
    Signed-off-by: CQ Tang <cq.tang@intel.com>
    Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 7c6471cb94adcb2d7027d6981e87e87ccd05b06e
Author: David Woodhouse <David.Woodhouse@intel.com>
Date:   Tue Jan 12 19:18:06 2016 +0000

    iommu/vt-d: Fix mm refcounting to hold mm_count not mm_users
    
    commit e57e58bd390a6843db58560bf7b8341665d2e058 upstream.
    
    Holding mm_users works OK for graphics, which was the first user of SVM
    with VT-d. However, it works less well for other devices, where we actually
    do a mmap() from the file descriptor to which the SVM PASID state is tied.
    
    In this case on process exit we end up with a recursive reference count:
     - The MM remains alive until the file is closed and the driver's release()
       call ends up unbinding the PASID.
     - The VMA corresponding to the mmap() remains intact until the MM is
       destroyed.
     - Thus the file isn't closed, even when exit_files() runs, because the
       VMA is still holding a reference to it. And the MM remains alive…
    
    To address this issue, we *stop* holding mm_users while the PASID is bound.
    We already hold mm_count by virtue of the MMU notifier, and that can be
    made to be sufficient.
    
    It means that for a period during process exit, the fun part of mmput()
    has happened and exit_mmap() has been called so the MM is basically
    defunct. But the PGD still exists and the PASID is still bound to it.
    
    During this period, we have to be very careful — exit_mmap() doesn't use
    mm->mmap_sem because it doesn't expect anyone else to be touching the MM
    (quite reasonably, since mm_users is zero). So we also need to fix the
    fault handler to just report failure if mm_users is already zero, and to
    temporarily bump mm_users while handling any faults.
    
    Additionally, exit_mmap() calls mmu_notifier_release() *before* it tears
    down the page tables, which is too early for us to flush the IOTLB for
    this PASID. And __mmu_notifier_release() removes every notifier from the
    list, so when exit_mmap() finally *does* tear down the mappings and
    clear the page tables, we don't get notified. So we work around this by
    clearing the PASID table entry in our MMU notifier release() callback.
    That way, the hardware *can't* get any pages back from the page tables
    before they get cleared.
    
    Hardware designers have confirmed that the resulting 'PASID not present'
    faults should be handled just as gracefully as 'page not present' faults,
    the important criterion being that they don't perturb the operation for
    any *other* PASID in the system.
    
    Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d63a009a9bd9f0503a0d817b6ae97f5b06de22e3
Author: Baoquan He <bhe@redhat.com>
Date:   Wed Jan 20 22:01:19 2016 +0800

    iommu/amd: Correct the wrong setting of alias DTE in do_attach
    
    commit 9b1a12d29109234d2b9718d04d4d404b7da4e794 upstream.
    
    In below commit alias DTE is set when its peripheral is
    setting DTE. However there's a code bug here to wrongly
    set the alias DTE, correct it in this patch.
    
    commit e25bfb56ea7f046b71414e02f80f620deb5c6362
    Author: Joerg Roedel <jroedel@suse.de>
    Date:   Tue Oct 20 17:33:38 2015 +0200
    
        iommu/amd: Set alias DTE in do_attach/do_detach
    
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Tested-by: Mark Hounschell <markh@compro.net>
    Signed-off-by: Joerg Roedel <jroedel@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c65a7b684133887e9211cd901bb689c48a6c18d8
Author: Jeremy McNicoll <jmcnicol@redhat.com>
Date:   Thu Jan 14 21:33:06 2016 -0800

    iommu/vt-d: Don't skip PCI devices when disabling IOTLB
    
    commit da972fb13bc5a1baad450c11f9182e4cd0a091f6 upstream.
    
    Fix a simple typo when disabling IOTLB on PCI(e) devices.
    
    Fixes: b16d0cb9e2fc ("iommu/vt-d: Always enable PASID/PRI PCI capabilities before ATS")
    Signed-off-by: Jeremy McNicoll <jmcnicol@redhat.com>
    Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
    Signed-off-by: Joerg Roedel <jroedel@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b864f4e50c56a5bd0ab1603b86164cab66337906
Author: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Date:   Sat Jan 16 10:04:49 2016 -0800

    Input: vmmouse - fix absolute device registration
    
    commit d4f1b06d685d11ebdaccf11c0db1cb3c78736862 upstream.
    
    We should set device's capabilities first, and then register it,
    otherwise various handlers already present in the kernel will not be
    able to connect to the device.
    
    Reported-by: Lauri Kasanen <cand@gmx.com>
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 726ecfc321994ec6ab044c1e3e5886408de991ac
Author: James Bottomley <JBottomley@Odin.com>
Date:   Wed Jan 20 14:58:29 2016 -0800

    string_helpers: fix precision loss for some inputs
    
    commit 564b026fbd0d28e9f70fb3831293d2922bb7855b upstream.
    
    It was noticed that we lose precision in the final calculation for some
    inputs.  The most egregious example is size=3000 blk_size=1900 in units
    of 10 should yield 5.70 MB but in fact yields 3.00 MB (oops).
    
    This is because the current algorithm doesn't correctly account for
    all the remainders in the logarithms.  Fix this by doing a correct
    calculation in the remainders based on napier's algorithm.
    
    Additionally, now we have the correct result, we have to account for
    arithmetic rounding because we're printing 3 digits of precision.  This
    means that if the fourth digit is five or greater, we have to round up,
    so add a section to ensure correct rounding.  Finally account for all
    possible inputs correctly, including zero for block size.
    
    Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
    Signed-off-by: James Bottomley <JBottomley@Odin.com>
    Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 5c73252f746d4973294b7ae7eead944f4a5210f8
Author: Aurélien Francillon <aurelien@francillon.net>
Date:   Sat Jan 2 20:39:54 2016 -0800

    Input: i8042 - add Fujitsu Lifebook U745 to the nomux list
    
    commit dd0d0d4de582a6a61c032332c91f4f4cb2bab569 upstream.
    
    Without i8042.nomux=1 the Elantech touch pad is not working at all on
    a Fujitsu Lifebook U745. This patch does not seem necessary for all
    U745 (maybe because of different BIOS versions?). However, it was
    verified that the patch does not break those (see opensuse bug 883192:
    https://bugzilla.opensuse.org/show_bug.cgi?id=883192).
    
    Signed-off-by: Aurélien Francillon <aurelien@francillon.net>
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 1d70d30a5fa2139325e413aaf4ce44675d3133ce
Author: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Date:   Mon Jan 11 17:35:38 2016 -0800

    Input: elantech - mark protocols v2 and v3 as semi-mt
    
    commit 6544a1df11c48c8413071aac3316792e4678fbfb upstream.
    
    When using a protocol v2 or v3 hardware, elantech uses the function
    elantech_report_semi_mt_data() to report data. This devices are rather
    creepy because if num_finger is 3, (x2,y2) is (0,0). Yes, only one valid
    touch is reported.
    
    Anyway, userspace (libinput) is now confused by these (0,0) touches,
    and detect them as palm, and rejects them.
    
    Commit 3c0213d17a09 ("Input: elantech - fix semi-mt protocol for v3 HW")
    was sufficient enough for xf86-input-synaptics and libinput before it has
    palm rejection. Now we need to actually tell libinput that this device is
    a semi-mt one and it should not rely on the actual values of the 2 touches.
    
    Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d1f8217a9a6e9b84df4cf6a0d3c99c7bbe1430a7
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Date:   Wed Feb 17 13:11:15 2016 -0800

    mm: fix regression in remap_file_pages() emulation
    
    commit 48f7df329474b49d83d0dffec1b6186647f11976 upstream.
    
    Grazvydas Ignotas has reported a regression in remap_file_pages()
    emulation.
    
    Testcase:
    	#define _GNU_SOURCE
    	#include <assert.h>
    	#include <stdlib.h>
    	#include <stdio.h>
    	#include <sys/mman.h>
    
    	#define SIZE    (4096 * 3)
    
    	int main(int argc, char **argv)
    	{
    		unsigned long *p;
    		long i;
    
    		p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
    				MAP_SHARED | MAP_ANONYMOUS, -1, 0);
    		if (p == MAP_FAILED) {
    			perror("mmap");
    			return -1;
    		}
    
    		for (i = 0; i < SIZE / 4096; i++)
    			p[i * 4096 / sizeof(*p)] = i;
    
    		if (remap_file_pages(p, 4096, 0, 1, 0)) {
    			perror("remap_file_pages");
    			return -1;
    		}
    
    		if (remap_file_pages(p, 4096 * 2, 0, 1, 0)) {
    			perror("remap_file_pages");
    			return -1;
    		}
    
    		assert(p[0] == 1);
    
    		munmap(p, SIZE);
    
    		return 0;
    	}
    
    The second remap_file_pages() fails with -EINVAL.
    
    The reason is that remap_file_pages() emulation assumes that the target
    vma covers whole area we want to over map.  That assumption is broken by
    first remap_file_pages() call: it split the area into two vma.
    
    The solution is to check next adjacent vmas, if they map the same file
    with the same flags.
    
    Fixes: c8d78c1823f4 ("mm: replace remap_file_pages() syscall with emulation")
    Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Reported-by: Grazvydas Ignotas <notasas@gmail.com>
    Tested-by: Grazvydas Ignotas <notasas@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 413aab16bc7b9b58fb1a52010eb3af3d227d7007
Author: Konstantin Khlebnikov <koct9i@gmail.com>
Date:   Fri Feb 5 15:36:50 2016 -0800

    mm: replace vma_lock_anon_vma with anon_vma_lock_read/write
    
    commit 12352d3cae2cebe18805a91fab34b534d7444231 upstream.
    
    Sequence vma_lock_anon_vma() - vma_unlock_anon_vma() isn't safe if
    anon_vma appeared between lock and unlock.  We have to check anon_vma
    first or call anon_vma_prepare() to be sure that it's here.  There are
    only few users of these legacy helpers.  Let's get rid of them.
    
    This patch fixes anon_vma lock imbalance in validate_mm().  Write lock
    isn't required here, read lock is enough.
    
    And reorders expand_downwards/expand_upwards: security_mmap_addr() and
    wrapping-around check don't have to be under anon vma lock.
    
    Link: https://lkml.kernel.org/r/CACT4Y+Y908EjM2z=706dv4rV6dWtxTLK9nFg9_7DhRMLppBo2g@mail.gmail.com
    Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
    Reported-by: Dmitry Vyukov <dvyukov@google.com>
    Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 918a2c388ed7828fa4b6ccb5cacd9e422f30938c
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Date:   Thu Jan 21 16:40:27 2016 -0800

    mm: fix mlock accouting
    
    commit 7162a1e87b3e380133dadc7909081bb70d0a7041 upstream.
    
    Tetsuo Handa reported underflow of NR_MLOCK on munlock.
    
    Testcase:
    
        #include <stdio.h>
        #include <stdlib.h>
        #include <sys/mman.h>
    
        #define BASE ((void *)0x400000000000)
        #define SIZE (1UL << 21)
    
        int main(int argc, char *argv[])
        {
            void *addr;
    
            system("grep Mlocked /proc/meminfo");
            addr = mmap(BASE, SIZE, PROT_READ | PROT_WRITE,
                    MAP_ANONYMOUS | MAP_PRIVATE | MAP_LOCKED | MAP_FIXED,
                    -1, 0);
            if (addr == MAP_FAILED)
                printf("mmap() failed\n"), exit(1);
            munmap(addr, SIZE);
            system("grep Mlocked /proc/meminfo");
            return 0;
        }
    
    It happens on munlock_vma_page() due to unfortunate choice of nr_pages
    data type:
    
        __mod_zone_page_state(zone, NR_MLOCK, -nr_pages);
    
    For unsigned int nr_pages, implicitly casted to long in
    __mod_zone_page_state(), it becomes something around UINT_MAX.
    
    munlock_vma_page() usually called for THP as small pages go though
    pagevec.
    
    Let's make nr_pages signed int.
    
    Similar fixes in 6cdb18ad98a4 ("mm/vmstat: fix overflow in
    mod_zone_page_state()") used `long' type, but `int' here is OK for a
    count of the number of sub-pages in a huge page.
    
    Fixes: ff6a6da60b89 ("mm: accelerate munlock() treatment of THP pages")
    Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Cc: Michel Lespinasse <walken@google.com>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6e8ea2f2258caf0d2a876b36c15a93788a7cde93
Author: Dan Williams <dan.j.williams@intel.com>
Date:   Tue Jan 5 18:37:23 2016 -0800

    libnvdimm: fix namespace object confusion in is_uuid_busy()
    
    commit e07ecd76d4db7bda1e9495395b2110a3fe28845a upstream.
    
    When btt devices were re-worked to be child devices of regions this
    routine was overlooked.  It mistakenly attempts to_nd_namespace_pmem()
    or to_nd_namespace_blk() conversions on btt and pfn devices.  By luck to
    date we have happened to be hitting valid memory leading to a uuid
    miscompare, but a recent change to struct nd_namespace_common causes:
    
     BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
     IP: [<ffffffff814610dc>] memcmp+0xc/0x40
     [..]
     Call Trace:
      [<ffffffffa0028631>] is_uuid_busy+0xc1/0x2a0 [libnvdimm]
      [<ffffffffa0028570>] ? to_nd_blk_region+0x50/0x50 [libnvdimm]
      [<ffffffff8158c9c0>] device_for_each_child+0x50/0x90
    
    Signed-off-by: Dan Williams <dan.j.williams@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit bd55913cf20804d2c3d4d83e79d9867963af2ff1
Author: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Date:   Fri Jan 15 16:54:03 2016 -0800

    mm: soft-offline: check return value in second __get_any_page() call
    
    commit d96b339f453997f2f08c52da3f41423be48c978f upstream.
    
    I saw the following BUG_ON triggered in a testcase where a process calls
    madvise(MADV_SOFT_OFFLINE) on thps, along with a background process that
    calls migratepages command repeatedly (doing ping-pong among different
    NUMA nodes) for the first process:
    
       Soft offlining page 0x60000 at 0x700000600000
       __get_any_page: 0x60000 free buddy page
       page:ffffea0001800000 count:0 mapcount:-127 mapping:          (null) index:0x1
       flags: 0x1fffc0000000000()
       page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
       ------------[ cut here ]------------
       kernel BUG at /src/linux-dev/include/linux/mm.h:342!
       invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
       Modules linked in: cfg80211 rfkill crc32c_intel serio_raw virtio_balloon i2c_piix4 virtio_blk virtio_net ata_generic pata_acpi
       CPU: 3 PID: 3035 Comm: test_alloc_gene Tainted: G           O    4.4.0-rc8-v4.4-rc8-160107-1501-00000-rc8+ #74
       Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
       task: ffff88007c63d5c0 ti: ffff88007c210000 task.ti: ffff88007c210000
       RIP: 0010:[<ffffffff8118998c>]  [<ffffffff8118998c>] put_page+0x5c/0x60
       RSP: 0018:ffff88007c213e00  EFLAGS: 00010246
       Call Trace:
         put_hwpoison_page+0x4e/0x80
         soft_offline_page+0x501/0x520
         SyS_madvise+0x6bc/0x6f0
         entry_SYSCALL_64_fastpath+0x12/0x6a
       Code: 8b fc ff ff 5b 5d c3 48 89 df e8 b0 fa ff ff 48 89 df 31 f6 e8 c6 7d ff ff 5b 5d c3 48 c7 c6 08 54 a2 81 48 89 df e8 a4 c5 01 00 <0f> 0b 66 90 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b 47
       RIP  [<ffffffff8118998c>] put_page+0x5c/0x60
        RSP <ffff88007c213e00>
    
    The root cause resides in get_any_page() which retries to get a refcount
    of the page to be soft-offlined.  This function calls
    put_hwpoison_page(), expecting that the target page is putback to LRU
    list.  But it can be also freed to buddy.  So the second check need to
    care about such case.
    
    Fixes: af8fae7c0886 ("mm/memory-failure.c: clean up soft_offline_page()")
    Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Cc: Sasha Levin <sasha.levin@oracle.com>
    Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Jerome Marchand <jmarchan@redhat.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Dave Hansen <dave.hansen@intel.com>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Steve Capper <steve.capper@linaro.org>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Michal Hocko <mhocko@suse.cz>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: David Rientjes <rientjes@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a6a3f3ddf6a663ab3d141f0e13f457b084a8c409
Author: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Date:   Mon Dec 7 12:25:02 2015 +0530

    perf kvm record/report: 'unprocessable sample' error while recording/reporting guest data
    
    commit 3caeaa562733c4836e61086ec07666635006a787 upstream.
    
    While recording guest samples in host using perf kvm record, it will
    populate unprocessable sample error, though samples will be recorded
    properly. While generating report using perf kvm report, no samples will
    be processed and same error will populate. We have seen this behaviour
    with upstream perf(4.4-rc3) on x86 and ppc64 hardware.
    
    Reason behind this failure is, when it tries to fetch machine from
    rb_tree of machines, it fails. As a part of tracing a bug, we figured
    out that this code was incorrectly refactored in commit 54245fdc3576
    ("perf session: Remove wrappers to machines__find").
    
    This patch will change the functionality such that if it can't fetch
    machine in first trial, it will create one node of machine and add that to
    rb_tree. So next time when it tries to fetch same machine from rb_tree,
    it won't fail. Actually it was the case before refactoring of code in
    aforementioned commit.
    
    This patch is generated from acme perf/core branch.
    
    Below I've mention an example that demonstrate the behaviour before and
    after applying patch.
    
    Before applying patch:
    [Note: One needs to run guest before recording data in host]
    
      ravi@ravi-bangoria:~$ ./perf kvm record -a
      Warning:
      5903 unprocessable samples recorded.
      Do you have a KVM guest running and not using 'perf kvm'?
      [ perf record: Captured and wrote 1.409 MB perf.data.guest (285 samples) ]
    
      ravi@ravi-bangoria:~$ ./perf kvm report --stdio
      Warning:
      5903 unprocessable samples recorded.
      Do you have a KVM guest running and not using 'perf kvm'?
      # To display the perf.data header info, please use --header/--header-only options.
      #
      # Total Lost Samples: 0
      #
      # Samples: 285  of event 'cycles'
      # Event count (approx.): 88715406
      #
      # Overhead  Command  Shared Object  Symbol
      # ........  .......  .............  ......
      #
    
      # (For a higher level overview, try: perf report --sort comm,dso)
      #
    
    After applying patch:
    
      ravi@ravi-bangoria:~$ ./perf kvm record -a
      [ perf record: Captured and wrote 1.188 MB perf.data.guest (17 samples) ]
    
      ravi@ravi-bangoria:~$ ./perf kvm report --stdio
      # To display the perf.data header info, please use --header/--header-only options.
      #
      # Total Lost Samples: 0
      #
      # Samples: 17  of event 'cycles'
      # Event count (approx.): 700746
      #
      # Overhead  Command  Shared Object     Symbol
      # ........  .......  ................  ......................
      #
          34.19%  :5758    [unknown]         [g] 0xffffffff818682ab
          22.79%  :5758    [unknown]         [g] 0xffffffff812dc7f8
          22.79%  :5758    [unknown]         [g] 0xffffffff818650d0
          14.83%  :5758    [unknown]         [g] 0xffffffff8161a1b6
           2.49%  :5758    [unknown]         [g] 0xffffffff818692bf
           0.48%  :5758    [unknown]         [g] 0xffffffff81869253
           0.05%  :5758    [unknown]         [g] 0xffffffff81869250
    
    Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
    Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
    Fixes: 54245fdc3576 ("perf session: Remove wrappers to machines__find")
    Link: http://lkml.kernel.org/r/1449471302-11283-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b58731d6263a323e602191257628d93c2274cefa
Author: Greg Kurz <gkurz@linux.vnet.ibm.com>
Date:   Wed Jan 13 18:28:17 2016 +0100

    KVM: PPC: Fix ONE_REG AltiVec support
    
    commit b4d7f161feb3015d6306e1d35b565c888ff70c9d upstream.
    
    The get and set operations got exchanged by mistake when moving the
    code from book3s.c to powerpc.c.
    
    Fixes: 3840edc8033ad5b86deee309c1c321ca54257452
    Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
    Signed-off-by: Paul Mackerras <paulus@samba.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 921fa9b77380b0c3df67c424c2a68db5214ab461
Author: Thomas Huth <thuth@redhat.com>
Date:   Fri Nov 20 09:11:45 2015 +0100

    KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8
    
    commit 760a7364f27d974d100118d88190e574626e18a6 upstream.
    
    In the old DABR register, the BT (Breakpoint Translation) bit
    is bit number 61. In the new DAWRX register, the WT (Watchpoint
    Translation) bit is bit number 59. So to move the DABR-BT bit
    into the position of the DAWRX-WT bit, it has to be shifted by
    two, not only by one. This fixes hardware watchpoints in gdb of
    older guests that only use the H_SET_DABR/X interface instead
    of the new H_SET_MODE interface.
    
    Signed-off-by: Thomas Huth <thuth@redhat.com>
    Reviewed-by: Laurent Vivier <lvivier@redhat.com>
    Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
    Signed-off-by: Paul Mackerras <paulus@samba.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b3e336de65ebdd5ed85ba1f03216091a28583e7e
Author: Andre Przywara <andre.przywara@arm.com>
Date:   Wed Feb 3 16:56:51 2016 +0000

    KVM: arm/arm64: Fix reference to uninitialised VGIC
    
    commit b3aff6ccbb1d25e506b60ccd9c559013903f3464 upstream.
    
    Commit 4b4b4512da2a ("arm/arm64: KVM: Rework the arch timer to use
    level-triggered semantics") brought the virtual architected timer
    closer to the VGIC. There is one occasion were we don't properly
    check for the VGIC actually having been initialized before, but
    instead go on to check the active state of some IRQ number.
    If userland hasn't instantiated a virtual GIC, we end up with a
    kernel NULL pointer dereference:
    =========
    Unable to handle kernel NULL pointer dereference at virtual address 00000000
    pgd = ffffffc9745c5000
    [00000000] *pgd=00000009f631e003, *pud=00000009f631e003, *pmd=0000000000000000
    Internal error: Oops: 96000006 [#2] PREEMPT SMP
    Modules linked in:
    CPU: 0 PID: 2144 Comm: kvm_simplest-ar Tainted: G      D 4.5.0-rc2+ #1300
    Hardware name: ARM Juno development board (r1) (DT)
    task: ffffffc976da8000 ti: ffffffc976e28000 task.ti: ffffffc976e28000
    PC is at vgic_bitmap_get_irq_val+0x78/0x90
    LR is at kvm_vgic_map_is_active+0xac/0xc8
    pc : [<ffffffc0000b7e28>] lr : [<ffffffc0000b972c>] pstate: 20000145
    ....
    =========
    
    Fix this by bailing out early of kvm_timer_flush_hwstate() if we don't
    have a VGIC at all.
    
    Reported-by: Cosmin Gorgovan <cosmin@linux-geek.org>
    Acked-by: Marc Zyngier <marc.zyngier@arm.com>
    Signed-off-by: Andre Przywara <andre.przywara@arm.com>
    Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 593337c55ac30807f38ca924e716edfdc4e6a916
Author: Marek Szyprowski <m.szyprowski@samsung.com>
Date:   Tue Feb 16 15:14:44 2016 +0100

    arm64: dma-mapping: fix handling of devices registered before arch_initcall
    
    commit 722ec35f7faefcc34d12616eca7976a848870f9d upstream.
    
    This patch ensures that devices, which got registered before arch_initcall
    will be handled correctly by IOMMU-based DMA-mapping code.
    
    Fixes: 13b8629f6511 ("arm64: Add IOMMU dma_ops")
    Acked-by: Robin Murphy <robin.murphy@arm.com>
    Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
    Signed-off-by: Will Deacon <will.deacon@arm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a6e01f0c81d54678cdd50892462d12b672320e5e
Author: Tony Lindgren <tony@atomide.com>
Date:   Thu Jan 14 12:20:48 2016 -0800

    ARM: OMAP2+: Fix ppa_zero_params and ppa_por_params for rodata
    
    commit 4da597d16602d14405b71a18d45e1c59f28f0fd2 upstream.
    
    We don't want to write to .text so let's move ppa_zero_params and
    ppa_por_params to .data and access them via pointers.
    
    Note that I have not been able to test as we I don't have a HS
    omap4 to test with. The code has been changed in similar way as
    for omap3 though.
    
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Laura Abbott <labbott@redhat.com>
    Cc: Nishanth Menon <nm@ti.com>
    Cc: Richard Woodruff <r-woodruff2@ti.com>
    Cc: Russell King <linux@arm.linux.org.uk>
    Cc: Tero Kristo <t-kristo@ti.com>
    Acked-by: Nicolas Pitre <nico@linaro.org>
    Fixes: 1e6b48116a95 ("ARM: mm: allow non-text sections to be
    non-executable")
    Signed-off-by: Tony Lindgren <tony@atomide.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 82de5956e9f4102e7b29d6718abaf0d156cdbc2a
Author: Tony Lindgren <tony@atomide.com>
Date:   Thu Jan 14 12:20:47 2016 -0800

    ARM: OMAP2+: Fix save_secure_ram_context for rodata
    
    commit a5311d4d13df80bd71a9e47f9ecaf327f478fab1 upstream.
    
    We don't want to write to .text and we can move save_secure_ram_context
    into .data as it all gets copied into SRAM anyways.
    
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Laura Abbott <labbott@redhat.com>
    Cc: Nishanth Menon <nm@ti.com>
    Cc: Richard Woodruff <r-woodruff2@ti.com>
    Cc: Russell King <linux@arm.linux.org.uk>
    Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
    Cc: Tero Kristo <t-kristo@ti.com>
    Acked-by: Nicolas Pitre <nico@linaro.org>
    Fixes: 1e6b48116a95 ("ARM: mm: allow non-text sections to be
    non-executable")
    Signed-off-by: Tony Lindgren <tony@atomide.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 31a50ee1ad3e0cc51a75fb052115fcf5a804ab35
Author: Tony Lindgren <tony@atomide.com>
Date:   Thu Jan 14 12:20:47 2016 -0800

    ARM: OMAP2+: Fix l2dis_3630 for rodata
    
    commit eeaf9646aca89d097861caa24d9818434e48810e upstream.
    
    We don't want to write to .text section. Let's move l2dis_3630
    to .data and access it via a pointer.
    
    For calculating the offset, let's optimize out the add and do it
    in ldr/str as suggested by Nicolas Pitre <nicolas.pitre@linaro.org>.
    
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Laura Abbott <labbott@redhat.com>
    Cc: Nishanth Menon <nm@ti.com>
    Cc: Richard Woodruff <r-woodruff2@ti.com>
    Cc: Russell King <linux@arm.linux.org.uk>
    Cc: Tero Kristo <t-kristo@ti.com>
    Acked-by: Nicolas Pitre <nico@linaro.org>
    Fixes: 1e6b48116a95 ("ARM: mm: allow non-text sections to be
    non-executable")
    Signed-off-by: Tony Lindgren <tony@atomide.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 98b3f17a7235cb49755c6bd17fc428381261105c
Author: Tony Lindgren <tony@atomide.com>
Date:   Thu Jan 14 12:20:47 2016 -0800

    ARM: OMAP2+: Fix l2_inv_api_params for rodata
    
    commit 0a0b13275558c32bbf6241464a7244b1ffd5afb3 upstream.
    
    We don't want to write to .text, so let's move l2_inv_api_params
    to .data and access it via a pointer.
    
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Laura Abbott <labbott@redhat.com>
    Cc: Nishanth Menon <nm@ti.com>
    Cc: Richard Woodruff <r-woodruff2@ti.com>
    Cc: Russell King <linux@arm.linux.org.uk>
    Cc: Tero Kristo <t-kristo@ti.com>
    Acked-by: Nicolas Pitre <nico@linaro.org>
    Fixes: 1e6b48116a95 ("ARM: mm: allow non-text sections to be
    non-executable")
    Signed-off-by: Tony Lindgren <tony@atomide.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ec776d670e2d1f6034fac2f1a28ff972776bba52
Author: Tony Lindgren <tony@atomide.com>
Date:   Thu Jan 14 12:20:47 2016 -0800

    ARM: OMAP2+: Fix wait_dll_lock_timed for rodata
    
    commit d9db59103305eb5ec2a86369f32063e9921b6ac5 upstream.
    
    We don't want to be writing to .text so it can be set rodata.
    Fix error "Unable to handle kernel paging request at virtual address
    c012396c" in wait_dll_lock_timed if CONFIG_DEBUG_RODATA is selected.
    
    As these counters are for debugging only and unused, we can just
    remove them.
    
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Laura Abbott <labbott@redhat.com>
    Cc: Nishanth Menon <nm@ti.com>
    Cc: Richard Woodruff <r-woodruff2@ti.com>
    Cc: Russell King <linux@arm.linux.org.uk>
    Cc: Tero Kristo <t-kristo@ti.com>
    Acked-by: Nicolas Pitre <nico@linaro.org>
    Fixes: 1e6b48116a95 ("ARM: mm: allow non-text sections to be
    non-executable")
    Signed-off-by: Tony Lindgren <tony@atomide.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6ec8b7c5bbdd539878fb23f6040a2c041c43247c
Author: Wenyou Yang <wenyou.yang@atmel.com>
Date:   Wed Jan 27 13:16:24 2016 +0800

    ARM: dts: at91: sama5d4ek: add phy address and IRQ for macb0
    
    commit aae6b18f5c95b9dc78de66d1e27e8afeee2763b7 upstream.
    
    On SAMA5D4EK board, the Ethernet doesn't work after resuming from the suspend
    state.
    
    Signed-off-by: Wenyou Yang <wenyou.yang@atmel.com>
    [nicolas.ferre@atmel.com: adapt to newer kernel]
    Fixes: 38153a017896 ("ARM: at91/dt: sama5d4: add dts for sama5d4 xplained board")
    Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3b18631fbcea811634fadacb5a7358170d9e3414
Author: Nicolas Ferre <nicolas.ferre@atmel.com>
Date:   Wed Jan 27 11:03:02 2016 +0100

    ARM: dts: at91: sama5d4 xplained: fix phy0 IRQ type
    
    commit e873cc022ce5e2c04bbc53b5874494b657e29d3f upstream.
    
    For phy0 KSZ8081, the type of GPIO IRQ should be "level low" instead of
    "edge falling".
    
    Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
    Fixes: 38153a017896 ("ARM: at91/dt: sama5d4: add dts for sama5d4 xplained board")
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 080fc28fe4759d0d8a64b8980e4e6220a590221e
Author: Mohamed Jamsheeth Hajanajubudeen <mohamedjamsheeth.hajanajubudeen@atmel.com>
Date:   Fri Dec 11 17:06:26 2015 +0530

    ARM: dts: at91: sama5d4: fix instance id of DBGU
    
    commit 929e883f2bfdf68d4bd3aec43912e956417005c7 upstream.
    
    Change instance id of DBGU to 45.
    
    Signed-off-by: Mohamed Jamsheeth Hajanajubudeen <mohamedjamsheeth.hajanajubudeen@atmel.com>
    Fixes: 7c661394c56c ("ARM: at91: dt: add device tree file for SAMA5D4 SoC")
    Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 5542d00c46536d09c594c2f604f3d684c62c3f29
Author: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Date:   Fri Jan 15 09:30:18 2016 +0100

    ARM: dts: at91: sama5d4 xplained: properly mux phy interrupt
    
    commit f505dba762ae826bb68978a85ee5c8ced7dea8d7 upstream.
    
    No interrupt were received from the phy because PIOE 1 may not be properly
    muxed. It prevented proper link detection, especially since commit
    321beec5047a ("net: phy: Use interrupts when available in NOLINK state")
    disables polling.
    
    Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
    Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a482d944816941a8ad987975585ee064dc023325
Author: H. Nikolaus Schaller <hns@goldelico.com>
Date:   Tue Jan 5 13:01:37 2016 +0100

    ARM: dts: omap5-board-common: enable rtc and charging of backup battery
    
    commit c08659d431b40ad5beb97d7dde49ad9796cb812c upstream.
    
    tested on OMP5432 EVM
    
    Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
    Signed-off-by: Tony Lindgren <tony@atomide.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 41a94b382396f3b33d6426ca8d0ddab237ad8b8a
Author: Tony Lindgren <tony@atomide.com>
Date:   Mon Jan 11 14:35:24 2016 -0800

    ARM: dts: Fix omap5 PMIC control lines for RTC writes
    
    commit af756bbccff85504ce05c63a50f80b9d7823c500 upstream.
    
    The palmas PMIC has two control lines that need to be muxed properly
    for things to work. The sys_nirq pin is used for interrupts, and msecure
    pin is used for enabling writes to some PMIC registers.
    
    Without these pins configured properly things can fail in mysterious
    ways. For example, we can't update the RTC registers on palmas PMIC
    unless the msecure pin is configured. And this is probably the reason
    why we had RTC missing from the omap5 dts file.
    
    According to "OMAP5430 ES2.0 Data Manual [Public] VErsion A (Rev. F)"
    swps052f.pdf, mux mode 1 is for sys_drm_msecure so in theory there's
    should be no need to configure it as a GPIO pin.
    
    However, it seems there are some reliability issues using the msecure
    mux mode. And the TI trees configure the msecure pin as GPIO out high
    instead.
    
    As the PMIC only cares that the msecure line is high to allow access
    to the RTC registers, let's use a GPIO hog as suggested by Nishanth
    Menon <nm@ti.com>. Also the use of the internal pull was considered
    but supposedly that may not be capable of keeping the line high in
    a noisy environment.
    
    If we ever see high security omap5 products in the mainline tree,
    those need to skip the msecure pin muxing and ignore setting the GPIO
    hog. Chances are the related pin mux registers are locked in that case
    and the msecure pin is managed by whatever software may be running in
    the ARM TrustZone.
    
    Who knows what the original intention of the msecure pin was. Maybe
    it was supposed to prevent the system time to be set back for some
    game demo modes to time out? Anyways, it seems that later PMICs like
    tps659037 have recycled this pin for "powerhold" and devices like
    beagle-x15 do not need changes to the msecure pin configuration.
    
    To avoid further confusion with TWL variant PMICs, beagle-x15 does
    not have a back-up battery for RTC palmas. Instead the mcp79410 RTC
    is used with rtc-ds1307 driver. There is a "powerhold" jumper j5
    holes near the palmas PMIC, and shorting it seems to power up
    beagle-x15 automatically. It is unknown if it also has other side
    effects to the beagle-x15 power up sequence.
    
    Signed-off-by: Tony Lindgren <tony@atomide.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 671a5bc6f54d080b9453dc70a638c41ea23838d8
Author: Adam Ford <aford173@gmail.com>
Date:   Thu Jan 21 11:03:20 2016 -0600

    ARM: dts: Fix wl12xx missing clocks that cause hangs
    
    commit 0ea24daae053a9ba65d2f3eb20523002c1a8af38 upstream.
    
    The tcxo-clock-frequency binding is listed as optional,
    but without it the wl12xx used on the torpedo + wireless
    may hang.  Scanning also appears broken without this patch.
    
    Signed-off-by: Adam Ford <aford173@gmail.com>
    Fixes: 687c27676151 ("ARM: dts: Add minimal support for LogicPD
    Torpedo DM3730 devkit")
    Signed-off-by: Tony Lindgren <tony@atomide.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 323f7cd28b7fd6b1c0cd9f003aa8482c10ff9d7d
Author: Linus Walleij <linus.walleij@linaro.org>
Date:   Mon Feb 1 14:18:57 2016 +0100

    ARM: nomadik: fix up SD/MMC DT settings
    
    commit 418d5516568b3fdbc4e7b53677dd78aed8514565 upstream.
    
    The DTSI file for the Nomadik does not properly specify how the
    PL180 levelshifter is connected: the Nomadik actually needs all
    the five st,sig-dir-* flags set to properly control all lines out.
    
    Further this board supports full power cycling of the card, and
    since this variant has no hardware clock gating, it needs a
    ridiculously low frequency setting to keep up with the ever
    overflowing FIFO.
    
    The pin configuration set-up is a bit of a mystery, because of
    course these pins are a mix of inputs and outputs. However the
    reference implementation sets all pins to "output" with
    unspecified initial value, so let's do that here as well.
    
    Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
    Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Olof Johansson <olof@lixom.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 53d991bbbc513ac4ac67ad7c422b13841920353f
Author: Linus Walleij <linus.walleij@linaro.org>
Date:   Mon Feb 8 09:14:37 2016 +0100

    ARM: 8517/1: ICST: avoid arithmetic overflow in icst_hz()
    
    commit 5070fb14a0154f075c8b418e5bc58a620ae85a45 upstream.
    
    When trying to set the ICST 307 clock to 25174000 Hz I ran into
    this arithmetic error: the icst_hz_to_vco() correctly figure out
    DIVIDE=2, RDW=100 and VDW=99 yielding a frequency of
    25174000 Hz out of the VCO. (I replicated the icst_hz() function
    in a spreadsheet to verify this.)
    
    However, when I called icst_hz() on these VCO settings it would
    instead return 4122709 Hz. This causes an error in the common
    clock driver for ICST as the common clock framework will call
    .round_rate() on the clock which will utilize icst_hz_to_vco()
    followed by icst_hz() suggesting the erroneous frequency, and
    then the clock gets set to this.
    
    The error did not manifest in the old clock framework since
    this high frequency was only used by the CLCD, which calls
    clk_set_rate() without first calling clk_round_rate() and since
    the old clock framework would not call clk_round_rate() before
    setting the frequency, the correct values propagated into
    the VCO.
    
    After some experimenting I figured out that it was due to a simple
    arithmetic overflow: the divisor for 24Mhz reference frequency
    as reference becomes 24000000*2*(99+8)=0x132212400 and the "1"
    in bit 32 overflows and is lost.
    
    But introducing an explicit 64-by-32 bit do_div() and casting
    the divisor into (u64) we get the right frequency back, and the
    right frequency gets set.
    
    Tested on the ARM Versatile.
    
    Cc: linux-clk@vger.kernel.org
    Cc: Pawel Moll <pawel.moll@arm.com>
    Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
    Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 9fe0b68c4949517720546aa3717d05b491e4ee09
Author: Linus Walleij <linus.walleij@linaro.org>
Date:   Wed Feb 10 09:25:17 2016 +0100

    ARM: 8519/1: ICST: try other dividends than 1
    
    commit e972c37459c813190461dabfeaac228e00aae259 upstream.
    
    Since the dawn of time the ICST code has only supported divide
    by one or hang in an eternal loop. Luckily we were always dividing
    by one because the reference frequency for the systems using
    the ICSTs is 24MHz and the [min,max] values for the PLL input
    if [10,320] MHz for ICST307 and [6,200] for ICST525, so the loop
    will always terminate immediately without assigning any divisor
    for the reference frequency.
    
    But for the code to make sense, let's insert the missing i++
    
    Reported-by: David Binderman <dcb314@hotmail.com>
    Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
    Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a68f555363f5e43d0da4a6d025e23db74e2b4f1f
Author: Mika Penttilä <mika.penttila@nextfour.com>
Date:   Tue Jan 26 15:47:25 2016 +0000

    arm64: mm: avoid calling apply_to_page_range on empty range
    
    commit 57adec866c0440976c96a4b8f5b59fb411b1cacb upstream.
    
    Calling apply_to_page_range with an empty range results in a BUG_ON
    from the core code. This can be triggered by trying to load the st_drv
    module with CONFIG_DEBUG_SET_MODULE_RONX enabled:
    
      kernel BUG at mm/memory.c:1874!
      Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
      Modules linked in:
      CPU: 3 PID: 1764 Comm: insmod Not tainted 4.5.0-rc1+ #2
      Hardware name: ARM Juno development board (r0) (DT)
      task: ffffffc9763b8000 ti: ffffffc975af8000 task.ti: ffffffc975af8000
      PC is at apply_to_page_range+0x2cc/0x2d0
      LR is at change_memory_common+0x80/0x108
    
    This patch fixes the issue by making change_memory_common (called by the
    set_memory_* functions) a NOP when numpages == 0, therefore avoiding the
    erroneous call to apply_to_page_range and bringing us into line with x86
    and s390.
    
    Reviewed-by: Laura Abbott <labbott@redhat.com>
    Acked-by: David Rientjes <rientjes@google.com>
    Signed-off-by: Mika Penttilä <mika.penttila@nextfour.com>
    Signed-off-by: Will Deacon <will.deacon@arm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 242813b9a1b6d44845fbc73fe0f6db13f2401828
Author: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Date:   Fri Dec 4 14:29:02 2015 +0100

    ARM: mvebu: remove duplicated regulator definition in Armada 388 GP
    
    commit 079ae0c121fd23287f4ad2be9e9f8a13f63cae73 upstream.
    
    The Armada 388 GP Device Tree file describes two times a regulator
    named 'reg_usb2_1_vbus', with the exact same description. This has
    been wrong since Armada 388 GP support was introduced.
    
    Fixes: 928413bd859c0 ("ARM: mvebu: Add Armada 388 General Purpose Development Board support")
    Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
    Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 602acfedc981eb887289f11f1a7565e6f48c710d
Author: Alexey Kardashevskiy <aik@ozlabs.ru>
Date:   Wed Feb 17 18:26:31 2016 +1100

    powerpc/ioda: Set "read" permission when "write" is set
    
    commit 6ecad912a0073c768db1491c27ca55ad2d0ee68f upstream.
    
    Quite often drivers set only "write" permission assuming that this
    includes "read" permission as well and this works on plenty of
    platforms. However IODA2 is strict about this and produces an EEH when
    "read" permission is not set and reading happens.
    
    This adds a workaround in the IODA code to always add the "read" bit
    when the "write" bit is set.
    
    Fixes: 10b35b2b7485 ("powerpc/powernv: Do not set "read" flag if direction==DMA_NONE")
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
    Tested-by: Douglas Miller <dougmill@linux.vnet.ibm.com>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b5311270caba5392a83ed918e11a16e71b3b1e44
Author: Gavin Shan <gwshan@linux.vnet.ibm.com>
Date:   Tue Feb 9 15:50:22 2016 +1100

    powerpc/powernv: Fix stale PE primary bus
    
    commit 1bc74f1ccd457832dc515fc1febe6655985fdcd2 upstream.
    
    When PCI bus is unplugged during full hotplug for EEH recovery,
    the platform PE instance (struct pnv_ioda_pe) isn't released and
    it dereferences the stale PCI bus that has been released. It leads
    to kernel crash when referring to the stale PCI bus.
    
    This fixes the issue by correcting the PE's primary bus when it's
    oneline at plugging time, in pnv_pci_dma_bus_setup() which is to
    be called by pcibios_fixup_bus().
    
    Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
    Reported-by: Pradipta Ghosh <pradghos@in.ibm.com>
    Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
    Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 5ecdf58c1945c40b6dcaf9764c119ecc607e8a06
Author: Gavin Shan <gwshan@linux.vnet.ibm.com>
Date:   Tue Feb 9 15:50:21 2016 +1100

    powerpc/eeh: Fix stale cached primary bus
    
    commit 05ba75f848647135f063199dc0e9f40fee769724 upstream.
    
    When PE is created, its primary bus is cached to pe->bus. At later
    point, the cached primary bus is returned from eeh_pe_bus_get().
    However, we could get stale cached primary bus and run into kernel
    crash in one case: full hotplug as part of fenced PHB error recovery
    releases all PCI busses under the PHB at unplugging time and recreate
    them at plugging time. pe->bus is still dereferencing the PCI bus
    that was released.
    
    This adds another PE flag (EEH_PE_PRI_BUS) to represent the validity
    of pe->bus. pe->bus is updated when its first child EEH device is
    online and the flag is set. Before unplugging in full hotplug for
    error recovery, the flag is cleared.
    
    Fixes: 8cdb2833 ("powerpc/eeh: Trace PCI bus from PE")
    Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
    Reported-by: Pradipta Ghosh <pradghos@in.ibm.com>
    Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
    Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 64f10cf83a6cb819d8192e679a9916519d3edf4a
Author: Gavin Shan <gwshan@linux.vnet.ibm.com>
Date:   Wed Dec 2 16:25:32 2015 +1100

    powerpc/eeh: Fix PE location code
    
    commit 7e56f627768da4e6480986b5145dc3422bc448a5 upstream.
    
    In eeh_pe_loc_get(), the PE location code is retrieved from the
    "ibm,loc-code" property of the device node for the bridge of the
    PE's primary bus. It's not correct because the property indicates
    the parent PE's location code.
    
    This reads the correct PE location code from "ibm,io-base-loc-code"
    or "ibm,slot-location-code" property of PE parent bus's device node.
    
    Fixes: 357b2f3dd9b7 ("powerpc/eeh: Dump PE location code")
    Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
    Tested-by: Russell Currey <ruscur@russell.cc>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 782126b225221e5a2b30b1824b2326abba53eab9
Author: Trond Myklebust <trond.myklebust@primarydata.com>
Date:   Wed Jan 6 08:57:06 2016 -0500

    SUNRPC: Fixup socket wait for memory
    
    commit 13331a551ab4df87f7a027d2cab392da96aba1de upstream.
    
    We're seeing hangs in the NFS client code, with loops of the form:
    
     RPC: 30317 xmit incomplete (267368 left of 524448)
     RPC: 30317 call_status (status -11)
     RPC: 30317 call_transmit (status 0)
     RPC: 30317 xprt_prepare_transmit
     RPC: 30317 xprt_transmit(524448)
     RPC:       xs_tcp_send_request(267368) = -11
     RPC: 30317 xmit incomplete (267368 left of 524448)
     RPC: 30317 call_status (status -11)
     RPC: 30317 call_transmit (status 0)
     RPC: 30317 xprt_prepare_transmit
     RPC: 30317 xprt_transmit(524448)
    
    Turns out commit ceb5d58b2170 ("net: fix sock_wake_async() rcu protection")
    moved SOCKWQ_ASYNC_NOSPACE out of sock->flags and into sk->sk_wq->flags,
    however it never tried to fix up the code in net/sunrpc.
    
    The new idiom is to use the flags in the RCU protected struct socket_wq.
    While we're at it, clear out the now redundant places where we set/clear
    SOCKWQ_ASYNC_NOSPACE and SOCK_NOSPACE. In principle, sk_stream_wait_memory()
    is supposed to set these for us, so we only need to clear them in the
    particular case of our ->write_space() callback.
    
    Fixes: ceb5d58b2170 ("net: fix sock_wake_async() rcu protection")
    Cc: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d0452554b9a114d40f7f86248c2935d0ba7ddd22
Author: Andrew Gabbasov <andrew_gabbasov@mentor.com>
Date:   Thu Dec 24 10:25:33 2015 -0600

    udf: Check output buffer length when converting name to CS0
    
    commit bb00c898ad1ce40c4bb422a8207ae562e9aea7ae upstream.
    
    If a name contains at least some characters with Unicode values
    exceeding single byte, the CS0 output should have 2 bytes per character.
    And if other input characters have single byte Unicode values, then
    the single input byte is converted to 2 output bytes, and the length
    of output becomes larger than the length of input. And if the input
    name is long enough, the output length may exceed the allocated buffer
    length.
    
    All this means that conversion from UTF8 or NLS to CS0 requires
    checking of output length in order to stop when it exceeds the given
    output buffer size.
    
    [JK: Make code return -ENAMETOOLONG instead of silently truncating the
    name]
    
    Signed-off-by: Andrew Gabbasov <andrew_gabbasov@mentor.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit eec1445767ccbb320255ad8bfc2e64c929ee21cc
Author: Andrew Gabbasov <andrew_gabbasov@mentor.com>
Date:   Thu Dec 24 10:25:32 2015 -0600

    udf: Prevent buffer overrun with multi-byte characters
    
    commit ad402b265ecf6fa22d04043b41444cdfcdf4f52d upstream.
    
    udf_CS0toUTF8 function stops the conversion when the output buffer
    length reaches UDF_NAME_LEN-2, which is correct maximum name length,
    but, when checking, it leaves the space for a single byte only,
    while multi-bytes output characters can take more space, causing
    buffer overflow.
    
    Similar error exists in udf_CS0toNLS function, that restricts
    the output length to UDF_NAME_LEN, while actual maximum allowed
    length is UDF_NAME_LEN-2.
    
    In these cases the output can override not only the current buffer
    length field, causing corruption of the name buffer itself, but also
    following allocation structures, causing kernel crash.
    
    Adjust the output length checks in both functions to prevent buffer
    overruns in case of multi-bytes UTF8 or NLS characters.
    
    Signed-off-by: Andrew Gabbasov <andrew_gabbasov@mentor.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit aef22a3d69452aa516f6738774f7b19372034fdc
Author: Vegard Nossum <vegard.nossum@oracle.com>
Date:   Fri Dec 11 15:54:16 2015 +0100

    udf: limit the maximum number of indirect extents in a row
    
    commit b0918d9f476a8434b055e362b83fa4fd1d462c3f upstream.
    
    udf_next_aext() just follows extent pointers while extents are marked as
    indirect. This can loop forever for corrupted filesystem. Limit number
    the of indirect extents we are willing to follow in a row.
    
    [JK: Updated changelog, limit, style]
    
    Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
    Cc: Jan Kara <jack@suse.com>
    Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 66b8812e87f3fcf9d52b107a8d3102cd3f77ae1e
Author: Trond Myklebust <trond.myklebust@primarydata.com>
Date:   Thu Jan 21 15:39:40 2016 -0500

    pNFS/flexfiles: Fix an XDR encoding bug in layoutreturn
    
    commit 082fa37d1351a41afc491d44a1d095cb8d919aa2 upstream.
    
    We must not skip encoding the statistics, or the server will see an
    XDR encoding error.
    
    Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d65eb5b3dfb1159ea69c6ce170f83a0a90548ac6
Author: Andrew Elble <aweits@rit.edu>
Date:   Wed Dec 2 09:20:57 2015 -0500

    nfs: Fix race in __update_open_stateid()
    
    commit 361cad3c89070aeb37560860ea8bfc092d545adc upstream.
    
    We've seen this in a packet capture - I've intermixed what I
    think was going on. The fix here is to grab the so_lock sooner.
    
    1964379 -> #1 open (for write) reply seqid=1
    1964393 -> #2 open (for read) reply seqid=2
    
      __nfs4_close(), state->n_wronly--
      nfs4_state_set_mode_locked(), changes state->state = [R]
      state->flags is [RW]
      state->state is [R], state->n_wronly == 0, state->n_rdonly == 1
    
    1964398 -> #3 open (for write) call -> because close is already running
    1964399 -> downgrade (to read) call seqid=2 (close of #1)
    1964402 -> #3 open (for write) reply seqid=3
    
     __update_open_stateid()
       nfs_set_open_stateid_locked(), changes state->flags
       state->flags is [RW]
       state->state is [R], state->n_wronly == 0, state->n_rdonly == 1
       new sequence number is exposed now via nfs4_stateid_copy()
    
       next step would be update_open_stateflags(), pending so_lock
    
    1964403 -> downgrade reply seqid=2, fails with OLD_STATEID (close of #1)
    
       nfs4_close_prepare() gets so_lock and recalcs flags -> send close
    
    1964405 -> downgrade (to read) call seqid=3 (close of #1 retry)
    
       __update_open_stateid() gets so_lock
     * update_open_stateflags() updates state->n_wronly.
       nfs4_state_set_mode_locked() updates state->state
    
       state->flags is [RW]
       state->state is [RW], state->n_wronly == 1, state->n_rdonly == 1
    
     * should have suppressed the preceding nfs4_close_prepare() from
       sending open_downgrade
    
    1964406 -> write call
    1964408 -> downgrade (to read) reply seqid=4 (close of #1 retry)
    
       nfs_clear_open_stateid_locked()
       state->flags is [R]
       state->state is [RW], state->n_wronly == 1, state->n_rdonly == 1
    
    1964409 -> write reply (fails, openmode)
    
    Signed-off-by: Andrew Elble <aweits@rit.edu>
    Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c8841e15d6ded83aac4da7caf70a20d3c58bf6fe
Author: Trond Myklebust <trond.myklebust@primarydata.com>
Date:   Wed Dec 30 10:57:01 2015 -0500

    pNFS/flexfiles: Fix an Oopsable typo in ff_mirror_match_fh()
    
    commit 86fb449b07b8215443a30782dca5755d5b8b0577 upstream.
    
    Jeff reports seeing an Oops in ff_layout_alloc_lseg. Turns out
    copy+paste has played cruel tricks on a nested loop.
    
    Reported-by: Jeff Layton <jeff.layton@primarydata.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 1873e6f486061eb93dc82a85fa8030857833d485
Author: Trond Myklebust <trond.myklebust@primarydata.com>
Date:   Tue Dec 29 18:55:19 2015 -0500

    NFS: Fix attribute cache revalidation
    
    commit ade14a7df796d4e86bd9d181193c883a57b13db0 upstream.
    
    If a NFSv4 client uses the cache_consistency_bitmask in order to
    request only information about the change attribute, timestamps and
    size, then it has not revalidated all attributes, and hence the
    attribute timeout timestamp should not be updated.
    
    Reported-by: Donald Buczek <buczek@molgen.mpg.de>
    Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit dadfe92207503548d60a2d4c6c467702ead17148
Author: Anton Protopopov <a.s.protopopov@gmail.com>
Date:   Wed Feb 10 12:50:21 2016 -0500

    cifs: fix erroneous return value
    
    commit 4b550af519854421dfec9f7732cdddeb057134b2 upstream.
    
    The setup_ntlmv2_rsp() function may return positive value ENOMEM instead
    of -ENOMEM in case of kmalloc failure.
    
    Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
    Signed-off-by: Steve French <smfrench@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 7e30995b26ccc952e30bbc2563fc5728c50c9e12
Author: Vasily Averin <vvs@virtuozzo.com>
Date:   Thu Jan 14 13:41:14 2016 +0300

    cifs_dbg() outputs an uninitialized buffer in cifs_readdir()
    
    commit 01b9b0b28626db4a47d7f48744d70abca9914ef1 upstream.
    
    In some cases tmp_bug can be not filled in cifs_filldir and stay uninitialized,
    therefore its printk with "%s" modifier can leak content of kernelspace memory.
    If old content of this buffer does not contain '\0' access bejond end of
    allocated object can crash the host.
    
    Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
    Signed-off-by: Steve French <sfrench@localhost.localdomain>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 5d80673404e691093d66fc5b02c8bc2ac1692d77
Author: Rabin Vincent <rabin.vincent@axis.com>
Date:   Wed Dec 23 07:32:41 2015 +0100

    cifs: fix race between call_async() and reconnect()
    
    commit 820962dc700598ffe8cd21b967e30e7520c34748 upstream.
    
    cifs_call_async() queues the MID to the pending list and calls
    smb_send_rqst().  If smb_send_rqst() performs a partial send, it sets
    the tcpStatus to CifsNeedReconnect and returns an error code to
    cifs_call_async().  In this case, cifs_call_async() removes the MID
    from the list and returns to the caller.
    
    However, cifs_call_async() releases the server mutex _before_ removing
    the MID.  This means that a cifs_reconnect() can race with this function
    and manage to remove the MID from the list and delete the entry before
    cifs_call_async() calls cifs_delete_mid().  This leads to various
    crashes due to the use after free in cifs_delete_mid().
    
    Task1				Task2
    
    cifs_call_async():
     - rc = -EAGAIN
     - mutex_unlock(srv_mutex)
    
    				cifs_reconnect():
    				 - mutex_lock(srv_mutex)
    				 - mutex_unlock(srv_mutex)
    				 - list_delete(mid)
    				 - mid->callback()
    				 	cifs_writev_callback():
    				 		- mutex_lock(srv_mutex)
    						- delete(mid)
    				 		- mutex_unlock(srv_mutex)
    
     - cifs_delete_mid(mid) <---- use after free
    
    Fix this by removing the MID in cifs_call_async() before releasing the
    srv_mutex.  Also hold the srv_mutex in cifs_reconnect() until the MIDs
    are moved out of the pending list.
    
    Signed-off-by: Rabin Vincent <rabin.vincent@axis.com>
    Acked-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
    Signed-off-by: Steve French <sfrench@localhost.localdomain>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 88413fceab844abec9e8007aaeec35c4bfc3f3fc
Author: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Date:   Sat Nov 7 22:13:49 2015 +1000

    cifs: Ratelimit kernel log messages
    
    commit ec7147a99e33a9e4abad6fc6e1b40d15df045d53 upstream.
    
    Under some conditions, CIFS can repeatedly call the cifs_dbg() logging
    wrapper. If done rapidly enough, the console framebuffer can softlockup
    or "rcu_sched self-detected stall". Apply the built-in log ratelimiters
    to prevent such hangs.
    
    Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
    Signed-off-by: Steve French <smfrench@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 224f259d9393ca342dd565321db48ec4a79a695f
Author: Dan Carpenter <dan.carpenter@oracle.com>
Date:   Tue Jan 26 12:25:21 2016 +0300

    iio: inkern: fix a NULL dereference on error
    
    commit d81dac3c1c5295c61b15293074ac2bd3254e1875 upstream.
    
    In twl4030_bci_probe() there are some failure paths where we call
    iio_channel_release() with a NULL pointer.  (Apparently, that driver can
    opperate without a valid channel pointer).  Let's fix it by adding a
    NULL check in iio_channel_release().
    
    Fixes: 2202e1fc5a29 ('drivers: power: twl4030_charger: fix link problems when building as module')
    Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Jonathan Cameron <jic23@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e16eb4bb193c3b0faf1ca78cf05a6623de5cb633
Author: Akinobu Mita <akinobu.mita@gmail.com>
Date:   Thu Jan 21 01:07:31 2016 +0900

    iio: pressure: mpl115: fix temperature offset sign
    
    commit 431386e783a3a6c8b7707bee32d18c353b8688b2 upstream.
    
    According to the datasheet, the resolusion of temperature sensor is
    -5.35 counts/C. Temperature ADC is 472 counts at 25C.
    (https://www.sparkfun.com/datasheets/Sensors/Pressure/MPL115A1.pdf
    NOTE: This is older revision, but this information is removed from the
    latest datasheet from nxp somehow)
    
    Temp [C] = (Tadc - 472) / -5.35 + 25
             = (Tadc - 605.750000) * -0.186915888
    
    So the correct offset is -605.750000.
    
    Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
    Acked-by: Peter Meerwald-Stadler <pmeerw@pmeerw.net>
    Signed-off-by: Jonathan Cameron <jic23@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 909e9c55196df318c7f3096826c8dc307cc2fd4b
Author: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Date:   Tue Jan 12 16:21:39 2016 +0100

    iio: light: acpi-als: Report data as processed
    
    commit fa34e6dd44d7c02c8a8468ce4a52a7506f907bef upstream.
    
    As per the ACPI specification (Revision 5.0) [1], the data coming
    from the sensor represent the ambient light illuminance reading
    expressed in lux. So use IIO_CHAN_INFO_PROCESSED to signify that
    the data are pre-processed.
    
    However, to keep backward ABI compatibility, the IIO_CHAN_INFO_RAW
    bit is not removed.
    
    [1] http://www.acpi.info/DOWNLOADS/ACPIspec50.pdf
    
    This issue has also been responsible for at least one userspace bug
    report hence marking what is a small semantic fix really for stable.
    [2] https://github.com/hadess/iio-sensor-proxy/issues/46
    
    Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
    Signed-off-by: Jonathan Cameron <jic23@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 377d1f59388fb7711b977458be7cb432f8dbbf9b
Author: Yong Li <sdliyong@gmail.com>
Date:   Wed Jan 6 09:09:43 2016 +0800

    iio: dac: mcp4725: set iio name property in sysfs
    
    commit 97a249e98a72d6b79fb7350a8dd56b147e9d5bdb upstream.
    
    Without this change, the name entity for mcp4725 is missing in
    /sys/bus/iio/devices/iio\:device*/name
    
    With this change, name is reported correctly
    
    Signed-off-by: Yong Li <sdliyong@gmail.com>
    Signed-off-by: Jonathan Cameron <jic23@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 1c1d4f2d76293cdcef7f77f7b92863c4bf3479d9
Author: Vegard Nossum <vegard.nossum@oracle.com>
Date:   Sat Jan 2 14:04:39 2016 +0100

    iio: add IIO_TRIGGER dependency to STK8BA50
    
    commit 01cc5235604d61018712c11a14d74230f6a38bf4 upstream.
    
    Ran into this on UML:
    
    drivers/iio/accel/stk8ba50.c: In function ‘stk8ba50_data_rdy_trigger_set_state’:
    drivers/iio/accel/stk8ba50.c:163:9: error: implicit declaration of function ‘iio_trigger_get_drvdata’ [-Werror=implicit-function-declaration]
    
    iio_trigger_get_drvdata() is defined only when IIO_TRIGGER is selected.
    
    Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
    Signed-off-by: Jonathan Cameron <jic23@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit dfa6e741d472a464ee85dc2e0a44176a1496f832
Author: Vegard Nossum <vegard.nossum@oracle.com>
Date:   Sat Jan 2 14:04:40 2016 +0100

    iio: add HAS_IOMEM dependency to VF610_ADC
    
    commit 005ce0713006a76d2b0c924ce0e2629e5d8510c3 upstream.
    
    Ran into this on UML:
    
    drivers/built-in.o: In function `vf610_adc_probe':
    drivers/iio/adc/vf610_adc.c:744: undefined reference to `devm_ioremap_resource'
    
    devm_ioremap_resource() is defined only when HAS_IOMEM is selected.
    
    Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
    Signed-off-by: Jonathan Cameron <jic23@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f865d8c326dd50649d3ee3129b93e6abb6929fae
Author: Markus Elfring <elfring@users.sourceforge.net>
Date:   Sat Dec 19 14:14:54 2015 +0100

    iio-light: Use a signed return type for ltr501_match_samp_freq()
    
    commit c08ae18560aaed50fed306a2e11f36ce70130f65 upstream.
    
    The return type "unsigned int" was used by the ltr501_match_samp_freq()
    function despite of the aspect that it will eventually return a negative
    error code.
    Improve this implementation detail by deletion of the type modifier then.
    
    This issue was detected by using the Coccinelle software.
    
    Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
    Acked-by: Peter Meerwald-Stadler <pmeerw@pmeerw.net>
    Signed-off-by: Jonathan Cameron <jic23@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e9b0f0e411d00b900a72891435c8997fb8eeb32b
Author: Jonathan Cameron <jic23@kernel.org>
Date:   Fri Jan 1 18:05:34 2016 +0000

    iio:adc:ti_am335x_adc Fix buffered mode by identifying as software buffer.
    
    commit 9d0be85d4e2cfa2519ae16efe7ff4a7150c43c0b upstream.
    
    Whilst this part has a hardware buffer, the identifcation that IIO cares
    about is the userspace facing end.  It this case we push individual elements
    from the hardware fifo into the software interface (specifically a kfifo)
    rather than providing direct reads through to a hardware buffer
    (as we still do in the sca3000 for example).
    
    Technically the original specification as a hardware buffer could be
    considered wrong, but it didn't matter until the patch listed below.
    
    Result is that any attempt to enable the buffer will return -EINVAL
    
    Fixes: 225d59adf1c8 ("iio: Specify supported modes for buffers")
    Signed-off-by: Jonathan Cameron <jic23@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit dc275a6eb9d0ebe7711d8ced07509dc5168163b3
Author: Lars-Peter Clausen <lars@metafoo.de>
Date:   Fri Nov 27 14:55:56 2015 +0100

    iio: adis_buffer: Fix out-of-bounds memory access
    
    commit d590faf9e8f8509a0a0aa79c38e87fcc6b913248 upstream.
    
    The SPI tx and rx buffers are both supposed to be scan_bytes amount of
    bytes large and a common allocation is used to allocate both buffers. This
    puts the beginning of the tx buffer scan_bytes bytes after the rx buffer.
    The initialization of the tx buffer pointer is done adding scan_bytes to
    the beginning of the rx buffer, but since the rx buffer is of type __be16
    this will actually add two times as much and the tx buffer ends up pointing
    after the allocated buffer.
    
    Fix this by using scan_count, which is scan_bytes / 2, instead of
    scan_bytes when initializing the tx buffer pointer.
    
    Fixes: aacff892cbd5 ("staging:iio:adis: Preallocate transfer message")
    Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
    Signed-off-by: Jonathan Cameron <jic23@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a258a959fcf31230ae660f5b0a537b8d7fc0b675
Author: James Bottomley <James.Bottomley@HansenPartnership.com>
Date:   Wed Feb 10 08:03:26 2016 -0800

    scsi: fix soft lockup in scsi_remove_target() on module removal
    
    commit 90a88d6ef88edcfc4f644dddc7eef4ea41bccf8b upstream.
    
    This softlockup is currently happening:
    
    [  444.088002] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:1:29]
    [  444.088002] Modules linked in: lpfc(-) qla2x00tgt(O) qla2xxx_scst(O) scst_vdisk(O) scsi_transport_fc libcrc32c scst(O) dlm configfs nfsd lockd grace nfs_acl auth_rpcgss sunrpc ed
    d snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device dm_mod iTCO_wdt snd_hda_codec_realtek snd_hda_codec_generic gpio_ich iTCO_vendor_support ppdev snd_hda_intel snd_hda_codec snd_hda
    _core snd_hwdep tg3 snd_pcm snd_timer libphy lpc_ich parport_pc ptp acpi_cpufreq snd pps_core fjes parport i2c_i801 ehci_pci tpm_tis tpm sr_mod cdrom soundcore floppy hwmon sg 8250_
    fintek pcspkr i915 drm_kms_helper uhci_hcd ehci_hcd drm fb_sys_fops sysimgblt sysfillrect syscopyarea i2c_algo_bit usbcore button video usb_common fan ata_generic ata_piix libata th
    ermal
    [  444.088002] CPU: 1 PID: 29 Comm: kworker/1:1 Tainted: G           O    4.4.0-rc5-2.g1e923a3-default #1
    [  444.088002] Hardware name: FUJITSU SIEMENS ESPRIMO E           /D2164-A1, BIOS 5.00 R1.10.2164.A1               05/08/2006
    [  444.088002] Workqueue: fc_wq_4 fc_rport_final_delete [scsi_transport_fc]
    [  444.088002] task: f6266ec0 ti: f6268000 task.ti: f6268000
    [  444.088002] EIP: 0060:[<c07e7044>] EFLAGS: 00000286 CPU: 1
    [  444.088002] EIP is at _raw_spin_unlock_irqrestore+0x14/0x20
    [  444.088002] EAX: 00000286 EBX: f20d3800 ECX: 00000002 EDX: 00000286
    [  444.088002] ESI: f50ba800 EDI: f2146848 EBP: f6269ec8 ESP: f6269ec8
    [  444.088002]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
    [  444.088002] CR0: 8005003b CR2: 08f96600 CR3: 363ae000 CR4: 000006d0
    [  444.088002] Stack:
    [  444.088002]  f6269eec c066b0f7 00000286 f2146848 f50ba808 f50ba800 f50ba800 f2146a90
    [  444.088002]  f2146848 f6269f08 f8f0a4ed f3141000 f2146800 f2146a90 f619fa00 00000040
    [  444.088002]  f6269f40 c026cb25 00000001 166c6392 00000061 f6757140 f6136340 00000004
    [  444.088002] Call Trace:
    [  444.088002]  [<c066b0f7>] scsi_remove_target+0x167/0x1c0
    [  444.088002]  [<f8f0a4ed>] fc_rport_final_delete+0x9d/0x1e0 [scsi_transport_fc]
    [  444.088002]  [<c026cb25>] process_one_work+0x155/0x3e0
    [  444.088002]  [<c026cde7>] worker_thread+0x37/0x490
    [  444.088002]  [<c027214b>] kthread+0x9b/0xb0
    [  444.088002]  [<c07e72c1>] ret_from_kernel_thread+0x21/0x40
    
    What appears to be happening is that something has pinned the target
    so it can't go into STARGET_DEL via final release and the loop in
    scsi_remove_target spins endlessly until that happens.
    
    The fix for this soft lockup is to not keep looping over a device that
    we've called remove on but which hasn't gone into DEL state.  This
    patch will retain a simplistic memory of the last target and not keep
    looping over it.
    
    Reported-by: Sebastian Herbszt <herbszt@gmx.de>
    Tested-by: Sebastian Herbszt <herbszt@gmx.de>
    Fixes: 40998193560dab6c3ce8d25f4fa58a23e252ef38
    Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 900ae746c1e954c5dd2b926d9e30863c9852396c
Author: Mika Westerberg <mika.westerberg@linux.intel.com>
Date:   Wed Jan 27 16:19:13 2016 +0200

    SCSI: Add Marvell Console to VPD blacklist
    
    commit 82c43310508eb19eb41fe7862e89afeb74030b84 upstream.
    
    I have a Marvell 88SE9230 SATA Controller that has some sort of
    integrated console SCSI device attached to one of the ports.
    
      ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
      ata14.00: ATAPI: MARVELL VIRTUALL, 1.09, max UDMA/66
      ata14.00: configured for UDMA/66
      scsi 13:0:0:0: Processor         Marvell  Console 1.01 PQ: 0 ANSI: 5
    
    Sending it VPD INQUIRY command seem to always fail with following error:
    
      ata14.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
      ata14.00: irq_stat 0x40000001
      ata14.00: cmd a0/01:00:00:00:01/00:00:00:00:00/a0 tag 2 dma 16640 in
                Inquiry 12 01 00 00 ff 00res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x3 (HSM violation)
      ata14: hard resetting link
    
    This has been minor annoyance (only error printed on dmesg) until commit
    09e2b0b14690 ("scsi: rescan VPD attributes") added call to scsi_attach_vpd()
    in scsi_rescan_device(). The commit causes the system to splat out
    following errors continuously without ever reaching the UI:
    
      ata14.00: configured for UDMA/66
      ata14: EH complete
      ata14.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
      ata14.00: irq_stat 0x40000001
      ata14.00: cmd a0/01:00:00:00:01/00:00:00:00:00/a0 tag 6 dma 16640 in
                Inquiry 12 01 00 00 ff 00res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x3 (HSM violation)
      ata14: hard resetting link
      ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
      ata14.00: configured for UDMA/66
      ata14: EH complete
      ata14.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
      ata14.00: irq_stat 0x40000001
      ata14.00: cmd a0/01:00:00:00:01/00:00:00:00:00/a0 tag 7 dma 16640 in
                Inquiry 12 01 00 00 ff 00res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x3 (HSM violation)
    
    Without in-depth understanding of SCSI layer and the Marvell controller,
    I suspect this happens because when the link goes down (because of an
    error) we schedule scsi_rescan_device() which again fails to read VPD
    data... ad infinitum.
    
    Since VPD data cannot be read from the device anyway we prevent the SCSI
    layer from even trying by blacklisting the device. This gets away the
    error and the system starts up normally.
    
    [mkp: Widened the match to all revisions of this device]
    
    Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
    Reported-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Reported-by: Alexander Duyck <alexander.duyck@gmail.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 32c55052aa3364bd4a0d0c2b5f53ed9b95ae8d05
Author: Hannes Reinecke <hare@suse.de>
Date:   Fri Jan 22 15:42:41 2016 +0100

    scsi_dh_rdac: always retry MODE SELECT on command lock violation
    
    commit d2d06d4fe0f2cc2df9b17fefec96e6e1a1271d91 upstream.
    
    If MODE SELECT returns with sense '05/91/36' (command lock violation)
    it should always be retried without counting the number of retries.
    During an HBA upgrade or similar circumstances one might see a flood
    of MODE SELECT command from various HBAs, which will easily trigger
    the sense code and exceed the retry count.
    
    Signed-off-by: Hannes Reinecke <hare@suse.de>
    Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 4c654fc9357b70803c1878ec729ef61a8be0af52
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Date:   Tue Feb 2 16:57:35 2016 -0800

    drivers/scsi/sg.c: mark VMA as VM_IO to prevent migration
    
    commit 461c7fa126794157484dca48e88effa4963e3af3 upstream.
    
    Reduced testcase:
    
        #include <fcntl.h>
        #include <unistd.h>
        #include <sys/mman.h>
        #include <numaif.h>
    
        #define SIZE 0x2000
    
        int main()
        {
            int fd;
            void *p;
    
            fd = open("/dev/sg0", O_RDWR);
            p = mmap(NULL, SIZE, PROT_EXEC, MAP_PRIVATE | MAP_LOCKED, fd, 0);
            mbind(p, SIZE, 0, NULL, 0, MPOL_MF_MOVE);
            return 0;
        }
    
    We shouldn't try to migrate pages in sg VMA as we don't have a way to
    update Sg_scatter_hold::pages accordingly from mm core.
    
    Let's mark the VMA as VM_IO to indicate to mm core that the VMA is not
    migratable.
    
    Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Reported-by: Dmitry Vyukov <dvyukov@google.com>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Doug Gilbert <dgilbert@interlog.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
    Cc: Shiraz Hashim <shashim@codeaurora.org>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Sasha Levin <sasha.levin@oracle.com>
    Cc: syzkaller <syzkaller@googlegroups.com>
    Cc: Kostya Serebryany <kcc@google.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d763177d00d72af0efc0f13745b8c26c832845bf
Author: Alan Stern <stern@rowland.harvard.edu>
Date:   Wed Jan 20 11:26:01 2016 -0500

    SCSI: fix crashes in sd and sr runtime PM
    
    commit 13b4389143413a1f18127c07f72c74cad5b563e8 upstream.
    
    Runtime suspend during driver probe and removal can cause problems.
    The driver's runtime_suspend or runtime_resume callbacks may invoked
    before the driver has finished binding to the device or after the
    driver has unbound from the device.
    
    This problem shows up with the sd and sr drivers, and can cause disk
    or CD/DVD drives to become unusable as a result.  The fix is simple.
    The drivers store a pointer to the scsi_disk or scsi_cd structure as
    their private device data when probing is finished, so we simply have
    to be sure to clear the private data during removal and test it during
    runtime suspend/resume.
    
    This fixes <https://bugs.debian.org/801925>.
    
    Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
    Reported-by: Paul Menzel <paul.menzel@giantmonkey.de>
    Reported-by: Erich Schubert <erich@debian.org>
    Reported-by: Alexandre Rossi <alexandre.rossi@gmail.com>
    Tested-by: Paul Menzel <paul.menzel@giantmonkey.de>
    Tested-by: Erich Schubert <erich@debian.org>
    Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit dcec7af70910c48cba1e7b42383b9696356afc33
Author: Nicholas Bellinger <nab@linux-iscsi.org>
Date:   Tue Jan 19 16:15:27 2016 -0800

    iscsi-target: Fix potential dead-lock during node acl delete
    
    commit 26a99c19f810b2593410899a5b304b21b47428a6 upstream.
    
    This patch is a iscsi-target specific bug-fix for a dead-lock
    that can occur during explicit struct se_node_acl->acl_group
    se_session deletion via configfs rmdir(2), when iscsi-target
    time2retain timer is still active.
    
    It changes iscsi-target to obtain se_portal_group->session_lock
    internally using spin_in_locked() to check for the specific
    se_node_acl configfs shutdown rmdir(2) case.
    
    Note this patch is intended for stable, and the subsequent
    v4.5-rc patch converts target_core_tpg.c to use proper
    se_sess->sess_kref reference counting for both se_node_acl
    deletion + se_node_acl->queue_depth se_session restart.
    
    Reported-by:: Sagi Grimberg <sagig@mellanox.com>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: Hannes Reinecke <hare@suse.de>
    Cc: Andy Grover <agrover@redhat.com>
    Cc: Mike Christie <michaelc@cs.wisc.edu>
    Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 954bb20f70ed1ab93406c6e01e1cd54d866c977e
Author: Mike Christie <mchristi@redhat.com>
Date:   Thu Jan 7 16:34:05 2016 -0600

    scsi: add Synology to 1024 sector blacklist
    
    commit 9055082fb100cc66e20c048251d05159f5f2cfba upstream.
    
    Another iscsi target that cannot handle large IOs, but does not tell us
    a limit.
    
    The Synology iSCSI targets report:
    
    Block limits VPD page (SBC):
      Write same no zero (WSNZ): 0
      Maximum compare and write length: 0 blocks
      Optimal transfer length granularity: 0 blocks
      Maximum transfer length: 0 blocks
      Optimal transfer length: 0 blocks
      Maximum prefetch length: 0 blocks
      Maximum unmap LBA count: 0
      Maximum unmap block descriptor count: 0
      Optimal unmap granularity: 0
      Unmap granularity alignment valid: 0
      Unmap granularity alignment: 0
      Maximum write same length: 0x0 blocks
    
    and the size of the command it can handle seems to depend on how much
    memory it can allocate at the time. This results in IO errors when
    handling large IOs. This patch just has us use the old 1024 default
    sectors for this target by adding it to the scsi blacklist. We do not
    have good contacs with this vendors, so I have not been able to try and
    fix on their side.
    
    I have posted this a long while back, but it was not merged. This
    version just fixes it up for merge/patch failures in the original
    version.
    
    Reported-by: Ancoron Luciferis <ancoron.luciferis@googlemail.com>
    Reported-by: Michael Meyers <steltek@tcnnet.com>
    Signed-off-by: Mike Christie <mchristi@redhat.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 5b27adfac012bd383de3e950e2c8c431da9ebfd5
Author: James Bottomley <James.Bottomley@HansenPartnership.com>
Date:   Wed Jan 13 08:10:31 2016 -0800

    klist: fix starting point removed bug in klist iterators
    
    commit 00cd29b799e3449f0c68b1cc77cd4a5f95b42d17 upstream.
    
    The starting node for a klist iteration is often passed in from
    somewhere way above the klist infrastructure, meaning there's no
    guarantee the node is still on the list.  We've seen this in SCSI where
    we use bus_find_device() to iterate through a list of devices.  In the
    face of heavy hotplug activity, the last device returned by
    bus_find_device() can be removed before the next call.  This leads to
    
    Dec  3 13:22:02 localhost kernel: WARNING: CPU: 2 PID: 28073 at include/linux/kref.h:47 klist_iter_init_node+0x3d/0x50()
    Dec  3 13:22:02 localhost kernel: Modules linked in: scsi_debug x86_pkg_temp_thermal kvm_intel kvm irqbypass crc32c_intel joydev iTCO_wdt dcdbas ipmi_devintf acpi_power_meter iTCO_vendor_support ipmi_si imsghandler pcspkr wmi acpi_cpufreq tpm_tis tpm shpchp lpc_ich mfd_core nfsd nfs_acl lockd grace sunrpc tg3 ptp pps_core
    Dec  3 13:22:02 localhost kernel: CPU: 2 PID: 28073 Comm: cat Not tainted 4.4.0-rc1+ #2
    Dec  3 13:22:02 localhost kernel: Hardware name: Dell Inc. PowerEdge R320/08VT7V, BIOS 2.0.22 11/19/2013
    Dec  3 13:22:02 localhost kernel: ffffffff81a20e77 ffff880613acfd18 ffffffff81321eef 0000000000000000
    Dec  3 13:22:02 localhost kernel: ffff880613acfd50 ffffffff8107ca52 ffff88061176b198 0000000000000000
    Dec  3 13:22:02 localhost kernel: ffffffff814542b0 ffff880610cfb100 ffff88061176b198 ffff880613acfd60
    Dec  3 13:22:02 localhost kernel: Call Trace:
    Dec  3 13:22:02 localhost kernel: [<ffffffff81321eef>] dump_stack+0x44/0x55
    Dec  3 13:22:02 localhost kernel: [<ffffffff8107ca52>] warn_slowpath_common+0x82/0xc0
    Dec  3 13:22:02 localhost kernel: [<ffffffff814542b0>] ? proc_scsi_show+0x20/0x20
    Dec  3 13:22:02 localhost kernel: [<ffffffff8107cb4a>] warn_slowpath_null+0x1a/0x20
    Dec  3 13:22:02 localhost kernel: [<ffffffff8167225d>] klist_iter_init_node+0x3d/0x50
    Dec  3 13:22:02 localhost kernel: [<ffffffff81421d41>] bus_find_device+0x51/0xb0
    Dec  3 13:22:02 localhost kernel: [<ffffffff814545ad>] scsi_seq_next+0x2d/0x40
    [...]
    
    And an eventual crash. It can actually occur in any hotplug system
    which has a device finder and a starting device.
    
    We can fix this globally by making sure the starting node for
    klist_iter_init_node() is actually a member of the list before using it
    (and by starting from the beginning if it isn't).
    
    Reported-by: Ewan D. Milne <emilne@redhat.com>
    Tested-by: Ewan D. Milne <emilne@redhat.com>
    Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 152fb02241b60ffb8d406b87c68d1908478a205f
Author: Steven Rostedt (Red Hat) <rostedt@goodmis.org>
Date:   Mon Feb 15 12:36:14 2016 -0500

    tracepoints: Do not trace when cpu is offline
    
    commit f37755490fe9bf76f6ba1d8c6591745d3574a6a6 upstream.
    
    The tracepoint infrastructure uses RCU sched protection to enable and
    disable tracepoints safely. There are some instances where tracepoints are
    used in infrastructure code (like kfree()) that get called after a CPU is
    going offline, and perhaps when it is coming back online but hasn't been
    registered yet.
    
    This can probuce the following warning:
    
     [ INFO: suspicious RCU usage. ]
     4.4.0-00006-g0fe53e8-dirty #34 Tainted: G S
     -------------------------------
     include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage!
    
     other info that might help us debug this:
    
     RCU used illegally from offline CPU!  rcu_scheduler_active = 1, debug_locks = 1
     no locks held by swapper/8/0.
    
     stack backtrace:
      CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S              4.4.0-00006-g0fe53e8-dirty #34
      Call Trace:
      [c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable)
      [c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170
      [c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440
      [c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100
      [c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150
      [c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140
      [c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310
      [c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60
      [c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40
      [c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560
      [c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360
      [c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14
    
    This warning is not a false positive either. RCU is not protecting code that
    is being executed while the CPU is offline.
    
    Instead of playing "whack-a-mole(TM)" and adding conditional statements to
    the tracepoints we find that are used in this instance, simply add a
    cpu_online() test to the tracepoint code where the tracepoint will be
    ignored if the CPU is offline.
    
    Use of raw_smp_processor_id() is fine, as there should never be a case where
    the tracepoint code goes from running on a CPU that is online and suddenly
    gets migrated to a CPU that is offline.
    
    Link: http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@linux-powerpc.org
    
    Reported-by: Denis Kirjanov <kda@linux-powerpc.org>
    Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints")
    Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 2fa82bbbc73a7d8716e3f7aba6b2b5c84147a2fc
Author: Arnd Bergmann <arnd@arndb.de>
Date:   Fri Feb 12 22:26:42 2016 +0100

    tracing: Fix freak link error caused by branch tracer
    
    commit b33c8ff4431a343561e2319f17c14286f2aa52e2 upstream.
    
    In my randconfig tests, I came across a bug that involves several
    components:
    
    * gcc-4.9 through at least 5.3
    * CONFIG_GCOV_PROFILE_ALL enabling -fprofile-arcs for all files
    * CONFIG_PROFILE_ALL_BRANCHES overriding every if()
    * The optimized implementation of do_div() that tries to
      replace a library call with an division by multiplication
    * code in drivers/media/dvb-frontends/zl10353.c doing
    
            u32 adc_clock = 450560; /* 45.056 MHz */
            if (state->config.adc_clock)
                    adc_clock = state->config.adc_clock;
            do_div(value, adc_clock);
    
    In this case, gcc fails to determine whether the divisor
    in do_div() is __builtin_constant_p(). In particular, it
    concludes that __builtin_constant_p(adc_clock) is false, while
    __builtin_constant_p(!!adc_clock) is true.
    
    That in turn throws off the logic in do_div() that also uses
    __builtin_constant_p(), and instead of picking either the
    constant- optimized division, and the code in ilog2() that uses
    __builtin_constant_p() to figure out whether it knows the answer at
    compile time. The result is a link error from failing to find
    multiple symbols that should never have been called based on
    the __builtin_constant_p():
    
    dvb-frontends/zl10353.c:138: undefined reference to `____ilog2_NaN'
    dvb-frontends/zl10353.c:138: undefined reference to `__aeabi_uldivmod'
    ERROR: "____ilog2_NaN" [drivers/media/dvb-frontends/zl10353.ko] undefined!
    ERROR: "__aeabi_uldivmod" [drivers/media/dvb-frontends/zl10353.ko] undefined!
    
    This patch avoids the problem by changing __trace_if() to check
    whether the condition is known at compile-time to be nonzero, rather
    than checking whether it is actually a constant.
    
    I see this one link error in roughly one out of 1600 randconfig builds
    on ARM, and the patch fixes all known instances.
    
    Link: http://lkml.kernel.org/r/1455312410-1058841-1-git-send-email-arnd@arndb.de
    
    Acked-by: Nicolas Pitre <nico@linaro.org>
    Fixes: ab3c9c686e22 ("branch tracer, intel-iommu: fix build with CONFIG_BRANCH_TRACER=y")
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6fa74f50e357b0d87fad34a4b7ccbc2beb42b859
Author: Adrian Hunter <adrian.hunter@intel.com>
Date:   Tue Jan 26 14:05:20 2016 +0200

    perf tools: tracepoint_error() can receive e=NULL, robustify it
    
    commit ec183d22cc284a7a1e17f0341219d8ec8ca070cc upstream.
    
    Fixes segmentation fault using, for instance:
    
      (gdb) run record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls
      Starting program: /home/acme/bin/perf record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls
      Missing separate debuginfos, use: dnf debuginfo-install glibc-2.22-7.fc23.x86_64
      [Thread debugging using libthread_db enabled]
      Using host libthread_db library "/lib64/libthread_db.so.1".
    
     Program received signal SIGSEGV, Segmentation fault.
      0 x00000000004b9ea5 in tracepoint_error (e=0x0, err=13, sys=0x19b1370 "sched", name=0x19a5d00 "sched_switch") at util/parse-events.c:410
      (gdb) bt
      #0  0x00000000004b9ea5 in tracepoint_error (e=0x0, err=13, sys=0x19b1370 "sched", name=0x19a5d00 "sched_switch") at util/parse-events.c:410
      #1  0x00000000004b9fc5 in add_tracepoint (list=0x19a5d20, idx=0x7fffffffb8c0, sys_name=0x19b1370 "sched", evt_name=0x19a5d00 "sched_switch", err=0x0, head_config=0x0)
          at util/parse-events.c:433
      #2  0x00000000004ba334 in add_tracepoint_event (list=0x19a5d20, idx=0x7fffffffb8c0, sys_name=0x19b1370 "sched", evt_name=0x19a5d00 "sched_switch", err=0x0, head_config=0x0)
          at util/parse-events.c:498
      #3  0x00000000004bb699 in parse_events_add_tracepoint (list=0x19a5d20, idx=0x7fffffffb8c0, sys=0x19b1370 "sched", event=0x19a5d00 "sched_switch", err=0x0, head_config=0x0)
          at util/parse-events.c:936
      #4  0x00000000004f6eda in parse_events_parse (_data=0x7fffffffb8b0, scanner=0x19a49d0) at util/parse-events.y:391
      #5  0x00000000004bc8e5 in parse_events__scanner (str=0x663ff2 "sched:sched_switch", data=0x7fffffffb8b0, start_token=258) at util/parse-events.c:1361
      #6  0x00000000004bca57 in parse_events (evlist=0x19a5220, str=0x663ff2 "sched:sched_switch", err=0x0) at util/parse-events.c:1401
      #7  0x0000000000518d5f in perf_evlist__can_select_event (evlist=0x19a3b90, str=0x663ff2 "sched:sched_switch") at util/record.c:253
      #8  0x0000000000553c42 in intel_pt_track_switches (evlist=0x19a3b90) at arch/x86/util/intel-pt.c:364
      #9  0x00000000005549d1 in intel_pt_recording_options (itr=0x19a2c40, evlist=0x19a3b90, opts=0x8edf68 <record+232>) at arch/x86/util/intel-pt.c:664
      #10 0x000000000051e076 in auxtrace_record__options (itr=0x19a2c40, evlist=0x19a3b90, opts=0x8edf68 <record+232>) at util/auxtrace.c:539
      #11 0x0000000000433368 in cmd_record (argc=1, argv=0x7fffffffde60, prefix=0x0) at builtin-record.c:1264
      #12 0x000000000049bec2 in run_builtin (p=0x8fa2a8 <commands+168>, argc=5, argv=0x7fffffffde60) at perf.c:390
      #13 0x000000000049c12a in handle_internal_command (argc=5, argv=0x7fffffffde60) at perf.c:451
      #14 0x000000000049c278 in run_argv (argcp=0x7fffffffdcbc, argv=0x7fffffffdcb0) at perf.c:495
      #15 0x000000000049c60a in main (argc=5, argv=0x7fffffffde60) at perf.c:618
    (gdb)
    
    Intel PT attempts to find the sched:sched_switch tracepoint but that seg
    faults if tracefs is not readable, because the error reporting structure
    is null, as errors are not reported when automatically adding
    tracepoints.  Fix by checking before using.
    
    Committer note:
    
    This doesn't take place in a kernel that supports
    perf_event_attr.context_switch, that is the default way that will be
    used for tracking context switches, only in older kernels, like 4.2, in
    a machine with Intel PT (e.g. Broadwell) for non-priviledged users.
    
    Further info from a similar patch by Wang:
    
    The error is in tracepoint_error: it assumes the 'e' parameter is valid.
    
    However, there are many situation a parse_event() can be called without
    parse_events_error. See result of
    
      $ grep 'parse_events(.*NULL)' ./tools/perf/ -r'
    
    Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
    Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Josh Poimboeuf <jpoimboe@redhat.com>
    Cc: Tong Zhang <ztong@vt.edu>
    Cc: Wang Nan <wangnan0@huawei.com>
    Fixes: 196581717d85 ("perf tools: Enhance parsing events tracepoint error output")
    Link: http://lkml.kernel.org/r/1453809921-24596-2-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6e50ddaf0991285f6b6abb4d575f0cc0e8ed7834
Author: Steven Rostedt <rostedt@goodmis.org>
Date:   Mon Nov 16 17:25:16 2015 -0500

    tools lib traceevent: Fix output of %llu for 64 bit values read on 32 bit machines
    
    commit 32abc2ede536aae52978d6c0a8944eb1df14f460 upstream.
    
    When a long value is read on 32 bit machines for 64 bit output, the
    parsing needs to change "%lu" into "%llu", as the value is read
    natively.
    
    Unfortunately, if "%llu" is already there, the code will add another "l"
    to it and fail to parse it properly.
    
    Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
    Acked-by: Namhyung Kim <namhyung@kernel.org>
    Link: http://lkml.kernel.org/r/20151116172516.4b79b109@gandalf.local.home
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 969624b7c1c8c9784651eb97431e6f2bbb7a024c
Author: Jann Horn <jann@thejh.net>
Date:   Wed Jan 20 15:00:04 2016 -0800

    ptrace: use fsuid, fsgid, effective creds for fs access checks
    
    commit caaee6234d05a58c5b4d05e7bf766131b810a657 upstream.
    
    By checking the effective credentials instead of the real UID / permitted
    capabilities, ensure that the calling process actually intended to use its
    credentials.
    
    To ensure that all ptrace checks use the correct caller credentials (e.g.
    in case out-of-tree code or newly added code omits the PTRACE_MODE_*CREDS
    flag), use two new flags and require one of them to be set.
    
    The problem was that when a privileged task had temporarily dropped its
    privileges, e.g.  by calling setreuid(0, user_uid), with the intent to
    perform following syscalls with the credentials of a user, it still passed
    ptrace access checks that the user would not be able to pass.
    
    While an attacker should not be able to convince the privileged task to
    perform a ptrace() syscall, this is a problem because the ptrace access
    check is reused for things in procfs.
    
    In particular, the following somewhat interesting procfs entries only rely
    on ptrace access checks:
    
     /proc/$pid/stat - uses the check for determining whether pointers
         should be visible, useful for bypassing ASLR
     /proc/$pid/maps - also useful for bypassing ASLR
     /proc/$pid/cwd - useful for gaining access to restricted
         directories that contain files with lax permissions, e.g. in
         this scenario:
         lrwxrwxrwx root root /proc/13020/cwd -> /root/foobar
         drwx------ root root /root
         drwxr-xr-x root root /root/foobar
         -rw-r--r-- root root /root/foobar/secret
    
    Therefore, on a system where a root-owned mode 6755 binary changes its
    effective credentials as described and then dumps a user-specified file,
    this could be used by an attacker to reveal the memory layout of root's
    processes or reveal the contents of files he is not allowed to access
    (through /proc/$pid/cwd).
    
    [akpm@linux-foundation.org: fix warning]
    Signed-off-by: Jann Horn <jann@thejh.net>
    Acked-by: Kees Cook <keescook@chromium.org>
    Cc: Casey Schaufler <casey@schaufler-ca.com>
    Cc: Oleg Nesterov <oleg@redhat.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: James Morris <james.l.morris@oracle.com>
    Cc: "Serge E. Hallyn" <serge.hallyn@ubuntu.com>
    Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Cc: Andy Lutomirski <luto@kernel.org>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: "Eric W. Biederman" <ebiederm@xmission.com>
    Cc: Willy Tarreau <w@1wt.eu>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ba6d92801ba4e4c0262b70ea00922a71092999bb
Author: Filipe Manana <fdmanana@suse.com>
Date:   Mon Feb 15 16:20:26 2016 +0000

    Btrfs: fix direct IO requests not reporting IO error to user space
    
    commit 1636d1d77ef4e01e57f706a4cae3371463896136 upstream.
    
    If a bio for a direct IO request fails, we were not setting the error in
    the parent bio (the main DIO bio), making us not return the error to
    user space in btrfs_direct_IO(), that is, it made __blockdev_direct_IO()
    return the number of bytes issued for IO and not the error a bio created
    and submitted by btrfs_submit_direct() got from the block layer.
    This essentially happens because when we call:
    
       dio_end_io(dio_bio, bio->bi_error);
    
    It does not set dio_bio->bi_error to the value of the second argument.
    So just add this missing assignment in endio callbacks, just as we do in
    the error path at btrfs_submit_direct() when we fail to clone the dio bio
    or allocate its private object. This follows the convention of what is
    done with other similar APIs such as bio_endio() where the caller is
    responsible for setting the bi_error field in the bio it passes as an
    argument to bio_endio().
    
    This was detected by the new generic test cases in xfstests: 271, 272,
    276 and 278. Which essentially setup a dm error target, then load the
    error table, do a direct IO write and unload the error table. They
    expect the write to fail with -EIO, which was not getting reported
    when testing against btrfs.
    
    Fixes: 4246a0b63bd8 ("block: add a bi_error field to struct bio")
    Signed-off-by: Filipe Manana <fdmanana@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e8eced78e0252040c4e6bb633b40afb11a176416
Author: Filipe Manana <fdmanana@suse.com>
Date:   Wed Feb 3 19:17:27 2016 +0000

    Btrfs: fix hang on extent buffer lock caused by the inode_paths ioctl
    
    commit 0c0fe3b0fa45082cd752553fdb3a4b42503a118e upstream.
    
    While doing some tests I ran into an hang on an extent buffer's rwlock
    that produced the following trace:
    
    [39389.800012] NMI watchdog: BUG: soft lockup - CPU#15 stuck for 22s! [fdm-stress:32166]
    [39389.800016] NMI watchdog: BUG: soft lockup - CPU#14 stuck for 22s! [fdm-stress:32165]
    [39389.800016] Modules linked in: btrfs dm_mod ppdev xor sha256_generic hmac raid6_pq drbg ansi_cprng aesni_intel i2c_piix4 acpi_cpufreq aes_x86_64 ablk_helper tpm_tis parport_pc i2c_core sg cryptd evdev psmouse lrw tpm parport gf128mul serio_raw pcspkr glue_helper processor button loop autofs4 ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring crc32c_intel scsi_mod e1000 virtio floppy [last unloaded: btrfs]
    [39389.800016] irq event stamp: 0
    [39389.800016] hardirqs last  enabled at (0): [<          (null)>]           (null)
    [39389.800016] hardirqs last disabled at (0): [<ffffffff8104e58d>] copy_process+0x638/0x1a35
    [39389.800016] softirqs last  enabled at (0): [<ffffffff8104e58d>] copy_process+0x638/0x1a35
    [39389.800016] softirqs last disabled at (0): [<          (null)>]           (null)
    [39389.800016] CPU: 14 PID: 32165 Comm: fdm-stress Not tainted 4.4.0-rc6-btrfs-next-18+ #1
    [39389.800016] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
    [39389.800016] task: ffff880175b1ca40 ti: ffff8800a185c000 task.ti: ffff8800a185c000
    [39389.800016] RIP: 0010:[<ffffffff810902af>]  [<ffffffff810902af>] queued_spin_lock_slowpath+0x57/0x158
    [39389.800016] RSP: 0018:ffff8800a185fb80  EFLAGS: 00000202
    [39389.800016] RAX: 0000000000000101 RBX: ffff8801710c4e9c RCX: 0000000000000101
    [39389.800016] RDX: 0000000000000100 RSI: 0000000000000001 RDI: 0000000000000001
    [39389.800016] RBP: ffff8800a185fb98 R08: 0000000000000001 R09: 0000000000000000
    [39389.800016] R10: ffff8800a185fb68 R11: 6db6db6db6db6db7 R12: ffff8801710c4e98
    [39389.800016] R13: ffff880175b1ca40 R14: ffff8800a185fc10 R15: ffff880175b1ca40
    [39389.800016] FS:  00007f6d37fff700(0000) GS:ffff8802be9c0000(0000) knlGS:0000000000000000
    [39389.800016] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [39389.800016] CR2: 00007f6d300019b8 CR3: 0000000037c93000 CR4: 00000000001406e0
    [39389.800016] Stack:
    [39389.800016]  ffff8801710c4e98 ffff8801710c4e98 ffff880175b1ca40 ffff8800a185fbb0
    [39389.800016]  ffffffff81091e11 ffff8801710c4e98 ffff8800a185fbc8 ffffffff81091895
    [39389.800016]  ffff8801710c4e98 ffff8800a185fbe8 ffffffff81486c5c ffffffffa067288c
    [39389.800016] Call Trace:
    [39389.800016]  [<ffffffff81091e11>] queued_read_lock_slowpath+0x46/0x60
    [39389.800016]  [<ffffffff81091895>] do_raw_read_lock+0x3e/0x41
    [39389.800016]  [<ffffffff81486c5c>] _raw_read_lock+0x3d/0x44
    [39389.800016]  [<ffffffffa067288c>] ? btrfs_tree_read_lock+0x54/0x125 [btrfs]
    [39389.800016]  [<ffffffffa067288c>] btrfs_tree_read_lock+0x54/0x125 [btrfs]
    [39389.800016]  [<ffffffffa0622ced>] ? btrfs_find_item+0xa7/0xd2 [btrfs]
    [39389.800016]  [<ffffffffa069363f>] btrfs_ref_to_path+0xd6/0x174 [btrfs]
    [39389.800016]  [<ffffffffa0693730>] inode_to_path+0x53/0xa2 [btrfs]
    [39389.800016]  [<ffffffffa0693e2e>] paths_from_inode+0x117/0x2ec [btrfs]
    [39389.800016]  [<ffffffffa0670cff>] btrfs_ioctl+0xd5b/0x2793 [btrfs]
    [39389.800016]  [<ffffffff8108a8b0>] ? arch_local_irq_save+0x9/0xc
    [39389.800016]  [<ffffffff81276727>] ? __this_cpu_preempt_check+0x13/0x15
    [39389.800016]  [<ffffffff8108a8b0>] ? arch_local_irq_save+0x9/0xc
    [39389.800016]  [<ffffffff8118b3d4>] ? rcu_read_unlock+0x3e/0x5d
    [39389.800016]  [<ffffffff811822f8>] do_vfs_ioctl+0x42b/0x4ea
    [39389.800016]  [<ffffffff8118b4f3>] ? __fget_light+0x62/0x71
    [39389.800016]  [<ffffffff8118240e>] SyS_ioctl+0x57/0x79
    [39389.800016]  [<ffffffff814872d7>] entry_SYSCALL_64_fastpath+0x12/0x6f
    [39389.800016] Code: b9 01 01 00 00 f7 c6 00 ff ff ff 75 32 83 fe 01 89 ca 89 f0 0f 45 d7 f0 0f b1 13 39 f0 74 04 89 c6 eb e2 ff ca 0f 84 fa 00 00 00 <8b> 03 84 c0 74 04 f3 90 eb f6 66 c7 03 01 00 e9 e6 00 00 00 e8
    [39389.800012] Modules linked in: btrfs dm_mod ppdev xor sha256_generic hmac raid6_pq drbg ansi_cprng aesni_intel i2c_piix4 acpi_cpufreq aes_x86_64 ablk_helper tpm_tis parport_pc i2c_core sg cryptd evdev psmouse lrw tpm parport gf128mul serio_raw pcspkr glue_helper processor button loop autofs4 ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring crc32c_intel scsi_mod e1000 virtio floppy [last unloaded: btrfs]
    [39389.800012] irq event stamp: 0
    [39389.800012] hardirqs last  enabled at (0): [<          (null)>]           (null)
    [39389.800012] hardirqs last disabled at (0): [<ffffffff8104e58d>] copy_process+0x638/0x1a35
    [39389.800012] softirqs last  enabled at (0): [<ffffffff8104e58d>] copy_process+0x638/0x1a35
    [39389.800012] softirqs last disabled at (0): [<          (null)>]           (null)
    [39389.800012] CPU: 15 PID: 32166 Comm: fdm-stress Tainted: G             L  4.4.0-rc6-btrfs-next-18+ #1
    [39389.800012] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
    [39389.800012] task: ffff880179294380 ti: ffff880034a60000 task.ti: ffff880034a60000
    [39389.800012] RIP: 0010:[<ffffffff81091e8d>]  [<ffffffff81091e8d>] queued_write_lock_slowpath+0x62/0x72
    [39389.800012] RSP: 0018:ffff880034a639f0  EFLAGS: 00000206
    [39389.800012] RAX: 0000000000000101 RBX: ffff8801710c4e98 RCX: 0000000000000000
    [39389.800012] RDX: 00000000000000ff RSI: 0000000000000000 RDI: ffff8801710c4e9c
    [39389.800012] RBP: ffff880034a639f8 R08: 0000000000000001 R09: 0000000000000000
    [39389.800012] R10: ffff880034a639b0 R11: 0000000000001000 R12: ffff8801710c4e98
    [39389.800012] R13: 0000000000000001 R14: ffff880172cbc000 R15: ffff8801710c4e00
    [39389.800012] FS:  00007f6d377fe700(0000) GS:ffff8802be9e0000(0000) knlGS:0000000000000000
    [39389.800012] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [39389.800012] CR2: 00007f6d3d3c1000 CR3: 0000000037c93000 CR4: 00000000001406e0
    [39389.800012] Stack:
    [39389.800012]  ffff8801710c4e98 ffff880034a63a10 ffffffff81091963 ffff8801710c4e98
    [39389.800012]  ffff880034a63a30 ffffffff81486f1b ffffffffa0672cb3 ffff8801710c4e00
    [39389.800012]  ffff880034a63a78 ffffffffa0672cb3 ffff8801710c4e00 ffff880034a63a58
    [39389.800012] Call Trace:
    [39389.800012]  [<ffffffff81091963>] do_raw_write_lock+0x72/0x8c
    [39389.800012]  [<ffffffff81486f1b>] _raw_write_lock+0x3a/0x41
    [39389.800012]  [<ffffffffa0672cb3>] ? btrfs_tree_lock+0x119/0x251 [btrfs]
    [39389.800012]  [<ffffffffa0672cb3>] btrfs_tree_lock+0x119/0x251 [btrfs]
    [39389.800012]  [<ffffffffa061aeba>] ? rcu_read_unlock+0x5b/0x5d [btrfs]
    [39389.800012]  [<ffffffffa061ce13>] ? btrfs_root_node+0xda/0xe6 [btrfs]
    [39389.800012]  [<ffffffffa061ce83>] btrfs_lock_root_node+0x22/0x42 [btrfs]
    [39389.800012]  [<ffffffffa062046b>] btrfs_search_slot+0x1b8/0x758 [btrfs]
    [39389.800012]  [<ffffffff810fc6b0>] ? time_hardirqs_on+0x15/0x28
    [39389.800012]  [<ffffffffa06365db>] btrfs_lookup_inode+0x31/0x95 [btrfs]
    [39389.800012]  [<ffffffff8108d62f>] ? trace_hardirqs_on+0xd/0xf
    [39389.800012]  [<ffffffff8148482b>] ? mutex_lock_nested+0x397/0x3bc
    [39389.800012]  [<ffffffffa068821b>] __btrfs_update_delayed_inode+0x59/0x1c0 [btrfs]
    [39389.800012]  [<ffffffffa068858e>] __btrfs_commit_inode_delayed_items+0x194/0x5aa [btrfs]
    [39389.800012]  [<ffffffff81486ab7>] ? _raw_spin_unlock+0x31/0x44
    [39389.800012]  [<ffffffffa0688a48>] __btrfs_run_delayed_items+0xa4/0x15c [btrfs]
    [39389.800012]  [<ffffffffa0688d62>] btrfs_run_delayed_items+0x11/0x13 [btrfs]
    [39389.800012]  [<ffffffffa064048e>] btrfs_commit_transaction+0x234/0x96e [btrfs]
    [39389.800012]  [<ffffffffa0618d10>] btrfs_sync_fs+0x145/0x1ad [btrfs]
    [39389.800012]  [<ffffffffa0671176>] btrfs_ioctl+0x11d2/0x2793 [btrfs]
    [39389.800012]  [<ffffffff8108a8b0>] ? arch_local_irq_save+0x9/0xc
    [39389.800012]  [<ffffffff81140261>] ? __might_fault+0x4c/0xa7
    [39389.800012]  [<ffffffff81140261>] ? __might_fault+0x4c/0xa7
    [39389.800012]  [<ffffffff8108a8b0>] ? arch_local_irq_save+0x9/0xc
    [39389.800012]  [<ffffffff8118b3d4>] ? rcu_read_unlock+0x3e/0x5d
    [39389.800012]  [<ffffffff811822f8>] do_vfs_ioctl+0x42b/0x4ea
    [39389.800012]  [<ffffffff8118b4f3>] ? __fget_light+0x62/0x71
    [39389.800012]  [<ffffffff8118240e>] SyS_ioctl+0x57/0x79
    [39389.800012]  [<ffffffff814872d7>] entry_SYSCALL_64_fastpath+0x12/0x6f
    [39389.800012] Code: f0 0f b1 13 85 c0 75 ef eb 2a f3 90 8a 03 84 c0 75 f8 f0 0f b0 13 84 c0 75 f0 ba ff 00 00 00 eb 0a f0 0f b1 13 ff c8 74 0b f3 90 <8b> 03 83 f8 01 75 f7 eb ed c6 43 04 00 5b 5d c3 0f 1f 44 00 00
    
    This happens because in the code path executed by the inode_paths ioctl we
    end up nesting two calls to read lock a leaf's rwlock when after the first
    call to read_lock() and before the second call to read_lock(), another
    task (running the delayed items as part of a transaction commit) has
    already called write_lock() against the leaf's rwlock. This situation is
    illustrated by the following diagram:
    
             Task A                       Task B
    
      btrfs_ref_to_path()               btrfs_commit_transaction()
        read_lock(&eb->lock);
    
                                          btrfs_run_delayed_items()
                                            __btrfs_commit_inode_delayed_items()
                                              __btrfs_update_delayed_inode()
                                                btrfs_lookup_inode()
    
                                                  write_lock(&eb->lock);
                                                    --> task waits for lock
    
        read_lock(&eb->lock);
        --> makes this task hang
            forever (and task B too
    	of course)
    
    So fix this by avoiding doing the nested read lock, which is easily
    avoidable. This issue does not happen if task B calls write_lock() after
    task A does the second call to read_lock(), however there does not seem
    to exist anything in the documentation that mentions what is the expected
    behaviour for recursive locking of rwlocks (leaving the idea that doing
    so is not a good usage of rwlocks).
    
    Also, as a side effect necessary for this fix, make sure we do not
    needlessly read lock extent buffers when the input path has skip_locking
    set (used when called from send).
    
    Signed-off-by: Filipe Manana <fdmanana@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit be1232bcea113204523b7d530cc5d6d534916276
Author: Filipe Manana <fdmanana@suse.com>
Date:   Wed Jan 27 18:37:47 2016 +0000

    Btrfs: fix page reading in extent_same ioctl leading to csum errors
    
    commit 313140023026ae542ad76e7e268c56a1eaa2c28e upstream.
    
    In the extent_same ioctl, we were grabbing the pages (locked) and
    attempting to read them without bothering about any concurrent IO
    against them. That is, we were not checking for any ongoing ordered
    extents nor waiting for them to complete, which leads to a race where
    the extent_same() code gets a checksum verification error when it
    reads the pages, producing a message like the following in dmesg
    and making the operation fail to user space with -ENOMEM:
    
    [18990.161265] BTRFS warning (device sdc): csum failed ino 259 off 495616 csum 685204116 expected csum 1515870868
    
    Fix this by using btrfs_readpage() for reading the pages instead of
    extent_read_full_page_nolock(), which waits for any concurrent ordered
    extents to complete and locks the io range. Also do better error handling
    and don't treat all failures as -ENOMEM, as that's clearly misleasing,
    becoming identical to the checks and operation of prepare_uptodate_page().
    
    The use of extent_read_full_page_nolock() was required before
    commit f441460202cb ("btrfs: fix deadlock with extent-same and readpage"),
    as we had the range locked in an inode's io tree before attempting to
    read the pages.
    
    Fixes: f441460202cb ("btrfs: fix deadlock with extent-same and readpage")
    Signed-off-by: Filipe Manana <fdmanana@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit df567e6dcd22097d2f31b6a8bf481959f1352b6a
Author: Filipe Manana <fdmanana@suse.com>
Date:   Wed Jan 27 10:20:58 2016 +0000

    Btrfs: fix invalid page accesses in extent_same (dedup) ioctl
    
    commit e0bd70c67bf996b360f706b6c643000f2e384681 upstream.
    
    In the extent_same ioctl we are getting the pages for the source and
    target ranges and unlocking them immediately after, which is incorrect
    because later we attempt to map them (with kmap_atomic) and access their
    contents at btrfs_cmp_data(). When we do such access the pages might have
    been relocated or removed from memory, which leads to an invalid memory
    access. This issue is detected on a kernel with CONFIG_DEBUG_PAGEALLOC=y
    which produces a trace like the following:
    
    186736.677437] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
    [186736.680382] Modules linked in: btrfs dm_flakey dm_mod ppdev xor raid6_pq sha256_generic hmac drbg ansi_cprng acpi_cpufreq evdev sg aesni_intel aes_x86_64
    parport_pc ablk_helper tpm_tis psmouse parport i2c_piix4 tpm cryptd i2c_core lrw processor button serio_raw pcspkr gf128mul glue_helper loop autofs4 ext4
    crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring crc32c_intel scsi_mod e1000 virtio floppy [last
    unloaded: btrfs]
    [186736.681319] CPU: 13 PID: 10222 Comm: duperemove Tainted: G        W       4.4.0-rc6-btrfs-next-18+ #1
    [186736.681319] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
    [186736.681319] task: ffff880132600400 ti: ffff880362284000 task.ti: ffff880362284000
    [186736.681319] RIP: 0010:[<ffffffff81264d00>]  [<ffffffff81264d00>] memcmp+0xb/0x22
    [186736.681319] RSP: 0018:ffff880362287d70  EFLAGS: 00010287
    [186736.681319] RAX: 000002c002468acf RBX: 0000000012345678 RCX: 0000000000000000
    [186736.681319] RDX: 0000000000001000 RSI: 0005d129c5cf9000 RDI: 0005d129c5cf9000
    [186736.681319] RBP: ffff880362287d70 R08: 0000000000000000 R09: 0000000000001000
    [186736.681319] R10: ffff880000000000 R11: 0000000000000476 R12: 0000000000001000
    [186736.681319] R13: ffff8802f91d4c88 R14: ffff8801f2a77830 R15: ffff880352e83e40
    [186736.681319] FS:  00007f27b37fe700(0000) GS:ffff88043dda0000(0000) knlGS:0000000000000000
    [186736.681319] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [186736.681319] CR2: 00007f27a406a000 CR3: 0000000217421000 CR4: 00000000001406e0
    [186736.681319] Stack:
    [186736.681319]  ffff880362287ea0 ffffffffa048d0bd 000000000009f000 0000000000001000
    [186736.681319]  0100000000000000 ffff8801f2a77850 ffff8802f91d49b0 ffff880132600400
    [186736.681319]  00000000000004f8 ffff8801c1efbe41 0000000000000000 0000000000000038
    [186736.681319] Call Trace:
    [186736.681319]  [<ffffffffa048d0bd>] btrfs_ioctl+0x24cb/0x2731 [btrfs]
    [186736.681319]  [<ffffffff8108a8b0>] ? arch_local_irq_save+0x9/0xc
    [186736.681319]  [<ffffffff8118b3d4>] ? rcu_read_unlock+0x3e/0x5d
    [186736.681319]  [<ffffffff811822f8>] do_vfs_ioctl+0x42b/0x4ea
    [186736.681319]  [<ffffffff8118b4f3>] ? __fget_light+0x62/0x71
    [186736.681319]  [<ffffffff8118240e>] SyS_ioctl+0x57/0x79
    [186736.681319]  [<ffffffff814872d7>] entry_SYSCALL_64_fastpath+0x12/0x6f
    [186736.681319] Code: 0a 3c 6e 74 0d 3c 79 74 04 3c 59 75 0c c6 06 01 eb 03 c6 06 00 31 c0 eb 05 b8 ea ff ff ff 5d c3 55 31 c9 48 89 e5 48 39 d1 74 13 <0f> b6
    04 0f 44 0f b6 04 0e 48 ff c1 44 29 c0 74 ea eb 02 31 c0
    
    (gdb) list *(btrfs_ioctl+0x24cb)
    0x5e0e1 is in btrfs_ioctl (fs/btrfs/ioctl.c:2972).
    2967                    dst_addr = kmap_atomic(dst_page);
    2968
    2969                    flush_dcache_page(src_page);
    2970                    flush_dcache_page(dst_page);
    2971
    2972                    if (memcmp(addr, dst_addr, cmp_len))
    2973                            ret = BTRFS_SAME_DATA_DIFFERS;
    2974
    2975                    kunmap_atomic(addr);
    2976                    kunmap_atomic(dst_addr);
    
    So fix this by making sure we keep the pages locked and respect the same
    locking order as everywhere else: get and lock the pages first and then
    lock the range in the inode's io tree (like for example at
    __btrfs_buffered_write() and extent_readpages()). If an ordered extent
    is found after locking the range in the io tree, unlock the range,
    unlock the pages, wait for the ordered extent to complete and repeat the
    entire locking process until no overlapping ordered extents are found.
    
    Signed-off-by: Filipe Manana <fdmanana@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b58081d430b4eaf6bf7d407cc5d4bfa39488b6c1
Author: David Sterba <dsterba@suse.com>
Date:   Fri Nov 13 13:44:28 2015 +0100

    btrfs: properly set the termination value of ctx->pos in readdir
    
    commit bc4ef7592f657ae81b017207a1098817126ad4cb upstream.
    
    The value of ctx->pos in the last readdir call is supposed to be set to
    INT_MAX due to 32bit compatibility, unless 'pos' is intentially set to a
    larger value, then it's LLONG_MAX.
    
    There's a report from PaX SIZE_OVERFLOW plugin that "ctx->pos++"
    overflows (https://forums.grsecurity.net/viewtopic.php?f=1&t=4284), on a
    64bit arch, where the value is 0x7fffffffffffffff ie. LLONG_MAX before
    the increment.
    
    We can get to that situation like that:
    
    * emit all regular readdir entries
    * still in the same call to readdir, bump the last pos to INT_MAX
    * next call to readdir will not emit any entries, but will reach the
      bump code again, finds pos to be INT_MAX and sets it to LLONG_MAX
    
    Normally this is not a problem, but if we call readdir again, we'll find
    'pos' set to LLONG_MAX and the unconditional increment will overflow.
    
    The report from Victor at
    (http://thread.gmane.org/gmane.comp.file-systems.btrfs/49500) with debugging
    print shows that pattern:
    
     Overflow: e
     Overflow: 7fffffff
     Overflow: 7fffffffffffffff
     PAX: size overflow detected in function btrfs_real_readdir
       fs/btrfs/inode.c:5760 cicus.935_282 max, count: 9, decl: pos; num: 0;
       context: dir_context;
     CPU: 0 PID: 2630 Comm: polkitd Not tainted 4.2.3-grsec #1
     Hardware name: Gigabyte Technology Co., Ltd. H81ND2H/H81ND2H, BIOS F3 08/11/2015
      ffffffff81901608 0000000000000000 ffffffff819015e6 ffffc90004973d48
      ffffffff81742f0f 0000000000000007 ffffffff81901608 ffffc90004973d78
      ffffffff811cb706 0000000000000000 ffff8800d47359e0 ffffc90004973ed8
     Call Trace:
      [<ffffffff81742f0f>] dump_stack+0x4c/0x7f
      [<ffffffff811cb706>] report_size_overflow+0x36/0x40
      [<ffffffff812ef0bc>] btrfs_real_readdir+0x69c/0x6d0
      [<ffffffff811dafc8>] iterate_dir+0xa8/0x150
      [<ffffffff811e6d8d>] ? __fget_light+0x2d/0x70
      [<ffffffff811dba3a>] SyS_getdents+0xba/0x1c0
     Overflow: 1a
      [<ffffffff811db070>] ? iterate_dir+0x150/0x150
      [<ffffffff81749b69>] entry_SYSCALL_64_fastpath+0x12/0x83
    
    The jump from 7fffffff to 7fffffffffffffff happens when new dir entries
    are not yet synced and are processed from the delayed list. Then the code
    could go to the bump section again even though it might not emit any new
    dir entries from the delayed list.
    
    The fix avoids entering the "bump" section again once we've finished
    emitting the entries, both for synced and delayed entries.
    
    References: https://forums.grsecurity.net/viewtopic.php?f=1&t=4284
    Reported-by: Victor <services@swwu.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Tested-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com>
    Signed-off-by: Chris Mason <clm@fb.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit dfd2961ab6eda6c39aa9d7ddc5e6fb754b16a750
Author: David Sterba <dsterba@suse.com>
Date:   Mon Jan 25 11:02:06 2016 +0100

    Revert "btrfs: clear PF_NOFREEZE in cleaner_kthread()"
    
    commit 80ad623edd2d0ccb47d85357ee31c97e6c684e82 upstream.
    
    This reverts commit 696249132158014d594896df3a81390616069c5c. The
    cleaner thread can block freezing when there's a snapshot cleaning in
    progress and the other threads get suspended first. From the logs
    provided by Martin we're waiting for reading extent pages:
    
    kernel: PM: Syncing filesystems ... done.
    kernel: Freezing user space processes ... (elapsed 0.015 seconds) done.
    kernel: Freezing remaining freezable tasks ...
    kernel: Freezing of tasks failed after 20.003 seconds (1 tasks refusing to freeze, wq_busy=0):
    kernel: btrfs-cleaner   D ffff88033dd13bc0     0   152      2 0x00000000
    kernel: ffff88032ebc2e00 ffff88032e750000 ffff88032e74fa50 7fffffffffffffff
    kernel: ffffffff814a58df 0000000000000002 ffffea000934d580 ffffffff814a5451
    kernel: 7fffffffffffffff ffffffff814a6e8f 0000000000000000 0000000000000020
    kernel: Call Trace:
    kernel: [<ffffffff814a58df>] ? bit_wait+0x2c/0x2c
    kernel: [<ffffffff814a5451>] ? schedule+0x6f/0x7c
    kernel: [<ffffffff814a6e8f>] ? schedule_timeout+0x2f/0xd8
    kernel: [<ffffffff81076f94>] ? timekeeping_get_ns+0xa/0x2e
    kernel: [<ffffffff81077603>] ? ktime_get+0x36/0x44
    kernel: [<ffffffff814a4f6c>] ? io_schedule_timeout+0x94/0xf2
    kernel: [<ffffffff814a4f6c>] ? io_schedule_timeout+0x94/0xf2
    kernel: [<ffffffff814a590b>] ? bit_wait_io+0x2c/0x30
    kernel: [<ffffffff814a5694>] ? __wait_on_bit+0x41/0x73
    kernel: [<ffffffff8109eba8>] ? wait_on_page_bit+0x6d/0x72
    kernel: [<ffffffff8105d718>] ? autoremove_wake_function+0x2a/0x2a
    kernel: [<ffffffff811a02d7>] ? read_extent_buffer_pages+0x1bd/0x203
    kernel: [<ffffffff8117d9e9>] ? free_root_pointers+0x4c/0x4c
    kernel: [<ffffffff8117e831>] ? btree_read_extent_buffer_pages.constprop.57+0x5a/0xe9
    kernel: [<ffffffff8117f4f3>] ? read_tree_block+0x2d/0x45
    kernel: [<ffffffff8116782a>] ? read_block_for_search.isra.34+0x22a/0x26b
    kernel: [<ffffffff811656c3>] ? btrfs_set_path_blocking+0x1e/0x4a
    kernel: [<ffffffff8116919b>] ? btrfs_search_slot+0x648/0x736
    kernel: [<ffffffff81170559>] ? btrfs_lookup_extent_info+0xb7/0x2c7
    kernel: [<ffffffff81170ee5>] ? walk_down_proc+0x9c/0x1ae
    kernel: [<ffffffff81171c9d>] ? walk_down_tree+0x40/0xa4
    kernel: [<ffffffff8117375f>] ? btrfs_drop_snapshot+0x2da/0x664
    kernel: [<ffffffff8104ff21>] ? finish_task_switch+0x126/0x167
    kernel: [<ffffffff811850f8>] ? btrfs_clean_one_deleted_snapshot+0xa6/0xb0
    kernel: [<ffffffff8117eaba>] ? cleaner_kthread+0x13e/0x17b
    kernel: [<ffffffff8117e97c>] ? btrfs_item_end+0x33/0x33
    kernel: [<ffffffff8104d256>] ? kthread+0x95/0x9d
    kernel: [<ffffffff8104d1c1>] ? kthread_parkme+0x16/0x16
    kernel: [<ffffffff814a7b5f>] ? ret_from_fork+0x3f/0x70
    kernel: [<ffffffff8104d1c1>] ? kthread_parkme+0x16/0x16
    
    As this affects a released kernel (4.4) we need a minimal fix for
    stable kernels.
    
    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=108361
    Reported-by: Martin Ziegler <ziegler@uni-freiburg.de>
    CC: Jiri Kosina <jkosina@suse.cz>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Chris Mason <clm@fb.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 4e6943903a8eb700bf13a7be572249b34d438574
Author: Filipe Manana <fdmanana@suse.com>
Date:   Wed Jan 6 22:42:35 2016 +0000

    Btrfs: fix fitrim discarding device area reserved for boot loader's use
    
    commit 8cdc7c5b00d945a3c823fc4277af304abb9cb43d upstream.
    
    As of the 4.3 kernel release, the fitrim ioctl can now discard any region
    of a disk that is not allocated to any chunk/block group, including the
    first megabyte which is used for our primary superblock and by the boot
    loader (grub for example).
    
    Fix this by not allowing to trim/discard any region in the device starting
    with an offset not greater than min(alloc_start_mount_option, 1Mb), just
    as it was not possible before 4.3.
    
    A reproducer test case for xfstests follows.
    
      seq=`basename $0`
      seqres=$RESULT_DIR/$seq
      echo "QA output created by $seq"
      tmp=/tmp/$$
      status=1	# failure is the default!
      trap "_cleanup; exit \$status" 0 1 2 3 15
    
      _cleanup()
      {
          cd /
          rm -f $tmp.*
      }
    
      # get standard environment, filters and checks
      . ./common/rc
      . ./common/filter
    
      # real QA test starts here
      _need_to_be_root
      _supported_fs btrfs
      _supported_os Linux
      _require_scratch
    
      rm -f $seqres.full
    
      _scratch_mkfs >>$seqres.full 2>&1
    
      # Write to the [0, 64Kb[ and [68Kb, 1Mb[ ranges of the device. These ranges are
      # reserved for a boot loader to use (GRUB for example) and btrfs should never
      # use them - neither for allocating metadata/data nor should trim/discard them.
      # The range [64Kb, 68Kb[ is used for the primary superblock of the filesystem.
      $XFS_IO_PROG -c "pwrite -S 0xfd 0 64K" $SCRATCH_DEV | _filter_xfs_io
      $XFS_IO_PROG -c "pwrite -S 0xfd 68K 956K" $SCRATCH_DEV | _filter_xfs_io
    
      # Now mount the filesystem and perform a fitrim against it.
      _scratch_mount
      _require_batched_discard $SCRATCH_MNT
      $FSTRIM_PROG $SCRATCH_MNT
    
      # Now unmount the filesystem and verify the content of the ranges was not
      # modified (no trim/discard happened on them).
      _scratch_unmount
      echo "Content of the ranges [0, 64Kb] and [68Kb, 1Mb[ after fitrim:"
      od -t x1 -N $((64 * 1024)) $SCRATCH_DEV
      od -t x1 -j $((68 * 1024)) -N $((956 * 1024)) $SCRATCH_DEV
    
      status=0
      exit
    
    Reported-by: Vincent Petry  <PVince81@yahoo.fr>
    Reported-by: Andrei Borzenkov <arvidjaar@gmail.com>
    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=109341
    Fixes: 499f377f49f0 (btrfs: iterate over unused chunk space in FITRIM)
    Signed-off-by: Filipe Manana <fdmanana@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c57e49b50bc5ab5089bc1295bab9106925033c40
Author: David Sterba <dsterba@suse.com>
Date:   Mon Nov 30 17:27:06 2015 +0100

    btrfs: handle invalid num_stripes in sys_array
    
    commit f5cdedd73fa71b74dcc42f2a11a5735d89ce7c4f upstream.
    
    We can handle the special case of num_stripes == 0 directly inside
    btrfs_read_sys_array. The BUG_ON in btrfs_chunk_item_size is there to
    catch other unhandled cases where we fail to validate external data.
    
    A crafted or corrupted image crashes at mount time:
    
    BTRFS: device fsid 9006933e-2a9a-44f0-917f-514252aeec2c devid 1 transid 7 /dev/loop0
    BTRFS info (device loop0): disk space caching is enabled
    BUG: failure at fs/btrfs/ctree.h:337/btrfs_chunk_item_size()!
    Kernel panic - not syncing: BUG!
    CPU: 0 PID: 313 Comm: mount Not tainted 4.2.5-00657-ge047887-dirty #25
    Stack:
     637af890 60062489 602aeb2e 604192ba
     60387961 00000011 637af8a0 6038a835
     637af9c0 6038776b 634ef32b 00000000
    Call Trace:
     [<6001c86d>] show_stack+0xfe/0x15b
     [<6038a835>] dump_stack+0x2a/0x2c
     [<6038776b>] panic+0x13e/0x2b3
     [<6020f099>] btrfs_read_sys_array+0x25d/0x2ff
     [<601cfbbe>] open_ctree+0x192d/0x27af
     [<6019c2c1>] btrfs_mount+0x8f5/0xb9a
     [<600bc9a7>] mount_fs+0x11/0xf3
     [<600d5167>] vfs_kern_mount+0x75/0x11a
     [<6019bcb0>] btrfs_mount+0x2e4/0xb9a
     [<600bc9a7>] mount_fs+0x11/0xf3
     [<600d5167>] vfs_kern_mount+0x75/0x11a
     [<600d710b>] do_mount+0xa35/0xbc9
     [<600d7557>] SyS_mount+0x95/0xc8
     [<6001e884>] handle_syscall+0x6b/0x8e
    
    Reported-by: Jiri Slaby <jslaby@suse.com>
    Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit bbfe21c87bd0f529d19f077051a52d779c785c6c
Author: Eryu Guan <guaneryu@gmail.com>
Date:   Fri Feb 12 01:20:43 2016 -0500

    ext4: don't read blocks from disk after extents being swapped
    
    commit bcff24887d00bce102e0857d7b0a8c44a40f53d1 upstream.
    
    I notice ext4/307 fails occasionally on ppc64 host, reporting md5
    checksum mismatch after moving data from original file to donor file.
    
    The reason is that move_extent_per_page() calls __block_write_begin()
    and block_commit_write() to write saved data from original inode blocks
    to donor inode blocks, but __block_write_begin() not only maps buffer
    heads but also reads block content from disk if the size is not block
    size aligned.  At this time the physical block number in mapped buffer
    head is pointing to the donor file not the original file, and that
    results in reading wrong data to page, which get written to disk in
    following block_commit_write call.
    
    This also can be reproduced by the following script on 1k block size ext4
    on x86_64 host:
    
        mnt=/mnt/ext4
        donorfile=$mnt/donor
        testfile=$mnt/testfile
        e4compact=~/xfstests/src/e4compact
    
        rm -f $donorfile $testfile
    
        # reserve space for donor file, written by 0xaa and sync to disk to
        # avoid EBUSY on EXT4_IOC_MOVE_EXT
        xfs_io -fc "pwrite -S 0xaa 0 1m" -c "fsync" $donorfile
    
        # create test file written by 0xbb
        xfs_io -fc "pwrite -S 0xbb 0 1023" -c "fsync" $testfile
    
        # compute initial md5sum
        md5sum $testfile | tee md5sum.txt
        # drop cache, force e4compact to read data from disk
        echo 3 > /proc/sys/vm/drop_caches
    
        # test defrag
        echo "$testfile" | $e4compact -i -v -f $donorfile
        # check md5sum
        md5sum -c md5sum.txt
    
    Fix it by creating & mapping buffer heads only but not reading blocks
    from disk, because all the data in page is guaranteed to be up-to-date
    in mext_page_mkuptodate().
    
    Signed-off-by: Eryu Guan <guaneryu@gmail.com>
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 600d41f4ecb53edb540fa00a34a78ea6e5c9f9f7
Author: Insu Yun <wuninsu@gmail.com>
Date:   Fri Feb 12 01:15:59 2016 -0500

    ext4: fix potential integer overflow
    
    commit 46901760b46064964b41015d00c140c83aa05bcf upstream.
    
    Since sizeof(ext_new_group_data) > sizeof(ext_new_flex_group_data),
    integer overflow could be happened.
    Therefore, need to fix integer overflow sanitization.
    
    Signed-off-by: Insu Yun <wuninsu@gmail.com>
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 33f48f8ab0b94fdd1249f561b15c872c36e560c8
Author: Jan Kara <jack@suse.cz>
Date:   Thu Feb 11 23:15:12 2016 -0500

    ext4: fix scheduling in atomic on group checksum failure
    
    commit 05145bd799e498ce4e3b5145894174ee881f02b0 upstream.
    
    When block group checksum is wrong, we call ext4_error() while holding
    group spinlock from ext4_init_block_bitmap() or
    ext4_init_inode_bitmap() which results in scheduling while in atomic.
    Fix the issue by calling ext4_error() later after dropping the spinlock.
    
    Reported-by: Dmitry Vyukov <dvyukov@google.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 5859b9077763ea1c3f4db47522bdcf300aedb080
Author: Peter Hurley <peter@hurleysoftware.com>
Date:   Tue Jan 12 15:14:46 2016 -0800

    serial: omap: Prevent DoS using unprivileged ioctl(TIOCSRS485)
    
    commit 308bbc9ab838d0ace0298268c7970ba9513e2c65 upstream.
    
    The omap-serial driver emulates RS485 delays using software timers,
    but neglects to clamp the input values from the unprivileged
    ioctl(TIOCSRS485). Because the software implementation busy-waits,
    malicious userspace could stall the cpu for ~49 days.
    
    Clamp the input values to < 100ms.
    
    Fixes: 4a0ac0f55b18 ("OMAP: add RS485 support")
    Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 76e88140aa9111605768e7725ec1c5b709d51865
Author: Mika Westerberg <mika.westerberg@linux.intel.com>
Date:   Fri Jan 29 16:49:47 2016 +0200

    serial: 8250_pci: Add Intel Broadwell ports
    
    commit 6c55d9b98335f7f6bd5f061866ff1633401f3a44 upstream.
    
    Some recent (early 2015) macbooks have Intel Broadwell where LPSS UARTs are
    PCI enumerated instead of ACPI. The LPSS UART block is pretty much same as
    used on Intel Baytrail so we can reuse the existing Baytrail setup code.
    
    Add both Broadwell LPSS UART ports to the list of supported devices.
    
    Signed-off-by: Leif Liddy <leif.liddy@gmail.com>
    Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
    Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 124efa9fd5679ed68fd69e5b899e55b0452c68a4
Author: Jeremy McNicoll <jmcnicol@redhat.com>
Date:   Tue Feb 2 13:00:45 2016 -0800

    tty: Add support for PCIe WCH382 2S multi-IO card
    
    commit 7dde55787b43a8f2b4021916db38d90c03a2ec64 upstream.
    
    WCH382 2S board is a PCIe card with 2 DB9 COM ports detected as
    Serial controller: Device 1c00:3253 (rev 10) (prog-if 05 [16850])
    
    Signed-off-by: Jeremy McNicoll <jmcnicol@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 1bdf16025dfc5ed335f3d7d8bbe78461583105fc
Author: Herton R. Krzesinski <herton@redhat.com>
Date:   Thu Jan 14 17:56:58 2016 -0200

    pty: make sure super_block is still valid in final /dev/tty close
    
    commit 1f55c718c290616889c04946864a13ef30f64929 upstream.
    
    Considering current pty code and multiple devpts instances, it's possible
    to umount a devpts file system while a program still has /dev/tty opened
    pointing to a previosuly closed pty pair in that instance. In the case all
    ptmx and pts/N files are closed, umount can be done. If the program closes
    /dev/tty after umount is done, devpts_kill_index will use now an invalid
    super_block, which was already destroyed in the umount operation after
    running ->kill_sb. This is another "use after free" type of issue, but now
    related to the allocated super_block instance.
    
    To avoid the problem (warning at ida_remove and potential crashes) for
    this specific case, I added two functions in devpts which grabs additional
    references to the super_block, which pty code now uses so it makes sure
    the super block structure is still valid until pty shutdown is done.
    I also moved the additional inode references to the same functions, which
    also covered similar case with inode being freed before /dev/tty final
    close/shutdown.
    
    Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
    Reviewed-by: Peter Hurley <peter@hurleysoftware.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3ceeb564198cf81863c16d76572306e57e833963
Author: Herton R. Krzesinski <herton@redhat.com>
Date:   Mon Jan 11 12:07:43 2016 -0200

    pty: fix possible use after free of tty->driver_data
    
    commit 2831c89f42dcde440cfdccb9fee9f42d54bbc1ef upstream.
    
    This change fixes a bug for a corner case where we have the the last
    release from a pty master/slave coming from a previously opened /dev/tty
    file. When this happens, the tty->driver_data can be stale, due to all
    ptmx or pts/N files having already been closed before (and thus the inode
    related to these files, which tty->driver_data points to, being already
    freed/destroyed).
    
    The fix here is to keep a reference on the opened master ptmx inode.
    We maintain the inode referenced until the final pty_unix98_shutdown,
    and only pass this inode to devpts_kill_index.
    
    Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
    Reviewed-by: Peter Hurley <peter@hurleysoftware.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a45f23edb00e017289c5263fef25ad920009edb0
Author: Peter Hurley <peter@hurleysoftware.com>
Date:   Sun Jan 10 22:40:58 2016 -0800

    staging/speakup: Use tty_ldisc_ref() for paste kworker
    
    commit f4f9edcf9b5289ed96113e79fa65a7bf27ecb096 upstream.
    
    As the function documentation for tty_ldisc_ref_wait() notes, it is
    only callable from a tty file_operations routine; otherwise there
    is no guarantee the ref won't be NULL.
    
    The key difference with the VT's paste_selection() is that is an ioctl,
    where __speakup_paste_selection() is completely async kworker, kicked
    off from interrupt context.
    
    Fixes: 28a821c30688 ("Staging: speakup: Update __speakup_paste_selection()
           tty (ab)usage to match vt")
    Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3375ee8b9964979f6f1f745053e21ded1afdb1b6
Author: Tony Lindgren <tony@atomide.com>
Date:   Mon Nov 30 21:39:54 2015 -0800

    phy: twl4030-usb: Fix unbalanced pm_runtime_enable on module reload
    
    commit 58a66dba1beac2121d931cda4682ae4d40816af5 upstream.
    
    If we reload phy-twl4030-usb, we get a warning about unbalanced
    pm_runtime_enable. Let's fix the issue and also fix idling of the
    device on unload before we attempt to shut it down.
    
    If we don't properly idle the PHY before shutting it down on removal,
    the twl4030 ends up consuming about 62mW of extra power compared to
    running idle with the module loaded.
    
    Cc: Bin Liu <b-liu@ti.com>
    Cc: Felipe Balbi <balbi@ti.com>
    Cc: Kishon Vijay Abraham I <kishon@ti.com>
    Cc: NeilBrown <neil@brown.name>
    Signed-off-by: Tony Lindgren <tony@atomide.com>
    Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a90e66cb949a59298d74638e5db4e888711b4fa0
Author: Tony Lindgren <tony@atomide.com>
Date:   Mon Nov 30 21:39:53 2015 -0800

    phy: twl4030-usb: Relase usb phy on unload
    
    commit b241d31ef2f6a289d33dcaa004714b26e06f476f upstream.
    
    Otherwise rmmod omap2430; rmmod phy-twl4030-usb; modprobe omap2430
    will try to use a non-existing phy and oops:
    
    Unable to handle kernel paging request at virtual address b6f7c1f0
    ...
    [<c048a284>] (devm_usb_get_phy_by_node) from [<bf0758ac>]
    (omap2430_musb_init+0x44/0x2b4 [omap2430])
    [<bf0758ac>] (omap2430_musb_init [omap2430]) from [<bf055ec0>]
    (musb_init_controller+0x194/0x878 [musb_hdrc])
    
    Cc: Bin Liu <b-liu@ti.com>
    Cc: Felipe Balbi <balbi@ti.com>
    Cc: Kishon Vijay Abraham I <kishon@ti.com>
    Cc: NeilBrown <neil@brown.name>
    Signed-off-by: Tony Lindgren <tony@atomide.com>
    Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a40efb855068a20cf769425a799642aa95c57635
Author: Takashi Iwai <tiwai@suse.de>
Date:   Tue Feb 16 14:15:59 2016 +0100

    ALSA: seq: Fix double port list deletion
    
    commit 13d5e5d4725c64ec06040d636832e78453f477b7 upstream.
    
    The commit [7f0973e973cd: ALSA: seq: Fix lockdep warnings due to
    double mutex locks] split the management of two linked lists (source
    and destination) into two individual calls for avoiding the AB/BA
    deadlock.  However, this may leave the possible double deletion of one
    of two lists when the counterpart is being deleted concurrently.
    It ends up with a list corruption, as revealed by syzkaller fuzzer.
    
    This patch fixes it by checking the list emptiness and skipping the
    deletion and the following process.
    
    BugLink: http://lkml.kernel.org/r/CACT4Y+bay9qsrz6dQu31EcGaH9XwfW7o3oBzSQUG9fMszoh=Sg@mail.gmail.com
    Fixes: 7f0973e973cd ('ALSA: seq: Fix lockdep warnings due to 'double mutex locks)
    Reported-by: Dmitry Vyukov <dvyukov@google.com>
    Tested-by: Dmitry Vyukov <dvyukov@google.com>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6bb345ac7b30680018be6d7e6b4738fab7283b4f
Author: Takashi Iwai <tiwai@suse.de>
Date:   Mon Feb 15 16:20:24 2016 +0100

    ALSA: seq: Fix leak of pool buffer at concurrent writes
    
    commit d99a36f4728fcbcc501b78447f625bdcce15b842 upstream.
    
    When multiple concurrent writes happen on the ALSA sequencer device
    right after the open, it may try to allocate vmalloc buffer for each
    write and leak some of them.  It's because the presence check and the
    assignment of the buffer is done outside the spinlock for the pool.
    
    The fix is to move the check and the assignment into the spinlock.
    
    (The current implementation is suboptimal, as there can be multiple
     unnecessary vmallocs because the allocation is done before the check
     in the spinlock.  But the pool size is already checked beforehand, so
     this isn't a big problem; that is, the only possible path is the
     multiple writes before any pool assignment, and practically seen, the
     current coverage should be "good enough".)
    
    The issue was triggered by syzkaller fuzzer.
    
    BugLink: http://lkml.kernel.org/r/CACT4Y+bSzazpXNvtAr=WXaL8hptqjHwqEyFA+VN2AWEx=aurkg@mail.gmail.com
    Reported-by: Dmitry Vyukov <dvyukov@google.com>
    Tested-by: Dmitry Vyukov <dvyukov@google.com>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ef0ca96169a23551c8f8961c9a081a1c071703a4
Author: Takashi Iwai <tiwai@suse.de>
Date:   Wed Feb 17 14:30:26 2016 +0100

    ALSA: pcm: Fix rwsem deadlock for non-atomic PCM stream
    
    commit 67ec1072b053c15564e6090ab30127895dc77a89 upstream.
    
    A non-atomic PCM stream may take snd_pcm_link_rwsem rw semaphore twice
    in the same code path, e.g. one in snd_pcm_action_nonatomic() and
    another in snd_pcm_stream_lock().  Usually this is OK, but when a
    write lock is issued between these two read locks, the problem
    happens: the write lock is blocked due to the first reade lock, and
    the second read lock is also blocked by the write lock.  This
    eventually deadlocks.
    
    The reason is the way rwsem manages waiters; it's queued like FIFO, so
    even if the writer itself doesn't take the lock yet, it blocks all the
    waiters (including reads) queued after it.
    
    As a workaround, in this patch, we replace the standard down_write()
    with an spinning loop.  This is far from optimal, but it's good
    enough, as the spinning time is supposed to be relatively short for
    normal PCM operations, and the code paths requiring the write lock
    aren't called so often.
    
    Reported-by: Vinod Koul <vinod.koul@intel.com>
    Tested-by: Ramesh Babu <ramesh.babu@intel.com>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 434e26d6f6a000b8585c0eb64764a55daff65d20
Author: Takashi Iwai <tiwai@suse.de>
Date:   Mon Feb 15 16:37:24 2016 +0100

    ALSA: hda - Cancel probe work instead of flush at remove
    
    commit 0b8c82190c12e530eb6003720dac103bf63e146e upstream.
    
    The commit [991f86d7ae4e: ALSA: hda - Flush the pending probe work at
    remove] introduced the sync of async probe work at remove for fixing
    the race.  However, this may lead to another hangup when the module
    removal is performed quickly before starting the probe work, because
    it issues flush_work() and it's blocked forever.
    
    The workaround is to use cancel_work_sync() instead of flush_work()
    there.
    
    Fixes: 991f86d7ae4e ('ALSA: hda - Flush the pending probe work at remove')
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6deb0ec93da69d60f28c54b48d24c35e92b2155f
Author: Toshi Kani <toshi.kani@hpe.com>
Date:   Wed Feb 17 18:16:54 2016 -0700

    x86/mm: Fix vmalloc_fault() to handle large pages properly
    
    commit f4eafd8bcd5229e998aa252627703b8462c3b90f upstream.
    
    A kernel page fault oops with the callstack below was observed
    when a read syscall was made to a pmem device after a huge amount
    (>512GB) of vmalloc ranges was allocated by ioremap() on a x86_64
    system:
    
         BUG: unable to handle kernel paging request at ffff880840000ff8
         IP: vmalloc_fault+0x1be/0x300
         PGD c7f03a067 PUD 0
         Oops: 0000 [#1] SM
         Call Trace:
            __do_page_fault+0x285/0x3e0
            do_page_fault+0x2f/0x80
            ? put_prev_entity+0x35/0x7a0
            page_fault+0x28/0x30
            ? memcpy_erms+0x6/0x10
            ? schedule+0x35/0x80
            ? pmem_rw_bytes+0x6a/0x190 [nd_pmem]
            ? schedule_timeout+0x183/0x240
            btt_log_read+0x63/0x140 [nd_btt]
             :
            ? __symbol_put+0x60/0x60
            ? kernel_read+0x50/0x80
            SyS_finit_module+0xb9/0xf0
            entry_SYSCALL_64_fastpath+0x1a/0xa4
    
    Since v4.1, ioremap() supports large page (pud/pmd) mappings in
    x86_64 and PAE.  vmalloc_fault() however assumes that the vmalloc
    range is limited to pte mappings.
    
    vmalloc faults do not normally happen in ioremap'd ranges since
    ioremap() sets up the kernel page tables, which are shared by
    user processes.  pgd_ctor() sets the kernel's PGD entries to
    user's during fork().  When allocation of the vmalloc ranges
    crosses a 512GB boundary, ioremap() allocates a new pud table
    and updates the kernel PGD entry to point it.  If user process's
    PGD entry does not have this update yet, a read/write syscall
    to the range will cause a vmalloc fault, which hits the Oops
    above as it does not handle a large page properly.
    
    Following changes are made to vmalloc_fault().
    
    64-bit:
    
     - No change for the PGD sync operation as it handles large
       pages already.
     - Add pud_huge() and pmd_huge() to the validation code to
       handle large pages.
     - Change pud_page_vaddr() to pud_pfn() since an ioremap range
       is not directly mapped (while the if-statement still works
       with a bogus addr).
     - Change pmd_page() to pmd_pfn() since an ioremap range is not
       backed by struct page (while the if-statement still works
       with a bogus addr).
    
    32-bit:
     - No change for the sync operation since the index3 PGD entry
       covers the entire vmalloc range, which is always valid.
       (A separate change to sync PGD entry is necessary if this
        memory layout is changed regardless of the page size.)
     - Add pmd_huge() to the validation code to handle large pages.
       This is for completeness since vmalloc_fault() won't happen
       in ioremap'd ranges as its PGD entry is always valid.
    
    Reported-by: Henning Schild <henning.schild@siemens.com>
    Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
    Acked-by: Borislav Petkov <bp@alien8.de>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Andy Lutomirski <luto@amacapital.net>
    Cc: Brian Gerst <brgerst@gmail.com>
    Cc: Denys Vlasenko <dvlasenk@redhat.com>
    Cc: H. Peter Anvin <hpa@zytor.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Luis R. Rodriguez <mcgrof@suse.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Toshi Kani <toshi.kani@hp.com>
    Cc: linux-mm@kvack.org
    Cc: linux-nvdimm@lists.01.org
    Link: http://lkml.kernel.org/r/1455758214-24623-1-git-send-email-toshi.kani@hpe.com
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e0c89043e71ae8af6a8cd4a9dac99f46b7ff1d83
Author: Toshi Kani <toshi.kani@hpe.com>
Date:   Thu Feb 11 14:24:17 2016 -0700

    x86/uaccess/64: Handle the caching of 4-byte nocache copies properly in __copy_user_nocache()
    
    commit a82eee7424525e34e98d821dd059ce14560a1e35 upstream.
    
    Data corruption issues were observed in tests which initiated
    a system crash/reset while accessing BTT devices.  This problem
    is reproducible.
    
    The BTT driver calls pmem_rw_bytes() to update data in pmem
    devices.  This interface calls __copy_user_nocache(), which
    uses non-temporal stores so that the stores to pmem are
    persistent.
    
    __copy_user_nocache() uses non-temporal stores when a request
    size is 8 bytes or larger (and is aligned by 8 bytes).  The
    BTT driver updates the BTT map table, which entry size is
    4 bytes.  Therefore, updates to the map table entries remain
    cached, and are not written to pmem after a crash.
    
    Change __copy_user_nocache() to use non-temporal store when
    a request size is 4 bytes.  The change extends the current
    byte-copy path for a less-than-8-bytes request, and does not
    add any overhead to the regular path.
    
    Reported-and-tested-by: Micah Parrish <micah.parrish@hpe.com>
    Reported-and-tested-by: Brian Boylston <brian.boylston@hpe.com>
    Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Andy Lutomirski <luto@amacapital.net>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Borislav Petkov <bp@suse.de>
    Cc: Brian Gerst <brgerst@gmail.com>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Denys Vlasenko <dvlasenk@redhat.com>
    Cc: H. Peter Anvin <hpa@zytor.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Luis R. Rodriguez <mcgrof@suse.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Toshi Kani <toshi.kani@hp.com>
    Cc: Vishal Verma <vishal.l.verma@intel.com>
    Cc: linux-nvdimm@lists.01.org
    Link: http://lkml.kernel.org/r/1455225857-12039-3-git-send-email-toshi.kani@hpe.com
    [ Small readability edits. ]
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 1e2e0ad1cc1643fdfea0610805c253f53ec1aaa1
Author: Toshi Kani <toshi.kani@hpe.com>
Date:   Thu Feb 11 14:24:16 2016 -0700

    x86/uaccess/64: Make the __copy_user_nocache() assembly code more readable
    
    commit ee9737c924706aaa72c2ead93e3ad5644681dc1c upstream.
    
    Add comments to __copy_user_nocache() to clarify its procedures
    and alignment requirements.
    
    Also change numeric branch target labels to named local labels.
    
    No code changed:
    
     arch/x86/lib/copy_user_64.o:
    
        text    data     bss     dec     hex filename
        1239       0       0    1239     4d7 copy_user_64.o.before
        1239       0       0    1239     4d7 copy_user_64.o.after
    
     md5:
        58bed94c2db98c1ca9a2d46d0680aaae  copy_user_64.o.before.asm
        58bed94c2db98c1ca9a2d46d0680aaae  copy_user_64.o.after.asm
    
    Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Andy Lutomirski <luto@amacapital.net>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Borislav Petkov <bp@suse.de>
    Cc: Brian Gerst <brgerst@gmail.com>
    Cc: Denys Vlasenko <dvlasenk@redhat.com>
    Cc: H. Peter Anvin <hpa@zytor.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Luis R. Rodriguez <mcgrof@suse.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Toshi Kani <toshi.kani@hp.com>
    Cc: brian.boylston@hpe.com
    Cc: dan.j.williams@intel.com
    Cc: linux-nvdimm@lists.01.org
    Cc: micah.parrish@hpe.com
    Cc: ross.zwisler@linux.intel.com
    Cc: vishal.l.verma@intel.com
    Link: http://lkml.kernel.org/r/1455225857-12039-2-git-send-email-toshi.kani@hpe.com
    [ Small readability edits and added object file comparison. ]
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 4f298c10c35dbaa3c328bc5360b9c46cc52d9f93
Author: Matt Fleming <matt@codeblueprint.co.uk>
Date:   Fri Jan 29 11:36:10 2016 +0000

    x86/mm/pat: Avoid truncation when converting cpa->numpages to address
    
    commit 742563777e8da62197d6cb4b99f4027f59454735 upstream.
    
    There are a couple of nasty truncation bugs lurking in the pageattr
    code that can be triggered when mapping EFI regions, e.g. when we pass
    a cpa->pgd pointer. Because cpa->numpages is a 32-bit value, shifting
    left by PAGE_SHIFT will truncate the resultant address to 32-bits.
    
    Viorel-Cătălin managed to trigger this bug on his Dell machine that
    provides a ~5GB EFI region which requires 1236992 pages to be mapped.
    When calling populate_pud() the end of the region gets calculated
    incorrectly in the following buggy expression,
    
      end = start + (cpa->numpages << PAGE_SHIFT);
    
    And only 188416 pages are mapped. Next, populate_pud() gets invoked
    for a second time because of the loop in __change_page_attr_set_clr(),
    only this time no pages get mapped because shifting the remaining
    number of pages (1048576) by PAGE_SHIFT is zero. At which point the
    loop in __change_page_attr_set_clr() spins forever because we fail to
    map progress.
    
    Hitting this bug depends very much on the virtual address we pick to
    map the large region at and how many pages we map on the initial run
    through the loop. This explains why this issue was only recently hit
    with the introduction of commit
    
      a5caa209ba9c ("x86/efi: Fix boot crash by mapping EFI memmap
       entries bottom-up at runtime, instead of top-down")
    
    It's interesting to note that safe uses of cpa->numpages do exist in
    the pageattr code. If instead of shifting ->numpages we multiply by
    PAGE_SIZE, no truncation occurs because PAGE_SIZE is a UL value, and
    so the result is unsigned long.
    
    To avoid surprises when users try to convert very large cpa->numpages
    values to addresses, change the data type from 'int' to 'unsigned
    long', thereby making it suitable for shifting by PAGE_SHIFT without
    any type casting.
    
    The alternative would be to make liberal use of casting, but that is
    far more likely to cause problems in the future when someone adds more
    code and fails to cast properly; this bug was difficult enough to
    track down in the first place.
    
    Reported-and-tested-by: Viorel-Cătălin Răpițeanu <rapiteanu.catalin@gmail.com>
    Acked-by: Borislav Petkov <bp@alien8.de>
    Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
    Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=110131
    Link: http://lkml.kernel.org/r/1454067370-10374-1-git-send-email-matt@codeblueprint.co.uk
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 75a101ba31faf1047da55e26338d800fba9a86bc
Author: Jan Beulich <JBeulich@suse.com>
Date:   Tue Jan 26 04:15:18 2016 -0700

    x86/mm: Fix types used in pgprot cacheability flags translations
    
    commit 3625c2c234ef66acf21a72d47a5ffa94f6c5ebf2 upstream.
    
    For PAE kernels "unsigned long" is not suitable to hold page protection
    flags, since _PAGE_NX doesn't fit there. This is the reason for quite a
    few W+X pages getting reported as insecure during boot (observed namely
    for the entire initrd range).
    
    Fixes: 281d4078be ("x86: Make page cache mode a real type")
    Signed-off-by: Jan Beulich <jbeulich@suse.com>
    Reviewed-by: Juergen Gross <JGross@suse.com>
    Link: http://lkml.kernel.org/r/56A7635602000078000CAFF1@prv-mh.provo.novell.com
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>