aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2018-06-04Merge branch 'siginfo-linus' of ↵HEADmasterLinus Torvalds83-1066/+493
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull siginfo updates from Eric Biederman: "This set of changes close the known issues with setting si_code to an invalid value, and with not fully initializing struct siginfo. There remains work to do on nds32, arc, unicore32, powerpc, arm, arm64, ia64 and x86 to get the code that generates siginfo into a simpler and more maintainable state. Most of that work involves refactoring the signal handling code and thus careful code review. Also not included is the work to shrink the in kernel version of struct siginfo. That depends on getting the number of places that directly manipulate struct siginfo under control, as it requires the introduction of struct kernel_siginfo for the in kernel things. Overall this set of changes looks like it is making good progress, and with a little luck I will be wrapping up the siginfo work next development cycle" * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits) signal/sh: Stop gcc warning about an impossible case in do_divide_error signal/mips: Report FPE_FLTUNK for undiagnosed floating point exceptions signal/um: More carefully relay signals in relay_signal. signal: Extend siginfo_layout with SIL_FAULT_{MCEERR|BNDERR|PKUERR} signal: Remove unncessary #ifdef SEGV_PKUERR in 32bit compat code signal/signalfd: Add support for SIGSYS signal/signalfd: Remove __put_user from signalfd_copyinfo signal/xtensa: Use force_sig_fault where appropriate signal/xtensa: Consistenly use SIGBUS in do_unaligned_user signal/um: Use force_sig_fault where appropriate signal/sparc: Use force_sig_fault where appropriate signal/sparc: Use send_sig_fault where appropriate signal/sh: Use force_sig_fault where appropriate signal/s390: Use force_sig_fault where appropriate signal/riscv: Replace do_trap_siginfo with force_sig_fault signal/riscv: Use force_sig_fault where appropriate signal/parisc: Use force_sig_fault where appropriate signal/parisc: Use force_sig_mceerr where appropriate signal/openrisc: Use force_sig_fault where appropriate signal/nios2: Use force_sig_fault where appropriate ...
2018-06-04Merge branch 'userns-linus' of ↵Linus Torvalds6-19/+52
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull userns updates from Eric Biederman: "This is the last couple of vfs bits to enable root in a user namespace to mount and manipulate a filesystem with backing store (AKA not a virtual filesystem like proc, but a filesystem where the unprivileged user controls the content). The target filesystem for this work is fuse, and Miklos should be sending you the pull request for the fuse bits this merge window. The two key patches are "evm: Don't update hmacs in user ns mounts" and "vfs: Don't allow changing the link count of an inode with an invalid uid or gid". Those close small gaps in the vfs that would be a problem if an unprivileged fuse filesystem is mounted. The rest of the changes are things that are now safe to allow a root user in a user namespace to do with a filesystem they have mounted. The most interesting development is that remount is now safe" * 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: fs: Allow CAP_SYS_ADMIN in s_user_ns to freeze and thaw filesystems capabilities: Allow privileged user in s_user_ns to set security.* xattrs fs: Allow superblock owner to access do_remount_sb() fs: Allow superblock owner to replace invalid owners of inodes vfs: Allow userns root to call mknod on owned filesystems. vfs: Don't allow changing the link count of an inode with an invalid uid or gid evm: Don't update hmacs in user ns mounts
2018-06-04Merge tag '4.18-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds28-434/+1346
Pull cifs updates from Steve French: - smb3 fixes for stable - addition of ftrace hooks for cifs.ko - improvements in compounding and smbdirect (rdma) * tag '4.18-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: (38 commits) CIFS: Add support for direct pages in wdata CIFS: Use offset when reading pages CIFS: Add support for direct pages in rdata cifs: update multiplex loop to handle compounded responses cifs: remove header_preamble_size where it is always 0 cifs: remove struct smb2_hdr CIFS: 511c54a2f69195b28afb9dd119f03787b1625bb4 adds a check for session expiry, status STATUS_NETWORK_SESSION_EXPIRED, however the server can also respond with STATUS_USER_SESSION_DELETED in cases where the session has been idle for some time and the server reaps the session to recover resources. cifs: change smb2_get_data_area_len to take a smb2_sync_hdr as argument cifs: update smb2_calc_size to use smb2_sync_hdr instead of smb2_hdr cifs: remove struct smb2_oplock_break_rsp cifs: remove rfc1002 header from all SMB2 response structures smb3: on reconnect set PreviousSessionId field smb3: Add posix create context for smb3.11 posix mounts smb3: add tracepoints for smb2/smb3 open cifs: add debug output to show nocase mount option smb3: add define for id for posix create context and corresponding struct cifs: update smb2_check_message to handle PDUs without a 4 byte length header smb3: allow "posix" mount option to enable new SMB311 protocol extensions smb3: add support for posix negotiate context cifs: allow disabling less secure legacy dialects ...
2018-06-04Merge tag 'gfs2-4.18.fixes' of ↵Linus Torvalds12-246/+332
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Bob Peterson: "We've got nine more patches for this merge window. - remove sd_jheightsize to greatly simplify some code (Andreas Gruenbacher) - fix some comments (Andreas) - fix a glock recursion bug when allocation errors occur (Andreas) - improve the hole_size function so it returns the entire hole rather than figuring it out piecemeal (Andreas) - clean up gfs2_stuffed_write_end to remove a lot of redundancy (Andreas) - clarify code with regard to the way ordered writes are processed (Andreas) - a bunch of improvements and cleanups of the iomap code to pave the way for iomap writes, which is a future patch set (Andreas) - fix a bug where block reservations can run off the end of a bitmap (Bob Peterson) - add Andreas to the MAINTAINERS file (Bob Peterson)" * tag 'gfs2-4.18.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: MAINTAINERS: Add Andreas Gruenbacher as a maintainer for gfs2 gfs2: Iomap cleanups and improvements gfs2: Remove ordered write mode handling from gfs2_trans_add_data gfs2: gfs2_stuffed_write_end cleanup gfs2: hole_size improvement GFS2: gfs2_free_extlen can return an extent that is too long GFS2: Fix allocation error bug with recursive rgrp glocking gfs2: Update find_metapath comment gfs2: Remove sdp->sd_jheightsize
2018-06-04Merge tag 'dlm-4.18' of ↵Linus Torvalds1-2/+14
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: "These three commits fix and clean up the flags dlm was using on its SCTP sockets. This improves performance and fixes some bad connection delays" * tag 'dlm-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: remove O_NONBLOCK flag in sctp_connect_to_sock dlm: make sctp_connect_to_sock() return in specified time dlm: fix a clerical error when set SCTP_NODELAY
2018-06-04Merge tag 'for-4.18-tag' of ↵Linus Torvalds49-2885/+3529
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "User visible features: - added support for the ioctl FS_IOC_FSGETXATTR, per-inode flags, successor of GET/SETFLAGS; now supports only existing flags: append, immutable, noatime, nodump, sync - 3 new unprivileged ioctls to allow users to enumerate subvolumes - dedupe syscall implementation does not restrict the range to 16MiB, though it still splits the whole range to 16MiB chunks - on user demand, rmdir() is able to delete an empty subvolume, export the capability in sysfs - fix inode number types in tracepoints, other cleanups - send: improved speed when dealing with a large removed directory, measurements show decrease from 2000 minutes to 2 minutes on a directory with 2 million entries - pre-commit check of superblock to detect a mysterious in-memory corruption - log message updates Other changes: - orphan inode cleanup improved, does no keep long-standing reservations that could lead up to early ENOSPC in some cases - slight improvement of handling snapshotted NOCOW files by avoiding some unnecessary tree searches - avoid OOM when dealing with many unmergeable small extents at flush time - speedup conversion of free space tree representations from/to bitmap/tree - code refactoring, deletion, cleanups: + delayed refs + delayed iput + redundant argument removals + memory barrier cleanups + remove a redundant mutex supposedly excluding several ioctls to run in parallel - new tracepoints for blockgroup manipulation - more sanity checks of compressed headers" * tag 'for-4.18-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (183 commits) btrfs: Add unprivileged version of ino_lookup ioctl btrfs: Add unprivileged ioctl which returns subvolume's ROOT_REF btrfs: Add unprivileged ioctl which returns subvolume information Btrfs: clean up error handling in btrfs_truncate() btrfs: Factor out write portion of btrfs_get_blocks_direct btrfs: Factor out read portion of btrfs_get_blocks_direct btrfs: return ENOMEM if path allocation fails in btrfs_cross_ref_exist btrfs: raid56: Remove VLA usage btrfs: return error value if create_io_em failed in cow_file_range btrfs: drop useless member qgroup_reserved of btrfs_pending_snapshot btrfs: drop unused parameter qgroup_reserved btrfs: balance dirty metadata pages in btrfs_finish_ordered_io btrfs: lift some btrfs_cross_ref_exist checks in nocow path btrfs: Remove fs_info argument from btrfs_uuid_tree_rem btrfs: Remove fs_info argument from btrfs_uuid_tree_add Btrfs: remove unused check of skip_locking Btrfs: remove always true check in unlock_up Btrfs: grab write lock directly if write_lock_level is the max level Btrfs: move get root out of btrfs_search_slot to a helper Btrfs: use more straightforward extent_buffer_uptodate check ...
2018-06-04Merge tag 'affs-for-4.18-tag' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull affs fix from David Sterba: "A potential memory leak fix for AFFS" * tag 'affs-for-4.18-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: affs: fix potential memory leak when parsing option 'prefix'
2018-06-04Merge branch 'work.aio-1' of ↵Linus Torvalds99-595/+849
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull aio updates from Al Viro: "Majority of AIO stuff this cycle. aio-fsync and aio-poll, mostly. The only thing I'm holding back for a day or so is Adam's aio ioprio - his last-minute fixup is trivial (missing stub in !CONFIG_BLOCK case), but let it sit in -next for decency sake..." * 'work.aio-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) aio: sanitize the limit checking in io_submit(2) aio: fold do_io_submit() into callers aio: shift copyin of iocb into io_submit_one() aio_read_events_ring(): make a bit more readable aio: all callers of aio_{read,write,fsync,poll} treat 0 and -EIOCBQUEUED the same way aio: take list removal to (some) callers of aio_complete() aio: add missing break for the IOCB_CMD_FDSYNC case random: convert to ->poll_mask timerfd: convert to ->poll_mask eventfd: switch to ->poll_mask pipe: convert to ->poll_mask crypto: af_alg: convert to ->poll_mask net/rxrpc: convert to ->poll_mask net/iucv: convert to ->poll_mask net/phonet: convert to ->poll_mask net/nfc: convert to ->poll_mask net/caif: convert to ->poll_mask net/bluetooth: convert to ->poll_mask net/sctp: convert to ->poll_mask net/tipc: convert to ->poll_mask ...
2018-06-04Merge branch 'work.lookup' of ↵Linus Torvalds26-451/+293
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull dcache lookup cleanups from Al Viro: "Cleaning ->lookup() instances up - mostly d_splice_alias() conversions" * 'work.lookup' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (29 commits) switch the rest of procfs lookups to d_splice_alias() procfs: switch instantiate_t to d_splice_alias() don't bother with tid_fd_revalidate() in lookups proc_lookupfd_common(): don't bother with instantiate unless the file is open procfs: get rid of ancient BS in pid_revalidate() uses cifs_lookup(): switch to d_splice_alias() cifs_lookup(): cifs_get_inode_...() never returns 0 with *inode left NULL 9p: unify paths in v9fs_vfs_lookup() ncp_lookup(): use d_splice_alias() hfsplus: switch to d_splice_alias() hfs: don't allow mounting over .../rsrc hfs: use d_splice_alias() omfs_lookup(): report IO errors, use d_splice_alias() orangefs_lookup: simplify openpromfs: switch to d_splice_alias() xfs_vn_lookup: simplify a bit adfs_lookup: do not fail with ENOENT on negatives, use d_splice_alias() adfs_lookup_byname: .. *is* taken care of in fs/namei.c romfs_lookup: switch to d_splice_alias() qnx6_lookup: switch to d_splice_alias() ...
2018-06-04Merge tag 'locks-v4.18-1' of ↵Linus Torvalds2-9/+8
git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull fasync fix from Jeff Layton: "Just a single fix for a deadlock in the fasync handling code that Kirill observed while testing. The fix is to change the fa_lock to be rwlock_t, and use a read lock in kill_fasync_rcu" * tag 'locks-v4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: fasync: Fix deadlock between task-context and interrupt-context kill_fasync()
2018-06-04Merge tag 'docs-4.18' of git://git.lwn.net/linuxLinus Torvalds166-2920/+5535
Pull documentation updates from Jonathan Corbet: "There's been a fair amount of work in the docs tree this time around, including: - Extensive RST conversions and organizational work in the memory-management docs thanks to Mike Rapoport. - An update of Documentation/features from Andrea Parri and a script to keep it updated. - Various LICENSES updates from Thomas, along with a script to check SPDX tags. - Work to fix dangling references to documentation files; this involved a fair number of one-liner comment changes outside of Documentation/ ... and the usual list of documentation improvements, typo fixes, etc" * tag 'docs-4.18' of git://git.lwn.net/linux: (103 commits) Documentation: document hung_task_panic kernel parameter docs/admin-guide/mm: add high level concepts overview docs/vm: move ksm and transhuge from "user" to "internals" section. docs: Use the kerneldoc comments for memalloc_no*() doc: document scope NOFS, NOIO APIs docs: update kernel versions and dates in tables docs/vm: transhuge: split userspace bits to admin-guide/mm/transhuge docs/vm: transhuge: minor updates docs/vm: transhuge: change sections order Documentation: arm: clean up Marvell Berlin family info Documentation: gpio: driver: Fix a typo and some odd grammar docs: ranoops.rst: fix location of ramoops.txt scripts/documentation-file-ref-check: rewrite it in perl with auto-fix mode docs: uio-howto.rst: use a code block to solve a warning mm, THP, doc: Add document for thp_swpout/thp_swpout_fallback w1: w1_io.c: fix a kernel-doc warning Documentation/process/posting: wrap text at 80 cols docs: admin-guide: add cgroup-v2 documentation Revert "Documentation/features/vm: Remove arch support status file for 'pte_special'" Documentation: refcount-vs-atomic: Update reference to LKMM doc. ...
2018-06-04swait: strengthen language to discourage useLinus Torvalds1-1/+13
We already earlier discouraged people from using this interface in commit 88796e7e5c45 ("sched/swait: Document it clearly that the swait facilities are special and shouldn't be used"), but I just got a pull request with a new broken user. So make the comment *really* clear. The swait interfaces are bad, and should not be used unless you have some *very* strong reasons that include tons of hard performance numbers on just why you want to use them, and you show that you actually understand that they aren't at all like the normal wait/wakeup interfaces. So far, every single user has been suspect. The main user is KVM, which is completely pointless (there is only ever one waiter, which avoids the interface subtleties, but also means that having a queue instead of a pointer is counter-productive and certainly not an "optimization"). So make the comments much stronger. Not that anybody likely reads them anyway, but there's always some slight hope that it will cause somebody to think twice. I'd like to remove this interface entirely, but there is the theoretical possibility that it's actually the right thing to use in some situation, most likely some deep RT use. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-04Merge tag 'regmap-v4.18' of ↵Linus Torvalds3-3/+21
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap updates from Mark Brown: "This is another quiet release for regmap, there's one minor feature improvement for the recently added slimbus support and a few minor fixes and cleanups" * tag 'regmap-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: slimbus: allow register offsets up to 16 bits regmap: add missing prototype for devm_init_slimbus regmap: Skip clk_put for attached clocks when freeing context regmap: include <linux/ktime.h> from include/linux/regmap.h
2018-06-04Merge tag 'spi-v4.18' of ↵Linus Torvalds28-1309/+1423
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi updates from Mark Brown: "Quite a busy release for SPI, mainly as a result of Boris Brezillon's work on improving the integration with MTD for accelerated SPI flash controllers. He's added a new spi_mem interface which works a lot better with general hardware and converted the users over to it, as a result of this work we've got some MTD changes in here as well. Other highlights include: - Lots of spring cleaning for the s3c64xx driver. - Removal of the bcm53xx, the hardware is also supported by the mspi driver but SoC naming had caused people to miss the duplication. - Conversion of the pxa2xx driver to use the standard message processing loop rather than open coding. - A bunch of improvements to the runtime PM of the OMAP McSPI driver" * tag 'spi-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (47 commits) spi: Fix typo on SPI_MEM help text spi: sh-msiof: Fix setting SIRMDR1.SYNCAC to match SITMDR1.SYNCAC mtd: devices: m25p80: Use spi_mem_set_drvdata() instead of spi_set_drvdata() spi: omap2-mcspi: Remove unnecessary pm_runtime_force_suspend() spi: Add missing pm_runtime_put_noidle() after failed get spi: ti-qspi: Make sure res_mmap != NULL before dereferencing it spi: spi-s3c64xx: Fix system resume support spi: bcm-qspi: Fix build failure caused by spi_flash_read() API removal spi: Get rid of the spi_flash_read() API mtd: spi-nor: Use the spi_mem_xx() API spi: ti-qspi: Implement the spi_mem interface spi: bcm-qspi: Implement the spi_mem interface spi: Make support for regular transfers optional when ->mem_ops != NULL spi: Extend the core to ease integration of SPI memory controllers spi: remove forgotten CONFIG_SPI_BCM53XX spi: remove the older/duplicated bcm53xx driver spi: pxa2xx: check clk_prepare_enable() return value spi: lpspi: Switch to SPDX identifier spi: mxs: Switch to SPDX identifier spi: imx: Switch to SPDX identifier ...
2018-06-04Merge tag 'chrome-platform-for-linus-4.18' of ↵Linus Torvalds9-57/+428
git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform Pull chrome platform updates from Benson Leung: - further changes from Dmitry related to the removal of platform data from atmel_mxt_ts and chromeos_laptop. This time, we have some changes that teach chromeos_laptop how to supply acpi properties for some input devices so that the peripheral driver doesn't have to do dmi matching on some Chromebook platforms. - new Chromebook Tablet switch driver, which is useful for x86 convertible Chromebooks. - other misc cleanup * tag 'chrome-platform-for-linus-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform: platform/chrome: Use to_cros_ec_dev more broadly platform/chrome: chromeos_laptop: fix touchpad button mapping on Celes platform: chrome: Add input dependency for tablet switch driver platform/chrome: chromeos_laptop - supply properties for ACPI devices platform/chrome: chromeos_tbmc - add SPDX identifier platform: chrome: Add Tablet Switch ACPI driver platform/chrome: cros_ec_lpc: do not try DMI match when ACPI device found
2018-06-04Merge tag 'hwmon-for-linus-v4.18' of ↵Linus Torvalds14-151/+393
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon updates from Guenter Roeck: - asus_atk0110 driver modified to use new API - k10temp supports new CPUs and reports both Tctl and Tdie - minor fixes in gpio-fan, ltc2990, fschmd, and mc13783 drivers * tag 'hwmon-for-linus-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (asus_atk0110) Make use of device managed memory hwmon: (asus_atk0110) Replace deprecated device register call hwmon: (k10temp) Make function get_raw_temp static hwmon: (gpio-fan) Fix "#cooling-cells" property name in bindings MAINTAINERS: hwmon: Add Documentation/devicetree/bindings/hwmon hwmon: (ltc2990) support all measurement modes hwmon: (ltc2990) add devicetree binding hwmon: (ltc2990) Fix incorrect conversion of negative temperatures hwmon: (core) check parent dev != NULL when chip != NULL hwmon: (fschmd) fix typo 'can by' to 'can be' hwmon: (k10temp) Display both Tctl and Tdie hwmon: (k10temp) Add support for Stoney Ridge and Bristol Ridge CPUs hwmon: MC13783: Add uid and die temperature sensor inputs
2018-06-04Merge tag 'dma-mapping-4.18' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds141-1392/+624
Pull dma-mapping updates from Christoph Hellwig: - replace the force_dma flag with a dma_configure bus method. (Nipun Gupta, although one patch is іncorrectly attributed to me due to a git rebase bug) - use GFP_DMA32 more agressively in dma-direct. (Takashi Iwai) - remove PCI_DMA_BUS_IS_PHYS and rely on the dma-mapping API to do the right thing for bounce buffering. - move dma-debug initialization to common code, and apply a few cleanups to the dma-debug code. - cleanup the Kconfig mess around swiotlb selection - swiotlb comment fixup (Yisheng Xie) - a trivial swiotlb fix. (Dan Carpenter) - support swiotlb on RISC-V. (based on a patch from Palmer Dabbelt) - add a new generic dma-noncoherent dma_map_ops implementation and use it for arc, c6x and nds32. - improve scatterlist validity checking in dma-debug. (Robin Murphy) - add a struct device quirk to limit the dma-mask to 32-bit due to bridge/system issues, and switch x86 to use it instead of a local hack for VIA bridges. - handle devices without a dma_mask more gracefully in the dma-direct code. * tag 'dma-mapping-4.18' of git://git.infradead.org/users/hch/dma-mapping: (48 commits) dma-direct: don't crash on device without dma_mask nds32: use generic dma_noncoherent_ops nds32: implement the unmap_sg DMA operation nds32: consolidate DMA cache maintainance routines x86/pci-dma: switch the VIA 32-bit DMA quirk to use the struct device flag x86/pci-dma: remove the explicit nodac and allowdac option x86/pci-dma: remove the experimental forcesac boot option Documentation/x86: remove a stray reference to pci-nommu.c core, dma-direct: add a flag 32-bit dma limits dma-mapping: remove unused gfp_t parameter to arch_dma_alloc_attrs dma-debug: check scatterlist segments c6x: use generic dma_noncoherent_ops arc: use generic dma_noncoherent_ops arc: fix arc_dma_{map,unmap}_page arc: fix arc_dma_sync_sg_for_{cpu,device} arc: simplify arc_dma_sync_single_for_{cpu,device} dma-mapping: provide a generic dma-noncoherent implementation dma-mapping: simplify Kconfig dependencies riscv: add swiotlb support riscv: only enable ZONE_DMA32 for 64-bit ...
2018-06-04Merge branch 'work.misc' of ↵Linus Torvalds13-118/+69
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "Misc bits and pieces not fitting into anything more specific" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: delete unnecessary assignment in vfs_listxattr Documentation: filesystems: update filesystem locking documentation vfs: namei: use path_equal() in follow_dotdot() fs.h: fix outdated comment about file flags __inode_security_revalidate() never gets NULL opt_dentry make xattr_getsecurity() static vfat: simplify checks in vfat_lookup() get rid of dead code in d_find_alias() it's SB_BORN, not MS_BORN... msdos_rmdir(): kill BS comment remove rpc_rmdir() fs: avoid fdput() after failed fdget() in vfs_dedupe_file_range()
2018-06-04Merge branch 'hch.procfs' of ↵Linus Torvalds266-6032/+1229
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull procfs updates from Al Viro: "Christoph's proc_create_... cleanups series" * 'hch.procfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (44 commits) xfs, proc: hide unused xfs procfs helpers isdn/gigaset: add back gigaset_procinfo assignment proc: update SIZEOF_PDE_INLINE_NAME for the new pde fields tty: replace ->proc_fops with ->proc_show ide: replace ->proc_fops with ->proc_show ide: remove ide_driver_proc_write isdn: replace ->proc_fops with ->proc_show atm: switch to proc_create_seq_private atm: simplify procfs code bluetooth: switch to proc_create_seq_data netfilter/x_tables: switch to proc_create_seq_private netfilter/xt_hashlimit: switch to proc_create_{seq,single}_data neigh: switch to proc_create_seq_data hostap: switch to proc_create_{seq,single}_data bonding: switch to proc_create_seq_data rtc/proc: switch to proc_create_single_data drbd: switch to proc_create_single resource: switch to proc_create_seq_data staging/rtl8192u: simplify procfs code jfs: simplify procfs code ...
2018-06-04Merge branch 'work.rmdir' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull rmdir update from Al Viro: "More shrink_dcache_parent()-related stuff - killing the main source of potentially contended calls of that on large subtrees" * 'work.rmdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: rmdir(),rename(): do shrink_dcache_parent() only on success
2018-06-04Merge branch 'work.dcache' of ↵Linus Torvalds1-79/+43
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull dcache updates from Al Viro: "This is the first part of dealing with livelocks etc around shrink_dcache_parent()." * 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: restore cond_resched() in shrink_dcache_parent() dput(): turn into explicit while() loop dcache: move cond_resched() into the end of __dentry_kill() d_walk(): kill 'finish' callback d_invalidate(): unhash immediately
2018-06-04Merge tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-blockLinus Torvalds212-3984/+4020
Pull block updates from Jens Axboe: - clean up how we pass around gfp_t and blk_mq_req_flags_t (Christoph) - prepare us to defer scheduler attach (Christoph) - clean up drivers handling of bounce buffers (Christoph) - fix timeout handling corner cases (Christoph/Bart/Keith) - bcache fixes (Coly) - prep work for bcachefs and some block layer optimizations (Kent). - convert users of bio_sets to using embedded structs (Kent). - fixes for the BFQ io scheduler (Paolo/Davide/Filippo) - lightnvm fixes and improvements (Matias, with contributions from Hans and Javier) - adding discard throttling to blk-wbt (me) - sbitmap blk-mq-tag handling (me/Omar/Ming). - remove the sparc jsflash block driver, acked by DaveM. - Kyber scheduler improvement from Jianchao, making it more friendly wrt merging. - conversion of symbolic proc permissions to octal, from Joe Perches. Previously the block parts were a mix of both. - nbd fixes (Josef and Kevin Vigor) - unify how we handle the various kinds of timestamps that the block core and utility code uses (Omar) - three NVMe pull requests from Keith and Christoph, bringing AEN to feature completeness, file backed namespaces, cq/sq lock split, and various fixes - various little fixes and improvements all over the map * tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-block: (196 commits) blk-mq: update nr_requests when switching to 'none' scheduler block: don't use blocking queue entered for recursive bio submits dm-crypt: fix warning in shutdown path lightnvm: pblk: take bitmap alloc. out of critical section lightnvm: pblk: kick writer on new flush points lightnvm: pblk: only try to recover lines with written smeta lightnvm: pblk: remove unnecessary bio_get/put lightnvm: pblk: add possibility to set write buffer size manually lightnvm: fix partial read error path lightnvm: proper error handling for pblk_bio_add_pages lightnvm: pblk: fix smeta write error path lightnvm: pblk: garbage collect lines with failed writes lightnvm: pblk: rework write error recovery path lightnvm: pblk: remove dead function lightnvm: pass flag on graceful teardown to targets lightnvm: pblk: check for chunk size before allocating it lightnvm: pblk: remove unnecessary argument lightnvm: pblk: remove unnecessary indirection lightnvm: pblk: return NVM_ error on failed submission lightnvm: pblk: warn in case of corrupted write buffer ...
2018-06-04MAINTAINERS: Add Andreas Gruenbacher as a maintainer for gfs2Bob Peterson1-1/+1
Add Andreas Gruenbacher as a maintainer for the gfs2 file system and remove Steve Whitehouse. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04gfs2: Iomap cleanups and improvementsAndreas Gruenbacher4-95/+126
Clean up gfs2_iomap_alloc and gfs2_iomap_get. Document how gfs2_iomap_alloc works: it now needs to be called separately after gfs2_iomap_get where necessary; this will be used later by iomap write. Move gfs2_iomap_ops into bmap.c. Introduce a new gfs2_iomap_get_alloc helper and use it in fallocate_chunk: gfs2_iomap_begin will become unsuitable for fallocate with proper iomap write support. In gfs2_block_map and fallocate_chunk, zero-initialize struct iomap. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04gfs2: Remove ordered write mode handling from gfs2_trans_add_dataAndreas Gruenbacher5-28/+30
In journaled data mode, we need to add each buffer head to the current transaction. In ordered write mode, we only need to add the inode to the ordered inode list. So far, both cases are handled in gfs2_trans_add_data. This makes the code look misleading and is inefficient for small block sizes as well. Handle both cases separately instead. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04gfs2: gfs2_stuffed_write_end cleanupAndreas Gruenbacher1-31/+18
First, change the sanity check in gfs2_stuffed_write_end to check for the actual write size instead of the requested write size. Second, use the existing teardown code in gfs2_write_end instead of duplicating it in gfs2_stuffed_write_end. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04gfs2: hole_size improvementAndreas Gruenbacher1-57/+153
Reimplement function hole_size based on a generic function for walking the metadata tree and rename hole_size to gfs2_hole_size. While previously, multiple invocations of hole_size were sometimes needed to walk across the entire hole, the new implementation always returns the entire hole at once (provided that the caller is interested in the total size). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04GFS2: gfs2_free_extlen can return an extent that is too longBob Peterson2-1/+2
Function gfs2_free_extlen calculates the length of an extent of free blocks that may be reserved. The end pointer was calculated as end = start + bh->b_size but b_size is incorrect because the bitmap usually stops prior to the end of the buffer data on the last bitmap. What this means is that when you do a write, you can reserve a chunk of blocks that runs off the end of the last bitmap. For example, I've got a file system where there is only one bitmap for each rgrp, so ri_length==1. I saw cases in which iozone tried to do a big write, grabbed a large block reservation, chose rgrp 5464152, which has ri_data0 5464153 and ri_data 8188. So 5464153 + 8188 = 5472341 which is the end of the rgrp. When it grabbed a reservation it got back: 5470936, length 7229. But 5470936 + 7229 = 5478165. So the reservation starts inside the rgrp but runs 5824 blocks past the end of the bitmap. This patch fixes the calculation so it won't exceed the last bitmap. It also adds a BUG_ON to guard against overflows in the future. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04GFS2: Fix allocation error bug with recursive rgrp glockingAndreas Gruenbacher1-5/+8
Before this patch function gfs2_write_begin, upon discovering an error, called gfs2_trim_blocks while the rgrp glock was still held. That's because gfs2_inplace_release is not called until later. This patch reorganizes the logic a bit so gfs2_inplace_release is called to release the lock prior to the call to gfs2_trim_blocks, thus preventing the glock recursion. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04gfs2: Update find_metapath commentAndreas Gruenbacher1-3/+2
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04Merge branch 'regmap-4.17' into regmap-4.18 for the merge windowMark Brown1-1/+2
2018-06-04Merge branch 'spi-4.17' into spi-4.18 for the merge windowMark Brown4-12/+35
2018-06-03Linux 4.17Linus Torvalds1-1/+1
2018-06-03Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds4-21/+34
Pull vfs fixes from Al Viro. - fix io_destroy()/aio_complete() race - the vfs_open() change to get rid of open_check_o_direct() boilerplate was nice, but buggy. Al has a patch avoiding a revert, but that's definitely not a last-day fodder, so for now revert it is... * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: Revert "fs: fold open_check_o_direct into do_dentry_open" fix io_destroy()/aio_complete() race
2018-06-03Revert "fs: fold open_check_o_direct into do_dentry_open"Al Viro3-19/+33
This reverts commit cab64df194667dc5d9d786f0a895f647f5501c0d. Having vfs_open() in some cases drop the reference to struct file combined with error = vfs_open(path, f, cred); if (error) { put_filp(f); return ERR_PTR(error); } return f; is flat-out wrong. It used to be error = vfs_open(path, f, cred); if (!error) { /* from now on we need fput() to dispose of f */ error = open_check_o_direct(f); if (error) { fput(f); f = ERR_PTR(error); } } else { put_filp(f); f = ERR_PTR(error); } and sure, having that open_check_o_direct() boilerplate gotten rid of is nice, but not that way... Worse, another call chain (via finish_open()) is FUBAR now wrt FILE_OPENED handling - in that case we get error returned, with file already hit by fput() *AND* FILE_OPENED not set. Guess what happens in path_openat(), when it hits if (!(opened & FILE_OPENED)) { BUG_ON(!error); put_filp(file); } The root cause of all that crap is that the callers of do_dentry_open() have no way to tell which way did it fail; while that could be fixed up (by passing something like int *opened to do_dentry_open() and have it marked if we'd called ->open()), it's probably much too late in the cycle to do so right now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-03Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds3-18/+35
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Thomas Gleixner: - two patches addressing the problem that the scheduler allows under certain conditions user space tasks to be scheduled on CPUs which are not yet fully booted which causes a few subtle and hard to debug issue - add a missing runqueue clock update in the deadline scheduler which triggers a warning under certain circumstances - fix a silly typo in the scheduler header file * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/headers: Fix typo sched/deadline: Fix missing clock update sched/core: Require cpu_active() in select_task_rq(), for user tasks sched/core: Fix rules for running on online && !active CPUs
2018-06-03Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds9-21/+185
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf tooling fixes from Thomas Gleixner: - fix 'perf test Session topology' segfault on s390 (Thomas Richter) - fix NULL return handling in bpf__prepare_load() (YueHaibing) - fix indexing on Coresight ETM packet queue decoder (Mathieu Poirier) - fix perf.data format description of NRCPUS header (Arnaldo Carvalho de Melo) - update perf.data documentation section on cpu topology - handle uncore event aliases in small groups properly (Kan Liang) - add missing perf_sample.addr into python sample dictionary (Leo Yan) * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf tools: Fix perf.data format description of NRCPUS header perf script python: Add addr into perf sample dict perf data: Update documentation section on cpu topology perf cs-etm: Fix indexing for decoder packet queue perf bpf: Fix NULL return handling in bpf__prepare_load() perf test: "Session topology" dumps core on s390 perf parse-events: Handle uncore event aliases in small groups properly
2018-06-02blk-mq: update nr_requests when switching to 'none' schedulerMing Lei1-0/+1
Now we setup q->nr_requests when switching to one new scheduler, but not do it for 'none', then q->nr_requests may not be correct for 'none'. This patch fixes this issue by always updating 'nr_requests' when switching to 'none'. Cc: Marco Patalano <mpatalan@redhat.com> Cc: "Ewan D. Milne" <emilne@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-02block: don't use blocking queue entered for recursive bio submitsJens Axboe3-1/+15
If we end up splitting a bio and the queue goes away between the initial submission and the later split submission, then we can block forever in blk_queue_enter() waiting for the reference to drop to zero. This will never happen, since we already hold a reference. Mark a split bio as already having entered the queue, so we can just use the live non-blocking queue enter variant. Thanks to Tetsuo Handa for the analysis. Reported-by: syzbot+c4f9cebf9d651f6e54de@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-02dm-crypt: fix warning in shutdown pathKent Overstreet1-4/+3
The counter for the number of allocated pages includes pages in the mempool's reserve, so checking that the number of allocated pages is 0 needs to happen after we exit the mempool. Fixes: 6f1c819c219f ("dm: convert to bioset_init()/mempool_init()") Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Reported-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Mike Snitzer <snitzer@redhat.com> Fixed to always just use percpu_counter_sum() Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds38-115/+192
Pull networking fixes from David Miller: 1) Infinite loop in _decode_session6(), from Eric Dumazet. 2) Pass correct argument to nla_strlcpy() in netfilter, also from Eric Dumazet. 3) Out of bounds memory access in ipv6 srh code, from Mathieu Xhonneux. 4) NULL deref in XDP_REDIRECT handling of tun driver, from Toshiaki Makita. 5) Incorrect idr release in cls_flower, from Paul Blakey. 6) Probe error handling fix in davinci_emac, from Dan Carpenter. 7) Memory leak in XPS configuration, from Alexander Duyck. 8) Use after free with cloned sockets in kcm, from Kirill Tkhai. 9) MTU handling fixes fo ip_tunnel and ip6_tunnel, from Nicolas Dichtel. 10) Fix UAPI hole in bpf data structure for 32-bit compat applications, from Daniel Borkmann. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits) bpf: fix uapi hole for 32 bit compat applications net: usb: cdc_mbim: add flag FLAG_SEND_ZLP ip6_tunnel: remove magic mtu value 0xFFF8 ip_tunnel: restore binding to ifaces with a large mtu net: dsa: b53: Add BCM5389 support kcm: Fix use-after-free caused by clonned sockets net-sysfs: Fix memory leak in XPS configuration ixgbe: fix parsing of TC actions for HW offload net: ethernet: davinci_emac: fix error handling in probe() net/ncsi: Fix array size in dumpit handler cls_flower: Fix incorrect idr release when failing to modify rule net/sonic: Use dma_mapping_error() xfrm Fix potential error pointer dereference in xfrm_bundle_create. vhost_net: flush batched heads before trying to busy polling tun: Fix NULL pointer dereference in XDP redirect be2net: Fix error detection logic for BE3 net: qmi_wwan: Add Netgear Aircard 779S mlxsw: spectrum: Forbid creation of VLAN 1 over port/LAG atm: zatm: fix memcmp casting iwlwifi: pcie: compare with number of IRQs requested for, not number of CPUs ...
2018-06-02CIFS: Add support for direct pages in wdataLong Li3-4/+17
Add a function to allocate wdata without allocating pages for data transfer. This gives the caller an option to pass a number of pages that point to the data buffer to write to. wdata is reponsible for free those pages after it's done. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>
2018-06-02CIFS: Use offset when reading pagesLong Li5-19/+45
With offset defined in rdata, transport functions need to look at this offset when reading data into the correct places in pages. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>
2018-06-02CIFS: Add support for direct pages in rdataLong Li2-4/+21
Add a function to allocate rdata without allocating pages for data transfer. This gives the caller an option to pass a number of pages that point to the data buffer. rdata is still reponsible for free those pages after it's done. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>
2018-06-02cifs: update multiplex loop to handle compounded responsesRonnie Sahlberg4-5/+39
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
2018-06-02Merge tag 'scsi-fixes' of ↵Linus Torvalds1-2/+20
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fix from James Bottomley: "Eve of merge window fix: The original code was so bogus as to be casting the wrong generic device to an rport and proceeding to take actions based on the bogus values it found. Fortunately it seems the location that is dereferenced always exists, so the code hasn't oopsed yet, but it certainly annoys the memory checkers" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: scsi_transport_srp: Fix shost to rport translation
2018-06-02Merge tag 'drm-fixes-for-v4.17-rc8' of ↵Linus Torvalds8-44/+91
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "A few final fixes: i915: - fix for potential Spectre vector in the new query uAPI - fix NULL pointer deref (FDO #106559) - DMI fix to hide LVDS for Radiant P845 (FDO #105468) amdgpu: - suspend/resume DC regression fix - underscan flicker fix on fiji - gamma setting fix after dpms omap: - fix oops regression core: - fix PSR timing dw-hdmi: - fix oops regression" * tag 'drm-fixes-for-v4.17-rc8' of git://people.freedesktop.org/~airlied/linux: drm/amd/display: Update color props when modeset is required drm/amd/display: Make atomic-check validate underscan changes drm/bridge/synopsys: dw-hdmi: fix dw_hdmi_setup_rx_sense drm/amd/display: Fix BUG_ON during CRTC atomic check update drm/i915/query: nospec expects no more than an unsigned long drm/i915/query: Protect tainted function pointer lookup drm/i915/lvds: Move acpi lid notification registration to registration phase drm/i915: Disable LVDS on Radiant P845 drm/omap: fix NULL deref crash with SDI displays drm/psr: Fix missed entry in PSR setup time table.
2018-06-03Merge branch 'drm-fixes-4.17' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie1-9/+22
into drm-fixes Two last minute DC fixes for 4.17. A fix for underscan on fiji and a fix for gamma settings getting after dpms. * 'drm-fixes-4.17' of git://people.freedesktop.org/~agd5f/linux: drm/amd/display: Update color props when modeset is required drm/amd/display: Make atomic-check validate underscan changes
2018-06-02Merge tag 'mips_fixes_4.17_3' of ↵Linus Torvalds4-38/+6
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from James Hogan: "A final few MIPS fixes for 4.17: - drop Lantiq gphy reboot/remove reset (4.14) - prctl(PR_SET_FP_MODE): Disallow PRE without FR (4.0) - ptrace(PTRACE_PEEKUSR): Fix 64-bit FGRs (3.15)" * tag 'mips_fixes_4.17_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: ptrace: Fix PTRACE_PEEKUSR requests for 64-bit FGRs MIPS: prctl: Disallow FRE without FR with PR_SET_FP_MODE requests MIPS: lantiq: gphy: Drop reboot/remove reset asserts
2018-06-02Merge tag 'vfio-v4.17' of git://github.com/awilliam/linux-vfioLinus Torvalds1-15/+10
Pull VFIO fix from Alex Williamson: "Revert a pfn page mapping optimization identified as introducing a bad page state regression (Alex Williamson)" * tag 'vfio-v4.17' of git://github.com/awilliam/linux-vfio: Revert "vfio/type1: Improve memory pinning process for raw PFN mapping"
2018-06-02Merge tag 'char-misc-4.17-rc8' of ↵Linus Torvalds3-7/+8
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are four small bugfixes for some char/misc drivers. Well, really three fixes and one fix for one of those fixes due to problems found by 0-day. This resolves some reported issues with the hwtracing drivers, and a reported regression for the thunderbolt subsystem. All of these have been in linux-next for a while now with no reported problems" * tag 'char-misc-4.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: hwtracing: stm: fix build error on some arches intel_th: Use correct device when freeing buffers stm class: Use vmalloc for the master map thunderbolt: Handle NULL boot ACL entries properly
2018-06-02Merge tag 'staging-4.17-rc8' of ↵Linus Torvalds8-68/+93
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull IIO driver fixes from Greg KH: "Here are some old IIO driver fixes that were sitting in my tree for a few weeks. Sorry about not getting them to you sooner. They fix a number of small IIO driver issues that have been reported. All of these have been in linux-next for a while with no reported problems" * tag 'staging-4.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: iio: adc: select buffer for at91-sama5d2_adc iio: hid-sensor-trigger: Fix sometimes not powering up the sensor after resume iio: adc: at91-sama5d2_adc: fix channel configuration for differential channels iio:kfifo_buf: check for uint overflow iio:buffer: make length types match kfifo types iio: adc: stm32-dfsdm: fix sample rate for div2 spi clock iio: adc: stm32-dfsdm: fix successive oversampling settings iio: ad7793: implement IIO_CHAN_INFO_SAMP_FREQ
2018-06-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds14-62/+172
Pull rdma fixes from Jason Gunthorpe: "Just three small last minute regressions that were found in the last week. The Broadcom fix is a bit big for rc7, but since it is fixing driver crash regressions that were merged via netdev into rc1, I am sending it. - bnxt netdev changes merged this cycle caused the bnxt RDMA driver to crash under certain situations - Arnd found (several, unfortunately) kconfig problems with the patches adding INFINIBAND_ADDR_TRANS. Reverting this last part, will fix it more fully outside -rc. - Subtle change in error code for a uapi function caused breakage in userspace. This was bug was subtly introduced cycle" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/core: Fix error code for invalid GID entry IB: Revert "remove redundant INFINIBAND kconfig dependencies" RDMA/bnxt_re: Fix broken RoCE driver due to recent L2 driver changes
2018-06-02Merge branch 'i2c/for-current' of ↵Linus Torvalds3-2/+10
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "A documentation bugfix and a MAINTAINERS addition" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: ocores: update HDL sources URL i2c: xlp9xx: Add MAINTAINERS entry
2018-06-02Merge branch 'akpm' (patches from Andrew)Linus Torvalds2-2/+2
Merge two fixes from Andrew Morton. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: fix the NULL mapping case in __isolate_lru_page() mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty()
2018-06-02mm: fix the NULL mapping case in __isolate_lru_page()Hugh Dickins1-1/+1
George Boole would have noticed a slight error in 4.16 commit 69d763fc6d3a ("mm: pin address_space before dereferencing it while isolating an LRU page"). Fix it, to match both the comment above it, and the original behaviour. Although anonymous pages are not marked PageDirty at first, we have an old habit of calling SetPageDirty when a page is removed from swap cache: so there's a category of ex-swap pages that are easily migratable, but were inadvertently excluded from compaction's async migration in 4.16. Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1805302014001.12558@eggly.anvils Fixes: 69d763fc6d3a ("mm: pin address_space before dereferencing it while isolating an LRU page") Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Ivan Kalvachev <ikalvachev@gmail.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-02mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty()Hugh Dickins1-1/+1
Swapping load on huge=always tmpfs (with khugepaged tuned up to be very eager, but I'm not sure that is relevant) soon hung uninterruptibly, waiting for page lock in shmem_getpage_gfp()'s find_lock_entry(), most often when "cp -a" was trying to write to a smallish file. Debug showed that the page in question was not locked, and page->mapping NULL by now, but page->index consistent with having been in a huge page before. Reproduced in minutes on a 4.15 kernel, even with 4.17's 605ca5ede764 ("mm/huge_memory.c: reorder operations in __split_huge_page_tail()") added in; but took hours to reproduce on a 4.17 kernel (no idea why). The culprit proved to be the __ClearPageDirty() on tails beyond i_size in __split_huge_page(): the non-atomic __bitoperation may have been safe when 4.8's baa355fd3314 ("thp: file pages support for split_huge_page()") introduced it, but liable to erase PageWaiters after 4.10's 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit"). Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1805291841070.3197@eggly.anvils Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit") Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-02Revert "vfio/type1: Improve memory pinning process for raw PFN mapping"Alex Williamson1-15/+10
Bisection by Amadeusz Sławiński implicates this commit leading to bad page state issues after VM shutdown, likely due to unbalanced page references. The original commit was intended only as a performance improvement, therefore revert for offline rework. Link: https://lkml.org/lkml/2018/6/2/97 Fixes: 356e88ebe447 ("vfio/type1: Improve memory pinning process for raw PFN mapping") Cc: Jason Cai (Xiang Feng) <jason.cai@linux.alibaba.com> Reported-by: Amadeusz Sławiński <amade@asmblr.net> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-06-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller2-0/+4
Daniel Borkmann says: ==================== pull-request: bpf 2018-06-02 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) BPF uapi fix in struct bpf_prog_info and struct bpf_map_info in order to fix offsets on 32 bit archs. This will have a minor merge conflict with net-next which has the __u32 gpl_compatible:1 bitfield in struct bpf_prog_info at this location. Resolution is to use the gpl_compatible member. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01bpf: fix uapi hole for 32 bit compat applicationsDaniel Borkmann2-0/+4
In 64 bit, we have a 4 byte hole between ifindex and netns_dev in the case of struct bpf_map_info but also struct bpf_prog_info. In net-next commit b85fab0e67b ("bpf: Add gpl_compatible flag to struct bpf_prog_info") added a bitfield into it to expose some flags related to programs. Thus, add an unnamed __u32 bitfield for both so that alignment keeps the same in both 32 and 64 bit cases, and can be naturally extended from there as in b85fab0e67b. Before: # file test.o test.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped # pahole test.o struct bpf_map_info { __u32 type; /* 0 4 */ __u32 id; /* 4 4 */ __u32 key_size; /* 8 4 */ __u32 value_size; /* 12 4 */ __u32 max_entries; /* 16 4 */ __u32 map_flags; /* 20 4 */ char name[16]; /* 24 16 */ __u32 ifindex; /* 40 4 */ __u64 netns_dev; /* 44 8 */ __u64 netns_ino; /* 52 8 */ /* size: 64, cachelines: 1, members: 10 */ /* padding: 4 */ }; After (same as on 64 bit): # file test.o test.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped # pahole test.o struct bpf_map_info { __u32 type; /* 0 4 */ __u32 id; /* 4 4 */ __u32 key_size; /* 8 4 */ __u32 value_size; /* 12 4 */ __u32 max_entries; /* 16 4 */ __u32 map_flags; /* 20 4 */ char name[16]; /* 24 16 */ __u32 ifindex; /* 40 4 */ /* XXX 4 bytes hole, try to pack */ __u64 netns_dev; /* 48 8 */ __u64 netns_ino; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ /* size: 64, cachelines: 1, members: 10 */ /* sum members: 60, holes: 1, sum holes: 4 */ }; Reported-by: Dmitry V. Levin <ldv@altlinux.org> Reported-by: Eugene Syromiatnikov <esyr@redhat.com> Fixes: 52775b33bb507 ("bpf: offload: report device information about offloaded maps") Fixes: 675fc275a3a2d ("bpf: offload: report device information for offloaded programs") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-01net: usb: cdc_mbim: add flag FLAG_SEND_ZLPDaniele Palmas1-1/+1
Testing Telit LM940 with ICMP packets > 14552 bytes revealed that the modem needs FLAG_SEND_ZLP to properly work, otherwise the cdc mbim data interface won't be anymore responsive. Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01Merge branch 'tunnel-mtus'David S. Miller3-9/+15
Nicolas Dichtel says: ==================== ip[6] tunnels: fix mtu calculations The first patch restores the possibility to bind an ip4 tunnel to an interface whith a large mtu. The second patch was spotted after the first fix. I also target it to net because it fixes the max mtu value that can be used for ipv6 tunnels. v2: remove the 0xfff8 in ip_tunnel_newlink() ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01ip6_tunnel: remove magic mtu value 0xFFF8Nicolas Dichtel2-5/+11
I don't know where this value comes from (probably a copy and paste and paste and paste ...). Let's use standard values which are a bit greater. Link: https://git.kernel.org/pub/scm/linux/kernel/git/davem/netdev-vger-cvs.git/commit/?id=e5afd356a411a Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01ip_tunnel: restore binding to ifaces with a large mtuNicolas Dichtel1-4/+4
After commit f6cc9c054e77, the following conf is broken (note that the default loopback mtu is 65536, ie IP_MAX_MTU + 1): $ ip tunnel add gre1 mode gre local 10.125.0.1 remote 10.125.0.2 dev lo add tunnel "gre0" failed: Invalid argument $ ip l a type dummy $ ip l s dummy1 up $ ip l s dummy1 mtu 65535 $ ip tunnel add gre1 mode gre local 10.125.0.1 remote 10.125.0.2 dev dummy1 add tunnel "gre0" failed: Invalid argument dev_set_mtu() doesn't allow to set a mtu which is too large. First, let's cap the mtu returned by ip_tunnel_bind_dev(). Second, remove the magic value 0xFFF8 and use IP_MAX_MTU instead. 0xFFF8 seems to be there for ages, I don't know why this value was used. With a recent kernel, it's also possible to set a mtu > IP_MAX_MTU: $ ip l s dummy1 mtu 66000 After that patch, it's also possible to bind an ip tunnel on that kind of interface. CC: Petr Machata <petrm@mellanox.com> CC: Ido Schimmel <idosch@mellanox.com> Link: https://git.kernel.org/pub/scm/linux/kernel/git/davem/netdev-vger-cvs.git/commit/?id=e5afd356a411a Fixes: f6cc9c054e77 ("ip_tunnel: Emit events for post-register MTU changes") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01Merge branch 'master' of ↵David S. Miller2-4/+3
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2018-05-31 1) Avoid possible overflow of the offset variable in _decode_session6(), this fixes an infinite lookp there. From Eric Dumazet. 2) We may use an error pointer in the error path of xfrm_bundle_create(). Fix this by returning this pointer directly to the caller. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01hwmon: (asus_atk0110) Make use of device managed memoryBastian Germann1-36/+8
Use devm_* variants of kstrdup and kzalloc. Get rid of kfree cleanups. Signed-off-by: Bastian Germann <bastiangermann@fishpost.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-06-01hwmon: (asus_atk0110) Replace deprecated device register callBastian Germann1-50/+25
Make the asus_atk0110 driver use hwmon_device_register_with_groups instead of the deprecated hwmon_device_register. Construct the expected attribute_group array from the sensor list which contains all needed attributes. Remove the manual sysfs file creation and deletion that are now taken care of by the (un)register calls via the attribute_group array. Signed-off-by: Bastian Germann <bastiangermann@fishpost.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-06-01hwmon: (k10temp) Make function get_raw_temp staticColin Ian King1-1/+1
The function get_raw_temp is local to the source and does not need to be in global scope, so make it static. Cleans up sparse warning: drivers/hwmon/k10temp.c:149:14: warning: symbol 'get_raw_temp' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-06-01net: dsa: b53: Add BCM5389 supportDamien Thébault4-1/+19
This patch adds support for the BCM5389 switch connected through MDIO. Signed-off-by: Damien Thébault <damien.thebault@vitec.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01lightnvm: pblk: take bitmap alloc. out of critical sectionJavier González1-41/+56
pblk allocates line bitmaps within the line lock unnecessarily. In order to take pressure out of the fast patch, allocate line bitmaps outside of this lock and refactor accordingly. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: kick writer on new flush pointsHans Holmberg4-7/+10
Unless we kick the writer directly when setting a new flush point, the user risks having to wait for up to one second (the default timeout for the write thread to be kicked) for the IO to complete. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: only try to recover lines with written smetaHans Holmberg1-9/+21
When switching between different lun configurations, there is no guarantee that all lines that contain closed/open chunks have some valid data to recover. Check that the smeta chunk has been written to instead. Also skip bad lines (that does not have enough good chunks). Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: remove unnecessary bio_get/putJavier González1-37/+28
In the read path, pblk gets a reference to the incoming bio and puts it after ending the bio. Though this behavior is correct, it is unnecessary since pblk is the one putting the bio, therefore, it cannot disappear underneath it. Removing this reference, allows to clean up rqd->bio and avoids pointer bouncing for the different read paths. Now, the incoming bio always resides in the read context and pblk's internal bios (if any) reside in rqd->bio. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: add possibility to set write buffer size manuallyMarcin Dziegielewski1-2/+12
In some cases, users can want set write buffer size manually, e.g. to adjust it to specific workload. This patch provides the possibility to set write buffer size via module parameter feature. Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: fix partial read error pathIgor Konopko1-4/+4
When error occurs during bio_add_page on partial read path, pblk tries to free pages twice. Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: proper error handling for pblk_bio_add_pagesIgor Konopko1-2/+4
Currently in case of error caused by bio_pc_add_page in pblk_bio_add_pages two issues occur when calling from pblk_rb_read_to_bio(). First one is in pblk_bio_free_pages, since we are trying to free pages not allocated from our mempool. Second one is the warn from dma_pool_free, that we are trying to free NULL pointer dma. This commit fix both issues. Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: fix smeta write error pathHans Holmberg1-3/+4
Smeta write errors were previously ignored. Skip these lines instead and throw them back on the free list, so the chunks will go through a reset cycle before we attempt to use the line again. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: garbage collect lines with failed writesHans Holmberg7-64/+200
Write failures should not happen under normal circumstances, so in order to bring the chunk back into a known state as soon as possible, evacuate all the valid data out of the line and let the fw judge if the block can be written to in the next reset cycle. Do this by introducing a new gc list for lines with failed writes, and ensure that the rate limiter allocates a small portion of the write bandwidth to get the job done. The lba list is saved in memory for use during gc as we cannot gurantee that the emeta data is readable if a write error occurred. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: rework write error recovery pathHans Holmberg5-229/+181
The write error recovery path is incomplete, so rework the write error recovery handling to do resubmits directly from the write buffer. When a write error occurs, the remaining sectors in the chunk are mapped out and invalidated and the request inserted in a resubmit list. The writer thread checks if there are any requests to resubmit, scans and invalidates any lbas that have been overwritten by later writes and resubmits the failed entries. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01kcm: Fix use-after-free caused by clonned socketsKirill Tkhai1-1/+1
(resend for properly queueing in patchwork) kcm_clone() creates kernel socket, which does not take net counter. Thus, the net may die before the socket is completely destructed, i.e. kcm_exit_net() is executed before kcm_done(). Reported-by: syzbot+5f1a04e374a635efc426@syzkaller.appspotmail.com Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01cifs: remove header_preamble_size where it is always 0Ronnie Sahlberg3-68/+48
Since header_preamble_size is 0 for SMB2+ we can remove it in those code paths that are only invoked from SMB2. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-06-01cifs: remove struct smb2_hdrRonnie Sahlberg7-55/+41
struct smb2_hdr is now just a wrapper for smb2_sync_hdr. We can thus get rid of smb2_hdr completely and access the sync header directly. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-06-01CIFS: 511c54a2f69195b28afb9dd119f03787b1625bb4 adds a check for session ↵Mark Syms1-2/+3
expiry, status STATUS_NETWORK_SESSION_EXPIRED, however the server can also respond with STATUS_USER_SESSION_DELETED in cases where the session has been idle for some time and the server reaps the session to recover resources. Handle this additional status in the same way as SESSION_EXPIRED. Signed-off-by: Mark Syms <mark.syms@citrix.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org>
2018-06-01lightnvm: pblk: remove dead functionJavier González2-8/+0
Remove dead function for manual sync. I/O Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pass flag on graceful teardown to targetsJavier González6-18/+35
If the namespace is unregistered before the LightNVM target is removed (e.g., on hot unplug) it is too late for the target to store any metadata on the device - any attempt to write to the device will fail. In this case, pass on a "gracefull teardown" flag to the target to let it know when this happens. In the case of pblk, we pad the open line (close all open chunks) to improve data retention. In the event of an ungraceful shutdown, avoid this part and just clean up. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: check for chunk size before allocating itJavier González1-3/+3
Do the check for the chunk state after making sure that the chunk type is supported. Fixes: 32ef9412c114 ("lightnvm: pblk: implement get log report chunk") Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: remove unnecessary argumentJavier González3-5/+5
Remove unnecessary argument on pblk_line_free() Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: remove unnecessary indirectionJavier González1-12/+2
Call nvm_submit_io directly and remove an unnecessary indirection on the read path. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: return NVM_ error on failed submissionJavier González1-14/+8
Return a meaningful error when the sanity vector I/O check fails. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: warn in case of corrupted write bufferJavier González1-3/+2
When cleaning up buffer entries as we wrap up, their state should be "completed". If any of the entries is in "submitted" state, it means that something bad has happened. Trigger a warning immediately instead of waiting for the state flag to eventually be updated, thus hiding the issue. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: improve error msg on corrupted LBAsJavier González1-10/+32
In the event of a mismatch between the read LBA and the metadata pointer reported by the device, improve the error message to be able to detect the offending physical address (PPA) mapped to the corrupted LBA. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: check read lba on gc pathJavier González1-6/+33
Check that the lba stored in the LBA metadata is correct in the GC path too. This requires a new helper function to check random reads in the vector read. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: recheck for bad lines at runtimeJavier González2-14/+35
Bad blocks can grow at runtime. Check that the number of valid blocks in a line are within the sanity threshold before allocating the line for new writes. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: fail gracefully on line alloc. failureJavier González2-9/+29
In the event of a line failing to allocate, fail gracefully and stop the pipeline to avoid more write failing in the same place. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01Merge branch 'nvme-4.18' of git://git.infradead.org/nvme into for-4.18/blockJens Axboe11-193/+351
Pull NVMe changes from Christoph: "Below is another set of NVMe updates for 4.18. Besides the usual bug fixes this includes more feature completness in terms of AEN and log page handling on the target." * 'nvme-4.18' of git://git.infradead.org/nvme: nvme: use the changed namespaces list log to clear ns data changed AENs nvme: mark nvme_queue_scan static nvme: submit AEN event configuration on startup nvmet: mask pending AENs nvmet: add AEN configuration support nvmet: implement the changed namespaces log nvmet: split log page implementation nvmet: add a new nvmet_zero_sgl helper nvme.h: add AEN configuration symbols nvme.h: add the changed namespace list log nvme.h: untangle AEN notice definitions nvmet: fix error return code in nvmet_file_ns_enable() nvmet: fix a typo in nvmet_file_ns_enable() nvme-fabrics: allow internal passthrough command on deleting controllers nvme-loop: add support for multiple ports nvme-pci: simplify __nvme_submit_cmd nvme-pci: Rate limit the nvme timeout warnings nvme: allow duplicate controller if prior controller being deleted
2018-06-01block: split the blk-mq case from elevator_initChristoph Hellwig3-32/+48
There is almost no shared logic, which leads to a very confusing code flow. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01block: move sysfs_lock into elevator_initChristoph Hellwig5-28/+8
Both callers take just around so function call, so move it in. Also remove the now pointless blk_mq_sched_init wrapper. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01block: remove the always unused name argument to elevator_initChristoph Hellwig4-11/+5
Reported-by: Damien Le Moal <Damien.LeMoal@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01block: unexport elevator_init/exitChristoph Hellwig3-4/+2
These are only used by the block core. Also move the declarations to block/blk.h. Reported-by: Damien Le Moal <Damien.LeMoal@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01block: move initialization of elevator-related fields to blk_alloc_queue_nodeChristoph Hellwig2-5/+5
No point in doing this in elevator_init. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Damien Le Moal <Damien.LeMoal@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01nvme: use the changed namespaces list log to clear ns data changed AENsChristoph Hellwig2-4/+49
Per section 5.2 we need to issue the corresponding log page to clear an AEN, so for a namespace data changed AEN we need to read the changed namespace list log. And once we read that log anyway we might as well use it to optimize the rescan. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2018-06-01nvme: mark nvme_queue_scan staticChristoph Hellwig2-11/+9
And move it toward the top of the file to avoid a forward declaration. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2018-06-01nvme: submit AEN event configuration on startupHannes Reinecke2-0/+18
We should register for AEN events; some law-abiding targets might not be sending us AENs otherwise. Signed-off-by: Hannes Reinecke <hare@suse.com> [hch: slight cleanups] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2018-06-01nvmet: mask pending AENsChristoph Hellwig3-1/+10
Per section 5.2 of the NVMe 1.3 spec: "When the controller posts a completion queue entry for an outstanding Asynchronous Event Request command and thus reports an asynchronous event, subsequent events of that event type are automatically masked by the controller until the host clears that event. An event is cleared by reading the log page associated with that event using the Get Log Page command (see section 5.14)." Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2018-06-01nvmet: add AEN configuration supportChristoph Hellwig3-2/+32
AEN configuration via the 'Get Features' and 'Set Features' admin command is mandatory, so we should be implemeting handling for it. Signed-off-by: Hannes Reinecke <hare@suse.com> [hch: use WRITE_ONCE, check for invalid values] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2018-06-01nvmet: implement the changed namespaces logChristoph Hellwig3-9/+76
Just keep a per-controller buffer of changed namespaces and copy it out in the get log page implementation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2018-06-01nvmet: split log page implementationChristoph Hellwig1-63/+36
Remove the common code to allocate a buffer and copy it into the SGL. Instead the two no-op implementations just zero the SGL directly, and the smart log allocates a buffer on its own. This prepares for the more elaborate ANA log page. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2018-06-01nvmet: add a new nvmet_zero_sgl helperChristoph Hellwig2-0/+8
Zeroes the SGL in the payload. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2018-06-01nvme.h: add AEN configuration symbolsHannes Reinecke1-0/+5
Signed-off-by: Hannes Reinecke <hare@suse.com> [hch: split from a larger patch] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2018-05-31net-sysfs: Fix memory leak in XPS configurationAlexander Duyck1-3/+3
This patch reorders the error cases in showing the XPS configuration so that we hold off on memory allocation until after we have verified that we can support XPS on a given ring. Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes") Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-31ixgbe: fix parsing of TC actions for HW offloadOndřej Hlavatý1-5/+4
The previous code was optimistic, accepting the offload of whole action chain when there was a single known action (drop/redirect). This results in offloading a rule which should not be offloaded, because its behavior cannot be reproduced in the hardware. For example: $ tc filter add dev eno1 parent ffff: protocol ip \ u32 ht 800: order 1 match tcp src 42 FFFF \ action mirred egress mirror dev enp1s16 pipe \ drop The controller is unable to mirror the packet to a VF, but still offloads the rule by dropping the packet. Change the approach of the function to a pessimistic one, rejecting the chain when an unknown action is found. This is better suited for future extensions. Note that both recognized actions always return TC_ACT_SHOT, therefore it is safe to ignore actions behind them. Signed-off-by: Ondřej Hlavatý <ohlavaty@redhat.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-31cifs: change smb2_get_data_area_len to take a smb2_sync_hdr as argumentRonnie Sahlberg3-20/+22
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-05-31cifs: update smb2_calc_size to use smb2_sync_hdr instead of smb2_hdrRonnie Sahlberg1-6/+4
smb2_hdr is just a wrapper around smb2_sync_hdr at this stage and smb2_hdr is going away. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-05-31cifs: remove struct smb2_oplock_break_rspRonnie Sahlberg3-16/+4
The two structures smb2_oplock_breaq_req/rsp are now basically identical. Replace this with a single definition of a smb2_oplock_break structure. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-05-31cifs: remove rfc1002 header from all SMB2 response structuresRonnie Sahlberg6-78/+87
Separate out all the 4 byte rfc1002 headers so that they are no longer part of the SMB2 header structures to prepare for future work to add compounding support. Update the smb3 transform header processing that we no longer have a rfc1002 header at the start of this structure. Update smb2_readv_callback to accommodate that the first iovector in the response is no the smb2 header and no longer a rfc1002 header. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-05-31smb3: on reconnect set PreviousSessionId fieldSteve French1-0/+1
The server detects reconnect by the (non-zero) value in PreviousSessionId of SMB2/SMB3 SessionSetup request, but this behavior regressed due to commit 166cea4dc3a4f66f020cfb9286225ecd228ab61d ("SMB2: Separate RawNTLMSSP authentication from SMB2_sess_setup") CC: Stable <stable@vger.kernel.org> CC: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-05-31smb3: Add posix create context for smb3.11 posix mountsSteve French5-3/+101
Signed-off-by: Steve French <smfrench@gmail.com>
2018-05-31Merge tag 'xfs-4.17-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds1-0/+1
Pull xfs fix from Darrick Wong: "Clear out i_mapping error state when we're reinitializing inodes. This last minute fix prevents writeback error state from persisting past the end of the in-core inode lifecycle and causing EIO errors to be reported to userspace when no error has occurred. This fix for the behavioral regression has been soaking in for-next for a while, but various fs developers persuaded me to try to get it upstream for 4.17 because the patch that broke things was introduced in 4.17-rc4" * tag 'xfs-4.17-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: fs: clear writeback errors in inode_init_always
2018-05-31net: ethernet: davinci_emac: fix error handling in probe()Dan Carpenter1-10/+12
The current error handling code has an issue where it does: if (priv->txchan) cpdma_chan_destroy(priv->txchan); The problem is that ->txchan is either valid or an error pointer (which would lead to an Oops). I've changed it to use multiple error labels so that the test can be removed. Also there were some missing calls to netif_napi_del(). Fixes: 3ef0fdb2342c ("net: davinci_emac: switch to new cpdma layer") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-31net/ncsi: Fix array size in dumpit handlerSamuel Mendoza-Jonas1-1/+1
With CONFIG_CC_STACKPROTECTOR enabled the kernel panics as below when parsing a NCSI_CMD_PKG_INFO command: [ 150.149711] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: 805cff08 [ 150.149711] [ 150.159919] CPU: 0 PID: 1301 Comm: ncsi-netlink Not tainted 4.13.16-468cbec6d2c91239332cb91b1f0a73aafcb6f0c6 #1 [ 150.170004] Hardware name: Generic DT based system [ 150.174852] [<80109930>] (unwind_backtrace) from [<80106bc4>] (show_stack+0x20/0x24) [ 150.182641] [<80106bc4>] (show_stack) from [<805d36e4>] (dump_stack+0x20/0x28) [ 150.189888] [<805d36e4>] (dump_stack) from [<801163ac>] (panic+0xdc/0x278) [ 150.196780] [<801163ac>] (panic) from [<801162cc>] (__stack_chk_fail+0x20/0x24) [ 150.204111] [<801162cc>] (__stack_chk_fail) from [<805cff08>] (ncsi_pkg_info_all_nl+0x244/0x258) [ 150.212912] [<805cff08>] (ncsi_pkg_info_all_nl) from [<804f939c>] (genl_lock_dumpit+0x3c/0x54) [ 150.221535] [<804f939c>] (genl_lock_dumpit) from [<804f873c>] (netlink_dump+0xf8/0x284) [ 150.229550] [<804f873c>] (netlink_dump) from [<804f8d44>] (__netlink_dump_start+0x124/0x17c) [ 150.237992] [<804f8d44>] (__netlink_dump_start) from [<804f9880>] (genl_rcv_msg+0x1c8/0x3d4) [ 150.246440] [<804f9880>] (genl_rcv_msg) from [<804f9174>] (netlink_rcv_skb+0xd8/0x134) [ 150.254361] [<804f9174>] (netlink_rcv_skb) from [<804f96a4>] (genl_rcv+0x30/0x44) [ 150.261850] [<804f96a4>] (genl_rcv) from [<804f7790>] (netlink_unicast+0x198/0x234) [ 150.269511] [<804f7790>] (netlink_unicast) from [<804f7ffc>] (netlink_sendmsg+0x368/0x3b0) [ 150.277783] [<804f7ffc>] (netlink_sendmsg) from [<804abea4>] (sock_sendmsg+0x24/0x34) [ 150.285625] [<804abea4>] (sock_sendmsg) from [<804ac1dc>] (___sys_sendmsg+0x244/0x260) [ 150.293556] [<804ac1dc>] (___sys_sendmsg) from [<804ad98c>] (__sys_sendmsg+0x5c/0x9c) [ 150.301400] [<804ad98c>] (__sys_sendmsg) from [<804ad9e4>] (SyS_sendmsg+0x18/0x1c) [ 150.308984] [<804ad9e4>] (SyS_sendmsg) from [<80102640>] (ret_fast_syscall+0x0/0x3c) [ 150.316743] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: 805cff08 This turns out to be because the attrs array in ncsi_pkg_info_all_nl() is initialised to a length of NCSI_ATTR_MAX which is the maximum attribute number, not the number of attributes. Fixes: 955dc68cb9b2 ("net/ncsi: Add generic netlink family") Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-31Merge tag 'wireless-drivers-for-davem-2018-05-30' of ↵David S. Miller2-9/+8
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 4.17 Two last minute fixes, hopefully they make it to 4.17 still. rt2x00 * revert a fix which caused even more problems iwlwifi * fix a crash when there are 16 or more logical CPUs ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-31cls_flower: Fix incorrect idr release when failing to modify rulePaul Blakey1-1/+1
When we fail to modify a rule, we incorrectly release the idr handle of the unmodified old rule. Fix that by checking if we need to release it. Fixes: fe2502e49b58 ("net_sched: remove cls_flower idr on failure") Reported-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-31net/sonic: Use dma_mapping_error()Finn Thain1-1/+1
With CONFIG_DMA_API_DEBUG=y, calling sonic_open() produces the message, "DMA-API: device driver failed to check map error". Add the missing dma_mapping_error() call. Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-31drm/amd/display: Update color props when modeset is requiredLeo (Sunpeng) Li1-2/+6
This fixes issues where color management properties don't persist over DPMS on/off, or when the CRTC is moved across connectors. Signed-off-by: Leo (Sunpeng) Li <sunpeng.li@amd.com> Reviewed-by: Harry Wentland <Harry.Wentland@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-31drm/amd/display: Make atomic-check validate underscan changesDavid Francis1-7/+16
When the underscan state was changed, atomic-check was triggering a validation but passing the old underscan values. This change adds a somewhat hacky check in dm_update_crtcs_state that will update the stream if old and newunderscan values are different. This was causing 4k on Fiji to allow underscan when it wasn't permitted. Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: David Francis <David.Francis@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-31nvme.h: add the changed namespace list logChristoph Hellwig1-0/+3
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2018-05-31nvme.h: untangle AEN notice definitionsChristoph Hellwig2-14/+24
Stop including the event type in the definitions for the notice type. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2018-05-31nvmet: fix error return code in nvmet_file_ns_enable()Wei Yongjun1-2/+6
Fix to return error code -ENOMEM from the memory alloc fail error handling case instead of 0, as done elsewhere in this function. Fixes: d5eff33ee6f8 ("nvmet: add simple file backed ns support") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.e> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-31nvmet: fix a typo in nvmet_file_ns_enable()Wei Yongjun1-1/+1
Fix a typo in nvmet_file_ns_enable(). Fixes: d5eff33ee6f8 ("nvmet: add simple file backed ns support") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.e> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-31nvme-fabrics: allow internal passthrough command on deleting controllersChristoph Hellwig1-48/+31
Without this we can't cleanly shut down. Based on analysis an an earlier patch from Hannes Reinecke. Fixes: bb06ec31452f ("nvme: expand nvmf_check_if_ready checks") Reported-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: James Smart <james.smart@broadcom.com>
2018-05-31dma-direct: don't crash on device without dma_maskChristoph Hellwig1-0/+7
Print a useful warning instead. Reported-by: Finn Thain <fthain@telegraphics.com.au> Tested-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-31block, bfq: prevent soft_rt_next_start from being stuck at infinityDavide Sapienza1-27/+2
BFQ can deem a bfq_queue as soft real-time only if the queue - periodically becomes completely idle, i.e., empty and with no still-outstanding I/O request; - after becoming idle, gets new I/O only after a special reference time soft_rt_next_start. In this respect, after commit "block, bfq: consider also past I/O in soft real-time detection", the value of soft_rt_next_start can never decrease. This causes a problem with the following special updating case for soft_rt_next_start: to prevent queues that are not completely idle to be wrongly detected as soft real-time (when they become non-empty again), soft_rt_next_start is temporarily set to infinity for empty queues with still outstanding I/O requests. But, if such an update is actually performed, then, because of the above commit, soft_rt_next_start will be stuck at infinity forever, and the queue will have no more chance to be considered soft real-time. On slow systems, this problem does cause actual soft real-time applications to be occasionally not detected as such. This commit addresses this issue by eliminating the pushing of soft_rt_next_start to infinity, and by changing the way non-empty queues are prevented from being wrongly detected as soft real-time. Simply, a queue that becomes non-empty again can now be detected as soft real-time only if it has no outstanding I/O request. Signed-off-by: Davide Sapienza <sapienza.dav@gmail.com> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-31block, bfq: increase weight-raising duration for interactive appsDavide Sapienza1-11/+15
The maximum possible duration of the weight-raising period for interactive applications is limited to 13 seconds, as this is the time needed to load the largest application that we considered when tuning weight raising. Unfortunately, in such an evaluation, we did not consider the case of very slow virtual machines. For example, on a QEMU/KVM virtual machine - running in a slow PC; - with a virtual disk stacked on a slow low-end 5400rpm HDD; - serving a heavy I/O workload, such as the sequential reading of several files; mplayer takes 23 seconds to start, if constantly weight-raised. To address this issue, this commit conservatively sets the upper limit for weight-raising duration to 25 seconds. Signed-off-by: Davide Sapienza <sapienza.dav@gmail.com> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-31block, bfq: remove slow-system classPaolo Valente2-105/+46
BFQ computes the duration of weight raising for interactive applications automatically, using some reference parameters. In particular, BFQ uses the best durations (see comments in the code for how these durations have been assessed) for two classes of systems: slow and fast ones. Examples of slow systems are old phones or systems using micro HDDs. Fast systems are all the remaining ones. Using these parameters, BFQ computes the actual duration of the weight raising, for the system at hand, as a function of the relative speed of the system w.r.t. the speed of a reference system, belonging to the same class of systems as the system at hand. This slow vs fast differentiation proved to be useful in the past, but happens to have little meaning with current hardware. Even worse, it does cause problems in virtual systems, where the speed of the system can vary frequently, and so widely to just confuse the class-detection mechanism, and, as we have verified experimentally, to cause BFQ to compute non-sensical weight-raising durations. This commit addresses this issue by removing the slow class and the class-detection mechanism. Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-31block, bfq: add description of weight-raising heuristicsPaolo Valente1-24/+56
A description of how weight raising works is missing in BFQ sources. In addition, the code for handling weight raising is scattered across a few functions. This makes it rather hard to understand the mechanism and its rationale. This commits adds such a description at the beginning of the main source file. Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-31block, bfq: remove the removal of 'next' rq in bfq_requests_mergedFilippo Muzzini1-7/+0
Since bfq_finish_request() is always called on the request 'next', after bfq_requests_merged() is finished, and bfq_finish_request() removes 'next' from its bfq_queue if needed, it isn't necessary to do such a removal in advance in bfq_merged_requests(). This commit removes such a useless 'next' removal. Signed-off-by: Filippo Muzzini <filippo.muzzini@outlook.it> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-31block, bfq: remove wrong check in bfq_requests_mergedPaolo Valente1-6/+20
The request rq passed to the function bfq_requests_merged is always in a bfq_queue, so the check !RB_EMPTY_NODE(&rq->rb_node) at the beginning of bfq_requests_merged always succeeds, and the control flow systematically skips to the end of the function. This implies that the body of the function is never executed, i.e., the repositioning of rq is never performed. On the opposite end, a control is missing in the body of the function: 'next' must be removed only if it is inside a bfq_queue. This commit removes the wrong check on rq, and adds the missing check on 'next'. In addition, this commit adds comments on bfq_requests_merged. Signed-off-by: Filippo Muzzini <filippo.muzzini@outlook.it> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-31block, bfq: remove wrong lock in bfq_requests_mergedFilippo Muzzini1-2/+0
In bfq_requests_merged(), there is a deadlock because the lock on bfqq->bfqd->lock is held by the calling function, but the code of this function tries to grab the lock again. This deadlock is currently hidden by another bug (fixed by next commit for this source file), which causes the body of bfq_requests_merged() to be never executed. This commit removes the deadlock by removing the lock/unlock pair. Signed-off-by: Filippo Muzzini <filippo.muzzini@outlook.it> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-31Merge tag 'platform-drivers-x86-v4.17-4' of ↵Linus Torvalds1-10/+13
git://git.infradead.org/linux-platform-drivers-x86 Pull x86 platform driver fix from Andy Shevchenko: "Fix NULL pointer dereference in asus-wmi on rfkill cleanup. The effective change is just one new condition - two lines of code. But it required moving one static helper function, which is why the diff looks a bit bigger" * tag 'platform-drivers-x86-v4.17-4' of git://git.infradead.org/linux-platform-drivers-x86: platform/x86: asus-wmi: Fix NULL pointer dereference
2018-05-31platform/x86: asus-wmi: Fix NULL pointer dereferenceJoão Paulo Rechi Vita1-10/+13
Do not perform the rfkill cleanup routine when (asus->driver->wlan_ctrl_by_user && ashs_present()) is true, since nothing is registered with the rfkill subsystem in that case. Doing so leads to the following kernel NULL pointer dereference: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff816c7348>] __mutex_lock_slowpath+0x98/0x120 PGD 1a3aa8067 PUD 1a3b3d067 PMD 0 Oops: 0002 [#1] PREEMPT SMP Modules linked in: bnep ccm binfmt_misc uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core hid_a4tech videodev x86_pkg_temp_thermal intel_powerclamp coretemp ath3k btusb btrtl btintel bluetooth kvm_intel snd_hda_codec_hdmi kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass crc32c_intel arc4 i915 snd_hda_intel snd_hda_codec ath9k ath9k_common ath9k_hw ath i2c_algo_bit snd_hwdep mac80211 ghash_clmulni_intel snd_hda_core snd_pcm snd_timer cfg80211 ehci_pci xhci_pci drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm xhci_hcd ehci_hcd asus_nb_wmi(-) asus_wmi sparse_keymap r8169 rfkill mxm_wmi serio_raw snd mii mei_me lpc_ich i2c_i801 video soundcore mei i2c_smbus wmi i2c_core mfd_core CPU: 3 PID: 3275 Comm: modprobe Not tainted 4.9.34-gentoo #34 Hardware name: ASUSTeK COMPUTER INC. K56CM/K56CM, BIOS K56CM.206 08/21/2012 task: ffff8801a639ba00 task.stack: ffffc900014cc000 RIP: 0010:[<ffffffff816c7348>] [<ffffffff816c7348>] __mutex_lock_slowpath+0x98/0x120 RSP: 0018:ffffc900014cfce0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff8801a54315b0 RCX: 00000000c0000100 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8801a54315b4 RBP: ffffc900014cfd30 R08: 0000000000000000 R09: 0000000000000002 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801a54315b4 R13: ffff8801a639ba00 R14: 00000000ffffffff R15: ffff8801a54315b8 FS: 00007faa254fb700(0000) GS:ffff8801aef80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000001a3b1b000 CR4: 00000000001406e0 Stack: ffff8801a54315b8 0000000000000000 ffffffff814733ae ffffc900014cfd28 ffffffff8146a28c ffff8801a54315b0 0000000000000000 ffff8801a54315b0 ffff8801a66f3820 0000000000000000 ffffc900014cfd48 ffffffff816c73e7 Call Trace: [<ffffffff814733ae>] ? acpi_ut_release_mutex+0x5d/0x61 [<ffffffff8146a28c>] ? acpi_ns_get_node+0x49/0x52 [<ffffffff816c73e7>] mutex_lock+0x17/0x30 [<ffffffffa00a3bb4>] asus_rfkill_hotplug+0x24/0x1a0 [asus_wmi] [<ffffffffa00a4421>] asus_wmi_rfkill_exit+0x61/0x150 [asus_wmi] [<ffffffffa00a49f1>] asus_wmi_remove+0x61/0xb0 [asus_wmi] [<ffffffff814a5128>] platform_drv_remove+0x28/0x40 [<ffffffff814a2901>] __device_release_driver+0xa1/0x160 [<ffffffff814a29e3>] device_release_driver+0x23/0x30 [<ffffffff814a1ffd>] bus_remove_device+0xfd/0x170 [<ffffffff8149e5a9>] device_del+0x139/0x270 [<ffffffff814a5028>] platform_device_del+0x28/0x90 [<ffffffff814a50a2>] platform_device_unregister+0x12/0x30 [<ffffffffa00a4209>] asus_wmi_unregister_driver+0x19/0x30 [asus_wmi] [<ffffffffa00da0ea>] asus_nb_wmi_exit+0x10/0xf26 [asus_nb_wmi] [<ffffffff8110c692>] SyS_delete_module+0x192/0x270 [<ffffffff810022b2>] ? exit_to_usermode_loop+0x92/0xa0 [<ffffffff816ca560>] entry_SYSCALL_64_fastpath+0x13/0x94 Code: e8 5e 30 00 00 8b 03 83 f8 01 0f 84 93 00 00 00 48 8b 43 10 4c 8d 7b 08 48 89 63 10 41 be ff ff ff ff 4c 89 3c 24 48 89 44 24 08 <48> 89 20 4c 89 6c 24 10 eb 1d 4c 89 e7 49 c7 45 08 02 00 00 00 RIP [<ffffffff816c7348>] __mutex_lock_slowpath+0x98/0x120 RSP <ffffc900014cfce0> CR2: 0000000000000000 ---[ end trace 8d484233fa7cb512 ]--- note: modprobe[3275] exited with preempt_count 2 https://bugzilla.kernel.org/show_bug.cgi?id=196467 Reported-by: red.f0xyz@gmail.com Signed-off-by: João Paulo Rechi Vita <jprvita@endlessm.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2018-05-31Merge tag 'perf-urgent-for-mingo-4.17-20180531' of ↵Ingo Molnar9-21/+185
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - Fix 'perf test Session topology' segfault on s390 (Thomas Richter) - Fix NULL return handling in bpf__prepare_load() (YueHaibing) - Fix indexing on Coresight ETM packet queue decoder (Mathieu Poirier) - Fix perf.data format description of NRCPUS header (Arnaldo Carvalho de Melo) - Update perf.data documentation section on cpu topology - Handle uncore event aliases in small groups properly (Kan Liang) - Add missing perf_sample.addr into python sample dictionary (Leo Yan) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-31sched/headers: Fix typoDavidlohr Bueso1-1/+1
I cannot spell 'throttling'. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180530224940.17839-1-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-31sched/deadline: Fix missing clock updateJuri Lelli1-3/+3
A missing clock update is causing the following warning: rq->clock_update_flags < RQCF_ACT_SKIP WARNING: CPU: 10 PID: 0 at kernel/sched/sched.h:963 inactive_task_timer+0x5d6/0x720 Call Trace: <IRQ> __hrtimer_run_queues+0x10f/0x530 hrtimer_interrupt+0xe5/0x240 smp_apic_timer_interrupt+0x79/0x2b0 apic_timer_interrupt+0xf/0x20 </IRQ> do_idle+0x203/0x280 cpu_startup_entry+0x6f/0x80 start_secondary+0x1b0/0x200 secondary_startup_64+0xa5/0xb0 hardirqs last enabled at (793919): [<ffffffffa27c5f6e>] cpuidle_enter_state+0x9e/0x360 hardirqs last disabled at (793920): [<ffffffffa2a0096e>] interrupt_entry+0xce/0xe0 softirqs last enabled at (793922): [<ffffffffa20bef78>] irq_enter+0x68/0x70 softirqs last disabled at (793921): [<ffffffffa20bef5d>] irq_enter+0x4d/0x70 This happens because inactive_task_timer() calls sub_running_bw() (if TASK_DEAD and non_contending) that might trigger a schedutil update, which might access the clock. Clock is however currently updated only later in inactive_task_timer() function. Fix the problem by updating the clock right after task_rq_lock(). Reported-by: kernel test robot <xiaolong.ye@intel.com> Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Claudio Scordino <claudio@evidence.eu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luca Abeni <luca.abeni@santannapisa.it> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180530160809.9074-1-juri.lelli@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-31sched/core: Require cpu_active() in select_task_rq(), for user tasksPaul Burton1-2/+1
select_task_rq() is used in a few paths to select the CPU upon which a thread should be run - for example it is used by try_to_wake_up() & by fork or exec balancing. As-is it allows use of any online CPU that is present in the task's cpus_allowed mask. This presents a problem because there is a period whilst CPUs are brought online where a CPU is marked online, but is not yet fully initialized - ie. the period where CPUHP_AP_ONLINE_IDLE <= state < CPUHP_ONLINE. Usually we don't run any user tasks during this window, but there are corner cases where this can happen. An example observed is: - Some user task A, running on CPU X, forks to create task B. - sched_fork() calls __set_task_cpu() with cpu=X, setting task B's task_struct::cpu field to X. - CPU X is offlined. - Task A, currently somewhere between the __set_task_cpu() in copy_process() and the call to wake_up_new_task(), is migrated to CPU Y by migrate_tasks() when CPU X is offlined. - CPU X is onlined, but still in the CPUHP_AP_ONLINE_IDLE state. The scheduler is now active on CPU X, but there are no user tasks on the runqueue. - Task A runs on CPU Y & reaches wake_up_new_task(). This calls select_task_rq() with cpu=X, taken from task B's task_struct, and select_task_rq() allows CPU X to be returned. - Task A enqueues task B on CPU X's runqueue, via activate_task() & enqueue_task(). - CPU X now has a user task on its runqueue before it has reached the CPUHP_ONLINE state. In most cases, the user tasks that schedule on the newly onlined CPU have no idea that anything went wrong, but one case observed to be problematic is if the task goes on to invoke the sched_setaffinity syscall. The newly onlined CPU reaches the CPUHP_AP_ONLINE_IDLE state before the CPU that brought it online calls stop_machine_unpark(). This means that for a portion of the window of time between CPUHP_AP_ONLINE_IDLE & CPUHP_ONLINE the newly onlined CPU's struct cpu_stopper has its enabled field set to false. If a user thread is executed on the CPU during this window and it invokes sched_setaffinity with a CPU mask that does not include the CPU it's running on, then when __set_cpus_allowed_ptr() calls stop_one_cpu() intending to invoke migration_cpu_stop() and perform the actual migration away from the CPU it will simply return -ENOENT rather than calling migration_cpu_stop(). We then return from the sched_setaffinity syscall back to the user task that is now running on a CPU which it just asked not to run on, and which is not present in its cpus_allowed mask. This patch resolves the problem by having select_task_rq() enforce that user tasks run on CPUs that are active - the same requirement that select_fallback_rq() already enforces. This should ensure that newly onlined CPUs reach the CPUHP_AP_ACTIVE state before being able to schedule user tasks, and also implies that bringup_wait_for_ap() will have called stop_machine_unpark() which resolves the sched_setaffinity issue above. I haven't yet investigated them, but it may be of interest to review whether any of the actions performed by hotplug states between CPUHP_AP_ONLINE_IDLE & CPUHP_AP_ACTIVE could have similar unintended effects on user tasks that might schedule before they are reached, which might widen the scope of the problem from just affecting the behaviour of sched_setaffinity. Signed-off-by: Paul Burton <paul.burton@mips.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180526154648.11635-2-paul.burton@mips.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-31sched/core: Fix rules for running on online && !active CPUsPeter Zijlstra1-12/+30
As already enforced by the WARN() in __set_cpus_allowed_ptr(), the rules for running on an online && !active CPU are stricter than just being a kthread, you need to be a per-cpu kthread. If you're not strictly per-CPU, you have better CPUs to run on and don't need the partially booted one to get your work done. The exception is to allow smpboot threads to bootstrap the CPU itself and get kernel 'services' initialized before we allow userspace on it. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 955dbdf4ce87 ("sched: Allow migrating kthreads into online but inactive CPUs") Link: http://lkml.kernel.org/r/20170725165821.cejhb7v2s3kecems@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-31spi: Fix typo on SPI_MEM help textFabio Estevam1-1/+1
The correct form is "a high-level", so fix it accordingly. Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-05-31btrfs: Add unprivileged version of ino_lookup ioctlTomohiro Misono2-0/+221
Add unprivileged version of ino_lookup ioctl BTRFS_IOC_INO_LOOKUP_USER to allow normal users to call "btrfs subvolume list/show" etc. in combination with BTRFS_IOC_GET_SUBVOL_INFO/BTRFS_IOC_GET_SUBVOL_ROOTREF. This can be used like BTRFS_IOC_INO_LOOKUP but the argument is different. This is because it always searches the fs/file tree correspoinding to the fd with which this ioctl is called and also returns the name of bottom subvolume. The main differences from original ino_lookup ioctl are: 1. Read + Exec permission will be checked using inode_permission() during path construction. -EACCES will be returned in case of failure. 2. Path construction will be stopped at the inode number which corresponds to the fd with which this ioctl is called. If constructed path does not exist under fd's inode, -EACCES will be returned. 3. The name of bottom subvolume is also searched and filled. Note that the maximum length of path is shorter 256 (BTRFS_VOL_NAME_MAX+1) bytes than ino_lookup ioctl because of space of subvolume's name. Reviewed-by: Gu Jinxiang <gujx@cn.fujitsu.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Tested-by: Gu Jinxiang <gujx@cn.fujitsu.com> Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com> [ style fixes ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-31btrfs: Add unprivileged ioctl which returns subvolume's ROOT_REFTomohiro Misono2-0/+117
Add unprivileged ioctl BTRFS_IOC_GET_SUBVOL_ROOTREF which returns ROOT_REF information of the subvolume containing this inode except the subvolume name (this is because to prevent potential name leak). The subvolume name will be gained by user version of ino_lookup ioctl (BTRFS_IOC_INO_LOOKUP_USER) which also performs permission check. The min id of root ref's subvolume to be searched is specified by @min_id in struct btrfs_ioctl_get_subvol_rootref_args. After the search ends, @min_id is set to the last searched root ref's subvolid + 1. Also, if there are more root refs than BTRFS_MAX_ROOTREF_BUFFER_NUM, -EOVERFLOW is returned. Therefore the caller can just call this ioctl again without changing the argument to continue search. Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Gu Jinxiang <gujx@cn.fujitsu.com> Tested-by: Gu Jinxiang <gujx@cn.fujitsu.com> Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com> [ style fixes and struct item renames ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-31btrfs: Add unprivileged ioctl which returns subvolume informationTomohiro Misono2-0/+183
Add new unprivileged ioctl BTRFS_IOC_GET_SUBVOL_INFO which returns the information of subvolume containing this inode. (i.e. returns the information in ROOT_ITEM and ROOT_BACKREF.) Reviewed-by: Gu Jinxiang <gujx@cn.fujitsu.com> Tested-by: Gu Jinxiang <gujx@cn.fujitsu.com> Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com> [ minor style fixes, update struct comments ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-31xfrm Fix potential error pointer dereference in xfrm_bundle_create.Steffen Klassert1-3/+2
We may derference an invalid pointer in the error path of xfrm_bundle_create(). Fix this by returning this error pointer directly instead of assigning it to xdst0. Fixes: 45b018beddb6 ("ipsec: Create and use new helpers for dst child access.") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-05-30fs: clear writeback errors in inode_init_alwaysDarrick J. Wong1-0/+1
In inode_init_always(), we clear the inode mapping flags, which clears any retained error (AS_EIO, AS_ENOSPC) bits. Unfortunately, we do not also clear wb_err, which means that old mapping errors can leak through to new inodes. This is crucial for the XFS inode allocation path because we recycle old in-core inodes and we do not want error state from an old file to leak into the new file. This bug was discovered by running generic/036 and generic/047 in a loop and noticing that the EIOs generated by the collision of direct and buffered writes in generic/036 would survive the remount between 036 and 047, and get reported to the fsyncs (on different files!) in generic/047. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-30smb3: add tracepoints for smb2/smb3 openSteve French2-1/+92
add two tracepoints for open completion. One for error one for completion (open_done). Sample output below TASK-PID CPU# |||| TIMESTAMP FUNCTION | | | |||| | | bash-15348 [007] .... 42441.027492: smb3_enter: cifs_lookup: xid=45 bash-15348 [007] .... 42441.028214: smb3_cmd_err: sid=0x6173e4ce tid=0xa05150e6 cmd=5 mid=105 status=0xc0000034 rc=-2 bash-15348 [007] .... 42441.028219: smb3_open_err: xid=45 sid=0x6173e4ce tid=0xa05150e6 cr_opts=0x0 des_access=0x80 rc=-2 bash-15348 [007] .... 42441.028225: smb3_exit_done: cifs_lookup: xid=45 fop777-24560 [002] .... 42442.627617: smb3_enter: cifs_revalidate_dentry_attr: xid=46 fop777-24560 [003] .... 42442.628301: smb3_cmd_err: sid=0x6173e4ce tid=0xa05150e6 cmd=5 mid=106 status=0xc0000034 rc=-2 fop777-24560 [003] .... 42442.628319: smb3_open_err: xid=46 sid=0x6173e4ce tid=0xa05150e6 cr_opts=0x0 des_access=0x80 rc=-2 fop777-24560 [003] .... 42442.628335: smb3_enter: cifs_atomic_open: xid=47 fop777-24560 [003] .... 42442.629587: smb3_cmd_done: sid=0x6173e4ce tid=0xa05150e6 cmd=5 mid=107 fop777-24560 [003] .... 42442.629592: smb3_open_done: xid=47 sid=0x6173e4ce tid=0xa05150e6 fid=0xb8a0984d cr_opts=0x40 des_access=0x40000080 Signed-off-by: Steve French <smfrench@gmail.com>
2018-05-30block: fixup bioset_integrity_create() callJens Axboe1-1/+1
Missed converting the bioset_integrity_create() bounce bio set call. Fixes: 338aa96d5661 ("block: convert bounce, q->bio_split to bioset_init()/mempool_init()") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30cifs: add debug output to show nocase mount optionSteve French1-0/+2
For smb1 nocase can be specified on mount. Allow displaying it in debug data. Signed-off-by: Steve French <smfrench@gmail.com>
2018-05-30smb3: add define for id for posix create context and corresponding structSteve French1-1/+10
Signed-off-by: Steve French <smfrench@gmail.com>
2018-05-31Merge tag 'drm-misc-fixes-2018-05-30' of ↵Dave Airlie3-13/+6
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes dw-hdmi: Fix Oops regression from rc1 (Neil) Cc: Neil Armstrong <narmstrong@baylibre.com> * tag 'drm-misc-fixes-2018-05-30' of git://anongit.freedesktop.org/drm/drm-misc: drm/bridge/synopsys: dw-hdmi: fix dw_hdmi_setup_rx_sense
2018-05-30cifs: update smb2_check_message to handle PDUs without a 4 byte length headerRonnie Sahlberg1-30/+20
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-05-30Merge tag 'for-linus-20180530' of git://git.kernel.dk/linux-blockLinus Torvalds1-1/+1
Pull block fix from Jens Axboe: "Just a single fix that should make it into this release, fixing a regression with T10-DIF on NVMe" * tag 'for-linus-20180530' of git://git.kernel.dk/linux-block: nvme: fix extended data LBA supported setting
2018-05-30Merge tag 'selinux-pr-20180530' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull SELinux fix from Paul Moore: "One more small fix for SELinux: a small string length fix found by KASAN. I dislike sending patches this late in the release cycle, but this patch fixes a legitimate problem, is very small, limited in scope, and well understood. There are two threads with more information on the problem, the latest is linked below: https://marc.info/?t=152723737400001&r=1&w=2 Stephen points out in the thread linked above: 'Such a setxattr() call can only be performed by a process with CAP_MAC_ADMIN that is also allowed mac_admin permission in SELinux policy. Consequently, this is never possible on Android (no process is allowed mac_admin permission, always enforcing) and is only possible in Fedora/RHEL for a few domains (if enforcing)'" * tag 'selinux-pr-20180530' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: KASAN: slab-out-of-bounds in xattr_getsecurity
2018-05-30block: Drop bioset_create()Kent Overstreet2-52/+15
All users have been converted to bioset_init(), kill off the old API. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30xfs: convert to bioset_init()/mempool_init()Kent Overstreet3-8/+7
Convert XFS to embedded bio sets. Acked-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30btrfs: convert to bioset_init()/mempool_init()Kent Overstreet1-14/+11
Convert btrfs to embedded bio sets. Acked-by: Chris Mason <clm@fb.com> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30fs: convert block_dev.c to bioset_init()Kent Overstreet1-6/+3
Convert block DIO code to embedded bio sets. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30target: convert to bioset_init()/mempool_init()Kent Overstreet2-9/+7
Convert the target code to embedded bio sets. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30dm: convert to bioset_init()/mempool_init()Kent Overstreet17-206/+197
Convert dm to embedded bio sets. Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30md: convert to bioset_init()/mempool_init()Kent Overstreet15-181/+159
Convert md to embedded bio sets. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30bcache: convert to bioset_init()/mempool_init()Kent Overstreet7-52/+37
Convert bcache to embedded bio sets. Reviewed-by: Coly Li <colyli@suse.de> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30lightnvm: convert to bioset_init()/mempool_init()Kent Overstreet6-65/+65
Convert lightnvm to embedded bio sets. Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30pktcdvd: convert to bioset_init()/mempool_init()Kent Overstreet2-26/+26
Convert pktcdvd to embedded bio sets. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30drbd: convert to bioset_init()/mempool_init()Kent Overstreet6-59/+38
Convert drbd to embedded bio sets and mempools. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30block: convert bounce, q->bio_split to bioset_init()/mempool_init()Kent Overstreet7-33/+41
Convert the core block functionality to embedded bio sets. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30Merge branch 'linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "This fixes a potential kernel panic in the inside-secure driver" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: inside-secure - do not use memset on MMIO
2018-05-30smb3: allow "posix" mount option to enable new SMB311 protocol extensionsSteve French3-1/+36
If "posix" (or synonym "unix" for backward compatibility) specified on mount, and server advertises support for SMB3.11 POSIX negotiate context, then enable the new posix extensions on the tcon. This can be viewed by looking for "posix" in the mount options displayed by /proc/mounts for that mount (ie if posix extensions allowed by server and the experimental POSIX extensions also requested on the mount by specifying "posix" at mount time). Also add check to warn user if conflicting unix/nounix or posix/noposix specified on mount. Signed-off-by: Steve French <smfrench@gmail.com>
2018-05-30smb3: add support for posix negotiate contextSteve French4-6/+36
Unlike CIFS where UNIX/POSIX extensions had been negotiatable, SMB3 did not have POSIX extensions yet. Add the new SMB3.11 POSIX negotiate context to ask the server whether it can support POSIX (and thus whether we can send the new POSIX open context). Signed-off-by: Steve French <smfrench@gmail.com>
2018-05-30cifs: allow disabling less secure legacy dialectsSteve French3-9/+29
To improve security it may be helpful to have additional ways to restrict the ability to override the default dialects (SMB2.1, SMB3 and SMB3.02) on mount with old dialects (CIFS/SMB1 and SMB2) since vers=1.0 (CIFS/SMB1) and vers=2.0 are weaker and less secure. Add a module parameter "disable_legacy_dialects" (/sys/module/cifs/parameters/disable_legacy_dialects) which can be set to 1 (or equivalently Y) to forbid use of vers=1.0 or vers=2.0 on mount. Also cleans up a few build warnings about globals for various module parms. Signed-off-by: Steve French <smfrench@gmail.com>
2018-05-30cifs: make minor clarifications to module params for cifs.koSteve French2-3/+5
Note which ones of the module params are cifs dialect only (N/A for default dialect now that has moved to SMB2.1 or later) Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-05-30cifs: show the "w" bit for writeable /proc/fs/cifs/* filesRonnie Sahlberg1-14/+14
RHBZ: 1539612 Lets show the "w" bit for those files have a .write interface to set/enable/... the feature. Reported-by: Xiaoli Feng <xifeng@redhat.com> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-05-30smb3: add module alias for smb3 to cifs.koSteve French1-1/+19
We really don't want to be encouraging people to use the old (less secure) cifs dialect (SMB1) and it can be confusing for them with SMB3 (or later) being recommended but the module name is cifs. Add a module alias for "smb3" to cifs.ko to make this less confusing. Signed-off-by: Steve French <smfrench@gmail.com>
2018-05-30cifs: return error on invalid value written to cifsFYIRonnie Sahlberg1-0/+2
RHBZ: 1539617 Check that, if it is not a boolean, the value the user tries to write to /proc/fs/cifs/cifsFYI is valid and return an error if not. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reported-by: Xiaoli Feng <xifeng@redhat.com>
2018-05-30cifs: invalidate cache when we truncate a fileRonnie Sahlberg1-4/+9
RHBZ: 1566345 When truncating a file we always do this synchronously to the server. Thus we need to make sure that the cached inode metadata is marked as stale so that on next getattr we will refresh the metadata. In this particular bug we want to ensure that both ctime and mtime are updated and become visible to the application after a truncate. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reported-by: Xiaoli Feng <xifeng@redhat.com>
2018-05-30smb3: print tree id in debugdata in proc to be able to help loggingSteve French1-0/+2
When loooking at the logs for the new trace-cmd tracepoints for cifs, it would help to know which tid is for which share (UNC name) so update /proc/fs/cifs/DebugData to display the tid. Also display Maximal Access which was missing as well. Now the entry for typical entry for a tcon (in proc/fs/cifs/) looks like: 1) \\localhost\test Mounts: 1 DevInfo: 0x20 Attributes: 0x1006f PathComponentMax: 255 Status: 1 type: DISK Share Capabilities: None Aligned, Partition Aligned, Share Flags: 0x0 tid: 0xe0632a55 Optimal sector size: 0x200 Maximal Access: 0x1f01ff Signed-off-by: Steve French <smfrench@gmail.com>
2018-05-30smb3: add additional ftrace entry points for entry/exit to cifs.koSteve French3-35/+85
Signed-off-by: Steve French <smfrench@gmail.com>
2018-05-30smb3: fix various xid leaksSteve French1-19/+44
Fix a few cases where we were not freeing the xid which led to active requests being non-zero at unmount time. Signed-off-by: Steve French <smfrench@gmail.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-05-30CIFS: Introduce offset for the 1st page in data transfer structuresLong Li2-0/+4
When direct I/O is used, the data buffer may not always align to page boundaries. Introduce a page offset in transport data structures to describe the location of the buffer within the page. Also change the function to pass the page offset when sending data to transport. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-05-30Btrfs: clean up error handling in btrfs_truncate()Omar Sandoval1-19/+14
btrfs_truncate() uses two variables for error handling, ret and err (if this sounds familiar, it's because btrfs_truncate_inode_items() did something similar). This is error prone, as was made evident by "Btrfs: fix error handling in btrfs_truncate()". We only have err because we don't want to mask an error if we call btrfs_update_inode() and btrfs_end_transaction(), so let's make that its own scoped return variable and use ret everywhere else. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30platform/chrome: Use to_cros_ec_dev more broadlyGwendal Grignou4-22/+12
Move to_cros_ec_dev macro to cros_ec.h and use it when the private ec object is needed from device object. Signed-off-by: Gwendal Grignou <gwendal@chromium.org> Reviewed-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Signed-off-by: Benson Leung <bleung@chromium.org>
2018-05-30blk-throttle: return proper bool type to caller instead of 0/1Chengguang Xu1-5/+5
Change to return true/false only for bool type return code. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30perf tools: Fix perf.data format description of NRCPUS headerArnaldo Carvalho de Melo1-1/+1
In the perf.data HEADER_CPUDESC feadure header we store first the number of available CPUs in the system, then the number of CPUs at the time of writing the header, not the other way around. Reported-by: Thomas-Mich Richter <tmricht@linux.ibm.com> Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Lakshman Annadorai <lakshmana@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-j7o92acm2vnxjv70y4o3swoc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-30perf script python: Add addr into perf sample dictLeo Yan1-0/+2
ARM CoreSight auxtrace uses 'sample->addr' to record the target address for branch instructions, so the data of 'sample->addr' is required for tracing data analysis. This commit collects data of 'sample->addr' into perf sample dict, finally can be used for python script for parsing event. Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Walker <robert.walker@arm.com> Cc: Tor Jeremiassen <tor@ti.com> Cc: coresight@lists.linaro.org Cc: kim.phillips@arm.co Cc: linux-arm-kernel@lists.infradead.org Cc: linux-doc@vger.kernel.org Link: http://lkml.kernel.org/r/1527497103-3593-3-git-send-email-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-30perf data: Update documentation section on cpu topologyThomas Richter1-0/+8
Add an explanation of each cpu's core and socket identifier to the perf.data file format documentation. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180528074433.16652-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-30perf cs-etm: Fix indexing for decoder packet queueMathieu Poirier1-2/+10
The tail of a queue is supposed to be pointing to the next available slot in a queue. In this implementation the tail is incremented before it is used and as such points to the last used element, something that has the immense advantage of centralizing tail management at a single location and eliminating a lot of redundant code. But this needs to be taken into consideration on the dequeueing side where the head also needs to be incremented before it is used, or the first available element of the queue will be skipped. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Walker <robert.walker@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1527289854-10755-1-git-send-email-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-30perf bpf: Fix NULL return handling in bpf__prepare_load()YueHaibing1-3/+3
bpf_object__open()/bpf_object__open_buffer can return error pointer or NULL, check the return values with IS_ERR_OR_NULL() in bpf__prepare_load and bpf__prepare_load_buffer Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: netdev@vger.kernel.org Link: https://lkml.kernel.org/n/tip-psf4xwc09n62al2cb9s33v9h@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-30drm/bridge/synopsys: dw-hdmi: fix dw_hdmi_setup_rx_senseNeil Armstrong3-13/+6
The dw_hdmi_setup_rx_sense exported function should not use struct device to recover the dw-hdmi context using drvdata, but take struct dw_hdmi directly like other exported functions. This caused a regression using Meson DRM on S905X since v4.17-rc1 : Internal error: Oops: 96000007 [#1] PREEMPT SMP [...] CPU: 0 PID: 124 Comm: irq/32-dw_hdmi_ Not tainted 4.17.0-rc7 #2 Hardware name: Libre Technology CC (DT) [...] pc : osq_lock+0x54/0x188 lr : __mutex_lock.isra.0+0x74/0x530 [...] Process irq/32-dw_hdmi_ (pid: 124, stack limit = 0x00000000adf418cb) Call trace: osq_lock+0x54/0x188 __mutex_lock_slowpath+0x10/0x18 mutex_lock+0x30/0x38 __dw_hdmi_setup_rx_sense+0x28/0x98 dw_hdmi_setup_rx_sense+0x10/0x18 dw_hdmi_top_thread_irq+0x2c/0x50 irq_thread_fn+0x28/0x68 irq_thread+0x10c/0x1a0 kthread+0x128/0x130 ret_from_fork+0x10/0x18 Code: 34000964 d00050a2 51000484 9135c042 (f864d844) ---[ end trace 945641e1fbbc07da ]--- note: irq/32-dw_hdmi_[124] exited with preempt_count 1 genirq: exiting task "irq/32-dw_hdmi_" (124) is an active IRQ thread (irq 32) Fixes: eea034af90c6 ("drm/bridge/synopsys: dw-hdmi: don't clobber drvdata") Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Tested-by: Koen Kooi <koen@dominion.thruhere.net> Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/1527673438-20643-1-git-send-email-narmstrong@baylibre.com
2018-05-30blk-mq: only iterate over inflight requests in blk_mq_tagset_busy_iterChristoph Hellwig5-21/+4
We already check for started commands in all callbacks, but we should also protect against already completed commands. Do this by taking the checks to common code. Acked-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30nbd: clear DISCONNECT_REQUESTED flag once disconnection occurs.Kevin Vigor1-6/+15
When a userspace client requests a NBD device be disconnected, the DISCONNECT_REQUESTED flag is set. While this flag is set, the driver will not inform userspace when a connection is closed. Unfortunately the flag was never cleared, so once a disconnect was requested the driver would thereafter never tell userspace about a closed connection. Thus when connections failed due to timeout, no attempt to reconnect was made and eventually the device would fail. Fix by clearing the DISCONNECT_REQUESTED flag (and setting the DISCONNECTED flag) once all connections are closed. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Kevin Vigor <kvigor@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30vhost_net: flush batched heads before trying to busy pollingJason Wang1-13/+24
After commit e2b3b35eb989 ("vhost_net: batch used ring update in rx"), we tend to batch updating used heads. But it doesn't flush batched heads before trying to do busy polling, this will cause vhost to wait for guest TX which waits for the used RX. Fixing by flush batched heads before busy loop. 1 byte TCP_RR performance recovers from 13107.83 to 50402.65. Fixes: e2b3b35eb989 ("vhost_net: batch used ring update in rx") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-30btrfs: Factor out write portion of btrfs_get_blocks_directNikolay Borisov1-99/+108
Now that the read side is extracted into its own function, do the same to the write side. This leaves btrfs_get_blocks_direct_write with the sole purpose of handling common locking required. Also flip the condition in btrfs_get_blocks_direct_write so that the write case comes first and we check for if (Create) rather than if (!create). This is purely subjective but I believe makes reading a bit more "linear". No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30btrfs: Factor out read portion of btrfs_get_blocks_directNikolay Borisov1-13/+43
Currently this function handles both the READ and WRITE dio cases. This is facilitated by a bunch of 'if' statements, a goto short-circuit statement and a very perverse aliasing of "!created"(READ) case by setting lockstart = lockend and checking for lockstart < lockend for detecting the write. Let's simplify this mess by extracting the READ-only code into a separate __btrfs_get_block_direct_read function. This is only the first step, the next one will be to factor out the write side as well. The end goal will be to have the common locking/ unlocking code in btrfs_get_blocks_direct and then it will call either the read|write subvariants. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30blk-throttle: fix potential NULL pointer dereference in throtl_select_dispatchLiu Bo1-1/+2
tg in throtl_select_dispatch is used first and then do check. Since tg may be NULL, it has potential NULL pointer dereference risk. So fix it. Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30block: kyber: make kyber more friendly with mergingJianchao Wang1-32/+158
Currently, kyber is very unfriendly with merging. kyber depends on ctx rq_list to do merging, however, most of time, it will not leave any requests in ctx rq_list. This is because even if tokens of one domain is used up, kyber will try to dispatch requests from other domain and flush the rq_list there. To improve this, we setup kyber_ctx_queue (kcq) which is similar with ctx, but it has rq_lists for different domain and build same mapping between kcq and khd as the ctx & hctx. Then we could merge, insert and dispatch for different domains separately. At the same time, only flush the rq_list of kcq when get domain token successfully. Then if one domain token is used up, the requests could be left in the rq_list of that domain and maybe merged with following io. Following is my test result on machine with 8 cores and NVMe card INTEL SSDPEKKR128G7 fio size=256m ioengine=libaio iodepth=64 direct=1 numjobs=8 seq/random +------+---------------------------------------------------------------+ |patch?| bw(MB/s) | iops | slat(usec) | clat(usec) | merge | +----------------------------------------------------------------------+ | w/o | 606/612 | 151k/153k | 6.89/7.03 | 3349.21/3305.40 | 0/0 | +----------------------------------------------------------------------+ | w/ | 1083/616 | 277k/154k | 4.93/6.95 | 1830.62/3279.95 | 223k/3k | +----------------------------------------------------------------------+ When set numjobs to 16, the bw and iops could reach 1662MB/s and 425k on my platform. Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>