aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2014-01-23Merge branch 'akpm' (incoming from Andrew)HEADmasterLinus Torvalds231-2187/+3339
Merge second patch-bomb from Andrew Morton: - various misc bits - the rest of MM - add generic fixmap.h, use it - backlight updates - dynamic_debug updates - printk() updates - checkpatch updates - binfmt_elf - ramfs - init/ - autofs4 - drivers/rtc - nilfs - hfsplus - Documentation/ - coredump - procfs - fork - exec - kexec - kdump - partitions - rapidio - rbtree - userns - memstick - w1 - decompressors * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (197 commits) lib/decompress_unlz4.c: always set an error return code on failures romfs: fix returm err while getting inode in fill_super drivers/w1/masters/w1-gpio.c: add strong pullup emulation drivers/memstick/host/rtsx_pci_ms.c: fix ms card data transfer bug userns: relax the posix_acl_valid() checks arch/sh/kernel/dwarf.c: use rbtree postorder iteration helper instead of solution using repeated rb_erase() fs-ext3-use-rbtree-postorder-iteration-helper-instead-of-opencoding-fix fs/ext3: use rbtree postorder iteration helper instead of opencoding fs/jffs2: use rbtree postorder iteration helper instead of opencoding fs/ext4: use rbtree postorder iteration helper instead of opencoding fs/ubifs: use rbtree postorder iteration helper instead of opencoding net/netfilter/ipset/ip_set_hash_netiface.c: use rbtree postorder iteration instead of opencoding rbtree/test: test rbtree_postorder_for_each_entry_safe() rbtree/test: move rb_node to the middle of the test struct rapidio: add modular rapidio core build into powerpc and mips branches partitions/efi: complete documentation of gpt kernel param purpose kdump: add /sys/kernel/vmcoreinfo ABI documentation kdump: fix exported size of vmcoreinfo note kexec: add sysctl to disable kexec_load fs/exec.c: call arch_pick_mmap_layout() only once ...
2014-01-23Merge tag 'clk-for-linus-3.14-part1' of ↵Linus Torvalds111-1577/+22451
git://git.linaro.org/people/mike.turquette/linux Pull clk framework changes from Mike Turquette: "The first half of the clk framework pull request is made up almost entirely of new platform/driver support. There are some conversions of existing drivers to the common-clock Device Tree binding, and a few non-critical fixes to the framework. Due to an entirely unnecessary cyclical dependency with the arm-soc tree this pull request is broken into two pieces. The second piece will be sent out after arm-soc sends you the pull request that merged in core support for the HiSilicon 3620 platform. That same pull request from arm-soc depends on this pull request to merge in those HiSilicon bits without causing build failures" [ Just did the ARM SoC merges, so getting ready for the second clk tree pull request - Linus ] * tag 'clk-for-linus-3.14-part1' of git://git.linaro.org/people/mike.turquette/linux: (97 commits) devicetree: bindings: Document qcom,mmcc devicetree: bindings: Document qcom,gcc clk: qcom: Add support for MSM8660's global clock controller (GCC) clk: qcom: Add support for MSM8974's multimedia clock controller (MMCC) clk: qcom: Add support for MSM8974's global clock controller (GCC) clk: qcom: Add support for MSM8960's multimedia clock controller (MMCC) clk: qcom: Add support for MSM8960's global clock controller (GCC) clk: qcom: Add reset controller support clk: qcom: Add support for branches/gate clocks clk: qcom: Add support for root clock generators (RCGs) clk: qcom: Add support for phase locked loops (PLLs) clk: qcom: Add a regmap type clock struct clk: Add set_rate_and_parent() op reset: Silence warning in reset-controller.h clk: sirf: re-arch to make the codes support both prima2 and atlas6 clk: composite: pass mux_hw into determine_rate clk: shmobile: Fix MSTP clock array initialization clk: shmobile: Fix MSTP clock index ARM: dts: Add clock provider specific properties to max77686 node clk: max77686: Register OF clock provider ...
2014-01-23Merge tag 'drivers-for-linus' of ↵Linus Torvalds87-753/+2914
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM driver updates from Olof Johansson: "Updates of SoC-near drivers and other driver updates that makes more sense to take through our tree. The largest part of this is a conversion of device registration for some renesas shmobile/sh devices over to use resources. This has required coordination with the corresponding arch/sh changes, and we've agreed to merge the arch/sh changes through our tree. Added in this branch is support for Trusted Foundations secure firmware, which is what is used on many of the commercial Nvidia Tegra products that are in the market, including the Nvidia Shield. The code is local to arch/arm at this time since it's uncertain whether it will be shared with arm64 longer-term, if needed we will refactor later. A couple of new RTC drivers used on ARM boards, merged through our tree on request by the RTC maintainer. ... plus a bunch of smaller updates across the board, gpio conversions for davinci, etc" * tag 'drivers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (45 commits) watchdog: davinci: rename platform driver to davinci-wdt tty: serial: Limit msm_serial_hs driver to platforms that use it mmc: msm_sdcc: Limit driver to platforms that use it usb: phy: msm: Move mach dependent code to platform data clk: versatile: fixup IM-PD1 clock implementation clk: versatile: pass a name to ICST clock provider ARM: integrator: pass parent IRQ to the SIC irqchip: versatile FPGA: support cascaded interrupts from DT gpio: davinci: don't create irq_domain in case of unbanked irqs gpio: davinci: use chained_irq_enter/chained_irq_exit API gpio: davinci: add OF support gpio: davinci: remove unused variable intc_irq_num gpio: davinci: convert to use irqdomain support. gpio: introduce GPIO_DAVINCI kconfig option gpio: davinci: get rid of DAVINCI_N_GPIO gpio: davinci: use {readl|writel}_relaxed() instead of __raw_* serial: sh-sci: Add OF support serial: sh-sci: Add device tree bindings documentation serial: sh-sci: Remove platform data mapbase and irqs fields serial: sh-sci: Remove platform data scbrr_algo_id field ...
2014-01-23Merge tag 'boards-for-linus' of ↵Linus Torvalds75-3868/+1899
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC board updates from Olof Johansson: "This branch is reducing in size for every release since most board-related changes have started happening in devicetrees now. Still, we have some things going on here. * Renesas platforms are still adding a bit more legacy device support, something that should trail off shortly as they move to full DT * We group most defconfig updates into this branch out of old habits * Removal of legacy OMAP2 platforms over to DT continues, and a handful of old code is being removed here" * tag 'boards-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (94 commits) ARM: dts: OMAP2: fix interrupt number for rng ARM: dts: Split omap3 pinmux core device ARM: dts: Add omap specific pinctrl defines to use padconf addresses ARM: bcm2835: bcm2835_defconfig updates ARM: msm_defconfig: Enable restart driver defconfig: msm_defconfig: Enable CONFIG_ARCH_MSM8974 ARM: msm: Add support for APQ8074 Dragonboard ARM: exynos_defconfig: Enable S2MPS11 voltage regulator ARM: tegra: Enable DRM panel support ARM: shmobile: mackerel: Fix USBHS pinconf entry ARM: shmobile: Let Koelsch multiplatform boot with Koelsch DTB ARM: shmobile: Let Lager multiplatform boot with Lager DTB ARM: shmobile: Remove non-multiplatform Koelsch reference support ARM: shmobile: Remove non-multiplatform Lager reference support ARM: shmobile: koelsch-reference: Instantiate clkdevs for SCIF and CMT ARM: shmobile: lager-reference: Instantiate clkdevs for SCIF and CMT ARM: shmobile: koelsch-reference: Remove duplicate CCF initialization ARM: shmobile: lager-reference: Enable multiplaform kernel support ARM: shmobile: armadillo: Set backlight enable GPIO ARM: shmobile: Koelsch: add Ether support ... Conflicts: arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c
2014-01-23Merge tag 'dt-for-linus' of ↵Linus Torvalds238-4211/+12679
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC DT updates from Olof Johansson: "DT and DT-conversion-related changes for various ARM platforms. Most of these are to enable various devices on various boards, etc, and not necessarily worth enumerating. New boards and systems continue to come in as new devicetree files that don't require corresponding C changes any more, which is indicating that the system is starting to work fairly well. A few things worth pointing out: * ST Ericsson ux500 platforms have made the major push to move over to fully support the platform with DT * Renesas platforms continue their conversion over from legacy platform devices to DT-based for hardware description" * tag 'dt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (327 commits) ARM: dts: SiRF: add pin group for USP0 with only RX or TX frame sync ARM: dts: SiRF: add lost usp1_uart_nostreamctrl pin group for atlas6 ARM: dts: sirf: add lost minigpsrtc device node ARM: dts: sirf: add clock, frequence-voltage table for CPU0 ARM: dts: sirf: add lost bus_width, clock and status for sdhci ARM: dts: sirf: add lost clocks for cphifbg ARM: dts: socfpga: add pl330 clock ARM: dts: socfpga: update L2 tag and data latency arm: sun7i: cubietruck: Enable the i2c controllers ARM: dts: add support for EXYNOS4412 based TINY4412 board ARM: dts: Add initial support for Arndale Octa board ARM: bcm2835: add USB controller to device tree ARM: dts: MSM8974: Add MMIO architected timer node ARM: dts: MSM8974: Add restart node ARM: dts: sun7i: external clock outputs ARM: dts: sun7i: Change 32768 Hz oscillator node name to clk@N style ARM: dts: sun7i: Add pin muxing options for clock outputs ARM: dts: sun7i: Add rtp controller node ARM: dts: sun5i: Add rtp controller node ARM: dts: sun4i: Add rtp controller node ...
2014-01-23Merge tag 'soc-for-linus' of ↵Linus Torvalds169-1887/+6147
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC platform changes from Olof Johansson: "New core SoC-specific changes. New platforms: * Introduction of a vendor, Hisilicon, and one of their SoCs with some random numerical product name. * Introduction of EFM32, embedded platform from Silicon Labs (ARMv7m, i.e. !MMU). * Marvell Berlin series of SoCs, which include the one in Chromecast. * MOXA platform support, ARM9-based platform used mostly in industrial products * Support for Freescale's i.MX50 SoC. Other work: * Renesas work for new platforms and drivers, and conversion over to more multiplatform-friendly device registration schemes. * SMP support for Allwinner sunxi platforms. * ... plus a bunch of other stuff across various platforms" * tag 'soc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (201 commits) ARM: tegra: fix tegra_powergate_sequence_power_up() inline ARM: msm_defconfig: Update for multi-platform ARM: msm: Move MSM's DT based hardware to multi-platform support ARM: msm: Only build timer.c if required ARM: msm: Only build clock.c on proc_comm based platforms ARM: ux500: Enable system suspend with WFI support ARM: ux500: turn on PRINTK_TIME in u8500_defconfig ARM: shmobile: r8a7790: Fix I2C controller names ARM: msm: Simplify ARCH_MSM_DT config ARM: msm: Add support for MSM8974 SoC ARM: sunxi: select ARM_PSCI MAINTAINERS: Update Allwinner sunXi maintainer files ARM: sunxi: Select RESET_CONTROLLER ARM: imx: improve the comment of CCM lpm SW workaround ARM: imx: improve status check of clock gate ARM: imx: add necessary interface for pfd ARM: imx_v6_v7_defconfig: Select CONFIG_REGULATOR_PFUZE100 ARM: imx_v6_v7_defconfig: Select MX35 and MX50 device tree support ARM: imx: Add cpu frequency scaling support ARM i.MX35: Add devicetree support. ...
2014-01-23Merge tag 'cleanup-for-linus' of ↵Linus Torvalds233-5145/+11389
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC cleanups from Olof Johansson: "This is the branch where we usually queue up cleanup efforts, moving drivers out of the architecture directory, header file restructuring, etc. Sometimes they tangle with new development so it's hard to keep it strictly to cleanups. Some of the things included in this branch are: * Atmel SAMA5 conversion to common clock * Reset framework conversion for tegra platforms - Some of this depends on tegra clock driver reworks that are shared with Mike Turquette's clk tree. * Tegra DMA refactoring, which are shared branches with the DMA tree. * Removal of some header files on exynos to prepare for multiplatform" * tag 'cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (169 commits) ARM: mvebu: move Armada 370/XP specific definitions to armada-370-xp.h ARM: mvebu: remove prototypes of non-existing functions from common.h ARM: mvebu: move ARMADA_XP_MAX_CPUS to armada-370-xp.h serial: sh-sci: Rework baud rate calculation serial: sh-sci: Compute overrun_bit without using baud rate algo serial: sh-sci: Remove unused GPIO request code serial: sh-sci: Move overrun_bit and error_mask fields out of pdata serial: sh-sci: Support resources passed through platform resources serial: sh-sci: Don't check IRQ in verify port operation serial: sh-sci: Set the UPF_FIXED_PORT flag serial: sh-sci: Remove duplicate interrupt check in verify port op serial: sh-sci: Simplify baud rate calculation algorithms serial: sh-sci: Remove baud rate calculation algorithm 5 serial: sh-sci: Sort headers alphabetically ARM: EXYNOS: Kill exynos_pm_late_initcall() ARM: EXYNOS: Consolidate selection of PM_GENERIC_DOMAINS for Exynos4 ARM: at91: switch Calao QIL-A9260 board to DT clk: at91: fix pmc_clk_ids data type attriubte PM / devfreq: use inclusion <mach/map.h> instead of <plat/map-s5p.h> ARM: EXYNOS: remove <mach/regs-clock.h> for exynos ...
2014-01-23Merge tag 'fixes-nc-for-linus' of ↵Linus Torvalds32-79/+381
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC non-critical fixes from Olof Johansson: "As usual, we have a batch of fixes that weren't considered significant enough to warrant going into the later -rcs for previous release, so they are queued up on this branch. A handful of these are for various DT fixups for Samsung platforms, and a handful of other minor things. There are also a couple of stable-marked patches for mvebu -- they came in quite late and we decided to keep them deferred until the first -stable release to get more coverage instead of squeezing them into 3.13" * tag 'fixes-nc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (32 commits) ARM: at91: smc: bug fix in sam9_smc_cs_read() i2c: mv64xxx: Document the newly introduced Armada XP A0 compatible i2c: mv64xxx: Fix bus hang on A0 version of the Armada XP SoCs ARM: mvebu: Add quirk for i2c for the OpenBlocks AX3-4 board ARM: mvebu: Add support to get the ID and the revision of a SoC ARM: dts: msm: Fix gpio interrupt and reg length irqchip: sirf: set IRQ_LEVEL status_flags ARM: OMAP2+: gpmc: Move legacy GPMC width setting ARM: OMAP2+: gpmc: Introduce gpmc_set_legacy() ARM: OMAP2+: gpmc: Move initialization outside the gpmc_t condition ARM: OMAP2+: board-generic: update SoC compatibility strings Documentation: dt: OMAP: explicitly state SoC compatible strings ARM: OMAP2+: enable AM33xx SOC EVM audio ARM: OMAP2+: Select USB PHY for AM335x SoC ARM: bcm2835: Fix grammar in help message ARM: msm: trout: fix uninit var warning ARM: dts: Use MSHC controller for eMMC memory for exynos4412-trats2 ARM: dts: Fix definition of MSHC device tree nodes for exynos4x12 ARM: dts: add clock provider for mshc node for Exynos4412 SOC clk: samsung: exynos4: Fix definition of div_mmc_pre4 divider ...
2014-01-23Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-armLinus Torvalds55-522/+1177
Pull ARM updates from Russell King: "In this set, we have: - Refactoring of some of the old StrongARM-1100 GPIO code to make things simpler by Dmitry Eremin-Solenikov - Read-only and non-executable support for modules on ARM from Laura Abbot - Removal of unnecessary set_drvdata() calls in AMBA code - Some non-executable support for kernel lowmem mappings at the 1MB section granularity, and dumping of kernel page tables via debugfs - Some improvements for the timer/clock code on Footbridge platforms, and cleanup some of the LED code there - Fix fls/ffs() signatures to match x86 to prevent build warnings, particularly where these are used with min/max() macros - Avoid using the bootmem allocator on ARM (patches from Santosh Shilimkar) - Various asid/unaligned access updates from Will Deacon" * 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (51 commits) ARM: SMP implementations are not supposed to return from smp_ops.cpu_die() ARM: ignore memory below PHYS_OFFSET Fix select-induced Kconfig warning for ZBOOT_ROM ARM: fix ffs/fls implementations to match x86 ARM: 7935/1: sa1100: collie: add gpio-keys configuration ARM: 7932/1: bcm: Add DEBUG_LL console support ARM: 7929/1: Remove duplicate SCHED_HRTICK config option ARM: 7928/1: kconfig: select HAVE_EFFICIENT_UNALIGNED_ACCESS for CPUv6+ && MMU ARM: 7927/1: dcache: select DCACHE_WORD_ACCESS for big-endian CPUs ARM: 7926/1: mm: flesh out and fix the comments in the ASID allocator ARM: 7925/1: mm: keep track of last ASID allocation to improve bitmap searching ARM: 7924/1: mm: don't bother with reserved ttbr0 when running with LPAE ARM: PCI: add legacy IDE IRQ implementation ARM: footbridge: cleanup LEDs code ARM: pgd allocation: retry on failure ARM: footbridge: add one-shot mode for DC21285 timer ARM: footbridge: add sched_clock implementation ARM: 7922/1: l2x0: add Marvell Tauros3 support ARM: 7877/1: use built-in byte swap function ARM: 7921/1: mcpm: remove redundant dsb instructions prior to sev ...
2014-01-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds40-1009/+10527
Pull crypto update from Herbert Xu: "Here is the crypto update for 3.14: - Improved crypto_memneq helper - Use cyprto_memneq in arch-specific crypto code - Replaced orphaned DCP driver with Freescale MXS DCP driver - Added AVX/AVX2 version of AESNI-GCM encode and decode - Added AMD Cryptographic Coprocessor (CCP) driver - Misc fixes" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (41 commits) crypto: aesni - fix build on x86 (32bit) crypto: mxs - Fix sparse non static symbol warning crypto: ccp - CCP device enabled/disabled changes crypto: ccp - Cleanup hash invocation calls crypto: ccp - Change data length declarations to u64 crypto: ccp - Check for caller result area before using it crypto: ccp - Cleanup scatterlist usage crypto: ccp - Apply appropriate gfp_t type to memory allocations crypto: drivers - Sort drivers/crypto/Makefile ARM: mxs: dts: Enable DCP for MXS crypto: mxs - Add Freescale MXS DCP driver crypto: mxs - Remove the old DCP driver crypto: ahash - Fully restore ahash request before completing crypto: aesni - fix build on x86 (32bit) crypto: talitos - Remove redundant dev_set_drvdata crypto: ccp - Remove redundant dev_set_drvdata crypto: crypto4xx - Remove redundant dev_set_drvdata crypto: caam - simplify and harden key parsing crypto: omap-sham - Fix Polling mode for larger blocks crypto: tcrypt - Added speed tests for AEAD crypto alogrithms in tcrypt test suite ...
2014-01-23Merge git://git.infradead.org/users/eparis/auditLinus Torvalds20-243/+404
Pull audit update from Eric Paris: "Again we stayed pretty well contained inside the audit system. Venturing out was fixing a couple of function prototypes which were inconsistent (didn't hurt anything, but we used the same value as an int, uint, u32, and I think even a long in a couple of places). We also made a couple of minor changes to when a couple of LSMs called the audit system. We hoped to add aarch64 audit support this go round, but it wasn't ready. I'm disappearing on vacation on Thursday. I should have internet access, but it'll be spotty. If anything goes wrong please be sure to cc rgb@redhat.com. He'll make fixing things his top priority" * git://git.infradead.org/users/eparis/audit: (50 commits) audit: whitespace fix in kernel-parameters.txt audit: fix location of __net_initdata for audit_net_ops audit: remove pr_info for every network namespace audit: Modify a set of system calls in audit class definitions audit: Convert int limit uses to u32 audit: Use more current logging style audit: Use hex_byte_pack_upper audit: correct a type mismatch in audit_syscall_exit() audit: reorder AUDIT_TTY_SET arguments audit: rework AUDIT_TTY_SET to only grab spin_lock once audit: remove needless switch in AUDIT_SET audit: use define's for audit version audit: documentation of audit= kernel parameter audit: wait_for_auditd rework for readability audit: update MAINTAINERS audit: log task info on feature change audit: fix incorrect set of audit_sock audit: print error message when fail to create audit socket audit: fix dangling keywords in audit_log_set_loginuid() output audit: log on errors from filter user rules ...
2014-01-23lib/decompress_unlz4.c: always set an error return code on failuresJan Beulich1-0/+1
"ret", being set to -1 early on, gets cleared by the first invocation of lz4_decompress()/lz4_decompress_unknownoutputsize(), and hence subsequent failures wouldn't be noticed by the caller without setting it back to -1 right after those calls. Reported-by: Matthew Daley <mattjd@gmail.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Kyungsik Lee <kyungsik.lee@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23romfs: fix returm err while getting inode in fill_superRui Xiang1-4/+2
Getting an inode by romfs_iget may lead to an err in fill_super, and the err value should be return. And it should return -ENOMEM instead while d_make_root fails, fix it too. Signed-off-by: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/w1/masters/w1-gpio.c: add strong pullup emulationEvgeny Boger3-12/+23
Strong pullup is emulated by driving pin logic high after write command when using tri-state push-pull GPIO. Signed-off-by: Evgeny Boger <boger@contactless.ru> Cc: Greg KH <greg@kroah.com> Acked-by: David Fries <david@fries.net> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/memstick/host/rtsx_pci_ms.c: fix ms card data transfer bugMicky Ching1-10/+20
This patch is used to add support for ms card. The main difference between ms card and mspro card is long data transfer mode. mspro card can use auto mode DMA for long data transfer, but ms can not use this mode, it should use normal mode DMA. The memstick core added support for ms card, but the original driver will make ms card fail at initialization, because it uses auto mode DMA. This patch makes the ms card work properly. Signed-off-by: Micky Ching <micky_ching@realsil.com.cn> Cc: Maxim Levitsky <maximlevitsky@gmail.com> Cc: Samuel Ortiz <sameo@linux.intel.com> Cc: Alex Dubov <oakad@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23userns: relax the posix_acl_valid() checksAndreas Gruenbacher1-10/+0
So far, POSIX ACLs are using a canonical representation that keeps all ACL entries in a strict order; the ACL_USER and ACL_GROUP entries for specific users and groups are ordered by user and group identifier, respectively. The user-space code provides ACL entries in this order; the kernel verifies that the ACL entry order is correct in posix_acl_valid(). User namespaces allow to arbitrary map user and group identifiers which can cause the ACL_USER and ACL_GROUP entry order to differ between user space and the kernel; posix_acl_valid() would then fail. Work around this by allowing ACL_USER and ACL_GROUP entries to be in any order in the kernel. The effect is only minor: file permission checks will pick the first matching ACL_USER entry, and check all matching ACL_GROUP entries. (The libacl user-space library and getfacl / setfacl tools will not create ACLs with duplicate user or group idenfifiers; they will handle ACLs with entries in an arbitrary order correctly.) Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Christoph Hellwig <hch@infradead.org> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23arch/sh/kernel/dwarf.c: use rbtree postorder iteration helper instead of ↵Cody P Schafer1-14/+4
solution using repeated rb_erase() Use rbtree_postorder_for_each_entry_safe() to destroy the rbtree instead of using repeated rb_erase() calls Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Michel Lespinasse <walken@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23fs-ext3-use-rbtree-postorder-iteration-helper-instead-of-opencoding-fixAndrew Morton1-4/+4
use do{}while - more efficient and it squishes a coccinelle warning Reported-by: Fengguang Wu <fengguang.wu@intel.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Jan Kara <jack@suse.cz> Cc: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23fs/ext3: use rbtree postorder iteration helper instead of opencodingCody P Schafer1-31/+5
Use rbtree_postorder_for_each_entry_safe() to destroy the rbtree instead of opencoding an alternate postorder iteration that modifies the tree Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Michel Lespinasse <walken@google.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23fs/jffs2: use rbtree postorder iteration helper instead of opencodingCody P Schafer2-49/+5
Use rbtree_postorder_for_each_entry_safe() to destroy the rbtree instead of opencoding an alternate postorder iteration that modifies the tree Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Michel Lespinasse <walken@google.com> Cc: Jan Kara <jack@suse.cz> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23fs/ext4: use rbtree postorder iteration helper instead of opencodingCody P Schafer2-59/+9
Use rbtree_postorder_for_each_entry_safe() to destroy the rbtree instead of opencoding an alternate postorder iteration that modifies the tree Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Michel Lespinasse <walken@google.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23fs/ubifs: use rbtree postorder iteration helper instead of opencodingCody P Schafer6-114/+17
Use rbtree_postorder_for_each_entry_safe() to destroy the rbtree instead of opencoding an alternate postorder iteration that modifies the tree Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Michel Lespinasse <walken@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Artem Bityutskiy <dedekind1@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23net/netfilter/ipset/ip_set_hash_netiface.c: use rbtree postorder iteration ↵Cody P Schafer1-23/+4
instead of opencoding Use rbtree_postorder_for_each_entry_safe() to destroy the rbtree instead of opencoding an alternate postorder iteration that modifies the tree Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Michel Lespinasse <walken@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23rbtree/test: test rbtree_postorder_for_each_entry_safe()Cody P Schafer1-0/+11
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Michel Lespinasse <walken@google.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23rbtree/test: move rb_node to the middle of the test structCody P Schafer1-1/+1
Avoid making the rb_node the first entry to catch some bugs around NULL checking the rb_node. Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Michel Lespinasse <walken@google.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23rapidio: add modular rapidio core build into powerpc and mips branchesAlexandre Bounine2-3/+3
Allow modular build option for RapidIO subsystem core in MIPS and PowerPC architectural branches. At this moment modular RapidIO subsystem build is enabled only for platforms that use PCI/PCIe based RapidIO controllers (e.g. Tsi721). Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Jean Delvare <jdelvare@suse.de> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Li Yang <leoli@freescale.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23partitions/efi: complete documentation of gpt kernel param purposeDavidlohr Bueso1-1/+3
The usage of the 'gpt' kernel parameter is twofold: (i) skip any mbr integrity checks and (ii) enable the backup GPT header to be used in situations where the primary one is corrupted. This last "feature" is not obvious and needs to be properly documented in the kernel-parameters document. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=63591 Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Cc: Matt Domsch <Matt_Domsch@dell.com> Cc: Matt Fleming <matt.fleming@intel.com> Cc: "Chandramouleeswaran,Aswin" <aswin@hp.com> Cc: Chris Murphy <bugzilla@colorremedies.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23kdump: add /sys/kernel/vmcoreinfo ABI documentationVivek Goyal1-0/+14
/sys/kernel/vmcoreinfo was introduced long back but there is no ABI documentation. This patch adds the documentation. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Cc: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp> Cc: Dan Aloni <da-x@monatomic.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23kdump: fix exported size of vmcoreinfo noteVivek Goyal1-1/+1
Right now we seem to be exporting the max data size contained inside vmcoreinfo note. But this does not include the size of meta data around vmcore info data. Like name of the note and starting and ending elf_note. I think user space expects total size and that size is put in PT_NOTE elf header. Things seem to be fine so far because we are not using vmcoreinfo note to the maximum capacity. But as it starts filling up, to capacity, at some point of time, problem will be visible. I don't think user space will be broken with this change. So there is no need to introduce vmcoreinfo2. This change is safe and backward compatible. More explanation on why this change is safe is below. vmcoreinfo contains information about kernel which user space needs to know to do things like filtering. For example, various kernel config options or information about size or offset of some data structures etc. All this information is commmunicated to user space with an ELF note present in ELF /proc/vmcore file. Currently vmcoreinfo data size is 4096. With some elf note meta data around it, actual size is 4132 bytes. But we are using barely 25% of that size. Rest is empty. So even if we tell user space that size of ELf note is 4096 and not 4132, nothing will be broken becase after around 1000 bytes, everything is zero anyway. But once we start filling up the note to the capacity, and not report the full size of note, bad things will start happening. Either some data will be lost or tools will be confused that they did not fine the zero note at the end. So I think this change is safe and should not break existing tools. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Cc: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp> Cc: Dan Aloni <da-x@monatomic.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23kexec: add sysctl to disable kexec_loadKees Cook4-2/+30
For general-purpose (i.e. distro) kernel builds it makes sense to build with CONFIG_KEXEC to allow end users to choose what kind of things they want to do with kexec. However, in the face of trying to lock down a system with such a kernel, there needs to be a way to disable kexec_load (much like module loading can be disabled). Without this, it is too easy for the root user to modify kernel memory even when CONFIG_STRICT_DEVMEM and modules_disabled are set. With this change, it is still possible to load an image for use later, then disable kexec_load so the image (or lack of image) can't be altered. The intention is for using this in environments where "perfect" enforcement is hard. Without a verified boot, along with verified modules, and along with verified kexec, this is trying to give a system a better chance to defend itself (or at least grow the window of discoverability) against attack in the face of a privilege escalation. In my mind, I consider several boot scenarios: 1) Verified boot of read-only verified root fs loading fd-based verification of kexec images. 2) Secure boot of writable root fs loading signed kexec images. 3) Regular boot loading kexec (e.g. kcrash) image early and locking it. 4) Regular boot with no control of kexec image at all. 1 and 2 don't exist yet, but will soon once the verified kexec series has landed. 4 is the state of things now. The gap between 2 and 4 is too large, so this change creates scenario 3, a middle-ground above 4 when 2 and 1 are not possible for a system. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Rik van Riel <riel@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23fs/exec.c: call arch_pick_mmap_layout() only onceRichard Weinberger1-1/+0
Currently both setup_new_exec() and flush_old_exec() issue a call to arch_pick_mmap_layout(). As setup_new_exec() and flush_old_exec() are always called pairwise arch_pick_mmap_layout() is called twice. This patch removes one call from setup_new_exec() to have it only called once. Signed-off-by: Richard Weinberger <richard@nod.at> Tested-by: Pat Erley <pat-lkml@erley.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23exec: avoid propagating PF_NO_SETAFFINITY into userspace childZhang Yi1-2/+2
Userspace process doesn't want the PF_NO_SETAFFINITY, but its parent may be a kernel worker thread which has PF_NO_SETAFFINITY set, and this worker thread can do kernel_thread() to create the child. Clearing this flag in usersapce child to enable its migrating capability. Signed-off-by: Zhang Yi <zhang.yi20@zte.com.cn> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23kernel/signal.c: change do_signal_stop/do_sigaction to use while_each_thread()Oleg Nesterov1-4/+3
Change do_signal_stop() and do_sigaction() to avoid next_thread() and use while_each_thread() instead. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Kees Cook <keescook@chromium.org> Reviewed-by: Sameer Nanda <snanda@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23kernel/sys.c: k_getrusage() can use while_each_thread()Oleg Nesterov1-2/+1
Change k_getrusage() to use while_each_thread(), no changes in the compiled code. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Kees Cook <keescook@chromium.org> Reviewed-by: Sameer Nanda <snanda@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23fs/proc/array.c: change do_task_stat() to use while_each_thread()Oleg Nesterov1-2/+1
Change the remaining next_thread (ab)users to use while_each_thread(). The last user which should be changed is next_tid(), but we can't do this now. __exit_signal() and complete_signal() are fine, they actually need next_thread() logic. This patch (of 3): do_task_stat() can use while_each_thread(), no changes in the compiled code. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Kees Cook <keescook@chromium.org> Reviewed-by: Sameer Nanda <snanda@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23exec: kill task_struct->did_execOleg Nesterov4-6/+2
We can kill either task->did_exec or PF_FORKNOEXEC, they are mutually exclusive. The patch kills ->did_exec because it has a single user. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23exec: move the final allow_write_access/fput into free_bprm()Oleg Nesterov1-15/+5
Both success/failure paths cleanup bprm->file, we can move this code into free_bprm() to simlify and cleanup this logic. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23exec:check_unsafe_exec: kill the dead -EAGAIN and clear_in_exec logicOleg Nesterov1-21/+8
fs_struct->in_exec == T means that this ->fs is used by a single process (thread group), and one of the treads does do_execve(). To avoid the mt-exec races this code has the following complications: 1. check_unsafe_exec() returns -EBUSY if ->in_exec was already set by another thread. 2. do_execve_common() records "clear_in_exec" to ensure that the error path can only clear ->in_exec if it was set by current. However, after 9b1bf12d5d51 "signals: move cred_guard_mutex from task_struct to signal_struct" we do not need these complications: 1. We can't race with our sub-thread, this is called under per-process ->cred_guard_mutex. And we can't race with another CLONE_FS task, we already checked that this fs is not shared. We can remove the dead -EAGAIN logic. 2. "out_unmark:" in do_execve_common() is either called under ->cred_guard_mutex, or after de_thread() which kills other threads, so we can't race with sub-thread which could set ->in_exec. And if ->fs is shared with another process ->in_exec should be false anyway. We can clear in_exec unconditionally. This also means that check_unsafe_exec() can be void. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23exec:check_unsafe_exec: use while_each_thread() rather than next_thread()Oleg Nesterov1-1/+2
next_thread() should be avoided, change check_unsafe_exec() to use while_each_thread(). Nobody except signal->curr_target actually needs next_thread-like code, and we need to change (fix) this interface. This particular code is fine, p == current. But in general the code like this can loop forever if p exits and next_thread(t) can't reach the unhashed thread. This also saves 32 bytes. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23kernel/fork.c: remove redundant NULL check in dup_mm()Daeseok Youn1-3/+0
current->mm doesn't need a NULL check in dup_mm(). Becasue dup_mm() is used only in copy_mm() and current->mm is checked whether it is NULL or not in copy_mm() before calling dup_mm(). Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23kernel/fork.c: fix coding style issuesDaeseok Youn1-2/+2
Fix errors reported by checkpatch.pl. One error is parentheses, the other is a whitespace issue. Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23kernel/fork.c: make dup_mm() staticDaeSeok Youn2-3/+1
dup_mm() is used only in kernel/fork.c Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23fs/proc: don't use module_init for non-modular core codePaul Gortmaker16-16/+16
PROC_FS is a bool, so this code is either present or absent. It will never be modular, so using module_init as an alias for __initcall is rather misleading. Fix this up now, so that we can relocate module_init from init.h into module.h in the future. If we don't do this, we'd have to add module.h to obviously non-modular code, and that would be ugly at best. Note that direct use of __initcall is discouraged, vs. one of the priority categorized subgroups. As __initcall gets mapped onto device_initcall, our use of fs_initcall (which makes sense for fs code) will thus change these registrations from level 6-device to level 5-fs (i.e. slightly earlier). However no observable impact of that small difference has been observed during testing, or is expected. Also note that this change uncovers a missing semicolon bug in the registration of vmcore_init as an initcall. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23fs/proc_namespace.c: simplify testing nsp and nsp->mnt_nsAxel Lin1-6/+1
Trivial cleanup to eliminate a goto. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23fs/proc/proc_devtree.c: remove empty /proc/device-tree when no openfirmware ↵Dave Jones1-0/+1
exists. Distribution kernels might want to build in support for /proc/device-tree for kernels that might end up running on hardware that doesn't support openfirmware. This results in an empty /proc/device-tree existing. Remove it if the OFW root node doesn't exist. This situation actually confuses grub2, resulting in install failures. grub2 sees the /proc/device-tree and picks the wrong install target cf. http://bzr.savannah.gnu.org/lh/grub/trunk/grub/annotate/4300/util/grub-install.in#L311 grub should be more robust, but still, leaving an empty proc dir seems pointless. Addresses https://bugzilla.redhat.com/show_bug.cgi?id=818378. Signed-off-by: Dave Jones <davej@redhat.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Paul Mackerras <paulus@samba.org> Cc: Josh Boyer <jwboyer@fedoraproject.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23proc: set attributes of pde using accessor functionsRui Xiang2-4/+3
Use existing accessors proc_set_user() and proc_set_size() to set attributes. Just a cleanup. Signed-off-by: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23proc: fix ->f_pos overflows in first_tid()Oleg Nesterov1-5/+9
1. proc_task_readdir()->first_tid() path truncates f_pos to int, this is wrong even on 64bit. We could check that f_pos < PID_MAX or even INT_MAX in proc_task_readdir(), but this patch simply checks the potential overflow in first_tid(), this check is nop on 64bit. We do not care if it was negative and the new unsigned value is huge, all we need to ensure is that we never wrongly return !NULL. 2. Remove the 2nd "nr != 0" check before get_nr_threads(), nr_threads == 0 is not distinguishable from !pid_task() above. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Sameer Nanda <snanda@chromium.org> Cc: Sergey Dyasly <dserrg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23proc: don't (ab)use ->group_leader in proc_task_readdir() pathsOleg Nesterov1-28/+24
proc_task_readdir() does not really need "leader", first_tid() has to revalidate it anyway. Just pass proc_pid(inode) to first_tid() instead, it can do pid_task(PIDTYPE_PID) itself and read ->group_leader only if necessary. The patch also extracts the "inode is dead" code from pid_delete_dentry(dentry) into the new trivial helper, proc_inode_is_dead(inode), proc_task_readdir() uses it to return -ENOENT if this dir was removed. This is a bit racy, but the race is very inlikely and the getdents() after openndir() can see the empty "." + ".." dir only once. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Sameer Nanda <snanda@chromium.org> Cc: Sergey Dyasly <dserrg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23proc: change first_tid() to use while_each_thread() rather than next_thread()Oleg Nesterov1-10/+10
Rerwrite the main loop to use while_each_thread() instead of next_thread(). We are going to fix or replace while_each_thread(), next_thread() should be avoided whenever possible. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Sameer Nanda <snanda@chromium.org> Cc: Sergey Dyasly <dserrg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23proc: fix the potential use-after-free in first_tid()Oleg Nesterov1-0/+3
proc_task_readdir() verifies that the result of get_proc_task() is pid_alive() and thus its ->group_leader is fine too. However this is not necessarily true after rcu_read_unlock(), we need to recheck this again after first_tid() does rcu_read_lock(). Otherwise leader->thread_group.next (used by next_thread()) can be invalid if the rcu grace period expires in between. The race is subtle and unlikely, but still it is possible afaics. To simplify lets ignore the "likely" case when tid != 0, f_version can be cleared by proc_task_operations->llseek(). Suppose we have a main thread M and its subthread T. Suppose that f_pos == 3, iow first_tid() should return T. Now suppose that the following happens between rcu_read_unlock() and rcu_read_lock(): 1. T execs and becomes the new leader. This removes M from ->thread_group but next_thread(M) is still T. 2. T creates another thread X which does exec as well, T goes away. 3. X creates another subthread, this increments nr_threads. 4. first_tid() does next_thread(M) and returns the already dead T. Note also that we need 2. and 3. only because of get_nr_threads() check, and this check was supposed to be optimization only. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Sameer Nanda <snanda@chromium.org> Cc: Sergey Dyasly <dserrg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23proc: cleanup/simplify get_task_state/task_state_arrayOleg Nesterov2-13/+4
get_task_state() and task_state_array[] look confusing and suboptimal, it is not clear what it can actually report to user-space and task_state_array[] blows .data for no reason. 1. state = (tsk->state & TASK_REPORT) | tsk->exit_state is not clear. TASK_REPORT is self-documenting but it is not clear what ->exit_state can add. Move the potential exit_state's (EXIT_ZOMBIE and EXIT_DEAD) into TASK_REPORT and use it to calculate the final result. 2. With the change above it is obvious that task_state_array[] has the unused entries just to make BUILD_BUG_ON() happy. Change this BUILD_BUG_ON() to use TASK_REPORT rather than TASK_STATE_MAX and shrink task_state_array[]. 3. Turn the "while (state)" loop into fls(state). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23coredump: make __get_dumpable/get_dumpable inline, kill fs/coredump.hOleg Nesterov4-29/+17
1. Remove fs/coredump.h. It is not clear why do we need it, it only declares __get_dumpable(), signal.c includes it for no reason. 2. Now that get_dumpable() and __get_dumpable() are really trivial make them inline in linux/sched.h. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Alex Kelly <alex.page.kelly@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Vasily Kulikov <segoon@openwall.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23coredump: kill MMF_DUMPABLE and MMF_DUMP_SECURELYOleg Nesterov2-18/+7
Nobody actually needs MMF_DUMPABLE/MMF_DUMP_SECURELY, they are only used to enforce the encoding of SUID_DUMP_* enum in mm->flags & MMF_DUMPABLE_MASK. Now that set_dumpable() updates both bits atomically we can kill them and simply store the value "as is" in 2 lower bits. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Alex Kelly <alex.page.kelly@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Vasily Kulikov <segoon@openwall.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23coredump: set_dumpable: fix the theoretical race with itselfOleg Nesterov1-34/+15
set_dumpable() updates MMF_DUMPABLE_MASK in a non-trivial way to ensure that get_dumpable() can't observe the intermediate state, but this all can't help if multiple threads call set_dumpable() at the same time. And in theory commit_creds()->set_dumpable(SUID_DUMP_ROOT) racing with sys_prctl()->set_dumpable(SUID_DUMP_DISABLE) can result in SUID_DUMP_USER. Change this code to update both bits atomically via cmpxchg(). Note: this assumes that it is safe to mix bitops and cmpxchg. IOW, if, say, an architecture implements cmpxchg() using the locking (like arch/parisc/lib/bitops.c does), then it should use the same locks for set_bit/etc. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Alex Kelly <alex.page.kelly@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Vasily Kulikov <segoon@openwall.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23Documentation/cpu-hotplug.txt: fix a typo in example codeSangjung Woo1-1/+1
As the notifier_block name (i.e. foobar_cpu_notifer) is different from the parameter (i.e.foobar_cpu_notifier) of register function, that is definitely error and it also makes readers confused. Signed-off-by: Sangjung Woo <sangjung.woo@samsung.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23Kconfig: update flightly outdated CONFIG_SMP documentationRobert Graffham11-40/+40
Remove an outdated reference to "most personal computers" having only one CPU, and change the use of "singleprocessor" and "single processor" in CONFIG_SMP's documentation to "uniprocessor" across all arches where that documentation is present. Signed-off-by: Robert Graffham <psquid@psquid.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23Documentation/filesystems/00-INDEX: updatesFabian Frederick1-13/+45
Add the following documentation-files with description : -autofs4-mount-control.txt -btrfs.txt -debugfs.txt -devpts.txt -fiemap.txt -gfs2-glocks.txt -gfs2-uevents.txt -omfs.txt -path-lookup.txt -qnx6.txt -quota.txt -squashfs.txt -sysfs-tagging.txt -ubifs.txt -xfs-delayed-logging-design.txt -xfs-self-describing-metadata.txt Add the following documentation directories with description : -caching -cifs (replacing cifs.txt) -pohmelfs Remove the following documentation-files reference: -dentry-locking.txt -reiser4.txt Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23Documentation/blockdev/ramdisk.txt: updatesFabian Frederick1-6/+15
- ramdisk_blocksize doesn't exist anymore - Module parameters added to documentation Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23Documentation/filesystems/sysfs.txt: fix device_attribute declarationAndre Richter1-3/+3
Fix a wrong device_attribute declaration example. Signed-off-by: Andre Richter <andre.o.richter@gmail.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23hfsplus: remove hfsplus_file_lookup()Sougata Santra1-59/+0
HFS+ resource fork lookup breaks opendir() library function. Since opendir first calls open() with O_DIRECTORY flag set. O_DIRECTORY means "refuse to open if not a directory". The open system call in the kernel does a check for inode->i_op->lookup and returns -ENOTDIR. So if hfsplus_file_lookup is set it allows opendir() for plain files. Also resource fork lookup in HFS+ does not work. Since it is never invoked after VFS permission checking. It will always return with -EACCES. When we call opendir() on a file, it does not return NULL. opendir() library call is based on open with O_DIRECTORY flag passed and then layered on top of getdents() system call. O_DIRECTORY means "refuse to open if not a directory". The open() system call in the kernel does a check for: do_sys_open() -->..--> can_lookup() i.e it only checks inode->i_op->lookup and returns ENOTDIR if this function pointer is not set. In OSX, we can open "file/rsrc" to get the resource fork of "file". This behavior is emulated inside hfsplus on Linux, which means that to some degree every file acts like a directory. That is the reason lookup() inode operations is supported for files, and it is possible to do a lookup on this specific name. As a result of this open succeeds without returning ENOTDIR for HFS+ Please see the LKML discussion thread on this issue: http://marc.info/?l=linux-fsdevel&m=122823343730412&w=2 I tried to test file/rsrc lookup in HFS+ driver and the feature does not work. From OSX: $ touch test $ echo "1234" > test/..namedfork/rsrc $ ls -l test..namedfork/rsrc --rw-r--r-- 1 tuxera staff 5 10 dec 12:59 test/..namedfork/rsrc [sougata@ultrabook tmp]$ id uid=1000(sougata) gid=1000(sougata) groups=1000(sougata),5(tty),18(dialout),1001(vboxusers) [sougata@ultrabook tmp]$ mount /dev/sdb1 on /mnt/tmp type hfsplus (rw,relatime,umask=0,uid=1000,gid=1000,nls=utf8) [sougata@ultrabook tmp]$ ls -l test/rsrc ls: cannot access test/rsrc: Permission denied According to this LKML thread it is expected behavior. http://marc.info/?t=121139033800008&r=1&w=4 I guess now that permission checking happens in vfs generic_permission() ? So it turns out that even though the lookup() inode_operation exists for HFS+ files. It cannot really get invoked ?. So if we can disable this feature to make opendir() work for HFS+. Signed-off-by: Sougata Santra <sougata@tuxera.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Vyacheslav Dubeyko <slava@dubeyko.com> Cc: Anton Altaparmakov <aia21@cam.ac.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23nilfs2: add comments for ioctlsVyacheslav Dubeyko2-1/+418
Add comments for ioctls in fs/nilfs2/ioctl.c file and describe NILFS2 specific ioctls in Documentation/filesystems/nilfs2.txt. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reviewed-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Wenliang Fan <fanwlexca@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23fs/nilfs2: fix integer overflow in nilfs_ioctl_wrap_copy()Wenliang Fan1-0/+8
The local variable 'pos' in nilfs_ioctl_wrap_copy function can overflow if a large number was passed to argv->v_index from userspace and the sum of argv->v_index and argv->v_nmembs exceeds the maximum value of __u64 type integer (= ~(__u64)0 = 18446744073709551615). Here, argv->v_index is a 64-bit width argument to specify the start position of target data items (such as segment number, checkpoint number, or virtual block address of nilfs), and argv->v_nmembs gives the total number of the items that userland programs (such as lssu, lscp, or cleanerd) want to get information about, which also gives the maximum element count of argv->v_base[] array. nilfs_ioctl_wrap_copy() calls dofunc() repeatedly and increments the position variable 'pos' at the end of each iteration if dofunc() itself didn't update 'pos': if (pos == ppos) pos += n; This patch prevents the overflow here by rejecting pairs of a start position (argv->v_index) and a total count (argv->v_nmembs) which leads to the overflow. [konishi.ryusuke@lab.ntt.co.jp: fix signedness issue] Signed-off-by: Wenliang Fan <fanwlexca@gmail.com> Cc: Vyacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23fs/pipe.c: skip file_update_time on frozen fsDmitry Monakhov1-1/+2
Pipe has no data associated with fs so it is not good idea to block pipe_write() if FS is frozen, but we can not update file's time on such filesystem. Let's use same idea as we use in touch_time(). Addresses https://bugzilla.kernel.org/show_bug.cgi?id=65701 Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/rtc/rtc-rx8581.c: add SMBus-only adapters supportAndreas Werner1-13/+68
Add support for SMBus-only adapters (e.g. i2c-piix4). The driver has implemented only support for I2C adapters which implement the I2C_FUNC_SMBUS_I2C_BLOCK functionality before. With this patch it is possible to load and use the RTC driver with I2C and SMBUS adapters like the rtc-ds1307 does. Tested on AMD G Series Platform (i2c-piix4 adapter driver). Signed-off-by: Andreas Werner <andreas.werner@men.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/rtc/rtc-s5m.c: s5m_rtc_{suspend,resume}() should depend on ↵Geert Uytterhoeven1-0/+2
CONFIG_PM_SLEEP If CONFIG_PM_SLEEP=n: drivers/rtc/rtc-s5m.c:643: warning: `s5m_rtc_resume' defined but not used drivers/rtc/rtc-s5m.c:654: warning: `s5m_rtc_suspend' defined but not used Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23rtc: max8907: weekday encoding fixesStephen Warren1-6/+5
The current MAX8907 driver has two issues related to weekday value handling: 1) The HW WEEKDAY register has range 0..6 rather than 1..7 as documented. Note that I validated the actual HW range by observing the HW register roll from 6->0 rather than 6->7->1 as would otherwise be expected. This matches Linux's tm_wday range of 0..6. When the CMOS RAM content is lost, the date returned from the device is 2007-01-01 00:00:00, which is a Monday. The WEEKDAY register reads 1 in this case. This matches the numbering in Linux's tm_wday field. Hence we should write Linux's tm_wday value to the register without modifying it. Hence, remove the +1/-1 calculations for WEEKDAY/tm_wday. 2) There's no need to make alarms match on the WEEKDAY register, since the other fields together uniquely define the alarm date/time. Ignoring the WEEKDAY value in the match isolates the driver from any incorrect value in the current time copy of the WEEKDAY register. Each change individually, or both together, solves an issue that I observed; "hwclock -r" would time out waiting for its alarm to fire if the CMOS RAM content had been lost, and hence the WEEKDAY register value mismatched what the driver expected it to be. "hwclock -w" would solve this by over-writing the HW default WEEKDAY register value with what the driver expected. Signed-off-by: Stephen Warren <swarren@nvidia.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/rtc/rtc-hym8563.c: staticize local symbolSachin Kamat1-1/+1
'hym8563_clkout_ops' is used only in this file. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/rtc/rtc-hym8563.c: remove redundant of_match_ptr() helperSachin Kamat1-1/+1
'hym8563_dt_idtable' is always compiled in. Hence the helper macro is not needed. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/rtc/rtc-ds1742.c: remove redundant of_match_ptr() helperSachin Kamat1-1/+1
'ds1742_rtc_of_match' is always compiled in. Hence the helper macro is not needed. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/rtc/rtc-cmos.c: propagate hpet_register_irq_handler() failureAndrew Morton1-4/+2
If hpet_register_irq_handler() fails, cmos_do_probe() will incorrectly return 0. Reported-by: Julia Lawall <julia.lawall@lip6.fr> Cc: John Stultz <john.stultz@linaro.org> Cc: Grant Likely <grant.likely@linaro.org> Cc: Rob Herring <robh+dt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23rtc: honor device tree /alias entries when assigning IDsStephen Warren1-4/+20
Assign RTC device IDs based on device tree /aliases entries if present, falling back to the existing numbering scheme if there is no /aliases entry (which includes when the system isn't booted using DT), or there is a numbering conflict. This is useful in systems with multiple RTC devices, to ensure that the best RTC device is selected as /dev/rtc0, which provides the overall system time. For example, Tegra has an on-SoC RTC that is not battery backed, typically coupled with an off-SoC RTC that is battery backed. Only the latter is useful for populating the system time, yet the former is useful e.g. for wakeup timing, since the time is not lost when the system is sleeps. Signed-off-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/rtc/rtc-pcf2127.c: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERODuan Jiong1-4/+1
Fix a coccinelle error regarding usage of IS_ERR and PTR_ERR instead of PTR_ERR_OR_ZERO. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/rtc/Kconfig: disable RTC_DRV_CMOS on AtariGeert Uytterhoeven1-1/+1
On ARAnyM (emulating an Atari Falcon, which doesn't have an RTC IRQ, as the Second Multi Function Peripheral MFP 68901 is available on Atari TT only), rtc-cmos doesn't work well: - The date is of by 32 years (2045 instead of 2013): rtc_cmos rtc_cmos: setting system clock to 2045-12-02 10:56:17 UTC (2395824977) - The hwclock utility doesn't work: hwclock: ioctl() to /dev/rtc to turn on update interrupts failed unexpectedly, errno=5: Input/output error. As rtc-generic works fine for the RTC part, and nvram works for the NVRAM part, we'll continue on using that. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/rtc/rtc-cmos.c: remove superfluous name castGeert Uytterhoeven1-1/+1
device_driver.name is "const char *" Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23rtc: add hym8563 rtc-driverHeiko Stuebner3-0/+618
The Haoyu Microelectronics HYM8563 provides rtc and alarm functions as well as a clock output of up to 32kHz. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Ian Campbell <ijc+devicetree@hellion.org.uk> Cc: Grant Likely <grant.likely@linaro.org> Cc: Mike Turquette <mturquette@linaro.org> Cc: Richard Weinberger <richard.weinberger@gmail.com> Cc: Mark Brown <broonie@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23dt-bindings: add hym8563 bindingHeiko Stuebner2-0/+28
Add binding documentation for the hym8563 rtc chip. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Ian Campbell <ijc+devicetree@hellion.org.uk> Cc: Grant Likely <grant.likely@linaro.org> Cc: Mike Turquette <mturquette@linaro.org> Cc: Richard Weinberger <richard.weinberger@gmail.com> Cc: Mark Brown <broonie@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/rtc/rtc-vr41xx.c: use devm_*() functionsJingoo Han1-38/+12
Use devm_*() functions to make cleanup paths simpler, and remove unnecessary remove(). Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Yoichi Yuasa <yuasa@linux-mips.org> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/rtc/rtc-twl.c: use devm_*() functionsJingoo Han1-25/+13
Use devm_*() functions to make cleanup paths simpler, and remove unnecessary remove(). Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Yoichi Yuasa <yuasa@linux-mips.org> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/rtc/rtc-ds1742.c: add devicetree supportAlexander Shiyan2-1/+21
This patch allows the driver to be enabled with devicetree. Signed-off-by: Alexander Shiyan <shc_work@mail.ru> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/rtc/rtc-mxc.c: check the return value from clk_prepare_enable()Fabio Estevam1-1/+4
clk_prepare_enable() may fail, so let's check its return value and propagate it in the case of error. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/rtc/rtc-mxc.c: remove unneeded labelFabio Estevam1-4/+1
There is no need to jump to the 'exit_free_pdata' label when devm_clk_get() fails, as we can directly return the error and simplify the code a bit. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/rtc/rtc-ds1305.c: remove unnecessary spi_set_drvdata()Jingoo Han1-1/+0
The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/rtc/rtc-as3722: use devm for rtc and irq registrationLaxman Dewangan1-16/+3
Use devm_* calls for rtc and irq registration and get rid of remove callback for platform driver. Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com> Cc: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23autofs: fix symlinks aren't checked for expiryIan Kent2-0/+18
The autofs4 module doesn't consider symlinks for expire as it did in the older autofs v3 module (so it's actually a long standing regression). The user space daemon has focused on the use of bind mounts instead of symlinks for a long time now and that's why this has not been noticed. But with the future addition of amd map parsing to automount(8), not to mention amd itself (of am-utils), symlink expiry will be needed. The direct and offset mount types can't be symlinks and the tree mounts of version 4 were always real mounts so only indirect mounts need expire symlinks. Since the current users of the autofs4 module haven't reported this as a problem to date this patch probably isn't a candidate for backport to stable. Signed-off-by: Ian Kent <ikent@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23autofs: use IS_ROOT to replace root dentry checksRui Xiang1-3/+3
Use the helper macro !IS_ROOT to replace parent != dentry->d_parent. Just clean up. Signed-off-by: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23autofs: fix the return value of autofs4_fill_superRui Xiang1-5/+8
While kzallocing sbi/ino fails, it should return -ENOMEM. And it should return the err value from autofs_prepare_pipe. Signed-off-by: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23autofs4: translate pids to the right namespace for the daemonMiklos Szeredi1-2/+14
The PID and the TGID of the process triggering the mount are sent to the daemon. Currently the global pid values are sent (ones valid in the initial pid namespace) but this is wrong if the autofs daemon itself is not running in the initial pid namespace. So send the pid values that are valid in the namespace of the autofs daemon. The namespace to use is taken from the oz_pgrp pid pointer, which was set at mount time to the mounting process' pid namespace. If the pid translation fails (the triggering process is in an unrelated pid namespace) then the automount fails with ENOENT. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Cc: Eric Biederman <ebiederm@xmission.com> Acked-by: Ian Kent <raven@themaw.net> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23autofs4: allow autofs to work outside the initial PID namespaceSukadev Bhattiprolu3-13/+43
Enable autofs4 to work in a "container". oz_pgrp is converted from pid_t to struct pid and this is stored at mount time based on the "pgrp=" option or if the option is missing then the current pgrp. The "pgrp=" option is interpreted in the PID namespace of the current process. This option is flawed in that it doesn't carry the namespace information, so it should be deprecated. AFAICS the autofs daemon always sends the current pgrp, which is the default anyway. The oz_pgrp is also set from the AUTOFS_DEV_IOCTL_SETPIPEFD_CMD ioctl. This ioctl sets oz_pgrp to the current pgrp. It is not allowed to change the pid namespace. oz_pgrp is used mainly to determine whether the process traversing the autofs mount tree is the autofs daemon itself or not. This function now compares the pid pointers instead of the pid_t values. One other use of oz_pgrp is in autofs4_show_options. There is shows the virtual pid number (i.e. the one that is valid inside the PID namespace of the calling process) For debugging printk convert oz_pgrp to the value in the initial pid namespace. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Cc: Eric Biederman <ebiederm@xmission.com> Acked-by: Ian Kent <raven@themaw.net> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23init: fix possible format string bugTetsuo Handa2-4/+5
Use constant format string in case message changes. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23init/main.c: remove unused declaration of tc_init()Geert Uytterhoeven1-4/+0
Its user was removed in v2.5.2.4. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23fs/ramfs: move ramfs_aops to inode.cAxel Lin4-15/+7
ramfs_aops is identical in file-mmu.c and file-nommu.c. Thus move it to fs/ramfs/inode.c and make it static. Signed-off-by: Axel Lin <axel.lin@ingics.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23fs/ramfs/file-nommu.c: make ramfs_nommu_get_unmapped_area() and ↵Axel Lin2-9/+8
ramfs_nommu_mmap() static Since commit 853ac43ab194 ("shmem: unify regular and tiny shmem"), ramfs_nommu_get_unmapped_area() and ramfs_nommu_mmap() are not directly referenced outside of file-nommu.c. Thus make them static. Signed-off-by: Axel Lin <axel.lin@ingics.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23fs: binfmt_elf: remove unused defines INTERPRETER_NONE and INTERPRETER_ELFTodor Minchev1-3/+0
These two defines are unused since the removal of the a.out interpreter support in the ELF loader in kernel 2.6.25 Signed-off-by: Todor Minchev <todor@minchev.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23checkpatch: prefer ether_addr_copy to memcpy(foo, bar, ETH_ALEN)Joe Perches1-0/+10
ether_addr_copy was added for kernel version 3.14. It's slightly smaller/faster for some arches. Encourage its use. Signed-off-by: Joe Perches <joe@perches.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23checkpatch: add DT compatible string documentation checksRob Herring1-0/+27
This adds a simple check that any compatible strings in DeviceTree dts files are present in Documentation/devicetree/bindings. Vendor prefixes are also checked for existing in vendor-prefixes.txt These should be temporary checks until we have more sophisticated binding schema checking. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Joe Perches <joe@perches.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Andy Whitcroft <apw@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23checkpatch: only flag FSF address, not gnu.org URLAlexander Duyck1-3/+1
This change restricts the check for the for the FSF address in the GPL copyright statement so that it only flags the address, not the references to the gnu.org/licenses URL which appears to be used in numerous drivers. The idea is to still allow some reference to an external copy of the GPL in the event that files are copied out of the kernel tree without the COPYING file. So for example this statement will still return an error: You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. However, this statement will not return an error after this patch: You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23checkpatch: add tests for function pointer style misusesJoe Perches1-0/+59
Kernel style uses function pointers in this form: "type (*funcptr)(args...)" Emit warnings when this function pointer form isn't used. Signed-off-by: Joe Perches <joe@perches.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Derek Perrin <d.roc16@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23checkpatch: update the FSF/GPL address checkJoe Perches1-5/+6
The FSF address check is a bit too verbose looking for the GPL text. Quiet it a bit by requiring --strict for the GPL bit. Also make the address tests match a few uses of abbreviations for street names and make it case insensitive. Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Andy Whitcroft <apw@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23checkpatch: check for if's with unnecessary parenthesesJoe Perches1-0/+14
If statements don't need multiple parentheses around tested comparisons like "if ((foo == bar))". An == comparison maybe a sign of an intended assignment, so emit a slightly different message if so. Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Andy Whitcroft <apw@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23checkpatch: improve space before tab --fix optionJoe Perches1-2/+4
This test should remove all the spaces before a tab not just one space. Substitute a tab for each 8 space block before a tab and remove less than 8 spaces before a tab. This SPACE_BEFORE_TAB test is done after CODE_INDENT. If there are spaces used at the beginning of a line that should be converted to tabs, please make sure that the CODE_INDENT test and conversion is done before this SPACE_BEFORE_TAB test and conversion. Reported-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Joe Perches <joe@perches.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Andy Whitcroft <apw@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23checkpatch: add a --fix-inplace optionJoe Perches1-1/+9
Add the ability to fix and overwrite existing files/patches instead of creating a new file "<filename>.EXPERIMENTAL-checkpatch-fixes". Suggested-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: Andy Whitcroft <apw@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23checkpatch: attempt to find missing switch/case break;Joe Perches1-0/+25
switch case statements missing a break statement are an unfortunately common error. e.g.: commit 4a2c94c9b6c0 ("HID: kye: Add report fixup for Genius Manticore Keyboard") case blocks should end in a break/return/goto/continue. If a fall-through is used, it should have a comment showing that it is intentional. Ideally that comment should be something like: "/* fall-through */" Add a test to look for missing break statements. This looks only at the context lines before an inserted case so it's possible to have false positives when the context contains a close brace and the break is before the brace and not part of the patch context. Looking at recent patches, this is a pretty rare occurrence. The normal kernel style uses a break as the last line of the previous block. Signed-off-by: Joe Perches <joe@perche.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com> Cc: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23checkpatch: add warning of future __GFP_NOFAIL useDavid Rientjes1-0/+6
gfp.h and page_alloc.c already specify that __GFP_NOFAIL is deprecated and no new users should be added. Add a warning to checkpatch to catch this. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23checkpatch: warn only on "space before semicolon" at end of lineJoe Perches1-1/+1
The "space before a non-naked semicolon" test has unwanted output when used in "for ( ;; )" loops. Make the test work only on end-of-line statement termination semicolons. Signed-off-by: Joe Perches <joe@perches.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23checkpatch: more comprehensive split strings warningJoe Perches1-8/+4
The current checkpatch test for split strings does not find several cases that should be found. For instance: /* Else poor success; go back to mode in "active" table */ } else { IWL_DEBUG_RATE(mvm, - "LQ: GOING BACK TO THE OLD TABLE suc=%d cur-tpt=%d old-tpt=%d\n", + "GOING BACK TO THE OLD TABLE: SR %d " + "cur-tpt %d old-tpt %d\n", window->success_ratio, window->average_tpt, lq_sta->last_tpt); does not currently emit a warning. Improve the test to find these cases. Add more exceptions to reduce false positives for assembly and octal/hex string constants. Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23firmware/dmi_scan: generalize for use by other archsArd Biesheuvel6-15/+24
This patch makes a couple of changes to the SMBIOS/DMI scanning code so it can be used on other archs (such as ARM and arm64): (a) wrap the calls to ioremap()/iounmap(), this allows the use of a flavor of ioremap() more suitable for random unaligned access; (b) allow the non-EFI fallback probe into hardcoded physical address 0xF0000 to be disabled. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Grant Likely <grant.likely@linaro.org> Cc: Ingo Molnar <mingo@elte.hu> Cc "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23test: check copy_to/from_user boundary validationKees Cook5-0/+138
To help avoid an architecture failing to correctly check kernel/user boundaries when handling copy_to_user, copy_from_user, put_user, or get_user, perform some simple tests and fail to load if any of them behave unexpectedly. Specifically, this is to make sure there is a way to notice if things like what was fixed in commit 8404663f81d2 ("ARM: 7527/1: uaccess: explicitly check __user pointer when !CPU_USE_DOMAINS") ever regresses again, for any architecture. Additionally, adds new "user" selftest target, which loads this module. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23test: add minimal module for verification testingKees Cook3-0/+48
This is a pair of test modules I'd like to see in the tree. Instead of putting these in lkdtm, where I've been adding various tests that trigger crashes, these don't make sense there since they need to be either distinctly separate, or their pass/fail state don't need to crash the machine. These live in lib/ for now, along with a few other in-kernel test modules, and use the slightly more common "test_" naming convention, instead of "test-". We should likely standardize on the former: $ find . -name 'test_*.c' | grep -v /tools/ | wc -l 4 $ find . -name 'test-*.c' | grep -v /tools/ | wc -l 2 The first is entirely a no-op module, designed to allow simple testing of the module loading and verification interface. It's useful to have a module that has no other uses or dependencies so it can be reliably used for just testing module loading and verification. The second is a module that exercises the user memory access functions, in an effort to make sure that we can quickly catch any regressions in boundary checking (e.g. like what was recently fixed on ARM). This patch (of 2): When doing module loading verification tests (for example, with module signing, or LSM hooks), it is very handy to have a module that can be built on all systems under test, isn't auto-loaded at boot, and has no device or similar dependencies. This creates the "test_module.ko" module for that purpose, which only reports its load and unload to printk. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23lib/cmdline.c: declare exported symbols immediatelyFelipe Contreras1-3/+2
WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable +EXPORT_SYMBOL(memparse); WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable +EXPORT_SYMBOL(get_option); WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable +EXPORT_SYMBOL(get_options); Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Cc: Levente Kurusa <levex@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23lib/cmdline.c: fix style issuesFelipe Contreras1-5/+4
WARNING: space prohibited between function name and open parenthesis '(' +int get_option (char **str, int *pint) WARNING: space prohibited between function name and open parenthesis '(' + *pint = simple_strtol (cur, str, 0); ERROR: trailing whitespace + $ WARNING: please, no spaces at the start of a line + $ WARNING: space prohibited between function name and open parenthesis '(' + res = get_option ((char **)&str, ints + i); Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23lib/kstrtox.c: remove redundant cleanupFelipe Contreras1-1/+0
We can't reach the cleanup code unless the flag KSTRTOX_OVERFLOW is not set, so there's not no point in clearing a bit that we know is not set. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Acked-by: Levente Kurusa <levex@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23backlight: lp8788: remove unnecessary parenthesesJingoo Han1-3/+3
Remove unnecessary parentheses in order to fix the following checkpatch error. ERROR: return is not a function, parentheses are not required Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23backlight: lp855x: remove unnecessary parenthesesJingoo Han1-1/+1
Remove unnecessary parentheses in order to fix the following checkpatch error. ERROR: return is not a function, parentheses are not required Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23backlight: kb3886_bl: fix incorrect placement of __initdata markerJingoo Han1-1/+1
The __initdata marker can be virtually anywhere on the line, EXCEPT right after "struct". The preferred location is before the "=" sign if there is one, or before the trailing ";" otherwise. It also fixes the following chechpatch warning. WARNING: __initdata should be placed after kb3886bl_device_table[] Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23backlight: tosa: use devm_lcd_device_register()Jingoo Han1-4/+2
Use devm_lcd_device_register() to make cleanup paths simpler. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23backlight: l4f00242t03: use devm_lcd_device_register()Jingoo Han1-4/+2
Use devm_lcd_device_register() to make cleanup paths simpler. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23backlight: jornada720: use devm_lcd_device_register()Jingoo Han1-11/+2
Use devm_lcd_device_register() to make cleanup paths simpler, and remove unnecessary remove(). Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23backlight: tosa: use devm_backlight_device_register()Jingoo Han1-4/+3
Use devm_backlight_device_register() to make cleanup paths simpler. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23backlight: ot200_bl: use devm_backlight_device_register()Jingoo Han1-6/+3
Use devm_backlight_device_register() to make cleanup paths simpler. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23backlight: omap1: use devm_backlight_device_register()Jingoo Han1-12/+2
Use devm_backlight_device_register() to make cleanup paths simpler, and remove unnecessary remove(). Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23backlight: hp680_bl: use devm_backlight_device_register()Jingoo Han1-4/+2
Use devm_backlight_device_register() to make cleanup paths simpler. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23backlight: jornada720: use devm_backlight_device_register()Jingoo Han1-12/+3
Use devm_backlight_device_register() to make cleanup paths simpler, and remove unnecessary remove(). Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23MAINTAINERS: remove unnecessary EXYNOS DP DRIVER F: patternJingoo Han1-1/+0
Remove unnecessary pattern for Exynos DP header from MAINTAINERS file. After commit f9b1e013f1c6 ("video: exynos_dp: remove non-DT support for Exynos Display Port"), 'exynos_dp.h' has not been used. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23MAINTAINERS: describe differences between F: and N: patternsJoe Perches1-0/+5
There is a difference in how scripts/get_maintainer.pl treats F: and N: file pattern matches. Describe those differences in the MAINTAINERS file. Signed-off-by: Joe Perches <joe@perches.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Brown <broonie@linaro.org> Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23MAINTAINERS: add an entry for the Macintosh HFSPlus FilesystemGeert Uytterhoeven1-0/+6
To make scripts/get_maintainer.pl output something sensible. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23get_maintainer: add commit author information to --rolestatsJoe Perches1-6/+85
get_maintainer currently uses "Signed-off-by" style lines to find interested parties to send patches to when the MAINTAINERS file does not have a specific section entry with a matching file pattern. Add statistics for commit authors and lines added and deleted to the information provided by --rolestats. These statistics are also emitted whenever --rolestats and --git are selected even when there is a specified maintainer. This can have the effect of expanding the number of people that are shown as possible "maintainers" of a particular file because "authors", "added_lines", and "removed_lines" are also used as criterion for the --max-maintainers option separate from the "commit_signers". The first "--git-max-maintainers" values of each criterion are emitted. Any "ties" are not shown. For example: (forcedeth does not have a named maintainer) Old output: $ ./scripts/get_maintainer.pl -f drivers/net/ethernet/nvidia/forcedeth.c "David S. Miller" <davem@davemloft.net> (commit_signer:8/10=80%) Jiri Pirko <jiri@resnulli.us> (commit_signer:2/10=20%) Patrick McHardy <kaber@trash.net> (commit_signer:2/10=20%) Larry Finger <Larry.Finger@lwfinger.net> (commit_signer:1/10=10%) Peter Zijlstra <peterz@infradead.org> (commit_signer:1/10=10%) netdev@vger.kernel.org (open list:NETWORKING DRIVERS) linux-kernel@vger.kernel.org (open list) New output: $ ./scripts/get_maintainer.pl -f drivers/net/ethernet/nvidia/forcedeth.c "David S. Miller" <davem@davemloft.net> (commit_signer:8/10=80%) Jiri Pirko <jiri@resnulli.us> (commit_signer:2/10=20%,authored:2/10=20%,removed_lines:3/33=9%) Patrick McHardy <kaber@trash.net> (commit_signer:2/10=20%,authored:2/10=20%,added_lines:12/95=13%,removed_lines:10/33=30%) Larry Finger <Larry.Finger@lwfinger.net> (commit_signer:1/10=10%,authored:1/10=10%,added_lines:35/95=37%) Peter Zijlstra <peterz@infradead.org> (commit_signer:1/10=10%) "Peter Hüwe" <PeterHuewe@gmx.de> (authored:1/10=10%,removed_lines:15/33=45%) Joe Perches <joe@perches.com> (authored:1/10=10%) Neil Horman <nhorman@tuxdriver.com> (added_lines:40/95=42%) Bill Pemberton <wfp5p@virginia.edu> (removed_lines:3/33=9%) netdev@vger.kernel.org (open list:NETWORKING DRIVERS) linux-kernel@vger.kernel.org (open list) Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23printk: flush conflicting continuation lineArun KS1-3/+6
An earlier newline was missing and current print is from different task. In this scenario flush the continuation line and store this line seperatly. This patch fix the below scenario of timestamp interleaving, [ 28.154370 ] read_word_reg : reg[0x 3], reg[0x 4] data [0x 642] [ 28.155428 ] uart disconnect [ 31.947341 ] dvfs[cpufreq.c<275>]:plug-in cpu<1> done [ 28.155445 ] UART detached : send switch state 201 [ 32.014112 ] read_reg : reg[0x 3] data[0x21] [akpm@linux-foundation.org: simplify and condense the code] Signed-off-by: Arun KS <getarunks@gmail.com> Signed-off-by: Arun KS <arun.ks@broadcom.com> Cc: Joe Perches <joe@perches.com> Cc: Tejun Heo <tj@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Kay Sievers <kay@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23vsprintf: add %pad extension for dma_addr_t useJoe Perches2-8/+36
dma_addr_t's can be either u32 or u64 depending on a CONFIG option. There are a few hundred dma_addr_t's printed via either cast to unsigned long long, unsigned long or no cast at all. Add %pad to be able to emit them without the cast. Update Documentation/printk-formats.txt too. Signed-off-by: Joe Perches <joe@perches.com> Cc: "Shevchenko, Andriy" <andriy.shevchenko@intel.com> Cc: Rob Landley <rob@landley.net> Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Cc: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23printk/cache: mark printk_once test variable __read_mostlyJoe Perches3-11/+13
Add #include <linux/cache.h> to define __read_mostly. Convert cache.h to use uapi/linux/kernel.h instead of linux/kernel.h to avoid recursive #includes. Convert the ALIGN macro to __ALIGN_KERNEL. printk_once only sets the bool variable tested once so mark it __read_mostly. Neaten the alignment so it matches the rest of the pr_<level>_once #defines too. Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: James Hogan <james.hogan@imgtec.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23dynamic-debug-howto.txt: update since new wildcard supportDu, Changbin1-0/+9
Add the usage of using new feature wildcard support. Signed-off-by: Du, Changbin <changbin.du@gmail.com> Cc: Jason Baron <jbaron@akamai.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23dynamic_debug: add wildcard support to filter files/functions/modulesDu, Changbin1-5/+10
Add wildcard '*'(matches zero or more characters) and '?' (matches one character) support when qurying debug flags. Now we can open debug messages using keywords. eg: 1. open debug logs in all usb drivers echo "file drivers/usb/* +p" > <debugfs>/dynamic_debug/control 2. open debug logs for usb xhci code echo "file *xhci* +p" > <debugfs>/dynamic_debug/control Signed-off-by: Du, Changbin <changbin.du@gmail.com> Cc: Jason Baron <jbaron@akamai.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23lib/parser.c: put EXPORT_SYMBOLs in the conventional placeAndrew Morton1-7/+6
Cc: Du, Changbin <changbin.du@gmail.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23lib/parser.c: add match_wildcard() functionDu, Changbin2-0/+52
match_wildcard function is a simple implementation of wildcard matching algorithm. It only supports two usual wildcardes: '*' - matches zero or more characters '?' - matches one character This algorithm is safe since it is non-recursive. Signed-off-by: Du, Changbin <changbin.du@gmail.com> Cc: Jason Baron <jbaron@akamai.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23uapi: convert u64 to __u64 in exported headersMike Frysinger2-2/+2
The u64 type is not defined in any exported kernel headers, so trying to use it will lead to build failures. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23include/uapi/linux/dn.h: pull in ioctl.h headerMike Frysinger1-0/+1
This header uses _IOW/_IOR defines but doesn't include ioctl.h for it. If you try to use this w/out including ioctl.h yourself, it can fail to build, so add the explicit include. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23include/uapi/linux/ppp-ioctl.h: pull in ppp_defs.hMike Frysinger1-0/+1
This header uses enum NPmode but doesn't include ppp_defs.h. If you try to use this header w/out including the defs header first, it leads to a build failure. So add the explicit include to fix it. Don't know of any packages directly impacted, but noticed while building some ppp code by hand. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Paul Mackerras <paulus@samba.org> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23headers_check: special case seqbuf_dump()Paul Bolle1-1/+5
"make headers_check" warns about soundcard.h for (at least) five years now: [...]/usr/include/linux/soundcard.h:1054: userspace cannot reference function or variable defined in the kernel We're apparently stuck with providing OSSlib-3.8 compatibility, so let's special case this declaration just to silence it. Notes: 0) Support for OSSlib post 3.8 was already removed in commit 43a990765a ("sound: Remove OSSlib stuff from linux/soundcard.h"). Five years have passed since that commit: do people still care about OSSlib-3.8? If not, quite a bit of code could be remove from soundcard.h (and probably ultrasound.h). 2) By the way, what is actually meant by: It is no longer possible to actually link against OSSlib with this header, but we still provide these macros for programs using them. Doesn't that mean compatibility to OSSlib isn't even useful? 3) Anyhow, a previous discussion soundcard.h, which led to that commit, starts at https://lkml.org/lkml/2009/1/20/349 . 4) And, yes, I sneaked in a whitespace fix. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Cc: Takashi Iwai <tiwai@suse.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/vlynq/vlynq.c: fix another resource size off by 1 errorDan Carpenter1-1/+2
We fixed the call to request_mem_region() in commit 3354f73b24c6 ("drivers/vlynq/vlynq.c: fix resource size off by 1 error"). But we need to fix the call the release_mem_region() as well. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Florian Fainelli <florian@openwrt.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/video/aty/aty128fb.c: fix a warning pertaining to the aty128fb ↵David Howells1-0/+4
backlight variable Fix the following warning in the aty128fb driver: drivers/video/aty/aty128fb.c:363:12: warning: 'backlight' defined but not used [-Wunused-variable] static int backlight = 0; ^ as the variable's value is only read if CONFIG_FB_ATY128_BACKLIGHT=y. The variable is also set if MODULE is unset[*]. [*] I wonder if the conditional wrapper around aty128fb_setup() should be using CONFIG_MODULE rather than MODULE. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/mfd/tps65217.c: fix pointer-integer size mismatch warning in ↵David Howells1-1/+1
tps65217_probe() Fix up the following pointer-integer size mismatch warning in tps65217_probe(): drivers/mfd/tps65217.c: In function 'tps65217_probe': drivers/mfd/tps65217.c:173:13: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] chip_id = (unsigned int)match->data; ^ Signed-off-by: David Howells <dhowells@redhat.com> Cc: AnilKumar Ch <anilkumar@ti.com> Cc: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/mfd/max8998.c: fix pointer-integer size mismatch warning in ↵David Howells1-1/+1
max8998_i2c_get_driver_data() Fix up the following pointer-integer size mismatch warning in max8998_i2c_get_driver_data(): drivers/mfd/max8998.c: In function 'max8998_i2c_get_driver_data': drivers/mfd/max8998.c:178:10: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] return (int)match->data; ^ Signed-off-by: David Howells <dhowells@redhat.com> Cc: Tomasz Figa <t.figa@samsung.com> Cc: Mark Brown <broonie@linaro.org> Cc: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/gpu/drm/gma500/backlight.c: fix a defined-but-not-used warning for ↵David Howells1-2/+2
do_gma_backlight_set() Fix the following warning: drivers/gpu/drm/gma500/backlight.c:29:13: warning: 'do_gma_backlight_set' defined but not used [-Wunused-function] by moving the entire function inside the conditional section currently inside of it. All the places that call it are so conditionalised. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23include/linux/of.h: make for_each_child_of_node() reference its args when ↵David Howells1-1/+6
CONFIG_OF=n Make for_each_child_of_node() reference its args when CONFIG_OF=n to avoid warnings like: drivers/leds/leds-pwm.c:88:22: warning: unused variable 'node' [-Wunused-variable] struct device_node *node = pdev->dev.of_node; ^ Signed-off-by: David Howells <dhowells@redhat.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Rob Herring <robh+dt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23remove extra definitions of U32_MAXAlex Elder3-26/+0
Now that the definition is centralized in <linux/kernel.h>, the definitions of U32_MAX (and related) elsewhere in the kernel can be removed. Signed-off-by: Alex Elder <elder@linaro.org> Acked-by: Sage Weil <sage@inktank.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23kernel.h: define u8, s8, u32, etc. limitsAlex Elder1-0/+13
Create constants that define the maximum and minimum values representable by the kernel types u8, s8, u16, s16, and so on. Signed-off-by: Alex Elder <elder@linaro.org> Cc: Sage Weil <sage@inktank.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23conditionally define U32_MAXAlex Elder3-0/+6
The symbol U32_MAX is defined in several spots. Change these definitions to be conditional. This is in preparation for the next patch, which centralizes the definition in <linux/kernel.h>. Signed-off-by: Alex Elder <elder@linaro.org> Cc: Sage Weil <sage@inktank.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23um: use generic fixmap.hMark Salter1-39/+1
Signed-off-by: Mark Salter <msalter@redhat.com> Acked-by: Richard Weinberger <richard@nod.at> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23tile: use generic fixmap.hMark Salter1-32/+1
Signed-off-by: Mark Salter <msalter@redhat.com> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23sh: use generic fixmap.hMark Salter1-37/+2
Signed-off-by: Mark Salter <msalter@redhat.com> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23powerpc: use generic fixmap.hMark Salter1-42/+2
Signed-off-by: Mark Salter <msalter@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mips: use generic fixmap.hMark Salter1-32/+1
Signed-off-by: Mark Salter <msalter@redhat.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23microblaze: use generic fixmap.hMark Salter1-42/+2
Signed-off-by: Mark Salter <msalter@redhat.com> Tested-by: Michal Simek <monstr@monstr.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23metag: use generic fixmap.hMark Salter1-31/+1
Signed-off-by: Mark Salter <msalter@redhat.com> Acked-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23hexagon: use generic fixmap.hMark Salter1-39/+1
Signed-off-by: Mark Salter <msalter@redhat.com> Acked-by: Richard Kuo <rkuo@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23x86: use generic fixmap.hMark Salter1-58/+1
Signed-off-by: Mark Salter <msalter@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23add generic fixmap.hMark Salter1-0/+97
Many architectures provide an asm/fixmap.h which defines support for compile-time 'special' virtual mappings which need to be made before paging_init() has run. This support is also used for early ioremap on x86. Much of this support is identical across the architectures. This patch consolidates all of the common bits into asm-generic/fixmap.h which is intended to be included from arch/*/include/asm/fixmap.h. Signed-off-by: Mark Salter <msalter@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: James Hogan <james.hogan@imgtec.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Richard Weinberger <richard@nod.at> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jonas Bonn <jonas.bonn@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23logfs: check for the return value after calling find_or_create_page()Younger Liu1-1/+2
In get_mapping_page(), after calling find_or_create_page(), the return value should be checked. This patch has been provided: http://www.spinics.net/lists/linux-fsdevel/msg66948.html but not been applied now. Signed-off-by: Younger Liu <liuyiyang@hisense.com> Cc: Younger Liu <younger.liucn@gmail.com> Cc: Vyacheslav Dubeyko <slava@dubeyko.com> Reviewed-by: Prasad Joshi <prasadjoshi.linux@gmail.com> Cc: Jörn Engel <joern@logfs.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/block/Kconfig: update RAM block device module nameFabian Frederick1-1/+2
RAM block device support module name changed to brd.ko some years ago with an "rd" alias to match previous module implementation. This patch updates its Kconfig definition. Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23drivers/mailbox/omap: make mbox->irq signed for error handlingDan Carpenter1-1/+1
There is a bug in omap2_mbox_probe() where we try do: mbox->irq = platform_get_irq(pdev, info->irq_id); if (mbox->irq < 0) { The problem is that mbox->irq is unsigned so the error handling doesn't work. I've changed it to a signed integer. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Suman Anna <s-anna@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Omar Ramirez Luna <omar.ramirez@copitl.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23asm/types.h: Remove include/asm-generic/int-l64.hGeert Uytterhoeven2-51/+1
Now all 64-bit architectures have been converted to int-ll64.h, we can remove int-l64.h in kernelspace. For backwards compatibility, alpha, ia64, mips64, and powerpc64 still use int-l64.h in userspace. This is the (reworked for UAPI) non-documentation part of more than two year old "asm/types.h: All architectures use int-ll64.h in kernelspace" (https://lkml.org/lkml/2011/8/13/104) Since <asm/types.h> (from include/uapi/asm-generic/types.h) is used for both kernel and user space, include/asm-generic/int-ll64.h cannot just become include/asm-generic/types.h, as Arnd suggested. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm: ignore VM_SOFTDIRTY on VMA mergingCyrill Gorcunov1-2/+10
The VM_SOFTDIRTY bit affects vma merge routine: if two VMAs has all bits in vm_flags matched except dirty bit the kernel can't longer merge them and this forces the kernel to generate new VMAs instead. It finally may lead to the situation when userspace application reaches vm.max_map_count limit and get crashed in worse case | (gimp:11768): GLib-ERROR **: gmem.c:110: failed to allocate 4096 bytes | | (file-tiff-load:12038): LibGimpBase-WARNING **: file-tiff-load: gimp_wire_read(): error | xinit: connection to X server lost | | waiting for X server to shut down | /usr/lib64/gimp/2.0/plug-ins/file-tiff-load terminated: Hangup | /usr/lib64/gimp/2.0/plug-ins/script-fu terminated: Hangup | /usr/lib64/gimp/2.0/plug-ins/script-fu terminated: Hangup https://bugzilla.kernel.org/show_bug.cgi?id=67651 https://bugzilla.gnome.org/show_bug.cgi?id=719619#c0 Initial problem came from missed VM_SOFTDIRTY in do_brk() routine but even if we would set up VM_SOFTDIRTY here, there is still a way to prevent VMAs from merging: one can call | echo 4 > /proc/$PID/clear_refs and clear all VM_SOFTDIRTY over all VMAs presented in memory map, then new do_brk() will try to extend old VMA and finds that dirty bit doesn't match thus new VMA will be generated. As discussed with Pavel, the right approach should be to ignore VM_SOFTDIRTY bit when we're trying to merge VMAs and if merge successed we mark extended VMA with dirty bit where needed. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Reported-by: Bastian Hougaard <gnome@rvzt.net> Reported-by: Mel Gorman <mgorman@suse.de> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm/rmap: fix coccinelle warningsFengguang Wu1-2/+2
mm/rmap.c:851:9-10: WARNING: return of 0/1 in function 'invalid_mkclean_vma' with return type bool Return statements in functions returning bool should use true/false instead of 1/0. Generated by: coccinelle/misc/boolreturn.cocci Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm/swapfile.c: do not skip lowest_bit in scan_swap_map() scan loopJamie Liu1-1/+2
In the second half of scan_swap_map()'s scan loop, offset is set to si->lowest_bit and then incremented before entering the loop for the first time, causing si->swap_map[si->lowest_bit] to be skipped. Signed-off-by: Jamie Liu <jamieliu@google.com> Cc: Shaohua Li <shli@fusionio.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23memcg: remove unused code from kmem_cache_destroy_work_funcVladimir Davydov1-4/+2
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm: improve documentation of page_orderMel Gorman2-4/+9
Developers occasionally try and optimise PFN scanners by using page_order but miss that in general it requires zone->lock. This has happened twice for compaction.c and rejected both times. This patch clarifies the documentation of page_order and adds a note to compaction.c why page_order is not used. [akpm@linux-foundation.org: tweaks] [lauraa@codeaurora.org: Corrected a page_zone(page)->lock reference] Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Rafael Aquini <aquini@redhat.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Laura Abbott <lauraa@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23memcg: fix css reference leak and endless loop in mem_cgroup_iterMichal Hocko1-5/+13
Commit 19f39402864e ("memcg: simplify mem_cgroup_iter") has reorganized mem_cgroup_iter code in order to simplify it. A part of that change was dropping an optimization which didn't call css_tryget on the root of the walked tree. The patch however didn't change the css_put part in mem_cgroup_iter which excludes root. This wasn't an issue at the time because __mem_cgroup_iter_next bailed out for root early without taking a reference as cgroup iterators (css_next_descendant_pre) didn't visit root themselves. Nevertheless cgroup iterators have been reworked to visit root by commit bd8815a6d802 ("cgroup: make css_for_each_descendant() and friends include the origin css in the iteration") when the root bypass have been dropped in __mem_cgroup_iter_next. This means that css_put is not called for root and so css along with mem_cgroup and other cgroup internal object tied by css lifetime are never freed. Fix the issue by reintroducing root check in __mem_cgroup_iter_next and do not take css reference for it. This reference counting magic protects us also from another issue, an endless loop reported by Hugh Dickins when reclaim races with root removal and css_tryget called by iterator internally would fail. There would be no other nodes to visit so __mem_cgroup_iter_next would return NULL and mem_cgroup_iter would interpret it as "start looping from root again" and so mem_cgroup_iter would loop forever internally. Signed-off-by: Michal Hocko <mhocko@suse.cz> Reported-by: Hugh Dickins <hughd@google.com> Tested-by: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Greg Thelen <gthelen@google.com> Cc: <stable@vger.kernel.org> [3.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23memcg: fix endless loop caused by mem_cgroup_iterMichal Hocko1-3/+14
Hugh has reported an endless loop when the hardlimit reclaim sees the same group all the time. This might happen when the reclaim races with the memcg removal. shrink_zone [rmdir root] mem_cgroup_iter(root, NULL, reclaim) // prev = NULL rcu_read_lock() mem_cgroup_iter_load last_visited = iter->last_visited // gets root || NULL css_tryget(last_visited) // failed last_visited = NULL [1] memcg = root = __mem_cgroup_iter_next(root, NULL) mem_cgroup_iter_update iter->last_visited = root; reclaim->generation = iter->generation mem_cgroup_iter(root, root, reclaim) // prev = root rcu_read_lock mem_cgroup_iter_load last_visited = iter->last_visited // gets root css_tryget(last_visited) // failed [1] The issue seemed to be introduced by commit 5f5781619718 ("memcg: relax memcg iter caching") which has replaced unconditional css_get/css_put by css_tryget/css_put for the cached iterator. This patch fixes the issue by skipping css_tryget on the root of the tree walk in mem_cgroup_iter_load and symmetrically doesn't release it in mem_cgroup_iter_update. Signed-off-by: Michal Hocko <mhocko@suse.cz> Reported-by: Hugh Dickins <hughd@google.com> Tested-by: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Greg Thelen <gthelen@google.com> Cc: <stable@vger.kernel.org> [3.10+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm, oom: prefer thread group leaders for display purposesDavid Rientjes2-11/+20
When two threads have the same badness score, it's preferable to kill the thread group leader so that the actual process name is printed to the kernel log rather than the thread group name which may be shared amongst several processes. This was the behavior when select_bad_process() used to do for_each_process(), but it now iterates threads instead and leads to ambiguity. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Greg Thelen <gthelen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23doc/kmemcheck: add kmemcheck to kernel-parametersXishi Qiu1-0/+7
Add "kmemcheck=xx" to Documentation/kernel-parameters.txt. Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> Cc: Vegard Nossum <vegardno@ifi.uio.no> Cc: Pekka Enberg <penberg@kernel.org> Cc: Rob Landley <rob@landley.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm/memcg: iteration skip memcgs not yet fully initializedHugh Dickins1-4/+2
It is surprising that the mem_cgroup iterator can return memcgs which have not yet been fully initialized. By accident (or trial and error?) this appears not to present an actual problem; but it may be better to prevent such surprises, by skipping memcgs not yet online. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Tejun Heo <tj@kernel.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> [3.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm/memcg: fix last_dead_count memory wastageHugh Dickins1-1/+1
Shorten mem_cgroup_reclaim_iter.last_dead_count from unsigned long to int: it's assigned from an int and compared with an int, and adjacent to an unsigned int: so there's no point to it being unsigned long, which wasted 104 bytes in every mem_cgroup_per_zone. Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm: audit/fix non-modular users of module_init in core codePaul Gortmaker4-7/+6
Code that is obj-y (always built-in) or dependent on a bool Kconfig (built-in or absent) can never be modular. So using module_init as an alias for __initcall can be somewhat misleading. Fix these up now, so that we can relocate module_init from init.h into module.h in the future. If we don't do this, we'd have to add module.h to obviously non-modular code, and that would be a worse thing. The audit targets the following module_init users for change: mm/ksm.c bool KSM mm/mmap.c bool MMU mm/huge_memory.c bool TRANSPARENT_HUGEPAGE mm/mmu_notifier.c bool MMU_NOTIFIER Note that direct use of __initcall is discouraged, vs. one of the priority categorized subgroups. As __initcall gets mapped onto device_initcall, our use of subsys_initcall (which makes sense for these files) will thus change this registration from level 6-device to level 4-subsys (i.e. slightly earlier). However no observable impact of that difference has been observed during testing. One might think that core_initcall (l2) or postcore_initcall (l3) would be more appropriate for anything in mm/ but if we look at some actual init functions themselves, we see things like: mm/huge_memory.c --> hugepage_init --> hugepage_init_sysfs mm/mmap.c --> init_user_reserve --> sysctl_user_reserve_kbytes mm/ksm.c --> ksm_init --> sysfs_create_group and hence the choice of subsys_initcall (l4) seems reasonable, and at the same time minimizes the risk of changing the priority too drastically all at once. We can adjust further in the future. Also, several instances of missing ";" at EOL are fixed. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm/mm_init.c: make creation of the mm_kobj happen earlier than device_initcallPaul Gortmaker1-2/+1
The use of __initcall is to be eventually replaced by choosing one from the prioritized groupings laid out in init.h header: pure_initcall 0 core_initcall 1 postcore_initcall 2 arch_initcall 3 subsys_initcall 4 fs_initcall 5 device_initcall 6 late_initcall 7 In the interim, all __initcall are mapped onto device_initcall, which as can be seen above, comes quite late in the ordering. Currently the mm_kobj is created with __initcall in mm_sysfs_init(). This means that any other initcalls that want to reference the mm_kobj have to be device_initcall (or later), otherwise we will for example, trip the BUG_ON(!kobj) in sysfs's internal_create_group(). This unfairly restricts those users; for example something that clearly makes sense to be an arch_initcall will not be able to choose that. However, upon examination, it is only this way for historical reasons (i.e. simply not reprioritized yet). We see that sysfs is ready quite earlier in init/main.c via: vfs_caches_init |_ mnt_init |_ sysfs_init well ahead of the processing of the prioritized calls listed above. So we can recategorize mm_sysfs_init to be a pure_initcall, which in turn allows any mm_kobj initcall users a wider range (1 --> 7) of initcall priorities to choose from. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm: show message when updating min_free_kbytes in thpHan Pingtian3-2/+9
min_free_kbytes may be raised during THP's initialization. Sometimes, this will change the value which was set by the user. Showing this message will clarify this confusion. Only show this message when changing a value which was set by the user according to Michal Hocko's suggestion. Show the old value of min_free_kbytes according to Dave Hansen's suggestion. This will give user the chance to restore old value of min_free_kbytes. Signed-off-by: Han Pingtian <hanpt@linux.vnet.ibm.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm/memory_hotplug.c: move register_memory_resource out of the ↵Nathan Zimmer1-3/+4
lock_memory_hotplug We don't need to do register_memory_resource() under lock_memory_hotplug() since it has its own lock and doesn't make any callbacks. Also register_memory_resource return NULL on failure so we don't have anything to cleanup at this point. The reason for this rfc is I was doing some experiments with hotplugging of memory on some of our larger systems. While it seems to work, it can be quite slow. With some preliminary digging I found that lock_memory_hotplug is clearly ripe for breakup. It could be broken up per nid or something but it also covers the online_page_callback. The online_page_callback shouldn't be very hard to break out. Also there is the issue of various structures(wmarks come to mind) that are only updated under the lock_memory_hotplug that would need to be dealt with. Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Hedi <hedi@sgi.com> Cc: Mike Travis <travis@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm/nobootmem: free_all_bootmem againPhilipp Hachtmann2-26/+16
get_allocated_memblock_reserved_regions_info() should work if it is compiled in. Extended the ifdef around get_allocated_memblock_memory_regions_info() to include get_allocated_memblock_reserved_regions_info() as well. Similar changes in nobootmem.c/free_low_memory_core_early() where the two functions are called. [akpm@linux-foundation.org: cleanup] Signed-off-by: Philipp Hachtmann <phacht@linux.vnet.ibm.com> Cc: qiuxishi <qiuxishi@huawei.com> Cc: David Howells <dhowells@redhat.com> Cc: Daeseok Youn <daeseok.youn@gmail.com> Cc: Jiang Liu <liuj97@gmail.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm: vmscan: call NUMA-unaware shrinkers irrespective of nodemaskVladimir Davydov1-9/+10
If a shrinker is not NUMA-aware, shrink_slab() should call it exactly once with nid=0, but currently it is not true: if node 0 is not set in the nodemask or if it is not online, we will not call such shrinkers at all. As a result some slabs will be left untouched under some circumstances. Let us fix it. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Reported-by: Dave Chinner <dchinner@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Cc: Glauber Costa <glommer@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm: vmscan: shrink all slab objects if tight on memoryVladimir Davydov1-4/+21
When reclaiming kmem, we currently don't scan slabs that have less than batch_size objects (see shrink_slab_node()): while (total_scan >= batch_size) { shrinkctl->nr_to_scan = batch_size; shrinker->scan_objects(shrinker, shrinkctl); total_scan -= batch_size; } If there are only a few shrinkers available, such a behavior won't cause any problems, because the batch_size is usually small, but if we have a lot of slab shrinkers, which is perfectly possible since FS shrinkers are now per-superblock, we can end up with hundreds of megabytes of practically unreclaimable kmem objects. For instance, mounting a thousand of ext2 FS images with a hundred of files in each and iterating over all the files using du(1) will result in about 200 Mb of FS caches that cannot be dropped even with the aid of the vm.drop_caches sysctl! This problem was initially pointed out by Glauber Costa [*]. Glauber proposed to fix it by making the shrink_slab() always take at least one pass, to put it simply, turning the scan loop above to a do{}while() loop. However, this proposal was rejected, because it could result in more aggressive and frequent slab shrinking even under low memory pressure when total_scan is naturally very small. This patch is a slightly modified version of Glauber's approach. Similarly to Glauber's patch, it makes shrink_slab() scan less than batch_size objects, but only if the total number of objects we want to scan (total_scan) is greater than the total number of objects available (max_pass). Since total_scan is biased as half max_pass if the current delta change is small: if (delta < max_pass / 4) total_scan = min(total_scan, max_pass / 2); this is only possible if we are scanning at high prio. That said, this patch shouldn't change the vmscan behaviour if the memory pressure is low, but if we are tight on memory, we will do our best by trying to reclaim all available objects, which sounds reasonable. [*] http://www.spinics.net/lists/cgroups/msg06913.html Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Glauber Costa <glommer@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23sched/numa: fix setting of cpupid on page migration twiceWanpeng Li1-2/+0
Commit 7851a45cd3f6 ("mm: numa: Copy cpupid on page migration") copiess over the cpupid at page migration time. It is unnecessary to set it again in migrate_misplaced_transhuge_page(). Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm: do_mincore() cleanupJianguo Wu1-7/+0
Two cleanups: 1. remove redundant codes for hugetlb pages. 2. end = pmd_addr_end(addr, end) restricts [addr, end) within PMD_SIZE, this may increase do_mincore() calls, remove it. Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: qiuxishi <qiuxishi@huawei.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23include/linux/genalloc.h: spinlock_t needs spinlock_types.hShawn Guo1-0/+2
Compiling a C file which includes genalloc.h but without spinlock_types.h being included before, we will see the compile error below. include/linux/genalloc.h:54:2: error: unknown type name `spinlock_t' Include spinlock_types.h from genalloc.h to fix the problem. Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23Documentation/trace/postprocess/trace-vmscan-postprocess.pl: fix the ↵Vinayak Menon1-9/+9
traceevent regex When irq, preempt and lockdep fields are printed (field 3 in the example below) in the trace output, the script fails. An example entry: kswapd0-610 [000] ...1 158.112152: mm_vmscan_kswapd_wake: nid=0 order=0 Signed-off-by: Vinayak Menon <vinayakm.list@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm: prevent setting of a value less than 0 to min_free_kbytesHan Pingtian1-1/+6
If echo -1 > /proc/vm/sys/min_free_kbytes, the system will hang. Changing proc_dointvec() to proc_dointvec_minmax() in the min_free_kbytes_sysctl_handler() can prevent this to happen. mhocko said: : You can still do echo $BIG_VALUE > /proc/vm/sys/min_free_kbytes and make : your machine unusable but I agree that proc_dointvec_minmax is more : suitable here as we already have: : : .proc_handler = min_free_kbytes_sysctl_handler, : .extra1 = &zero, : : It used to work properly but then 6fce56ec91b5 ("sysctl: Remove references : to ctl_name and strategy from the generic sysctl table") has removed : sysctl_intvec strategy and so extra1 is ignored. Signed-off-by: Han Pingtian <hanpt@linux.vnet.ibm.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm: new_vma_page() cannot see NULL vma for hugetlb pagesMichal Hocko1-4/+2
Commit 11c731e81bb0 ("mm/mempolicy: fix !vma in new_vma_page()") has removed BUG_ON(!vma) from new_vma_page which is partially correct because page_address_in_vma will return EFAULT for non-linear mappings and at least shared shmem might be mapped this way. The patch also tried to prevent NULL ptr for hugetlb pages which is not correct AFAICS because hugetlb pages cannot be mapped as VM_NONLINEAR and other conditions in page_address_in_vma seem to be legit and catch real bugs. This patch restores BUG_ON for PageHuge to catch potential issues when the to-be-migrated page is not setup properly. Signed-off-by: Michal Hocko <mhocko@suse.cz> Reviewed-by: Bob Liu <bob.liu@oracle.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm/memory-failure.c: shift page lock from head page to tail page after thp splitNaoya Horiguchi1-10/+11
After thp split in hwpoison_user_mappings(), we hold page lock on the raw error page only between try_to_unmap, hence we are in danger of race condition. I found in the RHEL7 MCE-relay testing that we have "bad page" error when a memory error happens on a thp tail page used by qemu-kvm: Triggering MCE exception on CPU 10 mce: [Hardware Error]: Machine check events logged MCE exception done on CPU 10 MCE 0x38c535: Killing qemu-kvm:8418 due to hardware memory corruption MCE 0x38c535: dirty LRU page recovery: Recovered qemu-kvm[8418]: segfault at 20 ip 00007ffb0f0f229a sp 00007fffd6bc5240 error 4 in qemu-kvm[7ffb0ef14000+420000] BUG: Bad page state in process qemu-kvm pfn:38c400 page:ffffea000e310000 count:0 mapcount:0 mapping: (null) index:0x7ffae3c00 page flags: 0x2fffff0008001d(locked|referenced|uptodate|dirty|swapbacked) Modules linked in: hwpoison_inject mce_inject vhost_net macvtap macvlan ... CPU: 0 PID: 8418 Comm: qemu-kvm Tainted: G M -------------- 3.10.0-54.0.1.el7.mce_test_fixed.x86_64 #1 Hardware name: NEC NEC Express5800/R120b-1 [N8100-1719F]/MS-91E7-001, BIOS 4.6.3C19 02/10/2011 Call Trace: dump_stack+0x19/0x1b bad_page.part.59+0xcf/0xe8 free_pages_prepare+0x148/0x160 free_hot_cold_page+0x31/0x140 free_hot_cold_page_list+0x46/0xa0 release_pages+0x1c1/0x200 free_pages_and_swap_cache+0xad/0xd0 tlb_flush_mmu.part.46+0x4c/0x90 tlb_finish_mmu+0x55/0x60 exit_mmap+0xcb/0x170 mmput+0x67/0xf0 vhost_dev_cleanup+0x231/0x260 [vhost_net] vhost_net_release+0x3f/0x90 [vhost_net] __fput+0xe9/0x270 ____fput+0xe/0x10 task_work_run+0xc4/0xe0 do_exit+0x2bb/0xa40 do_group_exit+0x3f/0xa0 get_signal_to_deliver+0x1d0/0x6e0 do_signal+0x48/0x5e0 do_notify_resume+0x71/0xc0 retint_signal+0x48/0x8c The reason of this bug is that a page fault happens before unlocking the head page at the end of memory_failure(). This strange page fault is trying to access to address 0x20 and I'm not sure why qemu-kvm does this, but anyway as a result the SIGSEGV makes qemu-kvm exit and on the way we catch the bad page bug/warning because we try to free a locked page (which was the former head page.) To fix this, this patch suggests to shift page lock from head page to tail page just after thp split. SIGSEGV still happens, but it affects only error affected VMs, not a whole system. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> [3.9+] # a3e0f9e47d5ef "mm/memory-failure.c: transfer page count from head page to tail page after split thp" Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23numa: add a sysctl for numa_balancingAndi Kleen4-2/+37
Add a working sysctl to enable/disable automatic numa memory balancing at runtime. This allows us to track down performance problems with this feature and is generally a good idea. This was possible earlier through debugfs, but only with special debugging options set. Also fix the boot message. [akpm@linux-foundation.org: s/sched_numa_balancing/sysctl_numa_balancing/] Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm: free memblock.memory in free_all_bootmemPhilipp Hachtmann3-1/+26
When calling free_all_bootmem() the free areas under memblock's control are released to the buddy allocator. Additionally the reserved list is freed if it was reallocated by memblock. The same should apply for the memory list. Signed-off-by: Philipp Hachtmann <phacht@linux.vnet.ibm.com> Reviewed-by: Tejun Heo <tj@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm/nobootmem.c: add return value check in __alloc_memory_core_early()Philipp Hachtmann1-1/+3
When memblock_reserve() fails because memblock.reserved.regions cannot be resized, the caller (e.g. alloc_bootmem()) is not informed of the failed allocation. Therefore alloc_bootmem() silently returns the same pointer again and again. This patch adds a check for the return value of memblock_reserve() in __alloc_memory_core(). Signed-off-by: Philipp Hachtmann <phacht@linux.vnet.ibm.com> Reviewed-by: Tejun Heo <tj@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Toshi Kani <toshi.kani@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23memcg: rework memcg_update_kmem_limit synchronizationVladimir Davydov1-91/+105
Currently we take both the memcg_create_mutex and the set_limit_mutex when we enable kmem accounting for a memory cgroup, which makes kmem activation events serialize with both memcg creations and other memcg limit updates (memory.limit, memory.memsw.limit). However, there is no point in such strict synchronization rules there. First, the set_limit_mutex was introduced to keep the memory.limit and memory.memsw.limit values in sync. Since memory.kmem.limit can be set independently of them, it is better to introduce a separate mutex to synchronize against concurrent kmem limit updates. Second, we take the memcg_create_mutex in order to make sure all children of this memcg will be kmem-active as well. For achieving that, it is enough to hold this mutex only while checking if memcg_has_children() though. This guarantees that if a child is added after we checked that the memcg has no children, the newly added cgroup will see its parent kmem-active (of course if the latter succeeded), and call kmem activation for itself. This patch simplifies the locking rules of memcg_update_kmem_limit() according to these considerations. [vdavydov@parallels.com: fix unintialized var warning] Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23memcg: remove KMEM_ACCOUNTED_ACTIVATED flagVladimir Davydov1-27/+2
Currently we have two state bits in mem_cgroup::kmem_account_flags regarding kmem accounting activation, ACTIVATED and ACTIVE. We start kmem accounting only if both flags are set (memcg_can_account_kmem()), plus throughout the code there are several places where we check only the ACTIVE flag, but we never check the ACTIVATED flag alone. These flags are both set from memcg_update_kmem_limit() under the set_limit_mutex, the ACTIVE flag always being set after ACTIVATED, and they never get cleared. That said checking if both flags are set is equivalent to checking only for the ACTIVE flag, and since there is no ACTIVATED flag checks, we can safely remove the ACTIVATED flag, and nothing will change. Let's try to understand what was the reason for introducing these flags. The purpose of the ACTIVE flag is clear - it states that kmem should be accounting to the cgroup. The only requirement for it is that it should be set after we have fully initialized kmem accounting bits for the cgroup and patched all static branches relating to kmem accounting. Since we always check if static branch is enabled before actually considering if we should account (otherwise we wouldn't benefit from static branching), this guarantees us that we won't skip a commit or uncharge after a charge due to an unpatched static branch. Now let's move on to the ACTIVATED bit. As I proved in the beginning of this message, it is absolutely useless, and removing it will change nothing. So what was the reason introducing it? The ACTIVATED flag was introduced by commit a8964b9b84f9 ("memcg: use static branches when code not in use") in order to guarantee that static_key_slow_inc(&memcg_kmem_enabled_key) would be called only once for each memory cgroup when its kmem accounting was activated. The point was that at that time the memcg_update_kmem_limit() function's work-flow looked like this: bool must_inc_static_branch = false; cgroup_lock(); mutex_lock(&set_limit_mutex); if (!memcg->kmem_account_flags && val != RESOURCE_MAX) { /* The kmem limit is set for the first time */ ret = res_counter_set_limit(&memcg->kmem, val); memcg_kmem_set_activated(memcg); must_inc_static_branch = true; } else ret = res_counter_set_limit(&memcg->kmem, val); mutex_unlock(&set_limit_mutex); cgroup_unlock(); if (must_inc_static_branch) { /* We can't do this under cgroup_lock */ static_key_slow_inc(&memcg_kmem_enabled_key); memcg_kmem_set_active(memcg); } So that without the ACTIVATED flag we could race with other threads trying to set the limit and increment the static branching ref-counter more than once. Today we call the whole memcg_update_kmem_limit() function under the set_limit_mutex and this race is impossible. As now we understand why the ACTIVATED bit was introduced and why we don't need it now, and know that removing it will change nothing anyway, let's get rid of it. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23memcg, slab: RCU protect memcg_params for root cachesVladimir Davydov3-10/+30
We relocate root cache's memcg_params whenever we need to grow the memcg_caches array to accommodate all kmem-active memory cgroups. Currently on relocation we free the old version immediately, which can lead to use-after-free, because the memcg_caches array is accessed lock-free (see cache_from_memcg_idx()). This patch fixes this by making memcg_params RCU-protected for root caches. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23slab: do not panic if we fail to create memcg cacheVladimir Davydov1-1/+8
There is no point in flooding logs with warnings or especially crashing the system if we fail to create a cache for a memcg. In this case we will be accounting the memcg allocation to the root cgroup until we succeed to create its own cache, but it isn't that critical. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23memcg: get rid of kmem_cache_dup()Vladimir Davydov1-31/+8
kmem_cache_dup() is only called from memcg_create_kmem_cache(). The latter, in fact, does nothing besides this, so let's fold kmem_cache_dup() into memcg_create_kmem_cache(). This patch also makes the memcg_cache_mutex private to memcg_create_kmem_cache(), because it is not used anywhere else. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23memcg, slab: fix races in per-memcg cache creation/destructionVladimir Davydov2-10/+27
We obtain a per-memcg cache from a root kmem_cache by dereferencing an entry of the root cache's memcg_params::memcg_caches array. If we find no cache for a memcg there on allocation, we initiate the memcg cache creation (see memcg_kmem_get_cache()). The cache creation proceeds asynchronously in memcg_create_kmem_cache() in order to avoid lock clashes, so there can be several threads trying to create the same kmem_cache concurrently, but only one of them may succeed. However, due to a race in the code, it is not always true. The point is that the memcg_caches array can be relocated when we activate kmem accounting for a memcg (see memcg_update_all_caches(), memcg_update_cache_size()). If memcg_update_cache_size() and memcg_create_kmem_cache() proceed concurrently as described below, we can leak a kmem_cache. Asume two threads schedule creation of the same kmem_cache. One of them successfully creates it. Another one should fail then, but if memcg_create_kmem_cache() interleaves with memcg_update_cache_size() as follows, it won't: memcg_create_kmem_cache() memcg_update_cache_size() (called w/o mutexes held) (called with slab_mutex, set_limit_mutex held) ------------------------- ------------------------- mutex_lock(&memcg_cache_mutex) s->memcg_params=kzalloc(...) new_cachep=cache_from_memcg_idx(cachep,idx) // new_cachep==NULL => proceed to creation s->memcg_params->memcg_caches[i] =cur_params->memcg_caches[i] // kmem_cache_create_memcg takes slab_mutex // so we will hang around until // memcg_update_cache_size finishes, but // nothing will prevent it from succeeding so // memcg_caches[idx] will be overwritten in // memcg_register_cache! new_cachep = kmem_cache_create_memcg(...) mutex_unlock(&memcg_cache_mutex) Let's fix this by moving the check for existence of the memcg cache to kmem_cache_create_memcg() to be called under the slab_mutex and make it return NULL if so. A similar race is possible when destroying a memcg cache (see kmem_cache_destroy()). Since memcg_unregister_cache(), which clears the pointer in the memcg_caches array, is called w/o protection, we can race with memcg_update_cache_size() and omit clearing the pointer. Therefore memcg_unregister_cache() should be moved before we release the slab_mutex. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23memcg: fix possible NULL deref while traversing memcg_slab_caches listVladimir Davydov1-7/+18
All caches of the same memory cgroup are linked in the memcg_slab_caches list via kmem_cache::memcg_params::list. This list is traversed, for example, when we read memory.kmem.slabinfo. Since the list actually consists of memcg_cache_params objects, we have to convert an element of the list to a kmem_cache object using memcg_params_to_cache(), which obtains the pointer to the cache from the memcg_params::memcg_caches array of the corresponding root cache. That said the pointer to a kmem_cache in its parent's memcg_params must be initialized before adding the cache to the list, and cleared only after it has been unlinked. Currently it is vice-versa, which can result in a NULL ptr dereference while traversing the memcg_slab_caches list. This patch restores the correct order. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23memcg, slab: fix barrier usage when accessing memcg_cachesVladimir Davydov2-15/+21
Each root kmem_cache has pointers to per-memcg caches stored in its memcg_params::memcg_caches array. Whenever we want to allocate a slab for a memcg, we access this array to get per-memcg cache to allocate from (see memcg_kmem_get_cache()). The access must be lock-free for performance reasons, so we should use barriers to assert the kmem_cache is up-to-date. First, we should place a write barrier immediately before setting the pointer to it in the memcg_caches array in order to make sure nobody will see a partially initialized object. Second, we should issue a read barrier before dereferencing the pointer to conform to the write barrier. However, currently the barrier usage looks rather strange. We have a write barrier *after* setting the pointer and a read barrier *before* reading the pointer, which is incorrect. This patch fixes this. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23memcg, slab: clean up memcg cache initialization/destructionVladimir Davydov3-41/+37
Currently, we have rather a messy function set relating to per-memcg kmem cache initialization/destruction. Per-memcg caches are created in memcg_create_kmem_cache(). This function calls kmem_cache_create_memcg() to allocate and initialize a kmem cache and then "registers" the new cache in the memcg_params::memcg_caches array of the parent cache. During its work-flow, kmem_cache_create_memcg() executes the following memcg-related functions: - memcg_alloc_cache_params(), to initialize memcg_params of the newly created cache; - memcg_cache_list_add(), to add the new cache to the memcg_slab_caches list. On the other hand, kmem_cache_destroy() called on a cache destruction only calls memcg_release_cache(), which does all the work: it cleans the reference to the cache in its parent's memcg_params::memcg_caches, removes the cache from the memcg_slab_caches list, and frees memcg_params. Such an inconsistency between destruction and initialization paths make the code difficult to read, so let's clean this up a bit. This patch moves all the code relating to registration of per-memcg caches (adding to memcg list, setting the pointer to a cache from its parent) to the newly created memcg_register_cache() and memcg_unregister_cache() functions making the initialization and destruction paths look symmetrical. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23memcg, slab: kmem_cache_create_memcg(): fix memleak on fail pathVladimir Davydov3-9/+19
We do not free the cache's memcg_params if __kmem_cache_create fails. Fix this. Plus, rename memcg_register_cache() to memcg_alloc_cache_params(), because it actually does not register the cache anywhere, but simply initialize kmem_cache::memcg_params. [akpm@linux-foundation.org: fix build] Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23slab: clean up kmem_cache_create_memcg() error handlingVladimir Davydov1-34/+31
Currently kmem_cache_create_memcg() backoffs on failure inside conditionals, without using gotos. This results in the rollback code duplication, which makes the function look cumbersome even though on error we should only free the allocated cache. Since in the next patch I am going to add yet another rollback function call on error path there, let's employ labels instead of conditionals for undoing any changes on failure to keep things clean. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Reviewed-by: Pekka Enberg <penberg@kernel.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGESasha Levin30-170/+181
Most of the VM_BUG_ON assertions are performed on a page. Usually, when one of these assertions fails we'll get a BUG_ON with a call stack and the registers. I've recently noticed based on the requests to add a small piece of code that dumps the page to various VM_BUG_ON sites that the page dump is quite useful to people debugging issues in mm. This patch adds a VM_BUG_ON_PAGE(cond, page) which beyond doing what VM_BUG_ON() does, also dumps the page before executing the actual BUG_ON. [akpm@linux-foundation.org: fix up includes] Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>