aboutsummaryrefslogtreecommitdiffstats
path: root/kernel/compat.c
AgeCommit message (Collapse)AuthorFilesLines
2023-03-14sched_getaffinity: don't assume 'cpumask_size()' is fully initializedLinus Torvalds1-1/+1
The getaffinity() system call uses 'cpumask_size()' to decide how big the CPU mask is - so far so good. It is indeed the allocation size of a cpumask. But the code also assumes that the whole allocation is initialized without actually doing so itself. That's wrong, because we might have fixed-size allocations (making copying and clearing more efficient), but not all of it is then necessarily used if 'nr_cpu_ids' is smaller. Having checked other users of 'cpumask_size()', they all seem to be ok, either using it purely for the allocation size, or explicitly zeroing the cpumask before using the size in bytes to copy it. See for example the ublk_ctrl_get_queue_affinity() function that uses the proper 'zalloc_cpumask_var()' to make sure that the whole mask is cleared, whether the storage is on the stack or if it was an external allocation. Fix this by just zeroing the allocation before using it. Do the same for the compat version of sched_getaffinity(), which had the same logic. Also, for consistency, make sched_getaffinity() use 'cpumask_bits()' to access the bits. For a cpumask_var_t, it ends up being a pointer to the same data either way, but it's just a good idea to treat it like you would a 'cpumask_t'. The compat case already did that. Reported-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/lkml/7d026744-6bd6-6827-0471-b5e8eae0be3f@arm.com/ Cc: Yury Norov <yury.norov@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-08arch: remove compat_alloc_user_spaceArnd Bergmann1-21/+0
All users of compat_alloc_user_space() and copy_in_user() have been removed from the kernel, only a few functions in sparc remain that can be changed to calling arch_copy_in_user() instead. Link: https://lkml.kernel.org/r/20210727144859.4150043-7-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-23treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva1-3/+3
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-05-01uaccess: Selectively open read or write user accessChristophe Leroy1-6/+6
When opening user access to only perform reads, only open read access. When opening user access to only perform writes, only open write access. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2e73bc57125c2c6ab12a587586a4eed3a47105fc.1585898438.git.christophe.leroy@c-s.fr
2020-02-21y2038: remove unused time32 interfacesArnd Bergmann1-64/+0
No users remain, so kill these off before we grow new ones. Link: http://lkml.kernel.org/r/20200110154232.4104492-3-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-15y2038: itimer: compat handling to itimer.cArnd Bergmann1-24/+0
The structure is only used in one place, moving it there simplifies the interface and helps with later changes to this code. Rename it to match the other time32 structures in the process. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500Thomas Gleixner1-4/+1
Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-15kernel/compat.c: mark expected switch fall-throughsStephen Rothwell1-0/+3
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. This patch aims to suppress 3 missing-break-in-switch false positives on some architectures. Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Deepa Dinamani <deepa.kernel@gmail.com> Cc: Gustavo A. R. Silva <gustavo@embeddedor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Jann Horn <jannh@google.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-07time: make adjtime compat handling available for 32 bitArnd Bergmann1-64/+0
We want to reuse the compat_timex handling on 32-bit architectures the same way we are using the compat handling for timespec when moving to 64-bit time_t. Move all definitions related to compat_timex out of the compat code into the normal timekeeping code, along with a rename to old_timex32, corresponding to the timespec/timeval structures, and make it controlled by CONFIG_COMPAT_32BIT_TIME, which 32-bit architectures will then select. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-01-04make 'user_access_begin()' do 'access_ok()'Linus Torvalds1-4/+2
Originally, the rule used to be that you'd have to do access_ok() separately, and then user_access_begin() before actually doing the direct (optimized) user access. But experience has shown that people then decide not to do access_ok() at all, and instead rely on it being implied by other operations or similar. Which makes it very hard to verify that the access has actually been range-checked. If you use the unsafe direct user accesses, hardware features (either SMAP - Supervisor Mode Access Protection - on x86, or PAN - Privileged Access Never - on ARM) do force you to use user_access_begin(). But nothing really forces the range check. By putting the range check into user_access_begin(), we actually force people to do the right thing (tm), and the range check vill be visible near the actual accesses. We have way too long a history of people trying to avoid them. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-03Remove 'type' argument from access_ok() functionLinus Torvalds1-8/+8
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument of the user address range verification function since we got rid of the old racy i386-only code to walk page tables by hand. It existed because the original 80386 would not honor the write protect bit when in kernel mode, so you had to do COW by hand before doing any user access. But we haven't supported that in a long time, and these days the 'type' argument is a purely historical artifact. A discussion about extending 'user_access_begin()' to do the range checking resulted this patch, because there is no way we're going to move the old VERIFY_xyz interface to that model. And it's best done at the end of the merge window when I've done most of my merges, so let's just get this done once and for all. This patch was mostly done with a sed-script, with manual fix-ups for the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form. There were a couple of notable cases: - csky still had the old "verify_area()" name as an alias. - the iter_iov code had magical hardcoded knowledge of the actual values of VERIFY_{READ,WRITE} (not that they mattered, since nothing really used it) - microblaze used the type argument for a debug printout but other than those oddities this should be a total no-op patch. I tried to fix up all architectures, did fairly extensive grepping for access_ok() uses, and the changes are trivial, but I may have missed something. Any missed conversion should be trivially fixable, though. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-27y2038: globally rename compat_time to old_time32Arnd Bergmann1-4/+4
Christoph Hellwig suggested a slightly different path for handling backwards compatibility with the 32-bit time_t based system calls: Rather than simply reusing the compat_sys_* entry points on 32-bit architectures unchanged, we get rid of those entry points and the compat_time types by renaming them to something that makes more sense on 32-bit architectures (which don't have a compat mode otherwise), and then share the entry points under the new name with the 64-bit architectures that use them for implementing the compatibility. The following types and interfaces are renamed here, and moved from linux/compat_time.h to linux/time32.h: old new --- --- compat_time_t old_time32_t struct compat_timeval struct old_timeval32 struct compat_timespec struct old_timespec32 struct compat_itimerspec struct old_itimerspec32 ns_to_compat_timeval() ns_to_old_timeval32() get_compat_itimerspec64() get_old_itimerspec32() put_compat_itimerspec64() put_old_itimerspec32() compat_get_timespec64() get_old_timespec32() compat_put_timespec64() put_old_timespec32() As we already have aliases in place, this patch addresses only the instances that are relevant to the system call interface in particular, not those that occur in device drivers and other modules. Those will get handled separately, while providing the 64-bit version of the respective interfaces. I'm not renaming the timex, rusage and itimerval structures, as we are still debating what the new interface will look like, and whether we will need a replacement at all. This also doesn't change the names of the syscall entry points, which can be done more easily when we actually switch over the 32-bit architectures to use them, at that point we need to change COMPAT_SYSCALL_DEFINEx to SYSCALL_DEFINEx with a new name, e.g. with a _time32 suffix. Suggested-by: Christoph Hellwig <hch@infradead.org> Link: https://lore.kernel.org/lkml/20180705222110.GA5698@infradead.org/ Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-06-24time: Enable get/put_compat_itimerspec64 alwaysDeepa Dinamani1-29/+0
This will aid in enabling the compat syscalls on 32-bit architectures later on. Also move compat_itimerspec and related defines to compat_time.h. The compat_time.h file will eventually be deleted. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: arnd@arndb.de Cc: viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org Cc: linux-api@vger.kernel.org Cc: y2038@lists.linaro.org Link: https://lkml.kernel.org/r/20180617051144.29756-3-deepa.kernel@gmail.com
2018-06-04Merge branch 'timers-core-for-linus' of ↵Linus Torvalds1-44/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timers and timekeeping updates from Thomas Gleixner: - Core infrastucture work for Y2038 to address the COMPAT interfaces: + Add a new Y2038 safe __kernel_timespec and use it in the core code + Introduce config switches which allow to control the various compat mechanisms + Use the new config switch in the posix timer code to control the 32bit compat syscall implementation. - Prevent bogus selection of CPU local clocksources which causes an endless reselection loop - Remove the extra kthread in the clocksource code which has no value and just adds another level of indirection - The usual bunch of trivial updates, cleanups and fixlets all over the place - More SPDX conversions * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) clocksource/drivers/mxs_timer: Switch to SPDX identifier clocksource/drivers/timer-imx-tpm: Switch to SPDX identifier clocksource/drivers/timer-imx-gpt: Switch to SPDX identifier clocksource/drivers/timer-imx-gpt: Remove outdated file path clocksource/drivers/arc_timer: Add comments about locking while read GFRC clocksource/drivers/mips-gic-timer: Add pr_fmt and reword pr_* messages clocksource/drivers/sprd: Fix Kconfig dependency clocksource: Move inline keyword to the beginning of function declarations timer_list: Remove unused function pointer typedef timers: Adjust a kernel-doc comment tick: Prefer a lower rating device only if it's CPU local device clocksource: Remove kthread time: Change nanosleep to safe __kernel_* types time: Change types to new y2038 safe __kernel_* types time: Fix get_timespec64() for y2038 safe compat interfaces time: Add new y2038 safe __kernel_timespec posix-timers: Make compat syscalls depend on CONFIG_COMPAT_32BIT_TIME time: Introduce CONFIG_COMPAT_32BIT_TIME time: Introduce CONFIG_64BIT_TIME in architectures compat: Enable compat_get/put_timespec64 always ...
2018-05-10compat: fix 4-byte infoleak via uninitialized struct fieldJann Horn1-0/+1
Commit 3a4d44b61625 ("ntp: Move adjtimex related compat syscalls to native counterparts") removed the memset() in compat_get_timex(). Since then, the compat adjtimex syscall can invoke do_adjtimex() with an uninitialized ->tai. If do_adjtimex() doesn't write to ->tai (e.g. because the arguments are invalid), compat_put_timex() then copies the uninitialized ->tai field to userspace. Fix it by adding the memset() back. Fixes: 3a4d44b61625 ("ntp: Move adjtimex related compat syscalls to native counterparts") Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-19compat: Enable compat_get/put_timespec64 alwaysDeepa Dinamani1-44/+8
These functions are used in the repurposed compat syscalls to provide backward compatibility for using 32 bit time_t on 32 bit systems. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-04-02mm: add kernel_move_pages() helper, move compat syscall to mm/migrate.cDominik Brodowski1-22/+0
Move compat_sys_move_pages() to mm/migrate.c and make it call a newly introduced helper -- kernel_move_pages() -- instead of the syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-mm@kvack.org Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02mm: add kernel_migrate_pages() helper, move compat syscall to mm/mempolicy.cDominik Brodowski1-33/+0
Move compat_sys_migrate_pages() to mm/mempolicy.c and make it call a newly introduced helper -- kernel_migrate_pages() -- instead of the syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-mm@kvack.org Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-03-02signals: Move put_compat_sigset to compat.h to silence hardened usercopyMatt Redfearn1-19/+0
Since commit afcc90f8621e ("usercopy: WARN() on slab cache usercopy region violations"), MIPS systems booting with a compat root filesystem emit a warning when copying compat siginfo to userspace: WARNING: CPU: 0 PID: 953 at mm/usercopy.c:81 usercopy_warn+0x98/0xe8 Bad or missing usercopy whitelist? Kernel memory exposure attempt detected from SLAB object 'task_struct' (offset 1432, size 16)! Modules linked in: CPU: 0 PID: 953 Comm: S01logging Not tainted 4.16.0-rc2 #10 Stack : ffffffff808c0000 0000000000000000 0000000000000001 65ac85163f3bdc4a 65ac85163f3bdc4a 0000000000000000 90000000ff667ab8 ffffffff808c0000 00000000000003f8 ffffffff808d0000 00000000000000d1 0000000000000000 000000000000003c 0000000000000000 ffffffff808c8ca8 ffffffff808d0000 ffffffff808d0000 ffffffff80810000 fffffc0000000000 ffffffff80785c30 0000000000000009 0000000000000051 90000000ff667eb0 90000000ff667db0 000000007fe0d938 0000000000000018 ffffffff80449958 0000000020052798 ffffffff808c0000 90000000ff664000 90000000ff667ab0 00000000100c0000 ffffffff80698810 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ffffffff8010d02c 65ac85163f3bdc4a ... Call Trace: [<ffffffff8010d02c>] show_stack+0x9c/0x130 [<ffffffff80698810>] dump_stack+0x90/0xd0 [<ffffffff80137b78>] __warn+0x100/0x118 [<ffffffff80137bdc>] warn_slowpath_fmt+0x4c/0x70 [<ffffffff8021e4a8>] usercopy_warn+0x98/0xe8 [<ffffffff8021e68c>] __check_object_size+0xfc/0x250 [<ffffffff801bbfb8>] put_compat_sigset+0x30/0x88 [<ffffffff8011af24>] setup_rt_frame_n32+0xc4/0x160 [<ffffffff8010b8b4>] do_signal+0x19c/0x230 [<ffffffff8010c408>] do_notify_resume+0x60/0x78 [<ffffffff80106f50>] work_notifysig+0x10/0x18 ---[ end trace 88fffbf69147f48a ]--- Commit 5905429ad856 ("fork: Provide usercopy whitelisting for task_struct") noted that: "While the blocked and saved_sigmask fields of task_struct are copied to userspace (via sigmask_to_save() and setup_rt_frame()), it is always copied with a static length (i.e. sizeof(sigset_t))." However, this is not true in the case of compat signals, whose sigset is copied by put_compat_sigset and receives size as an argument. At most call sites, put_compat_sigset is copying a sigset from the current task_struct. This triggers a warning when CONFIG_HARDENED_USERCOPY is active. However, by marking this function as static inline, the warning can be avoided because in all of these cases the size is constant at compile time, which is allowed. The only site where this is not the case is handling the rt_sigpending syscall, but there the copy is being made from a stack local variable so does not trigger the warning. Move put_compat_sigset to compat.h, and mark it static inline. This fixes the WARN on MIPS. Fixes: afcc90f8621e ("usercopy: WARN() on slab cache usercopy region violations") Signed-off-by: Matt Redfearn <matt.redfearn@mips.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: "Dmitry V . Levin" <ldv@altlinux.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: kernel-hardening@lists.openwall.com Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/18639/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-02-06cpumask: make cpumask_size() return "unsigned int"Alexey Dobriyan1-1/+1
CPUmasks are never big enough to warrant 64-bit code. Space savings: add/remove: 0/0 grow/shrink: 1/4 up/down: 3/-17 (-14) Function old new delta sched_init_numa 1530 1533 +3 compat_sys_sched_setaffinity 160 159 -1 sys_sched_getaffinity 197 195 -2 sys_sched_setaffinity 183 176 -7 compat_sys_sched_getaffinity 179 172 -7 Link: http://lkml.kernel.org/r/20171204165531.GA8221@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-20sched_rr_get_interval(): move compat to native, get rid of set_fs()Al Viro1-16/+0
switch to using timespec64 internally, while we are at it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-09-19get_compat_sigset()Al Viro1-7/+16
similar to put_compat_sigset() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-09-19get rid of {get,put}_compat_itimerspec()Al Viro1-18/+0
no users left Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-09-19signal: replace sigset_to_compat() with put_compat_sigset()Dmitry V. Levin1-6/+14
There are 4 callers of sigset_to_compat() in the entire kernel. One is in sparc compat rt_sigaction(2), the rest are in kernel/signal.c itself. All are followed by copy_to_user(), and all but the sparc one are under "if it's big-endian..." ifdefs. Let's transform sigset_to_compat() into put_compat_sigset() that also calls copy_to_user(). Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-07-15semtimedop(): move compat to nativeAl Viro1-23/+0
... and finally kill the sodding compat_convert_timespec() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-07-06Merge branch 'misc.compat' of ↵Linus Torvalds1-218/+49
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc compat stuff updates from Al Viro: "This part is basically untangling various compat stuff. Compat syscalls moved to their native counterparts, getting rid of quite a bit of double-copying and/or set_fs() uses. A lot of field-by-field copyin/copyout killed off. - kernel/compat.c is much closer to containing just the copyin/copyout of compat structs. Not all compat syscalls are gone from it yet, but it's getting there. - ipc/compat_mq.c killed off completely. - block/compat_ioctl.c cleaned up; floppy compat ioctls moved to drivers/block/floppy.c where they belong. Yes, there are several drivers that implement some of the same ioctls. Some are m68k and one is 32bit-only pmac. drivers/block/floppy.c is the only one in that bunch that can be built on biarch" * 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: mqueue: move compat syscalls to native ones usbdevfs: get rid of field-by-field copyin compat_hdio_ioctl: get rid of set_fs() take floppy compat ioctls to sodding floppy.c ipmi: get rid of field-by-field __get_user() ipmi: get COMPAT_IPMICTL_RECEIVE_MSG in sync with the native one rt_sigtimedwait(): move compat to native select: switch compat_{get,put}_fd_set() to compat_{get,put}_bitmap() put_compat_rusage(): switch to copy_to_user() sigpending(): move compat to native getrlimit()/setrlimit(): move compat to native times(2): move compat to native compat_{get,put}_bitmap(): use unsafe_{get,put}_user() fb_get_fscreeninfo(): don't bother with do_fb_ioctl() do_sigaltstack(): lift copying to/from userland into callers take compat_sys_old_getrlimit() to native syscall trim __ARCH_WANT_SYS_OLD_GETRLIMIT
2017-07-05Merge branch 'timers-compat' of ↵Linus Torvalds1-0/+65
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull timer-related user access updates from Al Viro: "Continuation of timers-related stuff (there had been more, but my parts of that series are already merged via timers/core). This is more of y2038 work by Deepa Dinamani, partially disrupted by the unification of native and compat timers-related syscalls" * 'timers-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: posix_clocks: Use get_itimerspec64() and put_itimerspec64() timerfd: Use get_itimerspec64() and put_itimerspec64() nanosleep: Use get_timespec64() and put_timespec64() posix-timers: Use get_timespec64() and put_timespec64() posix-stubs: Conditionally include COMPAT_SYS_NI defines time: introduce {get,put}_itimerspec64 time: add get_timespec64 and put_timespec64
2017-07-05Merge branch 'work.sys_wait' of ↵Linus Torvalds1-66/+0
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull wait syscall updates from Al Viro: "Consolidating sys_wait* and compat counterparts. Gets rid of set_fs()/double-copy mess, simplifies the whole thing (lifting the copyouts to the syscalls means less headache in the part that does actual work - fewer failure exits, to start with), gets rid of the overhead of field-by-field __put_user()" * 'work.sys_wait' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: osf_wait4: switch to kernel_wait4() waitid(): switch copyout of siginfo to unsafe_put_user() wait_task_zombie: consolidate info logics kill wait_noreap_copyout() lift getrusage() from wait_noreap_copyout() waitid(2): leave copyout of siginfo to syscall itself kernel_wait4()/kernel_waitid(): delay copying status to userland wait4(2)/waitid(2): separate copying rusage to userland move compat wait4 and waitid next to native variants
2017-06-25time: introduce {get,put}_itimerspec64Deepa Dinamani1-0/+21
As we change the user space type for the timerfd and posix timer functions to newer data types, we need some form of conversion helpers to avoid duplicating that logic. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-25time: add get_timespec64 and put_timespec64Deepa Dinamani1-0/+44
Add helper functions to convert between struct timespec64 and struct timespec at userspace boundaries. This is a preparatory patch to use timespec64 as the basic type internally in the kernel as timespec is not y2038 safe on 32 bit systems. The patch helps the cause by containing all data conversions at the userspace boundaries within these functions. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-14time: Move compat_gettimeofday()/settimeofday() to nativeAl Viro1-38/+0
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170607084241.28657-16-viro@ZenIV.linux.org.uk
2017-06-14time: Move compat_time()/stime() to nativeAl Viro1-40/+0
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170607084241.28657-15-viro@ZenIV.linux.org.uk
2017-06-14posix-timers: Move compat_timer_create() to native, get rid of set_fs()Al Viro1-18/+0
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170607084241.28657-14-viro@ZenIV.linux.org.uk
2017-06-14posix-timers: Move compat versions of clock_gettime/settime/getresAl Viro1-51/+0
Move them to the native implementations and get rid of the set_fs() hackery. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170607084241.28657-13-viro@ZenIV.linux.org.uk
2017-06-14itimers: Move compat itimer syscalls to native onesAl Viro1-53/+16
get rid of set_fs(), sanitize compat copyin/copyout. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170607084241.28657-12-viro@ZenIV.linux.org.uk
2017-06-14posix-timers: Take compat timer_gettime(2) to native oneAl Viro1-17/+0
... and get rid of set_fs() in there Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170607084241.28657-11-viro@ZenIV.linux.org.uk
2017-06-14posix-timers: Take compat timer_settime(2) to native oneAl Viro1-23/+0
... and get rid of set_fs() in there Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170607084241.28657-10-viro@ZenIV.linux.org.uk
2017-06-14ntp: Move adjtimex related compat syscalls to native counterpartsAl Viro1-89/+52
Get rid of set_fs() mess and sanitize compat_{get,put}_timex(), while we are at it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170607084241.28657-9-viro@ZenIV.linux.org.uk
2017-06-14time/posix-timers: Move the compat copyouts to the nanosleep implementationsAl Viro1-131/+0
Turn restart_block.nanosleep.{rmtp,compat_rmtp} into a tagged union (kind = 1 -> native, kind = 2 -> compat, kind = 0 -> nothing) and make the places doing actual copyout handle compat as well as native (that will become a helper in the next commit). Result: compat wrappers, messing with reassignments, etc. are gone. [ tglx: Folded in a variant of Peter Zijlstras enum patch ] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170607084241.28657-6-viro@ZenIV.linux.org.uk
2017-06-14hrtimer_nanosleep(): Pass rmtp in restart_blockAl Viro1-3/+3
Store the pointer to the timespec which gets updated with the remaining time in the restart block and remove the function argument. [ tglx: Added changelog ] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170607084241.28657-3-viro@ZenIV.linux.org.uk
2017-06-09rt_sigtimedwait(): move compat to nativeAl Viro1-32/+0
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-09put_compat_rusage(): switch to copy_to_user()Al Viro1-19/+21
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-09sigpending(): move compat to nativeAl Viro1-23/+0
... and kill set_fs() use Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-09getrlimit()/setrlimit(): move compat to nativeAl Viro1-38/+0
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-09times(2): move compat to nativeAl Viro1-24/+0
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-09compat_{get,put}_bitmap(): use unsafe_{get,put}_user()Al Viro1-53/+28
unroll the inner loops, while we are at it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-05-27take compat_sys_old_getrlimit() to native syscallAl Viro1-29/+0
... and sanitize the ifdefs in there Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-05-21move compat wait4 and waitid next to native variantsAl Viro1-66/+0
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-04-14time: Change k_clock nsleep() to use timespec64Deepa Dinamani1-2/+4
struct timespec is not y2038 safe on 32 bit machines. Replace uses of struct timespec with struct timespec64 in the kernel. The syscall interfaces themselves will be changed in a separate series. Note that the restart_block parameter for nanosleep has also been left unchanged and will be part of syscall series noted above. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Cc: y2038@lists.linaro.org Cc: john.stultz@linaro.org Cc: arnd@arndb.de Link: http://lkml.kernel.org/r/1490555058-4603-8-git-send-email-deepa.kernel@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-14time: Delete do_sys_setimeofday()Deepa Dinamani1-2/+2
struct timespec is not y2038 safe on 32 bit machines and needs to be replaced with struct timespec64. do_sys_timeofday() is just a wrapper function. Replace all calls to this function with direct calls to do_sys_timeofday64() instead and delete do_sys_timeofday(). Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Cc: y2038@lists.linaro.org Cc: john.stultz@linaro.org Cc: arnd@arndb.de Cc: linux-alpha@vger.kernel.org Link: http://lkml.kernel.org/r/1490555058-4603-2-git-send-email-deepa.kernel@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-24Replace <asm/uaccess.h> with <linux/uaccess.h> globallyLinus Torvalds1-1/+1
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-16posix-timers: Make them configurableNicolas Pitre1-0/+8
Some embedded systems have no use for them. This removes about 25KB from the kernel binary size when configured out. Corresponding syscalls are routed to a stub logging the attempt to use those syscalls which should be enough of a clue if they were disabled without proper consideration. They are: timer_create, timer_gettime: timer_getoverrun, timer_settime, timer_delete, clock_adjtime, setitimer, getitimer, alarm. The clock_settime, clock_gettime, clock_getres and clock_nanosleep syscalls are replaced by simple wrappers compatible with CLOCK_REALTIME, CLOCK_MONOTONIC and CLOCK_BOOTTIME only which should cover the vast majority of use cases with very little code. Signed-off-by: Nicolas Pitre <nico@linaro.org> Acked-by: Richard Cochran <richardcochran@gmail.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: Paul Bolle <pebolle@tiscali.nl> Cc: linux-kbuild@vger.kernel.org Cc: netdev@vger.kernel.org Cc: Michal Marek <mmarek@suse.com> Cc: Edward Cree <ecree@solarflare.com> Link: http://lkml.kernel.org/r/1478841010-28605-7-git-send-email-nicolas.pitre@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-06-04compat: cleanup coding in compat_get_bitmap() and compat_put_bitmap()Helge Deller1-2/+4
In the functions compat_get_bitmap() and compat_put_bitmap() the variable nr_compat_longs stores how many compat_ulong_t words should be copied in a loop. The copy loop itself is this: if (nr_compat_longs-- > 0) { if (__get_user(um, umask)) return -EFAULT; } else { um = 0; } Since nr_compat_longs gets unconditionally decremented in each loop and since it's type is unsigned this could theoretically lead to out of bounds accesses to userspace if nr_compat_longs wraps around to (unsigned)(-1). Although the callers currently do not trigger out-of-bounds accesses, we should better implement the loop in a safe way to completely avoid such warp-arounds. Signed-off-by: Helge Deller <deller@gmx.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk>
2015-02-12all arches, signal: move restart_block to struct task_structAndy Lutomirski1-3/+2
If an attacker can cause a controlled kernel stack overflow, overwriting the restart block is a very juicy exploit target. This is because the restart_block is held in the same memory allocation as the kernel stack. Moving the restart block to struct task_struct prevents this exploit by making the restart_block harder to locate. Note that there are other fields in thread_info that are also easy targets, at least on some architectures. It's also a decent simplification, since the restart code is more or less identical on all architectures. [james.hogan@imgtec.com: metag: align thread_info::supervisor_stack] Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: David Miller <davem@davemloft.net> Acked-by: Richard Weinberger <richard@nod.at> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Steven Miao <realmz6@gmail.com> Cc: Mark Salter <msalter@redhat.com> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-09-06compat: nanosleep: Clarify error handlingThomas Gleixner1-3/+21
The error handling in compat_sys_nanosleep() is correct, but completely non obvious. Document it and restrict it to the -ERESTART_RESTARTBLOCK return value for clarity. Reported-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-06-04kernel/compat.c: use sizeof() instead of sizeofFabian Frederick1-4/+4
Fix 4 checkpatch warnings WARNING: sizeof *tv should be sizeof(*tv) Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-02Merge branch 'x86-x32-for-linus' of ↵Linus Torvalds1-56/+56
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull compat time conversion changes from Peter Anvin: "Despite the branch name this is really neither an x86 nor an x32-specific patchset, although it the implementation of the discussions that followed the x32 security hole a few months ago. This removes get/put_compat_timespec/val() and replaces them with compat_get/put_timespec/val() which are savvy as to the current status of COMPAT_USE_64BIT_TIME. It removes several unused and/or incorrect/misleading functions (like compat_put_timeval_convert which doesn't in fact do any conversion) and also replaces several open-coded implementations what is now called compat_convert_timespec() with that function" * 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: compat: Fix sparse address space warnings compat: Get rid of (get|put)_compat_time(val|spec)
2014-03-06mm/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter typesHeiko Carstens1-5/+5
In order to allow the COMPAT_SYSCALL_DEFINE macro generate code that performs proper zero and sign extension convert all 64 bit parameters to their corresponding 32 bit compat counterparts. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2014-03-06kernel/compat: convert to COMPAT_SYSCALL_DEFINEHeiko Carstens1-45/+45
Convert all compat system call functions where all parameter types have a size of four or less than four bytes, or are pointer types to COMPAT_SYSCALL_DEFINE. The implicit casts within COMPAT_SYSCALL_DEFINE will perform proper zero and sign extension to 64 bit of all parameters if needed. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2014-02-02compat: Fix sparse address space warningsH. Peter Anvin1-2/+2
In compat_sys_old_getrlimit() we pass a kernel pointer to sys_old_getrlimit() inside a set_fs() bracket. This is okay, so we can safely cast the affected pointer to __user. In compat_clock_nanosleep_restart(), the variable "rmtp" holds a user pointer. Annotate it as such. Both of these warnings are ancient, but were reported by Fengguang Wu's test system due to other changes. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Toyo Abe <toyoa@mvista.com> Link: http://lkml.kernel.org/n/tip-507h7cq5e45eg6ygtykon3bf@git.kernel.org
2014-02-02compat: Get rid of (get|put)_compat_time(val|spec)H. Peter Anvin1-54/+54
We have two APIs for compatiblity timespec/val, with confusingly similar names. compat_(get|put)_time(val|spec) *do* handle the case where COMPAT_USE_64BIT_TIME is set, whereas (get|put)_compat_time(val|spec) do not. This is an accident waiting to happen. Clean it up by favoring the full-service version; the limited version is replaced with double-underscore versions static to kernel/compat.c. A common pattern is to convert a struct timespec to kernel format in an allocation on the user stack. Unfortunately it is open-coded in several places. Since this allocation isn't actually needed if COMPAT_USE_64BIT_TIME is true (since user format == kernel format) encapsulate that whole pattern into the function compat_convert_timespec(). An equivalent function should be written for struct timeval if it is needed in the future. Finally, get rid of compat_(get|put)_timeval_convert(): each was only used once, and the latter was not even doing what the function said (no conversion actually was being done.) Moving the conversion into compat_sys_settimeofday() itself makes the code much more similar to sys_settimeofday() itself. v3: Remove unused compat_convert_timeval(). v2: Drop bogus "const" in the destination argument for compat_convert_time*(). Cc: Mauro Carvalho Chehab <m.chehab@samsung.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Hans Verkuil <hans.verkuil@cisco.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Mateusz Guzik <mguzik@redhat.com> Cc: Rafael Aquini <aquini@redhat.com> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Tested-by: H.J. Lu <hjl.tools@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-05-01Merge branch 'for-linus' of ↵Linus Torvalds1-19/+0
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull compat cleanup from Al Viro: "Mostly about syscall wrappers this time; there will be another pile with patches in the same general area from various people, but I'd rather push those after both that and vfs.git pile are in." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: syscalls.h: slightly reduce the jungles of macros get rid of union semop in sys_semctl(2) arguments make do_mremap() static sparc: no need to sign-extend in sync_file_range() wrapper ppc compat wrappers for add_key(2) and request_key(2) are pointless x86: trim sys_ia32.h x86: sys32_kill and sys32_mprotect are pointless get rid of compat_sys_semctl() and friends in case of ARCH_WANT_OLD_COMPAT_IPC merge compat sys_ipc instances consolidate compat lookup_dcookie() convert vmsplice to COMPAT_SYSCALL_DEFINE switch getrusage() to COMPAT_SYSCALL_DEFINE switch epoll_pwait to COMPAT_SYSCALL_DEFINE convert sendfile{,64} to COMPAT_SYSCALL_DEFINE switch signalfd{,4}() to COMPAT_SYSCALL_DEFINE make SYSCALL_DEFINE<n>-generated wrappers do asmlinkage_protect make HAVE_SYSCALL_WRAPPERS unconditional consolidate cond_syscall and SYSCALL_ALIAS declarations teach SYSCALL_DEFINE<n> how to deal with long long/unsigned long long get rid of duplicate logics in __SC_....[1-6] definitions
2013-04-30kernel/compat.c: make do_sysinfo() staticStephen Rothwell1-65/+0
The only use outside of kernel/timer.c was in kernel/compat.c, so move compat_sys_sysinfo() next to sys_sysinfo() in kernel/timer.c. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-03switch getrusage() to COMPAT_SYSCALL_DEFINEAl Viro1-19/+0
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-23Merge branch 'for-linus' of ↵Linus Torvalds1-46/+26
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull signal handling cleanups from Al Viro: "This is the first pile; another one will come a bit later and will contain SYSCALL_DEFINE-related patches. - a bunch of signal-related syscalls (both native and compat) unified. - a bunch of compat syscalls switched to COMPAT_SYSCALL_DEFINE (fixing several potential problems with missing argument validation, while we are at it) - a lot of now-pointless wrappers killed - a couple of architectures (cris and hexagon) forgot to save altstack settings into sigframe, even though they used the (uninitialized) values in sigreturn; fixed. - microblaze fixes for delivery of multiple signals arriving at once - saner set of helpers for signal delivery introduced, several architectures switched to using those." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (143 commits) x86: convert to ksignal sparc: convert to ksignal arm: switch to struct ksignal * passing alpha: pass k_sigaction and siginfo_t using ksignal pointer burying unused conditionals make do_sigaltstack() static arm64: switch to generic old sigaction() (compat-only) arm64: switch to generic compat rt_sigaction() arm64: switch compat to generic old sigsuspend arm64: switch to generic compat rt_sigqueueinfo() arm64: switch to generic compat rt_sigpending() arm64: switch to generic compat rt_sigprocmask() arm64: switch to generic sigaltstack sparc: switch to generic old sigsuspend sparc: COMPAT_SYSCALL_DEFINE does all sign-extension as well as SYSCALL_DEFINE sparc: kill sign-extending wrappers for native syscalls kill sparc32_open() sparc: switch to use of generic old sigaction sparc: switch sys_compat_rt_sigaction() to COMPAT_SYSCALL_DEFINE mips: switch to generic sys_fork() and sys_clone() ...
2013-02-21compat: return -EFAULT on error in waitid()Dan Carpenter1-1/+1
The copy_to_user() call returns the number of bytes remaining but we want to return -EFAULT on error. Fixes "x32: fix waitid()" Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-03switch compat_sys_sched_rr_get_interval to COMPAT_SYSCALL_DEFINEAl Viro1-4/+3
... and make it unconditional - we want the sucker on all biarch platforms, really. All kinds of wrappers and private implementations can go now; fortunately, they don't cause name conflicts, so we can do that one first without any bisect hazards. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-03switch rt_tgsigqueueinfo to COMPAT_SYSCALL_DEFINEAl Viro1-11/+0
C ABI violations on sparc, ppc and mips Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-03switch compat_sys_sigprocmask to COMPAT_SYSCALL_DEFINEAl Viro1-3/+3
In principle, C ABI violation on ppc and mips... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-03switch compat_sys_rt_sigtimedwait to COMPAT_SYSCALL_DEFINEAl Viro1-5/+3
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-03generic compat_sys_rt_sigprocmask()Al Viro1-1/+12
conditional on GENERIC_COMPAT_RT_SIGPROCMASK; by the end of that series it will become the same thing as COMPAT and conditional will die out. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-03consolidate rt_sigsuspend()Al Viro1-17/+0
* pull compat version alongside with the native one * make little-endian compat variant just call the native * don't bother with separate conditional for compat (both native and compat are going to become unconditional very soon). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-03switch compat_sys_[gs]etitimer(2) to COMPAT_SYSCALL_DEFINEAl Viro1-5/+5
again, strictly speaking we are in nasal daemon territory on ppc and mips - we need to sign-extend int arguments. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-26x32: fix sigtimedwaitAl Viro1-1/+1
It needs 64bit timespec. As it is, we end up truncating the timeout to whole seconds; usually it doesn't matter, but for having all sub-second timeouts truncated to one jiffy is visibly wrong. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-26x32: fix waitid()Al Viro1-1/+5
It needs 64bit rusage and 32bit siginfo. glibc never calls it with non-NULL rusage pointer, or we would've seen breakage already... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-26switch compat_sys_wait4() and compat_sys_waitid() to COMPAT_SYSCALL_DEFINEAl Viro1-6/+9
Strictly speaking, ppc64 needs it for C ABI compliance. Realistically I would be very surprised if e.g. passing 0xffffffff as 'options' argument to waitid() from 32bit task would cause problems, but yes, it puts us into undefined behaviour territory. ppc64 expects int argument to be passed in 64bit register with bits 31..63 containing the same value. SYSCALL_DEFINE on ppc provides a wrapper that normalizes the value passed from userland; so does COMPAT_SYSCALL_DEFINE. Plain declaration of compat_sys_something() with an int argument obviously doesn't. Again, for wait4 and waitid I would be extremely surprised if gcc started to produce code depending on that value having been properly sign-extended - the argument(s) in question end up passed blindly to sys_wait4 and sys_waitid resp. and normalization for native syscalls takes care of their use there. Still, better to use COMPAT_SYSCALL_DEFINE here than worry about nasal daemons... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-17compat: generic compat_sys_sched_rr_get_interval() implementationCatalin Marinas1-0/+17
This function is used by sparc, powerpc tile and arm64 for compat support. The patch adds a generic implementation with a wrapper for PowerPC to do the u32->int sign extension. The reason for a single patch covering powerpc, tile, sparc and arm64 is to keep it bisectable, otherwise kernel building may fail with mismatched function declarations. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Chris Metcalf <cmetcalf@tilera.com> [for tile] Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-21new helper: sigsuspend()Al Viro1-9/+1
guts of saved_sigmask-based sigsuspend/rt_sigsuspend. Takes kernel sigset_t *. Open-coded instances replaced with calling it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-10compat: Fix RT signal mask corruption via sigprocmaskJan Kiszka1-17/+46
compat_sys_sigprocmask reads a smaller signal mask from userspace than sigprogmask accepts for setting. So the high word of blocked.sig[0] will be cleared, releasing any potentially blocked RT signal. This was discovered via userspace code that relies on get/setcontext. glibc's i386 versions of those functions use sigprogmask instead of rt_sigprogmask to save/restore signal mask and caused RT signal unblocking this way. As suggested by Linus, this replaces the sys_sigprocmask based compat version with one that open-codes the required logic, including the merge of the existing blocked set with the new one provided on SIG_SETMASK. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-20compat: Add helper functions to read/write struct timeval, timespecH. Peter Anvin1-8/+60
Add helper functions to read and write struct timeval and struct timespec from userspace. We already had helper functions for reading and writing struct compat_timespec; add a set of functions to do the same with struct timeval, and add a second suite of functions which can be sensitive to COMPAT_USE_64BIT_TIME and access either 32- or 64-bit time structures. This also exports these helper functions to modules. Rename the existing inlines for converting between struct compat_timeval and native struct timespec so we can have a saner naming convention for the exported functions. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2011-10-31kernel: Fix files explicitly needing EXPORT_SYMBOL infrastructurePaul Gortmaker1-0/+1
These files were getting <linux/module.h> via an implicit non-obvious path, but we want to crush those out of existence since they cost time during compiles of processing thousands of lines of headers for no reason. Give them the lightweight header that just contains the EXPORT_SYMBOL infrastructure. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-07-30Merge branch 'v4l_for_linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6 * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (430 commits) [media] ir-mce_kbd-decoder: include module.h for its facilities [media] ov5642: include module.h for its facilities [media] em28xx: Fix DVB-C maxsize for em2884 [media] tda18271c2dd: Fix saw filter configuration for DVB-C @6MHz [media] v4l: mt9v032: Fix Bayer pattern [media] V4L: mt9m111: rewrite set_pixfmt [media] V4L: mt9m111: fix missing return value check mt9m111_reg_clear [media] V4L: initial driver for ov5642 CMOS sensor [media] V4L: sh_mobile_ceu_camera: fix Oops when USERPTR mapping fails [media] V4L: soc-camera: remove soc-camera bus and devices on it [media] V4L: soc-camera: un-export the soc-camera bus [media] V4L: sh_mobile_csi2: switch away from using the soc-camera bus notifier [media] V4L: add media bus configuration subdev operations [media] V4L: soc-camera: group struct field initialisations together [media] V4L: soc-camera: remove now unused soc-camera specific PM hooks [media] V4L: pxa-camera: switch to using standard PM hooks [media] NetUP Dual DVB-T/C CI RF: force card hardware revision by module param [media] Don't OOPS if videobuf_dvb_get_frontend return NULL [media] NetUP Dual DVB-T/C CI RF: load firmware according card revision [media] omap3isp: Support configurable HS/VS polarities ... Fix up conflicts: - arch/arm/mach-omap2/board-rx51-peripherals.c: cleanup regulator supply definitions in mach-omap2 vs OMAP3: RX-51: define vdds_csib regulator supply - drivers/staging/tm6000/tm6000-alsa.c (trivial)
2011-07-27[media] v4l2-compat-ioctl32: add VIDIOC_DQEVENT supportHans Verkuil1-0/+1
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-07-27signals: sys_ssetmask/sys_rt_sigsuspend should use set_current_blocked()Oleg Nesterov1-4/+1
sys_ssetmask(), sys_rt_sigsuspend() and compat_sys_rt_sigsuspend() change ->blocked directly. This is not correct, see the changelog in e6fa16ab "signal: sigprocmask() should do retarget_shared_pending()" Change them to use set_current_blocked(). Another change is that now we are doing ->saved_sigmask = ->blocked lockless, it doesn't make any sense to do this under ->siglock. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Matt Fleming <matt.fleming@linux.intel.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-12KVM: Add compat ioctl for KVM_SET_SIGNAL_MASKAlexander Graf1-0/+1
KVM has an ioctl to define which signal mask should be used while running inside VCPU_RUN. At least for big endian systems, this mask is different on 32-bit and 64-bit systems (though the size is identical). Add a compat wrapper that converts the mask to whatever the kernel accepts, allowing 32-bit kvm user space to set signal masks. This patch fixes qemu with --enable-io-thread on ppc64 hosts when running 32-bit user land. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tileLinus Torvalds1-0/+8
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: (26 commits) arch/tile: prefer "tilepro" as the name of the 32-bit architecture compat: include aio_abi.h for aio_context_t arch/tile: cleanups for tilegx compat mode arch/tile: allocate PCI IRQs later in boot arch/tile: support signal "exception-trace" hook arch/tile: use better definitions of xchg() and cmpxchg() include/linux/compat.h: coding-style fixes tile: add an RTC driver for the Tilera hypervisor arch/tile: finish enabling support for TILE-Gx 64-bit chip compat: fixes to allow working with tile arch arch/tile: update defconfig file to something more useful tile: do_hardwall_trap: do not play with task->sighand tile: replace mm->cpu_vm_mask with mm_cpumask() tile,mn10300: add device parameter to dma_cache_sync() audit: support the "standard" <asm-generic/unistd.h> arch/tile: clarify flush_buffer()/finv_buffer() function names arch/tile: kernel-related cleanups from removing static page size arch/tile: various header improvements for building drivers arch/tile: disable GX prefetcher during cache flush arch/tile: tolerate disabling CONFIG_BLK_DEV_INITRD ...
2011-05-12compat: fixes to allow working with tile archChris Metcalf1-0/+8
The existing <asm-generic/unistd.h> mechanism doesn't really provide enough to create the 64-bit "compat" ABI properly in a generic way, since the compat ABI is a mix of things were you can re-use the 64-bit versions of syscalls and things where you need a compat wrapper. To provide this in the most direct way possible, I added two new macros to go along with the existing __SYSCALL and __SC_3264 macros: __SC_COMP and SC_COMP_3264. These macros take an additional argument, typically a "compat_sys_xxx" function, which is passed to __SYSCALL if you define __SYSCALL_COMPAT when including the header, resulting in a pointer to the compat function being placed in the generated syscall table. The change also adds some missing definitions to <linux/compat.h> so that it actually has declarations for all the compat syscalls, since the "[nr] = ##call" approach requires proper C declarations for all the functions included in the syscall table. Finally, compat.c defines compat_sys_sigpending() and compat_sys_sigprocmask() even if the underlying architecture doesn't request it, which tries to pull in undefined compat_old_sigset_t defines. We need to guard those compat syscall definitions with appropriate __ARCH_WANT_SYS_xxx ifdefs. Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2011-04-28signal: introduce do_sigtimedwait() to factor out compat/native codeOleg Nesterov1-34/+7
Factor out the common code in sys_rt_sigtimedwait/compat_sys_rt_sigtimedwait to the new helper, do_sigtimedwait(). Add the comment to document the extra tick we add to timespec_to_jiffies(ts), thanks to Linus who explained this to me. Perhaps it would be better to move compat_sys_rt_sigtimedwait() into signal.c under CONFIG_COMPAT, then we can make do_sigtimedwait() static. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Matt Fleming <matt.fleming@linux.intel.com>
2011-04-28signal: sys_rt_sigtimedwait: simplify the timeout logicOleg Nesterov1-24/+18
No functional changes, cleanup compat_sys_rt_sigtimedwait() and sys_rt_sigtimedwait(). Calculate the timeout before we take ->siglock, this simplifies and lessens the code. Use timespec_valid() to check the timespec. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Matt Fleming <matt.fleming@linux.intel.com>
2011-02-02posix-timers: Introduce a syscall for clock tuning.Richard Cochran1-0/+23
A new syscall is introduced that allows tuning of a POSIX clock. The new call, clock_adjtime, takes two parameters, the clock ID and a pointer to a struct timex. Any ADJTIMEX(2) operation may be requested via this system call, but various POSIX clocks may or may not support tuning. [ tglx: Adapted to the posix-timer cleanup series. Avoid copy_to_user in the error case ] Signed-off-by: Richard Cochran <richard.cochran@omicron.at> Acked-by: John Stultz <johnstul@us.ibm.com> LKML-Reference: <20110201134419.869804645@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-02time: Splitout compat timex accessorsRichard Cochran1-48/+65
Split out the compat timex accessors into separate functions. Preparatory patch for a new syscall. [ tglx: Split that patch from Richards "posix-timers: Introduce a syscall for clock tuning.". Keeps the changes strictly separate ] Originally-from: Richard Cochran <richardcochran@gmail.com> Acked-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <20110201134419.772343089@linutronix.de>
2010-09-14compat: Make compat_alloc_user_space() incorporate the access_ok()H. Peter Anvin1-0/+21
compat_alloc_user_space() expects the caller to independently call access_ok() to verify the returned area. A missing call could introduce problems on some architectures. This patch incorporates the access_ok() check into compat_alloc_user_space() and also adds a sanity check on the length. The existing compat_alloc_user_space() implementations are renamed arch_compat_alloc_user_space() and are used as part of the implementation of the new global function. This patch assumes NULL will cause __get_user()/__put_user() to either fail or access userspace on all architectures. This should be followed by checking the return value of compat_access_user_space() for NULL in the callers, at which time the access_ok() in the callers can also be removed. Reported-by: Ben Hawkes <hawkes@sota.gen.nz> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: James Bottomley <jejb@parisc-linux.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: <stable@kernel.org>
2010-07-16rlimits: switch more rlimit syscalls to do_prlimitJiri Slaby1-14/+3
After we added more generic do_prlimit, switch sys_getrlimit to that. Also switch compat handling, so we can get rid of ugly __user casts and avoid setting process' address limit to kernel data and back. Signed-off-by: Jiri Slaby <jslaby@suse.cz>
2010-05-19cpumask: fix compat getaffinityKOSAKI Motohiro1-14/+11
Commit a45185d2d "cpumask: convert kernel/compat.c" broke libnuma, which abuses sched_getaffinity to find out NR_CPUS in order to parse /sys/devices/system/node/node*/cpumap. On NUMA systems with less than 32 possibly CPUs, the current compat_sys_sched_getaffinity now returns '4' instead of the actual NR_CPUS/8, which makes libnuma bail out when parsing the cpumap. The libnuma call sched_getaffinity(0, bitmap, 4096) at first. It mean the libnuma expect the return value of sched_getaffinity() is either len argument or NR_CPUS. But it doesn't expect to return nr_cpu_ids. Strictly speaking, userland requirement are 1) Glibc assume the return value mean the lengh of initialized of mask argument. E.g. if sched_getaffinity(1024) return 128, glibc make zero fill rest 896 byte. 2) Libnuma assume the return value can be used to guess NR_CPUS in kernel. It assume len-arg<NR_CPUS makes -EINVAL. But it try len=4096 at first and 4096 is always bigger than NR_CPUS. Then, if we remove strange min_length normalization, we never hit -EINVAL case. sched_getaffinity() already solved this issue. This patch adapts compat_sys_sched_getaffinity() to match the non-compat case. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Ken Werner <ken.werner@web.de> Cc: stable@kernel.org Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo1-0/+1
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2009-04-30signals: implement sys_rt_tgsigqueueinfoThomas Gleixner1-0/+11
sys_kill has the per thread counterpart sys_tgkill. sigqueueinfo is missing a thread directed counterpart. Such an interface is important for migrating applications from other OSes which have the per thread delivery implemented. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Acked-by: Ulrich Drepper <drepper@redhat.com>
2009-01-06Allow times and time system calls to return small negative valuesPaul Mackerras1-1/+4
At the moment, the times() system call will appear to fail for a period shortly after boot, while the value it want to return is between -4095 and -1. The same thing will also happen for the time() system call on 32-bit platforms some time in 2106 or so. On some platforms, such as x86, this is unavoidable because of the system call ABI, but other platforms such as powerpc have a separate error indication from the return value, so system calls can in fact return small negative values without indicating an error. On those platforms, force_successful_syscall_return() provides a way to indicate that the system call return value should not be treated as an error even if it is in the range which would normally be taken as a negative error number. This adds a force_successful_syscall_return() call to the time() and times() system calls plus their 32-bit compat versions, so that they don't erroneously indicate an error on those platforms whose system call ABI has a separate error indication. This will not affect anything on other platforms. Joakim Tjernlund added the fix for time() and the compat versions of time() and times(), after I did the fix for times(). Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se> Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-01cpumask: convert kernel/compat.cRusty Russell1-19/+30
Impact: Reduce stack usage, use new cpumask API. Straightforward conversion; cpumasks' size is given by cpumask_size() (now a variable rather than fixed) and on-stack cpu masks use cpumask_var_t. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-10-20Merge branches 'timers/clocksource', 'timers/hrtimers', 'timers/nohz', ↵Thomas Gleixner1-39/+72
'timers/ntp', 'timers/posixtimers' and 'timers/debug' into v28-timers-for-linus
2008-10-16compat: generic compat get/settimeofdayChristoph Hellwig1-0/+58
Nothing arch specific in get/settimeofday. The details of the timeval conversion varied a little from arch to arch, but all with the same results. Also add an extern declaration for sys_tz to linux/time.h because externs in .c files are fowned upon. I'll kill the externs in various other files in a sparate patch. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: David S. Miller <davem@davemloft.net> [ sparc bits ] Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Acked-by: Kyle McMartin <kyle@mcmartin.ca> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-14timers: fix itimer/many thread hangFrank Mayhar1-39/+14
Overview This patch reworks the handling of POSIX CPU timers, including the ITIMER_PROF, ITIMER_VIRT timers and rlimit handling. It was put together with the help of Roland McGrath, the owner and original writer of this code. The problem we ran into, and the reason for this rework, has to do with using a profiling timer in a process with a large number of threads. It appears that the performance of the old implementation of run_posix_cpu_timers() was at least O(n*3) (where "n" is the number of threads in a process) or worse. Everything is fine with an increasing number of threads until the time taken for that routine to run becomes the same as or greater than the tick time, at which point things degrade rather quickly. This patch fixes bug 9906, "Weird hang with NPTL and SIGPROF." Code Changes This rework corrects the implementation of run_posix_cpu_timers() to make it run in constant time for a particular machine. (Performance may vary between one machine and another depending upon whether the kernel is built as single- or multiprocessor and, in the latter case, depending upon the number of running processors.) To do this, at each tick we now update fields in signal_struct as well as task_struct. The run_posix_cpu_timers() function uses those fields to make its decisions. We define a new structure, "task_cputime," to contain user, system and scheduler times and use these in appropriate places: struct task_cputime { cputime_t utime; cputime_t stime; unsigned long long sum_exec_runtime; }; This is included in the structure "thread_group_cputime," which is a new substructure of signal_struct and which varies for uniprocessor versus multiprocessor kernels. For uniprocessor kernels, it uses "task_cputime" as a simple substructure, while for multiprocessor kernels it is a pointer: struct thread_group_cputime { struct task_cputime totals; }; struct thread_group_cputime { struct task_cputime *totals; }; We also add a new task_cputime substructure directly to signal_struct, to cache the earliest expiration of process-wide timers, and task_cputime also replaces the it_*_expires fields of task_struct (used for earliest expiration of thread timers). The "thread_group_cputime" structure contains process-wide timers that are updated via account_user_time() and friends. In the non-SMP case the structure is a simple aggregator; unfortunately in the SMP case that simplicity was not achievable due to cache-line contention between CPUs (in one measured case performance was actually _worse_ on a 16-cpu system than the same test on a 4-cpu system, due to this contention). For SMP, the thread_group_cputime counters are maintained as a per-cpu structure allocated using alloc_percpu(). The timer functions update only the timer field in the structure corresponding to the running CPU, obtained using per_cpu_ptr(). We define a set of inline functions in sched.h that we use to maintain the thread_group_cputime structure and hide the differences between UP and SMP implementations from the rest of the kernel. The thread_group_cputime_init() function initializes the thread_group_cputime structure for the given task. The thread_group_cputime_alloc() is a no-op for UP; for SMP it calls the out-of-line function thread_group_cputime_alloc_smp() to allocate and fill in the per-cpu structures and fields. The thread_group_cputime_free() function, also a no-op for UP, in SMP frees the per-cpu structures. The thread_group_cputime_clone_thread() function (also a UP no-op) for SMP calls thread_group_cputime_alloc() if the per-cpu structures haven't yet been allocated. The thread_group_cputime() function fills the task_cputime structure it is passed with the contents of the thread_group_cputime fields; in UP it's that simple but in SMP it must also safely check that tsk->signal is non-NULL (if it is it just uses the appropriate fields of task_struct) and, if so, sums the per-cpu values for each online CPU. Finally, the three functions account_group_user_time(), account_group_system_time() and account_group_exec_runtime() are used by timer functions to update the respective fields of the thread_group_cputime structure. Non-SMP operation is trivial and will not be mentioned further. The per-cpu structure is always allocated when a task creates its first new thread, via a call to thread_group_cputime_clone_thread() from copy_signal(). It is freed at process exit via a call to thread_group_cputime_free() from cleanup_signal(). All functions that formerly summed utime/stime/sum_sched_runtime values from from all threads in the thread group now use thread_group_cputime() to snapshot the values in the thread_group_cputime structure or the values in the task structure itself if the per-cpu structure hasn't been allocated. Finally, the code in kernel/posix-cpu-timers.c has changed quite a bit. The run_posix_cpu_timers() function has been split into a fast path and a slow path; the former safely checks whether there are any expired thread timers and, if not, just returns, while the slow path does the heavy lifting. With the dedicated thread group fields, timers are no longer "rebalanced" and the process_timer_rebalance() function and related code has gone away. All summing loops are gone and all code that used them now uses the thread_group_cputime() inline. When process-wide timers are set, the new task_cputime structure in signal_struct is used to cache the earliest expiration; this is checked in the fast path. Performance The fix appears not to add significant overhead to existing operations. It generally performs the same as the current code except in two cases, one in which it performs slightly worse (Case 5 below) and one in which it performs very significantly better (Case 2 below). Overall it's a wash except in those two cases. I've since done somewhat more involved testing on a dual-core Opteron system. Case 1: With no itimer running, for a test with 100,000 threads, the fixed kernel took 1428.5 seconds, 513 seconds more than the unfixed system, all of which was spent in the system. There were twice as many voluntary context switches with the fix as without it. Case 2: With an itimer running at .01 second ticks and 4000 threads (the most an unmodified kernel can handle), the fixed kernel ran the test in eight percent of the time (5.8 seconds as opposed to 70 seconds) and had better tick accuracy (.012 seconds per tick as opposed to .023 seconds per tick). Case 3: A 4000-thread test with an initial timer tick of .01 second and an interval of 10,000 seconds (i.e. a timer that ticks only once) had very nearly the same performance in both cases: 6.3 seconds elapsed for the fixed kernel versus 5.5 seconds for the unfixed kernel. With fewer threads (eight in these tests), the Case 1 test ran in essentially the same time on both the modified and unmodified kernels (5.2 seconds versus 5.8 seconds). The Case 2 test ran in about the same time as well, 5.9 seconds versus 5.4 seconds but again with much better tick accuracy, .013 seconds per tick versus .025 seconds per tick for the unmodified kernel. Since the fix affected the rlimit code, I also tested soft and hard CPU limits. Case 4: With a hard CPU limit of 20 seconds and eight threads (and an itimer running), the modified kernel was very slightly favored in that while it killed the process in 19.997 seconds of CPU time (5.002 seconds of wall time), only .003 seconds of that was system time, the rest was user time. The unmodified kernel killed the process in 20.001 seconds of CPU (5.014 seconds of wall time) of which .016 seconds was system time. Really, though, the results were too close to call. The results were essentially the same with no itimer running. Case 5: With a soft limit of 20 seconds and a hard limit of 2000 seconds (where the hard limit would never be reached) and an itimer running, the modified kernel exhibited worse tick accuracy than the unmodified kernel: .050 seconds/tick versus .028 seconds/tick. Otherwise, performance was almost indistinguishable. With no itimer running this test exhibited virtually identical behavior and times in both cases. In times past I did some limited performance testing. those results are below. On a four-cpu Opteron system without this fix, a sixteen-thread test executed in 3569.991 seconds, of which user was 3568.435s and system was 1.556s. On the same system with the fix, user and elapsed time were about the same, but system time dropped to 0.007 seconds. Performance with eight, four and one thread were comparable. Interestingly, the timer ticks with the fix seemed more accurate: The sixteen-thread test with the fix received 149543 ticks for 0.024 seconds per tick, while the same test without the fix received 58720 for 0.061 seconds per tick. Both cases were configured for an interval of 0.01 seconds. Again, the other tests were comparable. Each thread in this test computed the primes up to 25,000,000. I also did a test with a large number of threads, 100,000 threads, which is impossible without the fix. In this case each thread computed the primes only up to 10,000 (to make the runtime manageable). System time dominated, at 1546.968 seconds out of a total 2176.906 seconds (giving a user time of 629.938s). It received 147651 ticks for 0.015 seconds per tick, still quite accurate. There is obviously no comparable test without the fix. Signed-off-by: Frank Mayhar <fmayhar@google.com> Cc: Roland McGrath <roland@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-01ntp: support for TAIRoman Zippel1-1/+2
This adds support for setting the TAI value (International Atomic Time). The value is reported back to userspace via timex (as we don't have a ntp_gettime() syscall). Signed-off-by: Roman Zippel <zippel@linux-m68k.org> Cc: john stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30signals: add set_restore_sigmaskRoland McGrath1-2/+1
This adds the set_restore_sigmask() inline in <linux/thread_info.h> and replaces every set_thread_flag(TIF_RESTORE_SIGMASK) with a call to it. No change, but abstracts the details of the flag protocol from all the calls. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-19generic: reduce stack pressure in sched_affinityMike Travis1-1/+1
* Modify sched_affinity functions to pass cpumask_t variables by reference instead of by value. * Use new set_cpus_allowed_ptr function. Depends on: [sched-devel]: sched: add new set_cpus_allowed_ptr function Cc: Paul Jackson <pj@sgi.com> Cc: Cliff Wickman <cpw@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17hrtimer: use nanosleep specific restart_block fieldsThomas Gleixner1-8/+7
Convert all the nanosleep related users of restart_block to the new nanosleep specific restart_block fields. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-10hrtimer: don't modify restart_block->fn in restart functionsOleg Nesterov1-1/+0
hrtimer_nanosleep_restart() clears/restores restart_block->fn. This is pointless and complicates its usage. Note that if sys_restart_syscall() doesn't actually happen, we have a bogus "pending" restart->fn anyway, this is harmless. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: Pavel Emelyanov <xemul@sw.ru> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Toyo Abe <toyoa@mvista.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-10hrtimer: fix *rmtp/restarts handling in compat_sys_nanosleep()Oleg Nesterov1-4/+40
Spotted by Pavel Emelyanov and Alexey Dobriyan. compat_sys_nanosleep() implicitly uses hrtimer_nanosleep_restart(), this can't work. Make a suitable compat_nanosleep_restart() helper. Introduced by commit c70878b4e0b6cf8d2f1e46319e48e821ef4a8aba hrtimer: hook compat_sys_nanosleep up to high res timer code Also, set ->addr_limit = KERNEL_DS before doing hrtimer_nanosleep(), this func was changed by the previous patch and now takes the "__user *" parameter. Thanks to Ingo Molnar for fixing the bug in this patch. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: Pavel Emelyanov <xemul@sw.ru> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Toyo Abe <toyoa@mvista.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-18Merge ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrtLinus Torvalds1-46/+11
* ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt: hrtimer: hook compat_sys_nanosleep up to high res timer code hrtimer: Rework hrtimer_nanosleep to make sys_compat_nanosleep easier
2007-10-18whitespace fixes: compat syscallsDaniel Walker1-30/+30
Signed-off-by: Daniel Walker <dwalker@mvista.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18hrtimer: hook compat_sys_nanosleep up to high res timer codeAnton Blanchard1-46/+11
Now we have high res timers on ppc64 I thought Id test them. It turns out compat_sys_nanosleep hasnt been converted to the hrtimer code and so is limited to HZ resolution. The follow patch converts compat_sys_nanosleep to use high res timers. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-05-11signal/timer/event: timerfd compat codeDavide Libenzi1-4/+4
This patch implements the necessary compat code for the timerfd system call. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] Common compat_sys_sysinfoKyle McMartin1-0/+66
I noticed that almost all architectures implemented exactly the same sys32_sysinfo... except parisc, where a bug was to be found in handling of the uptime. So let's remove a whole whack of code for fun and profit. Cribbed compat_sys_sysinfo from x86_64's implementation, since I figured it would be the best tested. This patch incorporates Arnd's suggestion of not using set_fs/get_fs, but instead extracting out the common code from sys_sysinfo. Cc: Christoph Hellwig <hch@infradead.org> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2006-11-03[PATCH] Create compat_sys_migrate_pagesStephen Rothwell1-0/+33
This is needed on bigendian 64bit architectures. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-28[PATCH] Constify compat_get_bitmap argumentStephen Rothwell1-1/+1
This means we can call it when the bitmap we want to fetch is declared const. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Christoph Lameter <clameter@engr.sgi.com> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] BLOCK: Revert patch to hack around undeclared sigset_t in linux/compat.hDavid Howells1-2/+0
Revert Andrew Morton's patch to temporarily hack around the lack of a declaration of sigset_t in linux/compat.h to make the block-disablement patches build on IA64. This got accidentally pushed to Linus and should be fixed in a different manner. Also make linux/compat.h #include asm/signal.h to gain a definition of sigset_t so that it can externally declare sigset_from_compat(). This has been compile-tested for i386, x86_64, ia64, mips, mips64, frv, ppc and ppc64 and run-tested on frv. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-30[PATCH] BLOCK: Move extern declarations out of fs/*.c into header files [try #6]David Howells1-0/+2
Create a new header file, fs/internal.h, for common definitions local to the sources in the fs/ directory. Move extern definitions that should be in header files from fs/*.c to fs/internal.h or other main header files where they span directories. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-29[PATCH] posix-timers: Fix clock_nanosleep() doesn't return the remaining ↵Toyo Abe1-0/+33
time in compatibility mode The clock_nanosleep() function does not return the time remaining when the sleep is interrupted by a signal. This patch creates a new call out, compat_clock_nanosleep_restart(), which handles returning the remaining time after a sleep is interrupted. This patch revives clock_nanosleep_restart(). It is now accessed via the new call out. The compat_clock_nanosleep_restart() is used for compatibility access. Since this is implemented in compatibility mode the normal path is virtually unaffected - no real performance impact. Signed-off-by: Toyo Abe <toyoa@mvista.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] N32 sigset and __COMPAT_ENDIAN_SWAP__akpm@osdl.org1-7/+0
I'm testing glibc on MIPS64, little-endian, N32, O32 and N64 multilibs. Among the NPTL test failures seen are some arising from sigsuspend problems for N32: it blocks the wrong signals, so SIGCANCEL (SIGRTMIN) is blocked despite glibc's carefully excluding it from sets of signals to block. Specifically, testing suggests it blocks signal N^32 instead of signal N, so (in the example tested) blocking SIGUSR1 (17) blocks signal 49 instead. glibc's sigset_t uses an array of unsigned long, as does the kernel. In both cases, signal N+1 is represented as (1UL << (N % (8 * sizeof (unsigned long)))) in word number (N / (8 * sizeof (unsigned long))). Thus the N32 glibc uses an array of 32-bit words and the N64 kernel uses an array of 64-bit words. For little-endian, the layout is the same, with signals 1-32 in the first 4 bytes, signals 33-64 in the second, etc.; for big-endian, userspace has that layout while in the kernel each 8 bytes have the two halves swapped from the userspace layout. The N32 sigsuspend syscall uses sigset_from_compat to convert the userspace sigset to kernel format. If __COMPAT_ENDIAN_SWAP__ is *not* set, this uses logic of the form set->sig[0] = compat->sig[0] | (((long)compat->sig[1]) << 32 ) to convert the userspace sigset to a kernel one. This looks correct to me for both big and little endian, given that in userspace compat->sig[1] will represent signals 33-64, and so will the high 32 bits of set->sig[0] in the kernel. If however __COMPAT_ENDIAN_SWAP__ *is* set, as it is for __MIPSEL__, it uses set->sig[0] = compat->sig[1] | (((long)compat->sig[0]) << 32 ); which seems incorrect for both big and little endian, and would explain the observed symptoms. This code is the only use of __COMPAT_ENDIAN_SWAP__, so if incorrect then that macro serves no purpose, in which case something like the following patch would seem appropriate to remove it. Signed-off-by: Joseph Myers <joseph@codesourcery.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] move_pages: fix 32 -> 64 bit compat functionChristoph Lameter1-2/+2
The definition of the third parameter is a pointer to an array of virtual addresses which give us some trouble. The existing code calculated the wrong address in the array since I used void to avoid having to specify a type. I now use the correct type "compat_uptr_t __user *" in the definition of the function in kernel/compat.c. However, I used __u32 in syscalls.h. Would have to include compat.h there in order to provide the same definition which would generate an ugly include situation. On both ia64 and x86_64 compat_uptr_t is u32. So this works although parameter declarations differ. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] sys_move_pages: 32bit support (i386, x86_64)Christoph Lameter1-0/+23
sys_move_pages() support for 32bit (i386 plus x86_64 compat layer) Add support for move_pages() on i386 and also add the compat functions necessary to run 32 bit binaries on x86_64. Add compat_sys_move_pages to the x86_64 32bit binary layer. Note that it is not up to date so I added the missing pieces. Not sure if this is done the right way. [akpm@osdl.org: compile fix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27[PATCH] lightweight robust futexes: compatIngo Molnar1-23/+0
32-bit syscall compatibility support. (This patch also moves all futex related compat functionality into kernel/futex_compat.c.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@infradead.org> Acked-by: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26[PATCH] consolidate sys32/compat_adjtimexStephen Rothwell1-0/+59
Create compat_sys_adjtimex and use it an all appropriate places. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-07[PATCH] remove bogus asm/bug.h includes.Al Viro1-1/+0
A bunch of asm/bug.h includes are both not needed (since it will get pulled anyway) and bogus (since they are done too early). Removed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-01-18[PATCH] Generic sys_rt_sigsuspend()David Woodhouse1-0/+28
The TIF_RESTORE_SIGMASK flag allows us to have a generic implementation of sys_rt_sigsuspend() instead of duplicating it for each architecture. This provides such an implementation and makes arch/powerpc use it. It also tidies up the ppc32 sys_sigsuspend() to use TIF_RESTORE_SIGMASK. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10[PATCH] common compat_sys_timer_createChristoph Hellwig1-2/+18
The comment in compat.c is wrong, every architecture provides a get_compat_sigevent() for the IPC compat code already. This basically moves the x86_64 version to common code and removes all the others. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Paul Mackerras <paulus@samba.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10[PATCH] kernel: fix-up schedule_timeout() usageNishanth Aravamudan1-6/+3
Use schedule_timeout_{,un}interruptible() instead of set_current_state()/schedule_timeout() to reduce kernel size. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16[PATCH] Fix get_compat_sigevent()David S. Miller1-1/+1
I have no idea how a bug like this lasted so long. Anyways, obvious memset()'ing of incorrect pointer. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16Linux-2.6.12-rc2v2.6.12-rc2Linus Torvalds1-0/+860
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!