kernel/git/tglx/history.git - Linux kernel history

Age	Commit message (Collapse)	Author	Files	Lines
2004-08-22	[PATCH] IS_ERR() unlikeliness cleanup	Andrew Morton	1	-1/+1
	Remove now-unneeded open-coded unlikelies around IS_ERR(). Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-22	[PATCH] rcu: abstracted RCU dereferencing	Dipankar Sarma	1	-6/+5
	Use abstracted RCU API to dereference RCU protected data. Hides barrier details. Patch from Paul McKenney. This patch introduced an rcu_dereference() macro that replaces most uses of smp_read_barrier_depends(). The new macro has the advantage of explicitly documenting which pointers are protected by RCU -- in contrast, it is sometimes difficult to figure out which pointer is being protected by a given smp_read_barrier_depends() call. Signed-off-by: Paul McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-22	[PATCH] cleanup of ipc/msg.c	Manfred Spraul	1	-106/+134
	Attached is a cleanup of the main loops in sys_msgrcv and sys_msgsnd, based on ipc_lock_by_ptr(). Most backward gotos are gone, instead normal "for(;;)" loops until a suitable message is found. Description: - General cleanup of sys_msgrcv and sys_msgsnd: the function were too convoluted. - Enable lockless receive, update comments. - Use ipc_getref for sys_msgsnd(), it's better than rechecking that the msqid is still valid. Signed-Off-By: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-22	[PATCH] ipc: enforce SEMVMX limit for undo	Manfred Spraul	1	-1/+16
	Independent from the other patches: undo operations should not result in out of range semaphore values. The test for newval > SEMVMX is missing. The attached patch adds the test and a comment. Signed-Off-By: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-22	[PATCH] ipc: remove sem_revalidate	Manfred Spraul	1	-29/+46
	The attached patch removes sem_revalidate and replaces it with ipc_rcu_getref() calls followed by ipc_lock_by_ptr(). Signed-Off-By: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-22	[PATCH] ipc: Add refcount to ipc_rcu_alloc	Manfred Spraul	5	-39/+85
	The lifetime of the ipc objects (sem array, msg queue, shm mapping) is controlled by kern_ipc_perms->lock - a spinlock. There is no simple way to reacquire this spinlock after it was dropped to schedule()/kmalloc/copy_{to,from}_user/whatever. The attached patch adds a reference count as a preparation to get rid of sem_revalidate(). Signed-Off-By: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-07-12	[PATCH] sparse: ipc compat annotations and cleanups	Alexander Viro	2	-211/+126
	ipc compat code switched to compat_alloc_user_space() and annotated.
2004-06-30	[PATCH] sparse: NULL vs 0 - the rest of it	Mika Kukkonen	3	-3/+3

2004-06-23	[PATCH] rcu: avoid passing an argument to the callback function	Andrew Morton	1	-5/+20
	From: Dipankar Sarma <dipankar@in.ibm.com> This patch changes the call_rcu() API and avoids passing an argument to the callback function as suggested by Rusty. Instead, it is assumed that the user has embedded the rcu head into a structure that is useful in the callback and the rcu_head pointer is passed to the callback. The callback can use container_of() to get the pointer to its structure and work with it. Together with the rcu-singly-link patch, it reduces the rcu_head size by 50%. Considering that we use these in things like struct dentry and struct dst_entry, this is good savings in space. An example : struct my_struct { struct rcu_head rcu; int x; int y; }; void my_rcu_callback(struct rcu_head head) { struct my_struct p = container_of(head, struct my_struct, rcu); free(p); } void my_delete(struct my_struct *p) { ... call_rcu(&p->rcu, my_rcu_callback); ... } Signed-Off-By: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-06-17	[PATCH] RLIM: adjust default mqueue sizes	Chris Wright	1	-3/+3
	Lower default sizes for POSIX mqueue allocation now that rlimits are in place. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-06-17	[PATCH] RLIM: enforce rlimits for POSIX mqueue allocation	Chris Wright	1	-5/+39
	Add a user_struct to the mq_inode_info structure. Charge the maximum number of bytes that could be allocated to a mqueue to the user who creates the mqueue. This is checked against the per user rlimit. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-06-17	[PATCH] RLIM: add mq_attr_ok() helper	Chris Wright	1	-10/+23
	Add helper function mq_attr_ok() to do mq_attr sanity checking, and do some extra overlow checking. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-05-28	[PATCH] sparse: ipc __user annotation	Alexander Viro	7	-26/+28

2004-05-22	[PATCH] numa api: Add shared memory support	Andrew Morton	1	-0/+4
	From: Andi Kleen <ak@suse.de> Add support to tmpfs and hugetlbfs to support NUMA API. Shared memory is a bit of a special case for NUMA policy. Normally policy is associated to VMAs or to processes, but for a shared memory segment you really want to share the policy. The core NUMA API has code for that, this patch adds the necessary changes to tmpfs and hugetlbfs. First it changes the custom swapping code in tmpfs to follow the policy set via VMAs. It is also useful to have a "backing store" of policy that saves the policy even when nobody has the shared memory segment mapped. This allows command line tools to pre configure policy, which is then later used by programs. Note that hugetlbfs needs more changes - it is also required to switch it to lazy allocation, otherwise the prefault prevents mbind() from working.
2004-05-21	[PATCH] Sanitise handling of unneeded syscall stubs	Andrew Morton	1	-2/+5
	From: David Mosberger <davidm@napali.hpl.hp.com> Below is a patch that tries to sanitize the dropping of unneeded system-call stubs in generic code. In some instances, it would be possible to move the optional system-call stubs into a library routine which would avoid the need for #ifdefs, but in many cases, doing so would require making several functions global (and possibly exporting additional data-structures in header-files). Furthermore, it would inhibit (automatic) inlining in the cases in the cases where the stubs are needed. For these reasons, the patch keeps the #ifdef-approach. This has been tested on ia64 and there were no objections from the arch-maintainers (and one positive response). The patch should be safe but arch-maintainers may want to take a second look to see if some __ARCH_WANT_foo macros should be removed for their architecture (I'm quite sure that's the case, but I wanted to play it safe and only preserved the status-quo in that regard).
2004-05-10	[PATCH] simplify mqueue_inode_info->messages allocation	Andrew Morton	1	-25/+14
	From: Chris Wright <chrisw@osdl.org> Currently, if a user creates an mqueue and passes an mq_attr, the info->messages will be created twice (and the extra one is properly freed). This patch simply delays the allocation so that it only ever happens once. The relevant mq_attr data is passed to lower levels via the dentry->d_fsdata fs private data. This also helps isolate the areas we'd need to touch to do rlimits on mqueues.
2004-05-04	[PATCH] fix queues_count accounting in mqueue_delete_inode()	Chris Wright	1	-3/+5
	During mqueue_get_inode(), it's possible that kmalloc() of the info->messages array will fail. This failure mode will cause the queues_count to be (incorrectly) decremented twice. This patch uses info->messages on mqueue_delete_inode() to determine whether the mqueue was every truly created, and hence proper accounting is needed on destruction.
2004-05-04	[PATCH] fix memleak in sys_mq_timedsend	Chris Wright	1	-2/+2
	Move error handling to capture all three possible error conditions on sending to a full queue. Without this fix any unprivileged user can leak arbitrary amounts of kernel memory.
2004-04-17	[PATCH] mqueue permission fix	Andrew Morton	1	-5/+1
	From: Manfred Spraul <manfred@colorfullife.com> Any user can delete any entries in a mqueue mounted filesystem. The attached patch prevents that. - remove the writable test from mq_unlink. - set the sticky bit in the root inode. This affects both mq_unlink and sys_unlink: only the owner (and root) should be allowed to remove queues.
2004-04-14	[PATCH] mq_open() and close_on_exec	Andrew Morton	1	-0/+1
	From: Chris Wright <chrisw@osdl.org> SUSv3 doesn't seem to specify one way or the other. I don't have the POSIX specs, and the old docs I have suggest that mq_open() creates an object which is to be closed upon exec. Jakub said: I think it is valid and required: http://www.opengroup.org/onlinepubs/007904975/functions/exec.html All open message queue descriptors in the calling process shall be closed, as described in mq_close() I'll add a new test for this into glibc testsuite.
2004-04-14	[PATCH] Fix mq_notify with SIGEV_NONE notification	Andrew Morton	1	-47/+46
	From: Jakub Jelinek <jakub@redhat.com> mq_notify (q, NULL) and struct sigevent ev = { .sigev_notify = SIGEV_NONE }; mq_notify (q, &ev) are not the same thing in POSIX, yet the kernel treats them the same. Only the former makes the notification available to other processes immediately, see http://www.opengroup.org/onlinepubs/007904975/functions/mq_notify.html Without the patch below, http://sources.redhat.com/ml/libc-hacker/2004-04/msg00028.html glibc test fails. I looked at mq in Solaris and they behave the same in this regard as Linux with this patch. Kernel with this patch passes both Intel POSIX testsuite (with testsuite fixes from Ulrich) and glibc mq testsuite.
2004-04-11	[PATCH] make the pagecache lock irq-safe.	Andrew Morton	1	-2/+0
	Intro to these patches: - Major surgery against the pagecache, radix-tree and writeback code. This work is to address the O_DIRECT-vs-buffered data exposure horrors which we've been struggling with for months. As a side-effect, 32 bytes are saved from struct inode and eight bytes are removed from struct page. At a cost of approximately 2.5 bits per page in the radix tree nodes on 4k pagesize, assuming the pagecache is densely populated. Not all pages are pagecache; other pages gain the full 8 byte saving. This change will break any arch code which is using page->list and will also break any arch code which is using page->lru of memory which was obtained from slab. The basic problem which we (mainly Daniel McNeil) have been struggling with is in getting a really reliable fsync() across the page lists while other processes are performing writeback against the same file. It's like juggling four bars of wet soap with your eyes shut while someone is whacking you with a baseball bat. Daniel pretty much has the problem plugged but I suspect that's just because we don't have testcases to trigger the remaining problems. The complexity and additional locking which those patches add is worrisome. So the approach taken here is to remove the page lists altogether and replace the list-based writeback and wait operations with in-order radix-tree walks. The radix-tree code has been enhanced to support "tagging" of pages, for later searches for pages which have a particular tag set. This means that we can ask the radix tree code "find me the next 16 dirty pages starting at pagecache index N" and it will do that in O(log64(N)) time. This affects I/O scheduling potentially quite significantly. It is no longer the case that the kernel will submit pages for I/O in the order in which the application dirtied them. We instead submit them in file-offset order all the time. This is likely to be advantageous when applications are seeking all over a large file randomly writing small amounts of data. I haven't performed much benchmarking, but tiobench random write throughput seems to be increased by 30%. Other tests appear to be unaltered. dbench may have got 10-20% quicker, but it's variable. There is one large file which everyone seeks all over randomly writing small amounts of data: the blockdev mapping which caches filesystem metadata. The kernel's IO submission patterns for this are now ideal. Because writeback and wait-for-writeback use a tree walk instead of a list walk they are no longer livelockable. This probably means that we no longer need to hold i_sem across O_SYNC writes and perhaps fsync() and fdatasync(). This may be beneficial for databases: multiple processes writing and syncing different parts of the same file at the same time can now all submit and wait upon writes to just their own little bit of the file, so we can get a lot more data into the queues. It is trivial to implement a part-file-fdatasync() as well, so applications can say "sync the file from byte N to byte M", and multiple applications can do this concurrently. This is easy for ext2 filesystems, but probably needs lots of work for data-journalled filesystems and XFS and it probably doesn't offer much benefit over an i_semless O_SYNC write. These patches can end up making ext3 (even) slower: for i in 1 2 3 4 do dd if=/dev/zero of=$i bs=1M count=2000 & done runs awfully slow on SMP. This is, yet again, because all the file blocks are jumbled up and the per-file linear writeout causes tons of seeking. The above test runs sweetly on UP because the on UP we don't allocate blocks to different files in parallel. Mingming and Badari are working on getting block reservation working for ext3 (preallocation on steroids). That should fix ext3 up. This patch: - Later, we'll need to access the radix trees from inside disk I/O completion handlers. So make mapping->page_lock irq-safe. And rename it to tree_lock to reliably break any missed conversions.
2004-04-11	[PATCH] compat emulation for posix message queues	Andrew Morton	2	-1/+198
	From: Arnd Bergmann <arnd@arndb.de> I have tested the code with the open posix test suite and found the same four failures for both 64-bit and compat mode, most tests pass. The patch is against -mc1, but I guess it also applies to the other trees around. What worries me more than mq_attr compatibility is the conversion of struct sigevent, which might turn out really hard when more fields in there are used. AFAICS, the only other part in the kernel ABI is sys_timer_create(), so maybe it's not too late to deprecate the current structure and create a structure that can be used properly for compat syscalls.
2004-04-11	[PATCH] posix message queues: send notifications via netlink	Andrew Morton	1	-147/+107
	From: Manfred Spraul <manfred@colorfullife.com> SIGEV_THREAD means that a given callback should be called in the context on a new thread. This must be done by the C library. The kernel must deliver a notice of the event to the C library when the callback should be called. This patch switches to a new, simpler interface: User space creates a socket with socket(PF_NETLINK, SOCK_RAW,0) and passes the fd to the mq_notify call together with a cookie. When the mq_notify() condition is satisfied, the kernel "writes" the cookie to the socket. User space then reads the cookie and calls the appropriate callback.
2004-04-11	[PATCH] security bugfix for mqueue	Andrew Morton	1	-6/+7
	From: Manfred Spraul <manfred@colorfullife.com> I found a security bug in the new mqueue code: a process that has only write permissions to a message queue could call mq_notify(SIGEV_THREAD) and use the returned notification file descriptor to read from the message queue.
2004-04-11	[PATCH] posix message queue update	Andrew Morton	1	-2/+4
	From: Manfred Spraul <manfred@colorfullife.com> My discussion with Ulrich had one result: - mq_setattr can accept implementation defined flags. Right now we have none, but we might add some later (e.g. switch to CLOCK_MONOTONIC for mq_timed{send,receive} or something similar). When we add flags, we might need the fields for additional information. And they don't hurt. Therefore add four __reserved fields to mq_attr. - fail mq_setattr if we get unknown flags - otherwise glibc can't detect if it's running on a future kernel that supports new features. - use memset to initialize the mq_attr structure - theoretically we could leak kernel memory. - Only set O_NONBLOCK in mq_attr, explicitely clear O_RDWR & friends. openposix uses getattr, attr \|=O_NONBLOCK, setattr - a sane approach. Without clearing O_RDWR, this fails. I've retested all openposix conformance tests with the new patch - the two new FAILED tests check undefined behavior. Note that I won't have net access until Sunday - if the message queue patch breaks something important either ask Krzysztof or drop it. Ulrich had another good idea for SIGEV_THREAD, but I must think about it. It would mean less complexitiy in glibc, but more code in the kernel. I'm not yet convinced that it's overall better.
2004-04-11	[PATCH] posix message queues: made user mountable	Andrew Morton	1	-12/+83
	From: Manfred Spraul <manfred@colorfullife.com> Make the posix message queue mountable by the user. This replaces ipcs and ipcrm for posix message queue: The admin can check which queues exist with ls and remove stale queues with rm. I'd like a final confirmation from Ulrich that our SIGEV_THREAD approach is the right thing(tm): He's aware of the design and didn't object, but I think he hasn't seen the final API yet.
2004-04-11	[PATCH] posix message queues: linux-specific poll extension	Andrew Morton	1	-3/+24
	From: Manfred Spraul <manfred@colorfullife.com> Linux specific extension: make the message queue identifiers pollable. It's simple and could be useful.
2004-04-11	[PATCH] posix message queues: implementation	Andrew Morton	2	-0/+1167
	From: Manfred Spraul <manfred@colorfullife.com> Actual implementation of the posix message queues, written by Krzysztof Benedyczak and Michal Wronski. The complete implementation is dependant on CONFIG_POSIX_MQUEUE. It passed the openposix test suite with two exceptions: one mq_unlink test was bad and tested undefined behavior. And Linux succeeds mq_close(open(,,,)). The spec mandates EBADF, but we have decided to ignore that: we would have to add a new syscall just for the right error code. The patch intentionally doesn't use all helpers from fs/libfs for kernel-only filesystems: step 5 allows user space mounts of the file system. Signal changes: The patch redefines SI_MESGQ using __SI_CODE: The generic Linux ABI uses a negative value (i.e. from user) for SI_MESGQ, but the kernel internal value must be posive to pass check_kill_value. Additionally, the patch adds support into copy_siginfo_to_user to copy the "new" signal type to user space. Changes in signal code caused by POSIX message queues patch: General & rationale: mqueues generated signals (only upon notification) must have si_code == SI_MESGQ. In fact such a signal is send from one process which caused notification (== sent message to empty message queue) to another which requested it. Both processes can be of course unrelated in terms of uids/euids. So SI_MESGQ signals must be classified as SI_FROMKERNEL to pass check_kill_permissions (not need to say that this signals ARE from kernel). Signals generated by message queues notification need the same fields in siginfo struct's union _sifields as POSIX.1b signals and we can reuse its union entry. SI_MESGQ was previously defined to -3 in kernel and also in glibc. So in userspace SI_MESGQ must be still visible as -3. Solution: SI_MESGQ is defined in the same style as SI_TIMER using __SI_CODE macro. Details: Fortunately copy_siginfo_to_user copies si_code as short. So we can use remaining part of int value freely. __SI_CODE does the work. SI_MESGQ is in kernel: 6<<16 \| (-3 & 0xffff) what is > 0 but to userspace is copied (short) SI_MESGQ == -3 Actual changes: Changes in include/asm-generic/siginfo.h __SI_MESGQ added in signal.h to represent inside-kernel prefix of SI_MESGQ. SI_MESGQ is redefined from -3 to __SI_CODE(__SI_MESGQ, -3) Except mips architecture those changes should be arch independent (asm-generic/siginfo.h is included in arch versions). On mips SI_MESGQ is redefined to -4 in order to be compatible with IRIX. But the same schema can be used. Change in copy_siginfo_to_user: We only add one line to order the same copy semantics as for _SI_RT. This change isn't very portable - some arch have its own copy_siginfo_to_user. All those should have similar change (but possibly not one-line as _SI_RT case was sometimes ignored because i wasn't used yet, e.g. see ia64 signal.c). Update: mq: only fail with invalid timespec if mq_timed{send,receive} needs to block From: Jakub Jelinek <jakub@redhat.com> POSIX requires EINVAL to be set if: "The process or thread would have blocked, and the abs_timeout parameter specified a nanoseconds field value less than zero or greater than or equal to 1000 million." but 2.6.5-mm3 returns -EINVAL even if the process or thread would not block (if the queue is not empty for timedreceive or not full for timedsend).
2004-04-11	[PATCH] posix message queues: code move	Andrew Morton	5	-127/+138
	From: Manfred Spraul <manfred@colorfullife.com> cleanup of sysv ipc as a preparation for posix message queues: - replace !CONFIG_SYSVIPC wrappers for copy_semundo and exit_sem with static inline wrappers. Now the whole ipc/util.c file is only used if CONFIG_SYSVIPC is set, use makefile magic instead of #ifdef. - remove the prototypes for copy_semundo and exit_sem from kernel/fork.c - they belong into a header file. - create a new msgutil.c with the helper functions for message queues. - cleanup the helper functions: run Lindent, add __user tags.
2004-03-26	[PATCH] ipc locking fix	Andrew Morton	1	-3/+4
	From: badari <pbadari@us.ibm.com> I ran into an ipc hang while trying to shutdown a database. The problem is due to missing sem_unlock() in find_undo().
2004-03-18	[PATCH] add file_accessed() helper	Alexander Viro	1	-1/+1
	New inlined helper - file_accessed(file) (wrapper for update_atime())
2004-03-16	[PATCH] SHMLBA compat task alignment fix	Andrew Morton	1	-1/+4
	From: Arun Sharma <arun.sharma@intel.com> The current Linux implementation of shmat() insists on SHMLBA alignment even when shmflg & SHM_RND == 0. This is not consistent with the man pages and the single UNIX spec, which require only a page-aligned address. However, some architectures require a SHMLBA alignment for correctness in all cases. Such architectures use __ARCH_FORCE_SHMLBA.
2004-03-15	[PATCH] document unchecked do_munmaps in ipc/shm.c	Andrew Morton	1	-0/+15
	From: Manfred Spraul <manfred@colorfullife.com> There are a few unchecked do_munmap()s in the shm code. Manfred's comment explains why they are OK.
2004-03-15	[PATCH] generic 32 bit emulation for System-V IPC	Andrew Morton	2	-0/+735
	From: Arnd Bergmann <arnd@arndb.de> Adds a generic implementation of 32 bit emulation for IPC system calls. The code is based on the existing implementations for sparc64, ia64, mips, s390, ppc and x86_64, which can subsequently be converted to use this.
2004-03-11	[PATCH] Remove unneeded unlock in ipc/sem.c	Andrew Morton	1	-1/+0
	From: Manfred Spraul <manfred@colorfullife.com> sem_revalidate checks that a semaphore array didn't disappear while the code was running without the semaphore array spinlock. If the array disappeared, then it will return without holding a lock. find_undo calls sem_revalidate and then sem_unlock, even if sem_revalidate failed. The sem_unlock call must be removed. Mingming Cao reported a spinlock deadlock with sysv semaphores. A superflous unlock doesn't explain the deadlock, but it's obviously a bug.
2004-02-24	[PATCH] add syscalls.h	Andrew Morton	1	-5/+5
	From: "Randy.Dunlap" <rddunlap@osdl.org> Add syscalls.h, which contains prototypes for the kernel's system calls. Replace open-coded declarations all over the place. This patch found a couple of prior bugs. It appears to be more important with -mregparm=3 as we discover more asmlinkage mismatches. Some syscalls have arch-dependent arguments, so their prototypes are in the arch-specific unistd.h. Maybe it should have been asm/syscalls.h, but there were already arch-specific syscall prototypes in asm/unistd.h... Tested on x86, ia64, x86_64, ppc64, s390 and sparc64. May cause trivial-to-fix build breakage on other architectures.
2004-02-22	[PATCH] rename shmat to make it clear it isn't a system call entrypoint	Manfred Spraul	1	-1/+1
	This renames sys_shmat to do_shmat. Additionally, I've replaced the cond_syscall with a conditional inline function. It touches all archs - only i386 is tested.
2004-02-22	[PATCH] cleanup condsyscall for sysv ipc	Andrew Morton	1	-63/+0
	From: Manfred Spraul <manfred@colorfullife.com> Attached is a patch that replaces the #ifndef CONFIG_SYSV syscall stubs with cond_syscall stubs.
2004-02-22	[PATCH] fix shmat	Andrew Morton	1	-1/+1
	From: Nick Piggin <piggin@cyberone.com.au> sys_shmat() need to be declared asmlinkage. This causes breakage when we actually get the proper prototypes into caller's scope.
2004-02-06	[PATCH] PA-RISC needs IPC64 structs	Matthew Wilcox	2	-2/+2
	PA-RISC also uses the 64-bit version of the IPC structs.
2004-01-25	[PATCH] Fix error checking in IPC_SET	Andi Kleen	1	-0/+1
	The LSM changes broke the error checking for queue lengths in IPC_SET. The LSM check would set set err to 0, but the next check expected it to still be -EPERM. Result was that no error was reported, but the new parameters weren't correctly set.
2003-12-29	[PATCH] lockless semop	Andrew Morton	1	-6/+69
	From: Manfred Spraul <manfred@colorfullife.com> attached is the lockless semop patch. I did another test run with idle=poll on an pentium III, and it remained unchanged: 99.9% direct fast path, 0.1% race with wakeup against writing the final result code: http://khack.osdl.org/stp/282936/environment/proc/slabinfo That means there is no immediate need to add the two-stage implementation to finish_wait. It reduces the spinlock operations on the semaphore array spinlock by 1/3.
2003-10-21	[PATCH] ipc msg race fix	Andrew Morton	1	-1/+10
	Backport this fix from 2.4
2003-10-06	[PATCH] UID16 fixes	Andi Kleen	1	-4/+4
	This fixes CONFIG_UID16 problems on x86-64 as discussed earlier. CONFIG_UID16 now only selects the inclusion of kernel/uid16.c, all conversions are triggered dynamically based on type sizes. This allows x86-64 to both include uid16.c for emulation purposes, but not truncate uids to 16bit in sys_newstat. - Replace the old macros from linux/highuid.h with new SET_UID/SET_GID macros that do type checking. Based on Linus' proposal. - Fix everybody to use them. - Clean up some cruft in the x86-64 32bit emulation allowed by this (other 32bit emulations could be cleaned too, but I'm too lazy for that right now) - Add one missing EOVERFLOW check in x86-64 32bit sys_newstat while I was at it.
2003-09-21	[PATCH] Fix sem_lock deadlock	Andrew Morton	1	-1/+3
	From: Anton Blanchard <anton@samba.org> I saw a lockup where 2 cpus were stuck in sem_lock(). It seems like we can loop back to retry_undos with the lock held. That path takes the lock so we will deadlock.
2003-08-31	[PATCH] ipc_init() uses vmalloc too early	Andrew Morton	1	-2/+3
	From: Andrea Arcangeli <andrea@suse.de> aka: "vmalloc allocations in ipc needs smp initialized (and vm must be allowed to schedule in 2.6)" In short if you change SEMMNI to 8192 the kernel will crash at boot, beause it tries to call vmalloc before the smp is initialized. The reason is that vmalloc calls into the pte alloc code, and the fast pte alloc is tried first, but that reads into the pte_quicklist, that requires the cpu_data to be initialized (and that happens in smp_init()). the patch is obviously safe, since no piece of kernel (especially the code in the check_bugs and smp_init paths ;) calls into the ipc subsystem. The reason this started to trigger wasn't really that we increased SEMMNI, but what happend is that some IPC data structure grown, and for some reason the corruption due the uninitalized pte_quicklist triggers only for smp boxes with less than 1G (not very common anymore ;). So it wasn't immediatly reproducible on all setups. 2.6 doesn't suffer from the same problem, simply because 2.6 isn't using the quicklist anymore, but I think it would be much more correct to make the same change in 2.6 too, since whatever cond_resched() in the vm paths (and they're definitely allowed to call it), will lead to a crash since the init task isn't initialized and the scheduler can't be invoked yet. (and 2.6 already has the bigger data structures that should trigger the vmalloc all the time on all setups)
2003-08-30	[PATCH] More ->pid to ->tgid changes	Ulrich Drepper	3	-9/+9
	One more overlooked area where the proper process ID has to be used: SysV IPC "pid" values should use the thread group ID, not the per-thread one.
2003-08-13	[PATCH] sparse annotations for ipc/sem	Dave Jones	1	-5/+5

2003-07-10	[PATCH] i_size atomic access	Andrew Morton	1	-1/+1
	From: Daniel McNeil <daniel@osdl.org> This adds i_seqcount to the inode structure and then uses i_size_read() and i_size_write() to provide atomic access to i_size. This is a port of Andrea Arcangeli's i_size atomic access patch from 2.4. This only uses the generic reader/writer consistent mechanism. Before: mnm:/usr/src/25> size vmlinux text data bss dec hex filename 2229582 1027683 162436 3419701 342e35 vmlinux After: mnm:/usr/src/25> size vmlinux text data bss dec hex filename 2225642 1027655 162436 3415733 341eb5 vmlinux 3.9k more text, a lot of it fastpath :( It's a very minor bug, and the fix has a fairly non-minor cost. The most compelling reason for fixing this is that writepage() checks i_size. If it sees a transient value it may decide that page is outside i_size and will refuse to write it. Lost user data.
2003-07-04	[PATCH] ipc semaphore optimization	Andrew Morton	1	-57/+38
	From: "Chen, Kenneth W" <kenneth.w.chen@intel.com> This patch proposes a performance fix for the current IPC semaphore implementation. There are two shortcoming in the current implementation: try_atomic_semop() was called two times to wake up a blocked process, once from the update_queue() (executed from the process that wakes up the sleeping process) and once in the retry part of the blocked process (executed from the block process that gets woken up). A second issue is that when several sleeping processes that are eligible for wake up, they woke up in daisy chain formation and each one in turn to wake up next process in line. However, every time when a process wakes up, it start scans the wait queue from the beginning, not from where it was last scanned. This causes large number of unnecessary scanning of the wait queue under a situation of deep wait queue. Blocked processes come and go, but chances are there are still quite a few blocked processes sit at the beginning of that queue. What we are proposing here is to merge the portion of the code in the bottom part of sys_semtimedop() (code that gets executed when a sleeping process gets woken up) into update_queue() function. The benefit is two folds: (1) is to reduce redundant calls to try_atomic_semop() and (2) to increase efficiency of finding eligible processes to wake up and higher concurrency for multiple wake-ups. We have measured that this patch improves throughput for a large application significantly on a industry standard benchmark. This patch is relative to 2.5.72. Any feedback is very much appreciated. Some kernel profile data attached: Kernel profile before optimization: ----------------------------------------------- 0.05 0.14 40805/529060 sys_semop [133] 0.55 1.73 488255/529060 ia64_ret_from_syscall [2] [52] 2.5 0.59 1.88 529060 sys_semtimedop [52] 0.05 0.83 477766/817966 schedule_timeout [62] 0.34 0.46 529064/989340 update_queue [61] 0.14 0.00 1006740/6473086 try_atomic_semop [75] 0.06 0.00 529060/989336 ipcperms [149] ----------------------------------------------- 0.30 0.40 460276/989340 semctl_main [68] 0.34 0.46 529064/989340 sys_semtimedop [52] [61] 1.5 0.64 0.87 989340 update_queue [61] 0.75 0.00 5466346/6473086 try_atomic_semop [75] 0.01 0.11 477676/576698 wake_up_process [146] ----------------------------------------------- 0.14 0.00 1006740/6473086 sys_semtimedop [52] 0.75 0.00 5466346/6473086 update_queue [61] [75] 0.9 0.89 0.00 6473086 try_atomic_semop [75] ----------------------------------------------- Kernel profile with optimization: ----------------------------------------------- 0.03 0.05 26139/503178 sys_semop [155] 0.46 0.92 477039/503178 ia64_ret_from_syscall [2] [61] 1.2 0.48 0.97 503178 sys_semtimedop [61] 0.04 0.79 470724/784394 schedule_timeout [62] 0.05 0.00 503178/3301773 try_atomic_semop [109] 0.05 0.00 503178/930934 ipcperms [149] 0.00 0.03 32454/460210 update_queue [99] ----------------------------------------------- 0.00 0.03 32454/460210 sys_semtimedop [61] 0.06 0.36 427756/460210 semctl_main [75] [99] 0.4 0.06 0.39 460210 update_queue [99] 0.30 0.00 2798595/3301773 try_atomic_semop [109] 0.00 0.09 470630/614097 wake_up_process [146] ----------------------------------------------- 0.05 0.00 503178/3301773 sys_semtimedop [61] 0.30 0.00 2798595/3301773 update_queue [99] [109] 0.3 0.35 0.00 3301773 try_atomic_semop [109] -----------------------------------------------=20 Both number of function calls to try_atomic_semop() and update_queue() are reduced by 50% as a result of the merge. Execution time of sys_semtimedop is reduced because of the reduction in the low level functions.
2003-07-01	[PATCH] Fix IPC ABI for AMD64	Andi Kleen	2	-2/+2
	AMD64 like IA64 needs to force IPC_64 in the IPC functions. This makes 2.5 compatible with 2.4 again.
2003-06-20	[PATCH] sysv semundo fixes	Andrew Morton	2	-183/+123
	From: Manfred Spraul <manfred@colorfullife.com> The CLONE_SYSVSEM implementation is racy: it does an (atomic_read(->refcnt) ==1) instead of atomic_dec_and_test calls in the exit handling. The patch fixes that. Additionally, the patch contains the following changes: - lock_undo() locks the list of undo structures. The lock is held throughout the semop() syscall, but that's unnecessary - we can drop it immediately after the lookup. - undo structures are only allocated when necessary. The need for undo structures is only noticed in the middle of the semop operation, while holding the semaphore array spinlock. The result is a convoluted unlock&revalidate implementation. I've reordered the code, and now the undo allocation can happen before acquiring the semaphore array spinlock. As a bonus, less code runs under the semaphore array spinlock. - sysvsem.sleep_list looks like code to handle oopses: if an oops kills a thread that sleeps in sys_timedsemop(), then sem_exit tries to recover. I've removed that - too fragile.
2003-06-02	[PATCH] remove 16-bit pid assumption from ipc/sem.c	Andrew Morton	1	-26/+22
	From: Manfred Spraul <manfred@colorfullife.com> SysV sem operations that involve multiple semaphores can fail in the middle, and then sempid (pid of the last successful operation) must be restored. This happens with "sempid >>= 16" - broken due to the 32-bit pid values. The attached patch fixes that by reordering the updates of the semaphore fields. Additionally, the patch fixes the corruption of the sempid value that occurs if a wait-for-zero operation fails. The patch is more than two years old, and was in -dj and -ak kernels.
2003-05-12	[PATCH] semop race fix	Andrew Morton	2	-14/+22
	From: Mingming Cao <cmm@us.ibm.com> Basically, freeary() is called with the spinlock for that semaphore set hold. But after the semaphore set is removed from the ID array by calling sem_rmid(), there is no lock to protect the waiting queue for that semaphore set. So, if a waiter is woken up by a signal (not by the wakeup from freeary()), it will check the q->status and q->prev fields. At that moment, freeary() may not have a chance to update those fields yet. static void freeary (int id) { ....... sma = sem_rmid(id); ...... /* Wake up all pending processes and let them fail with EIDRM./ for (q = sma->sem_pending; q; q = q->next) { q->status = -EIDRM; q->prev = NULL; wake_up_process(q->sleeper); / doesn't sleep */ } sem_unlock(sma); ...... } So I propose move sem_rmid() after the loop of waking up every waiters. That could gurantee that when the waiters are woke up, the updates for q->status and q->prev have already done. Similar thing in message queue case. The patch is attached below. Comments are very welcomed. I have tested this patch on 2.5.68 kernel with LTP tests, seems fine to me. Paul, could you test this on DOTS test again? Thanks!
2003-05-09	Annotate IPC system calls with user pointer annotations	Linus Torvalds	1	-7/+11

2003-05-07	[PATCH] s/UPDATE_ATIME/update_atime/ cleanup	Andrew Morton	1	-1/+1
	From: Stewart Smith <stewartsmith@mac.com> Remove the UPDATE_ATIME() macro, use update_atime() directly.
2003-04-20	[PATCH] shm_get_stat-handle-hugetlb-pages.patch	Andrew Morton	1	-11/+19
	From: William Lee Irwin III <wli@holomorphy.com> shm_get_stat() didn't know about hugetlbpage-backed shm.
2003-04-20	[PATCH] shmdt() speedup	Andrew Morton	1	-7/+52
	From: William Lee Irwin III <wli@holomorphy.com> Micro-optimize sys_shmdt(). There are methods of exploiting knowledge of the vma's being searched to restrict the search space. These are: (1) shm mappings always start their lives at file offset 0, so only vma's above shmaddr need be considered. find_vma() can be used to seek to the proper position in mm->mmap in O(lg(n)) time. (2) The search is for a vma which could be a fragment of a broken-up shm mapping, which would have been created starting at shmaddr with vm_pgoff 0 and then continued no further into userspace than shmaddr + size. So after having found an initial vma, find the size of the shm segment it maps to calculate an upper bound to the virtualspace that needs to be searched. (3) mremap() would have caused the original checks to miss vma's mapping the shm segment if shmaddr were the original address at which the shm segments were attached. This does no better and no worse than the original code in that situation. (4) If the chain of references in vma->vm_file->f_dentry->d_inode->i_size is not guaranteed by refcounting and/or the shm code then this is oopsable; AFAICT an inode is always allocated.
2003-01-09	[PATCH] 2.5.52-lsm-{dummy,ipc}.patch	Stephen D. Smalley	3	-8/+106
	This patch adds the remaining System V IPC hooks, including the inline documentation for them in security.h. This includes a restored sem_semop hook, as it does seem to be necessary to support fine-grained access. All of these System V IPC hooks are used by SELinux. The SELinux System V IPC access controls were originally described in the technical report available from http://www.nsa.gov/selinux/slinux-abs.html, and the LSM-based implementation is described in the technical report available from http://www.nsa.gov/selinux/module-abs.html.
2002-12-31	Fix up numerous '`xxxxx' is not at beginning of declaration' style warnings.	Dave Jones	1	-2/+2

2002-12-21	[PATCH] hugetlb: report shared memory attachment counts	Andrew Morton	1	-2/+5
	From Rohit Seth Attached is a patch that passes the correct information back to user land for number of attachments to shared memory segment. I could have done few more changes in a way nattach is getting set for regular cases now, but just want to limit it at this point.
2002-12-14	[PATCH] Remove Rules.make from Makefiles (3/3)	Brian Gerst	1	-2/+0
	Makefiles no longer need to include Rules.make, which is currently an empty file. This patch removes it from the remaining Makefiles, and removes the empty Rules.make file.
2002-12-14	[PATCH] semtimedop - semop() with a timeout	Andrew Morton	2	-2/+34
	Patch from Mark Fasheh <mark.fasheh@oracle.com> (plus a few cleanups and a speedup from yours truly) Adds the semtimedop() function - semop with a timeout. Solaris has this. It's apparently worth a couple of percent to Oracle throughput and given the simplicity, that is sufficient benefit for inclusion IMO. This patch hooks up semtimedop() only for ia64 and ia32.
2002-12-07	Merge conectiva.com.br:/home/BK/includes-2.5.old	Arnaldo Carvalho de Melo	1	-0/+2
	into conectiva.com.br:/home/BK/includes-2.5
2002-12-02	[PATCH] memory barrier work in ipc/util.c	Andrew Morton	1	-16/+40
	Patch from Mingming Cao <cmm@us.ibm.com> - ipc_lock() need a read_barrier_depends() to prevent indexing uninitialized new array on the read side. This is corresponding to the write memory barrier added in grow_ary() from Dipankar's patch to prevent indexing uninitialized array. - Replaced "wmb()" in IPC code with "smp_wmb()"."wmb()" produces a full write memory barrier in both UP and SMP kernels, while "smp_wmb()" provides a full write memory barrier in an SMP kernel, but only a compiler directive in a UP kernel. The same change are made for "rmb()". - Removed rmb() in ipc_get(). We do not need a read memory barrier there since ipc_get() is protected by ipc_ids.sem semaphore. - Added more comments about why write barriers and read barriers are needed (or not needed) here or there.
2002-11-26	LSM: change if statements into something more readable for the ipc/, mm/, ↵	Greg Kroah-Hartman	3	-3/+6
	and net/* files.
2002-11-23	Merge	Greg Kroah-Hartman	4	-13/+10

2002-11-23	o uaccess.h: remove include sched.h, it only needs thread_info.h	Arnaldo Carvalho de Melo	1	-0/+2

2002-11-21	[PATCH] Add SMP barrier to ipc's grow_ary()	Andrew Morton	1	-1/+7
	From Dipanker Sarma. Before setting the ids->entries to the new array, there must be a wmb() to make sure that the memcpyed contents of the new array are visible before the new array becomes visible.
2002-11-21	[PATCH] shmdt bugfix	Andrew Morton	1	-15/+9
	Patch from Hugh Dickins <hugh@veritas.com> Fixes the Oracle startup problem reported by Alessandro Suardi. Reverts a "simplification" to shmdt() which was wrong if subsequent mprotects broke up the original VMA, or if parts of it were munmapped.
2002-11-17	[PATCH] nanosecond stat timefields	Andi Kleen	3	-15/+15
	stat64 has been changed to return jiffies granuality as nsec in previously unused fields. This allows make to make better decisions on when to recompile a file. Follows losely the Solaris API. CURRENT_TIME has been redefined to return struct timespec. The users who don't use it in a inode/attr context have been changed to use a new get_seconds() function. CURRENT_TIME is implemented by an out-of-line function. There is a small performance penalty in this patch. The previous filemap code had an optimization to flush atime only once a second. This is currently gone, which will increase flushes a bit. I believe the correct solution if it should be a problem is to have per super block fields that give an arbitary atime flush granuality - so that you can set it to be only flushed once a hour if you prefer that. I will work on that later in separate patches if the need should arise. struct inode and the attr struct has been changed to store struct timespec instead of time_t for [cma]time. Not all file systems support this granuality, but some like XFS,NFSv3,CIFS,JFS do. The others will currently truncate the nsec part on flushing to disk. There was some discussion on this rounding on l-k previously. I went for simple truncation because there is not much evidence IMHO that the more complicated roundings have any advantages. In practice application will be rather unlikely to notice the rounding anyways - they can only see a difference when an inode is flush from memory and reloaded in less than a second, which is rather unlikely.
2002-10-31	[PATCH] uninlining in ipc/*	Andrew Morton	2	-72/+77
	Uninlines some large functions in the ipc code. Before: text data bss dec hex filename 30226 224 192 30642 77b2 ipc/built-in.o After: text data bss dec hex filename 20274 224 192 20690 50d2 ipc/built-in.o
2002-10-31	[PATCH] use RCU for IPC locking	Andrew Morton	5	-111/+220
	Patch from Mingming, Rusty, Hugh, Dipankar, me: - It greatly reduces the lock contention by having one lock per id. The global spinlock is removed and a spinlock is added in kern_ipc_perm structure. - Uses ReadCopyUpdate in grow_ary() for locking-free resizing. - In the places where ipc_rmid() is called, delay calling ipc_free() to RCU callbacks. This is to prevent ipc_lock() returning an invalid pointer after ipc_rmid(). In addition, use the workqueue to enable RCU freeing vmalloced entries. Also some other changes: - Remove redundant ipc_lockall/ipc_unlockall - Now ipc_unlock() directly takes IPC ID pointer as argument, avoid extra looking up the array. The changes are made based on the input from Huge Dickens, Manfred Spraul and Dipankar Sarma. In addition, Cliff White has run OSDL's dbt1 test on a 2 way against the earlier version of this patch. Results shows about 2-6% improvement on the average number of transactions per second. Here is the summary of his tests: 2.5.42-mm2 2.5.42-mm2-ipclock ----------------------------- Average over 5 runs 85.0 BT 89.8 BT Std Deviation 5 runs 7.4 BT 1.0 BT Average over 4 best 88.15 BT 90.2 BT Std Deviation 4 best 2.8 BT 0.5 BT Also, another test today from Bill Hartner: I tested Mingming's RCU ipc lock patch using a new microbenchmark - semopbench. semopbench was written to test the performance of Mingming's patch. I also ran a 3 hour stress and it completed successfully. Explanation of the microbenchmark is below the results. Here is a link to the microbenchmark source. http://www-124.ibm.com/developerworks/opensource/linuxperf/semopbench/semopbench.c SUT : 8-way 700 Mhz PIII I tested 2.5.44-mm2 and 2.5.44-mm2 + RCU ipc patch >semopbench -g 64 -s 16 -n 16384 -r > sem.results.out >readprofile -m /boot/System.map \| sort -n +0 -r > sem.profile.out The metric is seconds / per repetition. Lower is better. kernel run 1 run 2 seconds seconds ================== ======= ======= 2.5.44-mm2 515.1 515.4 2.5.44-mm2+rcu-ipc 46.7 46.7 With Mingming's patch, the test completes 10X faster.
2002-10-30	[PATCH] hugetlbfs backing for SYSV shared memory	Andrew Morton	1	-45/+82
	From Bill Irwin Optionally back priviled processes' shm with hugetlbfs. One of the more common requests for and/or users of hugetlb interfaces in general are databases using shm. This patch exports functionality mostly equivalent to tmpfs, adds the calling sequence to ipc/shm.c, and hashes out a small support function in fs/hugetlbfs/inode.c so that shm segments may be hugetlbpage-backed if userspace passes a flag to shmget(). Access to this resource requires CAP_IPC_LOCK.
2002-10-17	LSM: convert over the remaining security calls to the new format.	Greg Kroah-Hartman	4	-13/+10

2002-10-08	[PATCH] Base set of LSM hooks for SysV IPC	Stephen D. Smalley	4	-1/+34
	The patch below adds the base set of LSM hooks for System V IPC to the 2.5.41 kernel. These hooks permit a security module to label semaphore sets, message queues, and shared memory segments and to perform security checks on these objects that parallel the existing IPC access checks. Additional LSM hooks for labeling and controlling individual messages sent on a single message queue and for providing fine-grained distinctions among IPC operations will be submitted separately after this base set of LSM IPC hooks has been accepted.
2002-09-18	kbuild: Remove O_TARGET from {kernel,mm,fs,...}/Makefile	Kai Germaschewski	1	-7/+0
	It's gone almost everywhere else already, and will eventually make for a nicer top-level Makefile.
2002-09-09	[PATCH] Designated initializers for shm	Rusty Russell	1	-4/+4
	The old form of designated initializers are obsolete: we need to replace them with the ISO C forms before 2.6. Gcc has always supported both forms anyway.
2002-07-29	[PATCH] remove acct arg from do_munmap	Hugh Dickins	1	-1/+1
	An acct flag was added to do_munmap, true everywhere but in mremap's move_vma: instead of updating the arch and driver sources, revert that that change and temporarily mask VM_ACCOUNT around that one do_munmap. Also, noticed that do_mremap fails needlessly if both shrinking _and_ moving a mapping: update old_len to pass vm area boundaries test.
2002-07-29	[PATCH] shmem_file_setup when MAP_NORESERVE	Hugh Dickins	1	-1/+1
	If we support mmap MAP_NORESERVE, we should support it on shared anonymous objects: too bad that needs a few changes. do_mmap_pgoff pass VM_ACCOUNT (or not) down to shmem_file_setup, flag stored into shmem info, for use by shmem_delete_inode later. Also removed a harmless but pointless call to shmem_truncate.
2002-07-28	[PATCH] strict overcommit	Andrew Morton	1	-1/+1
	Alan's overcommit patch, brought to 2.5 by Robert Love. Can't say I've tested its functionality at all, but it doesn't crash, it has been in -ac and RH kernels for some time and I haven't observed any of its functions on profiles. "So what is strict VM overcommit? We introduce new overcommit policies that attempt to never succeed an allocation that can not be fulfilled by the backing store and consequently never OOM. This is achieved through strict accounting of the committed address space and a policy to allow/refuse allocations based on that accounting. In the strictest of modes, it should be impossible to allocate more memory than available and impossible to OOM. All memory failures should be pushed down to the allocation routines -- malloc, mmap, etc. The new modes are available via sysctl (same as before). See Documentation/vm/overcommit-accounting for more information."
2002-07-23	[PATCH] shm_destroy lock hang	Hugh Dickins	1	-7/+8
	Martin Schwidefsky <schwidefsky@de.ibm.com> reported "Bug with shared memory" to LKML 14 May: hang due to schedule in truncate_list_pages called from .... shm_destroy holding shm_lock spinlock. shm_destroy needs that lock for shm_rmid, but it can be safely unlocked once link from id to shp has been removed.
2002-07-16	Merge kroah.com:/home/greg/linux/BK/bleeding_edge-2.5	Greg Kroah-Hartman	3	-49/+13
	into kroah.com:/home/greg/linux/BK/lsm-2.5
2002-07-14	[PATCH] ipc_ statics	Stephen Rothwell	2	-5/+5
	This patch just makes some stuff in ipc/ static.
2002-07-14	LSM: move struct shmid_kernel out of ipc/shm.c to include/linux/shm.h	Greg Kroah-Hartman	2	-19/+9
	Also move where we set sma->sem_perm.mode and .key to before ipc_addid() gets called.
2002-07-14	LSM: move the struct msg_msg and struct msg_queue definitions out of the ↵	Greg Kroah-Hartman	1	-30/+4
	msg.c file to the msg.h file Also move where the msg->q_perm.mode and .key values get set to before ipc_addid() gets called to make placing a hook there easier.
2002-05-26	[PATCH] semctl SUSv2 compliance	Rusty Russell	1	-0/+1
	Christopher Yeoh <cyeoh@samba.org>: (Made -p1 compliant by rusty) SUSv2 semctl compliance: The semctl call with SETVAL currently does not set sempid (at the moment sempid is only set during a successful semop call). An explanation from Geoff Clare of the Open Group regarding why sempid should be set during the semctl call: "The spec isn't very clear, but there is a statement on the semget() page which I think justifies the assumption made by the test. It says that upon creation, the data structure associated with each semaphore in the set is not initialised, and that the semctl() function with SETVAL or SETALL can be used to initialise each semaphore. Therefore semctl() with SETVAL has to set sempid to something, and since sempid contains the "process ID of the last operation", setting it to anything other than the pid of the calling process would mean that sempid contained misleading information. It could be argued that setting it to zero would not be misleading, but zero cannot be the process ID of a process, and so is not a valid value for sempid anyway." The following patch changes semctl so when called with SETVAL sempid is set to the pid of the calling process:
2002-04-28	[PATCH] 2.5.10 BKL not always released in sem_exit()	Chris Wright	1	-1/+3
	The patch below fixes sem_exit() so that the BKL is always released.
2002-04-23	[PATCH] 2.5.9 SEM_UNDO patch	Dave Olien	2	-36/+214
	As we discussed some time ago, here is a patch for the SEM_UNDO change that can be applied to linux-2.5.9.
2002-04-02	Fix missing include due to do_exit() BKL movement	Linus Torvalds	1	-0/+1

2002-04-02	[PATCH] wrong return codes in ipc shm	Dave Jones	1	-2/+5
	We always returned success even when we had no ->vm_ops
2002-04-02	[PATCH] BKL reduction in do_exit	Dave Hansen	1	-0/+4
	Push BKL down to the (few) routines that actually need it, remove it from the do_exit() path.
2002-03-17	[PATCH] struct super_block cleanup - shmem	Brian Gerst	1	-0/+1
	Seperates shmem_sb_info from struct super_block.
2002-02-04	v2.5.1.3 -> v2.5.1.4	Linus Torvalds	1	-0/+1
	- Jens Axboe: more bio updates, fix some request list bogosity under load - Al Viro: export seq_xxx functions - Manfred Spraul: include file cleanups, pc110pad compile fix - David Woodhouse: fix JFFS2 write error handling - Dave Jones: start merging up with 2.4.x patches - Manfred Spraul: coredump fixes, FS event counter cleanups - me: fix SCSI CD-ROM sectorsize BIO breakage
2002-02-04	v2.5.0.7 -> v2.5.0.8	Linus Torvalds	1	-3/+24
	- Greg KH: USB updates - Jens Axboe: more bio updates - Christoph Rohland: fix up proper shmat semantics
2002-02-04	v2.4.12.3 -> v2.4.12.4	Linus Torvalds	1	-3/+5
	- Al Viro: mnt_list init - Jeff Garzik: network driver update (license tags, tulip driver) - David Miller: sparc, net updates - Ben Collins: firewire update - Gerd Knorr: btaudio/bttv update - Tim Hockin: MD cleanups - Greg KH, Petko Manolov: USB updates - Leonard Zubkoff: DAC960 driver update
2002-02-04	v2.4.10.1 -> v2.4.10.2	Linus Torvalds	1	-2/+15
	- me/Al Viro: fix bdget() oops with block device modules that don't clean up after they exit - Alan Cox: continued merging (drivers, license tags) - David Miller: sparc update, network fixes - Christoph Hellwig: work around broken drivers that add a gendisk more than once - Jakub Jelinek: handle more ELF loading special cases - Trond Myklebust: NFS client and lockd reclaimer cleanups/fixes - Greg KH: USB updates - Mikael Pettersson: sparate out local APIC / IO-APIC config options
2002-02-04	v2.4.9.9 -> v2.4.9.10	Linus Torvalds	2	-3/+13
	- Alan Cox: continued merging - Mingming Cao: make msgrcv/shmat check the queue/segment ID's properly - Greg KH: USB serial init failure fix, Xircom serial converter driver - Neil Brown: nsfd/raid/md/lockd cleanups - Ingo Molnar: multipath RAID personality, raid xor update - Hugh Dickins/Marcelo Tosatti: swapin read-ahead race fix - Vojtech Pavlik: fix up some of the infrastructure for x86-64 - Robert Love: AMD 761 AGP GART support - Jens Axboe: fix SCSI-generic queue handling race - me: be sane about page reference bits
2002-02-04	v2.4.8.1 -> v2.4.8.2	Linus Torvalds	1	-1/+1
	- me: fix forgotten nfsd usage of filldir off_t -> loff_t change - Alan Cox: more driver merges
2002-02-04	v2.4.4.3 -> v2.4.4.4	Linus Torvalds	1	-7/+14
	- Russell King: ARM updates - Al Viro: more init cleanups - Cort Dougan: more PPC updates - David Miller: cleanups, pci mmap updates - Neil Brown: raid resync by sector - Alan Cox: more merging with -ac - Johannes Erdfelt: USB updates - Kai Germaschewski: ISDN updates - Tobias Ringstrom: dmfe.c network driver update - Trond Myklebust: NFS client updates and cleanups
2002-02-04	v2.4.2.4 -> v2.4.2.5	Linus Torvalds	1	-4/+6
	- Rik van Riel and others: mm rw-semaphore (ps/top ok when swapping) - IDE: 256 sectors at a time is legal, but apparently confuses some drives. Max out at 255 sectors instead. - Petko Manolov: USB pegasus driver update - make the boottime memory map printout at least almost readable. - USB driver updates - pte_alloc()/pmd_alloc() need page_table_lock.
2002-02-04	v2.4.1.4 -> v2.4.2	Linus Torvalds	4	-6/+6
	- sync up more with Alan - Urban Widmark: smbfs and HIGHMEM fix - Chris Mason: reiserfs tail unpacking fix ("null bytes in reiserfs files") - Adan Richter: new cpia usb ID - Hugh Dickins: misc small sysv ipc fixes - Andries Brouwer: remove overly restrictive sector size check for SCSI cd-roms
2002-02-04	v2.4.1.2 -> v2.4.1.3	Linus Torvalds	4	-4/+4
	- Jens: better ordering of requests when unable to merge - Neil Brown: make md work as a module again (we cannot autodetect in modules, not enough background information) - Neil Brown: raid5 SMP locking cleanups - Neil Brown: nfsd: handle Irix NFS clients named pipe behavior and dentry leak fix - maestro3 shutdown fix - fix dcache hash calculation that could cause bad hashes under certain circumstances (Dean Gaudet) - David Miller: networking and sparc updates - Jeff Garzik: include file cleanups - Andy Grover: ACPI update - Coda-fs error return fixes - rth: alpha Jensen update
2002-02-04	v2.4.0.3 -> v2.4.0.4	Linus Torvalds	1	-2/+3
	- ReiserFS merge - fix DRM R128/AGP dependency
2002-02-04	Import changeset	Linus Torvalds	6	-0/+3224