aboutsummaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2005-10-21Merge with /pub/scm/linux/kernel/git/sfrench/cifs-2.6.git/Steve French24-545/+2266
2005-10-20[CIFS] Defer close of file handle slightly if there are pending writes thatSteve French5-25/+73
need to get in ahead of it that depend on that file handle. Fixes occassional bad file handle errors on write with heavy use multiple process cases. Signed-off-by: Steve French <sfrench@us.ibm.com>
2005-10-17[PATCH] aio: revert lock_kiocb()Zach Brown1-25/+1
lock_kiocb() was introduced to serialize retrying and cancellation. In the process of doing so it tried to sleep waiting for KIF_LOCKED while holding the ctx_lock spinlock. Recent fixes have ensured that multiple concurrent retries won't be attempted for a given iocb. Cancel has other problems and has no significant in-tree users that have been complaining about it. So for the immediate future we'll revert sleeping with the lock held and will address proper cancellation and retry serialization in the future. Signed-off-by: Zach Brown <zach.brown@oracle.com> Acked-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-17[PATCH] output of /proc/maps on nommu systems is incompleteDavid McCullough1-0/+1
Currently you do not get all the map entries on nommu systems because the start function doesn't index into the list using the value of "pos". Signed-off-by: David McCullough <davidm@snapgear.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-17[PATCH] NFS: Fix Oopsable/unnecessary i_count manipulations in ↵Trond Myklebust1-2/+0
nfs_wait_on_inode() Oopsable since nfs_wait_on_inode() can get called as part of iput_final(). Unnecessary since the caller had better be damned sure that the inode won't disappear from underneath it anyway. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-17[PATCH] NFS: Fix cache consistency racesTrond Myklebust3-6/+8
If the data cache has been marked as potentially invalid by nfs_refresh_inode, we should invalidate it rather than assume that changes are due to our own activity. Also ensure that we always start with a valid cache before declaring it to be protected by a delegation. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-14[PATCH] nommu build error fixYoshinori Sato1-0/+12
"proc_smaps_operations" is not defined in case of "CONFIG_MMU=n". Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-12CIFS: cifs_writepages should not write beyond end of fileSteve French1-2/+13
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2005-10-11[CIFS] Add null malloc response check in notify experimental codeSteve French2-14/+20
Signed-off-by: Steve French (sfrench@us.ibm.com)
2005-10-11[CIFS] CIFS Stats improvementsSteve French9-8/+98
New cifs_writepages routine was not updated bytes written in cifs stats. Also added ability to clear /proc/fs/cifs/Stats by writing (0 or 1) to it. Signed-off-by: Steve French <sfrench@us.ibm.com>
2005-10-11[PATCH] binfmt_elf bss padding fixakpm@osdl.org1-1/+1
Nir Tzachar <tzachar@cs.bgu.ac.il> points out that if an ELF file specifies a zero-length bss at a whacky address, we cannot load that binary because padzero() tries to zero out the end of the page at the whacky address, and that may not be writeable. See also http://bugzilla.kernel.org/show_bug.cgi?id=5411 So teach load_elf_binary() to skip the bss settng altogether if the elf file has a zero-length bss segment. Cc: Roland McGrath <roland@redhat.com> Cc: Daniel Jacobowitz <dan@debian.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-11[PATCH] nfsacl: Solaris VxFS compatibility fixAndreas Gruenbacher1-35/+35
Here is a compatibility fix between Linux and Solaris when used with VxFS filesystems: Solaris usually accepts acl entries in any order, but with VxFS it replies with NFSERR_INVAL when it sees a four-entry acl that is not in canonical form. It may also fail with other non-canonical acls -- I can't tell, because that case never triggers: We only send non-canonical acls when we fake up an ACL_MASK entry. Instead of adding fake ACL_MASK entries at the end, inserting them in the correct position makes Solaris+VxFS happy. The Linux client and server sides don't care about entry order. The three-entry-acl special case in which we need a fake ACL_MASK entry was handled in xdr_nfsace_encode. The patch moves this into nfsacl_encode. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-11[PATCH] v9fs: remove additional buffer allocation from v9fs_file_read and ↵Latchesar Ionkov1-81/+33
v9fs_file_write v9fs_file_read and v9fs_file_write use kmalloc to allocate buffers as big as the data buffer received as parameter. kmalloc cannot be used to allocate buffers bigger than 128K, so reading/writing data in chunks bigger than 128k fails. This patch reorganizes v9fs_file_read and v9fs_file_write to allocate only buffers as big as the maximum data that can be sent in one 9P message. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Cc: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-10[CIFS] Fix oops in experimental notify code (when CONFIG_CIFS_EXPERIMENTALSteve French4-1/+12
was turned on). Signed-off-by: Steve French <sfrench@us.ibm.com>
2005-10-10[CIFS] Still missing a line from previous fixSteve French1-1/+1
Signed-off-by: Steve French (sfrench@us.ibm.com)
2005-10-10[CIFS] Fix minor build problem with previous changesetSteve French1-5/+8
Signed-off-by: Steve French <sfrench@us.ibm.com>
2005-10-10[CIFS] Do not shrink tcp sndbuf/rcvbuf from their defaultsSteve French1-8/+10
Signed-off-by: Steve French <sfrench@us.ibm.com>
2005-10-10[CIFS] Correct cifs tcp retry when some data sent before getting EAGAIN.Steve French2-2/+20
Continue implementation of cifs umount begin to allow force unmounts of cifs mounts. Signed-off-by: Steve French <sfrench@us.ibm.com>
2005-10-10[CIFS] Update cifs version to 1.38Steve French1-1/+1
Signed-off-by: Steve French <sfrench@us.ibm.com>
2005-10-10[CIFS] Fix byte range locking to Windows when Windows server returnsSteve French4-12/+32
illegal RFC1001 length (which had caused the lock to block forever until killed).
2005-10-10[CIFS] Fix rsize calculation so that large readx flag is checked.Steve French3-8/+28
Signed-off-by: Steve French (sfrench@us.ibm.com)
2005-10-10[CIFS] Reduce CIFS tcp congestion timeout (it was too long) and backoffSteve French3-11/+22
ever longer amounts (up to 15 seconds). This improves performance especially when using large wsize. Signed-off-by: Steve French (sfrench@us.ibm.com)
2005-10-10[PATCH] relayfs: fix bogus param value in call to vmapTom Zanussi1-1/+1
The third param in this call to vmap shouldn't be GFP_KERNEL, which makes no sense, but rather VM_MAP. Thanks to Al Viro for spotting this. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-08[PATCH] gfp flags annotations - part 1Al Viro7-23/+22
- added typedef unsigned int __nocast gfp_t; - replaced __nocast uses for gfp flags with gfp_t - it gives exactly the same warnings as far as sparse is concerned, doesn't change generated code (from gcc point of view we replaced unsigned int with typedef) and documents what's going on far better. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-07[CIFS] /proc/fs/cifs debug code cleanup and new stats2Steve French3-5/+41
These changes to debug code and new stats are helpful in debugging potential tcp performance/configuration problems under cifs. Signed-off-by: Steve French <sfrench@us.ibm.com>
2005-10-06Avoid 'names_cache' memory leak with CONFIG_AUDITSYSCALLLinus Torvalds1-3/+3
The nameidata "last.name" is always allocated with "__getname()", and should always be free'd with "__putname()". Using "putname()" without the underscores will leak memory, because the allocation will have been hidden from the AUDITSYSCALL code. Arguably the real bug is that the AUDITSYSCALL code is really broken, but in the meantime this fixes the problem people see. Reported by Robert Derr, patch by Rick Lindsley. Acked-by: Al Viro <viro@ftp.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-05[CIFS] cleanup sparse and compile errors in previous fixSteve French2-4/+3
Signed-off-by: Steve French (sfrench@us.ibm.com)
2005-10-05CIFS: Allow wsize to exceed CIFSMaxBufSizeSteve French3-16/+10
This allows cifs_writepages to send data in larger chunks from the page cache, without requiring larger memory allocations in other cases. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2005-10-05CIFS: implement cifs_writepages to perform multi-page I/OSteve French2-11/+190
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2005-10-05CIFS: Create routine find_writable_file to reduce redundant codeSteve French3-119/+50
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2005-10-04[PATCH] bfs iget() abusesAl Viro1-12/+30
bfs_fill_super() walks the inode table to get the bitmap of free inodes and collect stats. It has no business using iget() for that - it's a lot of extra work, extra icache pollution and more complex code. Switched to walking the damn thing directly. Note: that also allows to kill ->i_dsk_ino in there - separate patch if Tigran can confirm that this field can be zero only for deleted inodes (i.e. something that could only be found during that scan and not by normal lookups). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-04[PATCH] bfs endianness annotationsAlexey Dobriyan2-2/+2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-04NTFS: Fix a 64-bitness bug where a left-shift could overflow a 32-bit variableAnton Altaparmakov3-3/+4
which we now cast to 64-bit first (fs/ntfs/mft.c::map_mft_record_page(). Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-10-04NTFS: Fix a stupid bug in __ntfs_bitmap_set_bits_in_run() which caused theAnton Altaparmakov2-2/+6
count to become negative and hence we had a wild memset() scribbling all over the system's ram. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-10-03[CIFS] Missing parenthesis from error message in previous fixSteve French1-1/+1
Signed-off-by: Steve French (sfrench@us.ibm.com)
2005-10-03[CIFS] Allow SMBWrite2 to work to older serversSteve French1-7/+22
Signed-off-by: Steve French (sfrench@us.ibm.com)
2005-10-03[CIFS] Add writepages support to shrink memory usage on writes,Steve French5-72/+89
eliminate the double copy, and improve cifs write performance and help the server by upping the typical write size from 4K to 16K (or even larger if wsize set explicitly) for servers which support this. Part 1 of 2 Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2005-09-30[PATCH] fuse: check O_DIRECTMiklos Szeredi1-0/+4
Check O_DIRECT and return -EINVAL error in open. dentry_open() also checks this but only after the open method is called. This patch optimizes away the unnecessary upcalls in this case. It could be a correctness issue too: if filesystem has open() with side effect, then it should fail before doing the open, not after. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-30[PATCH] uml: remove empty hostfs_truncate methodPaolo 'Blaisorblade' Giarrusso1-7/+0
Calling truncate() on hostfs spits a kernel warning "Something isn't implemented here", but it still works fine. Indeed, hostfs i_op->truncate doesn't do anything. But hostfs_setattr() -> set_attr() correctly detects ATTR_SIZE and calls truncate() on the host. So we should be safe (using ftruncate() may be better, in case the file is unlinked on the host, but we aren't sure to have the file open for writing, and reopening it would cause the same races; plus nobody should expect UML to be so careful). So, the warning is wrong, because the current implementation is working. Al, am I correct, and can the warning be therefore dropped? CC: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-30[PATCH] aio: avoid extra aio_{read,write} call when ki_left == 0Zach Brown1-2/+2
Recently aio_p{read,write} changed to perform retries internally rather than returning -EIOCBRETRY. This inadvertantly resulted in always calling aio_{read,write} with ki_left at 0 which would in turn immediately return 0. Harmless, but we can avoid this call by checking in the caller. Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-30[PATCH] aio: remove unlocked task_list test and resulting raceZach Brown1-46/+33
Only one of the run or kick path is supposed to put an iocb on the run list. If both of them do it than one of them can end up referencing a freed iocb. The kick path could delete the task_list item from the wait queue before getting the ctx_lock and putting the iocb on the run list. The run path was testing the task_list item outside the lock so that it could catch ki_retry methods that return -EIOCBRETRY *without* putting the iocb on a wait queue and promising to call kick_iocb. This unlocked check could then race with the kick path to cause both to try and put the iocb on the run list. The patch stops the run path from testing task_list by requring that any ki_retry that returns -EIOCBRETRY *must* guarantee that kick_iocb() will be called in the future. aio_p{read,write}, the only in-tree -EIOCBRETRY users, are updated. Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-30[PATCH] aio: lock around kiocbTryKick()Zach Brown1-7/+12
Only one of the run or kick path is supposed to put an iocb on the run list. If both of them do it than one of them can end up referencing a freed iocb. The kick patch could set the Kicked bit before acquiring the ctx_lock and putting the iocb on the run list. The run path, while holding the ctx_lock, could see this partial kick and mistake it for a kick that was deferred while it was doing work with the run_list NULLed out. It would then race with the kick thread to add the iocb to the run list. This patch moves the kick setting under the ctx_lock so that only one of the kick or run path queues the iocb on the run list, as intended. Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-30[PATCH] missing ERR_PTR in 9fsAl Viro1-1/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-29[PATCH] readv/writev syscalls are not checked by lsmKostik Belousov1-0/+3
it seems that readv(2)/writev(2) syscalls do not call file_permission callback. Looks like this is overlook. I have filled the issue into redhat bugzilla as https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=169433 and got the recommendation to post this on lsm mailing list. The following trivial patch solves the problem. Signed-off-by: Kostik Belousov <kostikbel@gmail.com> Signed-off-by: Chris Wright <chrisw@osdl.org>
2005-09-28[PATCH] epoll: handle timeout overflowDavide Libenzi1-2/+6
Handle the timeout upper boundary for epoll. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-28[PATCH] v9fs: fix races in fid allocationLatchesar Ionkov7-196/+200
Fid management cleanup. The patch attempts to fix the races in dentry's fid management. Dentries don't keep the opened fids anymore, they are moved to the file structs. Ideally there should be no more than one fid with fidcreate equal to zero in the dentry's list of fids. v9fs_fid_create initializes the important fields (fid, fidcreated) before v9fs_fid is added to the list. v9fs_fid_lookup returns only fids that are not created by v9fs_create. v9fs_fid_get_created returns the fid created by the same process by v9fs_create (if any) and removes it from dentry's list Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Cc: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-28[PATCH] Fix ext3_new_inode() failure pathsChris Sykes1-15/+14
Fix failure paths in ext3_new_inode() and clean up duplicated code: - DQUOT_DROP() was not being called if ext3_init_security() failed. Signed-off-by: Chris Sykes <chris@sigsegv.plus.com> Cc: Stephen Smalley <sds@epoch.ncsc.mil> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-28[PATCH] Fix ext2_new_inode() failure pathsChris Sykes1-12/+13
Fix failure paths in ext2_new_inode() and clean up duplicated code: - DQUOT_DROP() was not being called if ext2_init_security() failed. Signed-off-by: Chris Sykes <chris@sigsegv.plus.com> Cc: Stephen Smalley <sds@epoch.ncsc.mil> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-28[PATCH] fuse: check reserved node ID valuesMiklos Szeredi1-0/+6
This patch checks reserved node ID values returned by lookup and creation operations. In case one of the reserved values is sent, return -EIO. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-28[PATCH] fuse: add required version infoMiklos Szeredi1-0/+3
Add information about required version of the userspace library/utilities to Documentation/Changes. Also add pointer to this and to FUSE documentation from Kconfig. Thanks to Anton Altaparmakov for the reminder. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-26NTFS: Re-fix sparse warnings in a more correct way, i.e. don't use an enum withAnton Altaparmakov1-8/+5
different types in it but #define the two constants instead. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-26Merge branch 'master' of /home/src/linux-2.6/Anton Altaparmakov15-116/+138
2005-09-26NTFS: More $LogFile handling fixes: when chkdsk has been run, it can leave theAnton Altaparmakov3-15/+35
restart pages in the journal without multi sector transfer protection fixups (i.e. the update sequence array is empty and in fact does not exist). Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-26NTFS: Fix the definition of the CHKD ntfs record magic. It had an off byAnton Altaparmakov1-1/+1
two error causing it to be CHKB instead of CHKD. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-23[PATCH] cifs: Add support for suspendSteve French2-0/+4
cifsd had been preventing software suspend from completing. Signed-off-by: pavel@suse.de Signed-off-by: Steve French <sfrench@us.ibm.com> lightly modified Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-23Merge branch 'for-linus' of ↵Linus Torvalds4-7/+14
master.kernel.org:/pub/scm/linux/kernel/git/shaggy/jfs-2.6
2005-09-23NTFS: Change ntfs_cluster_free() to require a write locked runlist on entryAnton Altaparmakov5-34/+32
since we otherwise get into a lock reversal deadlock if a read locked runlist is passed in. In the process also change it to take an ntfs inode instead of a vfs inode as parameter. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-22[PATCH] NFS: fix client oops when debugging is onNick Wilson1-3/+2
nfs_readpage_release() causes an oops while accessing a file with NFS debugging turned on (echo 32767 > /proc/sys/sunrpc/nfs_debug) and a kernel built with CONFIG_DEBUG_SLAB. This patch moves the debugging statement above nfs_release_request() to avoid accessing freed memory. Signed-off-by: Nick Wilson <njw@osdl.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-22[PATCH] ext3: EXT3_DEBUG build fixesGlauber de Oliveira Costa2-6/+6
Fix some warnings and a build error when EXT3_DEBUG is enabled. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-22[PATCH] ext3: ext3_show_options fixOGAWA Hirofumi1-6/+5
EXT3_MOUNT_DATA_FLAGS is not a boolean. This fixes it. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-22[PATCH] v9fs: don't free root dentry & inode if error occurs in v9fs_get_sbLatchesar Ionkov1-17/+7
If error occurs while in v9fs_get_sb after it calles sget, the dentry object of the root and its inode may be freed twice -- once while handling the error in v9fs_get_sb, and second time when v9fs_get_sb calles deactivate_super (which in turn calls v9fs_kill_super) The patch removes the unnecessary code that frees the root dentry and its inode. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Cc: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-22[PATCH] v9fs: replace strlen on newly allocated by __getname buffers to PATH_MAXLatchesar Ionkov1-2/+2
v9fs_vfs_readlink allocates space for the link using __getname and errorneously uses strlen on the newly allocated buffer to check if the buffer passed by the user is bigger than the one returned by __getname. The patch replaces the strlen usage to PATH_MAX, which is the actual size of the buffers returned by __getname. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Cc: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-22[PATCH] v9fs: make copy of the transport prototype instead of using it directlyLatchesar Ionkov1-1/+7
When a new session is created it uses a template object of the specified transport type to instantiate its own copy. The code for the making a copy of the template object was lost, and the object itself is attached to the v9fs session. This leads to many sessions using the same transport instead of having their own copy. The patch puts back the code that makes a copy of the template object. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Cc: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-22[PATCH] v9fs: allocate the Rwalk qid array from the right conv bufferLatchesar Ionkov1-1/+1
When v9fs_deserealize_fcall deserializes a Rwalk message, it incorrectly allocates space for the qid array in the source instead of the destination buffer. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Cc: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-22[PATCH] v9fs: make conv functions to check for conv buffer overflowLatchesar Ionkov1-70/+85
buf_check_size function checks if the conv buffer has enough space for the performed operation, but it doesn't return the result back to the calling function, only logs an error in the log. The report-back-error functionality was lost when buf_check_size was converted from macro to inline function. The return in the macro used to exit from the functions that include it, after the conversion it just exits from the inline function itself. The patch makes buf_check_size to return flag and all functions that use it check if they should perform the operation, or exit. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Cc: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-22[PATCH] proc_task_root_link c99 fixAndrew Morton1-3/+5
fs/proc/base.c: In function `proc_task_root_link': fs/proc/base.c:364: warning: ISO C90 forbids mixed declarations and code Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-22[CIFS] Various minor bigendian fixes and sparse level 2 warning message fixesSteve French8-33/+60
Most important of these fixes mapchars on bigendian and a few statfs fields Signed-off-by: Shaggy (shaggy@austin.ibm.com) Signed-off-by: Steve French (sfrench@us.ibm.com)
2005-09-22NTFS: Fix sparse warnings that have crept in over time.Anton Altaparmakov3-4/+9
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-21[CIFS] Add support for legacy servers part nine. statfs (df and du) is nowSteve French7-10/+107
functional, and the length check is fixed so readdir does not throw a warning message when windows me messes up the response to FindFirst of an empty dir (with only . and ..). Signed-off-by: Steve French (sfrench@us.ibm.com)
2005-09-21[PATCH] fat: fix adateStephane Kardas1-3/+8
During a forensic analysis on the fat file system, I found than the result for the last access date on this file system was different between the stat command and the istat command (package tct-utils). The istat command display a true date (the right windows date) but the stat primitive (so stat, find, ls command) displays a wrong date. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-21[PATCH] Fix invisible threads problemSripathi Kodi1-7/+77
When the main thread of a thread group has done pthread_exit() and died, the other threads are still happily running, but will not be visible under /proc because their leader is no longer accessible. This fixes the access control so that we can see the sub-threads again. Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com> Acked-by: Al Viro <viro@ftp.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-20[CIFS] Add support for legacy servers part eight. Write fixes for WindowsSteve French4-31/+66
ME, and do not set ctime unless explicitly requested with atime and/or mtime (it gets thrown away by most servers anyway as there is no way to set this via posix). Signed-off-by: Steve French (sfrench@us.ibm.com)
2005-09-20JFS: don't dereference tlck->ip from txUpdateMapDave Kleikamp2-1/+9
The inode pointer may no longer be valid Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-09-20NTFS: More runlist handling fixes from Richard Russon and myself.Anton Altaparmakov1-22/+33
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-19Make fsnotify possibly work better for the inode removal caseLinus Torvalds1-1/+2
Checking i_nlink is dubious, but the alternatives look even less appetizing. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-19Merge branch 'master' of /home/src/linux-2.6/Anton Altaparmakov6-56/+43
2005-09-19NTFS: Fix ntfs_{read,write}page() to cope with concurrent truncates better.Anton Altaparmakov3-41/+80
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-19NTFS: Fix handling of compressed directories that I broke in earlier changeset.Anton Altaparmakov1-4/+8
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-19NTFS: Fix various bugs in the runlist merging code. (Based on libntfsAnton Altaparmakov2-64/+70
changes by Richard Russon.) Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-18[CIFS] Add support for legacy servers part seven. Fix open for write,Steve French2-4/+21
begin implementation of Win9x style set file size via open then write of zero bytes. Signed-off-by: Steve French (sfrench@us.ibm.com)
2005-09-17[PATCH] FAT: miss-sync issues on sync mount (miss-sync on write)OGAWA Hirofumi2-36/+16
This patch fixes miss-sync issue on write() system call. This updates inode attrs flags, mtime and ctime on every comit_write call, due to locking. Signed-off-by: Hiroyuki Machida <machida@sm.sony.co.jp> Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-17[PATCH] files: fix preemption issuesDipankar Sarma2-0/+6
With the new fdtable locking rules, you have to protect fdtable with either ->file_lock or rcu_read_lock/unlock(). There are some places where we aren't doing either. This patch fixes those places. Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-17[PATCH] Add smp_mb__after_clear_bit() to unlock_kiocb()Zach Brown1-0/+1
Add smp_mb__after_clear_bit() to unlock_kiocb() AIO's use of wait_on_bit_lock()/wake_up_bit() forgot to add a barrier between clearing its lock bit and calling wake_up_bit() so wake_up_bit()'s unlocked waitqueue_active() can race. This puts AIO's use in line with the others and the comment above wake_up_bit(). Signed-off-by: Zach Brown <zach.brown@oracle.com> Acked-by: Benjamin LaHaise <bcrl@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-17[PATCH] epoll: fix delayed initialization bugDavide Libenzi1-20/+20
Al found a potential problem in epoll_create(), where the file->private_data member was set after fd_install(). This is obviously wrong since another thread might do a close() on that fd# before we set the file->private_data member. This goes over 2.6.13 and passes a few basic tests I've done here. (akpm: snuck in a kzalloc() cleanup too) Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-15[CIFS] Add support for legacy servers part six. Fix read syntax soSteve French1-0/+7
we do not request more than negotiated buffer size even if buffer size is small (smaller than one page) Signed-off-by: Steve French (sfrench@us.ibm.com)
2005-09-15[CIFS] Fix readdir caching when unlink removes file in current searchSteve French3-6/+38
buffer, and this is followed by a rewind search to just before the deleted entry. Signed-off-by: Steve French (sfrench@us.ibm.com)
2005-09-15JFS: Fix sparse warnings, including endian errorDave Kleikamp3-6/+5
The fix in inode.c is a real bug. It could result in undeleted, yet unconnected files on big-endian hardware. The others are trivial. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-09-15[CIFS] Fix compiler warningsSteve French4-5/+6
Fix some compiler warnings noticed on x64 by me and ppc64 by Shaggy Signed-off-by: Steve French (sfrench@us.ibm.com)
2005-09-14[COMPAT]: Fixup compat_do_execve()David S. Miller1-0/+4
Missing acct_update_integrals() and update_mem_hiwater() calls compared to it's native counterpart. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-14[PATCH] Fix the fdtable freeing in the case of vmalloced fdset/arraysDipankar Sarma1-7/+3
Noted by David Miller: "The bug is that free_fd_array() takes a "num" argument, but when calling it from __free_fdtable() we're instead passing in the size in bytes (ie. "num * sizeof(struct file *)")." Yes it is a bug. I think I messed it up while merging newer changes with an older version where I was using size in bytes to optimize. Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-14[PATCH] error path in setup_arg_pages() misses vm_unacct_memory()Hugh Dickins1-5/+0
Pavel Emelianov and Kirill Korotaev observe that fs and arch users of security_vm_enough_memory tend to forget to vm_unacct_memory when a failure occurs further down (typically in setup_arg_pages variants). These are all users of insert_vm_struct, and that reservation will only be unaccounted on exit if the vma is marked VM_ACCOUNT: which in some cases it is (hidden inside VM_STACK_FLAGS) and in some cases it isn't. So x86_64 32-bit and ppc64 vDSO ELFs have been leaking memory into Committed_AS each time they're run. But don't add VM_ACCOUNT to them, it's inappropriate to reserve against the very unlikely case that gdb be used to COW a vDSO page - we ought to do something about that in do_wp_page, but there are yet other inconsistencies to be resolved. The safe and economical way to fix this is to let insert_vm_struct do the security_vm_enough_memory check when it finds VM_ACCOUNT is set. And the MIPS irix_brk has been calling security_vm_enough_memory before calling do_brk which repeats it, doubly accounting and so also leaking. Remove that, and all the fs and arch calls to security_vm_enough_memory: give it a less misleading name later on. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-Off-By: Kirill Korotaev <dev@sw.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-14[PATCH] Fix fs/exec.c:788 (de_thread()) BUG_ONAlexander Nyberg1-3/+2
It turns out that the BUG_ON() in fs/exec.c: de_thread() is unreliable and can trigger due to the test itself being racy. de_thread() does while (atomic_read(&sig->count) > count) { } ..... ..... BUG_ON(!thread_group_empty(current)); but release_task does write_lock_irq(&tasklist_lock) __exit_signal (this is where atomic_dec(&sig->count) is run) __exit_sighand __unhash_process takes write lock on tasklist_lock remove itself out of PIDTYPE_TGID list write_unlock_irq(&tasklist_lock) so there's a clear (although small) window between the atomic_dec(&sig->count) and the actual PIDTYPE_TGID unhashing of the thread. And actually there is no need for all threads to have exited at this point, so we simply kill the BUG_ON. Big thanks to Marc Lehmann who provided the test-case. Fixes Bug 5170 (http://bugme.osdl.org/show_bug.cgi?id=5170) Signed-off-by: Alexander Nyberg <alexn@telia.com> Cc: Roland McGrath <roland@redhat.com> Cc: Andrew Morton <akpm@osdl.org> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13Merge master.kernel.org:/pub/scm/linux/kernel/git/dwmw2/audit-2.6 Linus Torvalds1-1/+1
2005-09-13[PATCH] nfsd4: fix setclientid unlock of unlocked state lockNeil Brown1-3/+2
We could try to unlock the state lock here without having first locked it. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13[PATCH] nfsd4: fix open seqid incrementing in lockNeil Brown1-13/+15
In the case of a lock which introduces a new lockowner, the openowner's sequence id should be incremented, even when the operation fails, if the error is a sequence-id-mutating error. The current code fails to do that in some cases. Fix this by using the same sequence-id-incrementing mechanism that all other such operations use. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13[PATCH] nfsd4: move replay_ownerNeil Brown2-24/+30
It seems more natural to move the setting of the replay_owner into the relevant procedure instead of doing it in nfsv4_proc_compound. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13[PATCH] nfsd4: printk reductionNeil Brown1-17/+10
Demote some printk's that look like they could be triggered by non-buggy clients to dprintk's. (For example, stale clientid's are normal occurrences on reboot, and on a server with a lot of clients these messages could become annoying.) Also remove some redundant dprintk's (e.g. no need for both STALE_CLIENTID and its callers to do dprintks). Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13[PATCH] reiserfs: use mark_inode_dirty instead of reiserfs_update_sdChris Mason2-14/+13
reiserfs should use mark_inode_dirty during reiserfs_file_write and reiserfs_commit_write. This makes sure the inode is properly flagged as dirty, which is used during O_SYNC to decide when to trigger log commits. This patch also removes the O_SYNC check from reiserfs_commit_write, since that gets dealt with properly at higher layers once we start using mark_inode_dirty. Thanks to Hifumi Hisashi <hifumi.hisashi@lab.ntt.co.jp> for catching this. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13[PATCH] open returns ENFILE but creates file anywayPeter Staubach1-42/+56
When open(O_CREAT) is called and the error, ENFILE, is returned, the file may be created anyway. This is counter intuitive, against the SUS V3 specification, and may cause applications to misbehave if they are not coded correctly to handle this semantic. The SUS V3 specification explicitly states "No files shall be created or modified if the function returns -1.". The error, ENFILE, is used to indicate the system wide open file table is full and no more file structs can be allocated. This is due to an ordering problem. The entry in the directory is created before the file struct is allocated. If the allocation for the file struct fails, then the system call must return an error, but the directory entry was already created and can not be safely removed. The solution to this situation is relatively easy. The file struct should be allocated before the directory entry is created. If the allocation fails, then the error can be returned directly. If the creation of the directory entry fails, then the file struct can be easily freed. Signed-off-by: Peter Staubach <staubach@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12NTFS: Mask out __GFP_HIGHMEM when doing kmalloc() in __ntfs_malloc() as itAnton Altaparmakov2-4/+1
otherwise causes a BUG(). Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-12NTFS: Change the mount options {u,f,d}mask to always parse the number asAnton Altaparmakov2-4/+16
an octal number to conform to how chmod(1) works, too. Thanks to Giuseppe Bilotta and Horst von Brand for pointing out the errors of my ways. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-11Merge branch 'for-linus' from kernel.org:/.../shaggy/jfs-2.6 manuallyLinus Torvalds6-62/+193
Clash due to new delete_inode behavior (the filesystem now needs to do the truncate_inode_pages() call itself). Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10[PATCH] fs: fix-up schedule_timeout() usageNishanth Aravamudan10-33/+21
Use schedule_timeout_{,un}interruptible() instead of set_current_state()/schedule_timeout() to reduce kernel size. Also use helper functions to convert between human time units and jiffies rather than constant HZ division to avoid rounding errors. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10[PATCH] fs/cramfs/uncompress.c should #include <linux/cramfs_fs.h>Adrian Bunk1-0/+1
Every file should #include the header with the prototypes of the global functions it is offering. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10[PATCH] janitor: reiserfs: super.c - vfree() checking cleanupsJames Lamanna1-2/+1
super.c vfree() checking cleanups. Signed-off by: James Lamanna <jlamanna@gmail.com> Signed-off-by: Domen Puncer <domen@coderock.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10[PATCH] janitor: fs/dcache.c: list_for_each*Domen Puncer1-12/+4
First one is list_for_each_entry (thanks maks), second 2 list_for_each_safe. Signed-off-by: Maximilian Attems <janitor@sternwelten.at> Signed-off-by: Domen Puncer <domen@coderock.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10[PATCH] janitor: fs/namespace.c: list_for_each_entryDomen Puncer1-3/+1
Make code more readable with list_for_each_entry. Signed-off-by: Maximilian Attems <janitor@sternwelten.at> Signed-off-by: Domen Puncer <domen@coderock.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10[PATCH] janitor: jffs/intrep: list_for_each_entryDomen Puncer1-13/+9
Use list_for_each_entry to make code more readable. Signed-off-by: Maximilian Attems <janitor@sternwelten.at> Signed-off-by: Domen Puncer <domen@coderock.org> Cc: <jffs-dev@axis.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10[PATCH] sched: TASK_NONINTERACTIVEIngo Molnar1-1/+5
This patch implements a task state bit (TASK_NONINTERACTIVE), which can be used by blocking points to mark the task's wait as "non-interactive". This does not mean the task will be considered a CPU-hog - the wait will simply not have an effect on the waiting task's priority - positive or negative alike. Right now only pipe_wait() will make use of it, because it's a common source of not-so-interactive waits (kernel compilation jobs, etc.). Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10[PATCH] spinlock consolidationIngo Molnar1-0/+1
This patch (written by me and also containing many suggestions of Arjan van de Ven) does a major cleanup of the spinlock code. It does the following things: - consolidates and enhances the spinlock/rwlock debugging code - simplifies the asm/spinlock.h files - encapsulates the raw spinlock type and moves generic spinlock features (such as ->break_lock) into the generic code. - cleans up the spinlock code hierarchy to get rid of the spaghetti. Most notably there's now only a single variant of the debugging code, located in lib/spinlock_debug.c. (previously we had one SMP debugging variant per architecture, plus a separate generic one for UP builds) Also, i've enhanced the rwlock debugging facility, it will now track write-owners. There is new spinlock-owner/CPU-tracking on SMP builds too. All locks have lockup detection now, which will work for both soft and hard spin/rwlock lockups. The arch-level include files now only contain the minimally necessary subset of the spinlock code - all the rest that can be generalized now lives in the generic headers: include/asm-i386/spinlock_types.h | 16 include/asm-x86_64/spinlock_types.h | 16 I have also split up the various spinlock variants into separate files, making it easier to see which does what. The new layout is: SMP | UP ----------------------------|----------------------------------- asm/spinlock_types_smp.h | linux/spinlock_types_up.h linux/spinlock_types.h | linux/spinlock_types.h asm/spinlock_smp.h | linux/spinlock_up.h linux/spinlock_api_smp.h | linux/spinlock_api_up.h linux/spinlock.h | linux/spinlock.h /* * here's the role of the various spinlock/rwlock related include files: * * on SMP builds: * * asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the * initializers * * linux/spinlock_types.h: * defines the generic type and initializers * * asm/spinlock.h: contains the __raw_spin_*()/etc. lowlevel * implementations, mostly inline assembly code * * (also included on UP-debug builds:) * * linux/spinlock_api_smp.h: * contains the prototypes for the _spin_*() APIs. * * linux/spinlock.h: builds the final spin_*() APIs. * * on UP builds: * * linux/spinlock_type_up.h: * contains the generic, simplified UP spinlock type. * (which is an empty structure on non-debug builds) * * linux/spinlock_types.h: * defines the generic type and initializers * * linux/spinlock_up.h: * contains the __raw_spin_*()/etc. version of UP * builds. (which are NOPs on non-debug, non-preempt * builds) * * (included on UP-non-debug builds:) * * linux/spinlock_api_up.h: * builds the _spin_*() APIs. * * linux/spinlock.h: builds the final spin_*() APIs. */ All SMP and UP architectures are converted by this patch. arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via crosscompilers. m32r, mips, sh, sparc, have not been tested yet, but should be mostly fine. From: Grant Grundler <grundler@parisc-linux.org> Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU). Builds 32-bit SMP kernel (not booted or tested). I did not try to build non-SMP kernels. That should be trivial to fix up later if necessary. I converted bit ops atomic_hash lock to raw_spinlock_t. Doing so avoids some ugly nesting of linux/*.h and asm/*.h files. Those particular locks are well tested and contained entirely inside arch specific code. I do NOT expect any new issues to arise with them. If someone does ever need to use debug/metrics with them, then they will need to unravel this hairball between spinlocks, atomic ops, and bit ops that exist only because parisc has exactly one atomic instruction: LDCW (load and clear word). From: "Luck, Tony" <tony.luck@intel.com> ia64 fix Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjanv@infradead.org> Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Cc: Matthew Wilcox <willy@debian.org> Signed-off-by: Hirokazu Takata <takata@linux-m32r.org> Signed-off-by: Mikael Pettersson <mikpe@csd.uu.se> Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10[PATCH] ntfs build fixAndrew Morton1-0/+1
*** Warning: "bit_spin_lock" [fs/ntfs/ntfs.ko] undefined! *** Warning: "bit_spin_unlock" [fs/ntfs/ntfs.ko] undefined! Cc: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09Preempt-safe RCU file usageLinus Torvalds1-0/+2
Fix up fs/compat.c fixes.
2005-09-09Fix up lost patch in compat_sys_select() for new RCU files world orderLinus Torvalds1-1/+3
Andrew lost this in patch reject resolution, and never noticed, since the compat code isn't in use on x86. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] Lost sockfd_put() in routing_ioctl()Kirill Korotaev1-2/+5
This patch adds lost sockfd_put() in 32bit compat rounting_ioctl() on 64bit platforms Signed-Off-By: Kirill Korotaev <dev@sw.ru> Signed-Off-By: Maxim Giryaev <gem@sw.ru> Signed-off-By: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] more SPIN_LOCK_UNLOCKED -> DEFINE_SPINLOCK conversionsIngo Molnar1-1/+1
This converts the final 20 DEFINE_SPINLOCK holdouts. (another 580 places are already using DEFINE_SPINLOCK). Build tested on x86. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] FUSE: don't allow restarting of system callsMiklos Szeredi5-111/+45
This patch removes ability to interrupt and restart operations while there hasn't been any side-effect. The reason: applications. There are some apps it seems that generate signals at a fast rate. This means, that if the operation cannot make enough progress between two signals, it will be restarted for ever. This bug actually manifested itself with 'krusader' trying to open a file for writing under sshfs. Thanks to Eduard Czimbalmos for the report. The problem can be solved just by making open() uninterruptible, because in this case it was the truncate operation that slowed down the progress. But it's better to solve this by simply not allowing interrupts at all (except SIGKILL), because applications don't expect file operations to be interruptible anyway. As an added bonus the code is simplified somewhat. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] FUSE: add fsync operation for directoriesMiklos Szeredi3-4/+29
This patch adds a new FSYNCDIR request, which is sent when fsync is called on directories. This operation is available in libfuse 2.3-pre1 or greater. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] fuse: don't update file timesMiklos Szeredi3-4/+9
Don't change mtime/ctime/atime to local time on read/write. Rather invalidate file attributes, so next stat() will force a GETATTR call. Bug reported by Ben Grimm. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] fuse: more flexible cachingMiklos Szeredi3-32/+8
Make data caching behavior selectable on a per-open basis instead of per-mount. Compatibility for the old mount options 'kernel_cache' and 'direct_io' is retained in the userspace library (version 2.4.0-pre1 or later). Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] fuse: transfer readdir data through deviceMiklos Szeredi4-84/+69
This patch removes a long lasting "hack" in FUSE, which used a separate channel (a file descriptor refering to a disk-file) to transfer directory contents from userspace to the kernel. The patch adds three new operations (OPENDIR, READDIR, RELEASEDIR), which have semantics and implementation exactly maching the respective file operations (OPEN, READ, RELEASE). This simplifies the directory reading code. Also disk space is not necessary, which can be important in embedded systems. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] FUSE - direct I/OMiklos Szeredi3-2/+146
This patch adds support for the "direct_io" mount option of FUSE. When this mount option is specified, the page cache is bypassed for read and write operations. This is useful for example, if the filesystem doesn't know the size of files before reading them, or when any kind of caching is harmful. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] fuse: stricter mount option checkingMiklos Szeredi1-2/+11
Check for the presence of all mandatory mount options. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] FUSE: tighten check for processes allowed accessMiklos Szeredi3-6/+49
This patch tightens the check for allowing processes to access non-privileged mounts. The rational is that the filesystem implementation can control the behavior or get otherwise unavailable information of the filesystem user. If the filesystem user process has the same uid, gid, and is not suid or sgid application, then access is safe. Otherwise access is not allowed unless the "allow_other" mount option is given (for which policy is controlled by the userspace mount utility). Thanks to everyone linux-fsdevel, especially Martin Mares who helped uncover problems with the previous approach. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] FUSE - readpages operationMiklos Szeredi3-0/+85
This patch adds readpages support to FUSE. With the help of the readpages() operation multiple reads are bundled together and sent as a single request to userspace. This can improve reading performace. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] FUSE - extended attribute operationsMiklos Szeredi2-0/+195
This patch adds the extended attribute operations to FUSE. The following operations are added: o getxattr o setxattr o listxattr o removexattr Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] FUSE - mount optionsMiklos Szeredi5-67/+170
This patch adds miscellaneous mount options to the FUSE filesystem. The following mount options are added: o default_permissions: check permissions with generic_permission() o allow_other: allow other users to access files o allow_root: allow root to access files o kernel_cache: don't invalidate page cache on open Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] FUSE - file operationsMiklos Szeredi5-1/+366
This patch adds the file operations of FUSE. The following operations are added: o open o flush o release o fsync o readpage o commit_write Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] FUSE - read-write operationsMiklos Szeredi3-17/+378
This patch adds the write filesystem operations of FUSE. The following operations are added: o setattr o symlink o mknod o mkdir o create o unlink o rmdir o rename o link Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] FUSE - read-only operationsMiklos Szeredi5-1/+555
This patch adds the read-only filesystem operations of FUSE. This contains the following files: o dir.c - directory, symlink and file-inode operations The following operations are added: o lookup o getattr o readlink o follow_link o directory open o readdir o directory release o permission o dentry revalidate o statfs Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] FUSE - device functionsMiklos Szeredi4-6/+1161
This adds the FUSE device handling functions. This contains the following files: o dev.c - fuse device operations (read, write, release, poll) - registers misc device - support for sending requests to userspace Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] FUSE - coreMiklos Szeredi3-0/+524
This patch adds FUSE core. This contains the following files: o inode.c - superblock operations (alloc_inode, destroy_inode, read_inode, clear_inode, put_super, show_options) - registers FUSE filesystem o fuse_i.h - private header file Requirements ============ The most important difference between orinary filesystems and FUSE is the fact, that the filesystem data/metadata is provided by a userspace process run with the privileges of the mount "owner" instead of the kernel, or some remote entity usually running with elevated privileges. The security implication of this is that a non-privileged user must not be able to use this capability to compromise the system. Obvious requirements arising from this are: - mount owner should not be able to get elevated privileges with the help of the mounted filesystem - mount owner should not be able to induce undesired behavior in other users' or the super user's processes - mount owner should not get illegitimate access to information from other users' and the super user's processes These are currently ensured with the following constraints: 1) mount is only allowed to directory or file which the mount owner can modify without limitation (write access + no sticky bit for directories) 2) nosuid,nodev mount options are forced 3) any process running with fsuid different from the owner is denied all access to the filesystem 1) and 2) are ensured by the "fusermount" mount utility which is a setuid root application doing the actual mount operation. 3) is ensured by a check in the permission() method in kernel I started thinking about doing 3) in a different way because Christoph H. made a big deal out of it, saying that FUSE is unacceptable into mainline in this form. The suggested use of private namespaces would be OK, but in their current form have many limitations that make their use impractical (as discussed in this thread). Suggested improvements that would address these limitations: - implement shared subtrees - allow a process to join an existing namespace (make namespaces first-class objects) - implement the namespace creation/joining in a PAM module With all that in place the check of owner against current->fsuid may be removed from the FUSE kernel module, without compromising the security requirements. Suid programs still interesting questions, since they get access even to the private namespace causing some information leak (exact order/timing of filesystem operations performed), giving some ptrace-like capabilities to unprivileged users. BTW this problem is not strictly limited to the namespace approach, since suid programs setting fsuid and accessing users' files will succeed with the current approach too. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] FUSE - MAINTAINERS, Kconfig and Makefile changesMiklos Szeredi2-0/+14
This patch adds FUSE filesystem to MAINTAINERS, fs/Kconfig and fs/Makefile. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] v9fs: fix handling of malformed 9P messagesEric Van Hensbergen4-21/+46
This patch attempts to do a better job of cleaning up after detecting errors on the transport. This should also improve error reporting on broken connections to servers. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] v9fs: readlink extended mode checkEric Van Hensbergen1-5/+30
LANL reported some issues with random crashes during mount of legacy protocol servers (9P2000 versus 9P2000.u) -- crash was always happening in readlink (which should never happen in legacy mode). Added some sanity conditionals to the get_inode code which should prevent the errors LANL was seeing. Code tested benign through regression. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] v9fs: Fix support for special files (devices, named pipes, etc.)Eric Van Hensbergen1-0/+3
Fix v9fs special files (block, char devices) support. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] v9fs: Clean-up vfs_inode and setattr functionsEric Van Hensbergen2-97/+36
Cleanup code in v9fs vfs_inode as suggested by Alexey Dobriyan. Did some major revamping of the v9fs setattr code to remove unnecessary allocations and clean up some dead-code. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] v9fs: Change error magic numbers to defined constantsEric Van Hensbergen1-81/+77
Change magic error numbers to system defined constants in v9fs error.h As suggested by Jan-Benedict Glaw. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] v9fs: debug and support routinesEric Van Hensbergen5-0/+642
This part of the patch contains debug and other misc routines. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] v9fs: Support to force umountEric Van Hensbergen5-3/+40
Support for force umount Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] v9fs: transport modulesEric Van Hensbergen6-5/+979
This part of the patch contains transport routines. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] v9fs: 9P protocol implementationEric Van Hensbergen4-0/+1429
This part of the patch contains the 9P protocol functions. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] v9fs: VFS superblock operations and glueEric Van Hensbergen4-0/+877
This part of the patch contains VFS superblock and mapping code. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] v9fs: VFS inode operationsEric Van Hensbergen1-0/+1371
This part of the patch contains the VFS inode interfaces. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] v9fs: VFS file, dentry, and directory operationsEric Van Hensbergen3-0/+753
This part of the patch contains the VFS file, dentry & directory interfaces. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] v9fs: Documentation, Makefiles, ConfigurationEric Van Hensbergen3-0/+29
OVERVIEW V9FS is a distributed file system for Linux which provides an implementation of the Plan 9 resource sharing protocol 9P. It can be used to share all sorts of resources: static files, synthetic file servers (such as /proc or /sys), devices, and application file servers (such as FUSE). BACKGROUND Plan 9 (http://plan9.bell-labs.com/plan9) is a research operating system and associated applications suite developed by the Computing Science Research Center of AT&T Bell Laboratories (now a part of Lucent Technologies), the same group that developed UNIX , C, and C++. Plan 9 was initially released in 1993 to universities, and then made generally available in 1995. Its core operating systems code laid the foundation for the Inferno Operating System released as a product by Lucent Bell-Labs in 1997. The Inferno venture was the only commercial embodiment of Plan 9 and is currently maintained as a product by Vita Nuova (http://www.vitanuova.com). After updated releases in 2000 and 2002, Plan 9 was open-sourced under the OSI approved Lucent Public License in 2003. The Plan 9 project was started by Ken Thompson and Rob Pike in 1985. Their intent was to explore potential solutions to some of the shortcomings of UNIX in the face of the widespread use of high-speed networks to connect machines. In UNIX, networking was an afterthought and UNIX clusters became little more than a network of stand-alone systems. Plan 9 was designed from first principles as a seamless distributed system with integrated secure network resource sharing. Applications and services were architected in such a way as to allow for implicit distribution across a cluster of systems. Configuring an environment to use remote application components or services in place of their local equivalent could be achieved with a few simple command line instructions. For the most part, application implementations operated independent of the location of their actual resources. Commercial operating systems haven't changed much in the 20 years since Plan 9 was conceived. Network and distributed systems support is provided by a patchwork of middle-ware, with an endless number of packages supplying pieces of the puzzle. Matters are complicated by the use of different complicated protocols for individual services, and separate implementations for kernel and application resources. The V9FS project (http://v9fs.sourceforge.net) is an attempt to bring Plan 9's unified approach to resource sharing to Linux and other operating systems via support for the 9P2000 resource sharing protocol. V9FS HISTORY V9FS was originally developed by Ron Minnich and Maya Gokhale at Los Alamos National Labs (LANL) in 1997. In November of 2001, Greg Watson setup a SourceForge project as a public repository for the code which supported the Linux 2.4 kernel. About a year ago, I picked up the initial attempt Ron Minnich had made to provide 2.6 support and got the code integrated into a 2.6.5 kernel. I then went through a line-for-line re-write attempting to clean-up the code while more closely following the Linux Kernel style guidelines. I co-authored a paper with Ron Minnich on the V9FS Linux support including performance comparisons to NFSv3 using Bonnie and PostMark - this paper appeared at the USENIX/FREENIX 2005 conference in April 2005: ( http://www.usenix.org/events/usenix05/tech/freenix/hensbergen.html ). CALL FOR PARTICIPATION/REQUEST FOR COMMENTS Our 2.6 kernel support is stabilizing and we'd like to begin pursuing its integration into the official kernel tree. We would appreciate any review, comments, critiques, and additions from this community and are actively seeking people to join our project and help us produce something that would be acceptable and useful to the Linux community. STATUS The code is reasonably stable, although there are no doubt corner cases our regression tests haven't discovered yet. It is in regular use by several of the developers and has been tested on x86 and PowerPC (32-bit and 64-bit) in both small and large (LANL cluster) deployments. Our current regression tests include fsx, bonnie, and postmark. It was our intention to keep things as simple as possible for this release -- trying to focus on correctness within the core of the protocol support versus a rich set of features. For example: a more complete security model and cache layer are in the road map, but excluded from this release. Additionally, we have removed support for mmap operations at Al Viro's request. PERFORMANCE Detailed performance numbers and analysis are included in the FREENIX paper, but we show comparable performance to NFSv3 for large file operations based on the Bonnie benchmark, and superior performance for many small file operations based on the PostMark benchmark. Somewhat preliminary graphs (from the FREENIX paper) are available (http://v9fs.sourceforge.net/perf/index.html). RESOURCES The source code is available in a few different forms: tarballs: http://v9fs.sf.net CVSweb: http://cvs.sourceforge.net/viewcvs.py/v9fs/linux-9p/ CVS: :pserver:anonymous@cvs.sourceforge.net:/cvsroot/v9fs/linux-9p Git: rsync://v9fs.graverobber.org/v9fs (webgit: http://v9fs.graverobber.org) 9P: tcp!v9fs.graverobber.org!6564 The user-level server is available from either the Plan 9 distribution or from http://v9fs.sf.net Other support applications are still being developed, but preliminary version can be downloaded from sourceforge. Documentation on the protocol has historically been the Plan 9 Man pages (http://plan9.bell-labs.com/sys/man/5/INDEX.html), but there is an effort under way to write a more complete Internet-Draft style specification (http://v9fs.sf.net/rfc). There are a couple of mailing lists supporting v9fs, but the most used is v9fs-developer@lists.sourceforge.net -- please direct/cc your comments there so the other v9fs contibutors can participate in the conversation. There is also an IRC channel: irc://freenode.net/#v9fs This part of the patch contains Documentation, Makefiles, and configuration file changes. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] files: lock-free fd look-upDipankar Sarma3-19/+27
With the use of RCU in files structure, the look-up of files using fds can now be lock-free. The lookup is protected by rcu_read_lock()/rcu_read_unlock(). This patch changes the readers to use lock-free lookup. Signed-off-by: Maneesh Soni <maneesh@in.ibm.com> Signed-off-by: Ravikiran Thirumalai <kiran_th@gmail.com> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] files: files struct with RCUDipankar Sarma5-150/+303
Patch to eliminate struct files_struct.file_lock spinlock on the reader side and use rcu refcounting rcuref_xxx api for the f_count refcounter. The updates to the fdtable are done by allocating a new fdtable structure and setting files->fdt to point to the new structure. The fdtable structure is protected by RCU thereby allowing lock-free lookup. For fd arrays/sets that are vmalloced, we use keventd to free them since RCU callbacks can't sleep. A global list of fdtable to be freed is not scalable, so we use a per-cpu list. If keventd is already handling the current cpu's work, we use a timer to defer queueing of that work. Since the last publication, this patch has been re-written to avoid using explicit memory barriers and use rcu_assign_pointer(), rcu_dereference() premitives instead. This required that the fd information is kept in a separate structure (fdtable) and updated atomically. Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] files: break up files structDipankar Sarma8-63/+104
In order for the RCU to work, the file table array, sets and their sizes must be updated atomically. Instead of ensuring this through too many memory barriers, we put the arrays and their sizes in a separate structure. This patch takes the first step of putting the file table elements in a separate structure fdtable that is embedded withing files_struct. It also changes all the users to refer to the file table using files_fdtable() macro. Subsequent applciation of RCU becomes easier after this. Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] aio: kiocb locking to serialise retry and cancelBenjamin LaHaise1-4/+25
Implement a per-kiocb lock to serialise retry operations and cancel. This is done using wait_on_bit_lock() on the KIF_LOCKED bit of kiocb->ki_flags. Also, make the cancellation path lock the kiocb and subsequently release all references to it if the cancel was successful. This version includes a fix for the deadlock with __aio_run_iocbs. Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] change io_cancel return code for no cancel caseWendy Cheng1-1/+1
Note that other than few exceptions, most of the current filesystem and/or drivers do not have aio cancel specifically defined (kiob->ki_cancel field is mostly NULL). However, sys_io_cancel system call universally sets return code to -EAGAIN. This gives applications a wrong impression that this call is implemented but just never works. We have customer inquires about this issue. Changed by Benjamin LaHaise to EINVAL instead of ENOSYS Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Acked-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] bfs: fix endianness, signedness; add trivial bugfixAndrew Stribblehill4-70/+81
* Makes BFS code endianness-clean. * Fixes some signedness warnings. * Fixes a problem in fs/bfs/inode.c:164 where inodes not synced to disk don't get fully marked as clean. Here's how to reproduce it: # mount -o loop -t bfs /bfs.img /mnt # df -i /mnt Filesystem Inodes IUsed IFree IUse% Mounted on /bfs.img 48 1 47 3% /mnt # df -k /mnt Filesystem 1K-blocks Used Available Use% Mounted on /bfs.img 512 5 508 1% /mnt # cp 60k-archive.zip /mnt/mt.zip # df -k /mnt Filesystem 1K-blocks Used Available Use% Mounted on /bfs.img 512 65 447 13% /mnt # df -i /mnt Filesystem Inodes IUsed IFree IUse% Mounted on /bfs.img 48 2 46 5% /mnt # rm /mnt/mt.zip # echo $? 0 [If the unlink happens before the buffers flush, the following happens:] # df -i /mnt Filesystem Inodes IUsed IFree IUse% Mounted on /bfs.img 48 2 46 5% /mnt # df -k /mnt Filesystem 1K-blocks Used Available Use% Mounted on /bfs.img 512 65 447 13% /mnt fs/bfs/bfs.h | 1 Signed-off-by: Andrew Stribblehill <ads@wompom.org> Cc: <tigran@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] autofs: fix "busy inodes after umount..."Alexander Krizhanovsky3-4/+7
This patch for old autofs (version 3) cleans dentries which are not putted after killing the automount daemon (it's analogue of recent patch for autofs4). Signed-off-by: Alexander Krizhanovsky <klx@yandex.ru> Cc: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] remove the inode_post_link and inode_post_rename LSM hooksStephen Smalley1-8/+2
This patch removes the inode_post_link and inode_post_rename LSM hooks as they are unused (and likely useless). Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] Remove security_inode_post_create/mkdir/symlink/mknod hooksStephen Smalley1-12/+4
This patch removes the inode_post_create/mkdir/mknod/symlink LSM hooks as they are obsoleted by the new inode_init_security hook that enables atomic inode security labeling. If anyone sees any reason to retain these hooks, please speak now. Also, is anyone using the post_rename/link hooks; if not, those could also be removed. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] ext3: Enable atomic inode security labelingStephen Smalley3-0/+38
This patch modifies ext3 to call the inode_init_security LSM hook to obtain the security attribute for a newly created inode and to set the resulting attribute on the new inode as part of the same transaction. This parallels the existing processing for setting ACLs on newly created inodes. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] ext2: Enable atomic inode security labelingStephen Smalley3-0/+35
This patch modifies ext2 to call the inode_init_security LSM hook to obtain the security attribute for a newly created inode and to set the resulting attribute on the new inode. This parallels the existing processing for setting ACLs on newly created inodes. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] update filesystems for new delete_inode behaviorMark Fasheh19-0/+29
Update the file systems in fs/ implementing a delete_inode() callback to call truncate_inode_pages(). One implementation note: In developing this patch I put the calls to truncate_inode_pages() at the very top of those filesystems delete_inode() callbacks in order to retain the previous behavior. I'm guessing that some of those could probably be optimized. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] move truncate_inode_pages() into ->delete_inode()Mark Fasheh1-5/+7
Allow file systems supporting ->delete_inode() to call truncate_inode_pages() on their own. OCFS2 wants this so it can query the cluster before making a final decision on whether to wipe an inode from disk or not. In some corner cases an inode marked on the local node via voting may not actually get orphaned. A good example is node death before the transaction moving the inode to the orphan dir commits to the journal. Without this patch, the truncate_inode_pages() call in generic_delete_inode() would discard valid data for such inodes. During earlier discussion in the 2.6.13 merge plan thread, Christoph Hellwig indicated that other file systems might also find this useful. IMHO, the best solution would be to just allow ->drop_inode() to do the cluster query but it seems that would require a substantial reworking of that section of the code. Assuming it is safe to call write_inode_now() in ocfs2_delete_inode() for those inodes which won't actually get wiped, this solution should get us by for now. Trivial testing of this patch (and a related OCFS2 update) has shown this to avoid the corruption I'm seeing. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] bogus cast in bio.cviro@ZenIV.linux.org.uk1-1/+1
<qualifier> void * is not the same as void <qualifier> *... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6 Anton Altaparmakov1-2/+13
2005-09-09[XFS] Revert recent quota Makefile change, not in a fit state for merging.Nathan Scott1-2/+13
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-09-08Merge branch 'master' of /usr/src/linux-2.6 Anton Altaparmakov9-59/+58
2005-09-08NTFS: 2.1.24 release and some minor final fixes.Anton Altaparmakov4-9/+9
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Improve scalability by changing the driver global spin lock inAnton Altaparmakov2-6/+12
fs/ntfs/aops.c::ntfs_end_buffer_async_read() to a bit spin lock in the first buffer head of a page. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Fix page_has_buffers()/page_buffers() handling in fs/ntfs/aops.c.Anton Altaparmakov2-17/+22
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Fixup handling of sparse, compressed, and encrypted attributes inAnton Altaparmakov2-20/+26
fs/ntfs/aops.c::ntfs_readpage(). Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Fix fs/ntfs/aops.c::ntfs_{read,write}_block() to handle the caseAnton Altaparmakov2-11/+42
where a concurrent truncate has truncated the runlist under our feet. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Optimize fs/ntfs/aops.c::ntfs_write_block() by extending the pageAnton Altaparmakov2-10/+6
lock protection over the buffer submission for i/o which allows the removal of the get_bh()/put_bh() pairs for each buffer. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Fixup handling of sparse, compressed, and encrypted attributes inAnton Altaparmakov2-57/+49
fs/ntfs/aops.c::ntfs_writepage(). Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Make ntfs_write_block() not instantiate sparse blocks if they are zero.Anton Altaparmakov2-0/+23
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Fixup handling of sparse, compressed, and encrypted attributes inAnton Altaparmakov2-101/+116
fs/ntfs/inode.c::ntfs_read_locked_{,attr_,index_}inode(). Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Truncate {a,c,m}time to the ntfs supported time granularity whenAnton Altaparmakov2-5/+9
updating the times in the inode in ntfs_setattr(). Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Fix cluster (de)allocators to work when the runlist is NULL and moreAnton Altaparmakov4-33/+32
importantly to take a locked runlist rather than them locking it which leads to lock reversal. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Fix handling of sparse attributes in ntfs_attr_make_non_resident().Anton Altaparmakov2-17/+40
Also, add BUG() checks to ntfs_attr_make_non_resident() and ntfs_attr_set() to ensure that these functions are never called for compressed or encrypted attributes. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Fix several bugs in fs/ntfs/attrib.c.Anton Altaparmakov2-1/+37
- Fix a bug in ntfs_map_runlist_nolock() where we forgot to protect access to the allocated size in the ntfs inode with the size lock. - Fix ntfs_attr_vcn_to_lcn_nolock() and ntfs_attr_find_vcn_nolock() to return LCN_ENOENT when there is no runlist and the allocated size is zero. - Fix load_attribute_list() to handle the case of a NULL runlist. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Add fs/ntfs/attrib.[hc]::ntfs_resident_attr_value_resize().Anton Altaparmakov3-0/+43
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Remove bogus setting of PageError in ntfs_read_compressed_block().Anton Altaparmakov3-10/+8
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Fix a bug in fs/ntfs/index.c::ntfs_index_lookup(). When the returnedAnton Altaparmakov2-0/+4
index entry is in the index root, we forgot to set the @ir pointer in the index context. Thanks for Yura Pakhuchiy for finding this bug. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Add ntfs_rl_punch_nolock() which punches a caller specified hole into ↵Anton Altaparmakov3-0/+289
a runlist. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Change ntfs_rl_truncate_nolock() to throw away the runlist if the newAnton Altaparmakov2-1/+15
length is zero. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Report unrepresentable inodes during ntfs_readdir() as KERN_WARNINGAnton Altaparmakov3-2/+7
messages and include the inode number. Thanks to Yura Pakhuchiy for pointing this out. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Fix handling of valid but empty mapping pairs array inAnton Altaparmakov2-0/+5
fs/ntfs/runlist.c::ntfs_mapping_pairs_decompress(). Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Remove two bogus BUG_ON()s from fs/ntfs/mft.c.Anton Altaparmakov2-2/+1
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Fix two nasty runlist merging bugs that had gone unnoticed so far.Anton Altaparmakov2-2/+5
Thanks to Stefano Picerno for the bug report. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Use ntfs_malloc_nofs_nofail() in runlist.c::ntfs_runlists_merge()Anton Altaparmakov2-18/+53
in the two critical regions. This means we no longer need to panic() when the allocation fails as it now cannot fail. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Allow highmem kmalloc() in ntfs_malloc_nofs() and add _nofail() version.Anton Altaparmakov2-6/+48
- Modify fs/ntfs/malloc.h::ntfs_malloc_nofs() to do the kmalloc() based allocations with __GFP_HIGHMEM, analogous to how the vmalloc() based allocations are done. - Add fs/ntfs/malloc.h::ntfs_malloc_nofs_nofail() which is analogous to ntfs_malloc_nofs() but it performs allocations with __GFP_NOFAIL and hence cannot fail. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08NTFS: Support more clean journal ($LogFile) states.Anton Altaparmakov5-123/+167
- Support journals ($LogFile) which have been modified by chkdsk. This means users can boot into Windows after we marked the volume dirty. The Windows boot will run chkdsk and then reboot. The user can then immediately boot into Linux rather than having to do a full Windows boot first before rebooting into Linux and we will recognize such a journal and empty it as it is clean by definition. - Support journals ($LogFile) with only one restart page as well as journals with two different restart pages. We sanity check both and either use the only sane one or the more recent one of the two in the case that both are valid. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08[XFS] Fix modular XFS builds (Makefile botch).Nathan Scott2-5/+5
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-09-08[XFS] Remove special Kconfig XFS menu, make XFS options "inline".Nathan Scott1-24/+21
Signed-off-by: Eric Sandeen <sandeen@sgi.com> Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-09-08[XFS] Cleanup some -Wundef flag warnings in the endian macros (thanksNathan Scott6-30/+32
Christoph). SGI-PV: 942400 SGI-Modid: xfs-linux-melb:xfs-kern:23771a Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-09-07Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-for-linus-2.6 Linus Torvalds1-45/+182
2005-09-07Merge git://oss.sgi.com:8090/oss/git/xfs-2.6 Linus Torvalds52-1050/+1320
2005-09-07[PATCH] pivot_root() circular reference fixMiklos Szeredi1-0/+4
Fix http://bugzilla.kernel.org/show_bug.cgi?id=4857 When pivot_root is called from an init script in an initramfs environment, it causes a circular reference in the mount tree. The cause of this is that pivot_root() is not prepared to handle pivoting an unattached mount. In an initramfs environment, rootfs is the root of the namespace, and so it is not attached. This patch fixes this and related problems, by returning -EINVAL if either the current root or the new root is detached. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Acked-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Cc: <bigfish@asmallpond.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07[PATCH] Fix race in do_get_write_access()Jan Kara1-18/+21
attached patch should fix the following race: Proc 1 Proc 2 __flush_batch() ll_rw_block() do_get_write_access() lock_buffer jh is only waiting for checkpoint -> b_transaction == NULL -> do nothing unlock_buffer test_set_buffer_locked() test_clear_buffer_dirty() __journal_file_buffer() change the data submit_bh() and we have sent wrong data to disk... We now clean the dirty buffer flag under buffer lock in all cases and hence we know that whenever a buffer is starting to be journaled we either finish the pending write-out before attaching a buffer to a transaction or we won't write the buffer until the transaction is going to be committed. The test in jbd_unexpected_dirty_buffer() is redundant - remove it. Furthermore we have to clear the buffer dirty bit under the buffer lock to prevent races with buffer write-out (and hence prevent returning a buffer with IO happening). Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07[PATCH] Change HFS+ to not use ll_rw_block()Jan Kara1-4/+2
Use block layer predefined function. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07[PATCH] Change ll_rw_block() calls in UFSJan Kara3-18/+9
We need to be sure that current data are sent to disk. Hence we call ll_rw_block() with SWRITE. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07[PATCH] Change ll_rw_block() calls in ReiserJan Kara1-2/+2
We need to be sure that current data in buffer are sent to disk. Hence we need to call ll_rw_block() with SWRITE. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07[PATCH] Change ll_rw_block() calls in JBDJan Kara4-5/+5
We must be sure that the current data in buffer are sent to disk. Hence we have to call ll_rw_block() with SWRITE. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07[PATCH] Make ll_rw_block() wait for buffer lockJan Kara1-14/+16
Introduce new ll_rw_block() operation SWRITE meaning that block layer should wait for the buffer lock and write-out afterwards. Hence data in buffers at the time of call are guaranteed to be submitted to the disk. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07[PATCH] Fix JBD race in t_forget list handlingJan Kara1-10/+24
Fix race between journal_commit_transaction() and other places as journal_unmap_buffer() that are adding buffers to transaction's t_forget list. We have to protect against such places by holding j_list_lock even when traversing the t_forget list. The fact that other places can only add buffers to the list makes the locking easier. OTOH the lock ranking complicates the stuff... Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>