aboutsummaryrefslogtreecommitdiffstats
path: root/fs/Makefile
AgeCommit message (Collapse)AuthorFilesLines
2024-03-11Merge tag 'vfs-6.9.pidfd' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull pdfd updates from Christian Brauner: - Until now pidfds could only be created for thread-group leaders but not for threads. There was no technical reason for this. We simply had no users that needed support for this. Now we do have users that need support for this. This introduces a new PIDFD_THREAD flag for pidfd_open(). If that flag is set pidfd_open() creates a pidfd that refers to a specific thread. In addition, we now allow clone() and clone3() to be called with CLONE_PIDFD | CLONE_THREAD which wasn't possible before. A pidfd that refers to an individual thread differs from a pidfd that refers to a thread-group leader: (1) Pidfds are pollable. A task may poll a pidfd and get notified when the task has exited. For thread-group leader pidfds the polling task is woken if the thread-group is empty. In other words, if the thread-group leader task exits when there are still threads alive in its thread-group the polling task will not be woken when the thread-group leader exits but rather when the last thread in the thread-group exits. For thread-specific pidfds the polling task is woken if the thread exits. (2) Passing a thread-group leader pidfd to pidfd_send_signal() will generate thread-group directed signals like kill(2) does. Passing a thread-specific pidfd to pidfd_send_signal() will generate thread-specific signals like tgkill(2) does. The default scope of the signal is thus determined by the type of the pidfd. Since use-cases exist where the default scope of the provided pidfd needs to be overriden the following flags are added to pidfd_send_signal(): - PIDFD_SIGNAL_THREAD Send a thread-specific signal. - PIDFD_SIGNAL_THREAD_GROUP Send a thread-group directed signal. - PIDFD_SIGNAL_PROCESS_GROUP Send a process-group directed signal. The scope change will only work if the struct pid is actually used for this scope. For example, in order to send a thread-group directed signal the provided pidfd must be used as a thread-group leader and similarly for PIDFD_SIGNAL_PROCESS_GROUP the struct pid must be used as a process group leader. - Move pidfds from the anonymous inode infrastructure to a tiny pseudo filesystem. This will unblock further work that we weren't able to do simply because of the very justified limitations of anonymous inodes. Moving pidfds to a tiny pseudo filesystem allows for statx on pidfds to become useful for the first time. They can now be compared by inode number which are unique for the system lifetime. Instead of stashing struct pid in file->private_data we can now stash it in inode->i_private. This makes it possible to introduce concepts that operate on a process once all file descriptors have been closed. A concrete example is kill-on-last-close. Another side-effect is that file->private_data is now freed up for per-file options for pidfds. Now, each struct pid will refer to a different inode but the same struct pid will refer to the same inode if it's opened multiple times. In contrast to now where each struct pid refers to the same inode. The tiny pseudo filesystem is not visible anywhere in userspace exactly like e.g., pipefs and sockfs. There's no lookup, there's no complex inode operations, nothing. Dentries and inodes are always deleted when the last pidfd is closed. We allocate a new inode and dentry for each struct pid and we reuse that inode and dentry for all pidfds that refer to the same struct pid. The code is entirely optional and fairly small. If it's not selected we fallback to anonymous inodes. Heavily inspired by nsfs. The dentry and inode allocation mechanism is moved into generic infrastructure that is now shared between nsfs and pidfs. The path_from_stashed() helper must be provided with a stashing location, an inode number, a mount, and the private data that is supposed to be used and it will provide a path that can be passed to dentry_open(). The helper will try retrieve an existing dentry from the provided stashing location. If a valid dentry is found it is reused. If not a new one is allocated and we try to stash it in the provided location. If this fails we retry until we either find an existing dentry or the newly allocated dentry could be stashed. Subsequent openers of the same namespace or task are then able to reuse it. - Currently it is only possible to get notified when a task has exited, i.e., become a zombie and userspace gets notified with EPOLLIN. We now also support waiting until the task has been reaped, notifying userspace with EPOLLHUP. - Ensure that ESRCH is reported for getfd if a task is exiting instead of the confusing EBADF. - Various smaller cleanups to pidfd functions. * tag 'vfs-6.9.pidfd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (23 commits) libfs: improve path_from_stashed() libfs: add stashed_dentry_prune() libfs: improve path_from_stashed() helper pidfs: convert to path_from_stashed() helper nsfs: convert to path_from_stashed() helper libfs: add path_from_stashed() pidfd: add pidfs pidfd: move struct pidfd_fops pidfd: allow to override signal scope in pidfd_send_signal() pidfd: change pidfd_send_signal() to respect PIDFD_THREAD signal: fill in si_code in prepare_kill_siginfo() selftests: add ESRCH tests for pidfd_getfd() pidfd: getfd should always report ESRCH if a task is exiting pidfd: clone: allow CLONE_THREAD | CLONE_PIDFD together pidfd: exit: kill the no longer used thread_group_exited() pidfd: change do_notify_pidfd() to use __wake_up(poll_to_key(EPOLLIN)) pid: kill the obsolete PIDTYPE_PID code in transfer_pid() pidfd: kill the no longer needed do_notify_pidfd() in de_thread() pidfd_poll: report POLLHUP when pid_task() == NULL pidfd: implement PIDFD_THREAD flag for pidfd_open() ...
2024-02-28pidfd: move struct pidfd_fopsChristian Brauner1-1/+1
Move the pidfd file operations over to their own file in preparation of implementing pidfs and to isolate them from other mostly unrelated functionality in other files. Link: https://lore.kernel.org/r/20240213-vfs-pidfd_fs-v1-1-f863f58cfce1@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-01-24fs: Remove NTFS classicMatthew Wilcox (Oracle)1-1/+0
The replacement, NTFS3, was merged over two years ago. It is now time to remove the original from the tree as it is the last user of several APIs, and it is not worth changing. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20240115072025.2071931-1-willy@infradead.org Acked-by: Namjae Jeon <linkinjeon@kernel.org> Acked-by: Dave Chinner <david@fromorbit.com> Cc: Anton Altaparmakov <anton@tuxera.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-01-19Merge tag 'vfs-6.8.netfs' of ↵Linus Torvalds1-1/+0
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull netfs updates from Christian Brauner: "This extends the netfs helper library that network filesystems can use to replace their own implementations. Both afs and 9p are ported. cifs is ready as well but the patches are way bigger and will be routed separately once this is merged. That will remove lots of code as well. The overal goal is to get high-level I/O and knowledge of the page cache and ouf of the filesystem drivers. This includes knowledge about the existence of pages and folios The pull request converts afs and 9p. This removes about 800 lines of code from afs and 300 from 9p. For 9p it is now possible to do writes in larger than a page chunks. Additionally, multipage folio support can be turned on for 9p. Separate patches exist for cifs removing another 2000+ lines. I've included detailed information in the individual pulls I took. Summary: - Add NFS-style (and Ceph-style) locking around DIO vs buffered I/O calls to prevent these from happening at the same time. - Support for direct and unbuffered I/O. - Support for write-through caching in the page cache. - O_*SYNC and RWF_*SYNC writes use write-through rather than writing to the page cache and then flushing afterwards. - Support for write-streaming. - Support for write grouping. - Skip reads for which the server could only return zeros or EOF. - The fscache module is now part of the netfs library and the corresponding maintainer entry is updated. - Some helpers from the fscache subsystem are renamed to mark them as belonging to the netfs library. - Follow-up fixes for the netfs library. - Follow-up fixes for the 9p conversion" * tag 'vfs-6.8.netfs' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (50 commits) netfs: Fix wrong #ifdef hiding wait cachefiles: Fix signed/unsigned mixup netfs: Fix the loop that unmarks folios after writing to the cache netfs: Fix interaction between write-streaming and cachefiles culling netfs: Count DIO writes netfs: Mark netfs_unbuffered_write_iter_locked() static netfs: Fix proc/fs/fscache symlink to point to "netfs" not "../netfs" netfs: Rearrange netfs_io_subrequest to put request pointer first 9p: Use length of data written to the server in preference to error 9p: Do a couple of cleanups 9p: Fix initialisation of netfs_inode for 9p cachefiles: Fix __cachefiles_prepare_write() 9p: Use netfslib read/write_iter afs: Use the netfs write helpers netfs: Export the netfs_sreq tracepoint netfs: Optimise away reads above the point at which there can be no data netfs: Implement a write-through caching option netfs: Provide a launder_folio implementation netfs: Provide a writepages implementation netfs, cachefiles: Pass upper bound length to allow expansion ...
2023-12-24netfs, fscache: Move fs/fscache/* into fs/netfs/David Howells1-1/+0
There's a problem with dependencies between netfslib and fscache as each wants to access some functions of the other. Deal with this by moving fs/fscache/* into fs/netfs/ and renaming those files to begin with "fscache-". For the moment, the moved files are changed as little as possible and an fscache module is still built. A subsequent patch will integrate them. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: Christian Brauner <christian@brauner.io> cc: linux-fsdevel@vger.kernel.org cc: linux-cachefs@redhat.com
2023-12-23fs: prepare for stackable filesystems backing file helpersAmir Goldstein1-0/+1
In preparation for factoring out some backing file io helpers from overlayfs, move backing_file_open() into a new file fs/backing-file.c and header. Add a MAINTAINERS entry for stackable filesystems and add a Kconfig FS_STACK which stackable filesystems need to select. For now, the backing_file struct, the backing_file alloc/free functions and the backing_file_real_path() accessor remain internal to file_table.c. We may change that in the future. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-10-22bcachefs: Initial commitKent Overstreet1-0/+1
Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write filesystem with every feature you could possibly want. Website: https://bcachefs.org Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-08-02fs: add CONFIG_BUFFER_HEADChristoph Hellwig1-1/+1
Add a new config option that controls building the buffer_head code, and select it from all file systems and stacking drivers that need it. For the block device nodes and alternative iomap based buffered I/O path is provided when buffer_head support is not enabled, and iomap needs a a small tweak to define the IOMAP_F_BUFFER_HEAD flag to 0 to not call into the buffer_head code when it doesn't exist. Otherwise this is just Kconfig and ifdef changes. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20230801172201.1923299-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-26Merge tag 'for-6.5/block-2023-06-23' of git://git.kernel.dk/linuxLinus Torvalds1-8/+2
Pull block updates from Jens Axboe: - NVMe pull request via Keith: - Various cleanups all around (Irvin, Chaitanya, Christophe) - Better struct packing (Christophe JAILLET) - Reduce controller error logs for optional commands (Keith) - Support for >=64KiB block sizes (Daniel Gomez) - Fabrics fixes and code organization (Max, Chaitanya, Daniel Wagner) - bcache updates via Coly: - Fix a race at init time (Mingzhe Zou) - Misc fixes and cleanups (Andrea, Thomas, Zheng, Ye) - use page pinning in the block layer for dio (David) - convert old block dio code to page pinning (David, Christoph) - cleanups for pktcdvd (Andy) - cleanups for rnbd (Guoqing) - use the unchecked __bio_add_page() for the initial single page additions (Johannes) - fix overflows in the Amiga partition handling code (Michael) - improve mq-deadline zoned device support (Bart) - keep passthrough requests out of the IO schedulers (Christoph, Ming) - improve support for flush requests, making them less special to deal with (Christoph) - add bdev holder ops and shutdown methods (Christoph) - fix the name_to_dev_t() situation and use cases (Christoph) - decouple the block open flags from fmode_t (Christoph) - ublk updates and cleanups, including adding user copy support (Ming) - BFQ sanity checking (Bart) - convert brd from radix to xarray (Pankaj) - constify various structures (Thomas, Ivan) - more fine grained persistent reservation ioctl capability checks (Jingbo) - misc fixes and cleanups (Arnd, Azeem, Demi, Ed, Hengqi, Hou, Jan, Jordy, Li, Min, Yu, Zhong, Waiman) * tag 'for-6.5/block-2023-06-23' of git://git.kernel.dk/linux: (266 commits) scsi/sg: don't grab scsi host module reference ext4: Fix warning in blkdev_put() block: don't return -EINVAL for not found names in devt_from_devname cdrom: Fix spectre-v1 gadget block: Improve kernel-doc headers blk-mq: don't insert passthrough request into sw queue bsg: make bsg_class a static const structure ublk: make ublk_chr_class a static const structure aoe: make aoe_class a static const structure block/rnbd: make all 'class' structures const block: fix the exclusive open mask in disk_scan_partitions block: add overflow checks for Amiga partition support block: change all __u32 annotations to __be32 in affs_hardblocks.h block: fix signed int overflow in Amiga partition support block: add capacity validation in bdev_add_partition() block: fine-granular CAP_SYS_ADMIN for Persistent Reservation block: disallow Persistent Reservation on partitions reiserfs: fix blkdev_put() warning from release_journal_dev() block: fix wrong mode for blkdev_get_by_dev() from disk_scan_partitions() block: document the holder argument to blkdev_get_by_path ...
2023-05-24smb: move client and server files to common directory fs/smbSteve French1-3/+1
Move CIFS/SMB3 related client and server files (cifs.ko and ksmbd.ko and helper modules) to new fs/smb subdirectory: fs/cifs --> fs/smb/client fs/ksmbd --> fs/smb/server fs/smbfs_common --> fs/smb/common Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-05-19fs: remove the special !CONFIG_BLOCK def_blk_fopsChristoph Hellwig1-8/+2
def_blk_fops always returns -ENODEV, which dosn't match the return value of a non-existing block device with CONFIG_BLOCK, which is -ENXIO. Just remove the extra implementation and fall back to the default no_open_fops that always returns -ENXIO. Fixes: 9361401eb761 ("[PATCH] BLOCK: Make it possible to disable the block layer [try #6]") Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230508144405.41792-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-03Merge tag 'uml-for-linus-6.4-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull uml updates from Richard Weinberger: - Make stub data pages configurable - Make it harder to mix user and kernel code by accident * tag 'uml-for-linus-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: um: make stub data pages size tweakable um: prevent user code in modules um: further clean up user_syms um: don't export printf() um: hostfs: define our own API boundary um: add __weak for exported functions
2023-04-20um: hostfs: define our own API boundaryJohannes Berg1-1/+1
Instead of exporting the set of functions provided by glibc that are needed for hostfs_user.c, just build that into the kernel image whenever hostfs is built, and then export _those_ functions cleanly, to be independent of the libc implementation. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2023-04-13fs: fix sysctls.c builtKefeng Wang1-2/+1
'obj-$(CONFIG_SYSCTL) += sysctls.o' must be moved after "obj-y :=", or it won't be built as it is overwrited. Note that there is nothing that is going to break by linking sysctl.o later, we were just being way to cautious and patches have been updated to reflect these considerations and sent for stable as well with the whole "base" stuff needing to be linked prior to child sysctl tables that use that directory. All of the kernel sysctl APIs always share the same directory, and races against using it should end up re-using the same single created directory. And so something we can do eventually is do away with all the base stuff. For now it's fine, it's not creating an issue. It is just a bit pedantic and careful. Fixes: ab171b952c6e ("fs: move namespace sysctls and declare fs base directory") Cc: stable@vger.kernel.org # v5.17 Cc: Christian Brauner <brauner@kernel.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> [mcgrof: enhanced commit log for stable criteria and clarify base stuff ] Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-02-20Merge tag 'for-6.3/dio-2023-02-16' of git://git.kernel.dk/linuxLinus Torvalds1-1/+2
Pull legacy dio update from Jens Axboe: "We only have a few file systems that use the old dio code, make them select it rather than build it unconditionally" * tag 'for-6.3/dio-2023-02-16' of git://git.kernel.dk/linux: fs: build the legacy direct I/O code conditionally fs: move sb_init_dio_done_wq out of direct-io.c
2023-01-26fs: build the legacy direct I/O code conditionallyChristoph Hellwig1-1/+2
Add a new LEGACY_DIRECT_IO config symbol that is only selected by the file systems that still use the legacy blockdev_direct_IO code, so that kernels without support for those file systems don't need to build the code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20230125065839.191256-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-19fs: move mnt_idmapChristian Brauner1-1/+1
Now that we converted everything to just rely on struct mnt_idmap move it all into a separate file. This ensure that no code can poke around in struct mnt_idmap without any dedicated helpers and makes it easier to extend it in the future. Filesystems will now not be able to conflate mount and filesystem idmappings as they are two distinct types and require distinct helpers that cannot be used interchangeably. We are now also able to extend struct mnt_idmap as we see fit. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-09-27a.out: Remove the a.out implementationEric W. Biederman1-1/+0
In commit 19e8b701e258 ("a.out: Stop building a.out/osf1 support on alpha and m68k") the last users of a.out were disabled. As nothing has turned up to cause this change to be reverted, let's remove the code implementing a.out support as well. There may be userspace users of the uapi bits left so the uapi headers have been left untouched. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Arnd Bergmann <arnd@arndb.de> # arm defconfigs Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/871qrx3hq3.fsf@email.froward.int.ebiederm.org
2022-07-24io_uring: move to separate directoryJens Axboe1-2/+0
In preparation for splitting io_uring up a bit, move it into its own top level directory. It didn't really belong in fs/ anyway, as it's not a file system only API. This adds io_uring/ and moves the core files in there, and updates the MAINTAINERS file for the new location. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-01Merge tag 'unicode-for-next-5.17-rc3' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/krisman/unicode Pull unicode cleanup from Gabriel Krisman Bertazi: "A fix from Christoph Hellwig merging the CONFIG_UNICODE_UTF8_DATA into the previous CONFIG_UNICODE. It is -rc material since we don't want to expose the former symbol on 5.17. This has been living on linux-next for the past week" * tag 'unicode-for-next-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/krisman/unicode: unicode: clean up the Kconfig symbol confusion
2022-01-22fs: move namespace sysctls and declare fs base directoryLuis Chamberlain1-1/+2
This moves the namespace sysctls to its own file as part of the kernel/sysctl.c spring cleaning Since we have now removed all sysctls for "fs", we now have to declare it on the filesystem code, we do that using the new helper, which reduces boiler plate code. We rename init_fs_shared_sysctls() to init_fs_sysctls() to reflect that now fs/sysctls.c is taking on the burden of being the first to register the base directory as well. Lastly, since init code will load in the order in which we link it we have to move the sysctl code to be linked in early, so that its early init routine runs prior to other fs code. This way, other filesystem code can register their own sysctls using the helpers after this: * register_sysctl_init() * register_sysctl() Link: https://lkml.kernel.org/r/20211129211943.640266-3-mcgrof@kernel.org Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Antti Palosaari <crope@iki.fi> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Lukas Middendorf <kernel@tuxforce.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Stephen Kitt <steve@sk2.org> Cc: Xiaoming Ni <nixiaoming@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-22fs: move shared sysctls to fs/sysctls.cLuis Chamberlain1-0/+1
To help with this maintenance let's start by moving sysctls to places where they actually belong. The proc sysctl maintainers do not want to know what sysctl knobs you wish to add for your own piece of code, we just care about the core logic. To help with this maintenance let's start by moving sysctls to places where they actually belong. The proc sysctl maintainers do not want to know what sysctl knobs you wish to add for your own piece of code, we just care about the core logic. So move sysctls which are shared between filesystems into a common file outside of kernel/sysctl.c. Link: https://lkml.kernel.org/r/20211129205548.605569-6-mcgrof@kernel.org Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Antti Palosaari <crope@iki.fi> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Jeff Layton <jlayton@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Lukas Middendorf <kernel@tuxforce.de> Cc: Stephen Kitt <steve@sk2.org> Cc: Xiaoming Ni <nixiaoming@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20unicode: clean up the Kconfig symbol confusionChristoph Hellwig1-1/+1
Turn the CONFIG_UNICODE symbol into a tristate that generates some always built in code and remove the confusing CONFIG_UNICODE_UTF8_DATA symbol. Note that a lot of the IS_ENABLED() checks could be turned from cpp statements into normal ifs, but this change is intended to be fairly mechanic, so that should be cleaned up later. Fixes: 2b3d04787012 ("unicode: Add utf8-data module") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
2021-09-12Merge tag '5.15-rc-cifs-part2' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds1-1/+1
Pull smbfs updates from Steve French: "cifs/smb3 updates: - DFS reconnect fix - begin creating common headers for server and client - rename the cifs_common directory to smbfs_common to be more consistent ie change use of the name cifs to smb (smb3 or smbfs is more accurate, as the very old cifs dialect has long been superseded by smb3 dialects). In the future we can rename the fs/cifs directory to fs/smbfs. This does not include the set of multichannel fixes nor the two deferred close fixes (they are still being reviewed and tested)" * tag '5.15-rc-cifs-part2' of git://git.samba.org/sfrench/cifs-2.6: cifs: properly invalidate cached root handle when closing it cifs: move SMB FSCTL definitions to common code cifs: rename cifs_common to smbfs_common cifs: update FSCTL definitions
2021-09-11Merge tag 'block-5.15-2021-09-11' of git://git.kernel.dk/linux-blockLinus Torvalds1-1/+1
Pull block fixes from Jens Axboe: - NVMe pull request from Christoph: - fix nvmet command set reporting for passthrough controllers (Adam Manzanares) - update a MAINTAINERS email address (Chaitanya Kulkarni) - set QUEUE_FLAG_NOWAIT for nvme-multipth (me) - handle errors from add_disk() (Luis Chamberlain) - update the keep alive interval when kato is modified (Tatsuya Sasaki) - fix a buffer overrun in nvmet_subsys_attr_serial (Hannes Reinecke) - do not reset transport on data digest errors in nvme-tcp (Daniel Wagner) - only call synchronize_srcu when clearing current path (Daniel Wagner) - revalidate paths during rescan (Hannes Reinecke) - Split out the fs/block_dev into block/fops.c and block/bdev.c, which has been long overdue. Do this now before -rc1, to avoid annoying conflicts due to this (Christoph) - blk-throtl use-after-free fix (Li) - Improve plug depth for multi-device plugs, greatly increasing md resync performance (Song) - blkdev_show() locking fix (Tetsuo) - n64cart error check fix (Yang) * tag 'block-5.15-2021-09-11' of git://git.kernel.dk/linux-block: n64cart: fix return value check in n64cart_probe() blk-mq: allow 4x BLK_MAX_REQUEST_COUNT at blk_plug for multiple_queues block: move fs/block_dev.c to block/bdev.c block: split out operations on block special files blk-throttle: fix UAF by deleteing timer in blk_throtl_exit() block: genhd: don't call blkdev_show() with major_names_lock held nvme: update MAINTAINERS email address nvme: add error handling support for add_disk() nvme: only call synchronize_srcu when clearing current path nvme: update keep alive interval when kato is modified nvme-tcp: Do not reset transport on data digest errors nvmet: fixup buffer overrun in nvmet_subsys_attr_serial() nvmet: return bool from nvmet_passthru_ctrl and nvmet_is_passthru_req nvmet: looks at the passthrough controller when initializing CAP nvme: move nvme_multi_css into nvme.h nvme-multipath: revalidate paths during rescan nvme-multipath: set QUEUE_FLAG_NOWAIT
2021-09-08cifs: rename cifs_common to smbfs_commonSteve French1-1/+1
As we move to common code between client and server, we have been asked to make the names less confusing, and refer less to "cifs" and more to words which include "smb" instead to e.g. "smbfs" for the client (we already have "ksmbd" for the kernel server, and "smbd" for the user space Samba daemon). So to be more consistent in the naming of common code between client and server and reduce the risk of merge conflicts as more common code is added - rename "cifs_common" to "smbfs_common" (in future releases we also will rename the fs/cifs directory to fs/smbfs) Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-07block: move fs/block_dev.c to block/bdev.cChristoph Hellwig1-1/+1
Move it together with the rest of the block layer. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210907141303.1371844-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-04Merge git://github.com/Paragon-Software-Group/linux-ntfs3Linus Torvalds1-0/+1
Merge NTFSv3 filesystem from Konstantin Komarov: "This patch adds NTFS Read-Write driver to fs/ntfs3. Having decades of expertise in commercial file systems development and huge test coverage, we at Paragon Software GmbH want to make our contribution to the Open Source Community by providing implementation of NTFS Read-Write driver for the Linux Kernel. This is fully functional NTFS Read-Write driver. Current version works with NTFS (including v3.1) and normal/compressed/sparse files and supports journal replaying. We plan to support this version after the codebase once merged, and add new features and fix bugs. For example, full journaling support over JBD will be added in later updates" Link: https://lore.kernel.org/lkml/20210729134943.778917-1-almaz.alexandrovich@paragon-software.com/ Link: https://lore.kernel.org/lkml/aa4aa155-b9b2-9099-b7a2-349d8d9d8fbd@paragon-software.com/ * git://github.com/Paragon-Software-Group/linux-ntfs3: (35 commits) fs/ntfs3: Change how module init/info messages are displayed fs/ntfs3: Remove GPL boilerplates from decompress lib files fs/ntfs3: Remove unnecessary condition checking from ntfs_file_read_iter fs/ntfs3: Fix integer overflow in ni_fiemap with fiemap_prep() fs/ntfs3: Restyle comments to better align with kernel-doc fs/ntfs3: Rework file operations fs/ntfs3: Remove fat ioctl's from ntfs3 driver for now fs/ntfs3: Restyle comments to better align with kernel-doc fs/ntfs3: Fix error handling in indx_insert_into_root() fs/ntfs3: Potential NULL dereference in hdr_find_split() fs/ntfs3: Fix error code in indx_add_allocate() fs/ntfs3: fix an error code in ntfs_get_acl_ex() fs/ntfs3: add checks for allocation failure fs/ntfs3: Use kcalloc/kmalloc_array over kzalloc/kmalloc fs/ntfs3: Do not use driver own alloc wrappers fs/ntfs3: Use kernel ALIGN macros over driver specific fs/ntfs3: Restyle comment block in ni_parse_reparse() fs/ntfs3: Remove unused including <linux/version.h> fs/ntfs3: Fix fall-through warnings for Clang fs/ntfs3: Fix one none utf8 char in source file ...
2021-08-31Merge tag '5.15-rc-smb3-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds1-0/+1
Pull cifs client updates from Steve French: "Eleven cifs/smb3 client fixes: - mostly restructuring to allow disabling less secure algorithms (this will allow eventual removing rc4 and md4 from general use in the kernel) - four fixes, including two for stable - enable r/w support with fscache and cifs.ko I am working on a larger set of changes (the usual ... multichannel, auth and signing improvements), but wanted to get these in earlier to reduce chance of merge conflicts later in the merge window" * tag '5.15-rc-smb3-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6: cifs: Do not leak EDEADLK to dgetents64 for STATUS_USER_SESSION_DELETED cifs: add cifs_common directory to MAINTAINERS file cifs: cifs_md4 convert to SPDX identifier cifs: create a MD4 module and switch cifs.ko to use it cifs: fork arc4 and create a separate module for it for cifs and other users cifs: remove support for NTLM and weaker authentication algorithms cifs: enable fscache usage even for files opened as rw oid_registry: Add OIDs for missing Spnego auth mechanisms to Macs smb3: fix posix extensions mount option cifs: fix wrong release in sess_alloc_buffer() failed path CIFS: Fix a potencially linear read overflow
2021-08-31Merge tag '5.15-rc-first-ksmbd-merge' of git://git.samba.org/ksmbdLinus Torvalds1-0/+1
Pull initial ksmbd implementation from Steve French: "Initial merge of kernel smb3 file server, ksmbd. The SMB family of protocols is the most widely deployed network filesystem protocol, the default on Windows and Macs (and even on many phones and tablets), with clients and servers on all major operating systems, but lacked a kernel server for Linux. For many cases the current userspace server choices were suboptimal either due to memory footprint, performance or difficulty integrating well with advanced Linux features. ksmbd is a new kernel module which implements the server-side of the SMB3 protocol. The target is to provide optimized performance, GPLv2 SMB server, and better lease handling (distributed caching). The bigger goal is to add new features more rapidly (e.g. RDMA aka "smbdirect", and recent encryption and signing improvements to the protocol) which are easier to develop on a smaller, more tightly optimized kernel server than for example in Samba. The Samba project is much broader in scope (tools, security services, LDAP, Active Directory Domain Controller, and a cross platform file server for a wider variety of purposes) but the user space file server portion of Samba has proved hard to optimize for some Linux workloads, including for smaller devices. This is not meant to replace Samba, but rather be an extension to allow better optimizing for Linux, and will continue to integrate well with Samba user space tools and libraries where appropriate. Working with the Samba team we have already made sure that the configuration files and xattrs are in a compatible format between the kernel and user space server. Various types of functional and regression tests are regularly run against it. One example is the automated 'buildbot' regression tests which use the Linux client to test against ksmbd, e.g. http://smb3-test-rhel-75.southcentralus.cloudapp.azure.com/#/builders/8/builds/56 but other test suites, including Samba's smbtorture functional test suite are also used regularly" * tag '5.15-rc-first-ksmbd-merge' of git://git.samba.org/ksmbd: (219 commits) ksmbd: fix __write_overflow warning in ndr_read_string MAINTAINERS: ksmbd: add cifs_common directory to ksmbd entry MAINTAINERS: ksmbd: update my email address ksmbd: fix permission check issue on chown and chmod ksmbd: don't set FILE DELETE and FILE_DELETE_CHILD in access mask by default MAINTAINERS: add git adddress of ksmbd ksmbd: update SMB3 multi-channel support in ksmbd.rst ksmbd: smbd: fix kernel oops during server shutdown ksmbd: remove select FS_POSIX_ACL in Kconfig ksmbd: use proper errno instead of -1 in smb2_get_ksmbd_tcon() ksmbd: update the comment for smb2_get_ksmbd_tcon() ksmbd: change int data type to boolean ksmbd: Fix multi-protocol negotiation ksmbd: fix an oops in error handling in smb2_open() ksmbd: add ipv6_addr_v4mapped check to know if connection from client is ipv4 ksmbd: fix missing error code in smb2_lock ksmbd: use channel signingkey for binding SMB2 session setup ksmbd: don't set RSS capable in FSCTL_QUERY_NETWORK_INTERFACE_INFO ksmbd: Return STATUS_OBJECT_PATH_NOT_FOUND if smb2_creat() returns ENOENT ksmbd: fix -Wstringop-truncation warnings ...
2021-08-25cifs: fork arc4 and create a separate module for it for cifs and other usersRonnie Sahlberg1-0/+1
We can not drop ARC4 and basically destroy CIFS connectivity for almost all CIFS users so create a new forked ARC4 module that CIFS and other subsystems that have a hard dependency on ARC4 can use. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-08-13fs/ntfs3: Add NTFS3 in fs/Kconfig and fs/MakefileKonstantin Komarov1-0/+1
This adds NTFS3 in fs/Kconfig and fs/Makefile Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-07-25binfmt: remove support for em86 (alpha only)David Hildenbrand1-1/+0
We have a fairly specific alpha binary loader in Linux: running x86 (i386, i486) binaries via the em86 [1] emulator. As noted in the Kconfig option, the same behavior can be achieved via binfmt_misc, for example, more nowadays used for running qemu-user. An example on how to get binfmt_misc running with em86 can be found in Documentation/admin-guide/binfmt-misc.rst The defconfig does not have CONFIG_BINFMT_EM86=y set. And doing a make defconfig && make olddefconfig results in # CONFIG_BINFMT_EM86 is not set ... as we don't seem to have any supported Linux distirbution for alpha anymore, there isn't really any "default" user of that feature anymore. Searching for "CONFIG_BINFMT_EM86=y" reveals mostly discussions from around 20 years ago, like [2] describing how to get netscape via em86 running via em86, or [3] discussing that running wine or installing Win 3.11 through em86 would be a nice feature. The latest binaries available for em86 are from 2000, version 2.2.1 [4] -- which translates to "unsupported"; further, em86 doesn't even work with glibc-2.x but only with glibc-2.0 [4, 5]. These are clear signs that there might not be too many em86 users out there, especially users relying on modern Linux kernels. Even though the code footprint is relatively small, let's just get rid of this blast from the past that's effectively unused. [1] http://ftp.dreamtime.org/pub/linux/Linux-Alpha/em86/v0.4/docs/em86.html [2] https://static.lwn.net/1998/1119/a/alpha-netscape.html [3] https://groups.google.com/g/linux.debian.alpha/c/AkGuQHeCe0Y [4] http://zeniv.linux.org.uk/pub/linux/alpha/em86/v2.2-1/relnotes.2.2.1.html [5] https://forum.teamspeak.com/archive/index.php/t-1477.html Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-fsdevel@vger.kernel.org Cc: linux-api@vger.kernel.org Cc: linux-alpha@vger.kernel.org Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Matt Turner <mattst88@gmail.com>
2021-06-28ksmbd: move fs/cifsd to fs/ksmbdNamjae Jeon1-1/+1
Move fs/cifsd to fs/ksmbd and rename the remaining cifsd name to ksmbd. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-05-10cifsd: add Kconfig and MakefileNamjae Jeon1-0/+1
This adds the Kconfig and Makefile for cifsd. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-23netfs: Provide readahead and readpage netfs helpersDavid Howells1-0/+1
Add a pair of helper functions: (*) netfs_readahead() (*) netfs_readpage() to do the work of handling a readahead or a readpage, where the page(s) that form part of the request may be split between the local cache, the server or just require clearing, and may be single pages and transparent huge pages. This is all handled within the helper. Note that while both will read from the cache if there is data present, only netfs_readahead() will expand the request beyond what it was asked to do, and only netfs_readahead() will write back to the cache. netfs_readpage(), on the other hand, is synchronous and only fetches the page (which might be a THP) it is asked for. The netfs gives the helper parameters from the VM, the cache cookie it wants to use (or NULL) and a table of operations (only one of which is mandatory): (*) expand_readahead() [optional] Called to allow the netfs to request an expansion of a readahead request to meet its own alignment requirements. This is done by changing rreq->start and rreq->len. (*) clamp_length() [optional] Called to allow the netfs to cut down a subrequest to meet its own boundary requirements. If it does this, the helper will generate additional subrequests until the full request is satisfied. (*) is_still_valid() [optional] Called to find out if the data just read from the cache has been invalidated and must be reread from the server. (*) issue_op() [required] Called to ask the netfs to issue a read to the server. The subrequest describes the read. The read request holds information about the file being accessed. The netfs can cache information in rreq->netfs_priv. Upon completion, the netfs should set the error, transferred and can also set FSCACHE_SREQ_CLEAR_TAIL and then call fscache_subreq_terminated(). (*) done() [optional] Called after the pages have been unlocked. The read request is still pinning the file and mapping and may still be pinning pages with PG_fscache. rreq->error indicates any error that has been accumulated. (*) cleanup() [optional] Called when the helper is disposing of a finished read request. This allows the netfs to clear rreq->netfs_priv. Netfs support is enabled with CONFIG_NETFS_SUPPORT=y. It will be built even if CONFIG_FSCACHE=n and in this case much of it should be optimised away, allowing the filesystem to use it even when caching is disabled. Changes: v5: - Comment why netfs_readahead() is putting pages[2]. - Use page_file_mapping() rather than page->mapping[2]. - Use page_index() rather than page->index[2]. - Use set_page_fscache()[3] rather then SetPageFsCache() as this takes an appropriate ref too[4]. v4: - Folded in a kerneldoc comment fix. - Folded in a fix for the error handling in the case that ENOMEM occurs. - Added flag to netfs_subreq_terminated() to indicate that the caller may have been running async and stuff that might sleep needs punting to a workqueue (can't use in_softirq()[1]). Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-and-tested-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dave Wysochanski <dwysocha@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: Matthew Wilcox <willy@infradead.org> cc: linux-mm@kvack.org cc: linux-cachefs@redhat.com cc: linux-afs@lists.infradead.org cc: linux-nfs@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: ceph-devel@vger.kernel.org cc: v9fs-developer@lists.sourceforge.net cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20210216084230.GA23669@lst.de/ [1] Link: https://lore.kernel.org/r/20210321014202.GF3420@casper.infradead.org/ [2] Link: https://lore.kernel.org/r/2499407.1616505440@warthog.procyon.org.uk/ [3] Link: https://lore.kernel.org/r/CAHk-=wh+2gbF7XEjYc=HV9w_2uVzVf7vs60BPz0gFA=+pUm3ww@mail.gmail.com/ [4] Link: https://lore.kernel.org/r/160588497406.3465195.18003475695899726222.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161118136849.1232039.8923686136144228724.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161032290.2537118.13400578415247339173.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340394873.1303470.6237319335883242536.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539537375.286939.16642940088716990995.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653795430.2770958.4947584573720000554.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789076581.6155.6745849361504760209.stgit@warthog.procyon.org.uk/ # v6
2021-01-29fs: Remove dcookies supportViresh Kumar1-1/+0
The dcookies stuff was only used by the kernel's old oprofile code. Now that oprofile's support is removed from the kernel, there is no need for dcookies as well. Remove it. Suggested-by: Christoph Hellwig <hch@infradead.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Robert Richter <rric@kernel.org> Acked-by: William Cohen <wcohen@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2020-10-23Merge tag 'vfs-5.10-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds1-1/+1
Pull clone/dedupe/remap code refactoring from Darrick Wong: "Move the generic file range remap (aka reflink and dedupe) functions out of mm/filemap.c and fs/read_write.c and into fs/remap_range.c to reduce clutter in the first two files" * tag 'vfs-5.10-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: vfs: move the generic write and copy checks out of mm vfs: move the remap range helpers to remap_range.c vfs: move generic_remap_checks out of mm
2020-10-15Merge tag 'char-misc-5.10-rc1' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the big set of char, misc, and other assorted driver subsystem patches for 5.10-rc1. There's a lot of different things in here, all over the drivers/ directory. Some summaries: - soundwire driver updates - habanalabs driver updates - extcon driver updates - nitro_enclaves new driver - fsl-mc driver and core updates - mhi core and bus updates - nvmem driver updates - eeprom driver updates - binder driver updates and fixes - vbox minor bugfixes - fsi driver updates - w1 driver updates - coresight driver updates - interconnect driver updates - misc driver updates - other minor driver updates All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (396 commits) binder: fix UAF when releasing todo list docs: w1: w1_therm: Fix broken xref, mistakes, clarify text misc: Kconfig: fix a HISI_HIKEY_USB dependency LSM: Fix type of id parameter in kernel_post_load_data prototype misc: Kconfig: add a new dependency for HISI_HIKEY_USB firmware_loader: fix a kernel-doc markup w1: w1_therm: make w1_poll_completion static binder: simplify the return expression of binder_mmap test_firmware: Test partial read support firmware: Add request_partial_firmware_into_buf() firmware: Store opt_flags in fw_priv fs/kernel_file_read: Add "offset" arg for partial reads IMA: Add support for file reads without contents LSM: Add "contents" flag to kernel_read_file hook module: Call security_kernel_post_load_data() firmware_loader: Use security_post_load_data() LSM: Introduce kernel_post_load_data() hook fs/kernel_read_file: Add file_size output argument fs/kernel_read_file: Switch buffer size arg to size_t fs/kernel_read_file: Remove redundant size argument ...
2020-10-14vfs: move generic_remap_checks out of mmDarrick J. Wong1-1/+2
I would like to move all the generic helpers for the vfs remap range functionality (aka clonerange and dedupe) into a separate file so that they won't be scattered across the vfs and the mm subsystems. The eventual goal is to be able to deselect remap_range.c if none of the filesystems need that code, but the tricky part here is picking a stable(ish) part of the merge window to rearrange code. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-10-05fs/kernel_read_file: Split into separate source fileKees Cook1-1/+2
These routines are used in places outside of exec(2), so in preparation for refactoring them, move them into a separate source file, fs/kernel_read_file.c. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Scott Branden <scott.branden@broadcom.com> Link: https://lore.kernel.org/r/20201002173828.2099543-5-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-22fs: remove compat_sys_mountChristoph Hellwig1-1/+0
compat_sys_mount is identical to the regular sys_mount now, so remove it and use the native version everywhere. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-07-31init: add an init_mount helperChristoph Hellwig1-1/+1
Like do_mount, but takes a kernel pointer for the destination path. Switch over the mounts in the init code and devtmpfs to it, which just happen to work due to the implicit set_fs(KERNEL_DS) during early init right now. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-03-05exfat: add Kconfig and MakefileNamjae Jeon1-0/+1
This adds the Kconfig and Makefile for exfat. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Pali Rohár <pali.rohar@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-02-09Merge tag 'zonefs-5.6-rc1' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs Pull new zonefs file system from Damien Le Moal: "Zonefs is a very simple file system exposing each zone of a zoned block device as a file. Unlike a regular file system with native zoned block device support (e.g. f2fs or the on-going btrfs effort), zonefs does not hide the sequential write constraint of zoned block devices to the user. As a result, zonefs is not a POSIX compliant file system. Its goal is to simplify the implementation of zoned block devices support in applications by replacing raw block device file accesses with a richer file based API, avoiding relying on direct block device file ioctls which may be more obscure to developers. One example of this approach is the implementation of LSM (log-structured merge) tree structures (such as used in RocksDB and LevelDB) on zoned block devices by allowing SSTables to be stored in a zone file similarly to a regular file system rather than as a range of sectors of a zoned device. The introduction of the higher level construct "one file is one zone" can help reducing the amount of changes needed in the application while at the same time allowing the use of zoned block devices with various programming languages other than C. Zonefs IO management implementation uses the new iomap generic code. Zonefs has been successfully tested using a functional test suite (available with zonefs userland format tool on github) and a prototype implementation of LevelDB on top of zonefs" * tag 'zonefs-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs: zonefs: Add documentation fs: New zonefs file system
2020-02-09Merge branch 'work.vboxsf' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vboxfs from Al Viro: "This is the VirtualBox guest shared folder support by Hans de Goede, with fixups for fs_parse folded in to avoid bisection hazards from those API changes..." * 'work.vboxsf' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Add VirtualBox guest shared folder (vboxsf) support
2020-02-08fs: Add VirtualBox guest shared folder (vboxsf) supportHans de Goede1-0/+1
VirtualBox hosts can share folders with guests, this commit adds a VFS driver implementing the Linux-guest side of this, allowing folders exported by the host to be mounted under Linux. This driver depends on the guest <-> host IPC functions exported by the vboxguest driver. Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-02-07fs: New zonefs file systemDamien Le Moal1-0/+1
zonefs is a very simple file system exposing each zone of a zoned block device as a file. Unlike a regular file system with zoned block device support (e.g. f2fs), zonefs does not hide the sequential write constraint of zoned block devices to the user. Files representing sequential write zones of the device must be written sequentially starting from the end of the file (append only writes). As such, zonefs is in essence closer to a raw block device access interface than to a full featured POSIX file system. The goal of zonefs is to simplify the implementation of zoned block device support in applications by replacing raw block device file accesses with a richer file API, avoiding relying on direct block device file ioctls which may be more obscure to developers. One example of this approach is the implementation of LSM (log-structured merge) tree structures (such as used in RocksDB and LevelDB) on zoned block devices by allowing SSTables to be stored in a zone file similarly to a regular file system rather than as a range of sectors of a zoned device. The introduction of the higher level construct "one file is one zone" can help reducing the amount of changes needed in the application as well as introducing support for different application programming languages. Zonefs on-disk metadata is reduced to an immutable super block to persistently store a magic number and optional feature flags and values. On mount, zonefs uses blkdev_report_zones() to obtain the device zone configuration and populates the mount point with a static file tree solely based on this information. E.g. file sizes come from the device zone type and write pointer offset managed by the device itself. The zone files created on mount have the following characteristics. 1) Files representing zones of the same type are grouped together under a common sub-directory: * For conventional zones, the sub-directory "cnv" is used. * For sequential write zones, the sub-directory "seq" is used. These two directories are the only directories that exist in zonefs. Users cannot create other directories and cannot rename nor delete the "cnv" and "seq" sub-directories. 2) The name of zone files is the number of the file within the zone type sub-directory, in order of increasing zone start sector. 3) The size of conventional zone files is fixed to the device zone size. Conventional zone files cannot be truncated. 4) The size of sequential zone files represent the file's zone write pointer position relative to the zone start sector. Truncating these files is allowed only down to 0, in which case, the zone is reset to rewind the zone write pointer position to the start of the zone, or up to the zone size, in which case the file's zone is transitioned to the FULL state (finish zone operation). 5) All read and write operations to files are not allowed beyond the file zone size. Any access exceeding the zone size is failed with the -EFBIG error. 6) Creating, deleting, renaming or modifying any attribute of files and sub-directories is not allowed. 7) There are no restrictions on the type of read and write operations that can be issued to conventional zone files. Buffered, direct and mmap read & write operations are accepted. For sequential zone files, there are no restrictions on read operations, but all write operations must be direct IO append writes. mmap write of sequential files is not allowed. Several optional features of zonefs can be enabled at format time. * Conventional zone aggregation: ranges of contiguous conventional zones can be aggregated into a single larger file instead of the default one file per zone. * File ownership: The owner UID and GID of zone files is by default 0 (root) but can be changed to any valid UID/GID. * File access permissions: the default 640 access permissions can be changed. The mkzonefs tool is used to format zoned block devices for use with zonefs. This tool is available on Github at: git@github.com:damien-lemoal/zonefs-tools.git. zonefs-tools also includes a test suite which can be run against any zoned block device, including null_blk block device created with zoned mode. Example: the following formats a 15TB host-managed SMR HDD with 256 MB zones with the conventional zones aggregation feature enabled. $ sudo mkzonefs -o aggr_cnv /dev/sdX $ sudo mount -t zonefs /dev/sdX /mnt $ ls -l /mnt/ total 0 dr-xr-xr-x 2 root root 1 Nov 25 13:23 cnv dr-xr-xr-x 2 root root 55356 Nov 25 13:23 seq The size of the zone files sub-directories indicate the number of files existing for each type of zones. In this example, there is only one conventional zone file (all conventional zones are aggregated under a single file). $ ls -l /mnt/cnv total 137101312 -rw-r----- 1 root root 140391743488 Nov 25 13:23 0 This aggregated conventional zone file can be used as a regular file. $ sudo mkfs.ext4 /mnt/cnv/0 $ sudo mount -o loop /mnt/cnv/0 /data The "seq" sub-directory grouping files for sequential write zones has in this example 55356 zones. $ ls -lv /mnt/seq total 14511243264 -rw-r----- 1 root root 0 Nov 25 13:23 0 -rw-r----- 1 root root 0 Nov 25 13:23 1 -rw-r----- 1 root root 0 Nov 25 13:23 2 ... -rw-r----- 1 root root 0 Nov 25 13:23 55354 -rw-r----- 1 root root 0 Nov 25 13:23 55355 For sequential write zone files, the file size changes as data is appended at the end of the file, similarly to any regular file system. $ dd if=/dev/zero of=/mnt/seq/0 bs=4K count=1 conv=notrunc oflag=direct 1+0 records in 1+0 records out 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.000452219 s, 9.1 MB/s $ ls -l /mnt/seq/0 -rw-r----- 1 root root 4096 Nov 25 13:23 /mnt/seq/0 The written file can be truncated to the zone size, preventing any further write operation. $ truncate -s 268435456 /mnt/seq/0 $ ls -l /mnt/seq/0 -rw-r----- 1 root root 268435456 Nov 25 13:49 /mnt/seq/0 Truncation to 0 size allows freeing the file zone storage space and restart append-writes to the file. $ truncate -s 0 /mnt/seq/0 $ ls -l /mnt/seq/0 -rw-r----- 1 root root 0 Nov 25 13:49 /mnt/seq/0 Since files are statically mapped to zones on the disk, the number of blocks of a file as reported by stat() and fstat() indicates the size of the file zone. $ stat /mnt/seq/0 File: /mnt/seq/0 Size: 0 Blocks: 524288 IO Block: 4096 regular empty file Device: 870h/2160d Inode: 50431 Links: 1 Access: (0640/-rw-r-----) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2019-11-25 13:23:57.048971997 +0900 Modify: 2019-11-25 13:52:25.553805765 +0900 Change: 2019-11-25 13:52:25.553805765 +0900 Birth: - The number of blocks of the file ("Blocks") in units of 512B blocks gives the maximum file size of 524288 * 512 B = 256 MB, corresponding to the device zone size in this example. Of note is that the "IO block" field always indicates the minimum IO size for writes and corresponds to the device physical sector size. This code contains contributions from: * Johannes Thumshirn <jthumshirn@suse.de>, * Darrick J. Wong <darrick.wong@oracle.com>, * Christoph Hellwig <hch@lst.de>, * Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> and * Ting Yao <tingyao@hust.edu.cn>. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-03compat_ioctl: move sys_compat_ioctl() to ioctl.cArnd Bergmann1-1/+1
The rest of the fs/compat_ioctl.c file is no longer useful now, so move the actual syscall as planned. Reviewed-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-10-29io-wq: small threadpool implementation for io_uringJens Axboe1-0/+1
This adds support for io-wq, a smaller and specialized thread pool implementation. This is meant to replace workqueues for io_uring. Among the reasons for this addition are: - We can assign memory context smarter and more persistently if we manage the life time of threads. - We can drop various work-arounds we have in io_uring, like the async_list. - We can implement hashed work insertion, to manage concurrency of buffered writes without needing a) an extra workqueue, or b) needlessly making the concurrency of said workqueue very low which hurts performance of multiple buffered file writers. - We can implement cancel through signals, for cancelling interruptible work like read/write (or send/recv) to/from sockets. - We need the above cancel for being able to assign and use file tables from a process. - We can implement a more thorough cancel operation in general. - We need it to move towards a syslet/threadlet model for even faster async execution. For that we need to take ownership of the used threads. This list is just off the top of my head. Performance should be the same, or better, at least that's what I've seen in my testing. io-wq supports basic NUMA functionality, setting up a pool per node. io-wq hooks up to the scheduler schedule in/out just like workqueue and uses that to drive the need for more/less workers. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-18Merge tag 'fsverity-for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt Pull fs-verity support from Eric Biggers: "fs-verity is a filesystem feature that provides Merkle tree based hashing (similar to dm-verity) for individual readonly files, mainly for the purpose of efficient authenticity verification. This pull request includes: (a) The fs/verity/ support layer and documentation. (b) fs-verity support for ext4 and f2fs. Compared to the original fs-verity patchset from last year, the UAPI to enable fs-verity on a file has been greatly simplified. Lots of other things were cleaned up too. fs-verity is planned to be used by two different projects on Android; most of the userspace code is in place already. Another userspace tool ("fsverity-utils"), and xfstests, are also available. e2fsprogs and f2fs-tools already have fs-verity support. Other people have shown interest in using fs-verity too. I've tested this on ext4 and f2fs with xfstests, both the existing tests and the new fs-verity tests. This has also been in linux-next since July 30 with no reported issues except a couple minor ones I found myself and folded in fixes for. Ted and I will be co-maintaining fs-verity" * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: f2fs: add fs-verity support ext4: update on-disk format documentation for fs-verity ext4: add fs-verity read support ext4: add basic fs-verity support fs-verity: support builtin file signatures fs-verity: add SHA-512 support fs-verity: implement FS_IOC_MEASURE_VERITY ioctl fs-verity: implement FS_IOC_ENABLE_VERITY ioctl fs-verity: add data verification hooks for ->readpages() fs-verity: add the hook for file ->setattr() fs-verity: add the hook for file ->open() fs-verity: add inode and superblock fields fs-verity: add Kconfig and the helper functions for hashing fs: uapi: define verity bit for FS_IOC_GETFLAGS fs-verity: add UAPI header fs-verity: add MAINTAINERS file entry fs-verity: add a documentation file
2019-08-24erofs: move erofs out of stagingGao Xiang1-0/+1
EROFS filesystem has been merged into linux-staging for a year. EROFS is designed to be a better solution of saving extra storage space with guaranteed end-to-end performance for read-only files with the help of reduced metadata, fixed-sized output compression and decompression inplace technologies. In the past year, EROFS was greatly improved by many people as a staging driver, self-tested, betaed by a large number of our internal users, successfully applied to almost all in-service HUAWEI smartphones as the part of EMUI 9.1 and proven to be stable enough to be moved out of staging. EROFS is a self-contained filesystem driver. Although there are still some TODOs to be more generic, we have a dedicated team actively keeping on working on EROFS in order to make it better with the evolution of Linux kernel as the other in-kernel filesystems. As Pavel suggested, it's better to do as one commit since git can do moves and all histories will be saved in this way. Let's promote it from staging and enhance it more actively as a "real" part of kernel for more wider scenarios! Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Pavel Machek <pavel@denx.de> Cc: David Sterba <dsterba@suse.cz> Cc: Amir Goldstein <amir73il@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Darrick J . Wong <darrick.wong@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Richard Weinberger <richard@nod.at> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Chao Yu <yuchao0@huawei.com> Cc: Miao Xie <miaoxie@huawei.com> Cc: Li Guifu <bluce.liguifu@huawei.com> Cc: Fang Wei <fangwei1@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190822213659.5501-1-hsiangkao@aol.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-28fs-verity: add Kconfig and the helper functions for hashingEric Biggers1-0/+1
Add the beginnings of the fs/verity/ support layer, including the Kconfig option and various helper functions for hashing. To start, only SHA-256 is supported, but other hash algorithms can easily be added. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-07-17iomap: move the main iteration code into a separate fileDarrick J. Wong1-1/+0
Move the main iteration code into a separate file so that we can group related functions in a single file instead of having a single enormous source file. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-07-15iomap: start moving code to fs/iomap/Darrick J. Wong1-0/+1
Create the build infrastructure we need to start migrating iomap code to fs/iomap/ from fs/iomap.c. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-05-07Merge tag 'ext4_for_linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Add as a feature case-insensitive directories (the casefold feature) using Unicode 12.1. Also, the usual largish number of cleanups and bug fixes" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (25 commits) ext4: export /sys/fs/ext4/feature/casefold if Unicode support is present ext4: fix ext4_show_options for file systems w/o journal unicode: refactor the rule for regenerating utf8data.h docs: ext4.rst: document case-insensitive directories ext4: Support case-insensitive file name lookups ext4: include charset encoding information in the superblock MAINTAINERS: add Unicode subsystem entry unicode: update unicode database unicode version 12.1.0 unicode: introduce test module for normalized utf8 implementation unicode: implement higher level API for string handling unicode: reduce the size of utf8data[] unicode: introduce code for UTF-8 normalization unicode: introduce UTF-8 character database ext4: actually request zeroing of inode table after grow ext4: cond_resched in work-heavy group loops ext4: fix use-after-free race with debug_want_extra_isize ext4: avoid drop reference to iloc.bh twice ext4: ignore e_value_offs for xattrs with value-in-ea-inode ext4: protect journal inode's blocks using block_validity ext4: use BUG() instead of BUG_ON(1) ...
2019-04-25unicode: introduce UTF-8 character databaseGabriel Krisman Bertazi1-0/+1
The decomposition and casefolding of UTF-8 characters are described in a prefix tree in utf8data.h, which is a generate from the Unicode Character Database (UCD), published by the Unicode Consortium, and should not be edited by hand. The structures in utf8data.h are meant to be used for lookup operations by the unicode subsystem, when decoding a utf-8 string. mkutf8data.c is the source for a program that generates utf8data.h. It was written by Olaf Weber from SGI and originally proposed to be merged into Linux in 2014. The original proposal performed the compatibility decomposition, NFKD, but the current version was modified by me to do canonical decomposition, NFD, as suggested by the community. The changes from the original submission are: * Rebase to mainline. * Fix out-of-tree-build. * Update makefile to build 11.0.0 ucd files. * drop references to xfs. * Convert NFKD to NFD. * Merge back robustness fixes from original patch. Requested by Dave Chinner. The original submission is archived at: <https://linux-xfs.oss.sgi.narkive.com/Xx10wjVY/rfc-unicode-utf-8-support-for-xfs> The utf8data.h file can be regenerated using the instructions in fs/unicode/README.utf8data. - Notes on the update from 8.0.0 to 11.0: The structure of the ucd files and special cases have not experienced any changes between versions 8.0.0 and 11.0.0. 8.0.0 saw the addition of Cherokee LC characters, which is an interesting case for case-folding. The update is accompanied by new tests on the test_ucd module to catch specific cases. No changes to mkutf8data script were required for the updates. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-03-20vfs: syscall: Add fsopen() to prepare for superblock creationDavid Howells1-1/+1
Provide an fsopen() system call that starts the process of preparing to create a superblock that will then be mountable, using an fd as a context handle. fsopen() is given the name of the filesystem that will be used: int mfd = fsopen(const char *fsname, unsigned int flags); where flags can be 0 or FSOPEN_CLOEXEC. For example: sfd = fsopen("ext4", FSOPEN_CLOEXEC); fsconfig(sfd, FSCONFIG_SET_PATH, "source", "/dev/sda1", AT_FDCWD); fsconfig(sfd, FSCONFIG_SET_FLAG, "noatime", NULL, 0); fsconfig(sfd, FSCONFIG_SET_FLAG, "acl", NULL, 0); fsconfig(sfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0); fsconfig(sfd, FSCONFIG_SET_STRING, "sb", "1", 0); fsconfig(sfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); fsinfo(sfd, NULL, ...); // query new superblock attributes mfd = fsmount(sfd, FSMOUNT_CLOEXEC, MS_RELATIME); move_mount(mfd, "", sfd, AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH); sfd = fsopen("afs", -1); fsconfig(fd, FSCONFIG_SET_STRING, "source", "#grand.central.org:root.cell", 0); fsconfig(fd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); mfd = fsmount(sfd, 0, MS_NODEV); move_mount(mfd, "", sfd, AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH); If an error is reported at any step, an error message may be available to be read() back (ENODATA will be reported if there isn't an error available) in the form: "e <subsys>:<problem>" "e SELinux:Mount on mountpoint not permitted" Once fsmount() has been called, further fsconfig() calls will incur EBUSY, even if the fsmount() fails. read() is still possible to retrieve error information. The fsopen() syscall creates a mount context and hangs it of the fd that it returns. Netlink is not used because it is optional and would make the core VFS dependent on the networking layer and also potentially add network namespace issues. Note that, for the moment, the caller must have SYS_CAP_ADMIN to use fsopen(). Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-api@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-03-20Make anon_inodes unconditionalDavid Howells1-1/+1
Make the anon_inodes facility unconditional so that it can be used by core VFS code. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-03-12Merge branch 'work.mount' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs mount infrastructure updates from Al Viro: "The rest of core infrastructure; no new syscalls in that pile, but the old parts are switched to new infrastructure. At that point conversions of individual filesystems can happen independently; some are done here (afs, cgroup, procfs, etc.), there's also a large series outside of that pile dealing with NFS (quite a bit of option-parsing stuff is getting used there - it's one of the most convoluted filesystems in terms of mount-related logics), but NFS bits are the next cycle fodder. It got seriously simplified since the last cycle; documentation is probably the weakest bit at the moment - I considered dropping the commit introducing Documentation/filesystems/mount_api.txt (cutting the size increase by quarter ;-), but decided that it would be better to fix it up after -rc1 instead. That pile allows to do followup work in independent branches, which should make life much easier for the next cycle. fs/super.c size increase is unpleasant; there's a followup series that allows to shrink it considerably, but I decided to leave that until the next cycle" * 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (41 commits) afs: Use fs_context to pass parameters over automount afs: Add fs_context support vfs: Add some logging to the core users of the fs_context log vfs: Implement logging through fs_context vfs: Provide documentation for new mount API vfs: Remove kern_mount_data() hugetlbfs: Convert to fs_context cpuset: Use fs_context kernfs, sysfs, cgroup, intel_rdt: Support fs_context cgroup: store a reference to cgroup_ns into cgroup_fs_context cgroup1_get_tree(): separate "get cgroup_root to use" into a separate helper cgroup_do_mount(): massage calling conventions cgroup: stash cgroup_root reference into cgroup_fs_context cgroup2: switch to option-by-option parsing cgroup1: switch to option-by-option parsing cgroup: take options parsing into ->parse_monolithic() cgroup: fold cgroup1_mount() into cgroup1_get_tree() cgroup: start switching to fs_context ipc: Convert mqueue fs to fs_context proc: Add fs_context support to procfs ...
2019-03-09Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-1/+0
Pull SCSI updates from James Bottomley: "This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc, hisi_sas, target/iscsi and target/core. Additionally Christoph refactored gdth as part of the dma changes. The major mid-layer change this time is the removal of bidi commands and with them the whole of the osd/exofs driver and filesystem. This is a major simplification for block and mq in particular" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (240 commits) scsi: cxgb4i: validate tcp sequence number only if chip version <= T5 scsi: cxgb4i: get pf number from lldi->pf scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c scsi: mpt3sas: Add missing breaks in switch statements scsi: aacraid: Fix missing break in switch statement scsi: kill command serial number scsi: csiostor: drop serial_number usage scsi: mvumi: use request tag instead of serial_number scsi: dpt_i2o: remove serial number usage scsi: st: osst: Remove negative constant left-shifts scsi: ufs-bsg: Allow reading descriptors scsi: ufs: Allow reading descriptor via raw upiu scsi: ufs-bsg: Change the calling convention for write descriptor scsi: ufs: Remove unused device quirks Revert "scsi: ufs: disable vccq if it's not needed by UFS device" scsi: megaraid_sas: Remove a bunch of set but not used variables scsi: clean obsolete return values of eh_timed_out scsi: sd: Optimal I/O size should be a multiple of physical block size scsi: MAINTAINERS: SCSI initiator and target tweaks scsi: fcoe: make use of fip_mode enum complete ...
2019-03-08Merge tag 'io_uring-2019-03-06' of git://git.kernel.dk/linux-blockLinus Torvalds1-0/+1
Pull io_uring IO interface from Jens Axboe: "Second attempt at adding the io_uring interface. Since the first one, we've added basic unit testing of the three system calls, that resides in liburing like the other unit tests that we have so far. It'll take a while to get full coverage of it, but we're working towards it. I've also added two basic test programs to tools/io_uring. One uses the raw interface and has support for all the various features that io_uring supports outside of standard IO, like fixed files, fixed IO buffers, and polled IO. The other uses the liburing API, and is a simplified version of cp(1). This adds support for a new IO interface, io_uring. io_uring allows an application to communicate with the kernel through two rings, the submission queue (SQ) and completion queue (CQ) ring. This allows for very efficient handling of IOs, see the v5 posting for some basic numbers: https://lore.kernel.org/linux-block/20190116175003.17880-1-axboe@kernel.dk/ Outside of just efficiency, the interface is also flexible and extendable, and allows for future use cases like the upcoming NVMe key-value store API, networked IO, and so on. It also supports async buffered IO, something that we've always failed to support in the kernel. Outside of basic IO features, it supports async polled IO as well. This particular feature has already been tested at Facebook months ago for flash storage boxes, with 25-33% improvements. It makes polled IO actually useful for real world use cases, where even basic flash sees a nice win in terms of efficiency, latency, and performance. These boxes were IOPS bound before, now they are not. This series adds three new system calls. One for setting up an io_uring instance (io_uring_setup(2)), one for submitting/completing IO (io_uring_enter(2)), and one for aux functions like registrating file sets, buffers, etc (io_uring_register(2)). Through the help of Arnd, I've coordinated the syscall numbers so merge on that front should be painless. Jon did a writeup of the interface a while back, which (except for minor details that have been tweaked) is still accurate. Find that here: https://lwn.net/Articles/776703/ Huge thanks to Al Viro for helping getting the reference cycle code correct, and to Jann Horn for his extensive reviews focused on both security and bugs in general. There's a userspace library that provides basic functionality for applications that don't need or want to care about how to fiddle with the rings directly. It has helpers to allow applications to easily set up an io_uring instance, and submit/complete IO through it without knowing about the intricacies of the rings. It also includes man pages (thanks to Jeff Moyer), and will continue to grow support helper functions and features as time progresses. Find it here: git://git.kernel.dk/liburing Fio has full support for the raw interface, both in the form of an IO engine (io_uring), but also with a small test application (t/io_uring) that can exercise and benchmark the interface" * tag 'io_uring-2019-03-06' of git://git.kernel.dk/linux-block: io_uring: add a few test tools io_uring: allow workqueue item to handle multiple buffered requests io_uring: add support for IORING_OP_POLL io_uring: add io_kiocb ref count io_uring: add submission polling io_uring: add file set registration net: split out functions related to registering inflight socket files io_uring: add support for pre-mapped user IO buffers block: implement bio helper to add iter bvec pages to bio io_uring: batch io_kiocb allocation io_uring: use fget/fput_many() for file references fs: add fget_many() and fput_many() io_uring: support for IO polling io_uring: add fsync support Add io_uring IO interface
2019-02-28Add io_uring IO interfaceJens Axboe1-0/+1
The submission queue (SQ) and completion queue (CQ) rings are shared between the application and the kernel. This eliminates the need to copy data back and forth to submit and complete IO. IO submissions use the io_uring_sqe data structure, and completions are generated in the form of io_uring_cqe data structures. The SQ ring is an index into the io_uring_sqe array, which makes it possible to submit a batch of IOs without them being contiguous in the ring. The CQ ring is always contiguous, as completion events are inherently unordered, and hence any io_uring_cqe entry can point back to an arbitrary submission. Two new system calls are added for this: io_uring_setup(entries, params) Sets up an io_uring instance for doing async IO. On success, returns a file descriptor that the application can mmap to gain access to the SQ ring, CQ ring, and io_uring_sqes. io_uring_enter(fd, to_submit, min_complete, flags, sigset, sigsetsize) Initiates IO against the rings mapped to this fd, or waits for them to complete, or both. The behavior is controlled by the parameters passed in. If 'to_submit' is non-zero, then we'll try and submit new IO. If IORING_ENTER_GETEVENTS is set, the kernel will wait for 'min_complete' events, if they aren't already available. It's valid to set IORING_ENTER_GETEVENTS and 'min_complete' == 0 at the same time, this allows the kernel to return already completed events without waiting for them. This is useful only for polling, as for IRQ driven IO, the application can just check the CQ ring without entering the kernel. With this setup, it's possible to do async IO with a single system call. Future developments will enable polled IO with this interface, and polled submission as well. The latter will enable an application to do IO without doing ANY system calls at all. For IRQ driven IO, an application only needs to enter the kernel for completions if it wants to wait for them to occur. Each io_uring is backed by a workqueue, to support buffered async IO as well. We will only punt to an async context if the command would need to wait for IO on the device side. Any data that can be accessed directly in the page cache is done inline. This avoids the slowness issue of usual threadpools, since cached data is accessed as quickly as a sync interface. Sample application: http://git.kernel.dk/cgit/fio/plain/t/io_uring.c Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-28vfs: Add configuration parser helpersDavid Howells1-1/+1
Because the new API passes in key,value parameters, match_token() cannot be used with it. Instead, provide three new helpers to aid with parsing: (1) fs_parse(). This takes a parameter and a simple static description of all the parameters and maps the key name to an ID. It returns 1 on a match, 0 on no match if unknowns should be ignored and some other negative error code on a parse error. The parameter description includes a list of key names to IDs, desired parameter types and a list of enumeration name -> ID mappings. [!] Note that for the moment I've required that the key->ID mapping array is expected to be sorted and unterminated. The size of the array is noted in the fsconfig_parser struct. This allows me to use bsearch(), but I'm not sure any performance gain is worth the hassle of requiring people to keep the array sorted. The parameter type array is sized according to the number of parameter IDs and is indexed directly. The optional enum mapping array is an unterminated, unsorted list and the size goes into the fsconfig_parser struct. The function can do some additional things: (a) If it's not ambiguous and no value is given, the prefix "no" on a key name is permitted to indicate that the parameter should be considered negatory. (b) If the desired type is a single simple integer, it will perform an appropriate conversion and store the result in a union in the parse result. (c) If the desired type is an enumeration, {key ID, name} will be looked up in the enumeration list and the matching value will be stored in the parse result union. (d) Optionally generate an error if the key is unrecognised. This is called something like: enum rdt_param { Opt_cdp, Opt_cdpl2, Opt_mba_mpbs, nr__rdt_params }; const struct fs_parameter_spec rdt_param_specs[nr__rdt_params] = { [Opt_cdp] = { fs_param_is_bool }, [Opt_cdpl2] = { fs_param_is_bool }, [Opt_mba_mpbs] = { fs_param_is_bool }, }; const const char *const rdt_param_keys[nr__rdt_params] = { [Opt_cdp] = "cdp", [Opt_cdpl2] = "cdpl2", [Opt_mba_mpbs] = "mba_mbps", }; const struct fs_parameter_description rdt_parser = { .name = "rdt", .nr_params = nr__rdt_params, .keys = rdt_param_keys, .specs = rdt_param_specs, .no_source = true, }; int rdt_parse_param(struct fs_context *fc, struct fs_parameter *param) { struct fs_parse_result parse; struct rdt_fs_context *ctx = rdt_fc2context(fc); int ret; ret = fs_parse(fc, &rdt_parser, param, &parse); if (ret < 0) return ret; switch (parse.key) { case Opt_cdp: ctx->enable_cdpl3 = true; return 0; case Opt_cdpl2: ctx->enable_cdpl2 = true; return 0; case Opt_mba_mpbs: ctx->enable_mba_mbps = true; return 0; } return -EINVAL; } (2) fs_lookup_param(). This takes a { dirfd, path, LOOKUP_EMPTY? } or string value and performs an appropriate path lookup to convert it into a path object, which it will then return. If the desired type was a blockdev, the type of the looked up inode will be checked to make sure it is one. This can be used like: enum foo_param { Opt_source, nr__foo_params }; const struct fs_parameter_spec foo_param_specs[nr__foo_params] = { [Opt_source] = { fs_param_is_blockdev }, }; const char *char foo_param_keys[nr__foo_params] = { [Opt_source] = "source", }; const struct constant_table foo_param_alt_keys[] = { { "device", Opt_source }, }; const struct fs_parameter_description foo_parser = { .name = "foo", .nr_params = nr__foo_params, .nr_alt_keys = ARRAY_SIZE(foo_param_alt_keys), .keys = foo_param_keys, .alt_keys = foo_param_alt_keys, .specs = foo_param_specs, }; int foo_parse_param(struct fs_context *fc, struct fs_parameter *param) { struct fs_parse_result parse; struct foo_fs_context *ctx = foo_fc2context(fc); int ret; ret = fs_parse(fc, &foo_parser, param, &parse); if (ret < 0) return ret; switch (parse.key) { case Opt_source: return fs_lookup_param(fc, &foo_parser, param, &parse, &ctx->source); default: return -EINVAL; } } (3) lookup_constant(). This takes a table of named constants and looks up the given name within it. The table is expected to be sorted such that bsearch() be used upon it. Possibly I should require the table be terminated and just use a for-loop to scan it instead of using bsearch() to reduce hassle. Tables look something like: static const struct constant_table bool_names[] = { { "0", false }, { "1", true }, { "false", false }, { "no", false }, { "true", true }, { "yes", true }, }; and a lookup is done with something like: b = lookup_constant(bool_names, param->string, -1); Additionally, optional validation routines for the parameter description are provided that can be enabled at compile time. A later patch will invoke these when a filesystem is registered. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-05scsi: fs: remove exofsChristoph Hellwig1-1/+0
This was an example for using the SCSI OSD protocol, which we're trying to remove. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-30vfs: Introduce fs_context, switch vfs_kern_mount() to it.David Howells1-1/+2
Introduce a filesystem context concept to be used during superblock creation for mount and superblock reconfiguration for remount. This is allocated at the beginning of the mount procedure and into it is placed: (1) Filesystem type. (2) Namespaces. (3) Source/Device names (there may be multiple). (4) Superblock flags (SB_*). (5) Security details. (6) Filesystem-specific data, as set by the mount options. Accessor functions are then provided to set up a context, parameterise it from monolithic mount data (the data page passed to mount(2)) and tear it down again. A legacy wrapper is provided that implements what will be the basic operations, wrapping access to filesystems that aren't yet aware of the fs_context. Finally, vfs_kern_mount() is changed to make use of the fs_context and mount_fs() is replaced by vfs_get_tree(), called from vfs_kern_mount(). [AV -- add missing kstrdup()] [AV -- put_cred() can be unconditional - fc->cred can't be NULL] [AV -- take legacy_validate() contents into legacy_parse_monolithic()] [AV -- merge KERNEL_MOUNT and USER_MOUNT] [AV -- don't unlock superblock on success return from vfs_get_tree()] [AV -- kill 'reference' argument of init_fs_context()] Signed-off-by: David Howells <dhowells@redhat.com> Co-developed-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-21fs: common implementation of file typePhillip Potter1-1/+2
Many file systems use a copy&paste implementation of dirent to on-disk file type conversions. Create a common implementation to be used by file systems with some useful conversion helpers to reduce open coded file type conversions in file system code. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Phillip Potter <phil@philpotter.co.uk> Signed-off-by: Jan Kara <jack@suse.cz>
2018-06-11autofs: remove left-over autofs4 stubsLinus Torvalds1-1/+0
There's no need to retain the fs/autofs4 directory for backward compatibility. Adding an AUTOFS4_FS fragment to the autofs Kconfig and a module alias for autofs4 is sufficient for almost all cases. Not keeping fs/autofs4 remnants will prevent "insmod <path>/autofs4/autofs4.ko" from working but this shouldn't be used in automation scripts rather than modprobe(8). There were some comments about things to look out for with the module rename in the fs/autofs4/Kconfig that is removed by this patch, see the commit patch if you are interested. One potential problem with this change is that when the fs/autofs/Kconfig fragment for AUTOFS4_FS is removed any AUTOFS4_FS entries will be removed from the kernel config, resulting in no autofs file system being built if there is no AUTOFS_FS entry also. This would have also happened if the fs/autofs4 remnants had remained and is most likely to be a problem with automated builds. Please check your build configurations before the removal which will occur after the next couple of kernel releases. Acked-by: Ian Kent <raven@themaw.net> [ With edits and commit message from Ian Kent ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-07autofs: create autofs Kconfig and MakefileIan Kent1-0/+1
Create Makefile and Kconfig for autofs module. [raven@themaw.net: make autofs4 Kconfig depend on AUTOFS_FS] Link: http://lkml.kernel.org/r/152687649097.8263.7046086367407522029.stgit@pluto.themaw.net Link: http://lkml.kernel.org/r/152626705591.28589.356365986974038383.stgit@pluto.themaw.net Signed-off-by: Ian Kent <raven@themaw.net> Tested-by: Randy Dunlap <rdunlap@infradead.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-29split d_path() and friends into a separate fileAl Viro1-1/+1
Those parts of fs/dcache.c are pretty much self-contained. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-11-28ncpfs: move net/ncpfs to drivers/staging/ncpfsStephen Hemminger1-1/+0
The Netware Core Protocol is a file system that talks to Netware clients over IPX. Since IPX has been dead for many years move the file system into staging for eventual interment. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman1-0/+1
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-14logfs: remove from treeChristoph Hellwig1-1/+0
Logfs was introduced to the kernel in 2009, and hasn't seen any non drive-by changes since 2012, while having lots of unsolved issues including the complete lack of error handling, with more and more issues popping up without any fixes. The logfs.org domain has been bouncing from a mail, and the maintainer on the non-logfs.org domain hasn't repsonded to past queries either. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-06-21fs: introduce iomap infrastructureChristoph Hellwig1-0/+1
Add infrastructure for multipage buffered writes. This is implemented using an main iterator that applies an actor function to a range that can be written. This infrastucture is used to implement a buffered write helper, one to zero file ranges and one to implement the ->page_mkwrite VM operations. All of them borrow a fair amount of code from fs/buffers. for now by using an internal version of __block_write_begin that gets passed an iomap and builds the corresponding buffer head. The file system is gets a set of paired ->iomap_begin and ->iomap_end calls which allow it to map/reserve a range and get a notification once the write code is finished with it. Based on earlier code from Dave Chinner. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-03-26Merge tag 'ofs-pull-tag-1' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux Pull orangefs filesystem from Mike Marshall. This finally merges the long-pending orangefs filesystem, which has been much cleaned up with input from Al Viro over the last six months. From the documentation file: "OrangeFS is an LGPL userspace scale-out parallel storage system. It is ideal for large storage problems faced by HPC, BigData, Streaming Video, Genomics, Bioinformatics. Orangefs, originally called PVFS, was first developed in 1993 by Walt Ligon and Eric Blumer as a parallel file system for Parallel Virtual Machine (PVM) as part of a NASA grant to study the I/O patterns of parallel programs. Orangefs features include: - Distributes file data among multiple file servers - Supports simultaneous access by multiple clients - Stores file data and metadata on servers using local file system and access methods - Userspace implementation is easy to install and maintain - Direct MPI support - Stateless" see Documentation/filesystems/orangefs.txt for more in-depth details. * tag 'ofs-pull-tag-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: (174 commits) orangefs: fix orangefs_superblock locking orangefs: fix do_readv_writev() handling of error halfway through orangefs: have ->kill_sb() evict the VFS side of things first orangefs: sanitize ->llseek() orangefs-bufmap.h: trim unused junk orangefs: saner calling conventions for getting a slot orangefs_copy_{to,from}_bufmap(): don't pass bufmap pointer orangefs: get rid of readdir_handle_s ornagefs: ensure that truncate has an up to date inode size orangefs: move code which sets i_link to orangefs_inode_getattr orangefs: remove needless wrapper around GFP_KERNEL orangefs: remove wrapper around mutex_lock(&inode->i_mutex) orangefs: refactor inode type or link_target change detection orangefs: use new getattr for revalidate and remove old getattr orangefs: use new getattr in inode getattr and permission orangefs: use new orangefs_inode_getattr to get size in write and llseek orangefs: use new orangefs_inode_getattr to create new inodes orangefs: rename orangefs_inode_getattr to orangefs_inode_old_getattr orangefs: remove inode->i_lock wrapper orangefs: put register_chrdev immediately before register_filesystem ...
2016-03-17fs crypto: move per-file encryption from f2fs tree to fs/cryptoJaegeuk Kim1-0/+1
This patch adds the renamed functions moved from the f2fs crypto files. 1. definitions for per-file encryption used by ext4 and f2fs. 2. crypto.c for encrypt/decrypt functions a. IO preparation: - fscrypt_get_ctx / fscrypt_release_ctx b. before IOs: - fscrypt_encrypt_page - fscrypt_decrypt_page - fscrypt_zeroout_range c. after IOs: - fscrypt_decrypt_bio_pages - fscrypt_pullback_bio_page - fscrypt_restore_control_page 3. policy.c supporting context management. a. For ioctls: - fscrypt_process_policy - fscrypt_get_policy b. For context permission - fscrypt_has_permitted_context - fscrypt_inherit_context 4. keyinfo.c to handle permissions - fscrypt_get_encryption_info - fscrypt_free_encryption_info 5. fname.c to support filename encryption a. general wrapper functions - fscrypt_fname_disk_to_usr - fscrypt_fname_usr_to_disk - fscrypt_setup_filename - fscrypt_free_filename b. specific filename handling functions - fscrypt_fname_alloc_buffer - fscrypt_fname_free_buffer 6. Makefile and Kconfig Cc: Al Viro <viro@ftp.linux.org.uk> Signed-off-by: Michael Halcrow <mhalcrow@google.com> Signed-off-by: Ildar Muslukhov <ildarm@google.com> Signed-off-by: Uday Savagaonkar <savagaon@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-11-16Orangefs: Merge tag 'v4.4-rc1' into for-nextMike Marshall1-3/+4
Linux 4.4-rc1
2015-10-15ext4: promote ext4 over ext2 in the default probe orderDarrick J. Wong1-3/+4
Prevent clean ext3 filesystems from mounting by default with the ext2 driver (with no journal!) by putting ext4 ahead of ext2 in the default probe order. This will have the effect of mounting ext2 filesystems with ext4.ko by default, which is a safer failure than hoping the user notices that their journalled ext3 is now running without a journal! Users who require ext2.ko for ext2 can either disable ext4.ko or explicitly request ext2 via "mount -t ext2" or "rootfstype=ext2". Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-10-03Orangefs: kernel client part 7Mike Marshall1-0/+1
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2015-09-04userfaultfd: buildsystem activationAndrea Arcangeli1-0/+1
This allows to select the userfaultfd during configuration to build it. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com> Cc: zhang.zhanghailiang@huawei.com Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Andres Lagar-Cavilla <andreslc@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Hugh Dickins <hughd@google.com> Cc: Peter Feiner <pfeiner@google.com> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-23fs: Remove ext3 filesystem driverJan Kara1-2/+0
The functionality of ext3 is fully supported by ext4 driver. Major distributions (SUSE, RedHat) already use ext4 driver to handle ext3 filesystems for quite some time. There is some ugliness in mm resulting from jbd cleaning buffers in a dirty page without cleaning page dirty bit and also support for buffer bouncing in the block layer when stable pages are required is there only because of jbd. So let's remove the ext3 driver. This saves us some 28k lines of duplicated code. Acked-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jan Kara <jack@suse.cz>
2015-05-31um: Remove hppfsRichard Weinberger1-1/+0
hppfs (honeypot procfs) was an attempt to use UML as honeypot. It was never stable nor in heavy use. As Al Viro and Christoph Hellwig pointed some major issues out it is better to let it die. Signed-off-by: Richard Weinberger <richard@nod.at>
2015-04-14Merge tag 'trace-4.1-tracefs' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracefs from Steven Rostedt: "This adds the new tracefs file system. This has been in linux-next for more than one release, as I had it ready for the 4.0 merge window, but a last minute thing that needed to go into Linux first had to be done. That was that perf hard coded the file system number when reading /sys/kernel/debugfs/tracing directory making sure that the path had the debugfs mount # before it would parse the tracing file. This broke other use cases of perf, and the check is removed. Now when mounting /sys/kernel/debug, tracefs is automatically mounted in /sys/kernel/debug/tracing such that old tools will still see that path as expected. But now system admins can mount tracefs directly and not need to mount debugfs, which can expose security issues. A new directory is created when tracefs is configured such that system admins can now mount it separately (/sys/kernel/tracing)" * tag 'trace-4.1-tracefs' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Have mkdir and rmdir be part of tracefs tracefs: Add directory /sys/kernel/tracing tracing: Automatically mount tracefs on debugfs/tracing tracing: Convert the tracing facility over to use tracefs tracefs: Add new tracefs file system tracing: Create cmdline tracer options on tracing fs init tracing: Only create tracer options files if directory exists debugfs: Provide a file creation function that also takes an initial size
2015-02-17Merge branch 'parisc-3.20-1' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc update from Helge Deller: "The major change in here is the removal of the old HP-UX compat code which should have made it possible to load and execute 32-bit HP-UX binaries on PA-RISC Linux. Since it was never functional and since nobody cares about old 32-bit HPUX binaries any longer, it's now time to free up 3200 lines of kernel code (CONFIG_HPUX and CONFIG_BINFMT_SOM). Other than that we wire up the execveat() syscall, fix sparse errors and have some whitespace cleanups" * 'parisc-3.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: fs/binfmt_som: Drop kernel support for HP-UX SOM binaries parisc: Remove unused function parisc: macro whitespace fixes parisc/uaccess: fix sparse errors parisc: hpux - Remove HPUX syscall numbers parisc: hpux - Remove hpux gateway page parisc: hpux - Delete files in hpux subdirectory parisc: hpux - Do not compile hpux subdirectory parisc: hpux - Drop support for HP-UX binaries parisc: Add error checks when building up signal trampoline handler parisc: Wire up execveat syscall
2015-02-17fs/binfmt_som: Drop kernel support for HP-UX SOM binariesHelge Deller1-1/+0
The parisc arch has been the only user of HP-UX SOM binaries. Support for HP-UX executables was never finished and since we now drop support for the HP-UX compat layer anyway, it does not makes sense to keep the BINFMT_SOM support. Cc: linux-fsdevel@vger.kernel.org Cc: linux-parisc@vger.kernel.org Signed-off-by: Helge Deller <deller@gmx.de>
2015-02-16vfs,ext2: remove CONFIG_EXT2_FS_XIP and rename CONFIG_FS_XIP to CONFIG_FS_DAXMatthew Wilcox1-1/+1
The fewer Kconfig options we have the better. Use the generic CONFIG_FS_DAX to enable XIP support in ext2 as well as in the core. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Andreas Dilger <andreas.dilger@intel.com> Cc: Boaz Harrosh <boaz@plexistor.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Chinner <david@fromorbit.com> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-16dax,ext2: replace XIP read and write with DAX I/OMatthew Wilcox1-0/+1
Use the generic AIO infrastructure instead of custom read and write methods. In addition to giving us support for AIO, this adds the missing locking between read() and truncate(). Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Andreas Dilger <andreas.dilger@intel.com> Cc: Boaz Harrosh <boaz@plexistor.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Chinner <david@fromorbit.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-03tracefs: Add new tracefs file systemSteven Rostedt (Red Hat)1-0/+1
Add a separate file system to handle the tracing directory. Currently it is part of debugfs, but that is starting to show its limits. One thing is that in order to access the tracing infrastructure, you need to mount debugfs. As that includes debugging from all sorts of sub systems in the kernel, it is not considered advisable to mount such an all encompassing debugging system. Having the tracing system in its own file systems gives access to the tracing sub system without needing to include all other systems. Another problem with tracing using the debugfs system is that the instances use mkdir to create sub buffers. debugfs does not support mkdir from userspace so to implement it, special hacks were used. By controlling the file system that the tracing infrastructure uses, this can be properly done without hacks. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-10Merge branch 'nsfs' into for-nextAl Viro1-1/+1
2014-12-10take the targets of /proc/*/ns/* symlinks to separate fsAl Viro1-1/+1
New pseudo-filesystem: nsfs. Targets of /proc/*/ns/* live there now. It's not mountable (not even registered, so it's not in /proc/filesystems, etc.). Files on it *are* bindable - we explicitly permit that in do_loopback(). This stuff lives in fs/nsfs.c now; proc_ns_fget() moved there as well. get_proc_ns() is a macro now (it's simply returning ->i_private; would have been an inline, if not for header ordering headache). proc_ns_inode() is an ex-parrot. The interface used in procfs is ns_get_path(path, task, ops) and ns_get_name(buf, size, task, ops). Dentries and inodes are never hashed; a non-counting reference to dentry is stashed in ns_common (removed by ->d_prune()) and reused by ns_get_path() if present. See ns_get_path()/ns_prune_dentry/nsfs_evict() for details of that mechanism. As the result, proc_ns_follow_link() has stopped poking in nd->path.mnt; it does nd_jump_link() on a consistent <vfsmount,dentry> pair it gets from ns_get_path(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-20ovl: rename filesystem type to "overlay"Miklos Szeredi1-1/+1
Some distributions carry an "old" format of overlayfs while mainline has a "new" format. The distros will possibly want to keep the old overlayfs alongside the new for compatibility reasons. To make it possible to differentiate the two versions change the name of the new one from "overlayfs" to "overlay". Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reported-by: Serge Hallyn <serge.hallyn@ubuntu.com> Cc: Andy Whitcroft <apw@canonical.com>
2014-10-24overlay filesystemMiklos Szeredi1-0/+1
Overlayfs allows one, usually read-write, directory tree to be overlaid onto another, read-only directory tree. All modifications go to the upper, writable layer. This type of mechanism is most often used for live CDs but there's a wide variety of other uses. The implementation differs from other "union filesystem" implementations in that after a file is opened all operations go directly to the underlying, lower or upper, filesystems. This simplifies the implementation and allows native performance in these cases. The dentry tree is duplicated from the underlying filesystems, this enables fast cached lookups without adding special support into the VFS. This uses slightly more memory than union mounts, but dentries are relatively small. Currently inodes are duplicated as well, but it is a possible optimization to share inodes for non-directories. Opening non directories results in the open forwarded to the underlying filesystem. This makes the behavior very similar to union mounts (with the same limitations vs. fchmod/fchown on O_RDONLY file descriptors). Usage: mount -t overlayfs overlayfs -olowerdir=/lower,upperdir=/upper/upper,workdir=/upper/work /overlay The following cotributions have been folded into this patch: Neil Brown <neilb@suse.de>: - minimal remount support - use correct seek function for directories - initialise is_real before use - rename ovl_fill_cache to ovl_dir_read Felix Fietkau <nbd@openwrt.org>: - fix a deadlock in ovl_dir_read_merged - fix a deadlock in ovl_remove_whiteouts Erez Zadok <ezk@fsl.cs.sunysb.edu> - fix cleanup after WARN_ON Sedat Dilek <sedat.dilek@googlemail.com> - fix up permission to confirm to new API Robin Dong <hao.bigrat@gmail.com> - fix possible leak in ovl_new_inode - create new inode in ovl_link Andy Whitcroft <apw@canonical.com> - switch to __inode_permission() - copy up i_uid/i_gid from the underlying inode AV: - ovl_copy_up_locked() - dput(ERR_PTR(...)) on two failure exits - ovl_clear_empty() - one failure exit forgetting to do unlock_rename(), lack of check for udir being the parent of upper, dropping and regaining the lock on udir (which would require _another_ check for parent being right). - bogus d_drop() in copyup and rename [fix from your mail] - copyup/remove and copyup/rename races [fix from your mail] - ovl_dir_fsync() leaving ERR_PTR() in ->realfile - ovl_entry_free() is pointless - it's just a kfree_rcu() - fold ovl_do_lookup() into ovl_lookup() - manually assigning ->d_op is wrong. Just use ->s_d_op. [patches picked from Miklos]: * copyup/remove and copyup/rename races * bogus d_drop() in copyup and rename Also thanks to the following people for testing and reporting bugs: Jordi Pujol <jordipujolp@gmail.com> Andy Whitcroft <apw@canonical.com> Michal Suchanek <hramrach@centrum.cz> Felix Fietkau <nbd@openwrt.org> Erez Zadok <ezk@fsl.cs.sunysb.edu> Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-08-07take fs_pin stuff to fs/*Al Viro1-1/+1
Add a new field to fs_pin - kill(pin). That's what umount and r/o remount will be calling for all pins attached to vfsmount and superblock resp. Called after bumping the refcount, so it won't go away under us. Dropping the refcount is responsibility of the instance. All generic stuff moved to fs/fs_pin.c; the next step will rip all the knowledge of kernel/acct.c from fs/super.c and fs/namespace.c. After that - death to mnt_pin(); it was intended to be usable as generic mechanism for code that wants to attach objects to vfsmount, so that they would not make the sucker busy and would get killed on umount. Never got it right; it remained acct.c-specific all along. Now it's very close to being killable. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-05-19block: move ioprio.c from fs/ to block/Jens Axboe1-1/+1
Like commit f9c78b2b, move this block related file outside of fs/ and into the core block directory, block/. Signed-off-by: Jens Axboe <axboe@fb.com>
2014-05-19block: move bio.c and bio-integrity.c from fs/ to block/Jens Axboe1-2/+1
They really belong in block/, especially now since it's not in drivers/block/ anymore. Additionally, the get_maintainer script gets it wrong when in fs/. Suggested-by: Christoph Hellwig <hch@infradead.org> Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-07kernfs: add CONFIG_KERNFSTejun Heo1-1/+2
As sysfs was kernfs's only user, kernfs has been piggybacking on CONFIG_SYSFS; however, kernfs is scheduled to grow a new user very soon. Introduce a separate config option CONFIG_KERNFS which is to be selected by kernfs users. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-28Merge branch 'for-linus' of ↵Linus Torvalds1-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "Assorted stuff; the biggest pile here is Christoph's ACL series. Plus assorted cleanups and fixes all over the place... There will be another pile later this week" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (43 commits) __dentry_path() fixes vfs: Remove second variable named error in __dentry_path vfs: Is mounted should be testing mnt_ns for NULL or error. Fix race when checking i_size on direct i/o read hfsplus: remove can_set_xattr nfsd: use get_acl and ->set_acl fs: remove generic_acl nfs: use generic posix ACL infrastructure for v3 Posix ACLs gfs2: use generic posix ACL infrastructure jfs: use generic posix ACL infrastructure xfs: use generic posix ACL infrastructure reiserfs: use generic posix ACL infrastructure ocfs2: use generic posix ACL infrastructure jffs2: use generic posix ACL infrastructure hfsplus: use generic posix ACL infrastructure f2fs: use generic posix ACL infrastructure ext2/3/4: use generic posix ACL infrastructure btrfs: use generic posix ACL infrastructure fs: make posix_acl_create more useful fs: make posix_acl_chmod more useful ...
2014-01-26fs: remove generic_aclChristoph Hellwig1-1/+0
And instead convert tmpfs to use the new generic ACL code, with two stub methods provided for in-memory filesystems. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-01-25fs: merge xattr_acl.c into posix_acl.cChristoph Hellwig1-1/+1
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-11-27sysfs, kernfs: add skeletons for kernfsTejun Heo1-1/+1
Core sysfs implementation will be separated into kernfs so that it can be used by other non-kobject users. This patch creates fs/kernfs/ directory and makes boilerplate changes. kernfs interface will be directly based on sysfs_dirent and its forward declaration is moved to include/linux/kernfs.h which is included from include/linux/sysfs.h. sysfs core implementation will be gradually separated out and moved to kernfs. This patch doesn't introduce any functional changes. v2: mount.c added. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: linux-fsdevel@vger.kernel.org Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-01Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS updates from Al Viro, Misc cleanups all over the place, mainly wrt /proc interfaces (switch create_proc_entry to proc_create(), get rid of the deprecated create_proc_read_entry() in favor of using proc_create_data() and seq_file etc). 7kloc removed. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits) don't bother with deferred freeing of fdtables proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h proc: Make the PROC_I() and PDE() macros internal to procfs proc: Supply a function to remove a proc entry by PDE take cgroup_open() and cpuset_open() to fs/proc/base.c ppc: Clean up scanlog ppc: Clean up rtas_flash driver somewhat hostap: proc: Use remove_proc_subtree() drm: proc: Use remove_proc_subtree() drm: proc: Use minor->index to label things, not PDE->name drm: Constify drm_proc_list[] zoran: Don't print proc_dir_entry data in debug reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show() proc: Supply an accessor for getting the data from a PDE's parent airo: Use remove_proc_subtree() rtl8192u: Don't need to save device proc dir PDE rtl8187se: Use a dir under /proc/net/r8180/ proc: Add proc_mkdir_data() proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h} proc: Move PDE_NET() to fs/proc/proc_net.c ...
2013-05-01Merge branch 'x86-efi-for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/efi changes from Peter Anvin: "The bulk of these changes are cleaning up the efivars handling and breaking it up into a tree of files. There are a number of fixes as well. The entire changeset is pretty big, but most of it is code movement. Several of these commits are quite new; the history got very messed up due to a mismerge with the urgent changes for rc8 which completely broke IA64, and so Ingo requested that we rebase it to straighten it out." * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi: remove "kfree(NULL)" efi: locking fix in efivar_entry_set_safe() efi, pstore: Read data from variable store before memcpy() efi, pstore: Remove entry from list when erasing efi, pstore: Initialise 'entry' before iterating efi: split efisubsystem from efivars efivarfs: Move to fs/efivarfs efivars: Move pstore code into the new EFI directory efivars: efivar_entry API efivars: Keep a private global pointer to efivars efi: move utf16 string functions to efi.h x86, efi: Make efi_memblock_x86_reserve_range more readable efivarfs: convert to use simple_open()
2013-04-30fs: make binfmt support for #! scripts modular and removableJosh Triplett1-4/+1
Add a new configuration option CONFIG_BINFMT_SCRIPT to configure support for interpreted scripts starting with "#!"; allow compiling out that support, or building it as a module. Embedded systems running exclusively compiled binaries could leave this support out, and systems that don't need scripts before mounting the root filesystem can build this as a module. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29fs: don't compile in drop_caches.c when CONFIG_SYSCTL=nJosh Triplett1-1/+2
drop_caches.c provides code only invokable via sysctl, so don't compile it in when CONFIG_SYSCTL=n. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-17efivarfs: Move to fs/efivarfsMatt Fleming1-0/+1
Now that efivarfs uses the efivar API, move it out of efivars.c and into fs/efivarfs where it belongs. This move will eventually allow us to enable the efivarfs code without having to also enable CONFIG_EFI_VARS built, and vice versa. Furthermore, things like, mount -t efivarfs none /sys/firmware/efi/efivars will now work if efivarfs is built as a module without requiring the use of MODULE_ALIAS(), which would have been necessary when the efivarfs code was part of efivars.c. Cc: Matthew Garrett <matthew.garrett@nebula.com> Cc: Jeremy Kerr <jk@ozlabs.org> Reviewed-by: Tom Gundersen <teg@jklm.no> Tested-by: Tom Gundersen <teg@jklm.no> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-04-09fold fifo.c into pipe.cAl Viro1-1/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-11f2fs: update Kconfig and MakefileJaegeuk Kim1-0/+1
This adds Makefile and Kconfig for f2fs, and updates Makefile and Kconfig files in the fs directory. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-10-06coredump: make core dump functionality optionalAlex Kelly1-1/+2
Adds an expert Kconfig option, CONFIG_COREDUMP, which allows disabling of core dump. This saves approximately 2.6k in the compiled kernel, and complements CONFIG_ELF_CORE, which now depends on it. CONFIG_COREDUMP also disables coredump-related sysctls, except for suid_dumpable and related functions, which are necessary for ptrace. [akpm@linux-foundation.org: fix binfmt_aout.c build] Signed-off-by: Alex Kelly <alex.page.kelly@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-02coredump: move core dump functionality into its own fileAlex Kelly1-1/+1
This prepares for making core dump functionality optional. The variable "suid_dumpable" and associated functions are left in fs/exec.c because they're used elsewhere, such as in ptrace. Signed-off-by: Alex Kelly <alex.page.kelly@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20fs: initial qnx6fs additionKai Bankett1-0/+1
Adds support for qnx6fs readonly support to the linux kernel. * Mount option The option mmi_fs can be used to mount Harman Becker/Audi MMI 3G HDD qnx6fs filesystems. * Documentation A high level filesystem stucture description can be found in the Documentation/filesystems directory. (qnx6.txt) * Additional features - Active (stable) superblock selection - Superblock checksum check (enforced) - Supports mount of qnx6 filesystems with to host different endianess - Automatic endianess detection - Longfilename support (with non-enfocing crc check) - All blocksizes (512, 1024, 2048 and 4096 supported) Signed-off-by: Kai Bankett <chaosman@ontika.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-06Merge branches 'vfsmount-guts', 'umode_t' and 'partitions' into ZAl Viro1-1/+0
2012-01-03vfs: take /proc/*/mounts and friends to fs/proc_namespace.cAl Viro1-0/+2
rationale: that stuff is far tighter bound to fs/namespace.c than to the guts of procfs proper. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03move fs/partitions to block/Al Viro1-1/+0
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-10-27fs/Makefile: Stupid typo breakage of exofs inclusionBoaz Harrosh1-1/+1
In my last patch I did a stupid mistake and broke the exofs compilation completely. Fix it ASAP. Instead of obj-y I did obj-$(y) Really Really sorry. Me totally blushing :-{| Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-24fs/Makefile: Always inspect exofs/Boaz Harrosh1-1/+1
fs/exofs directory has multiple targets now, of which the ore.ko will be needed by the pnfs-objects-layout-driver (fs/nfs/objlayout). As suggested by: Michal Marek <mmarek@suse.cz> convert inclusion of exofs/ from obj-$(CONFIG_EXOFS_FS) => obj-$(y). So ORE can be selected also from fs/nfs/Kconfig CC: Michal Marek <mmarek@suse.cz> CC: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2011-07-15nfsd: Remove deprecated nfsctl system call and related code.NeilBrown1-1/+0
As promised in feature-removal-schedule.txt it is time to remove the nfsctl system call. Userspace has perferred to not use this call throughout 2.6 and it has been excluded in the default configuration since 2.6.36 (9 months ago). So this patch removes all the code that was being compiled out. There are still references to sys_nfsctl in various arch systemcall tables and related code. These should be cleaned out too, probably in the next merge window. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-03-16Merge branch 'release' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] tioca: Fix assignment from incompatible pointer warnings [IA64] mca.c: Fix cast from integer to pointer warning [IA64] setup.c Typo fix "Architechtuallly" [IA64] Add CONFIG_MISC_DEVICES=y to configs that need it. [IA64] disable interrupts at end of ia64_mca_cpe_int_handler() [IA64] Add DMA_ERROR_CODE define. pstore: fix build warning for unused return value from sysfs_create_file pstore: X86 platform interface using ACPI/APEI/ERST pstore: new filesystem interface to platform persistent storage
2011-03-15vfs: Add name to file handle conversion supportAneesh Kumar K.V1-0/+2
The syscall also return mount id which can be used to lookup file system specific information such as uuid in /proc/<pid>/mountinfo Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-12-28pstore: new filesystem interface to platform persistent storageTony Luck1-0/+1
Some platforms have a small amount of non-volatile storage that can be used to store information useful to diagnose the cause of a system crash. This is the generic part of a file system interface that presents information from the crash as a series of files in /dev/pstore. Once the information has been seen, the underlying storage is freed by deleting the files. Signed-off-by: Tony Luck <tony.luck@intel.com>
2010-10-28Merge 'staging-next' to Linus's treeGreg Kroah-Hartman1-2/+0
This merges the staging-next tree to Linus's tree and resolves some conflicts that were present due to changes in other trees that were affected by files here. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-10-05smbfs: move to drivers/stagingArnd Bergmann1-1/+0
smbfs has been scheduled for removal in 2.6.27, so maybe we can now move it to drivers/staging on the way out. smbfs still uses the big kernel lock and nobody is going to fix that, so we should be getting rid of it soon. This removes the 32 bit compat mount and ioctl handling code, which is implemented in common fs code, and moves all smbfs related files into drivers/staging/smbfs. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-10-05autofs3: move to drivers/stagingArnd Bergmann1-1/+0
Nobody appears to be interested in fixing autofs3 bugs any more and it uses the BKL, which is going away. Move this to staging for retirement. Unless someone complains until 2.6.38, we can remove it for good. The include/linux/auto_fs.h header file is still used by autofs4, so it remains in place. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Ian Kent <raven@themaw.net> Cc: autofs@linux.kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-22nfsd: allow deprecated interface to be compiled out.NeilBrown1-4/+1
Add CONFIG_NFSD_DEPRECATED, default to y. Only include deprecated interface if this is defined. This allows distros to remove this interface before the official removal, and allows developers to test without it. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-05-21Take statfs variants to fs/statfs.cAl Viro1-1/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-19Merge branch 'for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (205 commits) ceph: update for write_inode API change ceph: reset osd after relevant messages timed out ceph: fix flush_dirty_caps race with caps migration ceph: include migrating caps in issued set ceph: fix osdmap decoding when pools include (removed) snaps ceph: return EBADF if waiting for caps on closed file ceph: set osd request message front length correctly ceph: reset front len on return to msgpool; BUG on mismatched front iov ceph: fix snaptrace decoding on cap migration between mds ceph: use single osd op reply msg ceph: reset bits on connection close ceph: remove bogus mds forward warning ceph: remove fragile __map_osds optimization ceph: fix connection fault STANDBY check ceph: invalidate_authorizer without con->mutex held ceph: don't clobber write return value when using O_SYNC ceph: fix client_request_forward decoding ceph: drop messages on unregistered mds sessions; cleanup ceph: fix comments, locking in destroy_inode ceph: move dereference after NULL test ... Fix trivial conflicts in Documentation/ioctl/ioctl-number.txt
2009-11-20[LogFS] add new flash file systemJoern Engel1-0/+1
This is a new flash file system. See Documentation/filesystems/logfs.txt Signed-off-by: Joern Engel <joern@logfs.org>
2009-10-06ceph: Kconfig, MakefileSage Weil1-0/+1
Kconfig options and Makefile. Signed-off-by: Sage Weil <sage@newdream.net>
2009-04-07nilfs2: update makefile and KconfigRyusuke Konishi1-0/+1
This adds a Makefile for the nilfs2 file system, and updates the makefile and Kconfig file in the file system directory. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-fscacheLinus Torvalds1-0/+2
* git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-fscache: (41 commits) NFS: Add mount options to enable local caching on NFS NFS: Display local caching state NFS: Store pages from an NFS inode into a local cache NFS: Read pages from FS-Cache into an NFS inode NFS: nfs_readpage_async() needs to be accessible as a fallback for local caching NFS: Add read context retention for FS-Cache to call back with NFS: FS-Cache page management NFS: Add some new I/O counters for FS-Cache doing things for NFS NFS: Invalidate FsCache page flags when cache removed NFS: Use local disk inode cache NFS: Define and create inode-level cache objects NFS: Define and create superblock-level objects NFS: Define and create server-level objects NFS: Register NFS for caching and retrieve the top-level index NFS: Permit local filesystem caching to be enabled for NFS NFS: Add FS-Cache option bit and debug bit NFS: Add comment banners to some NFS functions FS-Cache: Make kAFS use FS-Cache CacheFiles: A cache that backs onto a mounted filesystem CacheFiles: Export things for CacheFiles ...
2009-04-03Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osdLinus Torvalds1-0/+1
* 'for-linus' of git://git.open-osd.org/linux-open-osd: fs: Add exofs to Kernel build exofs: Documentation exofs: export_operations exofs: super_operations and file_system_type exofs: dir_inode and directory operations exofs: address_space_operations exofs: symlink_inode and fast_symlink_inode operations exofs: file and file_inode operations exofs: Kbuild, Headers and osd utils
2009-04-03CacheFiles: A cache that backs onto a mounted filesystemDavid Howells1-0/+1
Add an FS-Cache cache-backend that permits a mounted filesystem to be used as a backing store for the cache. CacheFiles uses a userspace daemon to do some of the cache management - such as reaping stale nodes and culling. This is called cachefilesd and lives in /sbin. The source for the daemon can be downloaded from: http://people.redhat.com/~dhowells/cachefs/cachefilesd.c And an example configuration from: http://people.redhat.com/~dhowells/cachefs/cachefilesd.conf The filesystem and data integrity of the cache are only as good as those of the filesystem providing the backing services. Note that CacheFiles does not attempt to journal anything since the journalling interfaces of the various filesystems are very specific in nature. CacheFiles creates a misc character device - "/dev/cachefiles" - that is used to communication with the daemon. Only one thing may have this open at once, and whilst it is open, a cache is at least partially in existence. The daemon opens this and sends commands down it to control the cache. CacheFiles is currently limited to a single cache. CacheFiles attempts to maintain at least a certain percentage of free space on the filesystem, shrinking the cache by culling the objects it contains to make space if necessary - see the "Cache Culling" section. This means it can be placed on the same medium as a live set of data, and will expand to make use of spare space and automatically contract when the set of data requires more space. ============ REQUIREMENTS ============ The use of CacheFiles and its daemon requires the following features to be available in the system and in the cache filesystem: - dnotify. - extended attributes (xattrs). - openat() and friends. - bmap() support on files in the filesystem (FIBMAP ioctl). - The use of bmap() to detect a partial page at the end of the file. It is strongly recommended that the "dir_index" option is enabled on Ext3 filesystems being used as a cache. ============= CONFIGURATION ============= The cache is configured by a script in /etc/cachefilesd.conf. These commands set up cache ready for use. The following script commands are available: (*) brun <N>% (*) bcull <N>% (*) bstop <N>% (*) frun <N>% (*) fcull <N>% (*) fstop <N>% Configure the culling limits. Optional. See the section on culling The defaults are 7% (run), 5% (cull) and 1% (stop) respectively. The commands beginning with a 'b' are file space (block) limits, those beginning with an 'f' are file count limits. (*) dir <path> Specify the directory containing the root of the cache. Mandatory. (*) tag <name> Specify a tag to FS-Cache to use in distinguishing multiple caches. Optional. The default is "CacheFiles". (*) debug <mask> Specify a numeric bitmask to control debugging in the kernel module. Optional. The default is zero (all off). The following values can be OR'd into the mask to collect various information: 1 Turn on trace of function entry (_enter() macros) 2 Turn on trace of function exit (_leave() macros) 4 Turn on trace of internal debug points (_debug()) This mask can also be set through sysfs, eg: echo 5 >/sys/modules/cachefiles/parameters/debug ================== STARTING THE CACHE ================== The cache is started by running the daemon. The daemon opens the cache device, configures the cache and tells it to begin caching. At that point the cache binds to fscache and the cache becomes live. The daemon is run as follows: /sbin/cachefilesd [-d]* [-s] [-n] [-f <configfile>] The flags are: (*) -d Increase the debugging level. This can be specified multiple times and is cumulative with itself. (*) -s Send messages to stderr instead of syslog. (*) -n Don't daemonise and go into background. (*) -f <configfile> Use an alternative configuration file rather than the default one. =============== THINGS TO AVOID =============== Do not mount other things within the cache as this will cause problems. The kernel module contains its own very cut-down path walking facility that ignores mountpoints, but the daemon can't avoid them. Do not create, rename or unlink files and directories in the cache whilst the cache is active, as this may cause the state to become uncertain. Renaming files in the cache might make objects appear to be other objects (the filename is part of the lookup key). Do not change or remove the extended attributes attached to cache files by the cache as this will cause the cache state management to get confused. Do not create files or directories in the cache, lest the cache get confused or serve incorrect data. Do not chmod files in the cache. The module creates things with minimal permissions to prevent random users being able to access them directly. ============= CACHE CULLING ============= The cache may need culling occasionally to make space. This involves discarding objects from the cache that have been used less recently than anything else. Culling is based on the access time of data objects. Empty directories are culled if not in use. Cache culling is done on the basis of the percentage of blocks and the percentage of files available in the underlying filesystem. There are six "limits": (*) brun (*) frun If the amount of free space and the number of available files in the cache rises above both these limits, then culling is turned off. (*) bcull (*) fcull If the amount of available space or the number of available files in the cache falls below either of these limits, then culling is started. (*) bstop (*) fstop If the amount of available space or the number of available files in the cache falls below either of these limits, then no further allocation of disk space or files is permitted until culling has raised things above these limits again. These must be configured thusly: 0 <= bstop < bcull < brun < 100 0 <= fstop < fcull < frun < 100 Note that these are percentages of available space and available files, and do _not_ appear as 100 minus the percentage displayed by the "df" program. The userspace daemon scans the cache to build up a table of cullable objects. These are then culled in least recently used order. A new scan of the cache is started as soon as space is made in the table. Objects will be skipped if their atimes have changed or if the kernel module says it is still using them. =============== CACHE STRUCTURE =============== The CacheFiles module will create two directories in the directory it was given: (*) cache/ (*) graveyard/ The active cache objects all reside in the first directory. The CacheFiles kernel module moves any retired or culled objects that it can't simply unlink to the graveyard from which the daemon will actually delete them. The daemon uses dnotify to monitor the graveyard directory, and will delete anything that appears therein. The module represents index objects as directories with the filename "I..." or "J...". Note that the "cache/" directory is itself a special index. Data objects are represented as files if they have no children, or directories if they do. Their filenames all begin "D..." or "E...". If represented as a directory, data objects will have a file in the directory called "data" that actually holds the data. Special objects are similar to data objects, except their filenames begin "S..." or "T...". If an object has children, then it will be represented as a directory. Immediately in the representative directory are a collection of directories named for hash values of the child object keys with an '@' prepended. Into this directory, if possible, will be placed the representations of the child objects: INDEX INDEX INDEX DATA FILES ========= ========== ================================= ================ cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400 cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400/@75/Es0g000w...DB1ry cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400/@75/Es0g000w...N22ry cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400/@75/Es0g000w...FP1ry If the key is so long that it exceeds NAME_MAX with the decorations added on to it, then it will be cut into pieces, the first few of which will be used to make a nest of directories, and the last one of which will be the objects inside the last directory. The names of the intermediate directories will have '+' prepended: J1223/@23/+xy...z/+kl...m/Epqr Note that keys are raw data, and not only may they exceed NAME_MAX in size, they may also contain things like '/' and NUL characters, and so they may not be suitable for turning directly into a filename. To handle this, CacheFiles will use a suitably printable filename directly and "base-64" encode ones that aren't directly suitable. The two versions of object filenames indicate the encoding: OBJECT TYPE PRINTABLE ENCODED =============== =============== =============== Index "I..." "J..." Data "D..." "E..." Special "S..." "T..." Intermediate directories are always "@" or "+" as appropriate. Each object in the cache has an extended attribute label that holds the object type ID (required to distinguish special objects) and the auxiliary data from the netfs. The latter is used to detect stale objects in the cache and update or retire them. Note that CacheFiles will erase from the cache any file it doesn't recognise or any file of an incorrect type (such as a FIFO file or a device file). ========================== SECURITY MODEL AND SELINUX ========================== CacheFiles is implemented to deal properly with the LSM security features of the Linux kernel and the SELinux facility. One of the problems that CacheFiles faces is that it is generally acting on behalf of a process, and running in that process's context, and that includes a security context that is not appropriate for accessing the cache - either because the files in the cache are inaccessible to that process, or because if the process creates a file in the cache, that file may be inaccessible to other processes. The way CacheFiles works is to temporarily change the security context (fsuid, fsgid and actor security label) that the process acts as - without changing the security context of the process when it the target of an operation performed by some other process (so signalling and suchlike still work correctly). When the CacheFiles module is asked to bind to its cache, it: (1) Finds the security label attached to the root cache directory and uses that as the security label with which it will create files. By default, this is: cachefiles_var_t (2) Finds the security label of the process which issued the bind request (presumed to be the cachefilesd daemon), which by default will be: cachefilesd_t and asks LSM to supply a security ID as which it should act given the daemon's label. By default, this will be: cachefiles_kernel_t SELinux transitions the daemon's security ID to the module's security ID based on a rule of this form in the policy. type_transition <daemon's-ID> kernel_t : process <module's-ID>; For instance: type_transition cachefilesd_t kernel_t : process cachefiles_kernel_t; The module's security ID gives it permission to create, move and remove files and directories in the cache, to find and access directories and files in the cache, to set and access extended attributes on cache objects, and to read and write files in the cache. The daemon's security ID gives it only a very restricted set of permissions: it may scan directories, stat files and erase files and directories. It may not read or write files in the cache, and so it is precluded from accessing the data cached therein; nor is it permitted to create new files in the cache. There are policy source files available in: http://people.redhat.com/~dhowells/fscache/cachefilesd-0.8.tar.bz2 and later versions. In that tarball, see the files: cachefilesd.te cachefilesd.fc cachefilesd.if They are built and installed directly by the RPM. If a non-RPM based system is being used, then copy the above files to their own directory and run: make -f /usr/share/selinux/devel/Makefile semodule -i cachefilesd.pp You will need checkpolicy and selinux-policy-devel installed prior to the build. By default, the cache is located in /var/fscache, but if it is desirable that it should be elsewhere, than either the above policy files must be altered, or an auxiliary policy must be installed to label the alternate location of the cache. For instructions on how to add an auxiliary policy to enable the cache to be located elsewhere when SELinux is in enforcing mode, please see: /usr/share/doc/cachefilesd-*/move-cache.txt When the cachefilesd rpm is installed; alternatively, the document can be found in the sources. ================== A NOTE ON SECURITY ================== CacheFiles makes use of the split security in the task_struct. It allocates its own task_security structure, and redirects current->act_as to point to it when it acts on behalf of another process, in that process's context. The reason it does this is that it calls vfs_mkdir() and suchlike rather than bypassing security and calling inode ops directly. Therefore the VFS and LSM may deny the CacheFiles access to the cache data because under some circumstances the caching code is running in the security context of whatever process issued the original syscall on the netfs. Furthermore, should CacheFiles create a file or directory, the security parameters with that object is created (UID, GID, security label) would be derived from that process that issued the system call, thus potentially preventing other processes from accessing the cache - including CacheFiles's cache management daemon (cachefilesd). What is required is to temporarily override the security of the process that issued the system call. We can't, however, just do an in-place change of the security data as that affects the process as an object, not just as a subject. This means it may lose signals or ptrace events for example, and affects what the process looks like in /proc. So CacheFiles makes use of a logical split in the security between the objective security (task->sec) and the subjective security (task->act_as). The objective security holds the intrinsic security properties of a process and is never overridden. This is what appears in /proc, and is what is used when a process is the target of an operation by some other process (SIGKILL for example). The subjective security holds the active security properties of a process, and may be overridden. This is not seen externally, and is used whan a process acts upon another object, for example SIGKILLing another process or opening a file. LSM hooks exist that allow SELinux (or Smack or whatever) to reject a request for CacheFiles to run in a context of a specific security label, or to create files and directories with another security label. This documentation is added by the patch to: Documentation/filesystems/caching/cachefiles.txt Signed-Off-By: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03FS-Cache: Add main configuration option, module entry points and debuggingDavid Howells1-0/+1
Add the main configuration option, allowing FS-Cache to be selected; the module entry and exit functions and the debugging stuff used by these patches. The two configuration options added are: CONFIG_FSCACHE CONFIG_FSCACHE_DEBUG The first enables the facility, and the second makes the debugging statements enableable through the "debug" module parameter. The value of this parameter is a bitmask as described in: Documentation/filesystems/caching/fscache.txt The module can be loaded at this point, but all it will do at this point in the patch series is to start up the slow work facility and shut it down again. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-03-31Take fs_struct handling to new file (fs/fs_struct.c)Al Viro1-1/+1
Pure code move; two new helper functions for nfsd and daemonize (unshare_fs_struct() and daemonize_fs_struct() resp.; for now - the same code as used to be in callers). unshare_fs_struct() exported (for nfsd, as copy_fs_struct()/exit_fs() used to be), copy_fs_struct() and exit_fs() don't need exports anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-03-31fs: Add exofs to Kernel buildBoaz Harrosh1-0/+1
- Add exofs to fs/Kconfig under "menu 'Miscellaneous filesystems'" - Add exofs to fs/Makefile Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2009-03-26quota: Move quota files into separate directoryJan Kara1-5/+1
Quota subsystem has more and more files. It's time to create a dir for it. Signed-off-by: Jan Kara <jack@suse.cz>
2009-02-28ext4: Reorder fs/Makefile so that ext2 root fs's are mounted using ext2Theodore Ts'o1-2/+4
In fs/Makefile, ext3 was placed before ext2 so that a root filesystem that possessed a journal, it would be mounted as ext3 instead of ext2. This was necessary because a cleanly unmounted ext3 filesystem was fully backwards compatible with ext2, and could be mounted by ext2 --- but it was desirable that it be mounted with ext3 so that the journaling would be enabled. The ext4 filesystem supports new incompatible features, so there is no danger of an ext4 filesystem being mistaken for an ext2 filesystem. At that point, the relative ordering of ext4 with respect to ext2 didn't matter until ext4 gained the ability to mount filesystems without a journal starting in 2.6.29-rc1. Now that this is the case, given that ext4 is before ext2, it means that root filesystems that were using the plain-jane ext2 format are getting mounted using the ext4 filesystem driver, which is a change in behavior which could be surprising to users. It's doubtful that there are that many ext2-only root filesystem users that would also have ext4 compiled into the kernel, but to adhere to the principle of least surprise, the correct ordering in fs/Makefile is ext3, followed by ext2, and finally ext4. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-01-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linusLinus Torvalds1-0/+1
* git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus: MAINTAINERS: squashfs entry Squashfs: documentation Squashfs: initrd support Squashfs: Kconfig entry Squashfs: Makefiles Squashfs: header files Squashfs: block operations Squashfs: cache operations Squashfs: uid/gid lookup operations Squashfs: fragment block operations Squashfs: export operations Squashfs: super block operations Squashfs: symlink operations Squashfs: regular file operations Squashfs: directory readdir operations Squashfs: directory lookup operations Squashfs: inode operations
2009-01-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstableLinus Torvalds1-0/+1
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (864 commits) Btrfs: explicitly mark the tree log root for writeback Btrfs: Drop the hardware crc32c asm code Btrfs: Add Documentation/filesystem/btrfs.txt, remove old COPYING Btrfs: kmap_atomic(KM_USER0) is safe for btrfs_readpage_end_io_hook Btrfs: Don't use kmap_atomic(..., KM_IRQ0) during checksum verifies Btrfs: tree logging checksum fixes Btrfs: don't change file extent's ram_bytes in btrfs_drop_extents Btrfs: Use btrfs_join_transaction to avoid deadlocks during snapshot creation Btrfs: drop remaining LINUX_KERNEL_VERSION checks and compat code Btrfs: drop EXPORT symbols from extent_io.c Btrfs: Fix checkpatch.pl warnings Btrfs: Fix free block discard calls down to the block layer Btrfs: avoid orphan inode caused by log replay Btrfs: avoid potential super block corruption Btrfs: do not call kfree if kmalloc failed in btrfs_sysfs_add_super Btrfs: fix a memory leak in btrfs_get_sb Btrfs: Fix typo in clear_state_cb Btrfs: Fix memset length in btrfs_file_write Btrfs: update directory's size when creating subvol/snapshot Btrfs: add permission checks to the ioctls ...
2009-01-05quota: Split off quota tree handling into a separate fileJan Kara1-0/+1
There is going to be a new version of quota format having 64-bit quota limits and a new quota format for OCFS2. They are both going to use the same tree structure as VFSv0 quota format. So split out tree handling into a separate file and make size of leaf blocks, amount of space usable in each block (needed for checksumming) and structures contained in them configurable so that the code can be shared. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-01-05Squashfs: MakefilesPhillip Lougher1-0/+1
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
2008-12-31filesystem notification: create fs/notify to contain all fs notificationEric Paris1-4/+1
Creating a generic filesystem notification interface, fsnotify, which will be used by inotify, dnotify, and eventually fanotify is really starting to clutter the fs directory. This patch simply moves inotify and dnotify into fs/notify/inotify and fs/notify/dnotify respectively to make both current fs/ and future notification tidier. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-11-19Merge branch 'master' of ↵Chris Mason1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable
2008-11-06fat: move fs/vfat/* and fs/msdos/* to fs/fatOGAWA Hirofumi1-2/+0
This just moves those files, but change link order from MSDOS, VFAT to VFAT, MSDOS. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-17ext4: Remove an old reference to ext4dev in Makefile commentMartin Michlmayr1-1/+1
Remove an old reference to ext4dev. Signed-off-by: Martin Michlmayr <tbm@cyrius.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2008-10-16Configure out AIO supportThomas Petazzoni1-1/+2
This patchs adds the CONFIG_AIO option which allows to remove support for asynchronous I/O operations, that are not necessarly used by applications, particularly on embedded devices. As this is a size-reduction option, it depends on CONFIG_EMBEDDED. It allows to save ~7 kilobytes of kernel code/data: text data bss dec hex filename 1115067 119180 217088 1451335 162547 vmlinux 1108025 119048 217088 1444161 160941 vmlinux.new -7042 -132 0 -7174 -1C06 +/- This patch has been originally written by Matt Mackall <mpm@selenic.com>, and is part of the Linux Tiny project. [randy.dunlap@oracle.com: build fix] Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Zach Brown <zach.brown@oracle.com> Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-14Merge branch 'for-2.6.28' of git://linux-nfs.org/~bfields/linuxLinus Torvalds1-1/+2
* 'for-2.6.28' of git://linux-nfs.org/~bfields/linux: (59 commits) svcrdma: Fix IRD/ORD polarity svcrdma: Update svc_rdma_send_error to use DMA LKEY svcrdma: Modify the RPC reply path to use FRMR when available svcrdma: Modify the RPC recv path to use FRMR when available svcrdma: Add support to svc_rdma_send to handle chained WR svcrdma: Modify post recv path to use local dma key svcrdma: Add a service to register a Fast Reg MR with the device svcrdma: Query device for Fast Reg support during connection setup svcrdma: Add FRMR get/put services NLM: Remove unused argument from svc_addsock() function NLM: Remove "proto" argument from lockd_up() NLM: Always start both UDP and TCP listeners lockd: Remove unused fields in the nlm_reboot structure lockd: Add helper to sanity check incoming NOTIFY requests lockd: change nlmclnt_grant() to take a "struct sockaddr *" lockd: Adjust nlmsvc_lookup_host() to accomodate AF_INET6 addresses lockd: Adjust nlmclnt_lookup_host() signature to accomodate non-AF_INET lockd: Support non-AF_INET addresses in nlm_lookup_host() NLM: Convert nlm_lookup_host() to use a single argument svcrdma: Add Fast Reg MR Data Types ...
2008-10-10ext4: Rename ext4dev to ext4Theodore Ts'o1-1/+1
The ext4 filesystem is getting stable enough that it's time to drop the "dev" prefix. Also remove the requirement for the TEST_FILESYS flag. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-29Configure out file locking featuresThomas Petazzoni1-1/+2
This patch adds the CONFIG_FILE_LOCKING option which allows to remove support for advisory locks. With this patch enabled, the flock() system call, the F_GETLK, F_SETLK and F_SETLKW operations of fcntl() and NFS support are disabled. These features are not necessarly needed on embedded systems. It allows to save ~11 Kb of kernel code and data: text data bss dec hex filename 1125436 118764 212992 1457192 163c28 vmlinux.old 1114299 118564 212992 1445855 160fdf vmlinux -11137 -200 0 -11337 -2C49 +/- This patch has originally been written by Matt Mackall <mpm@selenic.com>, and is part of the Linux Tiny project. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: matthew@wil.cx Cc: linux-fsdevel@vger.kernel.org Cc: mpm@selenic.com Cc: akpm@linux-foundation.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-09-25Add Btrfs to fs/Kconfig and fs/MakefileChris Mason1-0/+1
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-07-26omfs: update kbuild to include OMFSBob Copeland1-0/+1
Adds OMFS to the fs Kconfig and Makefile Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-16Merge branch 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6Linus Torvalds1-0/+1
* 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6: UBIFS: include to compilation UBIFS: add new flash file system UBIFS: add brief documentation MAINTAINERS: add UBIFS section do_mounts: allow UBI root device name VFS: export sync_sb_inodes VFS: move inode_lock into sync_sb_inodes
2008-07-15UBIFS: include to compilationArtem Bityutskiy1-0/+1
Add UBIFS to Makefile and Kbuild. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
2008-07-03block: Block layer data integrity supportMartin K. Petersen1-0/+1
Some block devices support verifying the integrity of requests by way of checksums or other protection information that is submitted along with the I/O. This patch implements support for generating and verifying integrity metadata, as well as correctly merging, splitting and cloning bios and requests that have this extra information attached. See Documentation/block/data-integrity.txt for more information. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-30x86: compat_binfmt_elf KconfigRoland McGrath1-0/+1
This adds Kconfig and Makefile bits to build fs/compat_binfmt_elf.c, just added. Each arch that wants to use this file needs to add a "select COMPAT_BINFMT_ELF" line in its Kconfig bits that enable COMPAT. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17Remove valueless definition of hard-selected RAMFS optionRobert P. J. Day1-1/+1
Since CONFIG_RAMFS is currently hard-selected to "y", and since Documentation/filesystems/ramfs-rootfs-initramfs.txt reads as follows: "The amount of code required to implement ramfs is tiny, because all the work is done by the existing Linux caching infrastructure. Basically, you're mounting the disk cache as a filesystem. Because of this, ramfs is not an optional component removable via menuconfig, since there would be negligible space savings." It seems pointless to leave this as a Kconfig entry. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11signal/timer/event: eventfd coreDavide Libenzi1-0/+1
This is a very simple and light file descriptor, that can be used as event wait/dispatch by userspace (both wait and dispatch) and by the kernel (dispatch only). It can be used instead of pipe(2) in all cases where those would simply be used to signal events. Their kernel overhead is much lower than pipes, and they do not consume two fds. When used in the kernel, it can offer an fd-bridge to enable, for example, functionalities like KAIO or syslets/threadlets to signal to an fd the completion of certain operations. But more in general, an eventfd can be used by the kernel to signal readiness, in a POSIX poll/select way, of interfaces that would otherwise be incompatible with it. The API is: int eventfd(unsigned int count); The eventfd API accepts an initial "count" parameter, and returns an eventfd fd. It supports poll(2) (POLLIN, POLLOUT, POLLERR), read(2) and write(2). The POLLIN flag is raised when the internal counter is greater than zero. The POLLOUT flag is raised when at least a value of "1" can be written to the internal counter. The POLLERR flag is raised when an overflow in the counter value is detected. The write(2) operation can never overflow the counter, since it blocks (unless O_NONBLOCK is set, in which case -EAGAIN is returned). But the eventfd_signal() function can do it, since it's supposed to not sleep during its operation. The read(2) function reads the __u64 counter value, and reset the internal value to zero. If the value read is equal to (__u64) -1, an overflow happened on the internal counter (due to 2^64 eventfd_signal() posts that has never been retired - unlickely, but possible). The write(2) call writes an __u64 count value, and adds it to the current counter. The eventfd fd supports O_NONBLOCK also. On the kernel side, we have: struct file *eventfd_fget(int fd); int eventfd_signal(struct file *file, unsigned int n); The eventfd_fget() should be called to get a struct file* from an eventfd fd (this is an fget() + check of f_op being an eventfd fops pointer). The kernel can then call eventfd_signal() every time it wants to post an event to userspace. The eventfd_signal() function can be called from any context. An eventfd() simple test and bench is available here: http://www.xmailserver.org/eventfd-bench.c This is the eventfd-based version of pipetest-4 (pipe(2) based): http://www.xmailserver.org/pipetest-4.c Not that performance matters much in the eventfd case, but eventfd-bench shows almost as double as performance than pipetest-4. [akpm@linux-foundation.org: fix i386 build] [akpm@linux-foundation.org: add sys_eventfd to sys_ni.c] Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11signal/timer/event: timerfd coreDavide Libenzi1-0/+1
This patch introduces a new system call for timers events delivered though file descriptors. This allows timer event to be used with standard POSIX poll(2), select(2) and read(2). As a consequence of supporting the Linux f_op->poll subsystem, they can be used with epoll(2) too. The system call is defined as: int timerfd(int ufd, int clockid, int flags, const struct itimerspec *utmr); The "ufd" parameter allows for re-use (re-programming) of an existing timerfd w/out going through the close/open cycle (same as signalfd). If "ufd" is -1, s new file descriptor will be created, otherwise the existing "ufd" will be re-programmed. The "clockid" parameter is either CLOCK_MONOTONIC or CLOCK_REALTIME. The time specified in the "utmr->it_value" parameter is the expiry time for the timer. If the TFD_TIMER_ABSTIME flag is set in "flags", this is an absolute time, otherwise it's a relative time. If the time specified in the "utmr->it_interval" is not zero (.tv_sec == 0, tv_nsec == 0), this is the period at which the following ticks should be generated. The "utmr->it_interval" should be set to zero if only one tick is requested. Setting the "utmr->it_value" to zero will disable the timer, or will create a timerfd without the timer enabled. The function returns the new (or same, in case "ufd" is a valid timerfd descriptor) file, or -1 in case of error. As stated before, the timerfd file descriptor supports poll(2), select(2) and epoll(2). When a timer event happened on the timerfd, a POLLIN mask will be returned. The read(2) call can be used, and it will return a u32 variable holding the number of "ticks" that happened on the interface since the last call to read(2). The read(2) call supportes the O_NONBLOCK flag too, and EAGAIN will be returned if no ticks happened. A quick test program, shows timerfd working correctly on my amd64 box: http://www.xmailserver.org/timerfd-test.c [akpm@linux-foundation.org: add sys_timerfd to sys_ni.c] Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11signal/timer/event: signalfd coreDavide Libenzi1-0/+1
This patch series implements the new signalfd() system call. I took part of the original Linus code (and you know how badly it can be broken :), and I added even more breakage ;) Signals are fetched from the same signal queue used by the process, so signalfd will compete with standard kernel delivery in dequeue_signal(). If you want to reliably fetch signals on the signalfd file, you need to block them with sigprocmask(SIG_BLOCK). This seems to be working fine on my Dual Opteron machine. I made a quick test program for it: http://www.xmailserver.org/signafd-test.c The signalfd() system call implements signal delivery into a file descriptor receiver. The signalfd file descriptor if created with the following API: int signalfd(int ufd, const sigset_t *mask, size_t masksize); The "ufd" parameter allows to change an existing signalfd sigmask, w/out going to close/create cycle (Linus idea). Use "ufd" == -1 if you want a brand new signalfd file. The "mask" allows to specify the signal mask of signals that we are interested in. The "masksize" parameter is the size of "mask". The signalfd fd supports the poll(2) and read(2) system calls. The poll(2) will return POLLIN when signals are available to be dequeued. As a direct consequence of supporting the Linux poll subsystem, the signalfd fd can use used together with epoll(2) too. The read(2) system call will return a "struct signalfd_siginfo" structure in the userspace supplied buffer. The return value is the number of bytes copied in the supplied buffer, or -1 in case of error. The read(2) call can also return 0, in case the sighand structure to which the signalfd was attached, has been orphaned. The O_NONBLOCK flag is also supported, and read(2) will return -EAGAIN in case no signal is available. If the size of the buffer passed to read(2) is lower than sizeof(struct signalfd_siginfo), -EINVAL is returned. A read from the signalfd can also return -ERESTARTSYS in case a signal hits the process. The format of the struct signalfd_siginfo is, and the valid fields depends of the (->code & __SI_MASK) value, in the same way a struct siginfo would: struct signalfd_siginfo { __u32 signo; /* si_signo */ __s32 err; /* si_errno */ __s32 code; /* si_code */ __u32 pid; /* si_pid */ __u32 uid; /* si_uid */ __s32 fd; /* si_fd */ __u32 tid; /* si_fd */ __u32 band; /* si_band */ __u32 overrun; /* si_overrun */ __u32 trapno; /* si_trapno */ __s32 status; /* si_status */ __s32 svint; /* si_int */ __u64 svptr; /* si_ptr */ __u64 utime; /* si_utime */ __u64 stime; /* si_stime */ __u64 addr; /* si_addr */ }; [akpm@linux-foundation.org: fix signalfd_copyinfo() on i386] Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11signal/timer/event fds: anonymous inode sourceDavide Libenzi1-0/+1
This patch add an anonymous inode source, to be used for files that need and inode only in order to create a file*. We do not care of having an inode for each file, and we do not even care of having different names in the associated dentries (dentry names will be same for classes of file*). This allow code reuse, and will be used by epoll, signalfd and timerfd (and whatever else there'll be). Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-17Remove JFFS (version 1), as scheduled.Jeff Garzik1-1/+0
Unmaintained for years, few if any users. Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-12-08[PATCH] fsstack: Introduce fsstack_copy_{attr,inode}_*Josef "Jeff" Sipek1-1/+2
Introduce several fsstack_copy_* functions which allow stackable filesystems (such as eCryptfs and Unionfs) to easily copy over (currently only) inode attributes. This prevents code duplication and allows for code reuse. [akpm@osdl.org: Remove unneeded wrapper] [bunk@stusta.de: fs/stack.c should #include <linux/fs_stack.h>] Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11[PATCH] jbd2: enable building of jbd2 and have ext4 use it rather than jbdMingming Cao1-0/+1
Reworked from a patch by Mingming Cao and Randy Dunlap Signed-off-By: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11[PATCH] ext4: enable building of ext4Mingming Cao1-0/+1
Originally part of a patch from Mingming Cao and Randy Dunlap. Reorganized by Shaggy. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Mingming Cao<cmm@us.ibm.com> Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6Linus Torvalds1-0/+2
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6: (292 commits) [GFS2] Fix endian bug for de_type [GFS2] Initialize SELinux extended attributes at inode creation time. [GFS2] Move logging code into log.c (mostly) [GFS2] Mark nlink cleared so VFS sees it happen [GFS2] Two redundant casts removed [GFS2] Remove uneeded endian conversion [GFS2] Remove duplicate sb reading code [GFS2] Mark metadata reads for blktrace [GFS2] Remove iflags.h, use FS_ [GFS2] Fix code style/indent in ops_file.c [GFS2] streamline-generic_file_-interfaces-and-filemap gfs fix [GFS2] Remove readv/writev methods and use aio_read/aio_write instead (gfs bits) [GFS2] inode-diet: Eliminate i_blksize from the inode structure [GFS2] inode_diet: Replace inode.u.generic_ip with inode.i_private (gfs) [GFS2] Fix typo in last patch [GFS2] Fix direct i/o logic in filemap.c [GFS2] Fix bug in Makefiles for lock modules [GFS2] Remove (extra) fs_subsys declaration [GFS2/DLM] Fix trailing whitespace [GFS2] Tidy up meta_io code ...
2006-10-04[PATCH] ecryptfs: fs/Makefile and fs/KconfigMichael Halcrow1-0/+1
eCryptfs is a stacked cryptographic filesystem for Linux. It is derived from Erez Zadok's Cryptfs, implemented through the FiST framework for generating stacked filesystems. eCryptfs extends Cryptfs to provide advanced key management and policy features. eCryptfs stores cryptographic metadata in the header of each file written, so that encrypted files can be copied between hosts; the file will be decryptable with the proper key, and there is no need to keep track of any additional information aside from what is already in the encrypted file itself. [akpm@osdl.org: updates for ongoing API changes] [bunk@stusta.de: cleanups] [akpm@osdl.org: alpha build fix] [akpm@osdl.org: cleanups] [tytso@mit.edu: inode-diet updates] [pbadari@us.ibm.com: generic_file_*_read/write() interface updates] [rdunlap@xenotime.net: printk format fixes] [akpm@osdl.org: make slab creation and teardown table-driven] Signed-off-by: Phillip Hellewell <phillip@hellewell.homeip.net> Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02Merge branch 'master' into gfs2Steven Whitehouse1-4/+11
2006-10-01[PATCH] Create fs/utimes.cAlexey Dobriyan1-1/+1
* fs/open.c is getting bit crowdy * preparation to lutimes(2) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-30[PATCH] BLOCK: Make it possible to disable the block layer [try #6]David Howells1-4/+10
Make it possible to disable the block layer. Not all embedded devices require it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require the block layer to be present. This patch does the following: (*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev support. (*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls an item that uses the block layer. This includes: (*) Block I/O tracing. (*) Disk partition code. (*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS. (*) The SCSI layer. As far as I can tell, even SCSI chardevs use the block layer to do scheduling. Some drivers that use SCSI facilities - such as USB storage - end up disabled indirectly from this. (*) Various block-based device drivers, such as IDE and the old CDROM drivers. (*) MTD blockdev handling and FTL. (*) JFFS - which uses set_bdev_super(), something it could avoid doing by taking a leaf out of JFFS2's book. (*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is, however, still used in places, and so is still available. (*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and parts of linux/fs.h. (*) Makes a number of files in fs/ contingent on CONFIG_BLOCK. (*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK. (*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK is not enabled. (*) fs/no-block.c is created to hold out-of-line stubs and things that are required when CONFIG_BLOCK is not set: (*) Default blockdev file operations (to give error ENODEV on opening). (*) Makes some /proc changes: (*) /proc/devices does not list any blockdevs. (*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK. (*) Makes some compat ioctl handling contingent on CONFIG_BLOCK. (*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if given command other than Q_SYNC or if a special device is specified. (*) In init/do_mounts.c, no reference is made to the blockdev routines if CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2. (*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return error ENOSYS by way of cond_syscall if so). (*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if CONFIG_BLOCK is not set, since they can't then happen. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-29[PATCH] Generic infrastructure for aclsAndreas Gruenbacher1-0/+1
The patches solve the following problem: We want to grant access to devices based on who is logged in from where, etc. This includes switching back and forth between multiple user sessions, etc. Using ACLs to define device access for logged-in users gives us all the flexibility we need in order to fully solve the problem. Device special files nowadays usually live on tmpfs, hence tmpfs ACLs. Different distros have come up with solutions that solve the problem to different degrees: SUSE uses a resource manager which tracks login sessions and sets ACLs on device inodes as appropriate. RedHat uses pam_console, which changes the primary file ownership to the logged-in user. Others use a set of groups that users must be in in order to be granted the appropriate accesses. The freedesktop.org project plans to implement a combination of a console-tracker and a HAL-device-list based solution to grant access to devices to users, and more distros will likely follow this approach. These patches have first been posted here on 2 February 2005, and again on 8 January 2006. We have been shipping them in SLES9 and SLES10 with no problems reported. The previous submission is archived here: http://lkml.org/lkml/2006/1/8/229 http://lkml.org/lkml/2006/1/8/230 http://lkml.org/lkml/2006/1/8/231 This patch: Add some infrastructure for access control lists on in-memory filesystems such as tmpfs. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03Merge rsync://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6Steven Whitehouse1-1/+1
Conflicts: include/linux/kernel.h
2006-06-26[PATCH] devfs: Remove devfs from the kernel treeGreg Kroah-Hartman1-1/+0
This is the first patch in a series of patches that removes devfs support from the kernel. This patch removes the core devfs code, and its private header file. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-20[PATCH] inotify (1/5): split kernel API from userspace supportAmy Griffis1-0/+1
The following series of patches introduces a kernel API for inotify, making it possible for kernel modules to benefit from inotify's mechanism for watching inodes. With these patches, inotify will maintain for each caller a list of watches (via an embedded struct inotify_watch), where each inotify_watch is associated with a corresponding struct inode. The caller registers an event handler and specifies for which filesystem events their event handler should be called per inotify_watch. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Acked-by: Robert Love <rml@novell.com> Acked-by: John McCutchan <john@johnmccutchan.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-05-25Merge branch 'master'Steven Whitehouse1-1/+1
2006-05-17configfs: Make sure configfs_init() is called before consumers.Joel Becker1-1/+1
configfs_init() needs to be called first to register configfs before anyconsumers try to access it. Move up configfs in fs/Makefile to make sure it is initialized early. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-03Merge branch 'master'Steven Whitehouse1-1/+1
2006-03-31Merge branch 'master'Steven Whitehouse1-2/+1
2006-03-31[PATCH] sys_sync_file_range()Andrew Morton1-1/+1
Remove the recently-added LINUX_FADV_ASYNC_WRITE and LINUX_FADV_WRITE_WAIT fadvise() additions, do it in a new sys_sync_file_range() syscall instead. Reasons: - It's more flexible. Things which would require two or three syscalls with fadvise() can be done in a single syscall. - Using fadvise() in this manner is something not covered by POSIX. The patch wires up the syscall for x86. The sycall is implemented in the new fs/sync.c. The intention is that we can move sys_fsync(), sys_fdatasync() and perhaps sys_sync() into there later. Documentation for the syscall is in fs/sync.c. A test app (sync_file_range.c) is in http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz. The available-to-GPL-modules do_sync_file_range() is for knfsd: "A COMMIT can say NFS_DATA_SYNC or NFS_FILE_SYNC. I can skip the ->fsync call for NFS_DATA_SYNC which is hopefully the more common." Note: the `async' writeout mode SYNC_FILE_RANGE_WRITE will turn synchronous if the queue is congested. This is trivial to fix: add a new flag bit, set wbc->nonblocking. But I'm not sure that we want to expose implementation details down to that level. Note: it's notable that we can sync an fd which wasn't opened for writing. Same with fsync() and fdatasync()). Note: the code takes some care to handle attempts to sync file contents outside the 16TB offset on 32-bit machines. It makes such attempts appear to succeed, for best 32-bit/64-bit compatibility. Perhaps it should make such requests fail... Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-30[PATCH] Introduce sys_splice() system callJens Axboe1-1/+1
This adds support for the sys_splice system call. Using a pipe as a transport, it can connect to files or sockets (latter as output only). From the splice.c comments: "splice": joining two ropes together by interweaving their strands. This is the "extended pipe" functionality, where a pipe is used as an arbitrary in-memory buffer. Think of a pipe as a small kernel buffer that you can use to transfer data from one end to the other. The traditional unix read/write is extended with a "splice()" operation that transfers data buffers to or from a pipe buffer. Named by Larry McVoy, original implementation from Linus, extended by Jens to support splicing to files and fixing the initial implementation bugs. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23[PATCH] relay: migrate from relayfs to a generic relay APIJens Axboe1-1/+0
Original patch from Paul Mundt, sysfs parts removed by me since they were broken. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-18[DLM] The core of the DLM for GFS2/CLVMDavid Teigland1-0/+1
This is the core of the distributed lock manager which is required to use GFS2 as a cluster filesystem. It is also used by CLVM and can be used as a standalone lock manager independantly of either of these two projects. It implements VAX-style locking modes. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steve Whitehouse <swhiteho@redhat.com>
2006-01-16[GFS2] Hook GFS2 into the Kbuild systemDavid Teigland1-0/+1
Adds GFS2 into fs/Kconfig and adds a Makefile entry Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-10[PATCH] sanitize building of fs/compat_ioctl.cChristoph Hellwig1-1/+1
Now that all these entries in the arch ioctl32.c files are gone [1], we can build fs/compat_ioctl.c as a normal object and kill tons of cruft. We need a special do_ioctl32_pointer handler for s390 so the compat_ptr call is done. This is not needed but harmless on all other architectures. Also remove some superflous includes in fs/compat_ioctl.c Tested on ppc64. [1] parisc still had it's PPP handler left, which is not fully correct for ppp and besides that ppp uses the generic SIOCPRIV ioctl so it'd kick in for all netdevice users. We can introduce a proper handler in one of the next patch series by adding a compat_ioctl method to struct net_device but for now let's just kill it - parisc doesn't compile in mainline anyway and I don't want this to block this patchset. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@debian.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] drop-pagecacheAndrew Morton1-1/+1
Add /proc/sys/vm/drop_caches. When written to, this will cause the kernel to discard as much pagecache and/or reclaimable slab objects as it can. THis operation requires root permissions. It won't drop dirty data, so the user should run `sync' first. Caveats: a) Holds inode_lock for exorbitant amounts of time. b) Needs to be taught about NUMA nodes: propagate these all the way through so the discarding can be controlled on a per-node basis. This is a debugging feature: useful for getting consistent results between filesystem benchmarks. We could possibly put it under a config option, but it's less than 300 bytes. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-03[PATCH] OCFS2: The Second Oracle Cluster FilesystemMark Fasheh1-0/+1
Link the code into the kernel build system. OCFS2 is marked as experimental. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
2006-01-03[PATCH] configfs: User-driven configuration filesystemJoel Becker1-0/+1
Configfs, a file system for userspace-driven kernel object configuration. The OCFS2 stack makes extensive use of this for propagation of cluster configuration information into kernel. Signed-off-by: Joel Becker <joel.becker@oracle.com>
2005-11-07[PATCH] beginning of the shared-subtree properRam Pai1-1/+1
A private mount does not forward or receive propagation. This patch provides user the ability to convert any mount to private. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] FUSE - MAINTAINERS, Kconfig and Makefile changesMiklos Szeredi1-0/+1
This patch adds FUSE filesystem to MAINTAINERS, fs/Kconfig and fs/Makefile. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] v9fs: Documentation, Makefiles, ConfigurationEric Van Hensbergen1-0/+1
OVERVIEW V9FS is a distributed file system for Linux which provides an implementation of the Plan 9 resource sharing protocol 9P. It can be used to share all sorts of resources: static files, synthetic file servers (such as /proc or /sys), devices, and application file servers (such as FUSE). BACKGROUND Plan 9 (http://plan9.bell-labs.com/plan9) is a research operating system and associated applications suite developed by the Computing Science Research Center of AT&T Bell Laboratories (now a part of Lucent Technologies), the same group that developed UNIX , C, and C++. Plan 9 was initially released in 1993 to universities, and then made generally available in 1995. Its core operating systems code laid the foundation for the Inferno Operating System released as a product by Lucent Bell-Labs in 1997. The Inferno venture was the only commercial embodiment of Plan 9 and is currently maintained as a product by Vita Nuova (http://www.vitanuova.com). After updated releases in 2000 and 2002, Plan 9 was open-sourced under the OSI approved Lucent Public License in 2003. The Plan 9 project was started by Ken Thompson and Rob Pike in 1985. Their intent was to explore potential solutions to some of the shortcomings of UNIX in the face of the widespread use of high-speed networks to connect machines. In UNIX, networking was an afterthought and UNIX clusters became little more than a network of stand-alone systems. Plan 9 was designed from first principles as a seamless distributed system with integrated secure network resource sharing. Applications and services were architected in such a way as to allow for implicit distribution across a cluster of systems. Configuring an environment to use remote application components or services in place of their local equivalent could be achieved with a few simple command line instructions. For the most part, application implementations operated independent of the location of their actual resources. Commercial operating systems haven't changed much in the 20 years since Plan 9 was conceived. Network and distributed systems support is provided by a patchwork of middle-ware, with an endless number of packages supplying pieces of the puzzle. Matters are complicated by the use of different complicated protocols for individual services, and separate implementations for kernel and application resources. The V9FS project (http://v9fs.sourceforge.net) is an attempt to bring Plan 9's unified approach to resource sharing to Linux and other operating systems via support for the 9P2000 resource sharing protocol. V9FS HISTORY V9FS was originally developed by Ron Minnich and Maya Gokhale at Los Alamos National Labs (LANL) in 1997. In November of 2001, Greg Watson setup a SourceForge project as a public repository for the code which supported the Linux 2.4 kernel. About a year ago, I picked up the initial attempt Ron Minnich had made to provide 2.6 support and got the code integrated into a 2.6.5 kernel. I then went through a line-for-line re-write attempting to clean-up the code while more closely following the Linux Kernel style guidelines. I co-authored a paper with Ron Minnich on the V9FS Linux support including performance comparisons to NFSv3 using Bonnie and PostMark - this paper appeared at the USENIX/FREENIX 2005 conference in April 2005: ( http://www.usenix.org/events/usenix05/tech/freenix/hensbergen.html ). CALL FOR PARTICIPATION/REQUEST FOR COMMENTS Our 2.6 kernel support is stabilizing and we'd like to begin pursuing its integration into the official kernel tree. We would appreciate any review, comments, critiques, and additions from this community and are actively seeking people to join our project and help us produce something that would be acceptable and useful to the Linux community. STATUS The code is reasonably stable, although there are no doubt corner cases our regression tests haven't discovered yet. It is in regular use by several of the developers and has been tested on x86 and PowerPC (32-bit and 64-bit) in both small and large (LANL cluster) deployments. Our current regression tests include fsx, bonnie, and postmark. It was our intention to keep things as simple as possible for this release -- trying to focus on correctness within the core of the protocol support versus a rich set of features. For example: a more complete security model and cache layer are in the road map, but excluded from this release. Additionally, we have removed support for mmap operations at Al Viro's request. PERFORMANCE Detailed performance numbers and analysis are included in the FREENIX paper, but we show comparable performance to NFSv3 for large file operations based on the Bonnie benchmark, and superior performance for many small file operations based on the PostMark benchmark. Somewhat preliminary graphs (from the FREENIX paper) are available (http://v9fs.sourceforge.net/perf/index.html). RESOURCES The source code is available in a few different forms: tarballs: http://v9fs.sf.net CVSweb: http://cvs.sourceforge.net/viewcvs.py/v9fs/linux-9p/ CVS: :pserver:anonymous@cvs.sourceforge.net:/cvsroot/v9fs/linux-9p Git: rsync://v9fs.graverobber.org/v9fs (webgit: http://v9fs.graverobber.org) 9P: tcp!v9fs.graverobber.org!6564 The user-level server is available from either the Plan 9 distribution or from http://v9fs.sf.net Other support applications are still being developed, but preliminary version can be downloaded from sourceforge. Documentation on the protocol has historically been the Plan 9 Man pages (http://plan9.bell-labs.com/sys/man/5/INDEX.html), but there is an effort under way to write a more complete Internet-Draft style specification (http://v9fs.sf.net/rfc). There are a couple of mailing lists supporting v9fs, but the most used is v9fs-developer@lists.sourceforge.net -- please direct/cc your comments there so the other v9fs contibutors can participate in the conversation. There is also an IRC channel: irc://freenode.net/#v9fs This part of the patch contains Documentation, Makefiles, and configuration file changes. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07[PATCH] relayfsTom Zanussi1-0/+1
Here's the latest version of relayfs, against linux-2.6.11-mm2. I'm hoping you'll consider putting this version back into your tree - the previous rounds of comment seem to have shaken out all the API issues and the number of comments on the code itself have also steadily dwindled. This patch is essentially the same as the relayfs redux part 5 patch, with some minor changes based on reviewer comments. Thanks again to Pekka Enberg for those. The patch size without documentation is now a little smaller at just over 40k. Here's a detailed list of the changes: - removed the attribute_flags in relay open and changed it to a boolean specifying either overwrite or no-overwrite mode, and removed everything referencing the attribute flags. - added a check for NULL names in relayfs_create_entry() - got rid of the unnecessary multiple labels in relay_create_buf() - some minor simplification of relay_alloc_buf() which got rid of a couple params - updated the Documentation In addition, this version (through code contained in the relay-apps tarball linked to below, not as part of the relayfs patch) tries to make it as easy as possible to create the cooperating kernel/user pieces of a typical and common type of logging application, one where kernel logging is kicked off when a user space data collection app starts and stops when the collection app exits, with the data being automatically logged to disk in between. To create this type of application, you basically just include a header file (relay-app.h, included in the relay-apps tarball) in your kernel module, define a couple of callbacks and call an initialization function, and on the user side call a single function that sets up and continuously monitors the buffers, and writes data to files as it becomes available. Channels are created when the collection app is started and destroyed when it exits, not when the kernel module is inserted, so different channel buffer sizes can be specified for each separate run via command-line options. See the README in the relay-apps tarball for details. Also included in the relay-apps tarball are a couple examples demonstrating how you can use this to create quick and dirty kernel logging/debugging applications. They are: - tprintk, short for 'tee printk', which temporarily puts a kprobe on printk() and writes a duplicate stream of printk output to a relayfs channel. This could be used anywhere there's printk() debugging code in the kernel which you'd like to exercise, but would rather not have your system logs cluttered with debugging junk. You'd probably want to kill klogd while you do this, otherwise there wouldn't be much point (since putting a kprobe on printk() doesn't change the output of printk()). I've used this method to temporarily divert the packet logging output of the iptables LOG target from the system logs to relayfs files instead, for instance. - klog, which just provides a printk-like formatted logging function on top of relayfs. Again, you can use this to keep stuff out of your system logs if used in place of printk. The example applications can be found here: http://prdownloads.sourceforge.net/dprobes/relay-apps.tar.gz?download From: Christoph Hellwig <hch@lst.de> avoid lookup_hash usage in relayfs Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12[PATCH] inotifyRobert Love1-0/+1
inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a system call that returns a fd, not SIGIO. You get a single fd, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure), Gamin (a FAM replacement), and other projects. See Documentation/filesystems/inotify.txt. Signed-off-by: Robert Love <rml@novell.com> Cc: John McCutchan <ttb@tentacle.dhs.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27[PATCH] Update cfq io scheduler to time sliced designJens Axboe1-0/+1
This updates the CFQ io scheduler to the new time sliced design (cfq v3). It provides full process fairness, while giving excellent aggregate system throughput even for many competing processes. It supports io priorities, either inherited from the cpu nice value or set directly with the ioprio_get/set syscalls. The latter closely mimic set/getpriority. This import is based on my latest from -mm. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-22[PATCH] NFSD: Add server support for NFSv3 ACLs.Andreas Gruenbacher1-0/+1
This adds functions for encoding and decoding POSIX ACLs for the NFSACL protocol extension, and the GETACL and SETACL RPCs. The implementation is compatible with NFSACL in Solaris. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Acked-by: Olaf Kirch <okir@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-04-16Linux-2.6.12-rc2v2.6.12-rc2Linus Torvalds1-0/+97
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!