[PATCH] intrinsic automount and mountpoint degradation support

Here's a patch that I worked out with Al Viro that adds support for a filesystem (such as kAFS) to perform automounting intrinsically without the need for a userspace daemon. It also adds support for such mountpoints to be degraded at the filesystem's behest until they've been untouched long enough that they'll be removed. I've a patch (to follow) that removes some #ifdef's from fs/afs/* thus allowing it to make use of this facility. There are five pieces to this: (1) Any interested filesystem needs to have at least one list to which expirable mountpoints can be added. Access to this list is governed by the vfsmount_lock. (2) When a filesystem wants to create an expirable mount, it calls do_kern_mount() to get a handle on the filesystem it wants mounting, and then calls do_add_mount() to mount that filesystem on the designated mountpoint, supplying the list mentioned in (1) to which the vfsmount will be added. In kAFS's case, the mountpoint is a directory with a follow_link() method defined (fs/afs/mntpt.c). This uses the struct nameidata supplied as an argument as a determination of where the new filesystem should be mounted. (3) When something using a vfsmount finishes dealing with it, it calls mntput(). This unmarks the vfsmount for immediate expiry. There are two criteria for determining if a vfsmount may be expired - it mustn't be marked as in use for anything other than being a child of another vfsmount, and it must have an expiry mark against it already. (4) The filesystem then determines the policy on expiring the mounts created in (2). When it feels the need to, it passes the list mentioned in (1) to mark_mounts_for_expiry() to request everything on the list be expired. This function examines each mount listed. If the vfsmount meets the criteria mentioned in (3), then the vfsmount is deleted from the namespace and disposed of as for unmounting; otherwise the vfsmount is left untouched apart from now bearing an expiration mark if it didn't before. kAFS's expiration policy is simply to invoke this process at regular intervals for all the mounts on its list. (5) An expiration facility is also provided to userspace: by calling umount() with a MNT_EXPIRE flag, it can make a request to unmount only if the mountpoint hasn't been used since the last request and isn't in use now. This allows expiration to be driven by userspace instead of by the kernel if that is desirable. This also means that do_umount() has to use a different version of path_release() to everyone else... it can't call mntput() as that clears the expiration flag, thus rendering this unachievable; so it's version of path_release() calls _mntput(), which doesn't do the clear. My original idea was to give the kernel more knowledge of automounted things. This avoids a certain problem with stat() on a mountpoint causing it to mount (for example, do "ls -l /afs" on a machine with kAFS), but Al wanted it done this way. > Why is autofs unsuitable? Because: (1) Autofs is flat; AFS requires a tree - mounts on mounts on mounts on mounts... (2) AFS holds the data as to what the mountpoints are and where they go, and these may be cross-links to subtrees beyond your control. It's also not trivial to extract a list of mountpoints as is required for autofs. (3) Autofs is not namespace safe. (4) Ducking back to userspace to get that to do the mount is pretty tricky if namespaces are involved. In fact, autofs may well want to make use of this facility. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
author: David Howells <dhowells@redhat.com> 2004-07-10 19:30:11 -0700
committer: Linus Torvalds <torvalds@ppc970.osdl.org> 2004-07-10 19:30:11 -0700
commit: c8a6ba011918486195870aab9a8e3d61c76c1b02 (patch)
tree: 896c5960db3db754d7364802486044fa907eed50 /Documentation
parent: 694effe72032db1384b5a084977b8bbb2c3cdaa6 (diff)
download: history-c8a6ba011918486195870aab9a8e3d61c76c1b02.tar.gz
1 files changed, 118 insertions, 0 deletions
diff --git a/Documentation/filesystems/automount-support.txt b/Documentation/filesystems/automount-support.txt
new file mode 100644
index 00000000000000..58c65a1713e529
--- /dev/null
+++ b/Documentation/filesystems/automount-support.txt
@@ -0,0 +1,118 @@
+Support is available for filesystems that wish to do automounting support (such
+as kAFS which can be found in fs/afs/). This facility includes allowing
+in-kernel mounts to be performed and mountpoint degradation to be
+requested. The latter can also be requested by userspace.
+
+
+======================
+IN-KERNEL AUTOMOUNTING
+======================
+
+A filesystem can now mount another filesystem on one of its directories by the
+following procedure:
+
+ (1) Give the directory a follow_link() operation.
+
+     When the directory is accessed, the follow_link op will be called, and
+     it will be provided with the location of the mountpoint in the nameidata
+     structure (vfsmount and dentry).
+
+ (2) Have the follow_link() op do the following steps:
+
+     (a) Call do_kern_mount() to call the appropriate filesystem to set up a
+         superblock and gain a vfsmount structure representing it.
+
+     (b) Copy the nameidata provided as an argument and substitute the dentry
+	 argument into it the copy.
+
+     (c) Call do_add_mount() to install the new vfsmount into the namespace's
+	 mountpoint tree, thus making it accessible to userspace. Use the
+	 nameidata set up in (b) as the destination.
+
+	 If the mountpoint will be automatically expired, then do_add_mount()
+	 should also be given the location of an expiration list (see further
+	 down).
+
+     (d) Release the path in the nameidata argument and substitute in the new
+	 vfsmount and its root dentry. The ref counts on these will need
+	 incrementing.
+
+Then from userspace, you can just do something like:
+
+	[root@andromeda root]# mount -t afs \#root.afs. /afs
+	[root@andromeda root]# ls /afs
+	asd  cambridge  cambridge.redhat.com  grand.central.org
+	[root@andromeda root]# ls /afs/cambridge
+	afsdoc
+	[root@andromeda root]# ls /afs/cambridge/afsdoc/
+	ChangeLog  html  LICENSE  pdf  RELNOTES-1.2.2
+
+And then if you look in the mountpoint catalogue, you'll see something like:
+
+	[root@andromeda root]# cat /proc/mounts
+	...
+	#root.afs. /afs afs rw 0 0
+	#root.cell. /afs/cambridge.redhat.com afs rw 0 0
+	#afsdoc. /afs/cambridge.redhat.com/afsdoc afs rw 0 0
+
+
+===========================
+AUTOMATIC MOUNTPOINT EXPIRY
+===========================
+
+Automatic expiration of mountpoints is easy, provided you've mounted the
+mountpoint to be expired in the automounting procedure outlined above.
+
+To do expiration, you need to follow these steps:
+
+ (3) Create at least one list off which the vfsmounts to be expired can be
+     hung. Access to this list will be governed by the vfsmount_lock.
+
+ (4) In step (2c) above, the call to do_add_mount() should be provided with a
+     pointer to this list. It will hang the vfsmount off of it if it succeeds.
+
+ (5) When you want mountpoints to be expired, call mark_mounts_for_expiry()
+     with a pointer to this list. This will process the list, marking every
+     vfsmount thereon for potential expiry on the next call.
+
+     If a vfsmount was already flagged for expiry, and if its usage count is 1
+     (it's only referenced by its parent vfsmount), then it will be deleted
+     from the namespace and thrown away (effectively unmounted).
+
+     It may prove simplest to simply call this at regular intervals, using
+     some sort of timed event to drive it.
+
+The expiration flag is cleared by calls to mntput. This means that expiration
+will only happen on the second expiration request after the last time the
+mountpoint was accessed.
+
+If a mountpoint is moved, it gets removed from the expiration list. If a bind
+mount is made on an expirable mount, the new vfsmount will not be on the
+expiration list and will not expire.
+
+If a namespace is copied, all mountpoints contained therein will be copied,
+and the copies of those that are on an expiration list will be added to the
+same expiration list.
+
+
+=======================
+USERSPACE DRIVEN EXPIRY
+=======================
+
+As an alternative, it is possible for userspace to request expiry of any
+mountpoint (though some will be rejected - the current process's idea of the
+rootfs for example). It does this by passing the MNT_EXPIRE flag to
+umount(). This flag is considered incompatible with MNT_FORCE and MNT_DETACH.
+
+If the mountpoint in question is in referenced by something other than
+umount() or its parent mountpoint, an EBUSY error will be returned and the
+mountpoint will not be marked for expiration or unmounted.
+
+If the mountpoint was not already marked for expiry at that time, an EAGAIN
+error will be given and it won't be unmounted.
+
+Otherwise if it was already marked and it wasn't referenced, unmounting will
+take place as usual.
+
+Again, the expiration flag is cleared every time anything other than umount()
+looks at a mountpoint.
author	David Howells <dhowells@redhat.com>	2004-07-10 19:30:11 -0700
committer	Linus Torvalds <torvalds@ppc970.osdl.org>	2004-07-10 19:30:11 -0700
commit	c8a6ba011918486195870aab9a8e3d61c76c1b02 (patch)
tree	896c5960db3db754d7364802486044fa907eed50 /Documentation
parent	694effe72032db1384b5a084977b8bbb2c3cdaa6 (diff)
download	history-c8a6ba011918486195870aab9a8e3d61c76c1b02.tar.gz