kernel/git/cem/xfsprogs-dev.git - cem's fork of xfsprogs-dev.git

Age	Commit message (Collapse)	Author	Files	Lines
2024-04-17	xfsprogs: Release v6.7.0HEAD v6.7.0 master	Carlos Maiolino	4	-2/+18
	Update all the necessary files for a 6.7.0 release. Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for HDIO_GETGEO	Christoph Hellwig	5	-32/+1
	HDIO_GETGEO has been around longer than XFS. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for SG_IO	Christoph Hellwig	5	-39/+8
	SG_IO has been around longer than XFS. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for fstatat	Christoph Hellwig	4	-18/+2
	fstatat has been supported since Linux 2.6.16 and glibc 2.4. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for openat	Christoph Hellwig	4	-18/+2
	openat has been supported since Linux 2.6.16 and glibc 2.4. Note that xfs_db already uses it without the ifdef. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for the f_flags field in statfs	Christoph Hellwig	5	-22/+0
	The f_flags field has been supported since Linux 2.6.36. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for fsetxattr	Christoph Hellwig	4	-20/+0
	fsetxattr has been supported since Linux 2.4 and glibc 2.3. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for mremap	Christoph Hellwig	5	-27/+0
	mremap has been around since before the dawn of it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for preadv and pwritev	Christoph Hellwig	6	-42/+0
	preadv and pwritev have been supported since Linux 2.6.30. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for syncfs	Christoph Hellwig	7	-40/+0
	syncfs has been supported since Linux 2.6.39. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for fallocate	Christoph Hellwig	6	-38/+0
	fallocate has been supported since Linux 2.6.23 and glibc 2.10. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for fls	Christoph Hellwig	4	-19/+0
	fls should never be provided by system headers. It seems like on MacOS it did, but as we're not supporting MacOS anymore there is no need to check for it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for readdir	Christoph Hellwig	5	-32/+1
	readdir has been part of Posix since the very beginning. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for sync_file_range	Christoph Hellwig	5	-30/+2
	sync_file_range has been supported since Linux 2.6.17. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for fiemap	Christoph Hellwig	5	-36/+1
	fiemap has been supported since Linux 2.6.28. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for mincor	Christoph Hellwig	5	-33/+1
	mincore has been supported since Linux 2.3.99pre1 and glibc 2.2. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for madvise	Christoph Hellwig	5	-33/+1
	madvise has been supported since before the dawn of it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for sendfile	Christoph Hellwig	5	-33/+1
	sendfile has been supported since Linux 2.2 and glibc 2.1. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for fadvise	Christoph Hellwig	5	-33/+1
	fadvise has been supported since Linux 2.5.60 and glibc 2.2. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: require libblkid	Christoph Hellwig	3	-48/+1
	We can't support block device access (which is the reason for xfsprogs to exist) without blkid. Make it a hard requirement and remove the stubs. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	configure: don't check for getmntent	Christoph Hellwig	6	-47/+1
	getmntent always exists on Linux (and always has), so don't bother checking for it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	io: don't redefine SEEK_DATA and SEEK_HOLE	Christoph Hellwig	1	-5/+0
	HAVE_SEEK_DATA is never defined, so the code in xfs_io just unconditionally redefines SEEK_DATA and SEEK_HOLE. Switch to the system version instead, which has been around since Linux 3.1 and glibc of similar vintage. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	include: stop generating platform_defs.h	Christoph Hellwig	3	-13/+4
	Now that the sizeof checks are gone, we can stop generating platform_defs.h. The only caveat is that we need to stop undefining ENABLE_GETTEXT, which the generation process had removed before. The actual ENABLE_GETTEXT will be passd on the compiler command line, just like other ENABLE or HAVE values from autoconf. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	include: stop using SIZEOF_LONG	Christoph Hellwig	2	-5/+1
	SIZEOF_LONG together with the unused SIZEOF_CHAR_P is the last thing that really needs a generated configuration header. Switch to just using sizeof(long) so that we can stop generating platform_defs.h. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	repair: refactor the BLKMAP_NEXTS_MAX check	Christoph Hellwig	2	-21/+15
	Check the 32-bit limits using sizeof instead of cpp ifdefs so that we can get rid of BITS_PER_LONG. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	include: unconditionally define umode_t	Christoph Hellwig	5	-22/+0
	No system or kernel uapi header defines umode_t, so just define it unconditionally. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-13	include: remove the filldir_t typedef	Christoph Hellwig	1	-2/+0
	Neither struct filldir, nor filldir_t is used anywhere in xfsprogs. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-03-12	xfs_db: don't hardcode 'type data' size at 512b	Darrick J. Wong	2	-2/+4
	On a disk with 4096-byte LBAs, the xfs_db 'type data' subcommand doesn't work: # xfs_io -c 'sb' -c 'type data' /dev/sda xfs_db: read failed: Invalid argument no current object The cause of this is the hardcoded initialization of bb_count when we're setting type data -- it should be the filesystem sector size, not just 1. Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-16	debian: Increase build verbosity, add terse support	Bastian Germann	1	-0/+4
	Section 4.9 of the Debian Policy reads: "The package build should be as verbose as reasonably possible, except where the terse tag is included in DEB_BUILD_OPTIONS". Implement such behavior for xfsprogs by passing V=1 to make by default. Link: https://www.debian.org/doc/debian-policy/ch-source.html#main-building-script-debian-rules Link: https://bugs.debian.org/1063774 Reported-by: Emanuele Rocca <ema@debian.org> Signed-off-by: Bastian Germann <bage@debian.org> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-16	build: Request 64-bit time_t where possible	Sam James	1	-2/+2
	Suggested by Darrick during LFS review. We take the same approach as in 5c0599b721d1d232d2e400f357abdf2736f24a97 ('Fix building xfsprogs on 32-bit platforms') to avoid autoconf hell - just take the tried & tested approach which is working fine for us with LFS already. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sam James <sam@gentoo.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-16	io: Adapt to >= 64-bit time_t	Sam James	1	-3/+3
	We now require (at least) 64-bit time_t, so we need to adjust some printf specifiers accordingly. Unfortunately, we've stumbled upon a ridiculous C mmoment whereby there's no neat format specifier (not even one of the inttypes ones) for time_t, so we cast to intmax_t and use %jd. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sam James <sam@gentoo.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-16	Remove use of LFS64 interfaces	Violet Purcell	26	-74/+75
	LFS64 interfaces are non-standard and are being removed in the upcoming musl 1.2.5. Setting _FILE_OFFSET_BITS=64 (which is currently being done) makes all interfaces on glibc 64-bit by default, so using the LFS64 interfaces is redundant. This commit replaces all occurences of off64_t with off_t, stat64 with stat, and fstat64 with fstat. Link: https://bugs.gentoo.org/907039 Cc: Felix Janda <felix.janda@posteo.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Violet Purcell <vimproved@inventati.org> Signed-off-by: Sam James <sam@gentoo.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: inode recovery does not validate the recovered inodelibxfs-sync-6.7	Dave Chinner	1	-0/+3
	Source kernel commit: 038ca189c0d2c1570b4d922f25b524007c85cf94 Discovered when trying to track down a weird recovery corruption issue that wasn't detected at recovery time. The specific corruption was a zero extent count field when big extent counts are in use, and it turns out the dinode verifier doesn't detect that specific corruption case, either. So fix it too. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: fix internal error from AGFL exhaustion	Omar Sandoval	1	-3/+24
	Source kernel commit: f63a5b3769ad7659da4c0420751d78958ab97675 We've been seeing XFS errors like the following: XFS: Internal error i != 1 at line 3526 of file fs/xfs/libxfs/xfs_btree.c. Caller xfs_btree_insert+0x1ec/0x280 ... Call Trace: xfs_corruption_error+0x94/0xa0 xfs_btree_insert+0x221/0x280 xfs_alloc_fixup_trees+0x104/0x3e0 xfs_alloc_ag_vextent_size+0x667/0x820 xfs_alloc_fix_freelist+0x5d9/0x750 xfs_free_extent_fix_freelist+0x65/0xa0 __xfs_free_extent+0x57/0x180 ... This is the XFS_IS_CORRUPT() check in xfs_btree_insert() when xfs_btree_insrec() fails. After converting this into a panic and dissecting the core dump, I found that xfs_btree_insrec() is failing because it's trying to split a leaf node in the cntbt when the AG free list is empty. In particular, it's failing to get a block from the AGFL _while trying to refill the AGFL_. If a single operation splits every level of the bnobt and the cntbt (and the rmapbt if it is enabled) at once, the free list will be empty. Then, when the next operation tries to refill the free list, it allocates space. If the allocation does not use a full extent, it will need to insert records for the remaining space in the bnobt and cntbt. And if those new records go in full leaves, the leaves (and potentially more nodes up to the old root) need to be split. Fix it by accounting for the additional splits that may be required to refill the free list in the calculation for the minimum free list size. P.S. As far as I can tell, this bug has existed for a long time -- maybe back to xfs-history commit afdf80ae7405 ("Add XFS_AG_MAXLEVELS macros ...") in April 1994! It requires a very unlucky sequence of events, and in fact we didn't hit it until a particular sparse mmap workload updated from 5.12 to 5.19. But this bug existed in 5.12, so it must've been exposed by some other change in allocation or writeback patterns. It's also much less likely to be hit with the rmapbt enabled, since that increases the minimum free list size and is unlikely to split at the same time as the bnobt and cntbt. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: abort intent items when recovery intents fail	Long Li	2	-3/+4
	Source kernel commit: f8f9d952e42dd49ae534f61f2fa7ca0876cb9848 When recovering intents, we capture newly created intent items as part of point, we forget to remove those newly created intent items from the AIL and hang: [root@localhost ~]# cat /proc/539/stack [<0>] xfs_ail_push_all_sync+0x174/0x230 [<0>] xfs_unmount_flush_inodes+0x8d/0xd0 [<0>] xfs_mountfs+0x15f7/0x1e70 [<0>] xfs_fs_fill_super+0x10ec/0x1b20 [<0>] get_tree_bdev+0x3c8/0x730 [<0>] vfs_get_tree+0x89/0x2c0 [<0>] path_mount+0xecf/0x1800 [<0>] do_mount+0xf3/0x110 [<0>] __x64_sys_mount+0x154/0x1f0 [<0>] do_syscall_64+0x39/0x80 [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd When newly created intent items fail to commit via transaction, intent recovery hasn't created done items for these newly created intent items, so the capture structure is the sole owner of the captured intent items. We must release them explicitly or else they leak: unreferenced object 0xffff888016719108 (size 432): comm "mount", pid 529, jiffies 4294706839 (age 144.463s) hex dump (first 32 bytes): 08 91 71 16 80 88 ff ff 08 91 71 16 80 88 ff ff ..q.......q..... 18 91 71 16 80 88 ff ff 18 91 71 16 80 88 ff ff ..q.......q..... backtrace: [<ffffffff8230c68f>] xfs_efi_init+0x18f/0x1d0 [<ffffffff8230c720>] xfs_extent_free_create_intent+0x50/0x150 [<ffffffff821b671a>] xfs_defer_create_intents+0x16a/0x340 [<ffffffff821bac3e>] xfs_defer_ops_capture_and_commit+0x8e/0xad0 [<ffffffff82322bb9>] xfs_cui_item_recover+0x819/0x980 [<ffffffff823289b6>] xlog_recover_process_intents+0x246/0xb70 [<ffffffff8233249a>] xlog_recover_finish+0x8a/0x9a0 [<ffffffff822eeafb>] xfs_log_mount_finish+0x2bb/0x4a0 [<ffffffff822c0f4f>] xfs_mountfs+0x14bf/0x1e70 [<ffffffff822d1f80>] xfs_fs_fill_super+0x10d0/0x1b20 [<ffffffff81a21fa2>] get_tree_bdev+0x3d2/0x6d0 [<ffffffff81a1ee09>] vfs_get_tree+0x89/0x2c0 [<ffffffff81a9f35f>] path_mount+0xecf/0x1800 [<ffffffff81a9fd83>] do_mount+0xf3/0x110 [<ffffffff81aa00e4>] __x64_sys_mount+0x154/0x1f0 [<ffffffff83968739>] do_syscall_64+0x39/0x80 Fix the problem above by abort intent items that don't have a done item when recovery intents fail. Fixes: e6fff81e4870 ("xfs: proper replay of deferred ops queued during log recovery") Signed-off-by: Long Li <leo.lilong@huawei.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: factor out xfs_defer_pending_abort	Long Li	1	-8/+15
	Source kernel commit: 2a5db859c6825b5d50377dda9c3cc729c20cad43 Factor out xfs_defer_pending_abort() from xfs_defer_trans_abort(), which not use transaction parameter, so it can be used after the transaction life cycle. Signed-off-by: Long Li <leo.lilong@huawei.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: invert the realtime summary cache	Omar Sandoval	1	-3/+3
	Source kernel commit: e23aaf450de733044a74bc95528f728478b61c2a In commit 355e3532132b ("xfs: cache minimum realtime summary level"), I added a cache of the minimum level of the realtime summary that has any free extents. However, it turns out that the _maximum_ level is more useful for upcoming optimizations, and basically equivalent for the existing usage. So, let's change the meaning of the cache to be the maximum level + 1, or 0 if there are no free extents. For example, if the cache contains: {0, 4} then there are no free extents starting in realtime bitmap block 0, and there are no free extents larger than or equal to 2^4 blocks starting in realtime bitmap block 1. The cache is a loose upper bound, so there may or may not be free extents smaller than 2^4 blocks in realtime bitmap block 1. Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: simplify rt bitmap/summary block accessor functions	Darrick J. Wong	2	-44/+41
	Source kernel commit: e2cf427c91494ea0d1173a911090c39665c5fdef Simplify the calling convention of these functions since the xfs_rtalloc_args structure contains the parameters we need. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: simplify xfs_rtbuf_get calling conventions	Darrick J. Wong	2	-68/+69
	Source kernel commit: 5b1d0ae9753f0654ab56c1e06155b3abf2919d71 Now that xfs_rtalloc_args holds references to the last-read bitmap and summary blocks, we don't need to pass the buffer pointer out of xfs_rtbuf_get. Callers no longer have to xfs_trans_brelse on their own, though they are required to call xfs_rtbuf_cache_relse before the xfs_rtalloc_args goes out of scope. While we're at it, create some trivial helpers so that we don't have to remember if "0" means "bitmap" and "1" means "summary". Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: cache last bitmap block in realtime allocator	Omar Sandoval	2	-76/+88
	Source kernel commit: e94b53ff699c2674a9ec083342a5254866210ade Profiling a workload on a highly fragmented realtime device showed a ton of CPU cycles being spent in xfs_trans_read_buf() called by xfs_rtbuf_get(). Further tracing showed that much of that was repeated calls to xfs_rtbuf_get() for the same block of the realtime bitmap. These come from xfs_rtallocate_extent_block(): as it walks through ranges of free bits in the bitmap, each call to xfs_rtcheck_range() and xfs_rtfind_{forw,back}() gets the same bitmap block. If the bitmap block is very fragmented, then this is _a lot_ of buffer lookups. The realtime allocator already passes around a cache of the last used realtime summary block to avoid repeated reads (the parameters rbpp and rsb). We can do the same for the realtime bitmap. This replaces rbpp and rsb with a struct xfs_rtbuf_cache, which caches the most recently used block for both the realtime bitmap and summary. xfs_rtbuf_get() now handles the caching instead of the callers, which requires plumbing xfs_rtbuf_cache to more functions but also makes sure we don't miss anything. Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: consolidate realtime allocation arguments	Dave Chinner	2	-203/+211
	Source kernel commit: 41f33d82cfd310e344fc9183f02cc9e0d2d27663 Consolidate the arguments passed around the rt allocator into a struct xfs_rtalloc_arg similar to how the btree allocator arguments are consolidated in a struct xfs_alloc_arg.... Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: use accessor functions for summary info words	Darrick J. Wong	3	-9/+42
	Source kernel commit: 663b8db7b0256b81152b2f786e45ecf12bdf265f Create get and set functions for rtsummary words so that we can redefine the ondisk format with a specific endianness. Note that this requires the definition of a distinct type for ondisk summary info words so that the compiler can perform proper typechecking. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: create helpers for rtsummary block/wordcount computations	Darrick J. Wong	2	-0/+36
	Source kernel commit: bd85af280de66a946022775a876edf0c553e3f35 Create helper functions that compute the number of blocks or words necessary to store the rt summary file. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: use accessor functions for bitmap words	Darrick J. Wong	3	-77/+67
	Source kernel commit: 97e993830a1cdd86ad7d207308b9f55a00660edd Create get and set functions for rtbitmap words so that we can redefine the ondisk format with a specific endianness. Note that this requires the definition of a distinct type for ondisk rtbitmap words so that the compiler can perform proper typechecking as we go back and forth. In the upcoming rtgroups feature, we're going to fix the problem that rtwords are written in host endian order, which means we'll need the distinct rtword/rtword_raw types. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: create a helper to handle logging parts of rt bitmap/summary blocks	Darrick J. Wong	1	-15/+40
	Source kernel commit: 312d61021b8947446aa9ec80b78b9230e8cb3691 Create an explicit helper function to log parts of rt bitmap and summary blocks. While we're at it, fix an off-by-one error in two of the rtbitmap logging calls that led to unnecessarily large log items but was otherwise benign. Note that the upcoming rtgroups patchset will add block headers to the rtbitmap and rtsummary files. The helpers in this and the next few patches take a less than direct route through xfs_rbmblock_wordptr and xfs_rsumblock_infoptr to avoid helper churn in that patchset. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: create helpers for rtbitmap block/wordcount computations	Darrick J. Wong	3	-4/+44
	Source kernel commit: d0448fe76ac1a9ccbce574577a4c82246d17eec4 Create helper functions that compute the number of blocks or words necessary to store the rt bitmap. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: convert to new timestamp accessors	Jeff Layton	6	-10/+86
	Source kernel commit: 75d1e312bbbd175fa27ffdd4c4fe9e8cc7d047ec Convert to using the new inode timestamp accessor functions. [Carlos: Also partially port 077c212f0344ae and 12cd4402365166] Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20231004185347.80880-75-jlayton@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: convert rt summary macros to helpers	Darrick J. Wong	7	-16/+69
	Source kernel commit: 097b4b7b64ef67a4703b89fd4064480b61557fd5 Convert the realtime summary file macros to helper functions so that we can improve type checking. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: convert open-coded xfs_rtword_t pointer accesses to helper	Darrick J. Wong	2	-28/+42
	Source kernel commit: a9948626849c2c65dfd201b5e9d855e62937de61 There are a bunch of places where we use open-coded logic to find a pointer to an xfs_rtword_t within a rt bitmap buffer. Convert all that to helper functions for better type safety. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: remove XFS_BLOCKWSIZE and XFS_BLOCKWMASK macros	Darrick J. Wong	3	-11/+9
	Source kernel commit: add3cddaea509071d01bf1d34df0d05db1a93a07 Remove these trivial macros since they're not even part of the ondisk format. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: convert the rtbitmap block and bit macros to static inline functions	Darrick J. Wong	3	-20/+42
	Source kernel commit: 90d98a6ada1da0f8797ff3f5adafd175dd8c0a81 Replace these macros with typechecked helper functions. Eventually we're going to add more logic to the helpers and it'll be easier if we don't have to macro it up. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: use shifting and masking when converting rt extents, if possible	Darrick J. Wong	4	-0/+57
	Source kernel commit: ef5a83b7e597038d1c734ddb4bc00638082c2bf1 Avoid the costs of integer division (32-bit and 64-bit) if the realtime extent size is a power of two. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: create rt extent rounding helpers for realtime extent blocks	Darrick J. Wong	1	-0/+18
	Source kernel commit: 5f57f7309d9ab9d24d50c5707472b1ed8af4eabc Create a pair of functions to round rtblock numbers up or down to the nearest rt extent. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: convert do_div calls to xfs_rtb_to_rtx helper calls	Darrick J. Wong	2	-6/+16
	Source kernel commit: 055641248f649b52620a5fe8774bea253690e057 Convert these calls to use the helpers, and clean up all these places where the same variable can have different units depending on where it is in the function. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: create helpers to convert rt block numbers to rt extent numbers	Darrick J. Wong	3	-6/+37
	Source kernel commit: 5dc3a80d46a450481df7f7e9fe673ba3eb4514c3 Create helpers to do unit conversions of rt block numbers to rt extent numbers. There are three variations -- one to compute the rt extent number from an rt block number; one to compute the offset of an rt block within an rt extent; and one to extract both. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: create a helper to convert extlen to rtextlen	Darrick J. Wong	2	-1/+10
	Source kernel commit: 2c2b981b737a519907429f62148bbd9e40e01132 Create a helper to compute the realtime extent (xfs_rtxlen_t) from an extent length (xfs_extlen_t) value. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: create a helper to compute leftovers of realtime extents	Darrick J. Wong	2	-2/+11
	Source kernel commit: 68db60bf01c131c09bbe35adf43bd957a4c124bc Create a helper to compute the misalignment between a file extent (xfs_extlen_t) and a realtime extent. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: create a helper to convert rtextents to rtblocks	Darrick J. Wong	1	-0/+16
	Source kernel commit: fa5a387230861116c2434c20d29fc4b3fd077d24 Create a helper to convert a realtime extent to a realtime block. Later on we'll change the helper to use bit shifts when possible. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: convert rt extent numbers to xfs_rtxnum_t	Darrick J. Wong	3	-57/+57
	Source kernel commit: 2d5f216b77e33f9b503bd42998271da35d4b7055 Further disambiguate the xfs_rtblock_t uses by creating a new type, xfs_rtxnum_t, to store the position of an extent within the realtime section, in units of rtextents. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: rename xfs_verify_rtext to xfs_verify_rtbext	Darrick J. Wong	3	-6/+6
	Source kernel commit: 3d2b6d034f0feb7741b313f978a2fe45e917e1be This helper function validates that a range of blocks in the realtime section is completely contained within the realtime section. It does /not/ validate ranges of rtextents. Rename the function to avoid suggesting that it does, and change the type of the @len parameter since xfs_rtblock_t is a position unit, not a length unit. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: convert rt bitmap extent lengths to xfs_rtbxlen_t	Darrick J. Wong	3	-2/+3
	Source kernel commit: f29c3e745dc253bf9d9d06ddc36af1a534ba1dd0 XFS uses xfs_rtblock_t for many different uses, which makes it much more difficult to perform a unit analysis on the codebase. One of these (ab)uses is when we need to store the length of a free space extent as stored in the realtime bitmap. Because there can be up to 2^64 realtime extents in a filesystem, we need a new type that is larger than xfs_rtxlen_t for callers that are querying the bitmap directly. This means scrub and growfs. Create this type as "xfs_rtbxlen_t" and use it to store 64-bit rtx lengths. 'b' stands for 'bitmap' or 'big'; reader's choice. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: convert rt bitmap/summary block numbers to xfs_fileoff_t	Darrick J. Wong	2	-17/+17
	Source kernel commit: 03f4de332e2e79db36ed2156fb2350480f142bec We should use xfs_fileoff_t to store the file block offset of any location within the realtime bitmap or summary files. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: convert xfs_extlen_t to xfs_rtxlen_t in the rt allocator	Darrick J. Wong	3	-12/+12
	Source kernel commit: a684c538bc14410565e8939393089670fa1e19dd In most of the filesystem, we use xfs_extlen_t to store the length of a file (or AG) space mapping in units of fs blocks. Unfortunately, the realtime allocator also uses it to store the length of a rt space mapping in units of rt extents. This is confusing, since one rt extent can consist of many fs blocks. Separate the two by introducing a new type (xfs_rtxlen_t) to store the length of a space mapping (in units of realtime extents) that would be found in a file. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: move the xfs_rtbitmap.c declarations to xfs_rtbitmap.h	Darrick J. Wong	4	-54/+86
	Source kernel commit: 13928113fc5b5e79c91796290a99ed991ac0efe2 Move all the declarations for functionality in xfs_rtbitmap.c into a separate xfs_rtbitmap.h header file. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: fix units conversion error in xfs_bmap_del_extent_delay	Darrick J. Wong	1	-1/+1
	Source kernel commit: ddd98076d5c075c8a6c49d9e6e8ee12844137f23 The unit conversions in this function do not make sense. First we convert a block count to bytes, then divide that bytes value by rextsize, which is in blocks, to get an rt extent count. You can't divide bytes by blocks to get a (possibly multiblock) extent value. Fortunately nobody uses delalloc on the rt volume so this hasn't mattered. Fixes: fa5c836ca8eb5 ("xfs: refactor xfs_bunmapi_cow") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: hoist freeing of rt data fork extent mappings	Darrick J. Wong	4	-16/+38
	Source kernel commit: 6c664484337b37fa0cf6e958f4019623e30d40f7 Currently, xfs_bmap_del_extent_real contains a bunch of code to convert the physical extent of a data fork mapping for a realtime file into rt extents and pass that to the rt extent freeing function. Since the details of this aren't needed when CONFIG_XFS_REALTIME=n, move it to xfs_rtbitmap.c to reduce code size when realtime isn't enabled. This will (one day) enable realtime EFIs to reuse the same unit-converting call with less code duplication. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-15	xfs: bump max fsgeom struct version	Darrick J. Wong	1	-1/+1
	Source kernel commit: 9488062805943c2d63350d3ef9e4dc093799789a The latest version of the fs geometry structure is v5. Bump this constant so that xfs_db and mkfs calls to libxfs_fs_geometry will fill out all the fields. IOWs, this commit is a no-op for the kernel, but will be useful for userspace reporting in later changes. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-02-05	xfsprogs: Release v6.6.0	Carlos Maiolino	4	-2/+17
	Update all the necessary files for a 6.6.0 release. Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-01-23	Merge tag 'scruball-service-fixes-6.6_2024-01-11' of ↵	Carlos Maiolino	1	-32/+125
	https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev into for-next xfs_scrub_all: fixes for systemd services [v28.3 5/6] This patchset ties up some problems in the xfs_scrub_all program and service, which are essential for finding mounted filesystems to scrub and creating the background service instances that do the scrub. First, we need to fix various errors in pathname escaping, because systemd does /not/ like slashes in service names. Then, teach xfs_scrub_all to deal with systemd restarts causing it to think that a scrub has finished before the service actually finishes. Finally, implement a signal handler so that SIGINT (console ^C) and SIGTERM (systemd stopping the service) shut down the xfs_scrub@ services correctly. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-01-23	Merge tag 'scrub-service-fixes-6.6_2024-01-11' of ↵	Carlos Maiolino	7	-47/+57
	https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev into for-next xfs_scrub: fixes for systemd services [v28.3 4/6] This series fixes deficiencies in the systemd services that were created to manage background scans. First, improve the debian packaging so that services get installed at package install time. Next, fix copyright and spdx header omissions. Finally, fix bugs in the mailer scripts so that scrub failures are reported effectively. Finally, fix xfs_scrub_all to deal with systemd restarts causing it to think that a scrub has finished before the service actually finishes. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-01-23	Merge tag 'scrub-repair-fixes-6.6_2024-01-11' of ↵	Carlos Maiolino	2	-1/+14
	https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev into for-next xfs_scrub: fixes to the repair code [v28.3 3/6] Now that we've landed the new kernel code, it's time to reorganize the xfs_scrub code that handles repairs. Clean up various naming warts and misleading error messages. Move the repair code to scrub/repair.c as the first step. Then, fix various issues in the repair code before we start reorganizing things. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-01-23	Merge tag 'scrub-fix-legalese-6.6_2024-01-11' of ↵	Carlos Maiolino	45	-113/+143
	https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev into for-next xfs_scrub: fix licensing and copyright notices [v28.3 2/6] Fix various attribution problems in the xfs_scrub source code, such as the author's contact information, out of date SPDX tags, and a rough estimate of when the feature was under heavy development. The most egregious parts are the files that are missing license information completely. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-01-23	Merge tag 'xfsprogs-fixes-6.6_2024-01-11' of ↵	Carlos Maiolino	12	-204/+295
	https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev into for-next xfsprogs: various bug fixes for 6.6 [1/6] This series fixes a couple of bugs that I found in the userspace support libraries. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-01-23	Merge tag 'xfsprogs-fixes-6.6_2023-12-21' of ↵	Carlos Maiolino	9	-87/+171
	https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev into for-next xfsprogs: various bug fixes for 6.6 [1/8] This series fixes a couple of bugs that I found in the userspace support libraries. This has been running on the djcloud for months with no problems. Enjoy! Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-01-11	xfs_scrub_all: fix termination signal handling	Darrick J. Wong	1	-12/+52
	Currently, xfs_scrub_all does not handle termination signals well. SIGTERM and SIGINT are left to their default handlers, which are immediate termination of the process group in the case of SIGTERM and raising KeyboardInterrupt in the case of SIGINT. Terminating the process group is fine when the xfs_scrub processes are direct children, but this completely doesn't work if we're farming the work out to systemd services since we don't terminate the child service. Instead, they keep going. Raising KeyboardInterrupt doesn't work because once the main thread calls sys.exit at the bottom of main(), it blocks in the python runtime waiting for child threads to terminate. There's no longer any context to handle an exception, so the signal is ignored and no child processes are killed. In other words, if you try to kill a running xfs_scrub_all, chances are good it won't kill the child xfs_scrub processes. This is undesirable and egregious since we actually have the ability to track and kill all the subprocesses that we create. Solve the subproblem of getting stuck in the python runtime by calling it repeatedly until we no longer have subprocesses. This means that the main thread loops until all threads have exited. Solve the subproblem of the signals doing the wrong thing by setting up our own signal handler that can wake up the main thread and initiate subprocess shutdown, no matter whether the subprocesses are systemd services or directly fork/exec'd. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-01-11	xfs_scrub_all.cron: move to package data directory	Darrick J. Wong	2	-2/+1
	cron jobs don't belong in /usr/lib. Since the cron job is also secondary to the systemd timer, it's really only provided as a courtesy for distributions that don't use systemd. Move it to @datadir@, aka /usr/share/xfsprogs. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Neal Gompa <neal@gompa.dev> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-01-11	xfs_scrub_all: simplify cleanup of run_killable	Darrick J. Wong	1	-7/+6
	Get rid of the nested lambda functions to simplify the code. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-01-11	xfs_scrub_fail: move executable script to /usr/libexec	Darrick J. Wong	3	-5/+5
	Per FHS 3.0, non-PATH executable binaries are supposed to live under /usr/libexec, not /usr/lib. xfs_scrub_fail is an executable script, so move it to libexec in case some distro some day tries to mount /usr/lib as noexec or something. Link: https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch04s07.html Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Neal Gompa <neal@gompa.dev> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-01-11	xfs_scrub_all: survive systemd restarts when waiting for services	Darrick J. Wong	1	-13/+65
	If xfs_scrub_all detects a running systemd, it will use it to invoke xfs_scrub subprocesses in a sandboxed and resource-controlled environment. Unfortunately, if you happen to restart dbus or systemd while it's running, you get this: systemd[1]: Reexecuting. xfs_scrub_all[9958]: Warning! D-Bus connection terminated. xfs_scrub_all[9956]: Warning! D-Bus connection terminated. xfs_scrub_all[9956]: Failed to wait for response: Connection reset by peer xfs_scrub_all[9958]: Failed to wait for response: Connection reset by peer xfs_scrub_all[9930]: Scrubbing / done, (err=1) xfs_scrub_all[9930]: Scrubbing /storage done, (err=1) The xfs_scrub units themselves are still running, it's just that the `systemctl start' command that xfs_scrub_all uses to start and wait for the unit lost its connection to dbus and hence is no longer monitoring sub-services. When this happens, we don't have great options -- systemctl doesn't have a command to wait on an activating (aka running) unit. Emulate the functionality we normally get by polling the failed/active statuses. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-01-11	xfs_scrub_fail: advise recipients not to reply	Darrick J. Wong	1	-0/+1
	Advise recipients of the service failure emails that they should not try to reply to the automated service message. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-01-11	xfs_scrub_all: fix argument passing when invoking xfs_scrub manually	Darrick J. Wong	1	-1/+3
	Currently, xfs_scrub_all will try to invoke xfs_scrub with argv[1] being "-n -x". This of course is recognized by C getopt as a weird looking string, not two individual arguments, and causes the child process to exit with complaints about CLI usage. What we really want is to split the string into a proper array and then add them to the xfs_scrub command line. The code here isn't strictly correct, but as @scrub_args@ is controlled by us in the Makefile, it'll do for now. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-01-11	xfs_scrub_fail: add content type header to failure emails	Darrick J. Wong	1	-0/+2
	Add content type and encoding metadata so that these emails display correctly. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-01-11	xfs_scrub_fail: return the failure status of the mailer program	Darrick J. Wong	1	-0/+1
	We should return the exit code of the mailer program sending the scrub failure reports, since that's much more important to anyone watching the system. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-01-11	xfs_scrub: don't report media errors for space with unknowable owner	Darrick J. Wong	1	-1/+12
	On filesystems that don't have the reverse mapping feature enabled, the GETFSMAP call cannot tell us much about the owner of a space extent -- we're limited to static fs metadata, free space, or "unknown". In this case, nothing is corrupt, so str_corrupt is not an appropriate logging function. Relax this to str_info so that the user sees a notice that media errors have been found so that the user knows something bad happened even if the directory tree walker cannot find the file owning the space where the media error was found. Filesystems with rmap enabled are never supposed to return OWN_UNKNOWN from a GETFSMAP report, so continue to report that as a corruption. This fixes a regression in xfs/556. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-01-11	xfs_scrub: update copyright years for scrub/ files	Darrick J. Wong	39	-39/+39
	Update the copyright years in the scrub/ source code files. This isn't required, but it's helpful to remind myself just how long it's taken to develop this feature. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-01-11	xfs_scrub_fail: fix sendmail detection	Darrick J. Wong	1	-1/+2
	This script emails the results of failed scrub runs to root. We shouldn't be hardcoding the path to the mailer program because distros can change the path according to their whim. Modify this script to use command -v to find the program. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-01-11	xfs_scrub: flush stdout after printing to it	Darrick J. Wong	1	-0/+2
	Make sure we flush stdout after printf'ing to it, especially before we start any operation that could take a while to complete. Most of scrub already does this, but we missed a couple of spots. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-01-11	xfs_scrub: add missing license and copyright information	Darrick J. Wong	6	-0/+30
	These files are missing the required SPDX license and copyright information. Add them. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-01-11	xfs_scrub: fix pathname escaping across all service definitions	Darrick J. Wong	5	-33/+36
	systemd services provide an "instance name" that can be associated with a particular invocation of a service. This allows service users to invoke multiple copies of a service, each with a unique string. For xfs_scrub, we pass the mountpoint of the filesystem as the instance name. However, systemd services aren't supposed to have slashes in them, so we're supposed to escape them. The canonical escaping scheme for pathnames is defined by the systemd-escape --path command. Unfortunately, we've been adding our own opinionated sauce for years, to work around the fact that --path didn't exist in systemd before January 2017. The special sauce is incorrect, and we no longer care about systemd of 7 years past. Clean up this mess by following the systemd escaping scheme throughout the service units. Now we can use the '%f' specifier in them, which makes things a lot less complicated. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-01-11	xfs_scrub: fix author and spdx headers on scrub/ files	Darrick J. Wong	38	-74/+74
	Fix the spdx tags to match current practice, and update the author contact information. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-01-11	xfs_scrub_all: escape service names consistently	Darrick J. Wong	1	-15/+17
	This program is not consistent as to whether or not it escapes the pathname that is being used as the xfs_scrub service instance name. Fix it to be consistent, and to fall back to direct invocation if escaping doesn't work. The escaping itself is also broken, but we'll fix that in the next patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-01-11	debian: install scrub services with dh_installsystemd	Darrick J. Wong	1	-0/+1
	Use dh_installsystemd to handle the installation and activation of the scrub systemd services. This requires bumping the compat version to 11. Note that the services are /not/ activated on installation. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-01-11	libxfs: fix krealloc to allow freeing data	Darrick J. Wong	1	-0/+10
	A recent refactoring to xfs_idata_realloc in the kernel made it depend on krealloc returning NULL if the new size is zero. The xfsprogs wrapper instead aborts, so we need to make it follow the kernel behavior. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2023-12-21	xfs_db: report the device associated with each io cursor	Darrick J. Wong	3	-4/+48
	When db is reporting on an io cursor, have it print out the device that the cursor is pointing to. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-12-21	xfs_scrub: try to use XFS_SCRUB_IFLAG_FORCE_REBUILD	Darrick J. Wong	5	-22/+56
	Now that we have a FORCE_REBUILD flag to the scrub ioctl, try to use that over the (much noisier) error injection knob, which may or may not even be enabled in the kernel config. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-12-21	xfs_copy: actually do directio writes to block devices	Darrick J. Wong	1	-6/+6
	Not sure why block device targets don't get O_DIRECT in !buffered mode, but it's misleading when the copy completes instantly only to stall forever due to fsync-on-close. Adjust the "write last sector" code to allocate a properly aligned buffer. In removing the onstack buffer for EOD writes, this also corrects the buffer being larger than necessary -- the old code declared an array of 32768 pointers, whereas all we really need is an aligned 32768-byte buffer. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-12-21	xfs_scrub: don't retry unsupported optimizations	Darrick J. Wong	1	-7/+9
	If the kernel says it doesn't support optimizing a data structure, we should mark it done and move on. This is much better than requeuing the repair, in which case it will likely keep failing. Eventually these requeued repairs end up in the single-threaded last resort at the end of phase 4, which makes things /very/ slow. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-12-21	xfs_copy: distinguish short writes to EOD from runtime errors	Darrick J. Wong	1	-1/+11
	Detect short writes to the end of the destination device and report them. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-12-21	xfs_scrub: handle spurious wakeups in scan_fs_tree	Darrick J. Wong	1	-1/+1
	Coverity reminded me that the pthread_cond_wait can wake up and return without the predicate variable (sft.nr_dirs > 0) actually changing. Therefore, one has to retest the condition after each wakeup. Coverity-id: 1554280 Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-12-21	xfs_io: support passing the FORCE_REBUILD flag to online repair	Darrick J. Wong	2	-2/+12
	Add CLI options to the scrubv and repair commands so that the user can pass FORCE_REBUILD to force the kernel to rebuild metadata. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-12-21	xfs_io: extract control number parsing routines	Darrick J. Wong	1	-43/+85
	Break out the parts of parse_args that extract control numbers from the CLI arguments, so that the function isn't as long. This isn't all that exciting now, but the scrub vectorization speedups will introduce a new ioctl. For the new command that comes with that, we'll want the control number parsing helpers. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-12-21	xfs_io: collapse trivial helpers	Darrick J. Wong	1	-84/+40
	Simply the call chain by having parse_args set the scrub ioctl parameters in the caller's object. The parse_args callers can then invoke the ioctl directly, eliminating one function and one indirect call. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-12-21	xfs_io: set exitcode = 1 on parsing errors in scrub/repair command	Darrick J. Wong	1	-7/+17
	Set exitcode to 1 if there is an error parsing the CLI arguments to the scrub or repair commands, like we do most other places in xfs_io. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-12-21	xfs_mdrestore: refactor progress printing and sb fixup code	Darrick J. Wong	1	-50/+59
	Now that we've fixed the dissimilarities between the two progress printing callsites, refactor them into helpers. Do the same for the duplicate code that clears sb_inprogress from the primary superblock after the copy succeeds. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-12-21	xfs_mdrestore: fix missed progress reporting	Darrick J. Wong	1	-6/+19
	Currently, the progress reporting only triggers when the number of bytes read is exactly a multiple of a megabyte. This isn't always guaranteed, since AG headers can be 512 bytes in size. Fix the algorithm by recording the number of megabytes we've reported as being read, and emit a new report any time the bytes_read count, once converted to megabytes, doesn't match. Fix the v2 code to emit one final status message in case the last extent restored is more than a megabyte. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-21	xfs_mdrestore: EXTERNALLOG is a compat value, not incompat	Darrick J. Wong	1	-4/+3
	Fix this check to look at the correct header field. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-21	xfs_mdrestore: emit newlines for fatal errors	Darrick J. Wong	1	-2/+2
	Spit out a newline after a fatal error message. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-21	xfs_mdrestore: fix uninitialized variables in mdrestore main	Darrick J. Wong	1	-5/+4
	Coverity complained about the "is fd a file?" flags being uninitialized. Clean this up. Coverity-id: 1554270 Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-21	xfs_metadump.8: update for external log device options	Darrick J. Wong	2	-3/+10
	Update the documentation to reflect that we can metadump external log device contents. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-21	libxfs: don't UAF a requeued EFI	Darrick J. Wong	1	-0/+7
	In the kernel, commit 8ebbf262d4684 ("xfs: don't block in busy flushing when freeing extents") changed the allocator behavior such that AGFL fixing can return -EAGAIN in response to detection of a deadlock with the transaction busy extent list. If this happens, we're supposed to requeue the EFI so that we can roll the transaction and try the item again. If a requeue happens, we should not free the xefi pointer in xfs_extent_free_finish_item or else the retry will walk off a dangling pointer. There is no extent busy list in userspace so this should never happen, but let's fix the logic bomb anyway. We should have ported kernel commit 0853b5de42b47 ("xfs: allow extent free intents to be retried") to userspace, but neither Carlos nor I noticed this fine detail. :( Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-21	libfrog: move 64-bit division wrappers to libfrog	Darrick J. Wong	4	-76/+99
	We want to keep the rtgroup unit conversion functions as static inlines, so share the div64 functions via libfrog instead of libxfs_priv.h. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-18	libxfs: split out a libxfs_dev structure from struct libxfs_init	Christoph Hellwig	15	-292/+265
	Most of the content of libxfs_init is members duplicated for each of the data, log and RT devices. Split those members into a separate libxfs_dev structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxfs: stash away the device fd in struct xfs_buftarg	Christoph Hellwig	5	-94/+34
	Cache the open file descriptor for each device in the buftarg structure and remove the now unused dev_map infrastructure. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	xfs_repair: remove various libxfs_device_to_fd calls	Christoph Hellwig	1	-8/+5
	A few places in xfs_repair call libxfs_device_to_fd to get the data device fd from the data device dev_t stored in the libxfs_init structure. Just use the file descriptor stored right there directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxfs: pass the device fd to discard_blocks	Christoph Hellwig	1	-10/+6
	No need to do a dev_t to fd lookup when the caller already has the fd. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxfs: return the opened fd from libxfs_device_open	Christoph Hellwig	1	-7/+5
	So that the caller can stash it away without having to call xfs_device_to_fd. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxfs: mark libxfs_device_{open,close} static	Christoph Hellwig	2	-4/+2
	libxfs_device_open and libxfs_device_close are only used in init.c. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxfs: remove dead size < 0 checks in libxfs_init	Christoph Hellwig	1	-15/+0
	libxfs_init initializes the device size to 0 at the start of the function and libxfs_open_device never sets the size to a negativ value. Remove these checks as they are dead code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libfrog: make platform_set_blocksize exit on fatal failure	Christoph Hellwig	3	-23/+23
	platform_set_blocksize has a fatal argument that is currently only used to change the printed message. Make it actually fatal similar to other libfrog platform helpers to simplify the caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxfs: remove the setblksize == 1 case in libxfs_device_open	Christoph Hellwig	1	-4/+1
	All callers of libxfs_init always pass an actual sector size or zero in the setblksize member. Remove the unreachable setblksize == 1 case. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxfs: making passing flags to libxfs_init less confusing	Christoph Hellwig	12	-42/+47
	The libxfs_xinit stucture has four different ways to pass flags to libxfs_init: - the isreadonly argument despite it's name contains various LIBXFS_ flags that go beyond just the readonly flag - the isdirect flag contains a single LIBXFS_ flag from the same name - the usebuflock is an integer used as bool - the bcache_flags member is used to pass flags directly to cache_init() for the buffer cache While there is good arguments for keeping the last one separate, all the others are rather confusing. Consolidate them into a single flags member using flags in the LIBXFS_* namespace. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxfs: merge the file vs device cases in libxfs_init	Christoph Hellwig	1	-51/+23
	The only special handling for an XFS device on a regular file is that we skip the checks in check_open. Simplify perform those conditionally instead of duplicating the entire sequence. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxfs: pass a struct libxfs_init to libxfs_alloc_buftarg	Christoph Hellwig	7	-20/+20
	Pass a libxfs_init structure to libxfs_alloc_buftarg instead of three separate dev_t values. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxfs: pass a struct libxfs_init to libxfs_mount	Christoph Hellwig	6	-12/+11
	Pass a libxfs_init structure to libxfs_mount instead of three separate dev_t values. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxfs: rename struct libxfs_xinit to libxfs_init	Christoph Hellwig	14	-31/+34
	Make the struct name more usual, and remove the libxfs_init_t typedef. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxlog: remove the global libxfs_xinit x structure	Christoph Hellwig	6	-4/+6
	There is no need to export a libxfs_xinit with the somewhat unsuitable name x from libxlog. Move it into the tools linking against libxlog that actually need it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxlog: don't require a libxfs_xinit structure for xlog_init	Christoph Hellwig	9	-25/+19
	xlog_init currently requires a libxfs_args structure to be passed in, and then clobbers various log-related arguments to it. There is no good reason for that as all the required information can be calculated without it. Remove the x argument to xlog_init and xlog_is_dirty and the now unused logBBstart member in struct libxfs_xinit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxlog: add a helper to initialize a xlog without clobbering the x structure	Christoph Hellwig	4	-46/+28
	xfsprogs has three copies of a code sequence to initialize an xlog structure from a libxfs_init structure. Factor the code into a helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxlog: remove the verbose argument to xlog_is_dirty	Christoph Hellwig	5	-13/+6
	No caller passes a non-zero verbose argument to xlog_is_dirty. Remove the argument the code keyed off by it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	xfs_logprint: move all code to set up the fake xlog into logstat()	Christoph Hellwig	1	-10/+11
	Isolate the code that sets up the fake xlog into the logstat() helper to prepare for upcoming changes. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxfs: remove the volname concept	Christoph Hellwig	10	-101/+33
	IRIX has the concept of a volume that has data/log/rt subvolumes (that's where the subvolume name in Linux comes from), but in the current Linux-only xfsprogs version trying to pretend we do anything with that it is just utterly confusing. The volname is basically just a very obsfucated second way to pass the data device name, so get rid of it in the libxfs and progs internals. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxfs/frog: remove latform_find{raw,block}path	Christoph Hellwig	3	-42/+14
	Stop pretending we try to distinguish between the legacy Unix raw and block devices nodes. Linux as the only currently support platform never had them, but other modern Unix variants like FreeBSD also got rid of this distinction years ago. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxfs: remove the dead {d,log,rt}path variables in libxfs_init	Christoph Hellwig	1	-10/+0
	These variables are only initialized, and then unlink is called if they were changed from the initial value, which can't happen. Remove the variables and the conditional unlink calls. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	libxfs: remove the unused icache_flags member from struct libxfs_xinit	Christoph Hellwig	1	-1/+0
	Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-12-18	xfs_io/encrypt: support specifying crypto data unit size	Eric Biggers	5	-19/+84
	Add an '-s' option to the 'set_encpolicy' command of xfs_io to allow exercising the log2_data_unit_size field that is being added to struct fscrypt_policy_v2 (kernel patch: https://lore.kernel.org/linux-fscrypt/20230925055451.59499-6-ebiggers@kernel.org). The xfs_io support is needed for xfstests (https://lore.kernel.org/fstests/20231013061403.138425-1-ebiggers@kernel.org), which currently relies on xfs_io to access the encryption ioctls. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Carlos Maiolino <cem@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	mdrestore: Add support for passing log device as an argument	Chandan Babu R	2	-2/+17
	metadump v2 format allows dumping metadata from external log devices. This commit allows passing the device file to which log data must be restored from the corresponding metadump file. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	mdrestore: Define mdrestore ops for v2 format	Chandan Babu R	1	-12/+228
	This commit adds functionality to restore metadump stored in v2 format. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	mdrestore: Extract target device size verification into a function	Chandan Babu R	1	-17/+26
	A future commit will need to perform the device size verification on an external log device. In preparation for this, this commit extracts the relevant portions into a new function. No functional changes have been introduced. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	mdrestore: Introduce mdrestore v1 operations	Chandan Babu R	1	-23/+28
	In order to indicate the version of metadump files that they can work with, this commit renames read_header(), show_info() and restore() functions to read_header_v1(), show_info_v1() and restore_v1() respectively. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	mdrestore: Replace metadump header pointer argument with a union pointer	Chandan Babu R	1	-32/+34
	We will need two variants of read_header(), show_info() and restore() helper functions to support two versions of metadump formats. To this end, A future commit will introduce a vector of function pointers to work with the two metadump formats. To have a common function signature for the function pointers, this commit replaces the first argument of the previously listed function pointers from "struct xfs_metablock " with "union mdrestore_headers ". Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	mdrestore: Add open_device(), read_header() and show_info() functions	Chandan Babu R	1	-57/+84
	This commit moves functionality associated with opening the target device, reading metadump header information and printing information about the metadump into their respective functions. There are no functional changes made by this commit. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	mdrestore: Detect metadump v1 magic before reading the header	Chandan Babu R	1	-3/+15
	In order to support both v1 and v2 versions of metadump, mdrestore will have to detect the format in which the metadump file has been stored on the disk and then read the ondisk structures accordingly. In a step in that direction, this commit splits the work of reading the metadump header from disk into two parts, 1. Read the first 4 bytes containing the metadump magic code. 2. Read the remaining part of the header. A future commit will take appropriate action based on the value of the magic code. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	mdrestore: Define and use struct mdrestore	Chandan Babu R	1	-10/+17
	This commit collects all state tracking variables in a new "struct mdrestore" structure. This is done to collect all the global variables in one place rather than having them spread across the file. A new structure member of type "struct mdrestore_ops *" will be added by a future commit to support the two versions of metadump. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	mdrestore: Declare boolean variables with bool type	Chandan Babu R	1	-6/+6
	Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	xfs_db: Add support to read from external log device	Chandan Babu R	5	-25/+131
	This commit introduces a new function set_log_cur() allowing xfs_db to read from an external log device. This is required by a future commit which will add the ability to dump metadata from external log devices. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	metadump: Define metadump ops for v2 format	Chandan Babu R	1	-3/+71
	This commit adds functionality to dump metadata from an XFS filesystem in newly introduced v2 format. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	metadump: Define metadump v2 ondisk format structures and macros	Chandan Babu R	1	-0/+68
	The corresponding metadump file's disk layout is as shown below, \|------------------------------\| \| struct xfs_metadump_header \| \|------------------------------\| \| struct xfs_meta_extent 0 \| \| Extent 0's data \| \| struct xfs_meta_extent 1 \| \| Extent 1's data \| \| ... \| \| struct xfs_meta_extent (n-1) \| \| Extent (n-1)'s data \| \|------------------------------\| The "struct xfs_metadump_header" is followed by alternating series of "struct xfs_meta_extent" and the extent itself. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	metadump: Rename XFS_MD_MAGIC to XFS_MD_MAGIC_V1	Chandan Babu R	3	-3/+3
	Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	metadump: Introduce metadump v1 operations	Chandan Babu R	1	-62/+62
	This commit moves functionality associated with writing metadump to disk into a new function. It also renames metadump initialization, write and release functions to reflect the fact that they work with v1 metadump files. The metadump initialization, write and release functions are now invoked via metadump_ops->init(), metadump_ops->write() and metadump_ops->release() respectively. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	metadump: Introduce struct metadump_ops	Chandan Babu R	1	-0/+25
	We will need two sets of functions to implement two versions of metadump. This commit adds the definition for 'struct metadump_ops' to hold pointers to version specific metadump functions. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	metadump: Postpone invocation of init_metadump()	Chandan Babu R	1	-5/+6
	The metadump v2 initialization function (introduced in a later commit) writes the header structure into the metadump file. This will require the program to open the metadump file before the initialization function has been invoked. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	metadump: Add initialization and release functions	Chandan Babu R	1	-36/+52
	Move metadump initialization and release functionality into corresponding functions. There are no functional changes made in this commit. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	metadump: Define and use struct metadump	Chandan Babu R	1	-214/+244
	This commit collects all state tracking variables in a new "struct metadump" structure. This is done to collect all the global variables in one place rather than having them spread across the file. A new structure member of type "struct metadump_ops *" will be added by a future commit to support the two versions of metadump. Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	metadump: Declare boolean variables with bool type	Chandan Babu R	1	-16/+16
	Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	mdrestore: Fix logic used to check if target device is large enough	Chandan Babu R	1	-1/+1
	The device size verification code should be writing XFS_MAX_SECTORSIZE bytes to the end of the device rather than "sizeof(char ) XFS_MAX_SECTORSIZE" bytes. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	metadump: Use boolean values true/false instead of 1/0	Chandan Babu R	1	-5/+5
	Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	repair: fix the call to search_rt_dup_extent in scan_bmapbt	Christoph Hellwig	1	-2/+4
	search_rt_dup_extent expects an RT extent number and not a fsbno. Convert the units before the call. Without this we are unlikely to ever found a legit duplicate extent on the RT subvolume because the search will always be off the end. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	xfs_quota: fix missing mount point warning	Pavel Reichl	1	-7/+11
	When user have mounted an XFS volume, and defined project in /etc/projects file that points to a directory on a different volume, then: `xfs_quota -xc "report -a" $path_to_mounted_volume' complains with: "xfs_quota: cannot find mount point for path \ `directory_from_projects': Invalid argument" unlike `xfs_quota -xc "report -a"' which works as expected and no warning is printed. This is happening because in the 1st call we pass to xfs_quota command the $path_to_mounted_volume argument which says to xfs_quota not to look for all mounted volumes on the system, but use only those passed to the command and ignore all others (This behavior is intended as an optimization for systems with huge number of mounted volumes). After that, while projects are initialized, the project's directories on other volumes are obviously not in searched subset of volumes and warning is printed. I propose to fix this behavior by conditioning the printing of warning only if all mounted volumes are searched. Signed-off-by: Pavel Reichl <preichl@redhat.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	db: fix unsigned char related warnings	Christoph Hellwig	1	-22/+23
	Clean up the code in hash.c to use the normal char type for all high-level code, only casting to uint8_t when calling into low-level code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-21	Polish translation update for xfsprogs 6.5.0.	Jakub Bogusz	1	-7558/+7757
	Signed-off-by: Jakub Bogusz <qboosh@pld-linux.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-16	xfs: adjust the incore perag block_count when shrinking	Darrick J. Wong	1	-0/+6
	Source kernel commit: 6868b8505c807ad9397d78cc4e07cb1cb3582152 If we reduce the number of blocks in an AG, we must update the incore geometry values as well. Fixes: 0800169e3e2c9 ("xfs: Pre-calculate per-AG agbno geometry") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-16	Revert "xfs: switch to multigrain timestamps"	Christian Brauner	2	-12/+4
	Source kernel commit: f798accd5987dc2280e0ba9055edf1124af46a5f This reverts commit e44df2664746aed8b6dd5245eb711a0ce33c5cf5. Users reported regressions due to enabling multi-grained timestamps unconditionally. As no clear consensus on a solution has come up and the discussion has gone back to the drawing board revert the infrastructure changes for. If it isn't code that's here to stay, make it go away. Message-ID: <20230920-keine-eile-c9755b5825db@brauner> Acked-by: Jan Kara <jack@suse.cz> Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-14	xfs: fix log recovery when unknown rocompat bits are set	Darrick J. Wong	1	-1/+2
	Source kernel commit: 74ad4693b6473950e971b3dc525b5ee7570e05d0 Log recovery has always run on read only mounts, even where the primary superblock advertises unknown rocompat bits. Due to a misunderstanding between Eric and Darrick back in 2018, we accidentally changed the superblock write verifier to shutdown the fs over that exact scenario. As a result, the log cleaning that occurs at the end of the mounting process fails if there are unknown rocompat bits set. As we now allow writing of the superblock if there are unknown rocompat bits set on a RO mount, we no longer want to turn off RO state to allow log recovery to succeed on a RO mount. Hence we also remove all the (now unnecessary) RO state toggling from the log recovery path. Fixes: 9e037cb7972f ("xfs: check for unknown v5 feature bits in superblock write verifier" Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-14	xfs: switch to multigrain timestamps	Jeff Layton	2	-5/+13
	Source kernel commit: e44df2664746aed8b6dd5245eb711a0ce33c5cf5 Enable multigrain timestamps, which should ensure that there is an apparent change to the timestamp whenever it has been written after being actively observed via getattr. Also, anytime the mtime changes, the ctime must also change, and those are now the only two options for xfs_trans_ichgtime. Have that function unconditionally bump the ctime, and ASSERT that XFS_ICHGTIME_CHG is always set. Acked-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Jeff Layton <jlayton@kernel.org> Message-Id: <20230807-mgctime-v7-11-d1dec143a704@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-14	xfs: allow userspace to rebuild metadata structures	Darrick J. Wong	1	-1/+5
	Source kernel commit: 5c83df2e54b6af870e3e02ccd2a8ecd54e36668c Add a new (superuser-only) flag to the online metadata repair ioctl to force it to rebuild structures, even if they're not broken. We will use this to move metadata structures out of the way during a free space defragmentation operation. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-11-14	xfs: convert to ctime accessor functions	Jeff Layton	3	-4/+25
	Source kernel commit: a0a415e34b57368acd262e1172720252c028b936 In later patches, we're going to change how the inode's ctime field is used. Switch to using accessor functions instead of raw accesses of inode->i_ctime. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230705190309.579783-80-jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-10-12	xfsprogs: Release v6.5.0v6.5.0	Carlos Maiolino	4	-2/+22
	Update all the necessary files for a 6.5.0 release. Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-10-05	libfrog: drop build host crc32 selftest	Krzesimir Nowak	2	-33/+2
	CRC selftests running on a build host were useful long time ago, when CRC support was added to the on-disk support. Now it's purpose is replaced by fstests. Also mkfs.xfs and xfs_repair have their own selftests. On top of that, it adds a dependency on liburcu on the build host for no reason - liburcu is not used by the crc32 selftest. Signed-off-by: Krzesimir Nowak <knowak@microsoft.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-10-05	libxfs: fix atomic64_t detection on x86 32-bit architectures	Darrick J. Wong	1	-1/+8
	xfsprogs during compilation tries to detect if liburcu supports atomic 64-bit ops on the platform it is being compiled on, and if not it falls back to using pthread mutex locks. The detection logic for that fallback relies on _uatomic_link_error() which is a link-time trick used by liburcu that will cause compilation errors on archs that lack the required support. That only works for the generic liburcu code though, and it is not implemented for the x86-specific code. In practice this means that when xfsprogs is compiled on 32-bit x86 archs will successfully link to liburcu for atomic ops, but liburcu does not support atomic64_t on those archs. It indicates this during runtime by generating an illegal instruction that aborts execution, and thus causes various xfsprogs utils to be segfaulting. Fix this by requiring that unsigned longs are at least 64 bits in size, which /usually/ means that 64-bit atomic counters are supported. We can't simply execute the liburcu atomic64_t detection code during configure instead of only relying on the linker error because that doesn't work for cross-compiled packages. Fixes: 7448af588a2e ("libxfs: fix atomic64_t poorly for 32-bit architectures") Reported-by: Anthony Iliopoulos <ailiop@suse.com> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-10-05	xfs_repair: set aformat and anextents correctly when clearing the attr fork	Darrick J. Wong	1	-2/+0
	Ever since commit b42db0860e130 ("xfs: enhance dinode verifier"), we've required that inodes with zero di_forkoff must also have di_aformat == EXTENTS and di_naextents == 0. clear_dinode_attr actually does this, but then both callers inexplicably set di_format = LOCAL. That in turn causes a verifier failure the next time the xattrs of that file are read by the kernel. Get rid of the bogus field write. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-10-05	xfs_scrub: actually return errno from check_xattr_ns_names	Darrick J. Wong	1	-0/+1
	Actually return the error code when extended attribute checks fail. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-10-05	libxfs: use XFS_IGET_CREATE when creating new files	Darrick J. Wong	1	-1/+1
	Use this flag to check that newly allocated inodes are, in fact, unallocated. This matches the kernel, and prevents userspace programs from making latent corruptions worse by unintentionally crosslinking files. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-10-05	libfrog: don't fail on XFS_FSOP_GEOM_FLAGS_NREXT64 in xfrog_bulkstat_single5	Darrick J. Wong	1	-1/+1
	This flag is perfectly acceptable for bulkstatting a single file; there's no reason not to allow it. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-10-05	libfrog: fix overly sleep workqueues	Darrick J. Wong	2	-10/+25
	I discovered the following bad behavior in the workqueue code when I noticed that xfs_scrub was running single-threaded despite having 4 virtual CPUs allocated to the VM. I observed this sequence: Thread 1 WQ1 WQ2...N workqueue_create <start up> pthread_cond_wait <start up> pthread_cond_wait workqueue_add next_item == NULL pthread_cond_signal workqueue_add next_item != NULL <do not pthread_cond_signal> <receives wakeup> <run first item> workqueue_add next_item != NULL <do not pthread_cond_signal> <run second item> <run third item> pthread_cond_wait workqueue_terminate pthread_cond_broadcast <receives wakeup> <nothing to do, exits> <wakes up again> <nothing to do, exits> Notice how threads WQ2...N are completely idle while WQ1 ends up doing all the work! That wasn't the point of a worker pool! Observe that thread 1 manages to queue two work items before WQ1 pulls the first item off the queue. When thread 1 queues the third item, it sees that next_item is not NULL, so it doesn't wake a worker. If thread 1 queues all the N work that it has before WQ1 empties the queue, then none of the other thread get woken up. Fix this by maintaining a count of the number of active threads, and using that to wake either the sole idle thread, or all the threads if there are many that are idle. This dramatically improves startup behavior of the workqueue and eliminates the collapse case. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-10-05	xfs_db: use directio for device access	Darrick J. Wong	1	-0/+1
	XFS and tools (mkfs, copy, repair) don't generally rely on the block device page cache, preferring instead to use directio. For whatever reason, the debugger was never made to do this, but let's do that now. This should eliminate the weird fstests failures resulting from udev/blkid pinning a cache page while the unmounting filesystem writes to the superblock such that xfs_db finds the stale pagecache instead of the post-unmount superblock. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-10-05	libxfs: make platform_set_blocksize optional with directio	Darrick J. Wong	1	-2/+6
	If we're accessing the block device with directio (and hence bypassing the page cache), then don't fail on BLKBSZSET not working. We don't care what happens to the pagecache bufferheads. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-10-05	mkfs: add a config file for 6.6 LTS kernels	Darrick J. Wong	2	-1/+16
	Enable 64-bit extent counts and reverse mapping for 6.6. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-10-05	mkfs: enable reverse mapping by default	Darrick J. Wong	2	-3/+3
	Now that online fsck is feature complete, there's actually a compelling story for having the reverse mappings enabled. Turn it on by default. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-10-05	mkfs: enable large extent counts by default	Darrick J. Wong	2	-4/+5
	Format filesystems with the large extent counter feature turned on. We shall now support 64-bit extent counts for the data fork and 32-bit extent counts for the attr fork. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-10-05	xfs_db: create unlinked inodes	Darrick J. Wong	3	-0/+208
	Create an expert-mode debugger command to create unlinked inodes. This will hopefully aid in simulation of leaked unlinked inode handling in the kernel and elsewhere. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-10-05	xfs_db: dump unlinked buckets	Darrick J. Wong	6	-1/+227
	Create a new command to dump the resource usage of files in the unlinked buckets. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-09-07	xfs: convert flex-array declarations in xfs attr shortform objectslibxfs-sync-6.5	Darrick J. Wong	1	-1/+1
	Source kernel commit: f6250e205691a58c81be041b1809a2e706852641 As of 6.5-rc1, UBSAN trips over the ondisk extended attribute shortform definitions using an array length of 1 to pretend to be a flex array. Kernel compilers have to support unbounded array declarations, so let's correct this. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-09-07	xfs: convert flex-array declarations in xfs attr leaf blocks	Darrick J. Wong	2	-10/+67
	Source kernel commit: a49bbce58ea90b14d4cb1d00681023a8606955f2 As of 6.5-rc1, UBSAN trips over the ondisk extended attribute leaf block definitions using an array length of 1 to pretend to be a flex array. Kernel compilers have to support unbounded array declarations, so let's correct this. ================================================================================ UBSAN: array-index-out-of-bounds in fs/xfs/libxfs/xfs_attr_leaf.c:2535:24 index 2 is out of range for type '__u8 [1]' Call Trace: <TASK> dump_stack_lvl+0x33/0x50 __ubsan_handle_out_of_bounds+0x9c/0xd0 xfs_attr3_leaf_getvalue+0x2ce/0x2e0 [xfs 4a986a89a77bb77402ab8a87a37da369ef6a3f09] xfs_attr_leaf_get+0x148/0x1c0 [xfs 4a986a89a77bb77402ab8a87a37da369ef6a3f09] xfs_attr_get_ilocked+0xae/0x110 [xfs 4a986a89a77bb77402ab8a87a37da369ef6a3f09] xfs_attr_get+0xee/0x150 [xfs 4a986a89a77bb77402ab8a87a37da369ef6a3f09] xfs_xattr_get+0x7d/0xc0 [xfs 4a986a89a77bb77402ab8a87a37da369ef6a3f09] __vfs_getxattr+0xa3/0x100 vfs_getxattr+0x87/0x1d0 do_getxattr+0x17a/0x220 getxattr+0x89/0xf0 Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-09-07	xfs: convert flex-array declarations in struct xfs_attrlist*	Darrick J. Wong	1	-2/+2
	Source kernel commit: 371baf5c9750a258fee21d0cb8c8d683bb057429 As of 6.5-rc1, UBSAN trips over the attrlist ioctl definitions using an array length of 1 to pretend to be a flex array. Kernel compilers have to support unbounded array declarations, so let's correct this. This may cause friction with userspace header declarations, but suck is life. ================================================================================ UBSAN: array-index-out-of-bounds in fs/xfs/xfs_ioctl.c:345:18 index 1 is out of range for type '__s32 [1]' Call Trace: <TASK> dump_stack_lvl+0x33/0x50 __ubsan_handle_out_of_bounds+0x9c/0xd0 xfs_ioc_attr_put_listent+0x413/0x420 [xfs 4a986a89a77bb77402ab8a87a37da369ef6a3f09] xfs_attr_list_ilocked+0x170/0x850 [xfs 4a986a89a77bb77402ab8a87a37da369ef6a3f09] xfs_attr_list+0xb7/0x120 [xfs 4a986a89a77bb77402ab8a87a37da369ef6a3f09] xfs_ioc_attr_list+0x13b/0x2e0 [xfs 4a986a89a77bb77402ab8a87a37da369ef6a3f09] xfs_attrlist_by_handle+0xab/0x120 [xfs 4a986a89a77bb77402ab8a87a37da369ef6a3f09] xfs_file_ioctl+0x1ff/0x15e0 [xfs 4a986a89a77bb77402ab8a87a37da369ef6a3f09] vfs_ioctl+0x1f/0x60 The kernel and xfsprogs code that uses these structures will not have problems, but the long tail of external user programs might. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-09-07	xfs: AGI length should be bounds checked	Darrick J. Wong	3	-39/+60
	Source kernel commit: 2d7d1e7ea321b0b2810eb00183e21332ee9c4b6f Similar to the recent patch strengthening the AGF agf_length verification, the AGI verifier does not check that the AGI length field is within known good bounds. This isn't currently checked by runtime kernel code, yet we assume in many places that it is correct and verify other metadata against it. Add length verification to the AGI verifier. Just like the AGF length checking, the length of the AGI must be equal to the size of the AG specified in the superblock, unless it is the last AG in the filesystem. In that case, it must be less than or equal to sb->sb_agblocks and greater than XFS_MIN_AG_BLOCKS, which is the smallest AG a growfs operation will allow to exist. There's only one place in the filesystem that actually uses agi_length, but let's not leave it vulnerable to the same weird nonsense that generates syzbot bugs, eh? Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-09-07	xfs: fix xfs_btree_query_range callers to initialize btree rec fully	Darrick J. Wong	3	-20/+13
	Source kernel commit: 75dc0345312221971903b2e28279b7e24b7dbb1b Use struct initializers to ensure that the xfs_btree_irecs passed into the query_range function are completely initialized. No functional changes, just closing some sloppy hygiene. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-09-07	xfs: fix bounds check in xfs_defer_agfl_block()	Dave Chinner	1	-5/+6
	Source kernel commit: 2bed0d82c2f78b91a0a9a5a73da57ee883a0c070 Need to happen before we allocate and then leak the xefi. Found by coverity via an xfsprogs libxfs scan. [djwong: This also fixes the type of the @agbno argument.] Fixes: 7dfee17b13e5 ("xfs: validate block number being freed before adding to xefi") Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-09-07	xfs: AGF length has never been bounds checked	Dave Chinner	1	-34/+56
	Source kernel commit: edd8276dd70279c29d412d99b99c2c0cac1b2cdd The AGF verifier does not check that the AGF length field is within known good bounds. This has never been checked by runtime kernel code (i.e. the lack of verification goes back to 1993) yet we assume in many places that it is correct and verify other metdata against it. Add length verification to the AGF verifier. The length of the AGF must be equal to the size of the AG specified in the superblock, unless it is the last AG in the filesystem. In that case, it must be less than or equal to sb->sb_agblocks and greater than XFS_MIN_AG_BLOCKS, which is the smallest AG a growfs operation will allow to exist. This requires a bit of rework of the verifier function. We want to verify metadata before we use it to verify other metadata. Hence we need to verify the AGF sequence numbers before using them to verify the length of the AGF. Then we can verify the AGF length before we verify AGFL fields. Then we can verifier other fields that are bounds limited by the AGF length. And, finally, by calculating agf_length only once into a local variable, we can collapse repeated "if (xfs_has_foo() &&" conditionaly checks into single checks. This makes the code much easier to follow as all the checks for a given feature are obviously in the same place. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-09-07	xfs: journal geometry is not properly bounds checked	Dave Chinner	1	-1/+55
	Source kernel commit: f1e1765aad7de7a8b8102044fc6a44684bc36180 If the journal geometry results in a sector or log stripe unit validation problem, it indicates that we cannot set the log up to safely write to the the journal. In these cases, we must abort the mount because the corruption needs external intervention to resolve. Similarly, a journal that is too large cannot be written to safely, either, so we shouldn't allow those geometries to mount, either. If the log is too small, we risk having transaction reservations overruning the available log space and the system hanging waiting for space it can never provide. This is purely a runtime hang issue, not a corruption issue as per the first cases listed above. We abort mounts of the log is too small for V5 filesystems, but we must allow v4 filesystems to mount because, historically, there was no log size validity checking and so some systems may still be out there with undersized logs. The problem is that on V4 filesystems, when we discover a log geometry problem, we skip all the remaining checks and then allow the log to continue mounting. This mean that if one of the log size checks fails, we skip the log stripe unit check. i.e. we allow the mount because a "non-fatal" geometry is violated, and then fail to check the hard fail geometries that should fail the mount. Move all these fatal checks to the superblock verifier, and add a new check for the two log sector size geometry variables having the same values. This will prevent any attempt to mount a log that has invalid or inconsistent geometries long before we attempt to mount the log. However, for the minimum log size checks, we can only do that once we've setup up the log and calculated all the iclog sizes and roundoffs. Hence this needs to remain in the log mount code after the log has been initialised. It is also the only case where we should allow a v4 filesystem to continue running, so leave that handling in place, too. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-09-07	xfs: don't block in busy flushing when freeing extents	Dave Chinner	3	-24/+57
	Source kernel commit: 8ebbf262d4684e035af5e7aa2a71cab636673a9b If the current transaction holds a busy extent and we are trying to allocate a new extent to fix up the free list, we can deadlock if the AG is entirely empty except for the busy extent held by the transaction. This can occur at runtime processing an XEFI with multiple extents in this path: __schedule+0x22f at ffffffff81f75e8f schedule+0x46 at ffffffff81f76366 xfs_extent_busy_flush+0x69 at ffffffff81477d99 xfs_alloc_ag_vextent_size+0x16a at ffffffff8141711a xfs_alloc_ag_vextent+0x19b at ffffffff81417edb xfs_alloc_fix_freelist+0x22f at ffffffff8141896f xfs_free_extent_fix_freelist+0x6a at ffffffff8141939a __xfs_free_extent+0x99 at ffffffff81419499 xfs_trans_free_extent+0x3e at ffffffff814a6fee xfs_extent_free_finish_item+0x24 at ffffffff814a70d4 xfs_defer_finish_noroll+0x1f7 at ffffffff81441407 xfs_defer_finish+0x11 at ffffffff814417e1 xfs_itruncate_extents_flags+0x13d at ffffffff8148b7dd xfs_inactive_truncate+0xb9 at ffffffff8148bb89 xfs_inactive+0x227 at ffffffff8148c4f7 xfs_fs_destroy_inode+0xb8 at ffffffff81496898 destroy_inode+0x3b at ffffffff8127d2ab do_unlinkat+0x1d1 at ffffffff81270df1 do_syscall_64+0x40 at ffffffff81f6b5f0 entry_SYSCALL_64_after_hwframe+0x44 at ffffffff8200007c This can also happen in log recovery when processing an EFI with multiple extents through this path: context_switch() kernel/sched/core.c:3881 __schedule() kernel/sched/core.c:5111 schedule() kernel/sched/core.c:5186 xfs_extent_busy_flush() fs/xfs/xfs_extent_busy.c:598 xfs_alloc_ag_vextent_size() fs/xfs/libxfs/xfs_alloc.c:1641 xfs_alloc_ag_vextent() fs/xfs/libxfs/xfs_alloc.c:828 xfs_alloc_fix_freelist() fs/xfs/libxfs/xfs_alloc.c:2362 xfs_free_extent_fix_freelist() fs/xfs/libxfs/xfs_alloc.c:3029 __xfs_free_extent() fs/xfs/libxfs/xfs_alloc.c:3067 xfs_trans_free_extent() fs/xfs/xfs_extfree_item.c:370 xfs_efi_recover() fs/xfs/xfs_extfree_item.c:626 xlog_recover_process_efi() fs/xfs/xfs_log_recover.c:4605 xlog_recover_process_intents() fs/xfs/xfs_log_recover.c:4893 xlog_recover_finish() fs/xfs/xfs_log_recover.c:5824 xfs_log_mount_finish() fs/xfs/xfs_log.c:764 xfs_mountfs() fs/xfs/xfs_mount.c:978 xfs_fs_fill_super() fs/xfs/xfs_super.c:1908 mount_bdev() fs/super.c:1417 xfs_fs_mount() fs/xfs/xfs_super.c:1985 legacy_get_tree() fs/fs_context.c:647 vfs_get_tree() fs/super.c:1547 do_new_mount() fs/namespace.c:2843 do_mount() fs/namespace.c:3163 ksys_mount() fs/namespace.c:3372 __do_sys_mount() fs/namespace.c:3386 __se_sys_mount() fs/namespace.c:3383 __x64_sys_mount() fs/namespace.c:3383 do_syscall_64() arch/x86/entry/common.c:296 entry_SYSCALL_64() arch/x86/entry/entry_64.S:180 To avoid this deadlock, we should not block in xfs_extent_busy_flush() if we hold a busy extent in the current transaction. Now that the EFI processing code can handle requeuing a partially completed EFI, we can detect this situation in xfs_extent_busy_flush() and return -EAGAIN rather than going to sleep forever. The -EAGAIN get propagated back out to the xfs_trans_free_extent() context, where the EFD is populated and the transaction is rolled, thereby moving the busy extents into the CIL. At this point, we can retry the extent free operation again with a clean transaction. If we hit the same "all free extents are busy" situation when trying to fix up the free list, we can safely call xfs_extent_busy_flush() and wait for the busy extents to resolve and wake us. At this point, the allocation search can make progress again and we can fix up the free list. This deadlock was first reported by Chandan in mid-2021, but I couldn't make myself understood during review, and didn't have time to fix it myself. It was reported again in March 2023, and again I have found myself unable to explain the complexities of the solution needed during review. As such, I don't have hours more time to waste trying to get the fix written the way it needs to be written, so I'm just doing it myself. This patchset is largely based on Wengang Wang's last patch, but with all the unnecessary stuff removed, split up into multiple patches and cleaned up somewhat. Reported-by: Chandan Babu R <chandanrlinux@gmail.com> Reported-by: Wengang Wang <wen.gang.wang@oracle.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-09-07	xfs: pass alloc flags through to xfs_extent_busy_flush()	Dave Chinner	3	-46/+54
	Source kernel commit: 6a2a9d776c4ae24a797e25eed2b9f7f33f756296 To avoid blocking in xfs_extent_busy_flush() when freeing extents and the only busy extents are held by the current transaction, we need to pass the XFS_ALLOC_FLAG_FREEING flag context all the way into xfs_extent_busy_flush(). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-09-07	xfs: use deferred frees for btree block freeing	Dave Chinner	9	-24/+29
	Source kernel commit: b742d7b4f0e03df25c2a772adcded35044b625ca Btrees that aren't freespace management trees use the normal extent allocation and freeing routines for their blocks. Hence when a btree block is freed, a direct call to xfs_free_extent() is made and the extent is immediately freed. This puts the entire free space management btrees under this path, so we are stacking btrees on btrees in the call stack. The inobt, finobt and refcount btrees all do this. However, the bmap btree does not do this - it calls xfs_free_extent_later() to defer the extent free operation via an XEFI and hence it gets processed in deferred operation processing during the commit of the primary transaction (i.e. via intent chaining). We need to change xfs_free_extent() to behave in a non-blocking manner so that we can avoid deadlocks with busy extents near ENOSPC in transactions that free multiple extents. Inserting or removing a record from a btree can cause a multi-level tree merge operation and that will free multiple blocks from the btree in a single transaction. i.e. we can call xfs_free_extent() multiple times, and hence the btree manipulation transaction is vulnerable to this busy extent deadlock vector. To fix this, convert all the remaining callers of xfs_free_extent() to use xfs_free_extent_later() to queue XEFIs and hence defer processing of the extent frees to a context that can be safely restarted if a deadlock condition is detected. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-09-07	xfs: remove redundant initializations of pointers drop_leaf and save_leaf	Colin Ian King	1	-2/+0
	Source kernel commit: 347eb95b27eb97bebdc3ea7de23558216f4e2c90 Pointers drop_leaf and save_leaf are initialized with values that are never read, they are being re-assigned later on just before they are used. Remove the redundant early initializations and keep the later assignments at the point where they are used. Cleans up two clang scan build warnings: fs/xfs/libxfs/xfs_attr_leaf.c:2288:29: warning: Value stored to 'drop_leaf' during its initialization is never read [deadcode.DeadStores] fs/xfs/libxfs/xfs_attr_leaf.c:2289:29: warning: Value stored to 'save_leaf' during its initialization is never read [deadcode.DeadStores] Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-09-07	xfs: fix ag count overflow during growfs	Long Li	1	-0/+2
	Source kernel commit: c3b880acadc95d6e019eae5d669e072afda24f1b I found a corruption during growfs: XFS (loop0): Internal error agbno >= mp->m_sb.sb_agblocks at line 3661 of file fs/xfs/libxfs/xfs_alloc.c. Caller __xfs_free_extent+0x28e/0x3c0 CPU: 0 PID: 573 Comm: xfs_growfs Not tainted 6.3.0-rc7-next-20230420-00001-gda8c95746257 Call Trace: <TASK> dump_stack_lvl+0x50/0x70 xfs_corruption_error+0x134/0x150 __xfs_free_extent+0x2c1/0x3c0 xfs_ag_extend_space+0x291/0x3e0 xfs_growfs_data+0xd72/0xe90 xfs_file_ioctl+0x5f9/0x14a0 __x64_sys_ioctl+0x13e/0x1c0 do_syscall_64+0x39/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd XFS (loop0): Corruption detected. Unmount and run xfs_repair XFS (loop0): Internal error xfs_trans_cancel at line 1097 of file fs/xfs/xfs_trans.c. Caller xfs_growfs_data+0x691/0xe90 CPU: 0 PID: 573 Comm: xfs_growfs Not tainted 6.3.0-rc7-next-20230420-00001-gda8c95746257 Call Trace: <TASK> dump_stack_lvl+0x50/0x70 xfs_error_report+0x93/0xc0 xfs_trans_cancel+0x2c0/0x350 xfs_growfs_data+0x691/0xe90 xfs_file_ioctl+0x5f9/0x14a0 __x64_sys_ioctl+0x13e/0x1c0 do_syscall_64+0x39/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f2d86706577 The bug can be reproduced with the following sequence: # truncate -s 1073741824 xfs_test.img # mkfs.xfs -f -b size=1024 -d agcount=4 xfs_test.img # truncate -s 2305843009213693952 xfs_test.img # mount -o loop xfs_test.img /mnt/test # xfs_growfs -D 1125899907891200 /mnt/test The root cause is that during growfs, user space passed in a large value of newblcoks to xfs_growfs_data_private(), due to current sb_agblocks is too small, new AG count will exceed UINT_MAX. Because of AG number type is unsigned int and it would overflow, that caused nagcount much smaller than the actual value. During AG extent space, delta blocks in xfs_resizefs_init_new_ags() will much larger than the actual value due to incorrect nagcount, even exceed UINT_MAX. This will cause corruption and be detected in __xfs_free_extent. Fix it by growing the filesystem to up to the maximally allowed AGs and not return EINVAL when new AG count overflow. Signed-off-by: Long Li <leo.lilong@huawei.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-09-07	overflow: Add struct_size_t() helper	Kees Cook	2	-1/+17
	Source kernel commit: d67790ddf0219aa0ad3e13b53ae0a7619b3425a2 While struct_size() is normally used in situations where the structure type already has a pointer instance, there are places where no variable is available. In the past, this has been worked around by using a typed NULL first argument, but this is a bit ugly. Add a helper to do this, and replace the handful of instances of the code pattern with it. Instances were found with this Coccinelle script: @struct_size_t@ identifier STRUCT, MEMBER; expression COUNT; @@ - struct_size((struct STRUCT *)$0\\|NULL$, + struct_size_t(struct STRUCT, MEMBER, COUNT) Suggested-by: Christoph Hellwig <hch@infradead.org> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: James Smart <james.smart@broadcom.com> Cc: Keith Busch <kbusch@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: HighPoint Linux Team <linux@highpoint-tech.com> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumit Saxena <sumit.saxena@broadcom.com> Cc: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Cc: Don Brace <don.brace@microchip.com> Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: Dave Chinner <dchinner@redhat.com> Cc: Guo Xuenan <guoxuenan@huawei.com> Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Daniel Latypov <dlatypov@google.com> Cc: kernel test robot <lkp@intel.com> Cc: intel-wired-lan@lists.osuosl.org Cc: netdev@vger.kernel.org Cc: linux-nvme@lists.infradead.org Cc: linux-scsi@vger.kernel.org Cc: megaraidlinux.pdl@broadcom.com Cc: storagedev@microchip.com Cc: linux-xfs@vger.kernel.org Cc: linux-hardening@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20230522211810.never.421-kees@kernel.org Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-08-25	xfsprogs: don't allow udisks to automount XFS filesystems with no prompt	Darrick J. Wong	5	-0/+69
	The unending stream of syzbot bug reports and overwrought filing of CVEs for corner case handling (i.e. things that distract from actual user complaints) in XFS has generated all sorts of of overheated rhetoric about how every bug is a Serious Security Issue(tm) because anyone can craft a malicious filesystem on a USB stick, insert the stick into a victim machine, and mount will trigger a bug in the kernel driver that leads to some compromise or DoS or something. I thought that nobody would be foolish enough to automount an XFS filesystem. What a fool I was! It turns out that udisks can be told that it's okay to automount things, and then GNOME will do exactly that. Including mounting mangled XFS filesystems! <delete angry rant about poor decisionmaking and armchair fs developers blasting us on X while not actually doing any of the work> Turn off /this/ idiocy by adding a udev rule to tell udisks not to automount XFS filesystems. This will not stop a logged in user from unwittingly inserting a malicious storage device and pressing [mount] and getting breached. This is not a substitute for a thorough audit. This is not a substitute for lklfuse. This does not solve the general problem of in-kernel fs drivers being a huge attack surface. I just want a vacation from the sh*tstorm of bad ideas and threat models that I never agreed to support. v2: Add external logs to the list too, and document the var we set Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2023-08-02	xfs_repair: fix the problem of repair failure caused by dirty flag being ↵	Wu Guanghao	1	-1/+1
	abnormally set on buffer We found an issue where repair failed in the fault injection. $ xfs_repair test.img ... Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 Metadata CRC error detected at 0x55a30e420c7d, xfs_bmbt block 0x51d68/0x1000 - agno = 3 Metadata CRC error detected at 0x55a30e420c7d, xfs_bmbt block 0x51d68/0x1000 btree block 0/41901 is suspect, error -74 bad magic # 0x58534c4d in inode 3306572 (data fork) bmbt block 41901 bad data fork in inode 3306572 cleared inode 3306572 ... Phase 7 - verify and correct link counts... Metadata corruption detected at 0x55a30e420b58, xfs_bmbt block 0x51d68/0x1000 libxfs_bwrite: write verifier failed on xfs_bmbt bno 0x51d68/0x8 xfs_repair: Releasing dirty buffer to free list! xfs_repair: Refusing to write a corrupt buffer to the data device! xfs_repair: Lost a write to the data device! fatal error -- File system metadata writeout failed, err=117. Re-run xfs_repair. $ xfs_db test.img xfs_db> inode 3306572 xfs_db> p core.magic = 0x494e core.mode = 0100666 // regular file core.version = 3 core.format = 3 (btree) ... u3.bmbt.keys[1] = [startoff] 1:[6] u3.bmbt.ptrs[1] = 41901 // btree root ... $ hexdump -C -n 4096 41901.img 00000000 58 53 4c 4d 00 00 00 00 00 00 01 e8 d6 f4 03 14 \|XSLM............\| 00000010 09 f3 a6 1b 0a 3c 45 5a 96 39 41 ac 09 2f 66 99 \|.....<EZ.9A../f.\| 00000020 00 00 00 00 00 05 1f fb 00 00 00 00 00 05 1d 68 \|...............h\| ... The block data associated with inode 3306572 is abnormal, but check the CRC first when reading. If the CRC check fails, badcrc will be set. Then the dirty flag will be set on bp when badcrc is set. In the final stage of repair, the dirty bp will be refreshed in batches. When refresh to the disk, the data in bp will be verified. At this time, if the data verification fails, resulting in a repair error. After scan_bmapbt returns an error, the inode will be cleaned up. Then bp doesn't need to set dirty flag, so that it won't trigger writeback verification failure. Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2023-08-02	mkfs.xfs.8: correction on mkfs.xfs manpage since reflink and dax are compatible	Bill O'Donnell	1	-7/+0
	Merged early in 2023: Commit 480017957d638 xfs: remove restrictions for fsdax and reflink. There needs to be a corresponding change to the mkfs.xfs manpage to remove the incompatiblity statement. Signed-off-by: Bill O'Donnell <bodonnel@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2023-07-19	xfsprogs: Release v6.4.0	Carlos Maiolino	4	-2/+25
	Update all the necessary files for a 6.4.0 release. Signed-off-by: Carlos Maiolino <cem@kernel.org>
2023-07-12	xfs_db: expose the unwritten flag in rmapbt keys	Darrick J. Wong	1	-0/+4
	Teach the debugger to expose the "unwritten" flag in rmapbt keys so that we can simulate an old filesystem writing out bad keys for testing. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>