diff options
author | Wu Guanghao <wuguanghao3@huawei.com> | 2023-07-26 09:43:16 +0800 |
---|---|---|
committer | Carlos Maiolino <cem@kernel.org> | 2023-08-02 10:59:18 +0200 |
commit | a86308c98d33e921eb133f47faedf1d9e62f2e77 (patch) | |
tree | 5c157bef31dd45d255e2241d957c0ad00f9ab304 | |
parent | 780e93c5103d3c19d53c36ab7f4794d14912f3a5 (diff) | |
download | xfsprogs-dev-a86308c98d33e921eb133f47faedf1d9e62f2e77.tar.gz |
xfs_repair: fix the problem of repair failure caused by dirty flag being abnormally set on buffer
We found an issue where repair failed in the fault injection.
$ xfs_repair test.img
...
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
Metadata CRC error detected at 0x55a30e420c7d, xfs_bmbt block 0x51d68/0x1000
- agno = 3
Metadata CRC error detected at 0x55a30e420c7d, xfs_bmbt block 0x51d68/0x1000
btree block 0/41901 is suspect, error -74
bad magic # 0x58534c4d in inode 3306572 (data fork) bmbt block 41901
bad data fork in inode 3306572
cleared inode 3306572
...
Phase 7 - verify and correct link counts...
Metadata corruption detected at 0x55a30e420b58, xfs_bmbt block 0x51d68/0x1000
libxfs_bwrite: write verifier failed on xfs_bmbt bno 0x51d68/0x8
xfs_repair: Releasing dirty buffer to free list!
xfs_repair: Refusing to write a corrupt buffer to the data device!
xfs_repair: Lost a write to the data device!
fatal error -- File system metadata writeout failed, err=117. Re-run xfs_repair.
$ xfs_db test.img
xfs_db> inode 3306572
xfs_db> p
core.magic = 0x494e
core.mode = 0100666 // regular file
core.version = 3
core.format = 3 (btree)
...
u3.bmbt.keys[1] = [startoff]
1:[6]
u3.bmbt.ptrs[1] = 41901 // btree root
...
$ hexdump -C -n 4096 41901.img
00000000 58 53 4c 4d 00 00 00 00 00 00 01 e8 d6 f4 03 14 |XSLM............|
00000010 09 f3 a6 1b 0a 3c 45 5a 96 39 41 ac 09 2f 66 99 |.....<EZ.9A../f.|
00000020 00 00 00 00 00 05 1f fb 00 00 00 00 00 05 1d 68 |...............h|
...
The block data associated with inode 3306572 is abnormal, but check the CRC first
when reading. If the CRC check fails, badcrc will be set. Then the dirty flag
will be set on bp when badcrc is set. In the final stage of repair, the dirty bp
will be refreshed in batches. When refresh to the disk, the data in bp will be
verified. At this time, if the data verification fails, resulting in a repair
error.
After scan_bmapbt returns an error, the inode will be cleaned up. Then bp
doesn't need to set dirty flag, so that it won't trigger writeback verification
failure.
Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-rw-r--r-- | repair/scan.c | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/repair/scan.c b/repair/scan.c index 008ef65ac7..27a33286a2 100644 --- a/repair/scan.c +++ b/repair/scan.c @@ -185,7 +185,7 @@ scan_lbtree( ASSERT(dirty == 0 || (dirty && !no_modify)); - if ((dirty || badcrc) && !no_modify) { + if (!err && (dirty || badcrc) && !no_modify) { libxfs_buf_mark_dirty(bp); libxfs_buf_relse(bp); } |