aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2021-01-25e2fsck: Annotating fields in e2fsck_structpfsckSaranya Muruganandam1-6/+14
Adding information on fields in e2fsck_struct on how they are used when running parallel fsck. Signed-off-by: Saranya Muruganandam <saranyamohan@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: propagate number of threadsSaranya Muruganandam1-0/+3
Sometimes, such as in orphan_inode case, e2fsck_pass1 is called after reading the block bitmaps. This results in reading the block bitmap sequentially and multithreading only gets kicked in later. Fix the thread count earlier while setting up the file system. Signed-off-by: Saranya Muruganandam <saranyamohan@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: misc cleanups for pfsckAndreas Dilger5-28/+31
Add -m option description to e2fsck.8 man page. Rename e2fsck_struct fs_num_threads to pfs_num_threads to avoid confusion with the ext2_filsys fs_num_threads field, and move thread_info to be together with the other HAVE_PTHREAD fields. Move ext2_filsys fs_num_threads to fit into the __u16 "pad" field to avoid consuming one of the few remaining __u32 reserved fields. Fix a few print format warnings. Signed-off-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Wang Shilong <wshilong@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: fix memory leaks with pfsck enabledWang Shilong2-0/+5
valgrind detected two memory leaks: 1) quota context is not released after merging. 2) @refcount_orig should be released Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25tests: add pfsck testWang Shilong4-0/+29
pfsck run on a clean fs should not return any errors. Generate an image with possible features enabled, especially EA shared blocks etc. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: reset @inodes_to_rebuild if restartWang Shilong1-0/+4
Verify multiple thread on a corrupted images hit following bug: pass1.c:2902: e2fsck_pass1_thread_prepare: Assertion `global_ctx->inodes_to_rebuild == NULL' failed. Signal (6) SIGABRT si_code=SI_TKILL ./e2fsck/e2fsck[0x43829e] /lib64/libpthread.so.0(+0x14b20)[0x7f3b45135b20] /lib64/libc.so.6(gsignal+0x145)[0x7f3b44f2c625] /lib64/libc.so.6(abort+0x12b)[0x7f3b44f158d9] /lib64/libc.so.6(+0x257a9)[0x7f3b44f157a9] /lib64/libc.so.6(+0x34a66)[0x7f3b44f24a66] ./e2fsck/e2fsck(e2fsck_pass1+0x1662)[0x423572] ./e2fsck/e2fsck(e2fsck_run+0x5a)[0x41611a] ./e2fsck/e2fsck(main+0x1608)[0x4121b8] /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f3b44f171a3] ./e2fsck/e2fsck(_start+0x2e)[0x413dde] @inodes_to_rebuild could be not NULL after we restart pass1 Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: update mmp block in one threadWang Shilong2-2/+33
For multiple threads, different threads will try to update mmp block at the same time, only allow one thread to update it. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: make default smallest RA size to 1MWang Shilong1-0/+4
If we have a smaller inodes per group, default ra size could be very small(etc 128KiB), this hurts performances. Tune above 128K to 1M, i see pass1 time drop down from 677.12 seconds to 246 secons with 32 threads. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: avoid too much memory allocation for pfsckWang Shilong2-0/+6
e2fsck init memory according to filesystem inodes/dir numbers recorded in the superblock, this should be aware of filesystem number of threads, otherwise, oom happen. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: cleanup e2fsck_pass1_thread_join()Wang Shilong1-26/+8
Use e2fsck_reset_context() to free memory to simpify codes. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: wait fix thread finish before checkingWang Shilong4-12/+117
Before proceeding next inodes, waitting existed fixing finished. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Saranya Muruganandam <saranyamohan@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: set E2F_FLAG_ALLOC_OK after threadsWang Shilong1-2/+8
Only flag ALLOC OK after all threads finished without problem. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: simplify e2fsck context merging codesWang Shilong1-120/+31
We tried to copy thread context to global context directly and then copy back some saved variables before merging. Since we have finished almost all necessary variables in the e2fsck context, we could simplify codes, and this could help us understand what is missing rather than hide problems. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: merge extent depth count after threads finishWang Shilong1-1/+10
tests covered by f_extent_htree. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: reset lost_and_found after threads finishWang Shilong1-0/+5
This should not be kept, the reaons is similar to what e2fsck_pass1 has done before. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: merge options after threads finishWang Shilong1-5/+4
It will be possible that threads might append E2F_OPT_YES, so we need merge options to global, test f_yesall cover this. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Saranya Muruganandam <saranyamohan@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: fix readahead for pfsck of pass1Wang Shilong1-9/+20
Several improvments for this patch: 1) move readahead_kb detection to preparing phase. 2) inode readahead should be aware of thread block group boundary. 3) make readahead_kb aware of multiple threads. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: adjust number of threadsWang Shilong2-18/+46
number of threads should not exceed flex bg numbers, and output messages if we adjust threads number. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: allow admin specify number of threadsWang Shilong9-68/+128
-m option is introduced to specify number of threads for pfsck. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: kickoff mutex lock for block found mapWang Shilong6-81/+172
Now @block_found_map is no longer shared by multiple threads, and @block_dup_map need be checked again after threads finish. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Saranya Muruganandam <saranyamohan@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: merge EA blocks properlyWang Shilong2-33/+213
EA blocks might be shared, merge them carefully. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Saranya Muruganandam <saranyamohan@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: split and merge invalid bitmapsWang Shilong1-0/+71
Invalid bitmaps are splitted per thread, and we should merge them after thread finish. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: move some fixes out of parallel pthreadsWang Shilong6-136/+216
We could only use @found_map_block to find free blocks after we have collectd all used blocks, so something like handle_fs_bad_blocks(), ext2fs_create_resize_inode(), e2fsck_pass1_dupblocks() really should be handled after all threads has been finished. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Saranya Muruganandam <saranyamohan@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: serialize fix operationsWang Shilong3-7/+100
Allow different threads to fix at the same time could be dangerous and error-prone now, and most of time parallel scanning and checking is important. So this patch adds a mutex to serialize fix operations during pass1. And the good benefit of this, we don't need block allocations and free, superblock updates protection any more, since only fix operations during pass1 could touch them. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Saranya Muruganandam <saranyamohan@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: merge quota context after threads finishWang Shilong3-1/+56
Every threads calculate its own quota accounting, merge them after threads finish. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: merge context flags properlyWang Shilong1-3/+1
e2fsck might restart after pass1, so we should keep flags if possible, this patch try to fix f_illitable_flexbg failure Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Saranya Muruganandam <saranyamohan@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: merge dirs_to_hash when threads finishWang Shilong1-0/+30
@dirs_to_hash list need be merged after threads finish, test covered by t_dangerous. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: merge dx_dir_info after threads finishWang Shilong3-0/+88
Merge properly. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: merge fs flags when threads finishLi Xi1-0/+5
merge fs flags properly. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: merge counts after threads finishWang Shilong1-0/+34
Merge counts properly. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Saranya Muruganandam <saranyamohan@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: add debug codes for multiple threadsLi Xi2-0/+45
These debug codes are added to run the multiple pass1 check thread one by one in order. If all the codes are correct, fsck of multiple threads should have exactly the same outcome with single thread. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: merge dblist after thread finishesLi Xi3-0/+56
Merge dblist properly. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: merge icounts after thread finishesLi Xi3-0/+151
Merge inode_count and inode_link_info properly after threads finish. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: merge badblocks after thread finishesWang Shilong4-13/+94
Badblocks should be merged properly after threads finish. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: rbtree bitmap for dirWang Shilong1-0/+1
Only rbtree support merge operation now, use it for bitmaps. Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: merge dir_info after thread finishesLi Xi3-3/+123
dir_info need be merged after thread finish. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: optimize the inserting of dir_info_dbLi Xi1-60/+112
Binary search is now used when inserting an dir info to the array. Memmove is now used when moving array. Both of them improves the performance of inserting. This patch is also a prepartion for the merging of two dir db arrays. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: do not change global variablesLi Xi1-28/+47
Global variables used in pass1 check are changed to local variables in this patch. This will avoid conflict between threads. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: merge bitmaps after thread completesWang Shilong6-91/+264
A new method merge_bmap has been added to bitmap operations. But only red-black bitmap has that operation now. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: print thread log properlyLi Xi7-7/+98
When multi-thread fsck is enabled, logs printed from multiple threads could overlap with each other. The overlap sometimes makes the logs unreadable because log_out() is used multiple times for a single line. This patch adds leading [Thread XXX] to each logs if multi-thread is enabed by -m option. This patch also adds message to show the group ranges and inode numbers for each thread, which is useful for debuging multi-thread check. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: split groups to different threadsLi Xi1-6/+17
The start/end groups of a thread is calculated according to the thread number. But still, only one thread is used to check. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: add start/end group for threadLi Xi6-10/+78
When multi-threads are used for check, each thread needs to jump to different group in pass1 check. This patch adds the group jumping support. But still, only one thread is used to check. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: configure one pfsck threadLi Xi5-18/+168
This patch creates only one thread to do pass1 check if pthreads are enabled. The same codes can be used to create multiple threads, but other functions need to be modified to get ready for that. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: create logs for mult-threadsLi Xi7-8/+88
When multi-threads are used, different logs should be created for different threads. Each thread has log files with suffix of ".$THREAD_INDEX". And this patch adds f_multithread_logfile test case. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: open io-channel when copying fsLi Xi8-23/+127
This patch also add writethrough flag to the thread io-channel. When multiple threads write the same disk, we don't want the data being saved in memory cache. This will be useful in the future, but even without that flag, the tests can be passed too. This patch also cleanup the io channel cache of the global context. Otherwise, after pass1 step, the next steps would use old data saved in the cache. And the cached data might have already been overwritten in pass1. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Saranya Muruganandam <saranyamohan@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: copy badblocks when copying fsWang Shilong1-0/+13
This patch copies badblocks when the copying fs. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: copy bitmaps when copying contextLi Xi1-12/+139
This patch copies bitmap when the copying context. In the multi-thread fsck, each thread use different bitmap that copied from the glboal bitmap. And Bitmaps from multiple threads will be merged into a global one after the pass1 finishes. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: add assert when copying contextLi Xi1-0/+14
Adding the assert would simplify the copying of context. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: clear icache when using multi-thread fsckWang Shilong1-3/+34
icache of fs will be rebuilt when needed, so after copying fs, icache can be inited to NULL. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: copy fs when using multi-thread fsckLi Xi1-4/+37
This patch only copy the fs to a new one when -m is enabled. It doesn't actually start any thread. When pass1 test finishes, the new fs is copied back to the original context. This patch handles the fs fields in dblist, inode_map and block_map properly. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: copy context when using multi-thread fsckLi Xi2-8/+107
This patch only copy the context to a new one when -m is enabled. It doesn't actually start any thread. When pass1 test finishes, the new context is copied back to the original context. Since the signal handler only changes the original context, so add global_ctx in "struct e2fsck_struct" and use that to check whether there is any signal of canceling. This patch handles the long jump properly so that all the existing tests can be passed even the context has been copied. Otherwise, test f_expisize_ea_del would fail when aborting. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-25e2fsck: add -m option for multithreadLi Xi27-6/+217
-m option is added but no actual functionality is added. This patch only adds the logic that when -m is specified, one of -p/-y/-n options should be specified. And when -m is specified, -C shouldn't be specified and the completion progress report won't be triggered by sending SIGUSR1/SIGUSR2 signals. This simplifies the implementation of multi-thread fsck in the future. Completion progress support with multi-thread fsck will be added back after multi-thread fsck implementation is finished. Right now, disable it to simplify the implementation of multi-thread fsck. Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Saranya Muruganandam <saranyamohan@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-23Merge branch 'maint' into nextTheodore Ts'o13-35/+54
2021-01-23Include PTHREAD_CFLAGS in LDFLAGS* macrosTheodore Ts'o1-3/+3
PTHREAD_CFLAGS is set by AX_PTHREADS, and these flags need to be included when linking executables. Fixes: bdcd5f22203f ("Add configure and build support for the pthreads library") Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-23Fix clang warningsTheodore Ts'o4-8/+9
Clang gets unhappy when passing an unsigned char to string functions. For better or for worse we use __u8[] in the definition of the superblock. So cast them these to "char *" to prevent clang build-time warnings. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-23libext2fs: use compiler built-in offsetof() if availableTheodore Ts'o3-12/+26
This avoids UBSAN sanitizer warnings, since &(0->member) is technically undefined per the C standard. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-23Only build resize2fs.static when running "make all-static"Theodore Ts'o1-1/+1
Fixes: 93df80d2409d ("Teach makefiles... the target all-static") Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21libext2fs: fix UBSAN warning in ext2fs_mmp_new_seq()Theodore Ts'o1-1/+3
Left shifting the pid by 16 bits can cause a UBSAN warning if the pid is greater than or equal to 2**16. It doesn't matter since we're just using the pid to seed for a pseudo-random number generator, but silence the warning by just swapping the high and low 16 bits of the pid instead. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21mke2fs.8: Improve valid block size documentationJan Kara1-4/+6
Explain which valid block sizes mke2fs supports in more detail. Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21build: Add SYSLIBS to e4crypt linkingHauke Mehrtens1-1/+1
The $(SYSLIBS) was missing when linking the e4crypt application. This is available in the e4crypt.profiled variant, so I assume this was just missing in the normal variant and is not left out intentionally. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21tune2fs: abort clearing the dir_index when the fs needs to be fsck'ed firstTheodore Ts'o1-2/+3
We were not checking the return value of check_fsck_needed() when checking to clear the dir_index feature. As a result, tune2fs would print that the file system needed to be checked first, but then go ahead and clear the dir_index flag. Addresses-Coverity-Bug: 1467671 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21e2fsck: remove dead code when recreating the journalTheodore Ts'o1-7/+0
params.num_journal_blocks is an unsigned value so it can never be less than zero. Addresses-Coverity-Bug: 1472250 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21debugfs: fix the printf specifier when dumping a fast commit blockTheodore Ts'o1-1/+1
Addresses-Coverity-Bug: 1472249 Addresses-Coverity-Bug: 1472253 Addresses-Coverity-Bug: 1472254 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21libext2fs: fix minor Coverity nits in ext2fs_rw_bitmaps()Theodore Ts'o1-13/+11
Addresses-Coverity-Bug: 1472252 Addresses-Coverity-Bug: 1472253 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21tune2fs: fix resource leak in handle_quota_options()Theodore Ts'o1-2/+4
Addresses-Coverity-Bug: 1467672 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21debugfs: fix double free in realloc() error path in read_list()Theodore Ts'o1-4/+2
Fixes-Coverity-Bug: 1464575 Fixes-Coverity-Bug: 1464571 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21libext2fs: fix incorrect negative error return in unix and sparse io managersTheodore Ts'o2-3/+3
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21libext2fs: fix incorrect negative error return in ext2fs_rw_bitmaps()Theodore Ts'o1-1/+1
Fixes: e2e58d312804 ("ext2fs: parallel bitmap loading") Fixes-Coverity-Bug: 147255 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21debugfs: add fast commit support to logdumpHarshad Shirwadkar2-5/+320
Add fast commit support for debugfs logdump. This commit also adds fast_commit.h that contains the necessary helpers needed for fast commit replay. Note that this file is also byte by byte identical with kernel's fast_commit.h. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21ext4: fix tests to account for new dumpe2fs outputHarshad Shirwadkar22-54/+74
dumpe2fs tool now is capable of reporting number of fast commit blocks. There were slight changes in the output of dumpe2fs outside of fast commits. This patch fixes the regression tests to expect the new output. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21Make userspace tools number of fast commits blocks awareHarshad Shirwadkar8-39/+142
This patch makes number of fast commit blocks configurable. Also, the number of fast commit blocks can now be seen in dumpe2fs output. $ ./misc/mke2fs -O fast_commit -t ext4 image mke2fs 1.46-WIP (20-Mar-2020) Discarding device blocks: done Creating filesystem with 5120 1k blocks and 1280 inodes Allocating group tables: done Writing inode tables: done Creating journal (1040 blocks): done Writing superblocks and filesystem accounting information: done $ ./misc/dumpe2fs image dumpe2fs 1.46-WIP (20-Mar-2020) ... Journal features: (none) Total journal size: 1040k Total journal blocks: 1040 Max transaction length: 1024 Fast commit length: 16 Journal sequence: 0x00000001 Journal start: 0 $ ./misc/mke2fs -O fast_commit -t ext4 image -J fast_commit_size=256,size=1 mke2fs 1.46-WIP (20-Mar-2020) Creating filesystem with 5120 1k blocks and 1280 inodes Allocating group tables: done Writing inode tables: done Creating journal (1280 blocks): done Writing superblocks and filesystem accounting information: done $ ./misc/dumpe2fs image dumpe2fs 1.46-WIP (20-Mar-2020) ... Journal features: (none) Total journal size: 1280k Total journal blocks: 1280 Max transaction length: 1024 Fast commit length: 256 Journal sequence: 0x00000001 Journal start: 0 This patch also adds information about fast commit feature in mke2fs and tune2fs man pages. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21libext2fs: provide APIs to configure fast commit blocksHarshad Shirwadkar6-24/+133
This patch adds new libext2fs that allow configuring number of fast commit blocks in journal superblock. We also add a struct ext2fs_journal_params which contains number of fast commit blocks and number of normal journal blocks. With this patch, the preferred way for configuring number of blocks with and without fast commits is: struct ext2fs_journal_params params; ext2fs_get_journal_params(&params, ...); params.num_journal_blocks = ...; params.num_fc_blocks = ...; ext2fs_create_journal_superblock2(..., &params, ...); OR ext2fs_add_journal_inode3(..., &params, ...); Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21e2fsck: port fc changes from kernel's recovery.c to e2fsckHarshad Shirwadkar5-69/+194
This patch makes recovery.c identical with fast commit kernel changes. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21e2fsck: add kernel endian-ness conversion macrosHarshad Shirwadkar2-32/+16
In order to make recovery.c identical with kernel, we need endianness conversion macros (such as cpu_to_be32 and friends) defined in e2fsprogs. This patch defines these macros and also fixes recovery.c to use these. These macros are also needed for fast commit recovery patches later in this series. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21ext2fs: move calculate_summary_stats to ext2fs libHarshad Shirwadkar4-61/+99
The function calculate_summary_stats sets the global metadata of the file system. Tune2fs had this function defined statically in tune2fs.c. Fast commit replay needs this function to set global metadata at the end of the replay phase. So, move this function to libext2fs. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21Enable threaded support for e2fsprogs' applications.Theodore Ts'o9-11/+16
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21ext2fs: parallel bitmap loadingWang Shilong2-52/+288
In our benchmarking for PiB size filesystem, pass5 takes 10446s to finish and 99.5% of time takes on reading bitmaps. It makes sense to reading bitmaps using multiple threads, a quickly benchmark show 10446s to 626s with 64 threads. [ This has all of many bug fixes for rw_bitmaps.c from the original luster patch set collapsed into a single commit. In addition it has the new ext2fs_rw_bitmaps() api proposed by Ted. ] Signed-off-by: Wang Shilong <wshilong@ddn.com> Signed-off-by: Saranya Muruganandam <saranyamohan@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21libext2fs: allow the unix_io manager's cache to be disabled and re-enabledTheodore Ts'o2-0/+20
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21libext2fs: add threading support to the I/O manager abstractionTheodore Ts'o6-17/+133
Add initial implementation support for the unix_io manager. Applications which want to use threading should pass in IO_FLAG_THREADS when opening the channel. Channels which support threading (which as of this commit is unix_io and test_io if the backing io_manager supports threading) will set the CHANNEL_FLAGS_THREADS bit in io->flags. Library code or applications can test if threading is enabled by checking this flag. Applications using libext2fs can pass in EXT2_FLAG_THREADS to ext2fs_open() or ext2fs_open2() to request threading support. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-21Add configure and build support for the pthreads libraryTheodore Ts'o5-1077/+3157
Support for pthreads can be forcibly disabled by passing "--without-pthread" to the configure script. The actual changes in this commit are in configure.ac and MCONFIG.in; the other files were generated as a result of running aclocal, autoconf, and autoheader on a Debian testing system. Note: the autoconf-archive package must now be installed before rerunning aclocal, to supply the AX_PTHREAD macro. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-19Merge branch 'maint' into nextTheodore Ts'o29-186/+53
2021-01-19mke2fs: Escape double quotes when parsing mke2fs.confLukas Czerner1-0/+1
Currently, when constructing the <default> configuration pseudo-file using the profile-to-c.awk script we will just pass the double quotes as they appear in the mke2fs.conf. This is problematic, because the resulting default_profile.c will either fail to compile because of syntax error, or leave the resulting configuration invalid. It can be reproduced by adding the following line somewhere into mke2fs.conf configuration and forcing mke2fs to use the <default> configuration by specifying nonexistent mke2fs.conf MKE2FS_CONFIG="nonexistent" ./misc/mke2fs -T ext4 /dev/device default_mntopts = "acl,user_xattr" ^ this will fail to compile default_mntopts = "" ^ this will result in invalid config file Syntax error in mke2fs config file (<default>, line #4) Unknown code prof 17 Fix it by escaping the double quotes with a backslash in profile-to-c.awk script. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-19libext2fs: add gnu.translator supportRomain Naour1-0/+1
The support of setting (and reading) of passive translators from GNU/Linux has been added to the Linux kernel by the commit [1]. The name index '10' has been reserved for GNU/Hurd. Hurd passive translators are stored as a xattr value with name "gnu.translator" [2]. If "gnu.translator" xattr value has been set before calling mkfs.ext2, it will segfault since "gnu." is not present in ea_names[]. $ setfattr -n gnu.translator -v "/hurd/exec\0" ${TARGET_DIR}/servers/exec $ mkfs.ext2 -d ${TARGET_DIR} -o hurd -O ext_attr rootfs.ext2 "1G" Adding "gnu." to ea_names[], allow to create ext2 filesystem for GNU/Hurd with passive translator already set. [1] https://git.savannah.gnu.org/cgit/hurd/hurd.git/commit/?id=a04c7bf83172faa7cb080fbe3b6c04a8415ca645 [2] https://lists.gnu.org/archive/html/bug-hurd/2016-08/msg00075.html Signed-off-by: Romain Naour <romain.naour@gmail.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-01-19filefrag: handle invalid st_dev and blksize casesLuis Henriques1-2/+2
It is possible to crash filefrag with a "Floating point exception" in two different scenarios: 1. When fstat() returns a device ID set to 0 2. When FIGETBSZ ioctl returns a blocksize of 0 In both scenarios a divide-by-zero will occur in frag_report() because variable blksize will be set to zero. I've managed to trigger this crash with an old CephFS kernel client, using xfstest generic/519. The first scenario has been fixed by kernel commit 75c9627efb72 ("ceph: map snapid to anonymous bdev ID"). The second scenario is also fixed with commit 8f97d1e99149 ("vfs: fix FIGETBSZ ioctl on an overlayfs file"). However, it is desirable to handle these two scenarios gracefully by checking these conditions explicitly. Signed-off-by: Luis Henriques <lhenriques@suse.de> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-12-15create_inode: set xattrs to the root directory as wellAntoine Tenart1-0/+8
populate_fs do copy the xattrs for all files and directories, but the root directory is skipped and as a result its extended attributes aren't set. This is an issue when using mkfs to build a full system image that can be used with SElinux in enforcing mode without making any runtime fix at first boot. This patch adds logic to set the root directory's extended attributes. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-25po: reapply local e2fsprogs chages to po/Makefile.in.inTheodore Ts'o1-6/+16
These are the changes which are needed after running gettextize to update to gettext 0.19.8 in the previous commit. * Add support for maintainer mode (which doesn't do as much given that gettext now has settings in Makevars which allows us to suppress automatic updates of the po and gmo files) * Add support to expand the '@' abbreviations in e2fsck/problem.c and give an explanation of how they work for translators * Add support for configure --enable-verbose-makecmds and default to "kernel-style" quieter make output --- this makes it easier to see warnings and errors by suppressing the distracting details. * Teach the makefile where to find the generated error table C files in the build directory. * Add make targets (e.g., all-static, fullcheck, coverage.txt) which are required by the top-level Makefile. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-25Update gettext files to version 0.19.8Theodore Ts'o60-20026/+2178
This also removes the built-in "intl" directory as this is now considered deprecated by the gettext package. This means that we won't try to use an internal version of gettext if it's not installed on the build system. We will simply disable NLS support in that case. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-24po: update sr.po (from translationproject.org)Мирослав Николић1-1699/+1429
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-24po: update nl.po (from translationproject.org)Benno Schulenberg1-64/+74
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-24po: update ms.po (from translationproject.org)Sharuzzaman Ahmat Raslan1-253/+227
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-27release notes: delete two files that are fully contained within v1.41.txtBenno Schulenberg2-176/+0
They are pure duplicates. Signed-off-by: Benno Schulenberg <bensberg@telfort.nl> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-06debugfs: fix parse_uint for 64-bit fieldsTheodore Ts'o1-5/+5
The logic for handling 64-bit structure elements was reversed, which caused attempts to set fields like kbytes_written to fail: % debugfs -w /tmp/foo.img debugfs 1.45.6 (20-Mar-2020) debugfs: set_super_value kbytes_written 1024 64-bit field kbytes_written has a second 64-bit field defined; BUG?!? https://github.com/tytso/e2fsprogs/issues/36 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-04Define MKDIR_P in the Makefile.in files instead in MCONFIG.inTheodore Ts'o19-1/+18
In the case where mkdir -p is not thread-safe (for example, if the build environment is using busybox's mkdir) the configure script will fall back to the slow (but safe) install-sh script. In that case MKDIR_P will be using a relative pathname; so we can't use speed optimization of defining configure substitutions in MCONFIG.in, since the substitution will be different depending on depth of the subdirectory in the Makefile.in file. https://github.com/tytso/e2fsprogs/issues/51 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-02resize2fs: prevent block group descriptors from overflowing the first bgTheodore Ts'o1-0/+14
For 1k block file systems, resizing a file system larger than 1073610752 blocks will result in the size of the block group descriptors to be so large that it will overlap with the backup superblock in block group #1. This problem can be reproduced via: mke2fs -t ext4 /tmp/foo.img 200M resize2fs /tmp/foo.img 1T e2fsck -f /tmp/foo.img https://github.com/tytso/e2fsprogs/issues/50 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-01AOSP: Fix a trivial type errorYi Kong1-2/+3
Comparing unsigned int with ULONG_MAX is always false. Signed-off-by: Yi Kong <yikong@google.com> Change-Id: Iae02aad1bcb271d3468828977be288ad04333821 From AOSP commit: 757a4d672dae1a15c57f5f0705ba90ed007da7e6
2020-10-01AOSP: Include private/fs_config.h directly when neededTom Cherry1-0/+1
This is no longer a transitive include of android_filesystem_config.h Bug: 149785767 Test: build Change-Id: I954adf7879fa811eb2b290a0983c84d47ecae26c From AOSP commit: 9fa6bed983d5ddd7226af647a0c4c0d922227be4
2020-10-01Merge branch 'maint' into nextTheodore Ts'o32-150/+285
2020-10-01chattr/lsattr: Support dax attributeXiao Yang4-4/+16
Use the letter 'x' to set/get dax attribute on a directory/file. Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-01mke2fs: fix up check for hardlinks always false if inode > 0xFFFFFFFFHongxu Jia1-1/+1
While file has a large inode number (> 0xFFFFFFFF), mkfs.ext4 could not parse hardlink correctly. Prepare three hardlink files for mkfs.ext4 $ ls -il rootfs_ota/a rootfs_ota/boot/b rootfs_ota/boot/c 11026675846 -rw-r--r-- 3 hjia users 0 Jul 20 17:44 rootfs_ota/a 11026675846 -rw-r--r-- 3 hjia users 0 Jul 20 17:44 rootfs_ota/boot/b 11026675846 -rw-r--r-- 3 hjia users 0 Jul 20 17:44 rootfs_ota/boot/c $ truncate -s 1M rootfs_ota.ext4 $ mkfs.ext4 -F -i 8192 rootfs_ota.ext4 -L otaroot -U fd5f8768-c779-4dc9-adde-165a3d863349 -d rootfs_ota $ mkdir mnt && sudo mount rootfs_ota.ext4 mnt $ ls -il mnt/a mnt/boot/b mnt/boot/c 12 -rw-r--r-- 1 hjia users 0 Jul 20 17:44 mnt/a 14 -rw-r--r-- 1 hjia users 0 Jul 20 17:44 mnt/boot/b 15 -rw-r--r-- 1 hjia users 0 Jul 20 17:44 mnt/boot/c After applying this fix $ ls -il mnt/a mnt/boot/b mnt/boot/c 12 -rw-r--r-- 3 hjia users 0 Jul 20 17:44 mnt/a 12 -rw-r--r-- 3 hjia users 0 Jul 20 17:44 mnt/boot/b 12 -rw-r--r-- 3 hjia users 0 Jul 20 17:44 mnt/boot/c Since commit [382ed4a1 e2fsck: use proper types for variables][1] applied, it used ext2_ino_t instead of ino_t for referencing inode numbers, but the type of is_hardlink's `ino' should not be instead, The ext2_ino_t is 32bit, if inode > 0xFFFFFFFF, its value will be truncated. Add a debug printf to show the value of inode, when it check for hardlink files, it will always return false if inode > 0xFFFFFFFF |--- a/misc/create_inode.c |+++ b/misc/create_inode.c |@@ -605,6 +605,7 @@ static int is_hardlink(struct hdlinks_s *hdlinks, dev_t dev, ext2_ino_t ino) | { | int i; | |+ printf("%s %d, %lX, %lX\n", __FUNCTION__, __LINE__, hdlinks->hdl[i].src_ino, ino); | for (i = 0; i < hdlinks->count; i++) { | if (hdlinks->hdl[i].src_dev == dev && | hdlinks->hdl[i].src_ino == ino) Here is debug message: is_hardlink 608, 2913DB886, 913DB886 The length of ext2_ino_t is 32bit (typedef __u32 __bitwise ext2_ino_t;), and ino_t is 64bit on 64bit system (such as x86-64), recover `ino' to ino_t. [1] https://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git/commit/?id=382ed4a1c2b60acb9db7631e86dda207bde6076e Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-01mke2fs: Warn if fs block size is incompatible with DAXJan Kara2-29/+55
If we are creating filesystem on DAX capable device, warn if set block size is incompatible with DAX to give admin some hint why DAX might not be available. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-01e4crypt: if salt is explicitly provided to add_key, then use itFlorian Schmaus2-5/+17
Providing -S and a path to 'add_key' previously exhibited an unintuitive behavior: instead of using the salt explicitly provided by the user, e4crypt would use the salt obtained via EXT4_IOC_GET_ENCRYPTION_PWSALT on the path. This was because set_policy() was still called with NULL as salt. With this change we now remember the explicitly provided salt (if any) and use it as argument for set_policy(). Eventually e4crypt add_key -S s:my-spicy-salt /foo will now actually use 'my-spicy-salt' and not something else as salt for the policy set on /foo. Signed-off-by: Florian Schmaus <flo@geekplace.eu> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-01tune2fs: reset MMP state on error exitAndreas Dilger5-38/+95
If tune2fs cannot perform the requested change, ensure that the MMP block is reset to the unused state before exiting. Otherwise, the filesystem will be left with mmp_seq = EXT4_MMP_SEQ_FSCK set, which prevents it from being mounted afterward: EXT4-fs warning (device dm-9): ext4_multi_mount_protect:311: fsck is running on the filesystem Add a test to try some failed tune2fs operations and verify that the MMP block is left in a clean state afterward. Lustre-bug-id: https://jira.whamcloud.com/browse/LU-13672 Signed-off-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-01ext2fs: remove unused variable 'left'Lukas Czerner1-2/+1
Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-01e2fsck: use the right conversion specifier in e2fsck_allocate_memory()Lukas Czerner1-1/+1
Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-01e2fsck: use size_t instead of int in string_copy()Lukas Czerner2-2/+2
len argument in string_copy() is int, but it is used with malloc(), strlen(), strncpy() and some callers use sizeof() to pass value in. So it really ought to be size_t rather than int. Fix it. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-09-30tests: remove unnecessary uncompressed image fileTheodore Ts'o1-0/+0
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-08-26libext2fs: fix potential buffer overrun in __get_dirent_tail()Theodore Ts'o1-2/+1
If the file system is corrupted, there is a potential of a read-only buffer overrun. Fortunately, we don't actually use the result of that pointer dereference, and the overrun is at most 64k. Google-Bug-Id: #158564737 Fixes: eb88b751745b ("libext2fs: make ext2fs_dirent_has_tail() more strict") Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-06-17debugfs: fix building rdebugfs (with READ_ONLY define)Theodore Ts'o4-41/+48
Fix bitrot for building a restricted version of debugfs, which does not require read/write access to the file system, and which only allows access to the file system metadata. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-05-17libext2fs: retry reading superblock on open when checksum is badTheodore Ts'o1-1/+6
When opening a file system which is mounted, it's possible that when ext2fs_open2() is racing with the kernel modifying the orphaned inode list, the superblock's checksum could be incorrect. So retry reading the superblock in the hopes that the problem will self-correct. Google-Bug-Id: 151453112 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-25libext2fs: batch calls to ext2fs_zero_blocks2()Theodore Ts'o1-5/+23
When allocating blocks for an indirect block mapped file, accumulate blocks to be zero'ed and then call ext2fs_zero_blocks2() to zero them in large chunks instead of block by block. This significantly speeds up mkfs.ext3 since we don't send a large number of ZERO_RANGE requests to the kernel, and while the kernel does batch write requests, it is not batching ZERO_RANGE requests. It's more efficient to batch in userspace in any case, since it avoids unnecessary system calls. Reported-by: Mario Schuknecht <mario.schuknecht@dresearch-fe.de> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-13e2fsck: fix off-by-one check when validating depth of an htreeTheodore Ts'o1-1/+1
Fixes: 3f0cf6475399 ("e2fsprogs: add support for 3-level htree") Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-13libext2fs: avoid pointer arithmetic on `void *`Michael Forney1-1/+1
The pointer operand to the binary `+` operator must be to a complete object type. Signed-off-by: Michael Forney <mforney@mforney.org> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-10tune2fs.8: document the stable_inodes featureEric Biggers1-0/+7
Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-10ext4.5: document the stable_inodes featureEric Biggers1-0/+16
Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-10tune2fs: prevent stable_inodes feature from being clearedEric Biggers1-2/+1
Similar to encrypt and verity, once the stable_inodes feature has been enabled there may be files anywhere on the filesystem that require this feature. Therefore, in general it's unsafe to allow clearing it. Don't allow tune2fs to do so. Like encrypt and verity, it can still be cleared with debugfs if someone really knows what they're doing. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-10tune2fs: prevent changing UUID of fs with stable_inodes featureEric Biggers1-0/+7
The stable_inodes feature is intended to indicate that it's safe to use IV_INO_LBLK_64 encryption policies, where the encryption depends on the inode numbers and thus filesystem shrinking is not allowed. However since inode numbers are not unique across filesystems, the encryption also depends on the filesystem UUID, and I missed that there is a supported way to change the filesystem UUID (tune2fs -U). So, make 'tune2fs -U' report an error if stable_inodes is set. We could add a separate stable_uuid feature flag, but it seems unlikely it would be useful enough on its own to warrant another flag. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-10ext2fs: fix off-by-one in dx_grow_tree()Jan Kara1-1/+1
There is an off-by-one error in dx_grow_tree() when checking whether we can add another level to the tree. Thus we can grow tree too much leading to possible crashes in the library or corrupted filesystem. Fix the bug. Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-10ext2fs: fix error checking in dx_link()Jan Kara1-1/+1
dx_lookup() uses errcode_t return values. As such anything non-zero is an error, not values less than zero. Fix the error checking to avoid crashes on corrupted filesystems. Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-10Teach makefiles to build all static programs using the target all-staticTheodore Ts'o6-2/+22
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-04-10ext4: add support for printing the error code associated with an errorTheodore Ts'o6-16/+80
The error code allows the kernel to bucket the possible cause of an ext4 corruption by Unix errno codes (e.g., EIO, EFSBADCRC, ESHUTDOWN, etc.) Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-23tests: use grep -E for better portability in r_inline_xattrTheodore Ts'o2-6/+6
Since r_inline_xattr is using an extended regexp, we need grep -E on some implementations of grep. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-23e2fsck: fix various gcc -Wall nitsTheodore Ts'o3-5/+5
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-23Use ext2_loff_t instead of loff_tTheodore Ts'o2-6/+6
The loff_t type is a glibc'ism and is not fully portable. Use ext2_loff_t instead. Fixes: 382ed4a1c2b6 ("e2fsck: use proper types for variables") Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: Matthias Andree <matthias.andree@gmx.de>
2020-03-21Merge tag 'v1.45.6' into nextTheodore Ts'o31-525/+832
v1.45.6
2020-03-21Update release notes, etc., for the 1.45.6 releasev1.45.6Theodore Ts'o10-323/+420
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-20AOSP: tune2fs, resize2fs: make ramdisk binaries.Jaegeuk Kim2-18/+39
Bug: 149391799 Change-Id: I5183755915710e28a603e3f718f16813ea9991a0 Signed-off-by: Jaegeuk Kim <jaegeuk@google.com> From AOSP commit: a13a88d0d557a396f63702fb8db008487e2384d7
2020-03-20AOSP: e2fsdroid: Don't skip unusable blocks in BaseFS.David Anderson1-26/+38
Currently, basefs_allocator will iterate through blocks owned by an inode until it finds a block that is free. This effectively ignores the logical to physical block mapping, which can lead to a bigger delta in the final image. An example of how this can happen is if the BaseFS has a deduplicated block (D), that is not deduplicated in the new image: Old image: 1 2 3 D 4 5 New image: 1 2 3 ? 4 5 If the allocator sees that "D" is not usable, and skips to block "4", we will have a non-ideal assignment. Bad image: 1 2 3 4 5 ? This patch refactors get_next_block() to acquire at most one block. It's called a single time, and then only called in a loop if absolutely no blocks can be acquired from anywhere else. In a Virtual A/B simulation, this reduces the COW snapshot size by about 90MB. Bug: 139201772 Test: manual test Change-Id: I354f0dee1ee191dba0e1f90491ed591dba388f7f From AOSP commit: a495b54f89b2ec0e46be8e3564e4852c6434687c
2020-03-20AOSP: e2fsdroid: Fix logical block sequencing in BaseFS.David Anderson2-5/+99
By iterating over blocks to write BaseFS, holes in the extent tree are skipped. This is problematic because the purpose of BaseFS is to preserve the logical to physical block assignment between builds. By not preserving the location of holes, the assignment can be incorrect. For example, consider the following block list for a file: 1 2 3 0 4 5 If this is recorded as: 1 2 3 4 5 If the first block changes to a hole, the intended mapping will not be preserved at all: 0 1 2 0 3 This patch makes two changes to e2fsdroid to fix this. The first change is that holes are now recorded in BaseFS, by iterating over the extent tree rather than the block list, and inserting zeroes where appropriate. The second change is that the block allocator now recognizes when blocks have been skipped (either to deduplication or to holes), and skips the same number of logical blocks in BaseFS as well. In a Virtual A/B simulation, this reduces the COW snapshot size by approximately 100MB. Bug: 139201772 Test: m target-files-package, inspect .map files From AOSP commit: d391f3bf38cbe51718d5c3c0bb3e72b1a9978625
2020-03-20AOSP: e2fsdroid: Properly free the dedup block map.David Anderson1-0/+23
When BaseFS specifies the same block for two files, it gets added to a separate "dedup" bitmap, and removed from the free block bitmap. If the new build does not use every block in this bitmap, there will be an inconsistency: the block bitmap marks blocks as in-use when they are actually free. Although this doesn't matter for AOSP's read-only file systems, it does cause e2fsck to complain, which breaks the build. Fix the inconsistency by properly freeing all unused blocks within the dedup block set. Bug: 139201772 Test: build AOSP using BaseFS Change-Id: I6b6511eb713a56fec932f1d5668f1766d64d9479 From AOSP commit: 346bee6f8b97aefe7714688f738606c116099fbc
2020-03-20AOSP: Build e2freefragAlessio Balsini1-0/+19
Enable the build of e2freefrag to monitor the free space fragmentation in ext2/3/4 filesystems. Bug: 146078546 Test: m + e2freefrag on device Change-Id: Ia56e443a789ae881a03bdaeae1093567e1736c4c Signed-off-by: Alessio Balsini <balsini@google.com> From AOSP commit: ab77f6c79f3dab697cd56ad3b793e7d555ac9415
2020-03-20AOSP: Add -e2fsprogs to the e2fsprogs chattr and lsattr.Elliott Hughes1-2/+2
We want to start shipping the toybox chattr and lsattr on the device all the time, so the build system rightly complains that then we'd have two modules with the same name. I went with a suffix rather than a prefix so that tab completion works for folks still wanting to use the e2fsprogs versions. Bug: http://b/147769529 Test: builds Change-Id: Ib904fa6c709d29ce709302c61e452383c02cb9a3 From AOSP commit: 8525a455e7410461560a99a42feb0dbfabab5c8e
2020-03-20AOSP: Make ramdisk_available.Yifan Hong8-0/+17
Test: pass Bug: 147347110 Change-Id: Ie800ba1b56773dcc1b6563c4f19c27eccb9ffc1a From AOSP commit: f5a8e8fdefd78deae971a475a7fa43734eef205e
2020-03-20AOSP: Change #define to _BLKID_TYPES_HKousik Kumar2-2/+9
blkid_types.h and ext_types.h having the exact same content results in mismatches in remote RBE builds. Given blkid_types.h is actually supposed to be different, changing this to remove the mismatch. Test: Ran a build, and all e2fsprogs mismatches went away between local/remote. Change-Id: I63ab1719ee1d0ccd28907f0bc99531260251fa99 From AOSP commit: ec10b513c283706f984edeec47301b0661f7d283
2020-03-20AOSP: Allow resize2fs to compile with BUILD_HOST_staticDario Freni1-7/+22
Bug: 144477678 Test: BUILD_HOST_static=1 m resize2fs; m resize2fs Change-Id: I0986deccfe496153e662dcc3cc2fb1ffb6973c21 From AOSP commit: 2c767b86591c64cd7b84c5619e8d8b8a0afd557e
2020-03-20AOSP: Allow debugfs_static to be compiled as host tool.Dario Freni1-0/+1
Bug: 144477678 Test: m debugfs_static Change-Id: I7c360a2a381f8508578d14c32bbb280f386dd925 From AOSP commit: 742bb05a401eb2feb6caaee1c8d66fc1c37eef77
2020-03-20po: update ms.po (from translationproject.org)Sharuzzaman Ahmat Raslan1-149/+135
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-20misc: add text describing the impact of an inode size < 128 bytes in man pagesTheodore Ts'o2-8/+11
Addresses-Debian-Bug: #953493 Addresses-Debian-Bug: #953494 Addresses-Debian-Bug: #951808 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-20libe2p: add a thread-safe variant of e2p_feature2stringTheodore Ts'o2-4/+16
This commit adds the function e2p_feature_to_string() which allows the caller to pass in a preallocated buffer. Google-Bug-Id: 16978603 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-20libext2fs: fix the {set_get}_bitmap_range functions when bitmap->start > 7Theodore Ts'o1-2/+2
The bitmap array's set/get bitmap_range functions were not subtracting out bitmap->start. This doesn't matter for normal file systems, since the bitmap->start is zero or one, and the passed-in starting range is a multiple of eight, and the starting range is then divided by 8. But with a non-standard/fuzzed file system, bitmap->start could be significantly larger, and this could then lead to a array out of bounds memory reference. Google-Bug-Id: 147849134 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-20e2fsck: clarify overflow link count error messageJan Kara3-4/+24
When directory link count is set to overflow value (1) but during pass 4 we find out the exact link count would fit, we either silently fix this (which is not great because e2fsck then reports the fs was modified but output doesn't indicate why in any way), or we report that link count is wrong and ask whether we should fix it (in case -n option was specified). The second case is even more misleading because it suggests non-trivial fs corruption which then gets silently fixed on the next run. Similarly to how we fix up other non-problems, just create a new error message for the case directory link count is not overflown anymore and always report it to clarify what is going on. Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> (cherry picked from commit 4ebce13292f54c96f43dcb1bd1d5b8df5dc8749d)
2020-03-16Merge branch 'maint' into nextTheodore Ts'o8-25/+19
2020-03-15tune2fs: update dir checksums when clearing dir_index featureJan Kara1-48/+95
When clearing dir_index feature while metadata_csum is enabled, we have to rewrite checksums of all indexed directories to update checksums of internal tree nodes. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-15tests: add test to excercise indexed directories with metadata_csumJan Kara4-0/+117
Indexed directories have somewhat different format when metadata_csum is enabled. Add test to excercise linking in indexed directories and e2fsck rehash code in this case. Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-15tests: modify f_large_dir test to excercise indexed dir handlingJan Kara2-5/+32
Modify f_large_dir test to create indexed directory and create entries in it. That way the new code in ext2fs_link() for addition of entries into indexed directories gets executed including various special cases when growing htree. Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-15ext2fs: implement dir entry creation in htree directoriesJan Kara1-52/+497
Implement proper creation of new directory entries in htree directories in ext2fs_link(). So far we just cleared EXT2_INDEX_FL and treated directory as unindexed however this results in mismatched checksums if metadata checksums are in use because checksums are placed in different places depending on htree node type. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-15ext2fs: update allocation info earlier in ext2fs_mkdir() and ext2fs_symlink()Jan Kara2-14/+30
Currently, ext2fs_mkdir() and ext2fs_symlink() update allocation bitmaps and other information only close to the end of the function, in particular after calling to ext2fs_link(). When ext2fs_link() will support indexed directories, it will also need to allocate blocks and that would cause filesystem corruption in case allocation info isn't properly updated. So make sure ext2fs_mkdir() and ext2fs_symlink() update allocation info before calling into ext2fs_link(). [ Added error handling so the calls to ext2fs_{block,inode}_alloc_stats() can be undone if the newly created directory or symlink can not be linked into the directory. -- TYT ] Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-15debian: drop libattr1-dev from the build dependencies listTheodore Ts'o1-1/+1
The libattr has stopped providing attr/xattr.h; we now use sys/xattr.h. So there is no longer any reason to require that the libattr1-dev package be present when building e2fsprogs, so drop it. Addresses-Debian-Bug: #953926 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-15e2fsck: fix "make check" when using static librariesTheodore Ts'o1-2/+2
Fixes: 70303df16ca6 ("e2fsck: consistently use ext2fs_get_mem()") Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-14libext2fs: make ext2fs_dirent_has_tail() more strictTheodore Ts'o4-17/+14
Previously ext2fs_dirent_has_tail() would return true if the directory was corrupted. If the directory is corrupted, then by definition it doesn't have a valid checksum tail. (This fixes a big-endian failure on the master branch.) Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-10misc: fix typos in chattr's man pageSawood Alam1-2/+2
Plural form "directories" should be used along with "files". "id's" should be "ids" (i.e., plural form, not apostrophe). "much" should "must". Signed-off-by: Sawood Alam <ibnesayeed@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-09mke2fs.conf: remove options.fname_encodingPino Toscano1-3/+0
Introduced with commit e7236a9476cd1fa5296fbc4aa573b36426901a08, it was later renamed to encoding, and turned into a fs_type-only option with commit 28887533bb64db318e74c38cd9c0ad6d0bb2ced2. Hence, remove an option that does not exist in the default configuration. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-07e2fsck: fix indexed dir rehash failure with metadata_csum enabledJan Kara1-1/+7
E2fsck directory rehashing code can fail with ENOSPC due to a bug in ext2fs_htree_intnode_maxrecs() which fails to take metadata checksum into account and thus e.g. e2fsck can decide to create 1 indirect level of index tree when two are actually needed. Fix the logic to account for metadata checksum. Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-07e2fsck: clarify overflow link count error messageJan Kara4-4/+26
When directory link count is set to overflow value (1) but during pass 4 we find out the exact link count would fit, we either silently fix this (which is not great because e2fsck then reports the fs was modified but output doesn't indicate why in any way), or we report that link count is wrong and ask whether we should fix it (in case -n option was specified). The second case is even more misleading because it suggests non-trivial fs corruption which then gets silently fixed on the next run. Similarly to how we fix up other non-problems, just create a new error message for the case directory link count is not overflown anymore and always report it to clarify what is going on. Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-07Merge branch 'maint' into nextTheodore Ts'o47-346/+388
2020-03-07libext2fs: check open(O_EXCL) first in ismounted.cLukas Czerner1-15/+27
Currently the ext2fs_check_mount_point() will use the open(O_EXCL) check on linux after all the other checks are done. However it is not necessary to check mntent if open(O_EXCL) succeeds because it means that the device is not mounted. Moreover the commit ea4d53b7 introduced a regression where a following set of commands fails: vgcreate mygroup /dev/sda lvcreate -L 1G -n lvol0 mygroup mkfs.ext4 /dev/mygroup/lvol0 mount /dev/mygroup/lvol0 /mnt lvrename /dev/mygroup/lvol0 /dev/mygroup/lvol1 lvcreate -L 1G -n lvol0 mygroup mkfs.ext4 /dev/mygroup/lvol0 <<<--- This fails It fails because it thinks that /dev/mygroup/lvol0 is mounted because the device name in /proc/mounts is not updated following the lvrename. Move the open(O_EXCL) check before the mntent check and return immediatelly if the device is not busy. Fixes: ea4d53b7 ("libext2fs/ismounted.c: check device id in advance to skip false device names") Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reported-by: Zdenek Kabelac <zkabelac@redhat.com> Reported-by: Karel Zak <kzak@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-07mke2fs: fix permissions setting with "mke2fs -d /path/files"Theodore Ts'o1-2/+2
Set the directory for directories in cases where the owner permissions is not rwx. This was reported[1] by Robert Yang but we are using a different approach to fixing the issue. [1] https://lore.kernel.org/r/1582542522-97508-1-git-send-email-liezhi.yang@windriver.com Also set the permissions in a more portable way by making a distinction between the host OS's permissions stats and Linux's permissions. We still assume the low 12 bits are the historical Unix assignments, but we don't assume ST_IFMT bits are the same as Linux's. Reported-by: Robert Yang <liezhi.yang@windriver.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-07libext2fs: don't use O_DIRECT for files on tmpfsAndreas Dilger9-86/+14
If a filesystem image is on tmpfs, opening it with O_DIRECT for reading the MMP will fail. This is unnecessary, since the image file can't really be open on another node at this point. If the open with O_DIRECT fails, retry without it when plausible. Remove the special-casing of tmpfs from the mmp test cases. Change-Id: I41f4b31657b06f62f10be8d6e524d303dd36a321 Signed-off-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-04e2fsck: avoid overflow with very large dirsAndreas Dilger3-49/+81
In alloc_size_dir() it multiples signed ints when allocating the buffer for rehashing an htree-indexed directory. This will overflow when the directory size is above 4GB, which is possible with largedir directories having about 100M entries, assuming an average 3/4 leaf fullness and 24-byte filenames, or fewer with longer filenames. The same problem exisgs in get_next_block(). Similarly, the out_dir struct used a signed int for the number of blocks in the directory, which may result in a negative size if the directory is over 2GB (about 50M entries or fewer). Use appropriate unsigned variables for block counts, and use larger types for calculating the byte count for memory offsets/sizes. Such large directories not been seen yet, but are not too far away. The ext2fs_get_array() function will properly calculate the needed memory allocation, and detect overflow on 32-bit systems. Add ext2fs_resize_array() to do the same for array resize. Signed-off-by: Andreas Dilger <adilger@whamcloud.com> Lustre-bug-id: https://jira.whamcloud.com/browse/LU-13197 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-04misc: handle very large files with filefragAndreas Dilger2-14/+26
Avoid overflowing the column-width calc printing files over 4B blocks. Document the [KMG] suffixes for the "-b <blocksize>" option. The blocksize is limited to at most 1GiB blocksize to avoid shifting all extents down to zero GB in size. Even the use of 1GB blocksize is unlikely, but non-ext4 filesystems may use multi-GB extents. Signed-off-by: Andreas Dilger <adilger@dilger.ca> Lustre-bug-id: https://jira.whamcloud.com/browse/LU-13197 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-04e2fsck: consistently use ext2fs_get_mem()Andreas Dilger7-40/+41
Consistently use ext2fs_get_mem() and ext2fs_free_mem() instead of calling malloc() and free() directly in e2fsck. In several places it is possible to use ext2fs_get_memzero() instead of explicitly calling memset() on the memory afterward. This is just a code cleanup, and does not fix any specific bugs. [ Fix up library dependencies in e2fsck/Makefile.in to fix "make check" breakages. -- TYT ] Signed-off-by: Andreas Dilger <adilger@whamcloud.com> Lustre-bug-id: https://jira.whamcloud.com/browse/LU-13197 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-29e2fsck: fix overflow if more than 4B inodesAndreas Dilger4-5/+8
Even though we don't have support for filesystems with over 4B inodes in the current e2fsprogs, this may happen in the future. There are latent overflow bugs when calculating the number of inodes in the filesystem that can trivially be fixed now, rather than waiting for them to be hit at some point in the future. The block number calcs are already correct in this code. Signed-off-by: Andreas Dilger <adilger@dilger.ca> Lustre-bug-id: https://jira.whamcloud.com/browse/LU-13197 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-29debugfs: print inode numbers as unsignedAndreas Dilger4-14/+14
Print inode numbers as unsigned values, to avoid printing negative numbers for inodes above 2B. Flags should be printed as hex instead of signed decimal values. Signed-off-by: Andreas Dilger <adilger@whamcloud.com> Lustre-bug-id: https://jira.whamcloud.com/browse/LU-13197 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-29debugfs: allow comment lines in command fileAndreas Dilger1-0/+4
Allow comment lines with '#' at the start of the line in the command file passed in to debugfs via the "-f" option or from standard input. Signed-off-by: Andreas Dilger <adilger@whamcloud.com> Lustre-bug-id: https://jira.whamcloud.com/browse/LU-13197 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-29e2fsck: reduce memory usage for many directoriesAndreas Dilger3-15/+14
Pack struct dx_dir_info and dx_dirblock_info properly in memory, to avoid holes, and fields are not larger than necessary. This reduces the memory needed for each hashed dir, according to pahole(1) from: struct dx_dir_info { /* size: 32, cachelines: 1, members: 6 */ /* sum members: 26, holes: 1, sum holes: 2 */ /* padding: 4 */ }; struct dx_dirblock_info { /* size: 56, cachelines: 1, members: 9 */ /* sum members: 48, holes: 2, sum holes: 8 */ /* last cacheline: 56 bytes */ }; to 8 bytes less for each directory and directory block, and leaves space for future use if needed (e.g. larger numblocks): struct dx_dir_info { /* size: 24, cachelines: 1, members: 6 */ /* sum members: 20, holes: 1, sum holes: 4 */ /* bit holes: 1, sum bit holes: 7 bits */ }; struct dx_dirblock_info { /* size: 48, cachelines: 1, members: 9 */ }; Signed-off-by: Andreas Dilger <adilger@whamcloud.com> Lustre-bug-id: https://jira.whamcloud.com/browse/LU-13197 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-29e2fsck: avoid mallinfo() if over 2GB allocatedAndreas Dilger2-15/+15
Don't use mallinfo() for determining the amount of memory used if it is over 2GB. Otherwise, the signed ints used by this interface can can overflow and return garbage values. This makes the actual amount of memory used by e2fsck misleading and hard to determine. Instead, use brk() to get the total amount of memory allocated, and print this if the more detailed mallinfo() information is not suitable for use. There does not appear to be a mallinfo64() variant of this function. There does appear to be an abomination named malloc_info() that writes XML-formatted malloc stats to a FILE stream that would need to be read and parsed in order to get these stats, but that doesn't seem worthwhile. Signed-off-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Shilong Wang <wshilong@ddn.com> Lustre-bug-id: https://jira.whamcloud.com/browse/LU-13197 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-29e2fsck: use proper types for variablesAndreas Dilger10-33/+38
Use ext2_ino_t instead of ino_t for referencing inode numbers. Use loff_t for for file offsets, and dgrp_t for group numbers. Cast products to ssize_t before multiplication to avoid overflow. Signed-off-by: Andreas Dilger <adilger@dilger.ca> Reviewed-by: Shilong Wang <wshilong@ddn.com> Lustre-bug-id: https://jira.whamcloud.com/browse/LU-13197 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-29e2fsck: fix e2fsck_allocate_memory() overflowAndreas Dilger6-37/+39
e2fsck_allocate_memory() takes an "unsigned int size" argument, which will overflow for allocations above 4GB. This happens for dir_info and dx_dir_info arrays when there are more than 350M directories in a filesystem, and for the dblist array above 180M directories. There is also a risk of overflow during the binary search in both e2fsck_get_dir_info() and e2fsck_get_dx_dir_info() when the midpoint of the array is calculated, if there would be more than 2B directories in the filesystem and working above the half way point. Also, in some places inode numbers are "int" instead of "ext2_ino_t", which can also cause problems with the array size calculations, and makes it hard to identify where inode numbers are used. Fix e2fsck_allocate_memory() to take an "unsigned long" argument to match ext2fs_get_mem(), so that it can do single memory allocations over 4GB. Fix e2fsck_get_dir_info() and e2fsck_get_dx_dir_info() to temporarily use an unsigned long long value to calculate the midpoint (which will always fit into an ext2_ino_t again afterward). Change variables that hold inode numbers to be ext2_ino_t, and print them as unsigned values instead of printing negative inode numbers. Signed-off-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Shilong Wang <wshilong@ddn.com> Lustre-bug-id: https://jira.whamcloud.com/browse/LU-13197 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-27tst_libext2fs: Avoid multiple definition of global variablesLukas Czerner3-1/+9
gcc version 10 changed the default from -fcommon to -fno-common and as a result e2fsprogs make check tests fail because tst_libext2fs.c end up with a build error. This is because it defines two global variables debug_prog_name and extra_cmds that are already defined in debugfs/debugfs.c. With -fcommon linker was able to resolve those into the same object, however with -fno-common it's no longer able to do it and we end up with multiple definition errors. Fix the problem by using SKIP_GLOBDEFS macro to skip the variables definition in debugfs.c. Note that debug_prog_name is also defined in lib/ext2fs/extent.c when DEBUG macro is used, but this does not work even with older gcc versions and is never used regardless so I am not going to bother with it. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-24chattr.1: improve attribute formatting with labels and indented paragraphsJeremy Visser1-19/+40
By convention, lists of options in man pages use a label followed by an indented description, such as this example from the Options section: -R Recursively change attributes of directories and their contents. But the Attributes section places the available attributes mid-sentence, which makes it visually more difficult to parse: A file with the 'a' attribute set can only be opened in append mode for writing. [...] When a file with the 'A' attribute set is accessed, its atime record is not modified. [...] This patch places a label beside each attribute description, which (in my opinion) improves readability, especially when visually skimming the list. For example: a A file with the 'a' attribute set can only be opened in append mode for writing. A When a file with the 'A' attribute set is accessed, its atime record is not modified. Signed-off-by: Jeremy Visser <jeremyvisser@google.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-15libext2fs: avoid array buffer overruns caused by insane directory blocksTheodore Ts'o1-2/+10
Reported-by: canardo909@gmx.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-12libext2fs: fix potential OOB read check_for_inode_bad_blocks()Theodore Ts'o1-0/+7
If the bad block list has been reset in the middle of an inode scan, it's possible for bb->list[scan->bad_blocks_ptr] to result in an out-of-bounds read access. This is highly unlikely to happen under normal circumstances; in particular, we generally don't use bad block inodes any more. In addition, this would only happen if the bad block inode itself is corrupt so e2fsck needs to wipe it out. This might cause e2fsck to crash, but it will more likely cause a part of the inode table to be wrongly considered invalid, causing file system to be incorrectly fixed. This was reported by TALOS as TALOS-2020-0974 and CVE-2020-6057, but after closer examination, we don't believe this can be used in any way to exploit the system or release information about the system, since all this can do is to cause part of the inode table to be skipped when it shouldn't be, and this can't be leveraged since any information about the ASLR of the process is obsolete once e2fsck exits. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-26mke2fs: set overhead in super blockLi Dongyang35-44/+156
If overhead is not recorded in the super block, it is calculated during mount in kernel, for bigalloc file systems it takes O(groups**2) in time. For a 1PB device with 32K cluster size it takes ~12 mins to mount, with most of the time spent on figuring out overhead. While we can not improve the overhead algorithm in kernel due to the nature of bigalloc, we can work out the overhead during mke2fs and set it in the super block, avoiding calculating it every time when it mounts. Overhead is s_first_data_block plus internal journal blocks plus the block and inode bitmaps, inode table, super block backups and group descriptor blocks for every group. This patch introduces ext2fs_count_used_clusters(), which calculates the clusters used in the block bitmap for the given range. When bad blocks are involved, it gets tricky because the blocks counted as overhead and the bad blocks can end up in the same allocation cluster. In this case we will unmark the bad blocks from the block bitmap, convert to cluster bitmap and get the overhead, then mark the bad blocks back in the cluster bitmap. Reset the overhead to zero when resizing, we can not simply count the used blocks as overhead like we do when mke2fs. The overhead can be calculated by kernel side during mount. Signed-off-by: Li Dongyang <dongyangli@ddn.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-26ext2fs: rename "s_overhead_blocks" to "s_overhead_clusters"Li Dongyang5-7/+7
Rename s_overhead_blocks field from struct ext2_super_block to make it consistent with the kernel counterpart. Signed-off-by: Li Dongyang <dongyangli@ddn.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-26libext2fs: optimize ext2fs_convert_subcluster_bitmap()Li Dongyang1-13/+7
For a bigalloc filesystem, converting the block bitmap from blocks to chunks in ext2fs_convert_subcluster_bitmap() can take a long time when the device is huge, because we test the bitmap bit-by-bit using ext2fs_test_block_bitmap2(). Use ext2fs_find_first_set_block_bitmap2() which is more efficient for mke2fs when the fs is mostly empty. e2fsck can also benefit from this during pass1 block scanning. Time taken for "mke2fs -O bigalloc,extent -C 131072 -b 4096" on a 1PB device: without patch: real 27m49.457s user 21m36.474s sys 6m9.514s with patch: real 6m31.908s user 0m1.806s sys 6m29.697s Signed-off-by: Li Dongyang <dongyangli@ddn.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-25Merge branch 'maint' into nextTheodore Ts'o14-67/+66
2020-01-24mmp: abstract out repeated 'sizeof(buf), buf' usageAndreas Dilger13-70/+53
The printf("%.*s") format requires both the buffer size and buffer pointer to be specified for each use. Since this is repeatedly given as "(int)sizeof(buf), (char *)buf" for mmp_nodename and mmp_bdevname fields, with typecasts to avoid compiler warnings. Add a helper macro EXT2_LEN_STR() to avoid repeated boilerplate code. This can also be used for other superblock buffer fields that may not have NUL-terminated strings (e.g. s_volume_name, s_last_mounted, s_{first,last}_error_func, s_mount_opts) to simplify code and avoid the need for temporary buffers for NUL-termination. Annotate the superblock string fields that may not be NUL-terminated. Signed-off-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-24mmp: don't assume NUL termination for MMP stringsAndreas Dilger6-13/+29
Don't assume that mmp_nodename and mmp_bdevname are NUL terminated, since very long node/device names may completely fill the buffers. Limit string printing to the maximum buffer size for safety, and change the field definitions to __u8 to make it more clear that they are not NUL-terminated strings, as is done with other strings in the superblock that do not have NUL termination. Signed-off-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-23filefrag: add -E option to display the extent status cacheTheodore Ts'o3-9/+80
Ext4 has an extent status cache; add the fiemap extensions so we can query the kernel for the extent status cache information. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-17e2fsck: restart the full e2fsck run if the bad block inode is invalidateTheodore Ts'o2-11/+5
Previously, we just cleared the bad block list and restarted the inode scan, but we didn't do a full reset of all of e2fsck's state. When code handling this case; we didn't have the framework to do a restarted run. Now that we do, we can simply the code and make it more correct. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-16e2fsck: clean up unwind_pass1() as it's no longer really neededTheodore Ts'o1-12/+2
We now restart the full e2fsck instead of unwinding and restarting pass1. So most of what used to be in unwind_pass1() has been moved elsewhere. Let's git rid of it entirely, which simplifies and shrinks pass1.c slightly. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-16libext2fs: don't needlessly byte swap the group descriptors in ext2fs_flushTheodore Ts'o1-10/+14
If the EXT2_FLAG_SUPER_ONLY is set, there's no reason to allocate the shadow block group descriptors and byte swap the group descriptors. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-16libext2fs: teach ext2fs_flush() to check if group descriptors are loadedTheodore Ts'o1-0/+5
If the EXT2_FLAG_SUPER_ONLY is not set, and the group descriptors are not loaded, ext2fs_flush[2]() will return EXT2_ET_NO_GDESC. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-16Merge branch 'maint' into nextTheodore Ts'o4-4/+24
2020-01-16libext2fs: reserve the error code EXT2_ET_NO_GDESCTheodore Ts'o2-0/+6
This is really only needed in the 1.46+ where the EXT2_FLAG_SUPER_ONLY is honored by ext2fs_open to only read the superblock, so that fs->group_desc can be NULL. We define it in the maint branch so that we can be sure the error tables are kept in sync (in the unlikely case that a new error code needs to be assigned in the maint branch). Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-14libext2fs: fix crash in ext2fs_image_super_write() on Big Endian systemsTheodore Ts'o1-4/+4
This is a similar fix as c9a8c53b17cc ("libext2fs: fix crash in ext2fs_open2() on Big Endian systems"). Commit e6069a05: ("Teach ext2fs_open2() to honor the EXT2_FLAG_SUPER_ONLY flag") changed how the function ext2fs_group_desc() handled a request for a gdp pointer for a group larger than the number of groups in the file system; it now returns NULL, instead of returning a pointer beyond the end of the array. Previously, the ext2fs_imager_super_write() function would swap all of the block group descriptors in a block, even if they are beyond the end of the file system. This was OK, since we were not overrunning the allocated memory, since it was rounded to a block boundary. But now that ext2fs_group_desc() would return NULL for those gdp, it would cause ext2fs_open2(), when it was byte swapping the block group descriptors on Big Endian systems, to dereference a null pointer and crash. This commit adds a NULL pointer check to avoid byte swapping those block group descriptors in a bg descriptor block, but which are beyond the end of the file system, to address this crash. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: Anatoly Pugachev <matorola@gmail.com>
2020-01-08libcom_err: deal with the fact that the Hurd error messages are not zero-basedTheodore Ts'o2-4/+18
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-07Merge tag 'v1.45.5' into nextTheodore Ts'o16-202/+410
v1.45.5
2020-01-06Update release notes, etc., for the 1.45.5 releasev1.45.5Theodore Ts'o10-192/+321
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-06libext2fs: always compile swapfs functions on all architecturesTheodore Ts'o2-3/+13
By only compiling the ext2fs_swap_* functions on big-endian systems, it causes debian/libext2fs2.symbols to need to be different on different little-endian vs big-endian architectures. Including the ext2fS_swap_* functions increases the size of the library by ~6k, which is not a big deal. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-06e2scrub, e2scrub_all: don't sleep unnecessarily in exitcodeTheodore Ts'o2-8/+2
The two second sleep is only needed in e2scrub, and when there is a failure, so that systemd has a chance to gather the log output before e2scrub exits. It's not needed if the script is exiting successfully, and it's never needed for e2scrub_all ever. Addresses-Debian-Bug: #948193 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-06debian: add autopkgtest filesTheodore Ts'o3-0/+75
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-06libext2fs: don't scan /etc/mtab if file system not found in /proc/mountsTheodore Ts'o1-1/+1
Previously we would scan /etc/mtab if the device is not found in /proc/mounts. This is because previously, /etc/mtab would have the filename for a loopback mount, while /proc/mounts would only have something like /dev/loop0. Since on many systems /etc/mtab is now a symlink to /proc/mounts, ismounted.c has a special function, check_loop_mounted. For this reason, it's not necessary to fall back to trying to scan /etc/mtab if a device / filename is not found from scanning /proc/mounts. This also prevents failures if the file /etc/mtab does not exist but /proc/mounts does exist when checking to see if a device is mounted when it isn't. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-02Merge branch 'maint' into nextTheodore Ts'o36-2941/+3306
2020-01-01e2fsck: don't check for future superblock times if checkinterval == 0Theodore Ts'o1-2/+2
We are no longer enabling periodic file system checks by default in mke2fs. The only reason why we force file system checks if the last mount time or last write time in the superblock is if this might bypass the periodic file systme checks. So if the checkinterval is zero, skip the last mount/write time checks since there's no reason to force a check just because the system clock is incorrect. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-01debian: update debhelper compat level to 12Theodore Ts'o1-1/+1
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-01config: update config.{guess,sub}Theodore Ts'o2-1322/+1505
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-01po: update zh_CN.po (from translationproject.org)Boyuan Yang1-1302/+1348
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-01po: update ms.po (from translationproject.org)Sharuzzaman Ahmat Raslan1-95/+64
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-01Drop use of -pedantic when doing gcc-wallTheodore Ts'o3-47/+3
With newer versions of gcc -pedantic is *super* pedantic, and generates way too much noise. So we drop it, and thus we don't need util/gcc-wall-cleanup and util/static-analysis-cleanup. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-01-01libext2fs: use ext2fs_file_llseek in inode_io.cTheodore Ts'o1-6/+6
Enable the use of files > 2GB when using the inode_io manager. Signed-off-by: Theodore Ts'o <tytso@mit.edu>