diff options
author | Ævar Arnfjörð Bjarmason <avarab@gmail.com> | 2022-07-28 01:13:39 +0200 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2022-07-27 16:35:40 -0700 |
commit | faececa53f904845b615631f61f158a622c2ce08 (patch) | |
tree | 7abea90431f950e2cd3c06eac4f4e5ef0cfe1012 /t/test-lib.sh | |
parent | e92684e1a26a2e61f53c691d65e01b446914784d (diff) | |
download | git-faececa53f904845b615631f61f158a622c2ce08.tar.gz |
test-lib: have the "check" mode for SANITIZE=leak consider leak logs
As noted in previous on-list discussions[1] we have various tests that
will falsely report being leak-free because we're missing the relevant
exit code from LSAN as summarized below.
We should fix those issues, but in the meantime and as an additional
sanity check we can and should consider our own ASAN logs before
reporting that a test is leak-free.
Before this compiling with SANITIZE=leak and running:
./t6407-merge-binary.sh
Will exit successfully, now we'll get an error and an informative
message on:
GIT_TEST_SANITIZE_LEAK_LOG=true ./t6407-merge-binary.sh
Even better, as noted in the updated t/README we'll now error out when
combined with the "check" mode:
GIT_TEST_PASSING_SANITIZE_LEAK=check \
GIT_TEST_SANITIZE_LEAK_LOG=true \
./t4058-diff-duplicates.sh
Why do we miss these leaks? Because:
* We have leaks inside "test_expect_failure" blocks, which by design
will not distinguish a "normal" failure from an abort() or
segfault. See [1] for a discussion of it shortcomings.
* We have "git" invocations outside of "test_expect_success",
e.g. setup code in the main body of the test, or in test helper
functions that don't use &&-chaining.
* Our tests will otherwise catch segfaults and abort(), but if we
invoke a command that invokes another command it needs to ferry the
exit code up to us.
Notably a command that e.g. might invoke "git pack-objects" might
itself exit with status 128 if that "pack-objects" segfaults or
abort()'s. If the test invoking the parent command(s) is using
"test_must_fail" we'll consider it an expected "ok" failure.
* run-command.c doesn't (but probably should) ferry up such exit
codes, so for e.g. "git push" tests where we expect a failure and an
underlying "git" command fails we won't ferry up the segfault or
abort exit code.
* We have gitweb.perl and some other perl code ignoring return values
from close(), i.e. ignoring exit codes from "git rev-parse" et al.
* We have in-tree shellscripts like "git-merge-one-file.sh" invoking
git commands, they'll usually return their own exit codes on "git"
failure, rather then ferrying up segfault or abort() exit code.
E.g. these invocations in git-merge-one-file.sh leak, but aren't
reflected in the "git merge" exit code:
src1=$(git unpack-file $2)
src2=$(git unpack-file $3)
That case would be easily "fixed" by adding a line like this after
each assignment:
test $? -ne 0 && exit $?
But we'd then in e.g. "t6407-merge-binary.sh" run into
write_tree_trivial() in "builtin/merge.c" calling die() instead of
ferrying up the relevant exit code.
Let's remove "TEST_PASSES_SANITIZE_LEAK=true" from tests we
were falsely marking as leak-free.
In the case of t6407-merge-binary.sh it was marked as leak-free in
9081a421a6d (checkout: fix "branch info" memory leaks,
2021-11-16). I'd previously removed other bad
"TEST_PASSES_SANITIZE_LEAK=true" opt-ins in the series merged in
ea05fd5fbf7 (Merge branch 'ab/keep-git-exit-codes-in-tests',
2022-03-16). The case of t1060-object-corruption.sh is more subtle,
and will be discussed in a subsequent commit.
1. https://lore.kernel.org/git/cover-0.7-00000000000-20220318T002951Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 't/test-lib.sh')
-rw-r--r-- | t/test-lib.sh | 80 |
1 files changed, 80 insertions, 0 deletions
diff --git a/t/test-lib.sh b/t/test-lib.sh index 3d0c8896e0..10258def7b 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -309,6 +309,7 @@ TEST_RESULTS_SAN_FILE_PFX=trace TEST_RESULTS_SAN_DIR_SFX=leak TEST_RESULTS_SAN_FILE= TEST_RESULTS_SAN_DIR="$TEST_RESULTS_DIR/$TEST_NAME.$TEST_RESULTS_SAN_DIR_SFX" +TEST_RESULTS_SAN_DIR_NR_LEAKS_STARTUP= TRASH_DIRECTORY="trash directory.$TEST_NAME$TEST_STRESS_JOB_SFX" test -n "$root" && TRASH_DIRECTORY="$root/$TRASH_DIRECTORY" case "$TRASH_DIRECTORY" in @@ -316,6 +317,16 @@ case "$TRASH_DIRECTORY" in *) TRASH_DIRECTORY="$TEST_OUTPUT_DIRECTORY/$TRASH_DIRECTORY" ;; esac +# Utility functions using $TEST_RESULTS_* variables +nr_san_dir_leaks_ () { + # stderr piped to /dev/null because the directory may have + # been "rmdir"'d already. + find "$TEST_RESULTS_SAN_DIR" \ + -type f \ + -name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null | + wc -l +} + # If --stress was passed, run this test repeatedly in several parallel loops. if test "$GIT_TEST_STRESS_STARTED" = "done" then @@ -1191,6 +1202,66 @@ test_atexit_handler () { teardown_malloc_check } +sanitize_leak_log_message_ () { + local new="$1" && + local old="$2" && + local file="$3" && + + printf "With SANITIZE=leak at exit we have %d leak logs, but started with %d + +This means that we have a blindspot where git is leaking but we're +losing the exit code somewhere, or not propagating it appropriately +upwards! + +See the logs at \"%s.*\"; +those logs are reproduced below." \ + "$new" "$old" "$file" +} + +check_test_results_san_file_ () { + if test -z "$TEST_RESULTS_SAN_FILE" + then + return + fi && + local old="$TEST_RESULTS_SAN_DIR_NR_LEAKS_STARTUP" && + local new="$(nr_san_dir_leaks_)" && + + if test $new -le $old + then + return + fi && + local out="$(sanitize_leak_log_message_ "$new" "$old" "$TEST_RESULTS_SAN_FILE")" && + say_color error "$out" && + if test "$old" != 0 + then + echo && + say_color error "The logs include output from past runs to avoid" && + say_color error "that remove 'test-results' between runs." + fi && + say_color error "$(cat "$TEST_RESULTS_SAN_FILE".*)" && + + if test -n "$passes_sanitize_leak" && test "$test_failure" = 0 + then + say "As TEST_PASSES_SANITIZE_LEAK=true and our logs show we're leaking, exit non-zero!" && + invert_exit_code=t + elif test -n "$passes_sanitize_leak" + then + say "As TEST_PASSES_SANITIZE_LEAK=true and our logs show we're leaking, and we're failing for other reasons too..." && + invert_exit_code= + elif test -n "$sanitize_leak_check" && test "$test_failure" = 0 + then + say "As TEST_PASSES_SANITIZE_LEAK=true isn't set the above leak is 'ok' with GIT_TEST_PASSING_SANITIZE_LEAK=check" && + invert_exit_code= + elif test -n "$sanitize_leak_check" + then + say "As TEST_PASSES_SANITIZE_LEAK=true isn't set the above leak is 'ok' with GIT_TEST_PASSING_SANITIZE_LEAK=check" && + invert_exit_code=t + else + say "With GIT_TEST_SANITIZE_LEAK_LOG=true our logs revealed a memory leak, exit non-zero!" && + invert_exit_code=t + fi +} + test_done () { # Run the atexit commands _before_ the trash directory is # removed, so the commands can access pidfiles and socket files. @@ -1270,6 +1341,8 @@ test_done () { error "Tests passed but test cleanup failed; aborting" fi + check_test_results_san_file_ "$test_failure" + if test -z "$skip_all" && test -n "$invert_exit_code" then say_color warn "# faking up non-zero exit with --invert-exit-code" @@ -1286,6 +1359,8 @@ test_done () { say_color error "# failed $test_failure among $msg" say "1..$test_count" + check_test_results_san_file_ "$test_failure" + if test -n "$invert_exit_code" then _invert_exit_code_failure_end_blurb @@ -1464,6 +1539,7 @@ then if test "$GIT_TEST_PASSING_SANITIZE_LEAK" = "check" then + sanitize_leak_check=t if test -n "$invert_exit_code" then BAIL_OUT "cannot use --invert-exit-code under GIT_TEST_PASSING_SANITIZE_LEAK=check" @@ -1489,6 +1565,10 @@ then fi && TEST_RESULTS_SAN_FILE="$TEST_RESULTS_SAN_DIR/$TEST_RESULTS_SAN_FILE_PFX" + # In case "test-results" is left over from a previous + # run: Only report if new leaks show up. + TEST_RESULTS_SAN_DIR_NR_LEAKS_STARTUP=$(nr_san_dir_leaks_) + # Don't litter *.leak dirs if there was nothing to report test_atexit "rmdir \"$TEST_RESULTS_SAN_DIR\" 2>/dev/null || :" |