aboutsummaryrefslogtreecommitdiffstats
path: root/diff.c
AgeCommit message (Collapse)AuthorFilesLines
2024-03-15diff: add diff.srcPrefix and diff.dstPrefix configuration variablesPeter Hutterer1-2/+12
Allow the default prefixes "a/" and "b/" to be tweaked by the diff.srcPrefix and diff.dstPrefix configuration variables. Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-15Merge branch 'jx/dirstat-parseopt-help'Junio C Hamano1-3/+3
The mark-up of diff options has been updated to help translators. * jx/dirstat-parseopt-help: diff: mark param1 and param2 as placeholders
2024-02-14diff: mark param1 and param2 as placeholdersJiang Xin1-3/+3
Some l10n translators translated the parameters "files", "param1" and "param2" in the following message: "synonym for --dirstat=files,param1,param2..." Translating "param1" and "param2" is OK, but changing the parameter "files" is wrong. The parameters that are not meant to be used verbatim should be marked as placeholders, but the verbatim parameter not marked as a placeholder should be left as is. This change is a complement for commit 51e846e673 (doc: enforce placeholders in documentation, 2023-12-25). With the help of Jean-Noël,some parameter combinations in one placeholder (e.g. "<param1,param2>...") are splited into seperate placeholders. Helped-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06Merge branch 'jk/diff-external-with-no-index'Junio C Hamano1-1/+2
"git diff --no-index file1 file2" segfaulted while invoking the external diff driver, which has been corrected. * jk/diff-external-with-no-index: diff: handle NULL meta-info when spawning external diff
2024-01-29diff: handle NULL meta-info when spawning external diffJeff King1-1/+2
Running this: $ touch foo bar $ chmod +x foo $ git -c diff.external=echo diff --ext-diff --no-index foo bar results in a segfault. The issue is that run_diff_cmd() passes a NULL "xfrm_msg" variable to run_external_diff(), which feeds it to strvec_push(), causing the segfault. The bug dates back to 82fbf269b9 (run_external_diff: use an argv_array for the command line, 2014-04-19), though it mostly only ever worked accidentally. Before then, we just stuck the NULL pointer into a "const char **" array, so our NULL ended up acting as an extra end-of-argv sentinel (which was OK, because it was the last thing in the array). Curiously, though, this is only a problem with --no-index. We set up xfrm_msg by calling fill_metainfo(). This result may be empty, or may have text like "index 1234..5678\n", "rename from foo\nrename from bar\n", etc. In run_external_diff(), we only look at xfrm_msg if the "other" variable is not NULL. That variable is set when the paths of the two sides of the diff pair aren't the same (in which case the destination path becomes "other"). So normally it would kick in only for a rename, in which case xfrm_msg should not be NULL (it would have the rename information in it). But with a "--no-index" of two blobs, we of course have two different pathnames, and thus end up with a non-NULL "other" filename (which is always just a repeat of the file2-name), but possibly a NULL xfrm_msg. So how to fix it? I have a feeling that --no-index always passing "other" to the external diff command is probably a bug. There was no rename, and the name is always redundant with existing information we pass (and this may even cause us to pass a useless "xfrm_msg" that contains an "index 1234..5678" line). So one option would be to change that behavior. We don't seem to have ever documented the "other" or "xfrm_msg" parameters for external diffs. But I'm not sure what fallout we might have from changing that behavior now. So this patch takes the less-risky option, and simply teaches run_external_diff() to avoid passing xfrm_msg when it's NULL. That makes it agnostic to whether "other" and "xfrm_msg" always come as a pair. It fixes the segfault now, and if we want to change the --no-index "other" behavior on top, it will handle that, too. Reported-by: Wilfred Hughes <me@wilfred.me.uk> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08Merge branch 'en/header-cleanup'Junio C Hamano1-2/+0
Remove unused header "#include". * en/header-cleanup: treewide: remove unnecessary includes in source files treewide: add direct includes currently only pulled in transitively trace2/tr2_tls.h: remove unnecessary include submodule-config.h: remove unnecessary include pkt-line.h: remove unnecessary include line-log.h: remove unnecessary include http.h: remove unnecessary include fsmonitor--daemon.h: remove unnecessary includes blame.h: remove unnecessary includes archive.h: remove unnecessary include treewide: remove unnecessary includes in source files treewide: remove unnecessary includes from header files
2023-12-26treewide: remove unnecessary includes in source filesElijah Newren1-2/+0
Each of these were checked with gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE} to ensure that removing the direct inclusion of the header actually resulted in that header no longer being included at all (i.e. that no other header pulled it in transitively). ...except for a few cases where we verified that although the header was brought in transitively, nothing from it was directly used in that source file. These cases were: * builtin/credential-cache.c * builtin/pull.c * builtin/send-pack.c Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09diff: give more detailed messages for bogus diff.* configJeff King1-2/+6
The config callbacks for a few diff.* variables simply return -1 when we encounter an error. The message you get mentions the offending location, like: fatal: bad config variable 'diff.algorithm' in file '.git/config' at line 7 but is vague about "bad" (as it must be, since the message comes from the generic config code). Most callbacks add their own messages here, so let's do the same. E.g.: error: unknown value for config 'diff.algorithm': foo fatal: bad config variable 'diff.algorithm' in file '.git/config' at line 7 I've written the string in a way that should be reusable for translators, and matches another similar message in transport.c (there doesn't yet seem to be a popular generic message to reuse here, so hopefully this will get the ball rolling). Note that in the case of diff.algorithm, our parse_algorithm_value() helper does detect a NULL value string. But it's still worth detecting it ourselves here, since we can give a more specific error message (and which is the usual one for unexpected implicit-bool values). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09config: handle NULL value when parsing non-boolsJeff King1-3/+16
When the config parser sees an "implicit" bool like: [core] someVariable it passes NULL to the config callback. Any callback code which expects a string must check for NULL. This usually happens via helpers like git_config_string(), etc, but some custom code forgets to do so and will segfault. These are all fairly vanilla cases where the solution is just the usual pattern of: if (!value) return config_error_nonbool(var); though note that in a few cases we have to split initializers like: int some_var = initializer(); into: int some_var; if (!value) return config_error_nonbool(var); some_var = initializer(); There are still some broken instances after this patch, which I'll address on their own in individual patches after this one. Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-02Merge branch 'jk/diff-result-code-cleanup' into maint-2.42Junio C Hamano1-4/+2
"git diff --no-such-option" and other corner cases around the exit status of the "diff" command has been corrected. * jk/diff-result-code-cleanup: diff: drop useless "status" parameter from diff_result_code() diff: drop useless return values in git-diff helpers diff: drop useless return from run_diff_{files,index} functions diff: die when failing to read index in git-diff builtin diff: show usage for unknown builtin_diff_files() options diff-files: avoid negative exit value diff: spell DIFF_INDEX_CACHED out when calling run_diff_index()
2023-11-02Merge branch 'jc/diff-exit-code-with-w-fixes' into maint-2.42Junio C Hamano1-15/+25
"git diff -w --exit-code" with various options did not work correctly, which is being addressed. * jc/diff-exit-code-with-w-fixes: diff: the -w option breaks --exit-code for --raw and other output modes t4040: remove test that succeeded for a wrong reason diff: teach "--stat -w --exit-code" to notice differences diff: mode-only change should be noticed by "--patch -w --exit-code" diff: move the fallback "--exit-code" code down
2023-09-29diff --stat: set the width defaults in a helper functionDragan Simic1-0/+7
Extract the commonly used initialization of the --stat-width=<width>, --stat-name-width=<width> and --stat-graph-with=<width> parameters to their internal default values into a helper function, to avoid repeating the same initialization code in a few places. Add a couple of tests to additionally cover existing configuration options diff.statNameWidth=<width> and diff.statGraphWidth=<width> when used by git-merge to generate --stat outputs. This closes the gap that existed previously in the --stat tests, and reduces the chances for having any regressions introduced by this commit. While there, perform a small bunch of minor wording tweaks in the improved unit test, to improve its test-level consistency a bit. Signed-off-by: Dragan Simic <dsimic@manjaro.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-18diff --stat: add config option to limit filename widthDragan Simic1-2/+9
Add new configuration option diff.statNameWidth=<width> that is equivalent to the command-line option --stat-name-width=<width>, but it is ignored by format-patch. This follows the logic established by the already existing configuration option diff.statGraphWidth=<width>. Limiting the widths of names and graphs in the --stat output makes sense for interactive work on wide terminals with many columns, hence the support for these configuration options. They don't affect format-patch because it already adheres to the traditional 80-column standard. Update the documentation and add more tests to cover new configuration option diff.statNameWidth=<width>. While there, perform a few minor code and whitespace cleanups here and there, as spotted. Signed-off-by: Dragan Simic <dsimic@manjaro.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-01Merge branch 'jk/diff-result-code-cleanup'Junio C Hamano1-4/+2
"git diff --no-such-option" and other corner cases around the exit status of the "diff" command has been corrected. * jk/diff-result-code-cleanup: diff: drop useless "status" parameter from diff_result_code() diff: drop useless return values in git-diff helpers diff: drop useless return from run_diff_{files,index} functions diff: die when failing to read index in git-diff builtin diff: show usage for unknown builtin_diff_files() options diff-files: avoid negative exit value diff: spell DIFF_INDEX_CACHED out when calling run_diff_index()
2023-08-30Merge branch 'jc/diff-exit-code-with-w-fixes'Junio C Hamano1-15/+25
"git diff -w --exit-code" with various options did not work correctly, which is being addressed. * jc/diff-exit-code-with-w-fixes: diff: the -w option breaks --exit-code for --raw and other output modes t4040: remove test that succeeded for a wrong reason diff: teach "--stat -w --exit-code" to notice differences diff: mode-only change should be noticed by "--patch -w --exit-code" diff: move the fallback "--exit-code" code down
2023-08-21diff: the -w option breaks --exit-code for --raw and other output modesJunio C Hamano1-0/+6
The output from "--raw", "--name-status", and "--name-only" modes in "git diff" does depend on and does not reflect how certain different contents are considered equal, unlike "--patch" and "--stat" output modes do, when used with options like "-w" (another way of thinking about it is that it is not like we recompute the hash of the blob after removing all whitespaces to show "git diff --raw -w" output). But the fact that "--raw" and friends ignore "-w" is not a good excuse for "diff --raw -w --exit-code" to also ignore the request to report the differences with its exit status. When run without "-w", "git diff --exit-code --raw" does report with its exit status the differences as requested, and we should do the same when run with "-w", too. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21diff: drop useless "status" parameter from diff_result_code()Jeff King1-4/+2
Many programs use diff_result_code() to get a user-visible program exit code from a diff result (e.g., checking opts.found_changes if --exit-code was requested). This function also takes a "status" parameter, which seems at first glance that it could be used to propagate an error encountered when computing the diff. But it doesn't work that way: - negative values are passed through as-is, but are not appropriate as program exit codes - when --exit-code or --check is in effect, we _ignore_ the passed-in status completely. So a failed diff which did not have a chance to set opts.found_changes would erroneously report "success, no changes" instead of propagating the error. After recent cleanups, neither of these bugs is possible to trigger, as every caller just passes in "0". So rather than fixing them, we can simply drop the useless parameter instead. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-18diff: teach "--stat -w --exit-code" to notice differencesJunio C Hamano1-0/+1
When options like "-w" is used while "--exit-code" option is in effect, instead of the usual "do we have any filepair whose preimage and postimage have different <mode,object>?" check, we need to compare the contents of the blobs, taking into account that certain changes are considered no-op. With the previous step, we taught "--patch" codepath to set the .found_changes bit correctly, even for a change that only affects the mode and not object. The "--stat" codepath, however, did not set the .found_changes bit at all. This lead to $ git diff --stat -w --exit-code for a change that does have an output to exit with status 0. Set the bit by inspecting the list of paths the diffstat output is given for (a mode-only change will still appear as a "0-line added 0-line deleted" change) to fix it. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-18diff: mode-only change should be noticed by "--patch -w --exit-code"Junio C Hamano1-0/+3
The codepath to notice the content-level changes, taking certain no-op changes like "ignore whitespace" into account, forgot that a mode-only change is still a change. This resulted in $ git diff --patch --exit-code -w to exit with status 0 even when there is such a mode-only change, breaking both "--patch" and "--quiet" output formats. Teach the builtin_diff() codepath that creation and deletion as well as mode changes are all interesting changes. Note that the test specifically checks removal of an empty file, because if there is anything in the preimage (i.e. the removed file is not empty), the removal would still trigger textual patch output and the codepath for that does update .found_changes bit to report that it found an interesting change. We need to make sure that the .found_changes bit is set even without triggering textual patch output. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-18diff: move the fallback "--exit-code" code downJunio C Hamano1-15/+15
When "--exit-code" is asked and the code cannot just answer by comparing the object names on both sides but needs to inspect and compare the contents, there are two ways that the result is found out. Some output modes, like "--stat" and "--patch", inherently have to inspect the contents in order to show the differences in the way they do. The codepaths for these modes set the .found_changes bit as they compute what to show. However, other output modes do not need to inspect the contents to show the differences in the way they do. The most notable example is "--quiet", which does not need to compute any output to show. When they are asked to report "--exit-code", they run the codepaths for the "--patch" output with their output redirected to "/dev/null", only to set the .found_changes bit. Currently, this fallback invocation of "--patch" output is done after the "--stat" output format and its friends and before the "--patch" and internal callback logic. Move it to the end of the sequence to clarify the fallback status of this code block. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-17Merge branch 'cw/compat-util-header-cleanup'Junio C Hamano1-2/+0
Further shuffling of declarations across header files to streamline file dependencies. * cw/compat-util-header-cleanup: git-compat-util: move alloc macros to git-compat-util.h treewide: remove unnecessary includes for wrapper.h kwset: move translation table from ctype sane-ctype.h: create header for sane-ctype macros git-compat-util: move wrapper.c funcs to its header git-compat-util: move strbuf.c funcs to its header
2023-07-06Merge branch 'gc/config-context'Junio C Hamano1-8/+11
Reduce reliance on a global state in the config reading API. * gc/config-context: config: pass source to config_parser_event_fn_t config: add kvi.path, use it to evaluate includes config.c: remove config_reader from configsets config: pass kvi to die_bad_number() trace2: plumb config kvi config.c: pass ctx with CLI config config: pass ctx with config files config.c: pass ctx in configsets config: add ctx arg to config_fn_t urlmatch.h: use config_fn_t type config: inline git_color_default_config
2023-07-06Merge branch 'pb/complete-diff-options'Junio C Hamano1-0/+4
Completion updates. * pb/complete-diff-options: (24 commits) diff.c: mention completion above add_diff_options completion: complete --remerge-diff completion: complete --diff-merges, its options and --no-diff-merges completion: move --pickaxe-{all,regex} to __git_diff_common_options completion: complete --ws-error-highlight completion: complete --unified completion: complete --output-indicator-{context,new,old} completion: complete --output completion: complete --no-stat completion: complete --no-relative completion: complete --line-prefix completion: complete --ita-invisible-in-index and --ita-visible-in-index completion: complete --irreversible-delete completion: complete --ignore-matching-lines completion: complete --function-context completion: complete --find-renames completion: complete --find-object completion: complete --find-copies completion: complete --default-prefix completion: complete --compact-summary ...
2023-07-05git-compat-util: move alloc macros to git-compat-util.hCalvin Wan1-1/+0
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for dynamic array allocation. Moving these macros to git-compat-util.h with the other alloc macros focuses alloc.[ch] to allocation for Git objects and additionally allows us to remove inclusions to alloc.h from files that solely used the above macros. Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05treewide: remove unnecessary includes for wrapper.hCalvin Wan1-1/+0
Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28config: pass kvi to die_bad_number()Glen Choo1-4/+5
Plumb "struct key_value_info" through all code paths that end in die_bad_number(), which lets us remove the helper functions that read analogous values from "struct config_reader". As a result, nothing reads config_reader.config_kvi any more, so remove that too. In config.c, this requires changing the signature of git_configset_get_value() to 'return' "kvi" in an out parameter so that git_configset_get_<type>() can pass it to git_config_<type>(). Only numeric types will use "kvi", so for non-numeric types (e.g. git_configset_get_string()), pass NULL to indicate that the out parameter isn't needed. Outside of config.c, config callbacks now need to pass "ctx->kvi" to any of the git_config_<type>() functions that parse a config string into a number type. Included is a .cocci patch to make that refactor. The only exceptional case is builtin/config.c, where git_config_<type>() is called outside of a config callback (namely, on user-provided input), so config source information has never been available. In this case, die_bad_number() defaults to a generic, but perfectly descriptive message. Let's provide a safe, non-NULL for "kvi" anyway, but make sure not to change the message. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28config: add ctx arg to config_fn_tGlen Choo1-4/+6
Add a new "const struct config_context *ctx" arg to config_fn_t to hold additional information about the config iteration operation. config_context has a "struct key_value_info kvi" member that holds metadata about the config source being read (e.g. what kind of config source it is, the filename, etc). In this series, we're only interested in .kvi, so we could have just used "struct key_value_info" as an arg, but config_context makes it possible to add/adjust members in the future without changing the config_fn_t signature. We could also consider other ways of organizing the args (e.g. moving the config name and value into config_context or key_value_info), but in my experiments, the incremental benefit doesn't justify the added complexity (e.g. a config_fn_t will sometimes invoke another config_fn_t but with a different config value). In subsequent commits, the .kvi member will replace the global "struct config_reader" in config.c, making config iteration a global-free operation. It requires much more work for the machinery to provide meaningful values of .kvi, so for now, merely change the signature and call sites, pass NULL as a placeholder value, and don't rely on the arg in any meaningful way. Most of the changes are performed by contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every config_fn_t: - Modifies the signature to accept "const struct config_context *ctx" - Passes "ctx" to any inner config_fn_t, if needed - Adds UNUSED attributes to "ctx", if needed Most config_fn_t instances are easily identified by seeing if they are called by the various config functions. Most of the remaining ones are manually named in the .cocci patch. Manual cleanups are still needed, but the majority of it is trivial; it's either adjusting config_fn_t that the .cocci patch didn't catch, or adding forward declarations of "struct config_context ctx" to make the signatures make sense. The non-trivial changes are in cases where we are invoking a config_fn_t outside of config machinery, and we now need to decide what value of "ctx" to pass. These cases are: - trace2/tr2_cfg.c:tr2_cfg_set_fl() This is indirectly called by git_config_set() so that the trace2 machinery can notice the new config values and update its settings using the tr2 config parsing function, i.e. tr2_cfg_cb(). - builtin/checkout.c:checkout_main() This calls git_xmerge_config() as a shorthand for parsing a CLI arg. This might be worth refactoring away in the future, since git_xmerge_config() can call git_default_config(), which can do much more than just parsing. Handle them by creating a KVI_INIT macro that initializes "struct key_value_info" to a reasonable default, and use that to construct the "ctx" arg. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26diff.c: mention completion above add_diff_optionsPhilippe Blain1-0/+4
Add a comment on top of add_diff_options, where common diff options are listed, mentioning __git_diff_common_options in the completion script, in the hope that contributors update it when they add new diff flags. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21object-store-ll.h: split this header out of object-store.hElijah Newren1-1/+1
The vast majority of files including object-store.h did not need dir.h nor khash.h. Split the header into two files, and let most just depend upon object-store-ll.h, while letting the two callers that need it depend on the full object-store.h. After this patch: $ git grep -h include..object-store | sort | uniq -c 2 #include "object-store.h" 129 #include "object-store-ll.h" Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21merge-ll: rename from ll-mergeElijah Newren1-1/+1
A long term (but rather minor) pet-peeve of mine was the name ll-merge.[ch]. I thought it made it harder to realize what stuff was related to merging when I was working on the merge machinery and trying to improve it. Further, back in d1cbe1e6d8a ("hash-ll.h: split out of hash.h to remove dependency on repository.h", 2023-04-22), we have split the portions of hash.h that do not depend upon repository.h into a "hash-ll.h" (due to the recommendation to use "ll" for "low-level" in its name[1], but which I used as a suffix precisely because of my distaste for "ll-merge"). When we discussed adding additional "*-ll.h" files, a request was made that we use "ll" consistently as either a prefix or a suffix. Since it is already in use as both a prefix and a suffix, the only way to do so is to rename some files. Besides my distaste for the ll-merge.[ch] name, let me also note that the files ll-fsmonitor.h, ll-hash.h, ll-merge.h, ll-object-store.h, ll-read-cache.h would have essentially nothing to do with each other and make no sense to group. But giving them the common "ll-" prefix would group them. Using "-ll" as a suffix thus seems just much more logical to me. Rename ll-merge.[ch] to merge-ll.[ch] to achieve this consistency, and to ensure we get a more logical grouping of files. [1] https://lore.kernel.org/git/kl6lsfcu1g8w.fsf@chooglen-macbookpro.roam.corp.google.com/ Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21cache.h: remove this no-longer-used headerElijah Newren1-1/+1
Since this header showed up in some places besides just #include statements, update/clean-up/remove those other places as well. Note that compat/fsmonitor/fsm-path-utils-darwin.c previously got away with violating the rule that all files must start with an include of git-compat-util.h (or a short-list of alternate headers that happen to include it first). This change exposed the violation and caused it to stop building correctly; fix it by having it include git-compat-util.h first, as per policy. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21read-cache*.h: move declarations for read-cache.c functions from cache.hElijah Newren1-0/+1
For the functions defined in read-cache.c, move their declarations from cache.h to a new header, read-cache-ll.h. Also move some related inline functions from cache.h to read-cache.h. The purpose of the read-cache-ll.h/read-cache.h split is that about 70% of the sites don't need the inline functions and the extra headers they include. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-20Merge branch 'jk/log-follow-with-non-literal-pathspec'Junio C Hamano1-2/+27
"git [-c log.follow=true] log [--follow] ':(glob)f**'" used to barf. * jk/log-follow-with-non-literal-pathspec: diff: detect pathspec magic not supported by --follow diff: factor out --follow pathspec check pathspec: factor out magic-to-name function
2023-06-13Merge branch 'jc/diff-s-with-other-options'Junio C Hamano1-11/+13
The "-s" (silent, squelch) option of the "diff" family of commands did not interact with other options that specify the output format well. This has been cleaned up so that it will clear all the formatting options given before. * jc/diff-s-with-other-options: diff: fix interaction between the "-s" option and other options
2023-06-03diff: detect pathspec magic not supported by --followJeff King1-0/+15
The --follow code doesn't handle most forms of pathspec magic. We check that no unexpected ones have made it to try_to_follow_renames() with a runtime GUARD_PATHSPEC() check, which gives behavior like this: $ git log --follow ':(icase)makefile' >/dev/null BUG: tree-diff.c:596: unsupported magic 10 Aborted The same is true of ":(glob)", ":(attr)", and so on. It's good that we notice the problem rather than continuing and producing a wrong answer. But there are two non-ideal things: 1. The idea of GUARD_PATHSPEC() is to catch programming errors where low-level code gets unexpected pathspecs. We'd usually try to catch unsupported pathspecs by passing a magic_mask to parse_pathspec(), which would give the user a much better message like: pathspec magic not supported by this command: 'icase' That doesn't happen here because git-log usually _does_ support all types of pathspec magic, and so it passes "0" for the mask (this call actually happens in setup_revisions()). It needs to distinguish the normal case from the "--follow" one but currently doesn't. 2. In addition to --follow, we have the log.follow config option. When that is set, we try to turn on --follow mode only when there is a single pathspec (since --follow doesn't handle anything else). But really, that ought to be expanded to "use --follow when the pathspec supports it". Otherwise, we'd complain any time you use an exotic pathspec: $ git config log.follow true $ git log ':(icase)makefile' >/dev/null BUG: tree-diff.c:596: unsupported magic 10 Aborted We should instead just avoid enabling follow mode if it's not supported by this particular invocation. This patch expands our diff_check_follow_pathspec() function to cover pathspec magic, solving both problems. A few final notes: - we could also solve (1) by passing the appropriate mask to parse_pathspec(). But that's not great for two reasons. One is that the error message is less precise. It says "magic not supported by this command", but really it is not the command, but rather the --follow option which is the problem. The second is that it always calls die(). But for our log.follow code, we want to speculatively ask "is this pathspec OK?" and just get a boolean result. - This is obviously the right thing to do for ':(icase)' and most other magic options. But ':(glob)' is a bit odd here. The --follow code doesn't support wildcards, but we allow them anyway. From try_to_follow_renames(): #if 0 /* * We should reject wildcards as well. Unfortunately we * haven't got a reliable way to detect that 'foo\*bar' in * fact has no wildcards. nowildcard_len is merely a hint for * optimization. Let it slip for now until wildmatch is taught * about dry-run mode and returns wildcard info. */ if (opt->pathspec.has_wildcard) BUG("wildcards are not supported"); #endif So something like "git log --follow 'Make*'" is already doing the wrong thing, since ":(glob)" behavior is already the default (it is used only to countermand an earlier --noglob-pathspecs). So we _could_ loosen the guard to allow :(glob), since it just behaves the same as pathspecs do by default. But it seems like a backwards step to do so. It already doesn't work (it hits the BUG() case currently), and given that the user took an explicit step to say "this pathspec should glob", it is reasonable for us to say "no, --follow does not support globbing" (or in the case of log.follow, avoid turning on follow mode). Which is what happens after this patch. - The set of allowed pathspec magic is obviously the same as in GUARD_PATHSPEC(). We could perhaps factor these out to avoid repetition. The point of having separate masks and GUARD calls is that we don't necessarily know which parsed pathspecs will be used where. But in this case, the two are heavily correlated. Still, there may be some value in keeping them separate; it would make anyone think twice about adding new magic to the list in diff_check_follow_pathspec(). They'd need to touch try_to_follow_renames() as well, which is the code that would actually need to be updated to handle more exotic pathspecs. - The documentation for log.follow says that it enables --follow "...when a single <path> is given". We could possibly expand that to say "with no unsupported pathspec magic", but that raises the question of documenting which magic is supported. I think the existing wording of "single <path>" sufficiently encompasses the idea (the forbidden magic is stuff that might match multiple entries), and the spirit remains the same. Reported-by: Jim Pryor <dubiousjim@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-03diff: factor out --follow pathspec checkJeff King1-2/+12
In --follow mode, we require exactly one pathspec. We check this condition in two places: - in diff_setup_done(), we complain if --follow is used with an inapropriate pathspec - in git-log's revision "tweak" function, we enable log.follow only if the pathspec allows it The duplication isn't a big deal right now, since the logic is so simple. But in preparation for it becoming more complex, let's pull it into a shared function. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-15Merge branch 'jc/dirstat-plug-leaks'Junio C Hamano1-14/+20
"git diff --dirstat" leaked memory, which has been plugged. * jc/dirstat-plug-leaks: diff: plug leaks in dirstat diff: refactor common tail part of dirstat computation
2023-05-09Merge branch 'en/header-split-cache-h-part-2'Junio C Hamano1-0/+2
More header clean-up. * en/header-split-cache-h-part-2: (22 commits) reftable: ensure git-compat-util.h is the first (indirect) include diff.h: reduce unnecessary includes object-store.h: reduce unnecessary includes commit.h: reduce unnecessary includes fsmonitor: reduce includes of cache.h cache.h: remove unnecessary headers treewide: remove cache.h inclusion due to previous changes cache,tree: move basic name compare functions from read-cache to tree cache,tree: move cmp_cache_name_compare from tree.[ch] to read-cache.c hash-ll.h: split out of hash.h to remove dependency on repository.h tree-diff.c: move S_DIFFTREE_IFXMIN_NEQ define from cache.h dir.h: move DTYPE defines from cache.h versioncmp.h: move declarations for versioncmp.c functions from cache.h ws.h: move declarations for ws.c functions from cache.h match-trees.h: move declarations for match-trees.c functions from cache.h pkt-line.h: move declarations for pkt-line.c functions from cache.h base85.h: move declarations for base85.c functions from cache.h copy.h: move declarations for copy.c functions from cache.h server-info.h: move declarations for server-info.c functions from cache.h packfile.h: move pack_window and pack_entry from cache.h ...
2023-05-05diff: fix interaction between the "-s" option and other optionsJunio C Hamano1-11/+13
Sergey Organov noticed and reported "--patch --no-patch --raw" behaves differently from just "--raw". It turns out that there are a few interesting bugs in the implementation and documentation. * First, the documentation for "--no-patch" was unclear that it could be read to mean "--no-patch" countermands an earlier "--patch" but not other things. The intention of "--no-patch" ever since it was introduced at d09cd15d (diff: allow --no-patch as synonym for -s, 2013-07-16) was to serve as a synonym for "-s", so "--raw --patch --no-patch" should have produced no output, but it can be (mis)read to allow showing only "--raw" output. * Then the interaction between "-s" and other format options were poorly implemented. Modern versions of Git uses one bit each to represent formatting options like "--patch", "--stat" in a single output_format word, but for historical reasons, "-s" also is represented as another bit in the same word. This allows two interesting bugs to happen, and we have both X-<. (1) After setting a format bit, then setting NO_OUTPUT with "-s", the code to process another "--<format>" option drops the NO_OUTPUT bit to allow output to be shown again. However, the code to handle "-s" only set NO_OUTPUT without unsetting format bits set earlier, so the earlier format bit got revealed upon seeing the second "--<format>" option. This is the problem Sergey observed. (2) After setting NO_OUTPUT with "-s", code to process "--<format>" option can forget to unset NO_OUTPUT, leaving the command still silent. It is tempting to change the meaning of "--no-patch" to mean "disable only the patch format output" and reimplement "-s" as "not showing anything", but it would be an end-user visible change in behavior. Let's fix the interactions of these bits to first make "-s" work as intended. The fix is conceptually very simple. * Whenever we set DIFF_FORMAT_FOO because we saw the "--foo" option (e.g. DIFF_FORMAT_RAW is set when the "--raw" option is given), we make sure we drop DIFF_FORMAT_NO_OUTPUT. We forgot to do so in some of the options and caused (2) above. * When processing "-s" option, we should not just set DIFF_FORMAT_NO_OUTPUT bit, but clear other DIFF_FORMAT_* bits. We didn't do so and retained format bits set by options previously seen, causing (1) above. It is even more tempting to lose NO_OUTPUT bit and instead take output_format word being 0 as its replacement, but that would break the mechanism "git show" uses to default to "--patch" output, where the distinction between telling the command to be silent with "-s" and having no output format specified on the command line matters, and an explicit output format given on the command line should not be "combined" with the default "--patch" format. So, while we cannot lose the NO_OUTPUT bit, as a follow-up work, we may want to replace it with OPTION_GIVEN bit, and * make "--patch", "--raw", etc. set DIFF_FORMAT_$format bit and DIFF_FORMAT_OPTION_GIVEN bit on for each format. "--no-raw", etc. will set off DIFF_FORMAT_$format bit but still record the fact that we saw an option from the command line by setting DIFF_FORMAT_OPTION_GIVEN bit. * make "-s" (and its synonym "--no-patch") clear all other bits and set only the DIFF_FORMAT_OPTION_GIVEN bit on. which I suspect would make the code much cleaner without breaking any end-user expectations. Once that is in place, transitioning "--no-patch" to mean the counterpart of "--patch", just like "--no-raw" only defeats an earlier "--raw", would be quite simple at the code level. The social cost of migrating the end-user expectations might be too great for it to be worth, but at least the "GIVEN" bit clean-up alone may be worth it. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-05diff: plug leaks in dirstatJunio C Hamano1-6/+11
The array of dirstat_file contained in the dirstat_dir structure is not freed after the processing ends. Unfortunately, the member that points at the array, .files, is incremented as the gather_dirstat() function recursively walks it, and this needs to be plugged by remembering the beginning of the array before gather_dirstat() mucks with it and freeing it after we are done. We can mark t4047 as leak-free. t4000, which is marked as leak-free, now can exercise dirstat in it, which will happen next. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-05diff: refactor common tail part of dirstat computationJunio C Hamano1-14/+15
This will become useful when we plug leaks in these two functions. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-02Merge branch 'tb/ban-strtok'Junio C Hamano1-1/+1
Mark strtok() and strtok_r() to be banned. * tb/ban-strtok: banned.h: mark `strtok()` and `strtok_r()` as banned t/helper/test-json-writer.c: avoid using `strtok()` t/helper/test-oidmap.c: avoid using `strtok()` t/helper/test-hashmap.c: avoid using `strtok()` string-list: introduce `string_list_setlen()` string-list: multi-delimiter `string_list_split_in_place()`
2023-04-25Merge branch 'en/header-split-cache-h'Junio C Hamano1-0/+5
Header clean-up. * en/header-split-cache-h: (24 commits) protocol.h: move definition of DEFAULT_GIT_PORT from cache.h mailmap, quote: move declarations of global vars to correct unit treewide: reduce includes of cache.h in other headers treewide: remove double forward declaration of read_in_full cache.h: remove unnecessary includes treewide: remove cache.h inclusion due to pager.h changes pager.h: move declarations for pager.c functions from cache.h treewide: remove cache.h inclusion due to editor.h changes editor: move editor-related functions and declarations into common file treewide: remove cache.h inclusion due to object.h changes object.h: move some inline functions and defines from cache.h treewide: remove cache.h inclusion due to object-file.h changes object-file.h: move declarations for object-file.c functions from cache.h treewide: remove cache.h inclusion due to git-zlib changes git-zlib: move declarations for git-zlib functions from cache.h treewide: remove cache.h inclusion due to object-name.h changes object-name.h: move declarations for object-name.c functions from cache.h treewide: remove unnecessary cache.h inclusion treewide: be explicit about dependence on mem-pool.h treewide: be explicit about dependence on oid-array.h ...
2023-04-24string-list: multi-delimiter `string_list_split_in_place()`Taylor Blau1-1/+1
Enhance `string_list_split_in_place()` to accept multiple characters as delimiters instead of a single character. Instead of using `strchr(2)` to locate the first occurrence of the given delimiter character, `string_list_split_in_place_multi()` uses `strcspn(2)` to move past the initial segment of characters comprised of any characters in the delimiting set. When only a single delimiting character is provided, `strpbrk(2)` (which is implemented with `strcspn(2)`) has equivalent performance to `strchr(2)`. Modern `strcspn(2)` implementations treat an empty delimiter or the singleton delimiter as a special case and fall back to calling strchrnul(). Both glibc[1] and musl[2] implement `strcspn(2)` this way. This change is one step to removing `strtok(2)` from the tree. Note that `string_list_split_in_place()` is not a strict replacement for `strtok()`, since it will happily turn sequential delimiter characters into empty entries in the resulting string_list. For example: string_list_split_in_place(&xs, "foo:;:bar:;:baz", ":;", -1) would yield a string list of: ["foo", "", "", "bar", "", "", "baz"] Callers that wish to emulate the behavior of strtok(2) more directly should call `string_list_remove_empty_items()` after splitting. To avoid regressions for the new multi-character delimter cases, update t0063 in this patch as well. [1]: https://sourceware.org/git/?p=glibc.git;a=blob;f=string/strcspn.c;hb=glibc-2.37#l35 [2]: https://git.musl-libc.org/cgit/musl/tree/src/string/strcspn.c?h=v1.2.3#n11 Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24ws.h: move declarations for ws.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24base85.h: move declarations for base85.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11pager.h: move declarations for pager.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11object-file.h: move declarations for object-file.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11object-name.h: move declarations for object-name.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11treewide: be explicit about dependence on oid-array.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11treewide: be explicit about dependence on convert.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-06Merge branch 'en/header-split-cleanup'Junio C Hamano1-0/+5
Split key function and data structure definitions out of cache.h to new header files and adjust the users. * en/header-split-cleanup: csum-file.h: remove unnecessary inclusion of cache.h write-or-die.h: move declarations for write-or-die.c functions from cache.h treewide: remove cache.h inclusion due to setup.h changes setup.h: move declarations for setup.c functions from cache.h treewide: remove cache.h inclusion due to environment.h changes environment.h: move declarations for environment.c functions from cache.h treewide: remove unnecessary includes of cache.h wrapper.h: move declarations for wrapper.c functions from cache.h path.h: move function declarations for path.c functions from cache.h cache.h: remove expand_user_path() abspath.h: move absolute path functions from cache.h environment: move comment_line_char from cache.h treewide: remove unnecessary cache.h inclusion from several sources treewide: remove unnecessary inclusion of gettext.h treewide: be explicit about dependence on gettext.h treewide: remove unnecessary cache.h inclusion from a few headers
2023-04-06Merge branch 'ab/remove-implicit-use-of-the-repository'Junio C Hamano1-3/+3
Code clean-up around the use of the_repository. * ab/remove-implicit-use-of-the-repository: libs: use "struct repository *" argument, not "the_repository" post-cocci: adjust comments for recent repo_* migration cocci: apply the "revision.h" part of "the_repository.pending" cocci: apply the "rerere.h" part of "the_repository.pending" cocci: apply the "refs.h" part of "the_repository.pending" cocci: apply the "promisor-remote.h" part of "the_repository.pending" cocci: apply the "packfile.h" part of "the_repository.pending" cocci: apply the "pretty.h" part of "the_repository.pending" cocci: apply the "object-store.h" part of "the_repository.pending" cocci: apply the "diff.h" part of "the_repository.pending" cocci: apply the "commit.h" part of "the_repository.pending" cocci: apply the "commit-reach.h" part of "the_repository.pending" cocci: apply the "cache.h" part of "the_repository.pending" cocci: add missing "the_repository" macros to "pending" cocci: sort "the_repository" rules by header cocci: fix incorrect & verbose "the_repository" rules cocci: remove dead rule from "the_repository.pending.cocci"
2023-04-04Merge branch 'ab/remove-implicit-use-of-the-repository' into ↵Junio C Hamano1-3/+3
en/header-split-cache-h * ab/remove-implicit-use-of-the-repository: libs: use "struct repository *" argument, not "the_repository" post-cocci: adjust comments for recent repo_* migration cocci: apply the "revision.h" part of "the_repository.pending" cocci: apply the "rerere.h" part of "the_repository.pending" cocci: apply the "refs.h" part of "the_repository.pending" cocci: apply the "promisor-remote.h" part of "the_repository.pending" cocci: apply the "packfile.h" part of "the_repository.pending" cocci: apply the "pretty.h" part of "the_repository.pending" cocci: apply the "object-store.h" part of "the_repository.pending" cocci: apply the "diff.h" part of "the_repository.pending" cocci: apply the "commit.h" part of "the_repository.pending" cocci: apply the "commit-reach.h" part of "the_repository.pending" cocci: apply the "cache.h" part of "the_repository.pending" cocci: add missing "the_repository" macros to "pending" cocci: sort "the_repository" rules by header cocci: fix incorrect & verbose "the_repository" rules cocci: remove dead rule from "the_repository.pending.cocci"
2023-03-28cocci: apply the "promisor-remote.h" part of "the_repository.pending"Ævar Arnfjörð Bjarmason1-1/+1
Apply the part of "the_repository.pending.cocci" pertaining to "promisor-remote.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28cocci: apply the "cache.h" part of "the_repository.pending"Ævar Arnfjörð Bjarmason1-2/+2
Apply the part of "the_repository.pending.cocci" pertaining to "cache.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21Merge branch 'jk/format-patch-ignore-noprefix'Junio C Hamano1-5/+28
"git format-patch" honors the src/dst prefixes set to nonstandard values with configuration variables like "diff.noprefix", causing receiving end of the patch that expects the standard -p1 format to break. Teach "format-patch" to ignore end-user configuration and always use the standard prefixes. This is a backward compatibility breaking change. * jk/format-patch-ignore-noprefix: rebase: prefer --default-prefix to --{src,dst}-prefix for format-patch format-patch: add format.noprefix option format-patch: do not respect diff.noprefix diff: add --default-prefix option t4013: add tests for diff prefix options diff: factor out src/dst prefix setup
2023-03-21setup.h: move declarations for setup.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21environment.h: move declarations for environment.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21wrapper.h: move declarations for wrapper.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21abspath.h: move absolute path functions from cache.hElijah Newren1-0/+1
This is another step towards letting us remove the include of cache.h in strbuf.c. It does mean that we also need to add includes of abspath.h in a number of C files. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21treewide: be explicit about dependence on gettext.hElijah Newren1-0/+1
Dozens of files made use of gettext functions, without explicitly including gettext.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include gettext.h if they are using it. However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an include of gettext.h, it was left out to avoid conflicting with an in-flight topic. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-17Merge branch 'en/header-cleanup'Junio C Hamano1-0/+2
Code clean-up to clarify the rule that "git-compat-util.h" must be the first to be included. * en/header-cleanup: diff.h: remove unnecessary include of object.h Remove unnecessary includes of builtin.h treewide: replace cache.h with more direct headers, where possible replace-object.h: move read_replace_refs declaration from cache.h to here object-store.h: move struct object_info from cache.h dir.h: refactor to no longer need to include cache.h object.h: stop depending on cache.h; make cache.h depend on object.h ident.h: move ident-related declarations out of cache.h pretty.h: move has_non_ascii() declaration from commit.h cache.h: remove dependence on hex.h; make other files include it explicitly hex.h: move some hex-related declarations from cache.h hash.h: move some oid-related declarations from cache.h alloc.h: move ALLOC_GROW() functions from cache.h treewide: remove unnecessary cache.h includes in source files treewide: remove unnecessary cache.h includes treewide: remove unnecessary git-compat-util.h includes in headers treewide: ensure one of the appropriate headers is sourced first
2023-03-09diff: add --default-prefix optionJeff King1-0/+14
You can change the output of prefixes with diff.noprefix and diff.mnemonicprefix, but there's no easy way to override them from the command-line. We do have "--no-prefix", but there's no way to get back to the default prefix. So let's add an option to do that. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09diff: factor out src/dst prefix setupJeff King1-5/+14
We directly manipulate diffopt's a_prefix and b_prefix to set up either the default "a/foo" prefix or the "--no-prefix" variant. Although this is only a few lines, it's worth pulling these into their own functions. That lets us avoid one repetition already in this patch, but will also give us a cleaner interface for callers which want to tweak this setting. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27Merge branch 'jc/diff-algo-attribute'Junio C Hamano1-23/+67
The "diff" drivers specified by the "diff" attribute attached to paths can now specify which algorithm (e.g. histogram) to use. * jc/diff-algo-attribute: diff: teach diff to read algorithm from diff driver diff: consolidate diff algorithm option parsing
2023-02-23cache.h: remove dependence on hex.h; make other files include it explicitlyElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23alloc.h: move ALLOC_GROW() functions from cache.hElijah Newren1-0/+1
This allows us to replace includes of cache.h with includes of the much smaller alloc.h in many places. It does mean that we also need to add includes of alloc.h in a number of C files. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-21diff: teach diff to read algorithm from diff driverJohn Cai1-9/+24
It can be useful to specify diff algorithms per file type. For example, one may want to use the minimal diff algorithm for .json files, another for .c files, etc. The diff machinery already checks attributes for a diff driver. Teach the diff driver parser a new type "algorithm" to look for in the config, which will be used if a driver has been specified through the attributes. Enforce precedence of the diff algorithm by favoring the command line option, then looking at the driver attributes & config combination, then finally the diff.algorithm config. To enforce precedence order, use a new `ignore_driver_algorithm` member during options parsing to indicate the diff algorithm was set via command line args. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-21diff: consolidate diff algorithm option parsingJohn Cai1-14/+43
A subsequent commit will need the ability to tell if the diff algorithm was set through the command line through setting a new member of diff_options. While this logic can be added to the diff_opt_diff_algorithm() callback, the `--minimal` and `--histogram` options are handled via OPT_BIT without a callback. Remedy this by consolidating the options parsing logic for --minimal and --histogram into one callback. This way we can modify `diff_options` in that function. As an additional refactor, the logic that sets the diff algorithm in diff_opt_diff_algorithm() can be refactored into a helper that will allow multiple callsites to set the diff algorithm. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-16Merge branch 'jk/ext-diff-with-relative'Junio C Hamano1-17/+13
"git diff --relative" did not mix well with "git diff --ext-diff", which has been corrected. * jk/ext-diff-with-relative: diff: drop "name" parameter from prepare_temp_file() diff: clean up external-diff argv setup diff: use filespec path to set up tempfiles for ext-diff
2023-01-06diff: drop "name" parameter from prepare_temp_file()Jeff King1-11/+10
The prepare_temp_file() function takes a diff_filespec as well as a filename. But it is almost certainly an error to pass in a name that isn't the filespec's "path" parameter, since that is the only thing that reliably tells us how to find the content (and indeed, this was the source of a recently-fixed bug). So let's drop the redundant "name" parameter and just use one->path throughout the function. This simplifies the interface a little bit, and makes it impossible for calling code to get it wrong. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-06diff: clean up external-diff argv setupJeff King1-6/+3
Since the previous commit, setting up the tempfile for an external diff uses df->path from the diff_filespec, rather than the logical name. This means add_external_diff_name() does not need to take a "name" parameter at all, and we can drop it. And that in turn lets us simplify the conditional for handling renames (when the "other" name is non-NULL). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-06diff: use filespec path to set up tempfiles for ext-diffJeff King1-1/+1
When we're going to run an external diff, we have to make the contents of the pre- and post-images available either by dumping them to a tempfile, or by pointing at a valid file in the worktree. The logic of this is all handled by prepare_temp_file(), and we just pass in the filename and the diff_filespec. But there's a gotcha here. The "filename" we have is a logical filename and not necessarily a path on disk or in the repository. This matters in at least one case: when using "--relative", we may have a name like "foo", even though the file content is found at "subdir/foo". As a result, we look for the wrong path, fail to find "foo", and claim that the file has been deleted (passing "/dev/null" to the external diff, rather than the correct worktree path). We can fix this by passing the pathname from the diff_filespec, which should always be a full repository path (and that's what we want even if reusing a worktree file, since we're always operating from the top-level of the working tree). The breakage seems to go all the way back to cd676a5136 (diff --relative: output paths as relative to the current subdirectory, 2008-02-12). As far as I can tell, before then "name" would always have been the same as the filespec's "path". There are two related cases I looked at that aren't buggy: 1. the only other caller of prepare_temp_file() is run_textconv(). But it always passes the filespec's path field, so it's OK. 2. I wondered if file renames/copies might cause similar confusion. But they don't, because run_external_diff() receives two names in that case: "name" and "other", which correspond to the two sides of the diff. And we did correctly pass "other" when handling the post-image side. Barring the use of "--relative", that would always match "two->path", the path of the second filespec (and the rename destination). So the only bug is just the interaction with external diff drivers and --relative. Reported-by: Carl Baldwin <carl@ecbaldwin.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-26Merge branch 'pg/diff-stat-unmerged-regression-fix'Junio C Hamano1-1/+1
The output from "git diff --stat" on an unmerged path lost the terminating LF in Git 2.39, which has been corrected. * pg/diff-stat-unmerged-regression-fix: diff: fix regression with --stat and unmerged file
2022-12-26Merge branch 'jk/unused-post-2.39'Junio C Hamano1-9/+9
Code clean-up around unused function parameters. * jk/unused-post-2.39: userdiff: mark unused parameter in internal callback list-objects-filter: mark unused parameters in virtual functions diff: mark unused parameters in callbacks xdiff: mark unused parameter in xdl_call_hunk_func() xdiff: drop unused parameter in def_ff() ws: drop unused parameter from ws_blank_line() list-objects: drop process_gitlink() function blob: drop unused parts of parse_blob_buffer() ls-refs: use repository parameter to iterate refs
2022-12-19Merge branch 'rs/diff-parseopts'Junio C Hamano1-11/+8
The way the diff machinery prepares the options array for the parse_options API has been refactored to avoid resource leaks. * rs/diff-parseopts: diff: remove parseopts member from struct diff_options diff: use add_diff_options() in diff_opt_parse() diff: factor out add_diff_options()
2022-12-15diff: fix regression with --stat and unmerged filePeter Grayson1-1/+1
A regression was introduced in 12fc4ad89e (diff.c: use utf8_strwidth() to count display width, 2022-09-14) that causes missing newlines after "Unmerged" entries in `git diff --cached --stat` output. This problem affects v2.39.0-rc0 through v2.39.0. Add the missing newline along with a new test to cover this behavior. Signed-off-by: Peter Grayson <pete@jpgrayson.net> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13diff: mark unused parameters in callbacksJeff King1-3/+4
The diff code provides a format_callback interface, but not every callback needs each parameter (e.g., the "opt" and "data" parameters are frequently left unused). Likewise for the output_prefix callback, the low-level change/add_remove interfaces, the callbacks used by xdi_diff(), etc. Mark unused arguments in the callback implementations to quiet -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13ws: drop unused parameter from ws_blank_line()Jeff King1-6/+5
We take a ws_rule parameter, but have never looked at it since the function was added in 877f23ccb8 (Teach "diff --check" about new blank lines at end, 2008-06-26). A comment in the function does mention how we _could_ use it, but nobody has felt the need to do so for over a decade. We could keep it around as reminder of what could be done, but the comment serves that purpose. And in the meantime, it triggers -Wunused-parameter. So let's drop it, which in turn allows us to drop similar arguments further up the callstack. I've left the comment intact. It does still say "ws_rule", but that name is used consistently in the whitespace code, so the meaning is clear. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-02diff: remove parseopts member from struct diff_optionsRené Scharfe1-14/+1
repo_diff_setup() builds the struct option array with git diff's command line options and stores a pointer to it in the parseopts member of struct diff_options. The array is freed by diff_setup_done(), but not by release_revisions(). Thus calling only repo_diff_setup() and release_revisions() leaks that array. We could free it in release_revisions() as well to plug that leak, but there is a better way: Only build it when needed. Absorb prep_parse_options() into the last place that uses the parseopts member of struct diff_options, add_diff_parseopts(), and get rid of said member. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-02diff: use add_diff_options() in diff_opt_parse()René Scharfe1-1/+5
Prepare the removal of the parseopts member of struct diff_options by using the API function add_diff_options() instead of accessing it directly to get the command line option definitions. Building the copy by concatenating with an empty option array is slightly awkward, but simpler than a non-concat version of add_diff_options() would be to use in places that need concatenation. Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-02diff: factor out add_diff_options()René Scharfe1-0/+6
Add a function for appending the parseopts member of struct diff_options to a struct option array. Use it in two sites instead of accessing the parseopts member directly. Decoupling callers from diff internals like that allows us to change the latter. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-28Merge branch 'sg/plug-line-log-leaks'Junio C Hamano1-8/+9
A handful of leaks in the line-log machinery have been plugged. * sg/plug-line-log-leaks: diff.c: use diff_free_queue() line-log: free the diff queues' arrays when processing merge commits line-log: free diff queue when processing non-merge commits
2022-11-08Merge branch 'rs/no-more-run-command-v'Taylor Blau1-14/+13
Simplify the run-command API. * rs/no-more-run-command-v: replace and remove run_command_v_opt() replace and remove run_command_v_opt_cd_env_tr2() replace and remove run_command_v_opt_tr2() replace and remove run_command_v_opt_cd_env() use child_process members "args" and "env" directly use child_process member "args" instead of string array variable sequencer: simplify building argument list in do_exec() bisect--helper: factor out do_bisect_run() bisect: simplify building "checkout" argument list am: simplify building "show" argument list run-command: fix return value comment merge: remove always-the-same "verbose" arguments
2022-11-02diff.c: use diff_free_queue()SZEDER Gábor1-8/+2
Use diff_free_queue() instead of open-coding it. This shortens the code and make it less repetitive. Note that the second hunk in diff_flush() is interesting, because the 'free_queue' label separates the loop freeing the queue's filepairs from free()-ing the queue's internal array. This is somewhat suspicious, but it was not an issue before: there is only one place from where we jump to this label with a goto, and that is protected by an 'if (!q->nr && ...)' condition, i.e. we only skipped the loop freeing the filepairs when there were no filepairs in the queue to begin with. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02line-log: free diff queue when processing non-merge commitsSZEDER Gábor1-0/+7
When processing a non-merge commit, the line-level log first asks the tree-diff machinery whether any of the files in the given line ranges were modified between the current commit and its parent, and if some of them were, then it loads the contents of those files from both commits to see whether their line ranges were modified and/or need to be adjusted. Alas, it doesn't free() the diff queue holding the results of that query and the contents of those files once its done. This can add up to a substantial amount of leaked memory, especially when the file in question is big and is frequently modified: a user reported "Out of memory, malloc failed" errors with a 2MB text file that was modified ~2800 times [1] (I estimate the leak would use up almost 11GB memory in that case). Free that diff queue to plug this memory leak. However, instead of simply open-coding the necessary three lines, add them as a helper function to the diff API, because it will be useful elsewhere as well. [1] https://public-inbox.org/git/CAFOPqVXz2XwzX8vGU7wLuqb2ZuwTuOFAzBLRM_QPk+NJa=eC-g@mail.gmail.com/ Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30Merge branch 'jz/patch-id'Taylor Blau1-37/+38
A new "--include-whitespace" option is added to "git patch-id", and existing bugs in the internal patch-id logic that did not match what "git patch-id" produces have been corrected. * jz/patch-id: builtin: patch-id: remove unused diff-tree prefix builtin: patch-id: add --verbatim as a command mode patch-id: fix patch-id for mode changes builtin: patch-id: fix patch-id with binary diffs patch-id: use stable patch-id for rebases patch-id: fix stable patch id for binary / header-only
2022-10-30use child_process members "args" and "env" directlyRené Scharfe1-14/+13
Build argument list and environment of child processes by using struct child_process and populating its members "args" and "env" directly instead of maintaining separate strvecs and letting run_command_v_opt() and friends populate these members. This is simpler, shorter and slightly more efficient. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-28Merge branch 'tb/diffstat-with-utf8-strwidth'Junio C Hamano1-11/+31
"git diff --stat" etc. were invented back when everything was ASCII and strlen() was a way to measure the display width of a string; adjust them to compute the display width assuming UTF-8 pathnames. * tb/diffstat-with-utf8-strwidth: diff: leave NEEDWORK notes in show_stats() function diff.c: use utf8_strwidth() to count display width
2022-10-24patch-id: fix patch-id for mode changesJerry Zhang1-0/+5
Currently patch-id as used in rebase and cherry-pick does not account for file modes if the file is modified. One consequence of this is that if you have a local patch that changes modes, but upstream has applied an outdated version of the patch that doesn't include that mode change, "git rebase" will drop your local version of the patch along with your mode changes. It also means that internal patch-id doesn't produce the same output as the builtin, which does account for mode changes due to them being part of diff output. Fix by adding mode to the patch-id if it has changed, in the same format that would be produced by diff, so that it is compatible with builtin patch-id. Signed-off-by: Jerry Zhang <Jerry@skydio.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24patch-id: use stable patch-id for rebasesJerry Zhang1-8/+4
Git doesn't persist patch-ids during the rebase process, so there is no need to specifically invoke the unstable variant. Use the stable logic for all internal patch-id calculations to minimize the number of code paths and improve test coverage. Signed-off-by: Jerry Zhang <jerry@skydio.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-24patch-id: fix stable patch id for binary / header-onlyJerry Zhang1-29/+29
Patch-ids for binary patches are found by hashing the object ids of the before and after objects in succession. However in the --stable case, there is a bug where hunks are not flushed for binary and header-only patch ids, which would always result in a patch-id of 0000. The --unstable case is currently correct. Reorder the logic to branch into 3 cases for populating the patch body: header-only which populates nothing, binary which populates the object ids, and normal which populates the text diff. All branches will end up flushing the hunk. Don't populate the ---a/ and +++b/ lines for binary diffs, to correspond to those lines not being present in the "git diff" text output. This is necessary because we advertise that the patch-id calculated internally and used in format-patch is the same that what the builtin "git patch-id" would produce when piped from a diff. Update the test to run on both binary and normal files. Signed-off-by: Jerry Zhang <jerry@skydio.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-21diff: leave NEEDWORK notes in show_stats() functionJunio C Hamano1-0/+15
The previous step made an attempt to correctly compute display columns allocated and padded different parts of diffstat output. There are at least two known codepaths in the function that still mixes up display widths and byte length that need to be fixed. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17diffstat_consume(): assert non-zero lengthJeff King1-0/+3
The callback interface for xdiff_emit_line_fn gives us a line/len pair, but diffstat_consume() never looks at "len". At first glance this seems like a bug that could cause us to read further than xdiff intends. But in practice, we read only the first character, and xdiff would never pass us an empty line. Let's add a run-time assertion that this is true, which clarifies our assumption and silences -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-15Merge branch 'en/remerge-diff-fixes'Junio C Hamano1-6/+26
Fix a few "git log --remerge-diff" bugs. * en/remerge-diff-fixes: diff: fix filtering of merge commits under --remerge-diff diff: fix filtering of additional headers under --remerge-diff diff: have submodule_format logic avoid additional diff headers
2022-09-14diff.c: use utf8_strwidth() to count display widthTorsten Bögershausen1-11/+16
When unicode filenames (encoded in UTF-8) are used, the visible width on the screen is not the same as strlen(). For example, `git log --stat` may produce an output like this: [snip the header] Arger.txt | 1 + Ärger.txt | 1 + 2 files changed, 2 insertions(+) A side note: the original report was about cyrillic filenames. After some investigations it turned out that a) This is not a problem with "ambiguous characters" in unicode b) The same problem exists for all unicode code points (so we can use Latin based Umlauts for demonstrations below) The 'Ä' takes the same space on the screen as the 'A'. But needs one more byte in memory, so the the `git log --stat` output for "Arger.txt" (!) gets mis-aligned: The maximum length is derived from "Ärger.txt", 10 bytes in memory, 9 positions on the screen. That is why "Arger.txt" gets one extra ' ' for aligment, it needs 9 bytes in memory. If there was a file "Ö", it would be correctly aligned by chance, but "Öhö" would not. The solution is of course, to use utf8_strwidth() instead of strlen() when dealing with the width on screen. And then there is another problem, code like this: strbuf_addf(&out, "%-*s", len, name); (or using the underlying snprintf() function) does not align the buffer to a minimum of len measured in screen-width, but uses the memory count. One could be tempted to wish that snprintf() was UTF-8 aware. That doesn't seem to be the case anywhere (tested on Linux and Mac), probably snprintf() uses the "bytes in memory"/strlen() approach to be compatible with older versions and this will never change. The basic idea is to change code in diff.c like this strbuf_addf(&out, "%-*s", len, name); into something like this: int padding = len - utf8_strwidth(name); if (padding < 0) padding = 0; strbuf_addf(&out, " %s%*s", name, padding, ""); The real change is slighty bigger, as it, as well, integrates two calls of strbuf_addf() into one. Tests: Two things need to be tested: - The calculation of the maximum width - The calculation of padding The name "textfile" is changed into "tëxtfilë", both have a width of 8. If strlen() was used, to get the maximum width, the shorter "binfile" would have been mis-aligned: binfile | [snip] tëxtfilë | [snip] If only "binfile" would be renamed into "binfilë": binfilë | [snip] textfile | [snip] In order to verify that the width is calculated correctly everywhere, "binfile" is renamed into "binfilë", giving 1 bytes more in strlen() "tëxtfile" is renamed into "tëxtfilë", 2 byte more in strlen(). The updated t4012-diff-binary.sh checks the correct aligment: binfilë | [snip] tëxtfilë | [snip] Reported-by: Alexander Meshcheryakov <alexander.s.m@gmail.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-14Merge branch 'ab/unused-annotation'Junio C Hamano1-2/+2
Undoes 'jk/unused-annotation' topic and redoes it to work around Coccinelle rules misfiring false positives in unrelated codepaths. * ab/unused-annotation: git-compat-util.h: use "deprecated" for UNUSED variables git-compat-util.h: use "UNUSED", not "UNUSED(var)"
2022-09-14Merge branch 'jk/unused-annotation'Junio C Hamano1-2/+3
Annotate function parameters that are not used (but cannot be removed for structural reasons), to prepare us to later compile with -Wunused warning turned on. * jk/unused-annotation: is_path_owned_by_current_uid(): mark "report" parameter as unused run-command: mark unused async callback parameters mark unused read_tree_recursive() callback parameters hashmap: mark unused callback parameters config: mark unused callback parameters streaming: mark unused virtual method parameters transport: mark bundle transport_options as unused refs: mark unused virtual method parameters refs: mark unused reflog callback parameters refs: mark unused each_ref_fn parameters git-compat-util: add UNUSED macro
2022-09-02diff: fix filtering of merge commits under --remerge-diffElijah Newren1-0/+1
Commit 95433eeed9 ("diff: add ability to insert additional headers for paths", 2022-02-02) introduced the possibility of additional headers. Because there could be conflicts with no content differences (e.g. a modify/delete conflict resolved in favor of taking the modified file as-is), that commit also modified the diff_queue_is_empty() and diff_flush_patch() logic to ensure these headers were included even if there was no associated content diff. However, the added logic was a bit inconsistent between these two functions. diff_queue_is_empty() overlooked the fact that the additional headers strmap could be non-NULL and empty, which would cause it to display commits that should have been filtered out. Fix the diff_queue_is_empty() logic to also account for additional_path_headers being empty. Reported-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02diff: fix filtering of additional headers under --remerge-diffElijah Newren1-0/+2
Commit 95433eeed9 ("diff: add ability to insert additional headers for paths", 2022-02-02) introduced the possibility of additional headers. Because there could be conflicts with no content differences (e.g. a modify/delete conflict resolved in favor of taking the modified file as-is), that commit also modified the diff_queue_is_empty() and diff_flush_patch() logic to ensure these headers were included even if there was no associated content diff. However, when the pickaxe is active, we really only want the remerge conflict headers to be shown when there is an associated content diff. Adjust the logic in these two functions accordingly. This also removes the TEST_PASSES_SANITIZE_LEAK=true declaration from t4069, as there is apparently some kind of memory leak with the pickaxe code. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02diff: have submodule_format logic avoid additional diff headersElijah Newren1-6/+23
Commit 95433eeed9 ("diff: add ability to insert additional headers for paths", 2022-02-02) introduced the possibility of additional headers, created in create_filepairs_for_header_only_notifications(). These are represented by inserting additional pairs in diff_queued_diff which always have a mode of 0 and a null_oid. When these were added, one code path was noted to assume that at least one of the diff_filespecs in the pair were valid, and that codepath was corrected. The submodule_format handling is another codepath with the same issue; it would operate on these additional headers and attempt to display them as submodule changes. Prevent that by explicitly checking for "phoney" filepairs (i.e. filepairs with both modes being 0). Reported-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01git-compat-util.h: use "UNUSED", not "UNUSED(var)"Ævar Arnfjörð Bjarmason1-2/+2
As reported in [1] the "UNUSED(var)" macro introduced in 2174b8c75de (Merge branch 'jk/unused-annotation' into next, 2022-08-24) breaks coccinelle's parsing of our sources in files where it occurs. Let's instead partially go with the approach suggested in [2] of making this not take an argument. As noted in [1] "coccinelle" will ignore such tokens in argument lists that it doesn't know about, and it's less of a surprise to syntax highlighters. This undoes the "help us notice when a parameter marked as unused is actually use" part of 9b240347543 (git-compat-util: add UNUSED macro, 2022-08-19), a subsequent commit will further tweak the macro to implement a replacement for that functionality. 1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/ 2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19hashmap: mark unused callback parametersJeff King1-1/+1
Hashmap comparison functions must conform to a particular callback interface, but many don't use all of their parameters. Especially the void cmp_data pointer, but some do not use keydata either (because they can easily form a full struct to pass when doing lookups). Let's mark these to make -Wunused-parameter happy. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19config: mark unused callback parametersJeff King1-1/+2
The callback passed to git_config() must conform to a particular interface. But most callbacks don't actually look at the extra "void *data" parameter. Let's mark the unused parameters to make -Wunused-parameter happy. Note there's one unusual case here in get_remote_default() where we actually ignore the "value" parameter. That's because it's only checking whether the option is found at all, and not parsing its value. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19parse-options: PARSE_OPT_KEEP_UNKNOWN only applies to --optionsSZEDER Gábor1-1/+1
The description of 'PARSE_OPT_KEEP_UNKNOWN' starts with "Keep unknown arguments instead of erroring out". This is a bit misleading, as this flag only applies to unknown --options, while non-option arguments are kept even without this flag. Update the description to clarify this, and rename the flag to PARSE_OPTIONS_KEEP_UNKNOWN_OPT to make this obvious just by looking at the flag name. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-18Merge branch 'ab/cocci-unused'Junio C Hamano1-2/+0
Add Coccinelle rules to detect the pattern of initializing and then finalizing a structure without using it in between at all, which happens after code restructuring and the compilers fail to recognize as an unused variable. * ab/cocci-unused: cocci: generalize "unused" rule to cover more than "strbuf" cocci: add and apply a rule to find "unused" strbufs cocci: have "coccicheck{,-pending}" depend on "coccicheck-test" cocci: add a "coccicheck-test" target and test *.cocci rules Makefile & .gitignore: ignore & clean "git.res", not "*.res" Makefile: remove mandatory "spatch" arguments from SPATCH_FLAGS
2022-07-06cocci: add and apply a rule to find "unused" strbufsÆvar Arnfjörð Bjarmason1-2/+0
Add a coccinelle rule to remove "struct strbuf" initialization followed by calling "strbuf_release()" function, without any uses of the strbuf in the same function. See the tests in contrib/coccinelle/tests/unused.{c,res} for what it's intended to find and replace. The inclusion of "contrib/scalar/scalar.c" is because "spatch" was manually run on it (we don't usually run spatch on contrib). Per the "buggy code" comment we also match a strbuf_init() before the xmalloc(), but we're not seeking to be so strict as to make checks that the compiler will catch for us redundant. Saying we'll match either "init" or "xmalloc" lines makes the rule simpler. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-22merge-ort: make `path_messages` a strmap to a string_listJohannes Schindelin1-7/+20
This allows us once again to get away with less data copying. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-20diff: use mks_tempfile_dt()René Scharfe1-7/+1
Git uses temporary files to pass the contents of blobs to external diff programs and textconv filters. It calls mks_tempfile_ts() to create them, which puts them all in the same directory. This requires adding a random name prefix. Use mks_tempfile_dt() instead, which allows the files to have arbitrary names, each in their own separate temporary directory. This way they can have the same basename as the original blob, which looks nicer in graphical diff programs. The test in t4020 to check the prettiness of the temporary paths was neutered by 5476bdf0e8 (diff tests: don't ignore "git diff" exit code in "read" loop, 2022-03-07), which removed its grep check without replacing it with an equivalent test_cmp check. Add one that only checks the basename of the temporary file and nothing else. And make the test more robust while at it, by using test_when_finished to get rid of the added file even if the test fails. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-23Merge branch 'ab/plug-random-leaks'Junio C Hamano1-2/+9
Double-free fix for a recently merged topic. * ab/plug-random-leaks: diff.c: fix a double-free regression in a18d66cefb tests: demonstrate "show --word-diff --color-moved" regression
2022-03-17diff.c: fix a double-free regression in a18d66cefbÆvar Arnfjörð Bjarmason1-2/+9
My a18d66cefb9 (diff.c: free "buf" in diff_words_flush(), 2022-03-04) has what it retrospect is a rather obvious bug (I don't know what I was thinking, if it all): We use the "emitted_symbols" allocation in append_emitted_diff_symbol() N times, but starting with a18d66cefb9 we'd free it after its first use! The correct way to free this data would have been to add the free() to the existing free_diff_words_data() function, so let's do that. The "ecbdata->diff_words->opt->emitted_symbols" might be NULL, so let's add a trivial free_emitted_diff_symbols() helper next to the function that appends to it. This fixes the "no effect on show from" leak tested for in the preceding commit. Perhaps confusingly this change will skip that test under SANITIZE=leak, but otherwise opt-in the "t4015-diff-whitespace.sh" test. The reason is that a18d66cefb9 "fixed" the leak in the preceding "no effect on diff" test, but for the first call to diff_words_flush() the "wol->buf" would be NULL, so we wouldn't double-free (and SANITIZE=address would see nothing amiss). With this change we'll still pass that test, showing that we've also fixed leaks on this codepath. We then have to skip the new "no effect on show" test because it happens to trip over an unrelated memory leak (in revision.c). The same goes for "move detection with submodules". Both of them pass with SANITIZE=address though, which would error on the "no effect on show" test before this change. Reported-by: Michael J Gruber <git@grubix.eu> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-13Merge branch 'ab/plug-random-leaks'Junio C Hamano1-0/+1
Plug random memory leaks. * ab/plug-random-leaks: repository.c: free the "path cache" in repo_clear() range-diff: plug memory leak in read_patches() range-diff: plug memory leak in common invocation lockfile API users: simplify and don't leak "path" commit-graph: stop fill_oids_from_packs() progress on error and free() commit-graph: fix memory leak in misused string_list API submodule--helper: fix trivial leak in module_add() transport: stop needlessly copying bundle header references bundle: call strvec_clear() on allocated strvec remote-curl.c: free memory in cmd_main() urlmatch.c: add and use a *_release() function diff.c: free "buf" in diff_words_flush() merge-base: free() allocated "struct commit **" list index-pack: fix memory leaks
2022-03-06Merge branch 'ac/usage-string-fixups'Junio C Hamano1-1/+1
Usage-string normalization. * ac/usage-string-fixups: amend remaining usage strings according to style guide
2022-03-04diff.c: free "buf" in diff_words_flush()Ævar Arnfjörð Bjarmason1-0/+1
Amend the freeing logic added in e6e045f8031 (diff.c: buffer all output if asked to, 2017-06-29) to free the containing "buf" in addition to its members. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-25Merge branch 'ab/diff-free-more'Junio C Hamano1-0/+2
Leakfixes. * ab/diff-free-more: diff.[ch]: have diff_free() free options->parseopts diff.[ch]: have diff_free() call clear_pathspec(opts.pathspec)
2022-02-23amend remaining usage strings according to style guideAbhradeep Chakraborty1-1/+1
Usage strings for git (sub)command flags has a style guide that suggests - first letter should not capitalized (unless required) and it should skip full-stop at the end of line. But there are some files where usage-strings do not follow the above mentioned guide. Amend the usage strings that don't follow the style convention/guide. Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16Merge branch 'js/diff-filter-negation-fix'Junio C Hamano1-53/+44
"git diff --diff-filter=aR" is now parsed correctly. * js/diff-filter-negation-fix: diff-filter: be more careful when looking for negative bits diff.c: move the diff filter bits definitions up a bit docs(diff): lose incorrect claim about `diff-files --diff-filter=A`
2022-02-16Merge branch 'en/remerge-diff'Junio C Hamano1-4/+120
"git log --remerge-diff" shows the difference from mechanical merge result and the result that is actually recorded in a merge commit. * en/remerge-diff: diff-merges: avoid history simplifications when diffing merges merge-ort: mark conflict/warning messages from inner merges as omittable show, log: include conflict/warning messages in --remerge-diff headers diff: add ability to insert additional headers for paths merge-ort: format messages slightly different for use in headers merge-ort: mark a few more conflict messages as omittable merge-ort: capture and print ll-merge warnings in our preferred fashion ll-merge: make callers responsible for showing warnings log: clean unneeded objects during `log --remerge-diff` show, log: provide a --remerge-diff capability
2022-02-16diff.[ch]: have diff_free() free options->parseoptsÆvar Arnfjörð Bjarmason1-0/+1
The "struct option" added in 4a288478394 (diff.c: prepare to use parse_options() for parsing, 2019-01-27) would be free'd in the case of diff_setup_done() being called. But not all codepaths that allocate it reach that, e.g. "t6427-diff3-conflict-markers.sh" will now free memory that it didn't free before. By using FREE_AND_NULL() here (which diff_setup_done() also does) we ensure that we free the memory, and that we won't have double-free's. Before this running: ./t6427-diff3-conflict-markers.sh -vixd --run=7 Would report: SUMMARY: LeakSanitizer: 7823 byte(s) leaked in 6 allocation(s). But now we'll report: SUMMARY: LeakSanitizer: 703 byte(s) leaked in 5 allocation(s). I.e. the largest leak in that particular test has now been addressed. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16diff.[ch]: have diff_free() call clear_pathspec(opts.pathspec)Ævar Arnfjörð Bjarmason1-0/+1
Have the diff_free() function call clear_pathspec(). Since the diff_flush() function calls this all its callers can be simplified to rely on it instead. When I added the diff_free() function in e900d494dcf (diff: add an API for deferred freeing, 2021-02-11) I simply missed this, or wasn't interested in it. Let's consolidate this now. This means that any future callers (and I've got revision.c in mind) that embed a "struct diff_options" can simply call diff_free() instead of needing know that it has an embedded pathspec. This does fix a bunch of leaks, but I can't mark any test here as passing under the SANITIZE=leak testing mode because in 886e1084d78 (builtin/: add UNLEAKs, 2017-10-01) an UNLEAK(rev) was added, which plasters over the memory leak. E.g. "t4011-diff-symlink.sh" would report fewer leaks with this fix, but because of the UNLEAK() reports none. I'll eventually loop around to removing that UNLEAK(rev) annotation as I'll fix deeper issues with the revisions API leaking. This is one small step on the way there, a new freeing function in revisions.c will want to call this diff_free(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-02diff: add ability to insert additional headers for pathsElijah Newren1-4/+120
When additional headers are provided, we need to * add diff_filepairs to diff_queued_diff for each paths in the additional headers map which, unless that path is part of another diff_filepair already found in diff_queued_diff * format the headers (colorization, line_prefix for --graph) * make sure the various codepaths that attempt to return early if there are "no changes" take into account the headers that need to be shown. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-28diff-filter: be more careful when looking for negative bitsJohannes Schindelin1-16/+7
The `--diff-filter=<bits>` option allows to filter the diff by certain criteria, for example `R` to only show renamed files. It also supports negating a filter via a down-cased letter, i.e. `r` to show _everything but_ renamed files. However, the code is a bit overzealous when trying to figure out whether `git diff` should start with all diff-filters turned on because the user provided a lower-case letter: if the `--diff-filter` argument starts with an upper-case letter, we must not start with all bits turned on. Even worse, it is possible to specify the diff filters in multiple, separate options, e.g. `--diff-filter=AM [...] --diff-filter=m`. Let's accumulate the include/exclude filters independently, and only special-case the "only exclude filters were specified" case after parsing the options altogether. Note: The code replaced by this commit took pains to avoid setting any unused bits of `options->filter`. That was unnecessary, though, as all accesses happen via the `filter_bit_tst()` function using specific bits, and setting the unused bits has no effect. Therefore, we can simplify the code by using `~0` (or in this instance, `~<unwanted-bit>`). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-28diff.c: move the diff filter bits definitions up a bitJohannes Schindelin1-37/+37
This prepares for a more careful handling of the `--diff-filter` options over the next few commits. This commit is best viewed with `--color-moved`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-10Merge branch 'ja/i18n-similar-messages'Junio C Hamano1-4/+8
Similar message templates have been consolidated so that translators need to work on fewer number of messages. * ja/i18n-similar-messages: i18n: turn even more messages into "cannot be used together" ones i18n: ref-filter: factorize "%(foo) atom used without %(bar) atom" i18n: factorize "--foo outside a repository" i18n: refactor "unrecognized %(foo) argument" strings i18n: factorize "no directory given for --foo" i18n: factorize "--foo requires --bar" and the like i18n: tag.c factorize i18n strings i18n: standardize "cannot open" and "cannot read" i18n: turn "options are incompatible" into "cannot be used together" i18n: refactor "%s, %s and %s are mutually exclusive" i18n: refactor "foo and bar are mutually exclusive"
2022-01-05Merge branch 'pw/diff-color-moved-fix'Junio C Hamano1-245/+186
Correctness and performance update to "diff --color-moved" feature. * pw/diff-color-moved-fix: diff --color-moved: intern strings diff: use designated initializers for emitted_diff_symbol diff --color-moved-ws=allow-indentation-change: improve hash lookups diff --color-moved: stop clearing potential moved blocks diff --color-moved: shrink potential moved blocks as we go diff --color-moved: unify moved block growth functions diff --color-moved: call comparison function directly diff --color-moved-ws=allow-indentation-change: simplify and optimize diff: simplify allow-indentation-change delta calculation diff --color-moved: avoid false short line matches and bad zebra coloring diff --color-moved=zebra: fix alternate coloring diff --color-moved: rewind when discarding pmb diff --color-moved: factor out function diff --color-moved: clear all flags on blocks that are too short diff --color-moved: add perf tests
2022-01-05i18n: turn even more messages into "cannot be used together" onesJean-Noël Avila1-4/+8
Even if some of these messages are not subject to gettext i18n, this helps bring a single style of message for a given error type. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Reviewed-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-05i18n: refactor "%s, %s and %s are mutually exclusive"Jean-Noël Avila1-1/+1
Use placeholders for constant tokens. The strings are turned into "cannot be used together" Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Reviewed-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09diff --color-moved: intern stringsPhillip Wood1-78/+96
Taking inspiration from xdl_classify_record() assign an id to each addition and deletion such that lines that match for the current --color-moved-ws mode share the same unique id. This reduces the number of hash lookups a little (calculating the ids still involves one hash lookup per line) but the main benefit is that when growing blocks of potentially moved lines we can replace string comparisons which involve chasing a pointer with a simple integer comparison. On a large diff this commit reduces the time to run 'diff --color-moved' by 37% compared to the previous commit and 31% compared to master, for 'diff --color-moved-ws=allow-indentation-change' the reduction is 28% compared to the previous commit and 96% compared to master. There is little change in the performance of 'git log --patch' as the diffs are smaller. Test HEAD^ HEAD --------------------------------------------------------------------------------------------------------------- 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38(0.33+0.05) 0.38(0.33+0.05) +0.0% 4002.2: diff --color-moved --no-color-moved-ws large change 0.88(0.81+0.06) 0.55(0.50+0.04) -37.5% 4002.3: diff --color-moved-ws=allow-indentation-change large change 0.85(0.79+0.06) 0.61(0.54+0.06) -28.2% 4002.4: log --no-color-moved --no-color-moved-ws 1.16(1.07+0.08) 1.15(1.09+0.05) -0.9% 4002.5: log --color-moved --no-color-moved-ws 1.31(1.22+0.08) 1.29(1.19+0.09) -1.5% 4002.6: log --color-moved-ws=allow-indentation-change 1.32(1.24+0.08) 1.31(1.18+0.13) -0.8% Test master HEAD --------------------------------------------------------------------------------------------------------------- 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38 (0.33+0.05) 0.38(0.33+0.05) +0.0% 4002.2: diff --color-moved --no-color-moved-ws large change 0.80 (0.75+0.04) 0.55(0.50+0.04) -31.2% 4002.3: diff --color-moved-ws=allow-indentation-change large change 14.20(14.15+0.05) 0.61(0.54+0.06) -95.7% 4002.4: log --no-color-moved --no-color-moved-ws 1.15 (1.05+0.09) 1.15(1.09+0.05) +0.0% 4002.5: log --color-moved --no-color-moved-ws 1.30 (1.19+0.11) 1.29(1.19+0.09) -0.8% 4002.6: log --color-moved-ws=allow-indentation-change 1.70 (1.63+0.06) 1.31(1.18+0.13) -22.9% Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09diff: use designated initializers for emitted_diff_symbolPhillip Wood1-1/+3
This makes it clearer which fields are being explicitly initialized and will simplify the next commit where we add a new field to the struct. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09diff --color-moved-ws=allow-indentation-change: improve hash lookupsPhillip Wood1-46/+19
As libxdiff does not have a whitespace flag to ignore the indentation the code for --color-moved-ws=allow-indentation-change uses XDF_IGNORE_WHITESPACE and then filters out any hash lookups where there are non-indentation changes. This filtering is inefficient as we have to perform another string comparison. By using the offset data that we have already computed to skip the indentation we can avoid using XDF_IGNORE_WHITESPACE and safely remove the extra checks which improves the performance by 11% and paves the way for the elimination of string comparisons in the next commit. This change slightly increases the run time of other --color-moved modes. This could be avoided by using different comparison functions for the different modes but after the next two commits there is no measurable benefit in doing so. There is a change in behavior for lines that begin with a form-feed or vertical-tab character. Since b46054b374 ("xdiff: use git-compat-util", 2019-04-11) xdiff does not treat '\f' or '\v' as whitespace characters. This means that lines starting with those characters are never considered to be blank and never match a line that does not start with the same character. After this patch a line matching "^[\f\v\r]*[ \t]*$" is considered to be blank by --color-moved-ws=allow-indentation-change and lines beginning "^[\f\v\r]*[ \t]*" can match another line if the suffixes match. This changes the output of git show for d18f76dccf ("compat/regex: use the regex engine from gawk for compat", 2010-08-17) as some lines in the pre-image before a moved block that contain '\f' are now considered moved as well as they match a blank line before the moved lines in the post-image. This commit updates one of the tests to reflect this change. Test HEAD^ HEAD -------------------------------------------------------------------------------------------------------------- 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38(0.33+0.05) 0.38(0.33+0.05) +0.0% 4002.2: diff --color-moved --no-color-moved-ws large change 0.86(0.82+0.04) 0.88(0.84+0.04) +2.3% 4002.3: diff --color-moved-ws=allow-indentation-change large change 0.97(0.94+0.03) 0.86(0.81+0.05) -11.3% 4002.4: log --no-color-moved --no-color-moved-ws 1.16(1.07+0.09) 1.16(1.06+0.09) +0.0% 4002.5: log --color-moved --no-color-moved-ws 1.32(1.26+0.06) 1.33(1.27+0.05) +0.8% 4002.6: log --color-moved-ws=allow-indentation-change 1.35(1.29+0.06) 1.33(1.24+0.08) -1.5% Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09diff --color-moved: stop clearing potential moved blocksPhillip Wood1-11/+0
moved_block_clear() was introduced in 74d156f4a1 ("diff --color-moved-ws: fix double free crash", 2018-10-04) to free the memory that was allocated when initializing a potential moved block. However since 21536d077f ("diff --color-moved-ws: modify allow-indentation-change", 2018-11-23) initializing a potential moved block no longer allocates any memory. Up until the last commit we were relying on moved_block_clear() to set the `match` pointer to NULL when a block stopped matching, but since that commit we do not clear a moved block that does not match so it does not make sense to clear them elsewhere. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09diff --color-moved: shrink potential moved blocks as we goPhillip Wood1-36/+8
Rather than setting `match` to NULL and then looping over the list of potential matched blocks for a second time to remove blocks with no matches just filter out the blocks with no matches as we go. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09diff --color-moved: unify moved block growth functionsPhillip Wood1-29/+12
After the last two commits pmb_advance_or_null() and pmb_advance_or_null_multi_match() differ only in the comparison they perform. Lets simplify the code by combining them into a single function. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09diff --color-moved: call comparison function directlyPhillip Wood1-4/+7
This change will allow us to easily combine pmb_advance_or_null() and pmb_advance_or_null_multi_match() in the next commit. Calling xdiff_compare_lines() directly rather than using a function pointer from the hash map has little effect on the run time. Test HEAD^ HEAD ------------------------------------------------------------------------------------------------------------- 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38(0.35+0.03) 0.38(0.32+0.06) +0.0% 4002.2: diff --color-moved --no-color-moved-ws large change 0.87(0.83+0.04) 0.87(0.80+0.06) +0.0% 4002.3: diff --color-moved-ws=allow-indentation-change large change 0.97(0.92+0.04) 0.97(0.93+0.04) +0.0% 4002.4: log --no-color-moved --no-color-moved-ws 1.17(1.06+0.10) 1.16(1.10+0.05) -0.9% 4002.5: log --color-moved --no-color-moved-ws 1.32(1.24+0.08) 1.31(1.22+0.09) -0.8% 4002.6: log --color-moved-ws=allow-indentation-change 1.36(1.25+0.10) 1.35(1.25+0.10) -0.7% Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09diff --color-moved-ws=allow-indentation-change: simplify and optimizePhillip Wood1-50/+20
If we already have a block of potentially moved lines then as we move down the diff we need to check if the next line of each potentially moved line matches the current line of the diff. The implementation of --color-moved-ws=allow-indentation-change was needlessly performing this check on all the lines in the diff that matched the current line rather than just the current line. To exacerbate the problem finding all the other lines in the diff that match the current line involves a fuzzy lookup so we were wasting even more time performing a second comparison to filter out the non-matching lines. Fixing this reduces time to run git diff --color-moved-ws=allow-indentation-change v2.28.0 v2.29.0 by 93% compared to master and simplifies the code. Test HEAD^ HEAD --------------------------------------------------------------------------------------------------------------- 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38 (0.35+0.03) 0.38(0.35+0.03) +0.0% 4002.2: diff --color-moved --no-color-moved-ws large change 0.86 (0.80+0.06) 0.87(0.83+0.04) +1.2% 4002.3: diff --color-moved-ws=allow-indentation-change large change 19.01(18.93+0.06) 0.97(0.92+0.04) -94.9% 4002.4: log --no-color-moved --no-color-moved-ws 1.16 (1.06+0.09) 1.17(1.06+0.10) +0.9% 4002.5: log --color-moved --no-color-moved-ws 1.32 (1.25+0.07) 1.32(1.24+0.08) +0.0% 4002.6: log --color-moved-ws=allow-indentation-change 1.71 (1.64+0.06) 1.36(1.25+0.10) -20.5% Test master HEAD --------------------------------------------------------------------------------------------------------------- 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38 (0.33+0.05) 0.38(0.35+0.03) +0.0% 4002.2: diff --color-moved --no-color-moved-ws large change 0.80 (0.75+0.04) 0.87(0.83+0.04) +8.7% 4002.3: diff --color-moved-ws=allow-indentation-change large change 14.20(14.15+0.05) 0.97(0.92+0.04) -93.2% 4002.4: log --no-color-moved --no-color-moved-ws 1.15 (1.05+0.09) 1.17(1.06+0.10) +1.7% 4002.5: log --color-moved --no-color-moved-ws 1.30 (1.19+0.11) 1.32(1.24+0.08) +1.5% 4002.6: log --color-moved-ws=allow-indentation-change 1.70 (1.63+0.06) 1.36(1.25+0.10) -20.0% Helped-by: Jeff King <peff@peff.net> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09diff: simplify allow-indentation-change delta calculationPhillip Wood1-11/+2
Now that we reliably end a block when the sign changes we don't need the whitespace delta calculation to rely on the sign. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09diff --color-moved: avoid false short line matches and bad zebra coloringPhillip Wood1-6/+11
When marking moved lines it is possible for a block of potential matched lines to extend past a change in sign when there is a sequence of added lines whose text matches the text of a sequence of deleted and added lines. Most of the time either `match` will be NULL or `pmb_advance_or_null()` will fail when the loop encounters a change of sign but there are corner cases where `match` is non-NULL and `pmb_advance_or_null()` successfully advances the moved block despite the change in sign. One consequence of this is highlighting a short line as moved when it should not be. For example -moved line # Correctly highlighted as moved +short line # Wrongly highlighted as moved context +moved line # Correctly highlighted as moved +short line context -short line The other consequence is coloring a moved addition following a moved deletion in the wrong color. In the example below the first "+moved line 3" should be highlighted as newMoved not newMovedAlternate. -moved line 1 # Correctly highlighted as oldMoved -moved line 2 # Correctly highlighted as oldMovedAlternate +moved line 3 # Wrongly highlighted as newMovedAlternate context # Everything else is highlighted correctly +moved line 2 +moved line 3 context +moved line 1 -moved line 3 These false matches are more likely when using --color-moved-ws with the exception of --color-moved-ws=allow-indentation-change which ties the sign of the current whitespace delta to the sign of the line to avoid this problem. The fix is to check that the sign of the new line being matched is the same as the sign of the line that started the block of potential matches. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09diff --color-moved=zebra: fix alternate coloringPhillip Wood1-2/+2
b0a2ba4776 ("diff --color-moved=zebra: be stricter with color alternation", 2018-11-23) sought to avoid using the alternate colors unless there are two adjacent moved blocks of the same sign. Unfortunately it contains two bugs that prevented it from fixing the problem properly. Firstly `last_symbol` is reset at the start of each iteration of the loop losing the symbol of the last line and secondly when deciding whether to use the alternate color it should be checking if the current line is the same sign of the last line, not a different sign. The combination of the two errors means that we still use the alternate color when we should do but we also use it when we shouldn't. This is most noticable when using --color-moved-ws=allow-indentation-change with hunks like -this line gets indented + this line gets indented where the post image is colored with newMovedAlternate rather than newMoved. While this does not matter much, the next commit will change the coloring to be correct in this case, so lets fix the bug here to make it clear why the output is changing and add a regression test. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09diff --color-moved: rewind when discarding pmbPhillip Wood1-5/+23
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09diff --color-moved: factor out functionPhillip Wood1-17/+34
This code is quite heavily indented and having it in its own function simplifies an upcoming change. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09diff --color-moved: clear all flags on blocks that are too shortPhillip Wood1-3/+3
If a block of potentially moved lines is not long enough then the DIFF_SYMBOL_MOVED_LINE flag is cleared on the matching lines so they are not marked as moved. To avoid problems when we start rewinding after an unsuccessful match in a couple of commits time make sure all the move related flags are cleared, not just DIFF_SYMBOL_MOVED_LINE. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-25run-command API users: use strvec_push(), not argv constructionÆvar Arnfjörð Bjarmason1-6/+2
Change a pattern of hardcoding an "argv" array size, populating it and assigning to the "argv" member of "struct child_process" to instead use "strvec_push()" to add data to the "args" member. As noted in the preceding commit this moves us further towards being able to remove the "argv" member in a subsequent commit These callers could have used strvec_pushl(), but moving to strvec_push() makes the diff easier to read, and keeps the arguments aligned as before. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27*.[ch] *_INIT macros: use { 0 } for a "zero out" idiomÆvar Arnfjörð Bjarmason1-2/+2
In C it isn't required to specify that all members of a struct are zero'd out to 0, NULL or '\0', just providing a "{ 0 }" will accomplish that. Let's also change code that provided N zero'd fields to just provide one, and change e.g. "{ NULL }" to "{ 0 }" for consistency. I.e. even if the first member is a pointer let's use "0" instead of "NULL". The point of using "0" consistently is to pick one, and to not have the reader wonder why we're not using the same pattern everywhere. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09diff: ignore sparse paths in diffstatDerrick Stolee1-0/+8
The diff_populate_filespec() method is used to describe the diff after a merge operation is complete. In order to avoid expanding a sparse index, the reuse_worktree_file() needs to be adapted to ignore files that are outside of the sparse-checkout cone. The file names and OIDs used for this check come from the merged tree in the case of the ORT strategy, not the index, hence the ability to look into these paths without having already expanded the index. The work done by reuse_worktree_file() is only an optimization, and requires the file being on disk for it to be of any value. Thus, it is safe to exit the method early if we do not expect the file on disk. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-06Merge branch 'ab/pickaxe-pcre2'Junio C Hamano1-1/+1
* ab/pickaxe-pcre2: diff: --pickaxe-all typofix
2021-08-04diff: --pickaxe-all typofixBagas Sanjaya1-1/+1
When I was fixing fuzzies as I updating po/id.po for 2.33.0 l10n round, I noticed a triple-dash typo (--pickaxe-all) at diff.c, which according to git-diff(1) manpage, the correct option name should be --pickaxe-all. Fix the typo. Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-28Merge branch 'en/rename-limits-doc'Junio C Hamano1-2/+2
Documentation on "git diff -l<n>" and diff.renameLimit have been updated, and the defaults for these limits have been raised. * en/rename-limits-doc: rename: bump limit defaults yet again diffcore-rename: treat a rename_limit of 0 as unlimited doc: clarify documentation for rename/copy limits diff: correct warning message when renameLimit exceeded
2021-07-15rename: bump limit defaults yet againElijah Newren1-1/+1
These were last bumped in commit 92c57e5c1d29 (bump rename limit defaults (again), 2011-02-19), and were bumped both because processors had gotten faster, and because people were getting ugly merges that caused problems and reporting it to the mailing list (suggesting that folks were willing to spend more time waiting). Since that time: * Linus has continued recommending kernel folks to set diff.renameLimit=0 (maps to 32767, currently) * Folks with repositories with lots of renames were happy to set merge.renameLimit above 32767, once the code supported that, to get correct cherry-picks * Processors have gotten faster * It has been discovered that the timing methodology used last time probably used too large example files. The last point is probably worth explaining a bit more: * The "average" file size used appears to have been average blob size in the linux kernel history at the time (probably v2.6.25 or something close to it). * Since bigger files are modified more frequently, such a computation weights towards larger files. * Larger files may be more likely to be modified over time, but are not more likely to be renamed -- the mean and median blob size within a tree are a bit higher than the mean and median of blob sizes in the history leading up to that version for the linux kernel. * The mean blob size in v2.6.25 was half the average blob size in history leading to that point * The median blob size in v2.6.25 was about 40% of the mean blob size in v2.6.25. * Since the mean blob size is more than double the median blob size, any file as big as the mean will not be compared to any files of median size or less (because they'd be more than 50% dissimilar). * Since it is the number of files compared that provides the O(n^2) behavior, median-sized files should matter more than mean-sized ones. The combined effect of the above is that the file size used in past calculations was likely about 5x too large. Combine that with a CPU performance improvement of ~30%, and we can increase the limits by a factor of sqrt(5/(1-.3)) = 2.67, while keeping the original stated time limits. Keeping the same approximate time limit probably makes sense for diff.renameLimit (there is no progress feedback in e.g. git log -p), but the experience above suggests merge.renameLimit could be extended significantly. In fact, it probably would make sense to have an unlimited default setting for merge.renameLimit, but that would likely need to be coupled with changes to how progress is displayed. (See https://lore.kernel.org/git/YOx+Ok%2FEYvLqRMzJ@coredump.intra.peff.net/ for details in that area.) For now, let's just bump the approximate time limit from 10s to 1m. (Note: We do not want to use actual time limits, because getting results that depend on how loaded your system is that day feels bad, and because we don't discover that we won't get all the renames until after we've put in a lot of work rather than just upfront telling the user there are too many files involved.) Using the original time limit of 2s for diff.renameLimit, and bumping merge.renameLimit from 10s to 60s, I found the following timings using the simple script at the end of this commit message (on an AWS c5.xlarge which reports as "Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz"): N Timing 1300 1.995s 7100 59.973s So let's round down to nice even numbers and bump the limits from 400->1000, and from 1000->7000. Here is the measure_rename_perf script (adapted from https://lore.kernel.org/git/20080211113516.GB6344@coredump.intra.peff.net/ in particular to avoid triggering the linear handling from basename-guided rename detection): #!/bin/bash n=$1; shift rm -rf repo mkdir repo && cd repo git init -q -b main mkdata() { mkdir $1 for i in `seq 1 $2`; do (sed "s/^/$i /" <../sample echo tag: $1 ) >$1/$i done } mkdata initial $n git add . git commit -q -m initial mkdata new $n git add . cd new for i in *; do git mv $i $i.renamed; done cd .. git rm -q -rf initial git commit -q -m new time git diff-tree -M -l0 --summary HEAD^ HEAD Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15diff: correct warning message when renameLimit exceededElijah Newren1-1/+1
The warning when quadratic rename detection was skipped referred to "inexact rename detection". For years, the only linear portion of rename detection was looking for exact renames, so "inexact rename detection" was an accurate way to refer to the quadratic portion of rename detection. However, that changed with commit bd24aa2f97a0 (diffcore-rename: guide inexact rename detection based on basenames, 2021-02-14). Let's instead use the term "exhaustive rename detection" to refer to the quadratic portion. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-13Merge branch 'ab/pickaxe-pcre2'Junio C Hamano1-14/+25
Rewrite the backend for "diff -G/-S" to use pcre2 engine when available. * ab/pickaxe-pcre2: (22 commits) xdiff-interface: replace discard_hunk_line() with a flag xdiff users: use designated initializers for out_line pickaxe -G: don't special-case create/delete pickaxe -G: terminate early on matching lines xdiff-interface: allow early return from xdiff_emit_line_fn xdiff-interface: prepare for allowing early return pickaxe -S: slightly optimize contains() pickaxe: rename variables in has_changes() for brevity pickaxe -S: support content with NULs under --pickaxe-regex pickaxe: assert that we must have a needle under -G or -S pickaxe: refactor function selection in diffcore-pickaxe() perf: add performance test for pickaxe pickaxe/style: consolidate declarations and assignments diff.h: move pickaxe fields together again pickaxe: die when --find-object and --pickaxe-all are combined pickaxe: die when -G and --pickaxe-regex are combined pickaxe tests: add missing test for --no-pickaxe-regex being an error pickaxe tests: test for -G, -S and --find-object incompatibility pickaxe tests: add test for "log -S" not being a regex pickaxe tests: add test for diffgrep_consume() internals ...
2021-05-14Merge branch 'pw/word-diff-zero-width-matches'Junio C Hamano1-3/+7
The word-diff mode has been taught to work better with a word regexp that can match an empty string. * pw/word-diff-zero-width-matches: word diff: handle zero length matches
2021-05-11xdiff-interface: replace discard_hunk_line() with a flagÆvar Arnfjörð Bjarmason1-3/+4
Remove the dummy discard_hunk_line() function added in 3b40a090fd4 (diff: avoid generating unused hunk header lines, 2018-11-02) in favor of having a new XDL_EMIT_NO_HUNK_HDR flag, for use along with the two existing and similar XDL_EMIT_* flags. Unlike the recently amended xdiff_emit_line_fn interface which'll be called in a loop in xdl_emit_diff(), the hunk header is only emitted once. It makes more sense to pass this as a flag than provide a dummy callback because that function may be able to skip doing certain work if it knows the caller is doing nothing with the hunk header. It would be possible to do so in the case of -U0 now, but the benefit of doing so is so small that I haven't bothered. But this leaves the door open to that, and more importantly makes the API use more intuitive. The reason we're putting a flag in the gap between 1<<0 and 1<<2 is that the old 1<<1 flag was removed in 907681e940d (xdiff: drop XDL_EMIT_COMMON, 2016-02-23) without re-ordering the remaining flags. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11xdiff-interface: prepare for allowing early returnÆvar Arnfjörð Bjarmason1-11/+15
Change the function prototype of xdiff_emit_line_fn to return an "int" instead of "void". Change all of those functions to "return 0", nothing checks those return values yet, and no behavior is being changed. In subsequent commits the interface will be changed to allow early return via this new return value. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11pickaxe: die when --find-object and --pickaxe-all are combinedÆvar Arnfjörð Bjarmason1-0/+3
Neither the --pickaxe-all documentation nor --find-object's has ever suggested that you can combine the two. See f506b8e8b5 (git log/diff: add -G<regexp> that greps in the patch text, 2010-08-23) and 15af58c1ad (diffcore: add a pickaxe option to find a specific blob, 2018-01-04). But we've silently tolerated it, which makes the logic in diffcore_pickaxe() harder to reason about. Let's assert that we won't have the two combined. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11pickaxe: die when -G and --pickaxe-regex are combinedÆvar Arnfjörð Bjarmason1-0/+3
When the -G and --pickaxe-regex options are combined we simply ignore the --pickaxe-regex option. Let's die instead as suggested by our documentation, since -G is always a regex. When --pickaxe-regex was added in d01d8c6782 (Support for pickaxe matching regular expressions, 2006-03-29) only the -S option existed. Then when -G was added in f506b8e8b5 (git log/diff: add -G<regexp> that greps in the patch text, 2010-08-23) neither the documentation for --pickaxe-regex was updated accordingly, nor was something like this assertion added. Since 5bc3f0b567 (diffcore-pickaxe doc: document -S and -G properly, 2013-05-31) we've claimed that --pickaxe-regex should only be used with -S, but have silently tolerated combining it with -G, let's die instead. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05word diff: handle zero length matchesPhillip Wood1-3/+7
If find_word_boundaries() encounters a zero length match (which can be caused by matching a newline or using '*' instead of '+' in the regex) we stop splitting the input into words which generates an inaccurate diff. To fix this increment the start point when there is a zero length match and try a new match. This is safe as posix regular expressions always return the longest available match so a zero length match means there are no longer matches available from the current position. Commit bf82940dbf1 (color-words: enable REG_NEWLINE to help user, 2009-01-17) prevented matching newlines in negated character classes but it is still possible for the user to have an explicit newline match in the regex which could cause a zero length match. One could argue that having explicit newline matches or using '*' rather than '+' are user errors but it seems to be better to work round them than produce inaccurate diffs. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27hash: provide per-algorithm null OIDsbrian m. carlson1-2/+2
Up until recently, object IDs did not have an algorithm member, only a hash. Consequently, it was possible to share one null (all-zeros) object ID among all hash algorithms. Now that we're going to be handling objects from multiple hash algorithms, it's important to make sure that all object IDs have a correct algorithm field. Introduce a per-algorithm null OID, and add it to struct hash_algo. Introduce a wrapper function as well, and use it everywhere we used to use the null_oid constant. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27Use the final_oid_fn to finalize hashing of object IDsbrian m. carlson1-1/+1
When we're hashing a value which is going to be an object ID, we want to zero-pad that value if necessary. To do so, use the final_oid_fn instead of the final_fn anytime we're going to create an object ID to ensure we perform this operation. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13use CALLOC_ARRAYRené Scharfe1-6/+4
Add and apply a semantic patch for converting code that open-codes CALLOC_ARRAY to use it instead. It shortens the code and infers the element size automatically. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-25Merge branch 'jc/diffcore-rotate'Junio C Hamano1-0/+21
"git {diff,log} --{skip,rotate}-to=<path>" allows the user to discard diff output for early paths or move them to the end of the output. * jc/diffcore-rotate: diff: --{rotate,skip}-to=<path>
2021-02-16diff: --{rotate,skip}-to=<path>Junio C Hamano1-0/+21
In the implementation of "git difftool", there is a case where the user wants to start viewing the diffs at a specific path and continue on to the rest, optionally wrapping around to the beginning. Since it is somewhat cumbersome to implement such a feature as a post-processing step of "git diff" output, let's support it internally with two new options. - "git diff --rotate-to=C", when the resulting patch would show paths A B C D E without the option, would "rotate" the paths to shows patch to C D E A B instead. It is an error when there is no patch for C is shown. - "git diff --skip-to=C" would instead "skip" the paths before C, and shows patch to C D E. Again, it is an error when there is no patch for C is shown. - "git log [-p]" also accepts these two options, but it is not an error if there is no change to the specified path. Instead, the set of output paths are rotated or skipped to the specified path or the first path that sorts after the specified path. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11diff: plug memory leak from regcomp() on {log,diff} -IÆvar Arnfjörð Bjarmason1-0/+12
Fix a memory leak in 296d4a94e7 (diff: add -I<regex> that ignores matching changes, 2020-10-20) by freeing the memory it allocates in the newly introduced diff_free(). See the previous commit for details on that. This memory leak was intentionally introduced in 296d4a94e7, see the discussion on a previous iteration of it in https://lore.kernel.org/git/xmqqeelycajx.fsf@gitster.c.googlers.com/ At that time freeing the memory was somewhat tedious, but since it isn't anymore with the newly introduced diff_free() let's use it. Let's retain the pattern for diff_free_file() and add a diff_free_ignore_regex(), even though (unlike "diff_free_file") we don't need to call it elsewhere. I think this'll make for more readable code than gradually accumulating a giant diff_free() function, sharing "int i" across unrelated code etc. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11diff: add an API for deferred freeingÆvar Arnfjörð Bjarmason1-4/+16
Add a diff_free() function to free anything we may have allocated in the "diff_options" struct, and the ability to make calling it a noop by setting "no_free" in "diff_options". This is required because when e.g. "git diff" is run we'll allocate things in that struct, use the diff machinery once, and then exit. But if we run e.g. "git log -p" we're going to re-use what we allocated across multiple diff_flush() calls, and only want to free things at the end. We've thus ended up with features like the recently added "diff -I"[1] where we'll leak memory. As it turns out it could have simply used the pattern established in 6ea57703f6 (log: prepare log/log-tree to reuse the diffopt.close_file attribute, 2016-06-22). Manually adding more such flags to things log_tree_commit() every time we need to allocate something would be tedious. Let's instead move that fclose() code it to a new diff_free(), in anticipation of freeing more things in that function in follow-up commits. Some functions such as log_tree_commit() need an idiom of optionally retaining a previous "no_free", as they may either free the memory themselves, or their caller may do so. I'm keeping that idiom in log_show_early() for good measure, even though I don't think it's currently called in this manner. It also gets passed an existing "struct rev_info", so future callers may want to set the "no_free" flag. This change is a bit hard to read because while the freeing pattern we're introducing isn't unusual, the "file" member is a special snowflake. We usually don't want to fclose() it. This is because "file" is usually stdout, in which case we don't want to fclose() it. We only want to opt-in to closing it when we e.g. open a file on the filesystem. Thus the opt-in "close_file" flag. So the API in general just needs a "no_free" flag to defer freeing, but the "file" member still needs its "close_file" flag. This is made more confusing because while refactoring this code we could replace some "close_file=0" with "no_free=1", whereas others need to set both flags. This is because there were some cases where an existing "close_file=0" meant "let's defer deallocation", and others where it meant "we don't want to close this file handle at all". 1. 296d4a94e7 (diff: add -I<regex> that ignores matching changes, 2020-10-20) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25Merge branch 'sj/untracked-files-in-submodule-directory-is-not-dirty'Junio C Hamano1-0/+3
"git diff" showed a submodule working tree with untracked cruft as "Submodule commit <objectname>-dirty", but a natural expectation is that the "-dirty" indicator would align with "git describe --dirty", which does not consider having untracked files in the working tree as source of dirtiness. The inconsistency has been fixed. * sj/untracked-files-in-submodule-directory-is-not-dirty: diff: do not show submodule with untracked files as "-dirty"
2020-12-18Merge branch 'jc/diff-I-status-fix'Junio C Hamano1-1/+2
"git diff -I<pattern> -exit-code" should exit with 0 status when all the changes match the ignored pattern, but it didn't. * jc/diff-I-status-fix: diff: correct interaction between --exit-code and -I<pattern>
2020-12-16diff: correct interaction between --exit-code and -I<pattern>Junio C Hamano1-1/+2
Just like "git diff -w --exit-code" should exit with 0 when ignoring whitespace differences results in no changes shown, if ignoring certain changes with "git diff -I<pattern> --exit-code" result in an empty patch, we should exit with 0. The test suite did not cover the interaction between "--exit-code" and "-w"; add one while adding a new test for "--exit-code" + "-I". Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08diff: do not show submodule with untracked files as "-dirty"Sangeeta Jain1-0/+3
Git diff reports a submodule directory as -dirty even when there are only untracked files in the submodule directory. This is inconsistent with what `git describe --dirty` says when run in the submodule directory in that state. Make `--ignore-submodules=untracked` the default for `git diff` when there is no configuration variable or command line option, so that the command would not give '-dirty' suffix to a submodule whose working tree has untracked files, to make it consistent with `git describe --dirty` that is run in the submodule working tree. And also make `--ignore-submodules=none` the default for `git status` so that the user doesn't end up deleting a submodule that has uncommitted (untracked) files. Signed-off-by: Sangeeta Jain <sangunb09@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-21Merge branch 'en/strmap'Junio C Hamano1-2/+2
A specialization of hashmap that uses a string as key has been introduced. Hopefully it will see wider use over time. * en/strmap: shortlog: use strset from strmap.h Use new HASHMAP_INIT macro to simplify hashmap initialization strmap: take advantage of FLEXPTR_ALLOC_STR when relevant strmap: enable allocations to come from a mem_pool strmap: add a strset sub-type strmap: split create_entry() out of strmap_put() strmap: add functions facilitating use as a string->int map strmap: enable faster clearing and reusing of strmaps strmap: add more utility functions strmap: new utility functions hashmap: provide deallocation function names hashmap: introduce a new hashmap_partial_clear() hashmap: allow re-use after hashmap_free() hashmap: adjust spacing to fix argument alignment hashmap: add usage documentation explaining hashmap_free[_entries]()
2020-11-21Merge branch 'jk/diff-release-filespec-fix'Junio C Hamano1-0/+3
Running "git diff" while allowing external diff in a state with unmerged paths used to segfault, which has been corrected. * jk/diff-release-filespec-fix: t7800: simplify difftool test diff: allow passing NULL to diff_free_filespec_data()
2020-11-06diff: allow passing NULL to diff_free_filespec_data()Jinoh Kang1-0/+3
Commit 3aef54e8b8 ("diff: munmap() file contents before running external diff") introduced calls to diff_free_filespec_data in run_external_diff, which may pass NULL pointers. Fix this and prevent any such bugs in the future by making `diff_free_filespec_data(NULL)` a no-op. Fixes: 3aef54e8b8 ("diff: munmap() file contents before running external diff") Signed-off-by: Jinoh Kang <luke1337@theori.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-02Merge branch 'mk/diff-ignore-regex'Junio C Hamano1-0/+23
"git diff" family of commands learned the "-I<regex>" option to ignore hunks whose changed lines all match the given pattern. * mk/diff-ignore-regex: diff: add -I<regex> that ignores matching changes merge-base, xdiff: zero out xpparam_t structures
2020-11-02hashmap: provide deallocation function namesElijah Newren1-2/+2
hashmap_free(), hashmap_free_entries(), and hashmap_free_() have existed for a while, but aren't necessarily the clearest names, especially with hashmap_partial_clear() being added to the mix and lazy-initialization now being supported. Peff suggested we adopt the following names[1]: - hashmap_clear() - remove all entries and de-allocate any hashmap-specific data, but be ready for reuse - hashmap_clear_and_free() - ditto, but free the entries themselves - hashmap_partial_clear() - remove all entries but don't deallocate table - hashmap_partial_clear_and_free() - ditto, but free the entries This patch provides the new names and converts all existing callers over to the new naming scheme. [1] https://lore.kernel.org/git/20201030125059.GA3277724@coredump.intra.peff.net/ Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-20diff: add -I<regex> that ignores matching changesMichał Kępień1-0/+23
Add a new diff option that enables ignoring changes whose all lines (changed, removed, and added) match a given regular expression. This is similar to the -I/--ignore-matching-lines option in standalone diff utilities and can be used e.g. to ignore changes which only affect code comments or to look for unrelated changes in commits containing a large number of automatically applied modifications (e.g. a tree-wide string replacement). The difference between -G/-S and the new -I option is that the latter filters output on a per-change basis. Use the 'ignore' field of xdchange_t for marking a change as ignored or not. Since the same field is used by --ignore-blank-lines, identical hunk emitting rules apply for --ignore-blank-lines and -I. These two options can also be used together in the same git invocation (they are complementary to each other). Rename xdl_mark_ignorable() to xdl_mark_ignorable_lines(), to indicate that it is logically a "sibling" of xdl_mark_ignorable_regex() rather than its "parent". Signed-off-by: Michał Kępień <michal@isc.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-24diff: fix modified lines stats with --stat and --numstatThomas Guyot-Sionnest1-5/+7
Only skip diffstats when both oids are valid and identical. This check was causing both false-positives (files included in diffstats with no actual changes (0 lines modified) and false-negatives (showing 0 lines modified in stats when files had actually changed). Also replaced same_contents with may_differ to avoid confusion. Signed-off-by: Thomas Guyot-Sionnest <tguyot@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-18Merge branch 'jc/quote-path-cleanup'Junio C Hamano1-4/+4
"git status --short" quoted a path with SP in it when tracked, but not those that are untracked, ignored or unmerged. They are all shown quoted consistently. * jc/quote-path-cleanup: quote: turn 'nodq' parameter into a set of flags quote: rename misnamed sq_lookup[] to cq_lookup[] wt-status: consistently quote paths in "status --short" output quote_path: code clarification quote_path: optionally allow quoting a path with SP in it quote_path: give flags parameter to quote_path() quote_path: rename quote_path_relative() to quote_path()
2020-09-10quote: turn 'nodq' parameter into a set of flagsJunio C Hamano1-4/+4
quote_c_style() and its friend quote_two_c_style() both take an optional "please omit the double quotes around the quoted body" parameter. Turn it into a flag word, assign one bit out of it, and call it CQUOTE_NODQ bit. No behaviour change intended. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-09Merge branch 'ss/submodule-summary-in-c'Junio C Hamano1-1/+1
Yet another subcommand of "git submodule" is getting rewritten in C. * ss/submodule-summary-in-c: submodule: port submodule subcommand 'summary' from shell to C t7421: introduce a test script for verifying 'summary' output submodule: rename helper functions to avoid ambiguity submodule: remove extra line feeds between callback struct and macro
2020-09-03Merge branch 'mr/diff-hide-stat-wo-textual-change'Junio C Hamano1-7/+31
"git diff --stat -w" showed 0-line changes for paths whose changes were only whitespaces, which was not intuitive. We now omit such paths from the stat output. * mr/diff-hide-stat-wo-textual-change: diff: teach --stat to ignore uninteresting modifications
2020-08-31Merge branch 'dd/diff-customize-index-line-abbrev'Junio C Hamano1-1/+4
The output from the "diff" family of the commands had abbreviated object names of blobs involved in the patch, but its length was not affected by the --abbrev option. Now it is. * dd/diff-customize-index-line-abbrev: diff: index-line: respect --abbrev in object's name t4013: improve diff-post-processor logic
2020-08-24Merge branch 'rs/patch-id-with-incomplete-line'Junio C Hamano1-0/+2
The patch-id computation did not ignore the "incomplete last line" marker like whitespaces. * rs/patch-id-with-incomplete-line: patch-id: ignore newline at end of file in diff_flush_patch_id()
2020-08-21diff: index-line: respect --abbrev in object's nameĐoàn Trần Công Danh1-1/+4
A handful of Git's commands respect `--abbrev' for customizing length of abbreviation of object names. For diff-family, Git supports 2 different options for 2 different purposes, `--full-index' for showing diff-patch object's name in full, and `--abbrev' to customize the length of object names in diff-raw and diff-tree header lines, without any options to customise the length of object names in diff-patch format. When working with diff-patch format, we only have two options, either full index, or default abbrev length. Although, that behaviour is documented, it doesn't stop users from trying to use `--abbrev' with the hope of customising diff-patch's objects' name's abbreviation. Let's allow the blob object names shown on the "index" line to be abbreviated to arbitrary length given via the "--abbrev" option. To preserve backward compatibility with old script that specify both `--full-index' and `--abbrev', always show full object id if `--full-index' is specified. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-19diff: teach --stat to ignore uninteresting modificationsMatthew Rogers1-7/+31
When options such as --ignore-space-change are in use, files with modifications can have no interesting textual changes worth showing. In such cases, "git diff --stat" shows 0 lines of additions and deletions. Teach "git diff --stat" not to show such a path in its output, which would be more natural. However, we don't want to prevent the display of all files that have 0 effective diffs since they could be the result of a rename, permission change, or other similar operation that may still be of interest so we special case additions and deletions as they are always interesting. Signed-off-by: Matthew Rogers <mattr94@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-18patch-id: ignore newline at end of file in diff_flush_patch_id()René Scharfe1-0/+2
Whitespace is ignored when calculating patch IDs. This is done by removing all whitespace from diff lines before hashing them, including a newline at the end of a file. If that newline is missing, however, diff reports that fact in a separate line containing "\ No newline at end of file\n", and this marker is hashed like a context line. This goes against our goal of making patch IDs independent of whitespace. Use the same heuristic that 2485eab55cc (git-patch-id: do not trip over "no newline" markers, 2011-02-17) added to git patch-id instead and skip diff lines that start with a backslash and a space and are longer than twelve characters. Reported-by: Tilman Vogel <tilman.vogel@web.de> Initial-test-by: Tilman Vogel <tilman.vogel@web.de> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-12submodule: rename helper functions to avoid ambiguityShourya Shukla1-1/+1
The helper functions: show_submodule_summary(), prepare_submodule_summary() and print_submodule_summary() are used by the builtin_diff() function in diff.c to generate a summary of submodules in the context of a diff. Functions with similar names are to be introduced in the upcoming port of submodule's summary subcommand. So, rename the helper functions to '*_diff_submodule_summary()' to avoid ambiguity. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-30strvec: rename struct fieldsJeff King1-1/+1
The "argc" and "argv" names made sense when the struct was argv_array, but now they're just confusing. Let's rename them to "nr" (which we use for counts elsewhere) and "v" (which is rather terse, but reads well when combined with typical variable names like "args.v"). Note that we have to update all of the callers immediately. Playing tricks with the preprocessor is hard here, because we wouldn't want to rewrite unrelated tokens. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-28strvec: convert more callers away from argv_array nameJeff King1-14/+14
We eventually want to drop the argv_array name and just use strvec consistently. There's no particular reason we have to do it all at once, or care about interactions between converted and unconverted bits. Because of our preprocessor compat layer, the names are interchangeable to the compiler (so even a definition and declaration using different names is OK). This patch converts remaining files from the first half of the alphabet, to keep the diff to a manageable size. The conversion was done purely mechanically with: git ls-files '*.c' '*.h' | xargs perl -i -pe ' s/ARGV_ARRAY/STRVEC/g; s/argv_array/strvec/g; ' and then selectively staging files with "git add '[abcdefghjkl]*'". We'll deal with any indentation/style fallouts separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-28strvec: rename files from argv-array to strvecJeff King1-1/+1
This requires updating #include lines across the code-base, but that's all fairly mechanical, and was done with: git ls-files '*.c' '*.h' | xargs perl -i -pe 's/argv-array.h/strvec.h/' Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-17Merge branch 'jk/diff-memuse-optim-with-stat-unmatch'Junio C Hamano1-1/+4
Reduce memory usage during "diff --quiet" in a worktree with too many stat-unmatched paths. * jk/diff-memuse-optim-with-stat-unmatch: diff: discard blob data from stat-unmatched pairs
2020-06-02diff: discard blob data from stat-unmatched pairsJeff King1-1/+4
When performing a tree-level diff against the working tree, we may find that our index stat information is dirty, so we queue a filepair to be examined later. If the actual content hasn't changed, we call this a stat-unmatch; the stat information was out of date, but there's no actual diff. Normally diffcore_std() would detect and remove these identical filepairs via diffcore_skip_stat_unmatch(). However, when "--quiet" is used, we want to stop the diff as soon as we see any changes, so we check for stat-unmatches immediately in diff_change(). That check may require us to actually load the file contents into the pair of diff_filespecs. If we find that the pair isn't a stat-unmatch, then no big deal; we'd likely load the contents later anyway to generate a patch, do rename detection, etc, so we want to hold on to it. But if it is a stat-unmatch, then we have no more use for that data; the whole point is that we're going discard the pair. However, we never free the allocated diff_filespec data. In most cases, keeping that data isn't a problem. We don't expect a lot of stat-unmatch entries, and since we're using --quiet, we'd quit as soon as we saw such a real change anyway. However, there are extreme cases where it makes a big difference: 1. We'd generally mmap() the working tree half of the pair. And since the OS may limit the total number of maps, we can run afoul of this in large repositories. E.g.: $ cd linux $ git ls-files | wc -l 67959 $ sysctl vm.max_map_count vm.max_map_count = 65530 $ git ls-files | xargs touch ;# everything is stat-dirty! $ git diff --quiet fatal: mmap failed: Cannot allocate memory It should be unusual to have so many files stat-dirty, but it's possible if you've just run a script like "sed -i" or similar. After this patch, the above correctly exits with code 0. 2. Even if you don't hit mmap limits, the index half of the pair will have been pulled from the object database into heap memory. Again in a clone of linux.git, running: $ git ls-files | head -n 10000 | xargs touch $ git diff --quiet peaks at 145MB heap before this patch, and 94MB after. This patch solves the problem by freeing any diff_filespec data we picked up during the "--quiet" stat-unmatch check in diff_changes. Nobody is going to need that data later, so there's no point holding on to it. There are a few things to note: - we could skip queueing the pair entirely, which could in theory save a little work. But there's not much to save, as we need a diff_filepair to feed to diff_filespec_check_stat_unmatch() anyway. And since we cache the result of the stat-unmatch checks, a later call to diffcore_skip_stat_unmatch() call will quickly skip over them. The diffcore code also counts up the number of stat-unmatched pairs as it removes them. It's doubtful any callers would care about that in combination with --quiet, but we'd have to reimplement the logic here to be on the safe side. So it's not really worth the trouble. - I didn't write a test, because we always produce the correct output unless we run up against system mmap limits, which are both unportable and expensive to test against. Measuring peak heap would be interesting, but our perf suite isn't yet capable of that. - note that diff without "--quiet" does not suffer from the same problem. In diffcore_skip_stat_unmatch(), we detect the stat-unmatch entries and drop them immediately, so we're not carrying their data around. - you _can_ still trigger the mmap limit problem if you truly have that many files with actual changes. But it's rather unlikely. The stat-unmatch check avoids loading the file contents if the sizes don't match, so you'd need a pretty trivial change in every single file. Likewise, inexact rename detection might load the data for many files all at once. But you'd need not just 64k changes, but that many deletions and additions. The most likely candidate is perhaps break-detection, which would load the data for all pairs and keep it around for the content-level diff. But again, you'd need 64k actually changed files in the first place. So it's still possible to trigger this case, but it seems like "I accidentally made all my files stat-dirty" is the most likely case in the real world. Reported-by: Jan Christoph Uhde <Jan@UhdeJc.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-24diff: add config option relativeLaurent Arnoud1-3/+8
The `diff.relative` boolean option set to `true` shows only changes in the current directory/value specified by the `path` argument of the `relative` option and shows pathnames relative to the aforementioned directory. Teach `--no-relative` to override earlier `--relative` Add for git-format-patch(1) options documentation `--relative` and `--no-relative` Signed-off-by: Laurent Arnoud <laurent@spkdev.net> Acked-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-28Merge branch 'jt/avoid-prefetch-when-able-in-diff'Junio C Hamano1-50/+107
"git diff" in a partial clone learned to avoid lazy loading blob objects in more casese when they are not needed. * jt/avoid-prefetch-when-able-in-diff: diff: restrict when prefetching occurs diff: refactor object read diff: make diff_populate_filespec_options struct promisor-remote: accept 0 as oid_nr in function
2020-04-07diff: restrict when prefetching occursJonathan Tan1-22/+51
Commit 7fbbcb21b1 ("diff: batch fetching of missing blobs", 2019-04-08) optimized "diff" by prefetching blobs in a partial clone, but there are some cases wherein blobs do not need to be prefetched. In these cases, any command that uses the diff machinery will unnecessarily fetch blobs. diffcore_std() may read blobs when it calls the following functions: (1) diffcore_skip_stat_unmatch() (controlled by the config variable diff.autorefreshindex) (2) diffcore_break() and diffcore_merge_broken() (for break-rewrite detection) (3) diffcore_rename() (for rename detection) (4) diffcore_pickaxe() (for detecting addition/deletion of specified string) Instead of always prefetching blobs, teach diffcore_skip_stat_unmatch(), diffcore_break(), and diffcore_rename() to prefetch blobs upon the first read of a missing object. This covers (1), (2), and (3): to cover the rest, teach diffcore_std() to prefetch if the output type is one that includes blob data (and hence blob data will be required later anyway), or if it knows that (4) will be run. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-07diff: refactor object readJonathan Tan1-8/+21
Refactor the object reads in diff_populate_filespec() to have the first object read not be in an if/else branch, because in a future patch, a retry will be added to that first object read. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-07diff: make diff_populate_filespec_options structJonathan Tan1-19/+35
The behavior of diff_populate_filespec() currently can be customized through a bitflag, but a subsequent patch requires it to support a non-boolean option. Replace the bitflag with an options struct. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-02promisor-remote: accept 0 as oid_nr in functionJonathan Tan1-6/+5
There are 3 callers to promisor_remote_get_direct() that first check if the number of objects to be fetched is equal to 0. Fold that check into promisor_remote_get_direct(), and in doing so, be explicit as to what promisor_remote_get_direct() does if oid_nr is 0 (it returns 0, success, immediately). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-16convert: provide additional metadata to filtersbrian m. carlson1-1/+4
Now that we have the codebase wired up to pass any additional metadata to filters, let's collect the additional metadata that we'd like to pass. The two main places we pass this metadata are checkouts and archives. In these two situations, reading HEAD isn't a valid option, since HEAD isn't updated for checkouts until after the working tree is written and archives can accept an arbitrary tree. In other situations, HEAD will usually reflect the refname of the branch in current use. We pass a smaller amount of data in other cases, such as git cat-file, where we can really only logically know about the blob. This commit updates only the parts of the checkout code where we don't use unpack_trees. That function and callers of it will be handled in a future commit. In the archive code, we leak a small amount of memory, since nothing we pass in the archiver argument structure is freed. Signed-off-by: brian m. carlson <bk2204@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-16convert: permit passing additional metadata to filter processesbrian m. carlson1-1/+1
There are a variety of situations where a filter process can make use of some additional metadata. For example, some people find the ident filter too limiting and would like to include the commit or the branch in their smudged files. This information isn't available during checkout as HEAD hasn't been updated at that point, and it wouldn't be available in archives either. Let's add a way to pass this metadata down to the filter. We pass the blob we're operating on, the treeish (preferring the commit over the tree if one exists), and the ref we're operating on. Note that we won't pass this information in all cases, such as when renormalizing or when we're performing diffs, since it doesn't make sense in those cases. The data we currently get from the filter process looks like the following: command=smudge pathname=git.c 0000 With this change, we'll get data more like this: command=smudge pathname=git.c refname=refs/tags/v2.25.1 treeish=c522f061d551c9bb8684a7c3859b2ece4499b56b blob=7be7ad34bd053884ec48923706e70c81719a8660 0000 There are a couple things to note about this approach. For operations like checkout, treeish will always be a commit, since we cannot check out individual trees, but for other operations, like archive, we can end up operating on only a particular tree, so we'll provide only a tree as the treeish. Similar comments apply for refname, since there are a variety of cases in which we won't have a ref. This commit wires up the code to print this information, but doesn't pass any of it at this point. In a future commit, we'll have various code paths pass the actual useful data down. Signed-off-by: brian m. carlson <bk2204@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-14Merge branch 'mt/use-passed-repo-more-in-funcs'Junio C Hamano1-1/+1
Some codepaths were given a repository instance as a parameter to work in the repository, but passed the_repository instance to its callees, which has been cleaned up (somewhat). * mt/use-passed-repo-more-in-funcs: sha1-file: allow check_object_signature() to handle any repo sha1-file: pass git_hash_algo to hash_object_file() sha1-file: pass git_hash_algo to write_object_file_prepare() streaming: allow open_istream() to handle any repo pack-check: use given repo's hash_algo at verify_packfile() cache-tree: use given repo's hash_algo at verify_one() diff: make diff_populate_filespec() honor its repo argument
2020-01-31diff: move diff.wsErrorHighlight to "basic" configJeff King1-8/+8
We parse diff.wsErrorHighlight in git_diff_ui_config(), meaning that it doesn't take effect for plumbing commands, only for porcelains like git-diff itself. This is mildly annoying as it means scripts like add--interactive, which produce a user-visible diff with color, don't respect the option. We could teach that script to parse the config and pass it along as --ws-error-highlight to the diff plumbing. But there's a simpler solution. It should be reasonably safe for plumbing to respect this option, as it only kicks in when color is otherwise enabled. And anybody parsing colorized output must already deal with the fact that color.diff.* may change the exact output they see; those options have been part of git_diff_basic_config() since its inception in 9a1805a872 (add a "basic" diff config callback, 2008-01-04). So we can just move it to the "basic" config, which fixes add--interactive, along with any other script in the same boat, with a very low risk of hurting any plumbing users. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>