aboutsummaryrefslogtreecommitdiffstats
path: root/xdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-12-13xdiff: mark unused parameter in xdl_call_hunk_func()Jeff King1-1/+1
This function is used interchangeably with xdl_emit via a function pointer, so we can't just drop the unused parameter. Mark it to silence -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13xdiff: drop unused parameter in def_ff()Jeff King1-2/+2
The def_ff() function is the default "find_func" for finding hunk headers. It has never used its "priv" argument since it was introduced in f258475a6e (Per-path attribute based hunk header selection., 2007-07-06). But back then we used a function pointer to switch between a caller-provided function and the default, so the two had to conform to the same interface. In ff2981f724 (xdiff: factor out match_func_rec(), 2016-05-28), that pointer indirection went away in favor of code which directly calls either of the two functions. So there's no need for def_ff() to retain this unused parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-20xdiff: drop unused mmfile parameters from xdl_do_patience_diff()Jeff King3-19/+9
The entry point to the patience-diff algorithm takes two mmfile_t structs with the original file contents, but it doesn't actually do anything useful with them. This is similar to the case recently cleaned up in the histogram code via f1d019071e (xdiff: drop unused mmfile parameters from xdl_do_histogram_diff(), 2022-08-19), but there's a bit more subtlety going on. We pass them into the recursive patience_diff(), which in turn passes them into fill_hashmap(), which stuffs the pointers into a struct. But the only thing which reads the struct fields is our recursion into patience_diff()! So it's unlikely that something like -Wunused-parameter could find this case: it would have to detect the circular dependency caused by the recursion (not to mention tracing across struct field assignments). But once found, it's easy to have the compiler confirm what's going on: 1. Drop the "file1" and "file2" fields from the hashmap struct definition. Remove the assignments in fill_hashmap(), and temporarily substitute NULL in the recursive call to patience_diff(). Compiling shows that no other code touched those fields. 2. Now fill_hashmap() will trigger -Wunused-parameter. Drop "file1" and "file2" from its definition and callsite. 3. Now patience_diff() will trigger -Wunused-parameter. Drop them there, too. One of the callsites is the recursion with our NULL values, so those temporary values go away. 4. Now xdl_do_patience_diff() will trigger -Wunused-parameter. Drop them there. And we're done. Suggested-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19xdiff: drop unused mmfile parameters from xdl_do_histogram_diff()Jeff King3-5/+3
These are no longer used since 9df0fc3d57 (xdiff: fix a memory leak, 2022-02-16), as the caller is expected to call xdl_prepare_env() itself. After that change the histogram code only examines the prepared xdfenv_t, not the original buffers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-08xdiff: introduce XDL_ALLOC_GROW()Phillip Wood4-16/+33
Add a helper to grow an array. This is analogous to ALLOC_GROW() in the rest of the codebase but returns −1 on allocation failure to accommodate other users of libxdiff such as libgit2. It will also return a error if the multiplication overflows while calculating the new allocation size. Note that this keeps doubling on reallocation like the code it is replacing rather than increasing the existing size by half like ALLOC_GROW(). It does however copy ALLOC_GROW()'s trick of adding a small amount to the new allocation to avoid a lot of reallocations at small sizes. Note that xdl_alloc_grow_helper() uses long rather than size_t for `nr` and `alloc` to match the existing code. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-08xdiff: introduce XDL_CALLOC_ARRAY()Phillip Wood4-14/+11
Add a helper for allocating an array and initialize the elements to zero. This is analogous to CALLOC_ARRAY() in the rest of the codebase but it returns NULL on allocation failures rather than dying to accommodate other users of libxdiff such as libgit2. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-08xdiff: introduce xdl_callocPhillip Wood4-24/+15
In preparation for introducing XDL_CALLOC_ARRAY() use calloc() to obtain zeroed out memory rather than malloc() followed by memset(). To try and keep the lines a reasonable length this commit also stops casting the pointer returned by calloc() as this is unnecessary. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-08xdiff: introduce XDL_ALLOC_ARRAY()Phillip Wood4-7/+12
Add a helper to allocate an array that automatically calculates the allocation size. This is analogous to ALLOC_ARRAY() in the rest of the codebase but returns NULL if the allocation fails to accommodate other users of libxdiff such as libgit2. The helper will also return NULL if the multiplication in the allocation calculation overflows. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-20Merge branch 'ep/maint-equals-null-cocci'Junio C Hamano3-3/+3
Introduce and apply coccinelle rule to discourage an explicit comparison between a pointer and NULL, and applies the clean-up to the maintenance track. * ep/maint-equals-null-cocci: tree-wide: apply equals-null.cocci tree-wide: apply equals-null.cocci contrib/coccinnelle: add equals-null.cocci
2022-05-02Merge branch 'ep/maint-equals-null-cocci' for maint-2.35Junio C Hamano3-3/+3
* ep/maint-equals-null-cocci: tree-wide: apply equals-null.cocci contrib/coccinnelle: add equals-null.cocci
2022-05-02tree-wide: apply equals-null.cocciJunio C Hamano3-3/+3
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-01xdiff/xmacros.h: remove unused XDL_PTRFREEÆvar Arnfjörð Bjarmason1-1/+0
This macro was added in 3443546f6ef (Use a *real* built-in diff generator, 2006-03-24), but none of the xdiff code uses it, it uses xdl_free() directly. If we need its functionality again we'll use the FREE_AND_NULL() macro added in 481df65f4f7 (git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL, 2017-06-15). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16xdiff: handle allocation failure when mergingPhillip Wood1-1/+6
Other users of xdiff such as libgit2 need to be able to handle allocation failures. These allocation failures were previously ignored. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16xdiff: refactor a functionPhillip Wood1-19/+16
Use the standard "goto out" pattern rather than repeating very similar code after checking for each error. This will simplify the next commit that starts handling allocation failures that are currently ignored. On error xdl_do_diff() frees the environment so we need to take care to avoid a double free in that case. xdl_build_script() does not assign a result unless it is successful so there is no possibility of a double free if it fails. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16xdiff: handle allocation failure in patience diffPhillip Wood1-5/+12
Other users of libxdiff such as libgit2 need to be able to handle allocation failures. As NULL is a valid return value the function signature is changed to be able report allocation failures. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16xdiff: fix a memory leakPhillip Wood3-23/+17
Although the patience and histogram algorithms initialize the environment they do not free it if there is an error. In contrast for the Myers algorithm the environment is initalized in xdl_do_diff() and it is freed if there is an error. Fix this by always initializing the environment in xdl_do_diff() and freeing it there if there is an error. Remove the comment in do_patience_diff() about the environment being freed by xdl_diff() as it is not accurate because (a) xdl_diff() does not do that if there is an error and (b) xdl_diff() is not the only caller. Reported-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-21Merge branch 'pw/xdiff-classify-record-in-histogram'Junio C Hamano3-42/+29
"diff --histogram" optimization. * pw/xdiff-classify-record-in-histogram: xdiff: drop unused flags parameter from recs_match xdiff: drop xpparam_t parameter from histogram cmp_recs() xdiff: drop CMP_ENV macro from xhistogram xdiff: simplify comparison xdiff: avoid unnecessary memory allocations diff histogram: intern strings
2021-12-04xdiff: drop unused flags parameter from recs_matchJeff King1-9/+9
Since 6b13bc3232 (xdiff: simplify comparison, 2021-11-17), we do not call xdl_recmatch() from xdiffi.c's recs_match(), so we no longer need the "flags" parameter. That in turn lets us drop the flags parameters from the group-slide functions which call it. There's no functional change here; it's just making these functions a little simpler. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-04xdiff: drop xpparam_t parameter from histogram cmp_recs()Jeff King1-3/+2
Since 663c5ad035 (diff histogram: intern strings, 2021-11-17), our cmp_recs() does not call xdl_recmatch(), and thus no longer needs an xpparam_t struct from which to get the flags. We can drop the unused parameter from the function, as well as the macro which wraps it. There's no functional change here; it's just simplifying things (and making -Wunused-parameter happier). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-04xdiff: drop CMP_ENV macro from xhistogramJeff King1-3/+0
This macro has been unused since 43ca7530df (xdiff/xhistogram: rely on xdl_trim_ends(), 2011-08-01). The function that called it went away there, and its use in the CMP() macro was inlined. It probably should have been deleted then, but nobody noticed. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-01xdiff: implement a zealous diff3, or "zdiff3"Phillip Wood2-6/+58
"zdiff3" is identical to ordinary diff3 except that it allows compaction of common lines on the two sides of history at the beginning or end of the conflict hunk. For example, the following diff3 conflict: 1 2 3 4 <<<<<< A B C D E |||||| 5 6 ====== A X C Y E >>>>>> 7 8 9 has common lines 'A', 'C', and 'E' on the two sides. With zdiff3, one would instead get the following conflict: 1 2 3 4 A <<<<<< B C D |||||| 5 6 ====== X C Y >>>>>> E 7 8 9 Note that the common lines, 'A', and 'E' were moved outside the conflict. Unlike with the two-way conflicts from the 'merge' conflictStyle, the zdiff3 conflict is NOT split into multiple conflict regions to allow the common 'C' lines to be shown outside a conflict, because zdiff3 shows the base version too and the base version cannot be reasonably split. Note also that the removing of lines common to the two sides might make the remaining text inside the conflict region match the base text inside the conflict region (for example, if the diff3 conflict had '5 6 E' on the right side of the conflict, then the common line 'E' would be moved outside and both the base and right side's remaining conflict text would be the lines '5' and '6'). This has the potential to surprise users and make them think there should not have been a conflict, but there definitely was a conflict and it should remain. Based-on-patch-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Co-authored-by: Elijah Newren <newren@gmail.com> Signed-off-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-18xdiff: simplify comparisonPhillip Wood1-4/+1
Now that the histogram algorithm calls xdl_classify_record() it is no longer necessary to use xdl_recmatch() to compare lines, it is sufficient just to compare the hash values. This has a negligible effect on performance. Test HEAD~1 HEAD ----------------------------------------------------------------------------- 4000.1: log -3000 (baseline) 0.19(0.12+0.07) 0.18(0.14+0.04) -5.3% 4000.2: log --raw -3000 (tree-only) 0.98(0.81+0.16) 0.98(0.79+0.18) +0.0% 4000.3: log -p -3000 (Myers) 4.81(4.23+0.56) 4.80(4.26+0.53) -0.2% 4000.4: log -p -3000 --histogram 5.83(5.11+0.70) 5.82(5.15+0.65) -0.2% 4000.5: log -p -3000 --patience 5.31(4.61+0.69) 5.30(4.54+0.75) -0.2% Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-18xdiff: avoid unnecessary memory allocationsPhillip Wood1-4/+7
rindex and ha are only used by xdl_cleanup_records() which is not called by the histogram or patience algorithms. The perf test results show a small reduction in run time but that is probably within the noise. Test HEAD^ HEAD ----------------------------------------------------------------------------- 4000.1: log -3000 (baseline) 0.19(0.17+0.02) 0.19(0.12+0.07) +0.0% 4000.2: log --raw -3000 (tree-only) 0.98(0.78+0.20) 0.98(0.81+0.16) +0.0% 4000.3: log -p -3000 (Myers) 4.81(4.15+0.64) 4.81(4.23+0.56) +0.0% 4000.4: log -p -3000 --histogram 5.87(5.19+0.66) 5.83(5.11+0.70) -0.7% 4000.5: log -p -3000 --patience 5.35(4.60+0.73) 5.31(4.61+0.69) -0.7% Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-18diff histogram: intern stringsPhillip Wood2-19/+10
Histogram is the only diff algorithm not to call xdl_classify_record(). xdl_classify_record() ensures that the hash values of two strings that are not equal differ which means that it is not necessary to use xdl_recmatch() when comparing lines, all that is necessary is to compare the hash values. This gives a 7% reduction in the runtime of "git log --patch" when using the histogram diff algorithm. Test HEAD^ HEAD ----------------------------------------------------------------------------- 4000.1: log -3000 (baseline) 0.18(0.14+0.04) 0.19(0.17+0.02) +5.6% 4000.2: log --raw -3000 (tree-only) 0.99(0.77+0.21) 0.98(0.78+0.20) -1.0% 4000.3: log -p -3000 (Myers) 4.84(4.31+0.51) 4.81(4.15+0.64) -0.6% 4000.4: log -p -3000 --histogram 6.34(5.86+0.46) 5.87(5.19+0.66) -7.4% 4000.5: log -p -3000 --patience 5.39(4.60+0.76) 5.35(4.60+0.73) -0.7% Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-13Merge branch 'ab/pickaxe-pcre2'Junio C Hamano2-1/+3
Rewrite the backend for "diff -G/-S" to use pcre2 engine when available. * ab/pickaxe-pcre2: (22 commits) xdiff-interface: replace discard_hunk_line() with a flag xdiff users: use designated initializers for out_line pickaxe -G: don't special-case create/delete pickaxe -G: terminate early on matching lines xdiff-interface: allow early return from xdiff_emit_line_fn xdiff-interface: prepare for allowing early return pickaxe -S: slightly optimize contains() pickaxe: rename variables in has_changes() for brevity pickaxe -S: support content with NULs under --pickaxe-regex pickaxe: assert that we must have a needle under -G or -S pickaxe: refactor function selection in diffcore-pickaxe() perf: add performance test for pickaxe pickaxe/style: consolidate declarations and assignments diff.h: move pickaxe fields together again pickaxe: die when --find-object and --pickaxe-all are combined pickaxe: die when -G and --pickaxe-regex are combined pickaxe tests: add missing test for --no-pickaxe-regex being an error pickaxe tests: test for -G, -S and --find-object incompatibility pickaxe tests: add test for "log -S" not being a regex pickaxe tests: add test for diffgrep_consume() internals ...
2021-07-08Merge branch 'ab/xdiff-bug-cleanup'Junio C Hamano1-14/+8
Code clean-up. * ab/xdiff-bug-cleanup: xdiff: use BUG(...), not xdl_bug(...)
2021-06-08xdiff: use BUG(...), not xdl_bug(...)Ævar Arnfjörð Bjarmason1-14/+8
The xdl_bug() function was introduced in e8adf23d1e (xdl_change_compact(): introduce the concept of a change group, 2016-08-22), let's use our usual BUG() function instead. We'll now have meaningful line numbers if we encounter bugs in xdiff, and less code duplication. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-14Merge branch 'pw/patience-diff-clean-up'Junio C Hamano1-11/+3
Code clean-up. * pw/patience-diff-clean-up: patience diff: remove unused variable patience diff: remove unnecessary string comparisons
2021-05-11xdiff-interface: replace discard_hunk_line() with a flagÆvar Arnfjörð Bjarmason2-1/+3
Remove the dummy discard_hunk_line() function added in 3b40a090fd4 (diff: avoid generating unused hunk header lines, 2018-11-02) in favor of having a new XDL_EMIT_NO_HUNK_HDR flag, for use along with the two existing and similar XDL_EMIT_* flags. Unlike the recently amended xdiff_emit_line_fn interface which'll be called in a loop in xdl_emit_diff(), the hunk header is only emitted once. It makes more sense to pass this as a flag than provide a dummy callback because that function may be able to skip doing certain work if it knows the caller is doing nothing with the hunk header. It would be possible to do so in the case of -U0 now, but the benefit of doing so is so small that I haven't bothered. But this leaves the door open to that, and more importantly makes the API use more intuitive. The reason we're putting a flag in the gap between 1<<0 and 1<<2 is that the old 1<<1 flag was removed in 907681e940d (xdiff: drop XDL_EMIT_COMMON, 2016-02-23) without re-ordering the remaining flags. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05patience diff: remove unused variablePhillip Wood1-3/+0
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05patience diff: remove unnecessary string comparisonsPhillip Wood1-8/+3
xdl_prepare_env() calls xdl_classify_record() which arranges for the hashes of non-matching lines to be different so lines can be tested for equality by comparing just their hashes. This reduces the time taken to calculate the diff of v2.28.0 to v2.29.0 by ~3-4%. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-20diff: add -I<regex> that ignores matching changesMichał Kępień2-2/+49
Add a new diff option that enables ignoring changes whose all lines (changed, removed, and added) match a given regular expression. This is similar to the -I/--ignore-matching-lines option in standalone diff utilities and can be used e.g. to ignore changes which only affect code comments or to look for unrelated changes in commits containing a large number of automatically applied modifications (e.g. a tree-wide string replacement). The difference between -G/-S and the new -I option is that the latter filters output on a per-change basis. Use the 'ignore' field of xdchange_t for marking a change as ignored or not. Since the same field is used by --ignore-blank-lines, identical hunk emitting rules apply for --ignore-blank-lines and -I. These two options can also be used together in the same git invocation (they are complementary to each other). Rename xdl_mark_ignorable() to xdl_mark_ignorable_lines(), to indicate that it is logically a "sibling" of xdl_mark_ignorable_regex() rather than its "parent". Signed-off-by: Michał Kępień <michal@isc.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-20merge-base, xdiff: zero out xpparam_t structuresMichał Kępień2-0/+4
xpparam_t structures are usually zero-initialized before their specific fields are assigned to, but there are three locations in the tree where that does not happen. Add the missing memset() calls in order to make initialization of xpparam_t structures consistent tree-wide and to prevent stack garbage from being used as field values. Signed-off-by: Michał Kępień <michal@isc.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-16Merge branch 'rs/xdiff-ignore-ws-w-func-context'Junio C Hamano1-0/+17
The "diff" machinery learned not to lose added/removed blank lines in the context when --ignore-blank-lines and --function-context are used at the same time. * rs/xdiff-ignore-ws-w-func-context: xdiff: unignore changes in function context
2019-12-05xdiff: unignore changes in function contextRené Scharfe1-0/+17
Changes involving only blank lines are hidden with --ignore-blank-lines, unless they appear in the context lines of other changes. This is handled by xdl_get_hunk() for context added by --inter-hunk-context, -u and -U. Function context for -W and --function-context added by xdl_emit_diff() doesn't pay attention to such ignored changes; it relies fully on xdl_get_hunk() and shows just the post-image of ignored changes appearing in function context. That's inconsistent and confusing. Improve the result of using --ignore-blank-lines and --function-context together by fully showing ignored changes if they happen to fall within function context. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-09xdiffi: fix typos and touch up commentsJohannes Schindelin1-44/+55
Inspired by the thoroughly stale https://github.com/git/git/pull/159, this patch fixes a couple of typos, rewraps and clarifies some comments. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-31Merge branch 'cb/xdiff-no-system-includes-in-dot-c'Junio C Hamano3-8/+0
Compilation fix. * cb/xdiff-no-system-includes-in-dot-c: xdiff: remove duplicate headers from xpatience.c xdiff: remove duplicate headers from xhistogram.c xdiff: drop system includes in xutils.c
2019-07-29Merge branch 'jk/xdiff-clamp-funcname-context-index'Junio C Hamano1-2/+2
The internal diff machinery can be made to read out of bounds while looking for --funcion-context line in a corner case, which has been corrected. * jk/xdiff-clamp-funcname-context-index: xdiff: clamp function context indices in post-image
2019-07-28xdiff: remove duplicate headers from xpatience.cCarlo Marcelo Arenas Belón1-2/+0
92b7de93fb (Implement the patience diff algorithm, 2009-01-07) added them but were already part of xinclude.h Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-28xdiff: remove duplicate headers from xhistogram.cCarlo Marcelo Arenas Belón1-2/+0
8c912eea94 ("teach --histogram to diff", 2011-07-12) included them, but were already part of xinclude.h Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-28xdiff: drop system includes in xutils.cCarlo Marcelo Arenas Belón1-4/+0
After b46054b374 ("xdiff: use git-compat-util", 2019-04-11), two system headers added with 6942efcfa9 ("xdiff: load full words in the inner loop of xdl_hash_record", 2012-04-06) to xutils.c are no longer needed and could conflict as shown below from an OpenIndiana build: In file included from xdiff/xinclude.h:26:0, from xdiff/xutils.c:25: ./git-compat-util.h:4:0: warning: "_FILE_OFFSET_BITS" redefined #define _FILE_OFFSET_BITS 64 In file included from /usr/include/limits.h:37:0, from xdiff/xutils.c:23: /usr/include/sys/feature_tests.h:231:0: note: this is the location of the previous definition #define _FILE_OFFSET_BITS 32 Make sure git-compat-util.h is the first header (through xinclude.h) Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-23xdiff: clamp function context indices in post-imageJeff King1-2/+2
After finding a function line for --function-context in the pre-image, xdl_emit_diff() calculates the equivalent line in the post-image. It assumes that the lines between changes are the same on both sides. If the option --ignore-blank-lines was also given then this is not necessarily true. Clamp the calculation results for start and end of the function context to prevent out-of-bounds array accesses. Note that this _just_ fixes the case where our mismatch sends us off the beginning of the file. There are likely other cases where our assumption causes us to go to the wrong line within the file. Nobody has developed a test case yet, and the ultimate fix is likely more complicated than this patch. But this at least prevents a segfault in the meantime. Credit for finding the bug goes to "Liu Wei of Tencent Security Xuanwu Lab". Reported-by: 刘炜 <lw17qhdz@gmail.com> Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-12xdiff: use xmalloc/xreallocJeff King1-2/+2
Most of xdiff uses a bare malloc() to allocate memory, and returns an error when we get NULL. However, there are a few spots which don't check the return value and may segfault, including at least xdl_merge() and xpatience.c's find_longest_common_sequence(). Let's use xmalloc() everywhere instead, so that we get a graceful die() for these cases, without having to do further auditing. This does mean the existing cases which check errors will now die() instead of returning an error up the stack. But: - that's how the rest of Git behaves already for malloc errors - all of the callers of xdi_diff(), etc, die upon seeing an error So while we might one day want to fully lib-ify the diff code and make it possible to use as part of a long-running process, we're not close to that now. And because we're just tweaking the xdl_malloc() macro here, we're not really moving ourselves any further away from that. We could, for example, simplify some of the functions which handle malloc() errors which can no longer occur. But that would probably be taking us in the wrong direction. This also makes our malloc handling more consistent with the rest of Git, including enforcing GIT_ALLOC_LIMIT and trying to reclaim pack memory when needed. Reported-by: 王健强 <jianqiang.wang@securitygossip.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-12xdiff: use git-compat-utilJeff King1-7/+1
Since the xdiff library was not originally part of Git, it does its own system includes. Let's instead use git-compat-util, which has two benefits: 1. It adjusts for any system-specific quirks in how or what we should include (though xdiff's needs are light enough that this hasn't been a problem in the past). 2. It lets us use wrapper functions like xmalloc(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-02xdiff: provide a separate emit callback for hunksJeff King2-5/+22
The xdiff library always emits hunk header lines to our callbacks as formatted strings like "@@ -a,b +c,d @@\n". This is convenient if we're going to output a diff, but less so if we actually need to compute using those numbers, which requires re-parsing the line. In preparation for moving away from this, let's teach xdiff a new callback function which gets the broken-out hunk information. To help callers that don't want to use this new callback, if it's NULL we'll continue to format the hunk header into a string. Note that this function renames the "outf" callback to "out_line", as well. This isn't strictly necessary, but helps in two ways: 1. Now that there are two callbacks, it's nice to use more descriptive names. 2. Many callers did not zero the emit_callback_data struct, and needed to be modified to set ecb.out_hunk to NULL. By changing the name of the existing struct member, that guarantees that any new callers from in-flight topics will break the build and be examined manually. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-17Merge branch 'sb/indent-heuristic-optim'Junio C Hamano1-1/+11
"git diff --indent-heuristic" had a bad corner case performance. * sb/indent-heuristic-optim: xdiff: reduce indent heuristic overhead
2018-08-15Merge branch 'sb/histogram-less-memory'Junio C Hamano1-55/+78
"git diff --histogram" had a bad memory usage pattern, which has been rearranged to reduce the peak usage. * sb/histogram-less-memory: xdiff/histogram: remove tail recursion xdiff/xhistogram: move index allocation into find_lcs xdiff/xhistogram: factor out memory cleanup into free_index() xdiff/xhistogram: pass arguments directly to fall_back_to_classic_diff
2018-08-01xdiff: reduce indent heuristic overheadStefan Beller1-1/+11
Skip searching for better indentation heuristics if we'd slide a hunk more than its size. This is the easiest fix proposed in the analysis[1] in response to a patch that mercurial took for xdiff to limit searching by a constant. Using a performance test as: #!python open('a', 'w').write(" \n" * 1000000) open('b', 'w').write(" \n" * 1000001) This patch reduces the execution of "git diff --no-index a b" from 0.70s to 0.31s. However limiting the sliding to the size of the diff hunk, which was proposed as a solution (that I found easiest to implement for now) is not optimal for cases like open('a', 'w').write(" \n" * 1000000) open('b', 'w').write(" \n" * 2000000) as then we'd still slide 1000000 times. In addition to limiting the sliding to size of the hunk, also limit by a constant. Choose 100 lines as the constant as that fits more than a screen, which really means that the diff sliding is probably not providing a lot of benefit anyway. [1] https://public-inbox.org/git/72ac1ac2-f567-f241-41d6-d0f83072e0b3@alum.mit.edu/ Reported-by: Jun Wu <quark@fb.com> Analysis-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-23xdiff/histogram: remove tail recursionStefan Beller1-6/+14
When running the same reproduction script as the previous patch, it turns out the stack is too small, which can be easily avoided. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-19xdiff/xhistogram: move index allocation into find_lcsStefan Beller1-43/+53
This fixes a memory issue when recursing a lot, which can be reproduced as seq 1 100000 >one seq 1 4 100000 >two git diff --no-index --histogram one two Before this patch, histogram_diff would call itself recursively before calling free_index, which would mean a lot of memory is allocated during the recursion and only freed afterwards. By moving the memory allocation (and its free call) into find_lcs, the memory is free'd before we recurse, such that memory is reused in the next step of the recursion instead of using new memory. This addresses only the memory pressure, not the run time complexity, that is also awful for the corner case outlined above. Helpful in understanding the code (in addition to the sparse history of this file), was https://stackoverflow.com/a/32367597 which reproduces most of the code comments of the JGit implementation. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-19xdiff/xhistogram: factor out memory cleanup into free_index()Stefan Beller1-4/+9
This will be useful in the next patch as we'll introduce multiple callers. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-19xdiff/xhistogram: pass arguments directly to fall_back_to_classic_diffStefan Beller1-11/+11
By passing the 'xpp' and 'env' argument directly to the function 'fall_back_to_classic_diff', we eliminate an occurrence of the 'index' in histogram_diff, which will prove useful in a bit. While at it, move it up in the file. This will make the diff of one of the next patches more legible. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-17xdiff/xdiffi.c: remove unneeded function declarationsStefan Beller1-17/+0
There is no need to forward-declare these functions, as they are used after their implementation only. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-17xdiff/xdiff.h: remove unused flagsStefan Beller1-8/+0
These flags were there since the beginning (3443546f6e (Use a *real* built-in diff generator, 2006-03-24), but were never used. Remove them. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-19Merge branch 'jt/diff-anchored-patience'Junio C Hamano2-5/+41
"git diff" learned a variant of the "--patience" algorithm, to which the user can specify which 'unique' line to be used as anchoring points. * jt/diff-anchored-patience: diff: support anchoring line(s)
2017-11-28Merge branch 'rs/include-comments-before-the-function-header'Junio C Hamano1-3/+10
"git grep -W", "git diff -W" and their friends learned a heuristic to extend a pre-context beyond the line that matches the "function pattern" (aka "diff.*.xfuncname") to include a comment block, if exists, that immediately precedes it. * rs/include-comments-before-the-function-header: grep: show non-empty lines before functions with -W grep: update boundary variable for pre-context t7810: improve check of -W with user-defined function lines xdiff: show non-empty lines before functions with -W xdiff: factor out is_func_rec() t4051: add test for comments preceding function lines
2017-11-28diff: support anchoring line(s)Jonathan Tan2-5/+41
Teach diff a new algorithm, one that attempts to prevent user-specified lines from appearing as a deletion or addition in the end result. The end user can use this by specifying "--anchored=<text>" one or more times when using Git commands like "diff" and "show". Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-27Merge branch 'jc/ignore-cr-at-eol'Junio C Hamano2-12/+52
The "diff" family of commands learned to ignore differences in carriage return at the end of line. * jc/ignore-cr-at-eol: diff: --ignore-cr-at-eol xdiff: reassign xpparm_t.flags bits
2017-11-21xdiff: show non-empty lines before functions with -WRené Scharfe1-0/+3
Non-empty lines before a function definition are most likely comments for that function and thus relevant. Include them in function context. Such a non-empty line might also belong to the preceeding function if there is no separating blank line. Stop extending the context upwards also at the next function line to make sure only one extra function body is shown at most. Original-patch-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-21xdiff: factor out is_func_rec()René Scharfe1-3/+7
Add a helper for checking if a given record is a function line. It frees callers from having to deal with the buffer arguments of match_func_rec(). Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-09Replace Free Software Foundation address in license noticesTodd Zullinger14-28/+28
The mailing address for the FSF has changed over the years. Rather than updating the address across all files, refer readers to gnu.org, as the GNU GPL documentation now suggests for license notices. The mailing address is retained in the full license files (COPYING and LGPL-2.1). The old address is still present in t/diff-lib/COPYING. This is intentional, as the file is used in tests and the contents are not expected to change. Signed-off-by: Todd Zullinger <tmz@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-08diff: --ignore-cr-at-eolJunio C Hamano2-3/+39
A new option --ignore-cr-at-eol tells the diff machinery to treat a carriage-return at the end of a (complete) line as if it does not exist. Just like other "--ignore-*" options to ignore various kinds of whitespace differences, this will help reviewing the real changes you made without getting distracted by spurious CRLF<->LF conversion made by your editor program. Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> [jch: squashed in command line completion by Dscho] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-27xdiff: reassign xpparm_t.flags bitsJunio C Hamano1-10/+14
We have packed the bits too tightly in such a way that it is not easy to add a new type of whitespace ignoring option, a new type of LCS algorithm, or a new type of post-cleanup heuristics. Reorder bits a bit to give room for these three classes of options to grow. Also make use of XDF_WHITESPACE_FLAGS macro where we check any of these bits are on, instead of using DIFF_XDL_TST() macro on individual possibilities. That way, the "is any of the bits on?" code does not have to change when we add more ways to ignore whitespaces. While at it, add a comment in front of the bit definitions to clarify in which structure these defined bits may appear. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-10cleanup: fix possible overflow errors in binary searchDerrick Stolee1-1/+1
A common mistake when writing binary search is to allow possible integer overflow by using the simple average: mid = (min + max) / 2; Instead, use the overflow-safe version: mid = min + (max - min) / 2; This translation is safe since the operation occurs inside a loop conditioned on "min < max". The included changes were found using the following git grep: git grep '/ *2;' '*.c' Making this cleanup will prevent future review friction when a new binary search is contructed based on existing code. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-15xdiff -W: relax end-of-file function detectionVegard Nossum1-8/+6
When adding a new function to the end of a file, it's enough to know that 1) the addition is at the end of the file; and 2) there is a function _somewhere_ in there. If we had simply been changing the end of an existing function, then we would also be deleting something from the old version. This fixes the case where we add e.g. // Begin of dummy static int dummy(void) { } to the end of the file. Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-10Merge branch 'jc/retire-compaction-heuristics'Junio C Hamano2-35/+1
"git diff" and its family had two experimental heuristics to shift the contents of a hunk to make the patch easier to read. One of them turns out to be better than the other, so leave only the "--indent-heuristic" option and remove the other one. * jc/retire-compaction-heuristics: diff: retire "compaction" heuristics
2016-12-23diff: retire "compaction" heuristicsJunio C Hamano2-35/+1
When a patch inserts a block of lines, whose last lines are the same as the existing lines that appear before the inserted block, "git diff" can choose any place between these existing lines as the boundary between the pre-context and the added lines (adjusting the end of the inserted block as appropriate) to come up with variants of the same patch, and some variants are easier to read than others. We have been trying to improve the choice of this boundary, and Git 2.11 shipped with an experimental "compaction-heuristic". Since then another attempt to improve the logic further resulted in a new "indent-heuristic" logic. It is agreed that the latter gives better result overall, and the former outlived its usefulness. Retire "compaction", and keep "indent" as an experimental feature. The latter hopefully will be turned on by default in a future release, but that should be done as a separate step. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-06xdiff: drop XDL_FAST_HASHJeff King1-106/+0
The xdiff code hashes every line of both sides of a diff, and then compares those hashes to find duplicates. The overall performance depends both on how fast we can compute the hashes, but also on how many hash collisions we see. The idea of XDL_FAST_HASH is to speed up the hash computation. But the generated hashes have worse collision behavior. This means that in some cases it speeds diffs up (running "git log -p" on git.git improves by ~8% with it), but in others it can slow things down. One pathological case saw over a 100x slowdown[1]. There may be a better hash function that covers both properties, but in the meantime we are better off with the original hash. It's slightly slower in the common case, but it has fewer surprising pathological cases. [1] http://public-inbox.org/git/20141222041944.GA441@peff.net/ Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-03Merge branch 'mh/diff-indent-heuristic'Junio C Hamano1-7/+7
Clean-up for a recently graduated topic. * mh/diff-indent-heuristic: xdiff: rename "struct group" to "struct xdlgroup"
2016-09-29Merge branch 'rs/xdiff-merge-overlapping-hunks-for-W-context' into maintJunio C Hamano1-1/+1
"git diff -W" output needs to extend the context backward to include the header line of the current function and also forward to include the body of the entire current function up to the header line of the next one. This process may have to merge to adjacent hunks, but the code forgot to do so in some cases. * rs/xdiff-merge-overlapping-hunks-for-W-context: xdiff: fix merging of hunks with -W context and -u context
2016-09-27xdiff: rename "struct group" to "struct xdlgroup"Jeff King1-7/+7
Commit e8adf23 (xdl_change_compact(): introduce the concept of a change group, 2016-08-22) added a "struct group" type to xdiff/xdiffi.c. But the POSIX system header "grp.h" already defines "struct group" (it is part of the getgrnam interface). This happens to work because the new type is local to xdiffi.c, and the xdiff code includes a relatively small set of system headers. But it will break compilation if xdiff ever switches to using git-compat-util.h. It can also probably cause confusion with tools that look at the whole code base, like coccinelle or ctags. Let's resolve by giving the xdiff variant a scoped name, which is closer to other xdiff types anyway (e.g., xdlfile_t, though note that xdiff is fond if typedefs when Git usually is not). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26Merge branch 'mh/diff-indent-heuristic'Junio C Hamano2-98/+538
Output from "git diff" can be made easier to read by selecting which lines are common and which lines are added/deleted intelligently when the lines before and after the changed section are the same. A command line option is added to help with the experiment to find a good heuristics. * mh/diff-indent-heuristic: blame: honor the diff heuristic options and config parse-options: add parse_opt_unknown_cb() diff: improve positioning of add/delete blocks in diffs xdl_change_compact(): introduce the concept of a change group recs_match(): take two xrecord_t pointers as arguments is_blank_line(): take a single xrecord_t as argument xdl_change_compact(): only use heuristic if group can't be matched xdl_change_compact(): fix compaction heuristic to adjust ixo
2016-09-21Merge branch 'rs/xdiff-merge-overlapping-hunks-for-W-context'Junio C Hamano1-1/+1
"git diff -W" output needs to extend the context backward to include the header line of the current function and also forward to include the body of the entire current function up to the header line of the next one. This process may have to merge to adjacent hunks, but the code forgot to do so in some cases. * rs/xdiff-merge-overlapping-hunks-for-W-context: xdiff: fix merging of hunks with -W context and -u context
2016-09-19diff: improve positioning of add/delete blocks in diffsMichael Haggerty2-0/+326
Some groups of added/deleted lines in diffs can be slid up or down, because lines at the edges of the group are not unique. Picking good shifts for such groups is not a matter of correctness but definitely has a big effect on aesthetics. For example, consider the following two diffs. The first is what standard Git emits: --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl @@ -231,6 +231,9 @@ if (!defined $initial_reply_to && $prompting) { } if (!$smtp_server) { + $smtp_server = $repo->config('sendemail.smtpserver'); +} +if (!$smtp_server) { foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) { if (-x $_) { $smtp_server = $_; The following diff is equivalent, but is obviously preferable from an aesthetic point of view: --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl @@ -230,6 +230,9 @@ if (!defined $initial_reply_to && $prompting) { $initial_reply_to =~ s/(^\s+|\s+$)//g; } +if (!$smtp_server) { + $smtp_server = $repo->config('sendemail.smtpserver'); +} if (!$smtp_server) { foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) { if (-x $_) { This patch teaches Git to pick better positions for such "diff sliders" using heuristics that take the positions of nearby blank lines and the indentation of nearby lines into account. The existing Git code basically always shifts such "sliders" as far down in the file as possible. The only exception is when the slider can be aligned with a group of changed lines in the other file, in which case Git favors depicting the change as one add+delete block rather than one add and a slightly offset delete block. This naive algorithm often yields ugly diffs. Commit d634d61ed6 improved the situation somewhat by preferring to position add/delete groups to make their last line a blank line, when that is possible. This heuristic does more good than harm, but (1) it can only help if there are blank lines in the right places, and (2) always picks the last blank line, even if there are others that might be better. The end result is that it makes perhaps 1/3 as many errors as the default Git algorithm, but that still leaves a lot of ugly diffs. This commit implements a new and much better heuristic for picking optimal "slider" positions using the following approach: First observe that each hypothetical positioning of a diff slider introduces two splits: one between the context lines preceding the group and the first added/deleted line, and the other between the last added/deleted line and the first line of context following it. It tries to find the positioning that creates the least bad splits. Splits are evaluated based only on the presence and locations of nearby blank lines, and the indentation of lines near the split. Basically, it prefers to introduce splits adjacent to blank lines, between lines that are indented less, and between lines with the same level of indentation. In more detail: 1. It measures the following characteristics of a proposed splitting position in a `struct split_measurement`: * the number of blank lines above the proposed split * whether the line directly after the split is blank * the number of blank lines following that line * the indentation of the nearest non-blank line above the split * the indentation of the line directly below the split * the indentation of the nearest non-blank line after that line 2. It combines the measured attributes using a bunch of empirically-optimized weighting factors to derive a `struct split_score` that measures the "badness" of splitting the text at that position. 3. It combines the `split_score` for the top and the bottom of the slider at each of its possible positions, and selects the position that has the best `split_score`. I determined the initial set of weighting factors by collecting a corpus of Git histories from 29 open-source software projects in various programming languages. I generated many diffs from this corpus, and determined the best positioning "by eye" for about 6600 diff sliders. I used about half of the repositories in the corpus (corresponding to about 2/3 of the sliders) as a training set, and optimized the weights against this corpus using a crude automated search of the parameter space to get the best agreement with the manually-determined values. Then I tested the resulting heuristic against the full corpus. The results are summarized in the following table, in column `indent-1`: | repository | count | Git 2.9.0 | compaction | compaction-fixed | indent-1 | indent-2 | | --------------------- | ----- | -------------- | -------------- | ---------------- | -------------- | -------------- | | afnetworking | 109 | 89 (81.7%) | 37 (33.9%) | 37 (33.9%) | 2 (1.8%) | 2 (1.8%) | | alamofire | 30 | 18 (60.0%) | 14 (46.7%) | 15 (50.0%) | 0 (0.0%) | 0 (0.0%) | | angular | 184 | 127 (69.0%) | 39 (21.2%) | 23 (12.5%) | 5 (2.7%) | 5 (2.7%) | | animate | 313 | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | | ant | 380 | 356 (93.7%) | 152 (40.0%) | 148 (38.9%) | 15 (3.9%) | 15 (3.9%) | * | bugzilla | 306 | 263 (85.9%) | 109 (35.6%) | 99 (32.4%) | 14 (4.6%) | 15 (4.9%) | * | corefx | 126 | 91 (72.2%) | 22 (17.5%) | 21 (16.7%) | 6 (4.8%) | 6 (4.8%) | | couchdb | 78 | 44 (56.4%) | 26 (33.3%) | 28 (35.9%) | 6 (7.7%) | 6 (7.7%) | * | cpython | 937 | 158 (16.9%) | 50 (5.3%) | 49 (5.2%) | 5 (0.5%) | 5 (0.5%) | * | discourse | 160 | 95 (59.4%) | 42 (26.2%) | 36 (22.5%) | 18 (11.2%) | 13 (8.1%) | | docker | 307 | 194 (63.2%) | 198 (64.5%) | 253 (82.4%) | 8 (2.6%) | 8 (2.6%) | * | electron | 163 | 132 (81.0%) | 38 (23.3%) | 39 (23.9%) | 6 (3.7%) | 6 (3.7%) | | git | 536 | 470 (87.7%) | 73 (13.6%) | 78 (14.6%) | 16 (3.0%) | 16 (3.0%) | * | gitflow | 127 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | | ionic | 133 | 89 (66.9%) | 29 (21.8%) | 38 (28.6%) | 1 (0.8%) | 1 (0.8%) | | ipython | 482 | 362 (75.1%) | 167 (34.6%) | 169 (35.1%) | 11 (2.3%) | 11 (2.3%) | * | junit | 161 | 147 (91.3%) | 67 (41.6%) | 66 (41.0%) | 1 (0.6%) | 1 (0.6%) | * | lighttable | 15 | 5 (33.3%) | 0 (0.0%) | 2 (13.3%) | 0 (0.0%) | 0 (0.0%) | | magit | 88 | 75 (85.2%) | 11 (12.5%) | 9 (10.2%) | 1 (1.1%) | 0 (0.0%) | | neural-style | 28 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | | nodejs | 781 | 649 (83.1%) | 118 (15.1%) | 111 (14.2%) | 4 (0.5%) | 5 (0.6%) | * | phpmyadmin | 491 | 481 (98.0%) | 75 (15.3%) | 48 (9.8%) | 2 (0.4%) | 2 (0.4%) | * | react-native | 168 | 130 (77.4%) | 79 (47.0%) | 81 (48.2%) | 0 (0.0%) | 0 (0.0%) | | rust | 171 | 128 (74.9%) | 30 (17.5%) | 27 (15.8%) | 16 (9.4%) | 14 (8.2%) | | spark | 186 | 149 (80.1%) | 52 (28.0%) | 52 (28.0%) | 2 (1.1%) | 2 (1.1%) | | tensorflow | 115 | 66 (57.4%) | 48 (41.7%) | 48 (41.7%) | 5 (4.3%) | 5 (4.3%) | | test-more | 19 | 15 (78.9%) | 2 (10.5%) | 2 (10.5%) | 1 (5.3%) | 1 (5.3%) | * | test-unit | 51 | 34 (66.7%) | 14 (27.5%) | 8 (15.7%) | 2 (3.9%) | 2 (3.9%) | * | xmonad | 23 | 22 (95.7%) | 2 (8.7%) | 2 (8.7%) | 1 (4.3%) | 1 (4.3%) | * | --------------------- | ----- | -------------- | -------------- | ---------------- | -------------- | -------------- | | totals | 6668 | 4391 (65.9%) | 1496 (22.4%) | 1491 (22.4%) | 150 (2.2%) | 144 (2.2%) | | totals (training set) | 4552 | 3195 (70.2%) | 1053 (23.1%) | 1061 (23.3%) | 86 (1.9%) | 88 (1.9%) | | totals (test set) | 2116 | 1196 (56.5%) | 443 (20.9%) | 430 (20.3%) | 64 (3.0%) | 56 (2.6%) | In this table, the numbers are the count and percentage of human-rated sliders that the corresponding algorithm got *wrong*. The columns are * "repository" - the name of the repository used. I used the diffs between successive non-merge commits on the HEAD branch of the corresponding repository. * "count" - the number of sliders that were human-rated. I chose most, but not all, sliders to rate from those among which the various algorithms gave different answers. * "Git 2.9.0" - the default algorithm used by `git diff` in Git 2.9.0. * "compaction" - the heuristic used by `git diff --compaction-heuristic` in Git 2.9.0. * "compaction-fixed" - the heuristic used by `git diff --compaction-heuristic` after the fixes from earlier in this patch series. Note that the results are not dramatically different than those for "compaction". Both produce non-ideal diffs only about 1/3 as often as the default `git diff`. * "indent-1" - the new `--indent-heuristic` algorithm, using the first set of weighting factors, determined as described above. * "indent-2" - the new `--indent-heuristic` algorithm, using the final set of weighting factors, determined as described below. * `*` - indicates that repo was part of training set used to determine the first set of weighting factors. The fact that the heuristic performed nearly as well on the test set as on the training set in column "indent-1" is a good indication that the heuristic was not over-trained. Given that fact, I ran a second round of optimization, using the entire corpus as the training set. The resulting set of weights gave the results in column "indent-2". These are the weights included in this patch. The final result gives consistently and significantly better results across the whole corpus than either `git diff` or `git diff --compaction-heuristic`. It makes only about 1/30 as many errors as the former and about 1/10 as many errors as the latter. (And a good fraction of the remaining errors are for diffs that involve weirdly-formatted code, sometimes apparently machine-generated.) The tools that were used to do this optimization and analysis, along with the human-generated data values, are recorded in a separate project [1]. This patch adds a new command-line option `--indent-heuristic`, and a new configuration setting `diff.indentHeuristic`, that activate this heuristic. This interface is only meant for testing purposes, and should be finalized before including this change in any release. [1] https://github.com/mhagger/diff-slider-tools Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-14xdiff: fix merging of hunks with -W context and -u contextRené Scharfe1-1/+1
If the function context for a hunk (with -W) reaches the beginning of the next hunk then we need to merge these two -- otherwise we'd show some lines twice, which looks strange and even confuses git apply. We already do this checking and merging in xdl_emit_diff(), but forget to consider regular context (with -u or -U). Fix that by merging hunks already if function context of the first one touches or overlaps regular context of the second one. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07xdiff: remove unneeded declarationsStefan Beller1-9/+0
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-23xdl_change_compact(): introduce the concept of a change groupMichael Haggerty1-90/+203
The idea of xdl_change_compact() is fairly simple: * Proceed through groups of changed lines in the file to be compacted, keeping track of the corresponding location in the "other" file. * If possible, slide the group up and down to try to give the most aesthetically pleasing diff. Whenever it is slid, the current location in the other file needs to be adjusted. But these simple concepts are obfuscated by a lot of index handling that is written in terse, subtle, and varied patterns. I found it very hard to convince myself that the function was correct. So introduce a "struct group" that represents a group of changed lines in a file. Add some functions that perform elementary operations on groups: * Initialize a group to the first group in a file * Move to the next or previous group in a file * Slide a group up or down Even though the resulting code is longer, I think it is easier to understand and review. Its performance is not changed appreciably (though it would be if `group_next()` and `group_previous()` were not inlined). ...and in fact, the rewriting helped me discover another bug in the --compaction-heuristic code: The update of blank_lines was never done for the highest possible position of the group. This means that it could fail to slide the group to its highest possible position, even if that position had a blank line as its last line. So for example, it yielded the following diff: $ git diff --no-index --compaction-heuristic a.txt b.txt diff --git a/a.txt b/b.txt index e53969f..0d60c5fe 100644 --- a/a.txt +++ b/b.txt @@ -1,3 +1,7 @@ 1 A + +B + +A 2 when in fact the following diff is better (according to the rules of --compaction-heuristic): $ git diff --no-index --compaction-heuristic a.txt b.txt diff --git a/a.txt b/b.txt index e53969f..0d60c5fe 100644 --- a/a.txt +++ b/b.txt @@ -1,3 +1,7 @@ 1 +A + +B + A 2 The new code gives the bottom answer. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-23recs_match(): take two xrecord_t pointers as argumentsMichael Haggerty1-7/+7
There is no reason for it to take an array and two indexes as argument, as it only accesses two elements of the array. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-23is_blank_line(): take a single xrecord_t as argumentMichael Haggerty1-4/+4
There is no reason for it to take an array and index as argument, as it only accesses a single element of the array. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-23xdl_change_compact(): only use heuristic if group can't be matchedMichael Haggerty1-19/+19
If the changed group of lines can be matched to a group in the other file, then that positioning should take precedence over the compaction heuristic. The old code tried the heuristic unconditionally, which cost redundant effort and also was broken if the matching code had already shifted the group higher than the blank line. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-23xdl_change_compact(): fix compaction heuristic to adjust ixoMichael Haggerty1-0/+1
The code branch used for the compaction heuristic forgot to keep ixo in sync while the group was shifted. This is certainly wrong, as it causes the two counters to get out of sync. I *think* that this bug could also have caused the function to read past the end of the rchgo array, though I haven't done the work to prove it for sure. Here is my reasoning: If ixo is not decremented correctly during one iteration of the outer while loop, then it will loose sync with the ix counter. In particular, ixo will be too large. Suppose that the next iterations of the outer while loop (i.e., processing the next block of add/delete lines) don't have any sliders. Then the ixo counter would be incremented by the number of non-changed lines in xdf, which is the same as the number of non-changed lines in xdfo that *should have* followed the group that experienced the malfunction. But since ixo was too large at the end of that iteration, it will be incremented past the end of the xdfo->rchg array, and will try to read that memory illegally. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-08Merge branch 'js/ignore-space-at-eol' into maintJunio C Hamano2-3/+5
An age old bug that caused "git diff --ignore-space-at-eol" misbehave has been fixed. * js/ignore-space-at-eol: diff: fix a double off-by-one with --ignore-space-at-eol diff: demonstrate a bug with --patience and --ignore-space-at-eol
2016-07-11diff: fix a double off-by-one with --ignore-space-at-eolJohannes Schindelin2-3/+5
When comparing two lines, ignoring any whitespace at the end, we first try to match as many bytes as possible and break out of the loop only upon mismatch, to let the remainder be handled by the code shared with the other whitespace-ignoring code paths. When comparing the bytes, however, we incremented the counters always, even if the bytes did not match. And because we fall through to the space-at-eol handling at that point, it is as if that mismatch never happened. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-27Merge branch 'rs/xdiff-hunk-with-func-line' into maintJunio C Hamano1-8/+57
"git show -W" (extend hunks to cover the entire function, delimited by lines that match the "funcname" pattern) used to show the entire file when a change added an entire function at the end of the file, which has been fixed. * rs/xdiff-hunk-with-func-line: xdiff: fix merging of appended hunk with -W grep: -W: don't extend context to trailing empty lines t7810: add test for grep -W and trailing empty context lines xdiff: don't trim common tail with -W xdiff: -W: don't include common trailing empty lines in context xdiff: ignore empty lines before added functions with -W xdiff: handle appended chunks better with -W xdiff: factor out match_func_rec() t4051: rewrite, add more tests
2016-06-20Merge branch 'rs/xdiff-hunk-with-func-line'Junio C Hamano1-8/+57
"git show -W" (extend hunks to cover the entire function, delimited by lines that match the "funcname" pattern) used to show the entire file when a change added an entire function at the end of the file, which has been fixed. * rs/xdiff-hunk-with-func-line: xdiff: fix merging of appended hunk with -W grep: -W: don't extend context to trailing empty lines t7810: add test for grep -W and trailing empty context lines xdiff: don't trim common tail with -W xdiff: -W: don't include common trailing empty lines in context xdiff: ignore empty lines before added functions with -W xdiff: handle appended chunks better with -W xdiff: factor out match_func_rec() t4051: rewrite, add more tests
2016-06-09xdiff: fix merging of appended hunk with -WRené Scharfe1-1/+2
When -W is given we search the lines between the end of the current context and the next change for a function line. If there is none then we merge those two hunks as they must be part of the same function. If the next change is an appended chunk we abort the search early in get_func_line(), however, because its line number is out of range. Fix that by searching from the end of the pre-image in that case instead. Reported-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-31xdiff: -W: don't include common trailing empty lines in contextRené Scharfe1-0/+2
Empty lines between functions are shown by diff -W, as it considers them to be part of the function preceding them. They are not interesting in most languages. The previous patch stopped showing them in the special case of a function added at the end of a file. Stop extending context to those empty lines by skipping back over them from the start of the next function. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-31xdiff: ignore empty lines before added functions with -WRené Scharfe1-2/+20
If a new function and a preceding empty line is appended, diff -W shows the previous function in full in order to provide context for that empty line. In most languages empty lines between sections are not interesting in and off themselves and showing a whole extra function for them is not what we want. Skip empty lines when checking of the appended chunk starts with a function line, thereby avoiding to extend the context just for them. Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-31xdiff: handle appended chunks better with -WRené Scharfe1-3/+24
If lines are added at the end of a file, diff -W shows the whole file. That's because get_func_line() only considers the pre-image and gives up if it sees a record index beyond its end. Consider the post-image as well to see if the added lines already make up a full function. If it doesn't then search for the previous function line by starting from the bottom of the pre-image, thereby avoiding to confuse get_func_line(). Reuse the existing label called "again", as it's exactly where we need to jump to when we're done handling the pre-context, but rename it to "post_context_calculation" in order to document its new purpose better. Reported-by: Junio C Hamano <gitster@pobox.com> Initial-patch-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-31xdiff: factor out match_func_rec()René Scharfe1-4/+11
Add match_func_rec(), a helper that wraps accessing a record and calling the appropriate function for checking if it contains a function line. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-06Merge branch 'jk/diff-compact-heuristic'Junio C Hamano2-4/+38
Patch output from "git diff" and friends has been tweaked to be more readable by using a blank line as a strong hint that the contents before and after it belong to a logically separate unit. * jk/diff-compact-heuristic: diff: undocument the compaction heuristic knobs for experimentation xdiff: implement empty line chunk heuristic xdiff: add recs_match helper function
2016-04-19xdiff: implement empty line chunk heuristicStefan Beller2-0/+28
In order to produce the smallest possible diff and combine several diff hunks together, we implement a heuristic from GNU Diff which moves diff hunks forward as far as possible when we find common context above and below a diff hunk. This sometimes produces less readable diffs when writing C, Shell, or other programming languages, ie: ... /* + * + * + */ + +/* ... instead of the more readable equivalent of ... +/* + * + * + */ + /* ... Implement the following heuristic to (optionally) produce the desired output. If there are diff chunks which can be shifted around, shift each hunk such that the last common empty line is below the chunk with the rest of the context above. This heuristic appears to resolve the above example and several other common issues without producing significantly weird results. However, as with any heuristic it is not really known whether this will always be more optimal. Thus, it can be disabled via diff.compactionHeuristic. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-18xdiff: add recs_match helper functionJacob Keller1-4/+10
It is a common pattern in xdl_change_compact to check that hashes and strings match. The resulting code to perform this change causes very long lines and makes it hard to follow the intention. Introduce a helper function recs_match which performs both checks to increase code readability. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-03Merge branch 'rj/xdiff-prepare-plug-leak-on-error-codepath'Junio C Hamano1-1/+2
A small memory leak in an error codepath has been plugged in xdiff code. * rj/xdiff-prepare-plug-leak-on-error-codepath: xdiff/xprepare: fix a memory leak xdiff/xprepare: use the XDF_DIFF_ALG() macro to access flag bits
2016-03-04xdiff/xprepare: fix a memory leakRamsay Jones1-0/+1
The xdl_prepare_env() function may initialise an xdlclassifier_t data structure via xdl_init_classifier(), which allocates memory to several fields, for example 'rchash', 'rcrecs' and 'ncha'. If this function later exits due to the failure of xdl_optimize_ctxs(), then this xdlclassifier_t structure, and the memory allocated to it, is not cleaned up. In order to fix the memory leak, insert a call to xdl_free_classifier() before returning. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-04xdiff/xprepare: use the XDF_DIFF_ALG() macro to access flag bitsRamsay Jones1-1/+1
Commit 307ab20b3 ("xdiff: PATIENCE/HISTOGRAM are not independent option bits", 19-02-2012) introduced the XDF_DIFF_ALG() macro to access the flag bits used to represent the diff algorithm requested. In addition, code which had used explicit manipulation of the flag bits was changed to use the macros. However, one example of direct manipulation remains. Update this code to use the XDF_DIFF_ALG() macro. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-26Merge branch 'ps/plug-xdl-merge-leak'Junio C Hamano1-2/+7
* ps/plug-xdl-merge-leak: xdiff/xmerge: fix memory leak in xdl_merge
2016-02-26Merge branch 'jk/no-diff-emit-common'Junio C Hamano2-19/+0
"git merge-tree" used to mishandle "both sides added" conflict with its own "create a fake ancestor file that has the common parts of what both sides have added and do a 3-way merge" logic; this has been updated to use the usual "3-way merge with an empty blob as the fake common ancestor file" approach used in the rest of the system. * jk/no-diff-emit-common: xdiff: drop XDL_EMIT_COMMON merge-tree: drop generate_common strategy merge-one-file: use empty blob for add/add base
2016-02-23xdiff/xmerge: fix memory leak in xdl_mergePatrick Steinhardt1-2/+7
When building the script for the second file that is to be merged we have already allocated memory for data structures related to the first file. When we encounter an error in building the second script we only free allocated memory related to the second file before erroring out. Fix this memory leak by also releasing allocated memory related to the first file. Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22xdiff: drop XDL_EMIT_COMMONJeff King2-19/+0
There are no more callers that use this mode, and none likely to be added (as our xdl_merge() eliminates the common use of it for generating 3-way merge bases). This is effectively a revert of a9ed376 (xdiff: generate "anti-diffs" aka what is common to two files, 2006-06-28), though of course trying to revert that ancient commit directly produces many textual conflicts. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-27merge-file: ensure that conflict sections match eol styleJohannes Schindelin1-14/+23
In the previous patch, we made sure that the conflict markers themselves match the end-of-line style of the input files. However, this still left out the conflicting text itself: if it lacks a trailing newline, we add one, and should add a carriage return when appropriate, too. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-27merge-file: let conflict markers match end-of-line style of the contextJohannes Schindelin1-4/+57
When merging files with CR/LF line endings, the conflict markers should match those, lest the output file has mixed line endings. This is particularly of interest on Windows, where some editors get *really* confused by mixed line endings. The original version of this patch by Beat Bolli respected core.eol, and a subsequent improvement by this developer also respected gitattributes. This approach was suboptimal, though: `git merge-file` was invented as a drop-in replacement for GNU merge and as such has no problem operating outside of any repository at all! Another problem with the original approach was pointed out by Junio Hamano: legacy repositories might have their text files committed using CR/LF line endings (and core.eol and the gitattributes would give us a false impression there). Therefore, the much superior approach is to simply match the context's line endings, if any. We actually do not have to look at the *entire* context at all: if the files are all LF-only, or if they all have CR/LF line endings, it is sufficient to look at just a *single* line to match that style. And if the line endings are mixed anyway, it is *still* okay to imitate just a single line's eol: we will just add to the pile of mixed line endings, and there is nothing we can do about that. So what we do is: we look at the line preceding the conflict, falling back to the line preceding that in case it was the last line and had no line ending, falling back to the first line, first in the first post-image, then the second post-image, and finally the pre-image. If we find consistent CR/LF (or undecided) end-of-line style, we match that, otherwise we use LF-only line endings for the conflict markers. Note that while it is true that there have to be at least two lines we can look at (otherwise there would be no conflict), the same is not true for line *endings*: the three files in question could all consist of a single line without any line ending, each. In this case we fall back to using LF-only. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-30git-merge-file: do not add LF at EOF while applying unrelated changeMax Kirillov1-2/+2
If 'current-file' does not contain LF at EOF, and change between 'base-file' and 'other-file' does not change any line close to EOF, the 3-way merge should not add LF to EOF. This is what 'diff3 -m' does, and seems to be a reasonable expectation. The change which introduced the behavior is cd1d61c44f. It always calls function xdl_recs_copy() for sides with add_nl == 1. In fact, it looks like the only case when this is needed is when 2 files are being union-merged, and they do not have LF at EOF (strictly speaking, the first of them). Add tests: * "merge without conflict (missing LF at EOF, away from change in the other file)" and "merge does not add LF away of change", to demonstrate the changed behavior. * "conflict at EOF without LF resolved by --union", to verify that the union-merge at the end inerts newline between versions. * some more tests which I felt like not covering the functionality well Signed-off-by: Max Kirillov <max@max630.net> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-16C: have space around && and || operatorsJunio C Hamano1-1/+1
Correct all hits from git grep -e '\(&&\|||\)[^ ]' -e '[^ ]\(&&\|||\)' -- '*.c' i.e. && or || operators that are followed by anything but a SP, or that follow something other than a SP or a HT, so that these operators have a SP around it when necessary. We usually refrain from making this kind of a tree-wide change in order to avoid unnecessary conflicts with other "real work" patches, but in this case, the end result does not have a potentially cumbersome tree-wide impact, while this is a tree-wide cleanup. Fixes to compat/regex/regcomp.c and xdiff/xemit.c are to replace a HT immediately after && with a SP. This is based on Felipe's patch to bultin/symbolic-ref.c; I did all the finding out what other files in the whole tree need to be fixed and did the fix and also the log message while reviewing that single liner, so any screw-ups in this version are mine. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-19diff: add --ignore-blank-lines optionAntoine Pelisse7-8/+89
The goal of the patch is to introduce the GNU diff -B/--ignore-blank-lines as closely as possible. The short option is not available because it's already used for "break-rewrites". When this option is used, git-diff will not create hunks that simply add or remove empty lines, but will still show empty lines addition/suppression if they are close enough to "valuable" changes. There are two differences between this option and GNU diff -B option: - GNU diff doesn't have "--inter-hunk-context", so this must be handled - The following sequence looks like a bug (context is displayed twice): $ seq 5 >file1 $ cat <<EOF >file2 change 1 2 3 4 5 change EOF $ diff -u -B file1 file2 --- file1 2013-06-08 22:13:04.471517834 +0200 +++ file2 2013-06-08 22:13:23.275517855 +0200 @@ -1,5 +1,7 @@ +change 1 2 + 3 4 5 @@ -3,3 +5,4 @@ 3 4 5 +change So here is a more thorough description of the option: - real changes are interesting - blank lines that are close enough (less than context size) to interesting changes are considered interesting (recursive definition) - "context" lines are used around each hunk of interesting changes - If two hunks are separated by less than "inter-hunk-context", they will be merged into one. The implementation does the "interesting changes selection" in a single pass. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-12Correct common spelling mistakes in comments and testsStefano Lattarini2-2/+2
Most of these were found using Lucas De Marchi's codespell tool. Signed-off-by: Stefano Lattarini <stefano.lattarini@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-25Merge branch 'rs/xdiff-fast-hash-fix'Junio C Hamano1-15/+15
Fixes compilation issue on 32-bit in an earlier series.
2012-05-23xdiff: import new 32-bit version of count_masked_bytes()René Scharfe1-13/+5
Import the latest 32-bit implementation of count_masked_bytes() from Linux (arch/x86/include/asm/word-at-a-time.h). It's shorter and avoids overflows and negative numbers. This fixes test failures on 32-bit, where negative partial results had been shifted right using the "wrong" method (logical shift right instead of arithmetic short right). The compiler is free to chose the method, so it was only wrong in the sense that it didn't work as intended by us. Reported-by: Øyvind A. Holm <sunny@sunbase.org> Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-23xdiff: avoid more compiler warnings with XDL_FAST_HASH on 32-bit machinesRené Scharfe1-1/+7
Hide literals that can cause compiler warnings for 32-bit architectures in expressions that evaluate to small numbers there. Some compilers warn that 0x0001020304050608 won't fit into a 32-bit long, others that shifting right by 56 bits clears a 32-bit value completely. The correct values are calculated in the 64-bit case, which is all that matters in this if-branch. Reported-by: Øyvind A. Holm <sunny@sunbase.org> Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Acked-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-22xdiff: avoid compiler warnings with XDL_FAST_HASH on 32-bit machinesRené Scharfe1-3/+5
Import macro REPEAT_BYTE from Linux (arch/x86/include/asm/word-at-a-time.h) to avoid 64-bit integer literals, which cause some 32-bit compilers to print warnings. Reported-by: Øyvind A. Holm <sunny@sunbase.org> Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-09xdiff: remove unused functionsRené Scharfe2-46/+0
The functions xdl_cha_first(), xdl_cha_next() and xdl_atol() are not used by us. While removing them increases the difference to the upstream version of libxdiff, it only adds a bit to the more than 600 differing lines in xutils.c (mmfile_t management was simplified significantly when the library was imported initially). Besides, if upstream modifies these functions in the future, we won't need to think about importing those changes, so in that sense it makes tracking modifications easier. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-09xdiff: remove emit_func() and xdi_diff_hunks()René Scharfe2-6/+1
The functions are unused now, remove them. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-09xdiff: add hunk_func()René Scharfe2-0/+22
Add a way to register a callback function that is gets passed the start line and line count of each hunk of a diff. Only standard types are used. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-02Merge branch 'tr/xdiff-fast-hash'Junio C Hamano1-0/+106
Use word-at-a-time comparison to find end of line or NUL (end of buffer), borrowed from the linux-kernel discussion. By Thomas Rast * tr/xdiff-fast-hash: xdiff: choose XDL_FAST_HASH code on sizeof(long) instead of __WORDSIZE xdiff: load full words in the inner loop of xdl_hash_record
2012-05-01xdiff: choose XDL_FAST_HASH code on sizeof(long) instead of __WORDSIZEThomas Rast1-29/+23
Darwin does not define __WORDSIZE, and compiles the 32-bit code path on 64-bit systems, resulting in a totally broken git. I could not find an alternative -- other than the platform symbols (__x86_64__ etc.) -- that does the test in the preprocessor. However, we can also just test for the size of a 'long', which is what really matters here. Any compiler worth its salt will leave only the branch relevant for its platform, and indeed on Linux/GCC the numbers don't change: Test tr/darwin-xdl-fast-hash origin/next origin/master ------------------------------------------------------------------------------------------------------------------ 4000.1: log -3000 (baseline) 0.09(0.07+0.01) 0.09(0.07+0.01) -5.5%* 0.09(0.07+0.01) -4.1% 4000.2: log --raw -3000 (tree-only) 0.47(0.41+0.05) 0.47(0.40+0.05) -0.5% 0.45(0.38+0.06) -3.5%. 4000.3: log -p -3000 (Myers) 1.81(1.67+0.12) 1.81(1.67+0.13) +0.3% 1.99(1.84+0.12) +10.2%*** 4000.4: log -p -3000 --histogram 1.79(1.66+0.11) 1.80(1.67+0.11) +0.4% 1.96(1.82+0.10) +9.2%*** 4000.5: log -p -3000 --patience 2.17(2.02+0.13) 2.20(2.04+0.13) +1.3%. 2.33(2.18+0.13) +7.4%*** ------------------------------------------------------------------------------------------------------------------ Significance hints: '.' 0.1 '*' 0.05 '**' 0.01 '***' 0.001 Noticed-by: Brian Gernhardt <brian@gernhardtsoftware.com> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-15Merge branch 'jc/diff-algo-cleanup'Junio C Hamano5-21/+18
Resurrects the preparatory clean-up patches from another topic that was discarded, as this would give a saner foundation to build on diff.algo configuration option series. * jc/diff-algo-cleanup: xdiff: PATIENCE/HISTOGRAM are not independent option bits xdiff: remove XDL_PATCH_* macros
2012-04-09xdiff: load full words in the inner loop of xdl_hash_recordThomas Rast1-0/+112
Redo the hashing loop in xdl_hash_record in a way that loads an entire 'long' at a time, using masking tricks to see when and where we found the terminating '\n'. I stole inspiration and code from the posts by Linus Torvalds around https://lkml.org/lkml/2012/3/2/452 https://lkml.org/lkml/2012/3/5/6 His method reads the buffers in sizeof(long) increments, and may thus overrun it by at most sizeof(long)-1 bytes before it sees the final newline (or hits the buffer length check). I considered padding out all buffers by a suitable amount to "catch" the overrun, but * this does not work for mmap()'d buffers: if you map 4096+8 bytes from a 4096 byte file, accessing the last 8 bytes results in a SIGBUS on my machine; and * it would also be extremely ugly because it intrudes deep into the unpacking machinery. So I adapted it to not read beyond the buffer at all. Instead, it reads the final partial word byte-by-byte and strings it together. Then it can use the same logic as before to finish the hashing. So far we enable this only on x86_64, where it provides nice speedup for diff-related work: Test origin/next tr/xdiff-fast-hash ----------------------------------------------------------------------------- 4000.1: log -3000 (baseline) 0.07(0.05+0.02) 0.08(0.06+0.02) +14.3% 4000.2: log --raw -3000 (tree-only) 0.37(0.33+0.04) 0.37(0.32+0.04) +0.0% 4000.3: log -p -3000 (Myers) 1.75(1.65+0.09) 1.60(1.49+0.10) -8.6% 4000.4: log -p -3000 --histogram 1.73(1.62+0.09) 1.58(1.49+0.08) -8.7% 4000.5: log -p -3000 --patience 2.11(2.00+0.10) 1.94(1.80+0.11) -8.1% Perhaps other platforms could also benefit. However it does NOT work on big-endian systems! [jc: minimum style and compilation fixes] Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-19xdiff: PATIENCE/HISTOGRAM are not independent option bitsJunio C Hamano5-16/+18
Because the default Myers, patience and histogram algorithms cannot be in effect at the same time, XDL_PATIENCE_DIFF and XDL_HISTOGRAM_DIFF are not independent bits. Instead of wasting one bit per algorithm, define a few macros to access the few bits they occupy and update the code that access them. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-19xdiff: remove XDL_PATCH_* macrosJunio C Hamano1-5/+0
These are not used anywhere in our codebase, and the bit assignment definition is merely confusing. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-29Merge branch 'rs/diff-postimage-in-context'Junio C Hamano1-6/+6
* rs/diff-postimage-in-context: xdiff: print post-image for common records instead of pre-image
2012-01-06xdiff: print post-image for common records instead of pre-imageRené Scharfe1-6/+6
Normally it doesn't matter if we show the pre-image or th post-image for the common parts of a diff because they are the same. If white-space changes are ignored they can differ, though. The new text after applying the diff is more interesting in that case, so show that instead of the old contents. Note: GNU diff shows the pre-image. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-19Merge branch 'rs/diff-whole-function'Junio C Hamano2-16/+70
* rs/diff-whole-function: diff: add option to show whole functions as context xdiff: factor out get_func_line()
2011-10-13Merge branch 'rs/diff-cleanup-records-fix'Junio C Hamano1-3/+7
* rs/diff-cleanup-records-fix: diff: resurrect XDF_NEED_MINIMAL with --minimal Revert removal of multi-match discard heuristic in 27af01
2011-10-10diff: add option to show whole functions as contextRené Scharfe2-6/+49
Add the option -W/--function-context to git diff. It is similar to the same option of git grep and expands the context of change hunks so that the whole surrounding function is shown. This "natural" context can allow changes to be understood better. Note: GNU patch doesn't like diffs generated with the new option; it seems to expect context lines to be the same before and after changes. git apply doesn't complain. This implementation has the same shortcoming as the one in grep, namely that there is no way to explicitly find the end of a function. That means that a few lines of extra context are shown, right up to the next recognized function begins. It's already useful in its current form, though. The function get_func_line() in xdiff/xemit.c is extended to work forward as well as backward to find post-context as well as pre-context. It returns the position of the first found matching line. The func_line parameter is made optional, as we don't need it for -W. The enhanced function is then used in xdl_emit_diff() to extend the context as needed. If the added context overlaps with the next change, it is merged into the current hunk. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-10xdiff: factor out get_func_line()René Scharfe1-16/+27
Move the code to search for a function line to be shown in the hunk header into its own function and to make returning the length-limited result string easier, introduce struct func_line. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-26Revert removal of multi-match discard heuristic in 27af01René Scharfe1-3/+7
27af01d (xdiff/xprepare: improve O(n*m) performance in xdl_cleanup_records(), 2011-08-17) was supposed to be a performance boost only. However, it unexpectedly changed the behaviour of diff. Revert a part of 27af01d that removes logic that mark lines as "multi-match" (ie. dis[i] == 2). This was preventing the multi-match discard heuristic (performed in xdl_cleanup_records() and xdl_clean_mmatch()) from executing. Reported-by: Alexander Pepper <pepper@inf.fu-berlin.de> Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-06Merge branch 'rc/histogram-diff'Junio C Hamano1-0/+2
* rc/histogram-diff: xdiff/xprepare: initialise xdlclassifier_t cf in xdl_prepare_env()
2011-08-31xdiff/xprepare: initialise xdlclassifier_t cf in xdl_prepare_env()Tay Ray Chuan1-0/+2
Ensure that the xdl_free_classifier() call on xdlclassifier_t cf is safe even if xdl_init_classifier() isn't called. This may occur in the case where diff is run with --histogram and a call to, say, xdl_prepare_ctx() fails. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-17Merge branch 'rc/histogram-diff' into HEADJunio C Hamano8-125/+470
* rc/histogram-diff: xdiff/xhistogram: drop need for additional variable xdiff/xhistogram: rely on xdl_trim_ends() xdiff/xhistogram: rework handling of recursed results xdiff: do away with xdl_mmfile_next() Make test number unique xdiff/xprepare: use a smaller sample size for histogram diff xdiff/xprepare: skip classification teach --histogram to diff t4033-diff-patience: factor out tests xdiff/xpatience: factor out fall-back-diff function xdiff/xprepare: refactor abort cleanups xdiff/xprepare: use memset() Conflicts: xdiff/xprepare.c
2011-08-17xdiff/xprepare: improve O(n*m) performance in xdl_cleanup_records()Tay Ray Chuan1-36/+50
In xdl_cleanup_records(), we see O(n*m) performance, where n is the number of records from xdf->dstart to xdf->dend, and m is the size of a bucket in xdf->rhash (<= by mlim). Here, we improve this to O(n) by pre-computing nm (in rcrec->len(1|2)) in xdl_classify_record(). Reported-by: Marat Radchenko <marat@slonopotamus.org> Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-08xdiff/xhistogram: drop need for additional variableTay Ray Chuan1-5/+4
Having an additional variable (ptr) instead of changing line(1|2) and count(1|2) was for debugging purposes. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-08xdiff/xhistogram: rely on xdl_trim_ends()Tay Ray Chuan1-27/+4
Do away with reduce_common_start_end() and use xdf->dstart and xdf->dend set by xdl_trim_ends() that similarly tells us where the first unmatched line from the start and end occurs. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-08xdiff/xhistogram: rework handling of recursed resultsTay Ray Chuan1-6/+9
Previously we were over-complicating matters by trying to combine the recursed results. Now, terminate immediately if a recursive call failed and return its result. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-03xdiff: do away with xdl_mmfile_next()Tay Ray Chuan3-20/+2
Given our simple mmfile structure, xdl_mmfile_next() calls are redundant. Do away with calls to them. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-12xdiff/xprepare: use a smaller sample size for histogram diffTay Ray Chuan3-10/+17
For histogram diff, we can afford a smaller sample size and thus a poorer estimate of the number of lines, as the hash table (rhash) won't be filled up/grown. This is safe as the final count of lines (xdf.nrecs) will be updated correctly anyway by xdl_prepare_ctx(). This gives us a small boost in performance. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-12xdiff/xprepare: skip classificationTay Ray Chuan1-8/+16
xdiff performs "classification" of records (xdl_classify_record()), replacing hashes (xrecord_t.ha) with a unique identifier of the record/line and building a hash table (xrecord_t.rhash) of records. This is then used to "cleanup" records (xdl_cleanup_records()). We don't need any of that in histogram diff, so we omit calls to these functions. We also skip allocating memory to the hash table, rhash, as it is no longer used. This gives us a small boost in performance. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-12teach --histogram to diffTay Ray Chuan4-0/+390
Port JGit's HistogramDiff algorithm over to C. Rough numbers (TODO) show that it is faster than its --patience cousin, as well as the default Meyers algorithm. The implementation has been reworked to use structs and pointers, instead of bitmasks, thus doing away with JGit's 2^28 line limit. We also use xdiff's default hash table implementation (xdl_hash_bits() with XDL_HASHLONG()) for convenience. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-07xdiff/xpatience: factor out fall-back-diff functionTay Ray Chuan3-25/+35
This is in preparation for the histogram diff algorithm, which will also re-use much of the code to call the default Meyers diff algorithm. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-07xdiff/xprepare: refactor abort cleanupsTay Ray Chuan1-59/+32
Group free()'s that are called when a malloc() fails in xdl_prepare_ctx(), making for more readable code. Also add a free() on ha, in case future git hackers add allocs after the ha malloc. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-07xdiff/xprepare: use memset()Tay Ray Chuan1-7/+3
Use memset() instead of a for loop to initialize. This could give a performance advantage. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-17Merge branch 'cb/diff-fname-optim'Junio C Hamano1-24/+14
* cb/diff-fname-optim: diff: avoid repeated scanning while looking for funcname do not search functions for patch ID add rebase patch id tests
2010-10-06xdiff: cast arguments for ctype functions to unsigned charJonathan Nieder3-10/+11
The ctype functions isspace(), isalnum(), et al take an integer argument representing an unsigned character, or -1 for EOF. On platforms with a signed char, it is unsafe to pass a char to them without casting it to unsigned char first. Most of git is already shielded against this by the ctype implementation in git-compat-util.h, but xdiff, which uses libc ctype.h, ought to be fixed. Noticed-by: der Mouse <mouse@Rodents-Montreal.ORG> Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-30diff: avoid repeated scanning while looking for funcnameRené Scharfe1-24/+14
For each hunk, xdl_find_func searches the preimage for a function name until the beginning of the file. If the file does not contain any function names, this search has complexity O(n^2) in the number of hunks n. Instead, inline xdl_find_func() and keep track of up to which line we have scanned already and the contents of the last funcname line that we have found. Noticed and a different approach proposed by Clemens Buchacher. This alternative solution was done by René Scharfe. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-07-05xdiff: optimise for no whitespace difference when ignoring whitespace.Dylan Reid1-1/+3
In xdl_recmatch, do the memcmp to check if the two lines are equal before checking if whitespace flags are set. If the lines are identical, then there is no need to check if they differ only in whitespace. This makes the common case (there is no whitespace difference) faster. It costs the case where lines are the same length and contain whitespace differences, but the common case is more than 20% faster. Signed-off-by: Dylan Reid <dgreid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-01xdiff/xmerge.c: use memset() instead of explicit for-loopAlexey Mahotkin1-9/+8
memset() is heavily optimized, and resulting assembler code is about 150 lines less for that file. Signed-off-by: Alexey Mahotkin <squadette@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-20xdl_merge(): move file1 and file2 labels to xmparam structureJonathan Nieder2-10/+14
The labels for the three participants in a potential conflict are all optional arguments for the xdiff merge routine; if they are NULL, then xdl_merge() can cope by omitting the labels from its output. Move them to the xmparam structure to allow new callers to save some keystrokes where they are not needed. This also has the virtue of making the xdiff merge interface more similar to merge_trees, which might make it easier to learn. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-20xdl_merge(): add optional ancestor label to diff3-style outputJonathan Nieder2-2/+15
The ‘git checkout --conflict=diff3’ command can be used to present conflicts hunks including text from the common ancestor: <<<<<<< ours ourside ||||||| original ======= theirside >>>>>>> theirs The added information is helpful for resolving merges by hand, and merge tools can usually grok it because it is very similar to the output from diff3 -m. A subtle change can help more tools to understand the output. ‘diff3’ includes the name of the merge base on the ||||||| line of the output, and some tools misparse the conflict hunks without it. Add a new xmp->ancestor parameter to xdl_merge() for use with conflict style XDL_MERGE_DIFF3 as a label on the ||||||| line for any conflict hunks. If xmp->ancestor is NULL, the output format is unchanged. Thus, this change only provides unexposed plumbing for the new feature; it does not affect the outward behavior of git. Requested-by: Stefan Monnier <monnier@iro.umontreal.ca> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Bert Wesarg <Bert.Wesarg@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-02refactor merge flags into xmparam_tBert Wesarg2-12/+11
Include the merge level, favor, and style flags into the xmparam_t struct. This removes the bit twiddling with these three values into the one flags parameter. Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-02make union merge an xdl merge favorBert Wesarg2-7/+14
The current union merge driver is implemented as an post process. But the xdl_merge code is quite capable to produce the result by itself. Therefore move it there. Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-20Merge branch 'jc/conflict-marker-size'Junio C Hamano2-10/+25
* jc/conflict-marker-size: rerere: honor conflict-marker-size attribute rerere: prepare for customizable conflict marker length conflict-marker-size: new attribute rerere: use ll_merge() instead of using xdl_merge() merge-tree: use ll_merge() not xdl_merge() xdl_merge(): allow passing down marker_size in xmparam_t xdl_merge(): introduce xmparam_t for merge specific parameters git_attr(): fix function signature Conflicts: builtin-merge-file.c ll-merge.c xdiff/xdiff.h xdiff/xmerge.c
2010-01-16xdl_merge(): allow passing down marker_size in xmparam_tJunio C Hamano2-8/+18
This allows the callers of xdl_merge() to pass marker_size (defaults to 7) in xmparam_t argument, to use conflict markers of non-default length. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-16xdl_merge(): introduce xmparam_t for merge specific parametersJunio C Hamano2-2/+7
So far we have only needed to be able to pass an option that is generic to xdiff family of functions to this function. Extend the interface so that we can give it merge specific parameters. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-29git-merge-file --ours, --theirsJunio C Hamano2-3/+14
Sometimes people want their conflicting merges autoresolved by favouring upstream changes. The standard answer they are given is to run "git diff --name-only | xargs git checkout MERGE_HEAD --" in such a case. This is to accept automerge results for the paths that are fully resolved automatically, while taking their version of the file in full for paths that have conflicts. This is problematic on two counts. One is that this is not exactly what these people want. It discards all changes they did on their branch for any paths that conflicted. They usually want to salvage as much automerge result as possible in a conflicted file, and want to take the upstream change only in the conflicted part. This patch teaches two new modes of operation to the lowest-lever merge machinery, xdl_merge(). Instead of leaving the conflicted lines from both sides enclosed in <<<, ===, and >>> markers, the conflicts are resolved favouring our side or their side of changes. A larger problem is that this tends to encourage a bad workflow by allowing people to record such a mixed up half-merged result as a full commit without auditing. This commit does not tackle this issue at all. In git, we usually give long enough rope to users with strange wishes as long as the risky features are not enabled by default, and this is such a risky feature. Signed-off-by: Avery Pennarun <apenwarr@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-31Merge branch 'tf/diff-whitespace-incomplete-line'Junio C Hamano1-33/+53
* tf/diff-whitespace-incomplete-line: xutils: Fix xdl_recmatch() on incomplete lines xutils: Fix hashing an incomplete line with whitespaces at the end
2009-08-23xutils: Fix xdl_recmatch() on incomplete linesJunio C Hamano1-31/+49
Thell Fowler noticed that various "ignore whitespace" options to git diff do not work well on an incomplete line. The loop control of the function responsible for these bugs was extremely difficult to follow. This patch restructures the loops for three variants of "ignore whitespace" logic. The basic idea of the re-written logic is: - A loop runs while the characters from both strings we are looking at match. We declare unmatch immediately when we find something that does not match and return false from the function. We break out of the loop if we ran out of either side of the string. The way we skip spaces inside this loop varies depending on the style of ignoring whitespaces. - After the above loop breaks, we know that the parts of the strings we inspected so far match, ignoring the whitespaces. The lines can match only if the remainder consists of nothing but whitespaces. This part of the logic is shared across all three styles. The new code is more obvious and should be much easier to follow. Tested-by: Thell Fowler <git@tbfowler.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23xutils: Fix hashing an incomplete line with whitespaces at the endJunio C Hamano1-2/+4
Upon seeing a whitespace, xdl_hash_record_with_whitespace() first skipped the run of whitespaces (excluding LF) that begins there, ensuring that the pointer points at the last whitespace character in the run, and assumed that the next character must be LF at the end of the line. This does not work when hashing an incomplete line, which lacks the LF at the end. Introduce "at_eol" variable that is true when either we are at the end of line (looking at LF) or at the end of an incomplete line, and use that instead throughout the code. Noticed by Thell Fowler. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-22refactor: use bitsizeof() instead of 8 * sizeof()Pierre Habouzit1-1/+1
Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-02Merge branch 'cb/maint-1.6.0-xdl-merge-fix' into maintJunio C Hamano1-16/+15
* cb/maint-1.6.0-xdl-merge-fix: Change xdl_merge to generate output even for null merges t6023: merge-file fails to output anything for a degenerate merge Conflicts: xdiff/xmerge.c
2009-05-24Change xdl_merge to generate output even for null mergesCharles Bailey1-16/+15
xdl_merge used to have a check to ensure that there was at least some change in one or other side being merged but this suppressed output for the degenerate case when base, local and remote contents were all identical. Removing this check enables correct output in the degenerate case and xdl_free_script handles freeing NULL scripts so there is no need to have the check for these calls. Signed-off-by: Charles Bailey <charles@hashpling.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-22Fix typos / spelling in commentsMike Ralphson1-1/+1
Signed-off-by: Mike Ralphson <mike@abacus.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-15Fix various dead stores found by the clang static analyzerBenjamin Kramer2-4/+3
http-push.c::finish_request(): request is initialized by the for loop index-pack.c::free_base_data(): b is initialized by the for loop merge-recursive.c::process_renames(): move compare to narrower scope, and remove unused assignments to it remove unused variable renames2 xdiff/xdiffi.c::xdl_recs_cmp(): remove unused variable ec xdiff/xemit.c::xdl_emit_diff(): xche is always overwritten Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-05Merge branch 'kc/maint-diff-bwi-fix' into maintJunio C Hamano1-2/+4
* kc/maint-diff-bwi-fix: Fix combined use of whitespace ignore options to diff test more combinations of ignore-whitespace options to diff
2009-01-23Merge branch 'js/patience-diff'Junio C Hamano5-1/+389
* js/patience-diff: bash completions: Add the --patience option Introduce the diff option '--patience' Implement the patience diff algorithm Conflicts: contrib/completion/git-completion.bash
2009-01-21Merge branch 'kc/maint-diff-bwi-fix'Junio C Hamano1-2/+4
* kc/maint-diff-bwi-fix: Fix combined use of whitespace ignore options to diff
2009-01-19Fix combined use of whitespace ignore options to diffKeith Cascio1-2/+4
The code used to misbehave when options to ignore certain whitespaces (-w -b and --ignore-at-eol) were combined. Signed-off-by: Keith Cascio <keith@cs.ucla.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-07Implement the patience diff algorithmJohannes Schindelin5-1/+389
The patience diff algorithm produces slightly more intuitive output than the classic Myers algorithm, as it does not try to minimize the number of +/- lines first, but tries to preserve the lines that are unique. To this end, it first determines lines that are unique in both files, then the maximal sequence which preserves the order (relative to both files) is extracted. Starting from this initial set of common lines, the rest of the lines is handled recursively, with Myers' algorithm as a fallback when the patience algorithm fails (due to no common unique lines). This patch includes memory leak fixes by Pierre Habouzit. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-29diff: add option to show context between close hunksRené Scharfe2-1/+3
Merge two hunks if there is only the specified number of otherwise unshown context between them. For --inter-hunk-context=1, the resulting patch has the same number of lines but shows uninterrupted context instead of a context header line in between. Patches generated with this option are easier to read but are also more likely to conflict if the file to be patched contains other changes. This patch keeps the default for this option at 0. It is intended to just make the feature available in order to see its advantages and downsides. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-02xdiff: give up scanning similar lines earlyDavide Libenzi1-2/+13
In a corner case of large files whose lines do not match uniquely, the loop to eliminate a line that matches multiple locations adjacent to a run of lines that do not uniquely match wasted too much cycles. Fix this by giving up early after scanning 100 lines in both direction. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-12Merge branch 'dl/xdiff'Junio C Hamano1-2/+13
* dl/xdiff: xdiff: give up scanning similar lines early
2008-11-08xdiff: give up scanning similar lines earlyDavide Libenzi1-2/+13
In a corner case of large files whose lines do not match uniquely, the loop to eliminate a line that matches multiple locations adjacent to a run of lines that do not uniquely match wasted too much cycles. Fix this by giving up early after scanning 100 lines in both direction. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-25Allow alternate "low-level" emit function from xdl_diffBrian Downing4-3/+8
For some users (e.g. git blame), getting textual patch output is just extra work, as they can get all the information they need from the low- level diff structures. Allow for an alternate low-level emit function to be defined to allow bypassing the textual patch generation; set xemitconf_t's emit_func member to enable this. The (void (*)()) type is pretty ugly, but the alternative would be to include most of the private xdiff headers in xdiff.h to get the types required for the "proper" function prototype. Also, a (void *) won't work, as ANSI C doesn't allow a function pointer to be cast to an object pointer. Signed-off-by: Brian Downing <bdowning@lavos.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-30xmerge.c: "diff3 -m" style clips merge reduction level to EAGER or lessJunio C Hamano1-0/+9
When showing a conflicting merge result, and "--diff3 -m" style is asked for, this patch makes sure that the merge reduction level does not exceed XDL_MERGE_EAGER. This is because "diff3 -m" style output would not make sense for anything more aggressive than XDL_MERGE_EAGER, because of the way how the merge reduction works. "git merge-file" no longer has to force MERGE_EAGER when "--diff3" is asked for because of this change. Suppose a common ancestor (shared preimage) is modified to postimage #1 and #2 (each letter represents one line): ##### postimage#1: 1234ABCDE789 | / | / preimage: 123456789 | \ postimage#2: 1234AXYE789 #### XDL_MERGE_MINIMAL and XDL_MERGE_EAGER would: (1) find the s/56/ABCDE/ done on one side and s/56/AXYE/ done on the other side, (2) notice that they touch an overlapping area, and (3) mark it as a conflict, "ABCDE vs AXYE". The difference between the two algorithms is that EAGER drops the hunk altogether if the postimages match (i.e. both sides modified the same way), while MINIMAL keeps it. There is no other operation performed to the hunk. As the result, lines marked with "#" in the above picure will be in the RCS merge style output like this (letters <, = and > represent conflict marker lines): output: 1234<ABCDE=AXYE>789 ; with MINIMAL/EAGER The part from the preimage that corresponds to these conflicting changes is "56", which is what "diff3 -m" style output adds to it: output: 1234<ABCDE|56=AXYE>789 ; in "diff3 -m" style Now, XDL_MERGE_ZEALOUS looks at the differences between the changes two postimages made in order to reduce the number of lines in the conflicting regions. It notices that both sides start their new contents with "A", and excludes it from the output (it also excludes "E" for the same reason). The conflict that used to be "ABCDE vs AXYE" is now "BCD vs XY": output: 1234A<BCD=XY>E789 ; with ZEALOUS There could even be matching parts between two postimages in the middle. Instead of one side rewriting the shared "56" to "ABCDE" and the other side to "AXYE", imagine the case where the postimages are "ABCDE" and "AXCYE", in which case instead of having one conflicted hunk "BCD vs XY", you would have two conflicting hunks "B vs X" and "D vs Y". In either case, once you reduce "ABCDE vs AXYE" to "BCD vs XY" (or "ABCDE vs AXCYE" to "B vs X" and "D vs Y"), there is no part from the preimage that corresponds to the conflicting change made in both postimages anymore. In other words, conflict reduced by ZEALOUS algorithm cannot be expressed in "diff3 -m" style. Representing the last illustration like this is misleading to say the least: output: 1234A<BCD|56=XY>E789 ; broken "diff3 -m" style because the preimage was not ...4A56E... to begin with. "A" and "E" are common only between the postimages. Even worse, once a single conflicting hunk is split into multiple ones (recall the example of breaking "ABCDE vs AXCYE" to "B vs X" and "D vs Y"), there is no sane way to distribute the preimage text across split conflicting hunks. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-30xmerge.c: minimum readability fixupsJunio C Hamano1-7/+7
This replaces hardcoded magic constants with symbolic ones for readability, and swaps one if/else blocks to better match the order in which 0/1/2 variables are handled to nearby codepath. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-30xdiff-merge: optionally show conflicts in "diff3 -m" styleJunio C Hamano2-19/+90
When showing conflicting merges, we traditionally followed RCS's merge output format. The output shows: <<<<<<< postimage from one side; ======= postimage of the other side; and >>>>>>> Some poeple find it easier to be able to understand what is going on when they can view the common ancestor's version, which is used by "diff3 -m", which shows: <<<<<<< postimage from one side; ||||||| shared preimage; ======= postimage of the other side; and >>>>>>> This is an initial step to bring that as an optional feature to git. Only "git merge-file" has been converted, with "--diff3" option. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-30xdl_fill_merge_buffer(): separate out a too deeply nested functionJunio C Hamano1-51/+70
This simply moves code around to make a separate function that prepares a single conflicted hunk with markers into the buffer. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-18xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUMJohannes Schindelin2-4/+28
When a merge conflicts, there are often common lines that are not really common, such as empty lines or lines containing a single curly bracket. With XDL_MERGE_ZEALOUS_ALNUM, we use the following heuristics: when a hunk does not contain any letters or digits, it is treated as conflicting. In other words, a conflict which used to look like this: <<<<<<< a = 1; ======= output(); >>>>>>> } } } <<<<<<< output(); ======= b = 1; >>>>>>> will look like this with ZEALOUS_ALNUM: <<<<<<< a = 1; } } } output(); ======= output(); } } } b = 1; >>>>>>> To demonstrate this, git-merge-file has been switched from XDL_MERGE_ZEALOUS to XDL_MERGE_ZEALOUS_ALNUM. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-18xdl_merge(): make XDL_MERGE_ZEALOUS output simplerJohannes Schindelin1-1/+46
When a merge conflicts, there are often less than three common lines between two conflicting regions. Since a conflict takes up as many lines as are conflicting, plus three lines for the commit markers, the output will be shorter (and thus, simpler) in this case, if the common lines will be merged into the conflicting regions. This patch merges up to three common lines into the conflicts. For example, what looked like this before this patch: <<<<<<< if (a == 1) ======= if (a != 0) >>>>>>> { int i; <<<<<<< a = 0; ======= a = !a; >>>>>>> will now look like this: <<<<<<< if (a == 1) { int i; a = 0; ======= if (a != 0) { int i; a = !a; >>>>>>> Suggested Linus (based on ideas by "Voltage Spike" -- if that name is real, it is mighty cool). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-15Remove unreachable statementsGuido Ostkamp2-4/+0
Solaris Workshop Compiler found a few unreachable statements. Signed-off-by: Guido Ostkamp <git@ostkamp.fastmail.fm> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-06Per-path attribute based hunk header selection.Junio C Hamano2-14/+27
This makes"diff -p" hunk headers customizable via gitattributes mechanism. It is based on Johannes's earlier patch that allowed to define a single regexp to be used for everything. The mechanism to arrive at the regexp that is used to define hunk header is the same as other use of gitattributes. You assign an attribute, funcname (because "diff -p" typically uses the name of the function the patch is about as the hunk header), a simple string value. This can be one of the names of built-in pattern (currently, "java" is defined) or a custom pattern name, to be looked up from the configuration file. (in .gitattributes) *.java funcname=java *.perl funcname=perl (in .git/config) [funcname] java = ... # ugly and complicated regexp to override the built-in one. perl = ... # another ugly and complicated regexp to define a new one. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-08Missing statics.Pierre Habouzit1-2/+2
Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-07War on whitespaceJunio C Hamano12-12/+0
This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-03-19xdiff/xutils.c(xdl_hash_record): factor out whitespace handlingJohannes Schindelin1-2/+20
Since in at least one use case, xdl_hash_record() takes over 15% of the CPU time, it makes sense to even micro-optimize it. For many cases, no whitespace special handling is needed, and in these cases we should not even bother to check for whitespace in _every_ iteration of the loop. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-13teach diff machinery about --ignore-space-at-eolJohannes Schindelin2-1/+26
`git diff --ignore-space-at-eol` will ignore whitespace at the line ends. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-30Fix yet another subtle xdl_merge() bugJohannes Schindelin1-2/+5
In very obscure cases, a merge can hit an unexpected code path (where the original code went as far as saying that this was a bug). This failing merge was noticed by Alexandre Juillard. The problem is that the original file contains something like this: -- snip -- two non-empty lines before two empty lines after two empty lines -- snap -- and this snippet is reduced to _one_ empty line in _both_ new files. However, it is ambiguous as to which hunk takes the empty line: the first or the second one? Indeed in Alexandre's example files, the xdiff algorithm attributes the empty line to the first hunk in one case, and to the second hunk in the other case. (Trimming down the example files _changes_ that behaviour!) Thus, the call to xdl_merge_cmp_lines() has no chance to realize that the change is actually identical in both new files. Therefore, xdl_refine_conflicts() finds an empty diff script, which was not expected there, because (the original author of xdl_merge() thought) xdl_merge_cmp_lines() would catch that case earlier. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-28xdl_merge(): fix a segmentation fault when refining conflictsJohannes Schindelin1-0/+4
The function xdl_refine_conflicts() tries to break down huge conflicts by doing a diff on the conflicting regions. However, this does not make sense when one side is empty. Worse, when one side is not only empty, but after EOF, the code accessed unmapped memory. Noticed by Luben Tuikov, Shawn Pearce and Alexandre Julliard, the latter providing a test case. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-12Merge branch 'master' into js/mergeJunio C Hamano1-1/+2
* master: (42 commits) git-svn: correctly handle packed-refs in refs/remotes/ add test case for recursive merge git-svn: correctly display fatal() error messages git-svn: allow dcommit to take an alternate head git-svn: enable logging of information not supported by git Clarify fetch error for missing objects. Move Fink and Ports check to after config file shortlog: fix segfault on empty authorname shortlog: remove "[PATCH]" prefix from shortlog output Make sure the empty tree exists when needed in merge-recursive. Don't use memcpy when source and dest. buffers may overlap no need to install manpages as executable Documentation: simpler shared repository creation shortlog: fix segfault on empty authorname Add branch.*.merge warning and documentation update Fix perl/ build. git-svn: use do_switch for --follow-parent if the SVN library supports it Fix documentation copy&paste typo git-svn: extra error check to ensure we open a file correctly Documentation: update git-clone man page with new behavior ...
2006-12-05xdl_merge(): fix and simplify conflict handlingJohannes Schindelin1-16/+5
Suppose you have changes in new1 to the original lines 10-20, and changes in new2 to the original lines 15-25, then the changes to 10-25 conflict. But it is possible that the next changes in new1 still overlap with this change to new2. So, in the next iteration we have to look at the same change to new2 again. The old code tried to be a bit too clever. The new code is shorter and more to the point: do not fiddle with the ranges at all. Also, xdl_append_merge() tries harder to combine conflicts. This is necessary, because with the above simplification, some conflicts would not be recognized as conflicts otherwise: In the above scenario, it is possible that there is no other change to new1. Absent the combine logic, the change in new2 would be recorded _again_, but as a non-conflict. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2006-12-04diff -b: ignore whitespace at end of lineJohannes Schindelin1-1/+2
This is _not_ the same as "treat eol as whitespace", since that would mean that multiple empty lines would be treated as equal to e.g. a space. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-02xdl_merge(): fix thinkoJohannes Schindelin1-2/+2
If one side's block (of changed lines) ends later than the other side's block, the former should be tested against the next block of the other side, not vice versa. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-02xdl_merge(): fix an off-by-one bugJohannes Schindelin1-5/+5
The line range is i1 .. (i1 + chg1 - 1), not i1 .. (i1 + chg1). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-02xmerge: make return value from xdl_merge() more usable.Junio C Hamano1-10/+7
The callers would want to know if the resulting merge is clean; do not discard that information away after calling xdl_do_merge(). Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-02xdiff: add xdl_merge()Johannes Schindelin4-2/+442
This new function implements the functionality of RCS merge, but in-memory. It returns < 0 on error, otherwise the number of conflicts. Finding the conflicting lines can be a very expensive task. You can control the eagerness of this algorithm: - a level value of 0 means that all overlapping changes are treated as conflicts, - a value of 1 means that if the overlapping changes are identical, it is not treated as a conflict. - If you set level to 2, overlapping changes will be analyzed, so that almost identical changes will not result in huge conflicts. Rather, only the conflicting lines will be shown inside conflict markers. With each increasing level, the algorithm gets slower, but more accurate. Note that the code for level 2 depends on the simple definition of mmfile_t specific to git, and therefore it will be harder to port that to LibXDiff. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-23Increase length of function name bufferAndy Parkins1-1/+1
In xemit.c:xdl_emit_diff() a buffer for showing the function name as commentary is allocated; this buffer was 40 characters. This is a bit small; particularly for C++ function names where there is often an identical prefix (like void LongNamespace::LongClassName) on multiple functions, which makes the context the same everywhere. In other words the context is useless. This patch increases that buffer to 80 characters - which may still not be enough, but is better Signed-off-by: Andy Parkins <andyparkins@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-25xdiff: Match GNU diff behaviour when deciding hunk comment worthiness of linesPetr Baudis1-2/+1
This removes the '#' and '(' tests and adds a '$' test instead although I have no idea what it is actually good for - but hey, if that's what GNU diff does... Pasky only went and did as Junio sayeth. Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-23xdiff/xemit.c (xdl_find_func): Elide trailing white space in a context header.Jim Meyering1-1/+1
This removes trailing blanks from git-generated diff headers the same way a similar patch did that for GNU diff: http://article.gmane.org/gmane.comp.gnu.utils.bugs/13839 That is, it removes trailing blanks on the hunk header line that shows the function name. Signed-off-by: Jim Meyering <jim@meyering.net> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-16Merge branch 'maint'Junio C Hamano1-2/+3
* maint: Fix hash function in xdiff library
2006-10-16Fix hash function in xdiff libraryv1.4.2.4Linus Torvalds1-2/+3
Jim Mayering noticed that xdiff library took insanely long time when comparing files with many identical lines. This was because the hash function used in the library is broken on 64-bit architectures and caused too many collisions. http://thread.gmane.org/gmane.comp.version-control.git/28962/focus=28994 Acked-by: Davide Libenzi <davidel@xmaliserver.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-12diff: fix 2 whitespace issuesJohannes Schindelin1-17/+12
When whitespace or whitespace change was ignored, the function xdl_recmatch() returned memcmp() style differences, which is wrong, since it should return 0 on non-match. Also, there were three horrible off-by-one bugs, even leading to wrong hashes in the whitespace special handling. The issue was noticed by Ray Lehtiniemi. For good measure, this commit adds a test. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-12Merge branch 'lt/merge-tree'Junio C Hamano2-0/+19
* lt/merge-tree: Improved three-way blob merging code Prepare "git-merge-tree" for future work xdiff: generate "anti-diffs" aka what is common to two files
2006-07-10Fix more typos, primarily in the codePavel Roskin1-3/+3
The only visible change is that git-blame doesn't understand "--compability" anymore, but it does accept "--compatibility" instead, which is already documented. Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Junio C Hamano <junkio@cox.net>