aboutsummaryrefslogtreecommitdiffstats
path: root/environment.c
AgeCommit message (Collapse)AuthorFilesLines
2024-04-05Merge branch 'jk/core-comment-string'Junio C Hamano1-1/+1
core.commentChar used to be limited to a single byte, but has been updated to allow an arbitrary multi-byte sequence. * jk/core-comment-string: config: add core.commentString config: allow multi-byte core.commentChar environment: drop comment_line_char compatibility macro wt-status: drop custom comment-char stringification sequencer: handle multi-byte comment characters when writing todo list find multi-byte comment chars in unterminated buffers find multi-byte comment chars in NUL-terminated strings prefer comment_line_str to comment_line_char for printing strbuf: accept a comment string for strbuf_add_commented_lines() strbuf: accept a comment string for strbuf_commented_addf() strbuf: accept a comment string for strbuf_stripspace() environment: store comment_line_char as a string strbuf: avoid shadowing global comment_line_char name commit: refactor base-case of adjust_comment_line_char() strbuf: avoid static variables in strbuf_add_commented_lines() strbuf: simplify comment-handling in add_lines() helper config: forbid newline as core.commentChar
2024-03-12environment: store comment_line_char as a stringJeff King1-1/+1
We'd like to eventually support multi-byte comment prefixes, but the comment_line_char variable is referenced in many spots, making the transition difficult. Let's start by storing the character in a NUL-terminated string. That will let us switch code over incrementally to the string format, and we can easily support the existing code with a macro wrapper (since we'll continue to allow only a single-byte prefix, this will behave identically). Once all references to the "char" variable have been converted, we can drop it and enable longer strings. We'll still have to touch all of the spots that create or set the variable in this patch, but there are only a few (reading the config, and the "auto" character selector). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07Merge branch 'jc/no-lazy-fetch'Junio C Hamano1-0/+3
"git --no-lazy-fetch cmd" allows to run "cmd" while disabling lazy fetching of objects from the promisor remote, which may be handy for debugging. * jc/no-lazy-fetch: git: extend --no-lazy-fetch to work across subprocesses git: document GIT_NO_REPLACE_OBJECTS environment variable git: --no-lazy-fetch option
2024-02-27git: extend --no-lazy-fetch to work across subprocessesJunio C Hamano1-0/+3
Modeling after how the `--no-replace-objects` option is made usable across subprocess spawning (e.g., cURL based remote helpers are spawned as a separate process while running "git fetch"), allow the `--no-lazy-fetch` option to be passed across process boundaries. Do not model how the value of GIT_NO_REPLACE_OBJECTS environment variable is ignored, though. Just use the usual git_env_bool() to allow "export GIT_NO_LAZY_FETCH=0" and "unset GIT_NO_LAZY_FETCH" to be equivalents. Also do not model how the request is not propagated to subprocesses we spawn (e.g. "git clone --local" that spawns a new process to work in the origin repository, while the original one working in the newly created one) by the "--no-replace-objects" option, as this "do not lazily fetch from the promisor" is more about a per-request debugging aid, not "this repository's promisor should not be relied upon" property specific to a repository. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09config: use git_config_string() for core.checkRoundTripEncodingJeff King1-1/+1
Since this code path was recently converted to check for a NULL value, it now behaves exactly like git_config_string(). We can shorten the code a bit by using that helper. Note that git_config_string() takes a const pointer, but our storage variable is non-const. We're better off making this "const", though, since the default value points to a string literal (and thus it would be an error if anybody tried to write to it). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-02max_tree_depth: lower it for MSVC to avoid stack overflowsJohannes Schindelin1-1/+14
There seems to be some internal stack overflow detection in MSVC's `malloc()` machinery that seems to be independent of the `stack reserve` and `heap reserve` sizes specified in the executable (editable via `EDITBIN /STACK:<n> <exe>` and `EDITBIN /HEAP:<n> <exe>`). In the newly test cases added by `jk/tree-name-and-depth-limit`, this stack overflow detection is unfortunately triggered before Git can print out the error message about too-deep trees and exit gracefully. Instead, it exits with `STATUS_STACK_OVERFLOW`. This corresponds to the numeric value -1073741571, something the MSYS2 runtime we sadly need to use to run Git's test suite cannot handle and which it internally maps to the exit code 127. Git's test suite, in turn, mistakes this to mean that the command was not found, and fails both test cases. Here is an example stack trace from an example run: [0x0] ntdll!RtlpAllocateHeap+0x31 0x4212603f50 0x7ff9d6d4cd49 [0x1] ntdll!RtlpAllocateHeapInternal+0x6c9 0x42126041b0 0x7ff9d6e14512 [0x2] ntdll!RtlDebugAllocateHeap+0x102 0x42126042b0 0x7ff9d6dcd8b0 [0x3] ntdll!RtlpAllocateHeap+0x7ec70 0x4212604350 0x7ff9d6d4cd49 [0x4] ntdll!RtlpAllocateHeapInternal+0x6c9 0x42126045b0 0x7ff9596ed480 [0x5] ucrtbased!heap_alloc_dbg_internal+0x210 0x42126046b0 0x7ff9596ed20d [0x6] ucrtbased!heap_alloc_dbg+0x4d 0x4212604750 0x7ff9596f037f [0x7] ucrtbased!_malloc_dbg+0x2f 0x42126047a0 0x7ff9596f0dee [0x8] ucrtbased!malloc+0x1e 0x42126047d0 0x7ff730fcc1ef [0x9] git!do_xmalloc+0x2f 0x4212604800 0x7ff730fcc2b9 [0xa] git!do_xmallocz+0x59 0x4212604840 0x7ff730fca779 [0xb] git!xmallocz_gently+0x19 0x4212604880 0x7ff7311b0883 [0xc] git!unpack_compressed_entry+0x43 0x42126048b0 0x7ff7311ac9a4 [0xd] git!unpack_entry+0x554 0x42126049a0 0x7ff7311b0628 [0xe] git!cache_or_unpack_entry+0x58 0x4212605250 0x7ff7311ad3a8 [0xf] git!packed_object_info+0x98 0x42126052a0 0x7ff7310a92da [0x10] git!do_oid_object_info_extended+0x3fa 0x42126053b0 0x7ff7310a44e7 [0x11] git!oid_object_info_extended+0x37 0x4212605460 0x7ff7310a38ba [0x12] git!repo_read_object_file+0x9a 0x42126054a0 0x7ff7310a6147 [0x13] git!read_object_with_reference+0x97 0x4212605560 0x7ff7310b4656 [0x14] git!fill_tree_descriptor+0x66 0x4212605620 0x7ff7310dc0a5 [0x15] git!traverse_trees_recursive+0x3f5 0x4212605680 0x7ff7310dd831 [0x16] git!unpack_callback+0x441 0x4212605790 0x7ff7310b4c95 [0x17] git!traverse_trees+0x5d5 0x42126058a0 0x7ff7310dc0f2 [0x18] git!traverse_trees_recursive+0x442 0x4212605980 0x7ff7310dd831 [0x19] git!unpack_callback+0x441 0x4212605a90 0x7ff7310b4c95 [0x1a] git!traverse_trees+0x5d5 0x4212605ba0 0x7ff7310dc0f2 [0x1b] git!traverse_trees_recursive+0x442 0x4212605c80 0x7ff7310dd831 [0x1c] git!unpack_callback+0x441 0x4212605d90 0x7ff7310b4c95 [0x1d] git!traverse_trees+0x5d5 0x4212605ea0 0x7ff7310dc0f2 [0x1e] git!traverse_trees_recursive+0x442 0x4212605f80 0x7ff7310dd831 [0x1f] git!unpack_callback+0x441 0x4212606090 0x7ff7310b4c95 [0x20] git!traverse_trees+0x5d5 0x42126061a0 0x7ff7310dc0f2 [0x21] git!traverse_trees_recursive+0x442 0x4212606280 0x7ff7310dd831 [...] [0xfad] git!cmd_main+0x2a2 0x42126ff740 0x7ff730fb6345 [0xfae] git!main+0xe5 0x42126ff7c0 0x7ff730fbff93 [0xfaf] git!wmain+0x2a3 0x42126ff830 0x7ff731318859 [0xfb0] git!invoke_main+0x39 0x42126ff8a0 0x7ff7313186fe [0xfb1] git!__scrt_common_main_seh+0x12e 0x42126ff8f0 0x7ff7313185be [0xfb2] git!__scrt_common_main+0xe 0x42126ff960 0x7ff7313188ee [0xfb3] git!wmainCRTStartup+0xe 0x42126ff990 0x7ff9d5ed257d [0xfb4] KERNEL32!BaseThreadInitThunk+0x1d 0x42126ff9c0 0x7ff9d6d6aa78 [0xfb5] ntdll!RtlUserThreadStart+0x28 0x42126ff9f0 0x0 I verified manually that `traverse_trees_cur_depth` was 562 when that happened, which is far below the 2048 that were already accepted into Git as a hard limit. Despite many attempts to figure out which of the internals trigger this `STATUS_STACK_OVERFLOW` and how to maybe increase certain sizes to avoid running into this issue and let Git behave the same way as under Linux, I failed to find any build-time/runtime knob we could turn to that effect. Note: even switching to using a different allocator (I used mimalloc because that's what Git for Windows uses for its GCC builds) does not help, as the zlib code used to unpack compressed pack entries _still_ uses the regular `malloc()`. And runs into the same issue. Note also: switching to using a different allocator _also_ for zlib code seems _also_ not to help. I tried that, and it still exited with `STATUS_STACK_OVERFLOW` that seems to have been triggered by a `mi_assert_internal()`, i.e. an internal assertion of mimalloc... So the best bet to work around this for now seems to just lower the maximum allowed tree depth _even further_ for MSVC builds. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31lower core.maxTreeDepth default to 2048Jeff King1-1/+1
On my Linux system, all of our recursive tree walking algorithms can run up to the 4096 default limit without segfaulting. But not all platforms will have stack sizes as generous (nor might even Linux if we kick off a recursive walk within a thread). In particular, several of the tests added in the previous few commits fail in our Windows CI environment. Through some guess-and-check pushing, I found that 3072 is still too much, but 2048 is OK. These are obviously vague heuristics, and there is nothing to promise that another system might not have trouble at even lower values. But it seems unlikely anybody will be too angry about a 2048-depth limit (this is close to the default max-pathname limit on Linux even for a pathological path like "a/a/a/..."). So let's just lower it. Some alternatives are: - configure separate defaults for Windows versus other platforms. - just skip the tests on Windows. This leaves Windows users with the annoying case that they can be crashed by running out of stack space, but there shouldn't be any security implications (they can't go deep enough to hit integer overflow problems). Since the original default was arbitrary, it seems less confusing to just lower it, keeping behavior consistent across platforms. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31add core.maxTreeDepth configJeff King1-0/+1
Most of our tree traversal algorithms use recursion to visit sub-trees. For pathologically large trees, this can cause us to run out of stack space and abort in an uncontrolled way. Let's put our own limit here so that we can fail gracefully rather than segfaulting. In similar cases where we recursed along the commit graph, we rewrote the algorithms to avoid recursion and keep any stack data on the heap. But the commit graph is meant to grow without bound, whereas it's not an imposition to put a limit on the maximum size of tree we'll handle. And this has a bonus side effect: coupled with a limit on individual tree entry names, this limits the total size of a path we may encounter. This gives us an extra protection against code handling long path names which may suffer from integer overflows in the size (which could then be exploited by malicious trees). The default of 4096 is set to be much longer than anybody would care about in the real world. Even with single-letter interior tree names (like "a/b/c"), such a path is at least 8191 bytes. While most operating systems will let you create such a path incrementally, trying to reference the whole thing in a system call (as Git would do when actually trying to access it) will result in ENAMETOOLONG. Coupled with the recent fsck.largePathname warning, the maximum total pathname Git will handle is (by default) 16MB. This config option doesn't do anything yet; future patches will convert various algorithms to respect the limit. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-28Merge branch 'rs/pack-objects-parseopt-fix'Junio C Hamano1-1/+1
Command line parser fix. * rs/pack-objects-parseopt-fix: pack-objects: fix --no-quiet pack-objects: fix --no-keep-true-parents
2023-07-21pack-objects: fix --no-keep-true-parentsRené Scharfe1-1/+1
Since 99fb6e04cb (pack-objects: convert to use parse_options(), 2012-02-01) git pack-objects has accepted --no-keep-true-parents, but this option does the same as --keep-true-parents. That's because it's defined using OPT_SET_INT with a value of 0, which sets 0 when negated as well. Turn --no-keep-true-parents into the opposite of --keep-true-parents by using OPT_BOOL and storing the option's status directly in a variable named "grafts_keep_true_parents" instead of in negative form in "grafts_replace_parents". Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05treewide: remove unnecessary includes for wrapper.hCalvin Wan1-1/+0
Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29Merge branch 'en/header-split-cache-h-part-3'Junio C Hamano1-2/+3
Header files cleanup. * en/header-split-cache-h-part-3: (28 commits) fsmonitor-ll.h: split this header out of fsmonitor.h hash-ll, hashmap: move oidhash() to hash-ll object-store-ll.h: split this header out of object-store.h khash: name the structs that khash declares merge-ll: rename from ll-merge git-compat-util.h: remove unneccessary include of wildmatch.h builtin.h: remove unneccessary includes list-objects-filter-options.h: remove unneccessary include diff.h: remove unnecessary include of oidset.h repository: remove unnecessary include of path.h log-tree: replace include of revision.h with simple forward declaration cache.h: remove this no-longer-used header read-cache*.h: move declarations for read-cache.c functions from cache.h repository.h: move declaration of the_index from cache.h merge.h: move declarations for merge.c from cache.h diff.h: move declaration for global in diff.c from cache.h preload-index.h: move declarations for preload-index.c from elsewhere sparse-index.h: move declarations for sparse-index.c from cache.h name-hash.h: move declarations for name-hash.c from cache.h run-command.h: move declarations for run-command.c from cache.h ...
2023-06-22Merge branch 'ds/disable-replace-refs'Junio C Hamano1-2/+1
Introduce a mechanism to disable replace refs globally and per repository. * ds/disable-replace-refs: repository: create read_replace_refs setting replace-objects: create wrapper around setting repository: create disable_replace_refs()
2023-06-21object-store-ll.h: split this header out of object-store.hElijah Newren1-1/+1
The vast majority of files including object-store.h did not need dir.h nor khash.h. Split the header into two files, and let most just depend upon object-store-ll.h, while letting the two callers that need it depend on the full object-store.h. After this patch: $ git grep -h include..object-store | sort | uniq -c 2 #include "object-store.h" 129 #include "object-store-ll.h" Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21repository: remove unnecessary include of path.hElijah Newren1-0/+1
This also made it clear that several .c files that depended upon path.h were missing a #include for it; add the missing includes while at it. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21cache.h: remove this no-longer-used headerElijah Newren1-1/+1
Since this header showed up in some places besides just #include statements, update/clean-up/remove those other places as well. Note that compat/fsmonitor/fsm-path-utils-darwin.c previously got away with violating the rule that all files must start with an include of git-compat-util.h (or a short-list of alternate headers that happen to include it first). This change exposed the violation and caused it to stop building correctly; fix it by having it include git-compat-util.h first, as per policy. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12repository: create read_replace_refs settingDerrick Stolee1-1/+0
The 'read_replace_refs' global specifies whether or not we should respect the references of the form 'refs/replace/<oid>' to replace which object we look up when asking for '<oid>'. This global has caused issues when it is not initialized properly, such as in b6551feadfd (merge-tree: load default git config, 2023-05-10). To make this more robust, move its config-based initialization out of git_default_config and into prepare_repo_settings(). This provides a repository-scoped version of the 'read_replace_refs' global. The global still has its purpose: it is disabled process-wide by the GIT_NO_REPLACE_OBJECTS environment variable or by a call to disable_replace_refs() in some specific Git commands. Since we already encapsulated the use of the constant inside replace_refs_enabled(), we can perform the initialization inside that method, if necessary. This solves the problem of forgetting to check the config, as we will check it before returning this value. Due to this encapsulation, the global can move to be static within replace-object.c. There is an interesting behavior change possible here: we now have a repository-scoped understanding of this config value. Thus, if there was a command that recurses into submodules and might follow replace refs, then it would now respect the core.useReplaceRefs config value in each repository. 'git grep --recurse-submodules' is such a command that recurses into submodules in-process. We can demonstrate the granularity of this config value via a test in t7814. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12repository: create disable_replace_refs()Derrick Stolee1-1/+1
Several builtins depend on being able to disable the replace references so we actually operate on each object individually. These currently do so by directly mutating the 'read_replace_refs' global. A future change will move this global into a different place, so it will be necessary to change all of these lines. However, we can simplify that transition by abstracting the purpose of these global assignments with a method call. We will need to keep this read_replace_refs global forever, as we want to make sure that we never use replace refs throughout the life of the process if this method is called. Future changes may present a repository-scoped version of the variable to represent that repository's core.useReplaceRefs config value, but a zero-valued read_replace_refs will always override such a setting. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-26repository: move 'repository_format_worktree_config' to repo scopeVictoria Dye1-1/+0
Move 'repository_format_worktree_config' out of the global scope and into the 'repository' struct. This change is similar to how 'repository_format_partial_clone' was moved in ebaf3bcf1ae (repository: move global r_f_p_c to repo struct, 2021-06-17), adding it to the 'repository' struct and updating 'setup.c' & 'repository.c' functions to assign the value appropriately. The primary goal of this change is to be able to load the worktree config of a submodule depending on whether that submodule - not its superproject - has 'extensions.worktreeConfig' enabled. To ensure 'do_git_config_sequence()' has access to the newly repo-scoped configuration, add a 'struct repository' argument to 'do_git_config_sequence()' and pass it the 'repo' value from 'config_with_options()'. Finally, add/update tests in 't3007-ls-files-recurse-submodules.sh' to verify 'extensions.worktreeConfig' is read an used independently by superprojects and submodules. Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24treewide: remove cache.h inclusion due to previous changesElijah Newren1-1/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24ws.h: move declarations for ws.c functions from cache.hElijah Newren1-1/+0
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11pager.h: move declarations for pager.c functions from cache.hElijah Newren1-1/+0
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11object-file.h: move declarations for object-file.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11treewide: be explicit about dependence on convert.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11treewide: be explicit about dependence on trace.h & trace2.hElijah Newren1-0/+1
Dozens of files made use of trace and trace2 functions, without explicitly including trace.h or trace2.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include trace.h or trace2.h if they are using them. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21write-or-die.h: move declarations for write-or-die.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21setup.h: move declarations for setup.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21wrapper.h: move declarations for wrapper.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21abspath.h: move absolute path functions from cache.hElijah Newren1-0/+1
This is another step towards letting us remove the include of cache.h in strbuf.c. It does mean that we also need to add includes of abspath.h in a number of C files. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21treewide: be explicit about dependence on gettext.hElijah Newren1-0/+1
Dozens of files made use of gettext functions, without explicitly including gettext.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include gettext.h if they are using it. However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an include of gettext.h, it was left out to avoid conflicting with an in-flight topic. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23replace-object.h: move read_replace_refs declaration from cache.h to hereElijah Newren1-0/+1
Adjust several files to be more explicit about their dependency on replace-objects to accommodate this change. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-26read-tree: add "--super-prefix" option, eliminate globalÆvar Arnfjörð Bjarmason1-13/+0
The "--super-prefix" option to "git" was initially added in [1] for use with "ls-files"[2], and shortly thereafter "submodule--helper"[3] and "grep"[4]. It wasn't until [5] that "read-tree" made use of it. At the time [5] made sense, but since then we've made "ls-files" recurse in-process in [6], "grep" in [7], and finally "submodule--helper" in the preceding commits. Let's also remove it from "read-tree", which allows us to remove the option to "git" itself. We can do this because the only remaining user of it is the submodule API, which will now invoke "read-tree" with its new "--super-prefix" option. It will only do so when the "submodule_move_head()" function is called. That "submodule_move_head()" function was then only invoked by "read-tree" itself, but now rather than setting an environment variable to pass "--super-prefix" between cmd_read_tree() we: - Set a new "super_prefix" in "struct unpack_trees_options". The "super_prefixed()" function in "unpack-trees.c" added in [5] will now use this, rather than get_super_prefix() looking up the environment variable we set earlier in the same process. - Add the same field to the "struct checkout", which is only needed to ferry the "super_prefix" in the "struct unpack_trees_options" all the way down to the "entry.c" callers of "submodule_move_head()". Those calls which used the super prefix all originated in "cmd_read_tree()". The only other caller is the "unlink_entry()" caller in "builtin/checkout.c", which now passes a "NULL". 1. 74866d75793 (git: make super-prefix option, 2016-10-07) 2. e77aa336f11 (ls-files: optionally recurse into submodules, 2016-10-07) 3. 89c86265576 (submodule helper: support super prefix, 2016-12-08) 4. 0281e487fd9 (grep: optionally recurse into submodules, 2016-12-16) 5. 3d415425c7b (unpack-trees: support super-prefix option, 2017-01-17) 6. 188dce131fa (ls-files: use repository object, 2017-06-22) 7. f9ee2fcdfa0 (grep: recurse in-process using 'struct repository', 2017-08-02) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-14Merge branch 'ab/unused-annotation'Junio C Hamano1-2/+2
Undoes 'jk/unused-annotation' topic and redoes it to work around Coccinelle rules misfiring false positives in unrelated codepaths. * ab/unused-annotation: git-compat-util.h: use "deprecated" for UNUSED variables git-compat-util.h: use "UNUSED", not "UNUSED(var)"
2022-09-14Merge branch 'jk/unused-annotation'Junio C Hamano1-2/+2
Annotate function parameters that are not used (but cannot be removed for structural reasons), to prepare us to later compile with -Wunused warning turned on. * jk/unused-annotation: is_path_owned_by_current_uid(): mark "report" parameter as unused run-command: mark unused async callback parameters mark unused read_tree_recursive() callback parameters hashmap: mark unused callback parameters config: mark unused callback parameters streaming: mark unused virtual method parameters transport: mark bundle transport_options as unused refs: mark unused virtual method parameters refs: mark unused reflog callback parameters refs: mark unused each_ref_fn parameters git-compat-util: add UNUSED macro
2022-09-01git-compat-util.h: use "UNUSED", not "UNUSED(var)"Ævar Arnfjörð Bjarmason1-2/+2
As reported in [1] the "UNUSED(var)" macro introduced in 2174b8c75de (Merge branch 'jk/unused-annotation' into next, 2022-08-24) breaks coccinelle's parsing of our sources in files where it occurs. Let's instead partially go with the approach suggested in [2] of making this not take an argument. As noted in [1] "coccinelle" will ignore such tokens in argument lists that it doesn't know about, and it's less of a surprise to syntax highlighters. This undoes the "help us notice when a parameter marked as unused is actually use" part of 9b240347543 (git-compat-util: add UNUSED macro, 2022-08-19), a subsequent commit will further tweak the macro to implement a replacement for that functionality. 1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/ 2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19hashmap: mark unused callback parametersJeff King1-2/+2
Hashmap comparison functions must conform to a particular callback interface, but many don't use all of their parameters. Especially the void cmp_data pointer, but some do not use keydata either (because they can easily form a full struct to pass when doing lookups). Let's mark these to make -Wunused-parameter happy. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05refs: use ref_namespaces for replace refs baseDerrick Stolee1-2/+1
The git_replace_ref_base global is used to store the value of the GIT_REPLACE_REF_BASE environment variable or the default of "refs/replace/". This is initialized within setup_git_env(). The ref_namespaces array is a new centralized location for information such as the ref namespace used for replace refs. Instead of having this namespace stored in two places, use the ref_namespaces array instead. For simplicity, create a local git_replace_ref_base variable wherever the global was previously used. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05refs: add array of ref namespacesDerrick Stolee1-0/+2
Git interprets different meanings to different refs based on their names. Some meanings are cosmetic, like how refs in 'refs/remotes/*' are colored differently from refs in 'refs/heads/*'. Others are more critical, such as how replace refs are interpreted. Before making behavior changes based on ref namespaces, collect all known ref namespaces into a array of ref_namespace_info structs. This array is indexed by the new ref_namespace enum for quick access. As of this change, this array is purely documentation. Future changes will add dependencies on this array. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-04Merge branch 'ds/midx-normalize-pathname-before-comparison'Junio C Hamano1-1/+1
The path taken by "git multi-pack-index" command from the end user was compared with path internally prepared by the tool withut first normalizing, which lead to duplicated paths not being noticed, which has been corrected. * ds/midx-normalize-pathname-before-comparison: cache: use const char * for get_object_directory() multi-pack-index: use --object-dir real path midx: use real paths in lookup_multi_pack_index()
2022-04-25cache: use const char * for get_object_directory()Derrick Stolee1-1/+1
The get_object_directory() method returns the exact string stored at the_repository->objects->odb->path. The return type of "char *" implies that the caller must keep track of the buffer and free() it when complete. This causes significant problems later when the ODB is accessed. Use "const char *" as the return type to avoid this confusion. There are no current callers that care about the non-const definition. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-04Merge branch 'jh/builtin-fsmonitor-part2'Junio C Hamano1-1/+0
Built-in fsmonitor (part 2). * jh/builtin-fsmonitor-part2: (30 commits) t7527: test status with untracked-cache and fsmonitor--daemon fsmonitor: force update index after large responses fsmonitor--daemon: use a cookie file to sync with file system fsmonitor--daemon: periodically truncate list of modified files t/perf/p7519: add fsmonitor--daemon test cases t/perf/p7519: speed up test on Windows t/perf/p7519: fix coding style t/helper/test-chmtime: skip directories on Windows t/perf: avoid copying builtin fsmonitor files into test repo t7527: create test for fsmonitor--daemon t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon help: include fsmonitor--daemon feature flag in version info fsmonitor--daemon: implement handle_client callback compat/fsmonitor/fsm-listen-darwin: implement FSEvent listener on MacOS compat/fsmonitor/fsm-listen-darwin: add MacOS header files for FSEvent compat/fsmonitor/fsm-listen-win32: implement FSMonitor backend on Windows fsmonitor--daemon: create token-based changed path cache fsmonitor--daemon: define token-ids fsmonitor--daemon: add pathname classification fsmonitor--daemon: implement 'start' command ...
2022-03-25Merge branch 'ns/core-fsyncmethod'Junio C Hamano1-1/+3
Replace core.fsyncObjectFiles with two new configuration variables, core.fsync and core.fsyncMethod. * ns/core-fsyncmethod: core.fsync: documentation and user-friendly aggregate options core.fsync: new option to harden the index core.fsync: add configuration parsing core.fsync: introduce granular fsync control infrastructure core.fsyncmethod: add writeout-only mode wrapper: make inclusion of Windows csprng header tightly scoped
2022-03-25fsmonitor: config settings are repository-specificJeff Hostetler1-1/+0
Move fsmonitor config settings to a new and opaque `struct fsmonitor_settings` structure. Add a lazily-loaded pointer to this into `struct repo_settings` Create an `enum fsmonitor_mode` type in `struct fsmonitor_settings` to represent the state of fsmonitor. This lets us represent which, if any, fsmonitor provider (hook or IPC) is enabled. Create `fsm_settings__get_*()` getters to lazily look up fsmonitor- related config settings. Get rid of the `core_fsmonitor` global variable. Move the code to lookup the existing `core.fsmonitor` config value into the fsmonitor settings. Create a hook pathname variable in `struct fsmonitor-settings` and only set it when in hook mode. Extend the definition of `core.fsmonitor` to be either a boolean or a hook pathname. When true, the builtin FSMonitor is used. When false or unset, no FSMonitor (neither builtin nor hook) is used. The existing `core_fsmonitor` global variable was used to store the pathname to the fsmonitor hook *and* it was used as a boolean to see if fsmonitor was enabled. This dual usage and global visibility leads to confusion when we add the IPC-based provider. So lets hide the details in fsmonitor-settings.c and let it decide which provider to use in the case of multiple settings. This avoids cluttering up repo-settings.c with these private details. A future commit in builtin-fsmonitor series will add the ability to disqualify worktrees for various reasons, such as being mounted from a remote volume, where fsmonitor should not be started. Having the config settings hidden in fsmonitor-settings.c allows such worktree restrictions to override the config values used. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-10core.fsync: add configuration parsingNeeraj Singh1-1/+1
This change introduces code to parse the core.fsync setting and configure the fsync_components variable. core.fsync is configured as a comma-separated list of component names to sync. Each time a core.fsync variable is encountered in the configuration heirarchy, we start off with a clean state with the platform default value. Passing 'none' resets the value to indicate nothing will be synced. We gather all negative and positive entries from the comma separated list and then compute the new value by removing all the negative entries and adding all of the positive entries. We issue a warning for components that are not recognized so that the configuration code is compatible with configs from future versions of Git with more repo components. Complete documentation for the new setting is included in a later patch in the series so that it can be reviewed once in final form. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-10core.fsync: introduce granular fsync control infrastructureNeeraj Singh1-0/+1
This commit introduces the infrastructure for the core.fsync configuration knob. The repository components we want to sync are identified by flags so that we can turn on or off syncing for specific components. If core.fsyncObjectFiles is set and the core.fsync configuration also includes FSYNC_COMPONENT_LOOSE_OBJECT, we will fsync any loose objects. This picks the strictest data integrity behavior if core.fsync and core.fsyncObjectFiles are set to conflicting values. This change introduces the currently unused fsync_component helper, which will be used by a later patch that adds fsyncing to the refs backend. Actual configuration and documentation of the fsync components list are in other patches in the series to separate review of the underlying mechanism from the policy of how it's configured. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-10core.fsyncmethod: add writeout-only modeNeeraj Singh1-0/+1
This commit introduces the `core.fsyncMethod` configuration knob, which can currently be set to `fsync` or `writeout-only`. The new writeout-only mode attempts to tell the operating system to flush its in-memory page cache to the storage hardware without issuing a CACHE_FLUSH command to the storage controller. Writeout-only fsync is significantly faster than a vanilla fsync on common hardware, since data is written to a disk-side cache rather than all the way to a durable medium. Later changes in this patch series will take advantage of this primitive to implement batching of hardware flushes. When git_fsync is called with FSYNC_WRITEOUT_ONLY, it may fail and the caller is expected to do an ordinary fsync as needed. On Apple platforms, the fsync system call does not issue a CACHE_FLUSH directive to the storage controller. This change updates fsync to do fcntl(F_FULLFSYNC) to make fsync actually durable. We maintain parity with existing behavior on Apple platforms by setting the default value of the new core.fsyncMethod option. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-01repo_read_index: add config to expect files outside sparse patternsElijah Newren1-0/+1
Typically with sparse checkouts, we expect files outside the sparsity patterns to be marked as SKIP_WORKTREE and be missing from the working tree. Sometimes this expectation would be violated however; including in cases such as: * users grabbing files from elsewhere and writing them to the worktree (perhaps by editing a cached copy in an editor, copying/renaming, or even untarring) * various git commands having incomplete or no support for the SKIP_WORKTREE bit[1,2] * users attempting to "abort" a sparse-checkout operation with a not-so-early Ctrl+C (updating $GIT_DIR/info/sparse-checkout and the working tree is not atomic)[3]. When the SKIP_WORKTREE bit in the index did not reflect the presence of the file in the working tree, it traditionally caused confusion and was difficult to detect and recover from. So, in a sparse checkout, since af6a51875a (repo_read_index: clear SKIP_WORKTREE bit from files present in worktree, 2022-01-14), Git automatically clears the SKIP_WORKTREE bit at index read time for entries corresponding to files that are present in the working tree. There is another workflow, however, where it is expected that paths outside the sparsity patterns appear to exist in the working tree and that they do not lose the SKIP_WORKTREE bit, at least until they get modified. A Git-aware virtual file system[4] takes advantage of its position as a file system driver to expose all files in the working tree, fetch them on demand using partial clone on access, and tell Git to pay attention to them on demand by updating the sparse checkout pattern on writes. This means that commands like "git status" only have to examine files that have potentially been modified, whereas commands like "ls" are able to show the entire codebase without requiring manual updates to the sparse checkout pattern. Thus since af6a51875a, Git with such Git-aware virtual file systems unsets the SKIP_WORKTREE bit for all files and commands like "git status" have to fetch and examine them all. Introduce a configuration setting sparse.expectFilesOutsideOfPatterns to allow limiting the tracked set of files to a small set once again. A Git-aware virtual file system or other application that wants to maintain files outside of the sparse checkout can set this in a repository to instruct Git not to check for the presence of SKIP_WORKTREE files. The setting defaults to false, so most users of sparse checkout will still get the benefit of an automatically updating index to recover from the variety of difficult issues detailed in af6a51875a for paths with SKIP_WORKTREE set despite the path being present. [1] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/ [2] The three long paragraphs in the middle of https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/ [3] https://lore.kernel.org/git/CABPp-BFnFpzwGC11TLoLs8YK5yiisA5D5-fFjXnJsbESVDwZsA@mail.gmail.com/ [4] such as the vfsd described in https://lore.kernel.org/git/20220207190320.2960362-1-jonathantanmy@google.com/ Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-03Merge branch 'ns/tmp-objdir'Junio C Hamano1-0/+9
New interface into the tmp-objdir API to help in-core use of the quarantine feature. * ns/tmp-objdir: tmp-objdir: disable ref updates when replacing the primary odb tmp-objdir: new API for creating temporary writable databases
2021-12-15Merge branch 'ew/test-wo-fsync'Junio C Hamano1-0/+1
Allow running our tests while disabling fsync. * ew/test-wo-fsync: tests: disable fsync everywhere
2021-12-08tmp-objdir: disable ref updates when replacing the primary odbNeeraj Singh1-0/+4
When creating a subprocess with a temporary ODB, we set the GIT_QUARANTINE_ENVIRONMENT env var to tell child Git processes not to update refs, since the tmp-objdir may go away. Introduce a similar mechanism for in-process temporary ODBs when we call tmp_objdir_replace_primary_odb. Now both mechanisms set the disable_ref_updates flag on the odb, which is queried by the ref_transaction_prepare function. Peff's test case [1] was invoking ref updates via the cachetextconv setting. That particular code silently does nothing when a ref update is forbidden. See the call to notes_cache_put in fill_textconv where errors are ignored. [1] https://lore.kernel.org/git/YVOn3hDsb5pnxR53@coredump.intra.peff.net/ Reported-by: Jeff King <peff@peff.net> Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-08tmp-objdir: new API for creating temporary writable databasesNeeraj Singh1-0/+5
The tmp_objdir API provides the ability to create temporary object directories, but was designed with the goal of having subprocesses access these object stores, followed by the main process migrating objects from it to the main object store or just deleting it. The subprocesses would view it as their primary datastore and write to it. Here we add the tmp_objdir_replace_primary_odb function that replaces the current process's writable "main" object directory with the specified one. The previous main object directory is restored in either tmp_objdir_migrate or tmp_objdir_destroy. For the --remerge-diff usecase, add a new `will_destroy` flag in `struct object_database` to mark ephemeral object databases that do not require fsync durability. Add 'git prune' support for removing temporary object databases, and make sure that they have a name starting with tmp_ and containing an operation-specific name. Based-on-patch-by: Elijah Newren <newren@gmail.com> Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-29tests: disable fsync everywhereEric Wong1-0/+1
The "GIT_TEST_FSYNC" environment variable now exists for disabling fsync() even on packfiles and other "critical" data. Running "make test -j8 NO_SVN_TESTS=1" on a noisy 8-core system on an HDD, test runtime drops from ~4 minutes down to ~3 minutes. Using "GIT_TEST_FSYNC=1" re-enables fsync() for comparison purposes. SVN interopability tests are minimally affected since SVN will still use fsync in various places. This will also be useful for 3rd-party tools which create throwaway git repositories of temporary data, but remains undocumented for end users. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-11Merge branch 'jk/ref-paranoia'Junio C Hamano1-1/+0
The ref iteration code used to optionally allow dangling refs to be shown, which has been tightened up. * jk/ref-paranoia: refs: drop "broken" flag from for_each_fullref_in() ref-filter: drop broken-ref code entirely ref-filter: stop setting FILTER_REFS_INCLUDE_BROKEN repack, prune: drop GIT_REF_PARANOIA settings refs: turn on GIT_REF_PARANOIA by default refs: omit dangling symrefs when using GIT_REF_PARANOIA refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flag refs-internal.h: reorganize DO_FOR_EACH_* flag documentation refs-internal.h: move DO_FOR_EACH_* flags next to each other t5312: be more assertive about command failure t5312: test non-destructive repack t5312: create bogus ref as necessary t5312: drop "verbose" helper t5600: provide detached HEAD for corruption failures t5516: don't use HEAD ref for invalid ref-deletion tests t7900: clean up some more broken refs
2021-10-06Merge branch 'ab/repo-settings-cleanup'Junio C Hamano1-9/+1
Code cleanup. * ab/repo-settings-cleanup: repository.h: don't use a mix of int and bitfields repo-settings.c: simplify the setup read-cache & fetch-negotiator: check "enum" values in switch() environment.c: remove test-specific "ignore_untracked..." variable wrapper.c: add x{un,}setenv(), and use xsetenv() in environment.c
2021-09-27repack, prune: drop GIT_REF_PARANOIA settingsJeff King1-1/+0
Now that GIT_REF_PARANOIA is the default, we don't need to selectively enable it for destructive operations. In fact, it's harmful to do so, because it overrides any GIT_REF_PARANOIA=0 setting that the user may have provided (because they're trying to work around some corruption). With these uses gone, we can further clean up the ref_paranoia global, and make it a static variable inside the refs code. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22environment.c: remove test-specific "ignore_untracked..." variableÆvar Arnfjörð Bjarmason1-7/+0
Instead of the global ignore_untracked_cache_config variable added in dae6c322fa1 (test-dump-untracked-cache: don't modify the untracked cache, 2016-01-27) we can make use of the new facility to set config via environment variables added in d8d77153eaf (config: allow specifying config entries via envvar pairs, 2021-01-12). It's arguably a bit hacky to use setenv() and getenv() to pass messages between the same program, but since the test helpers are not the main intended audience of repo-settings.c I think it's better than hardcoding the test-only special-case in prepare_repo_settings(). This uses the xsetenv() wrapper added in the preceding commit, if we don't set these in the environment we'll fail in t7063-status-untracked-cache.sh, but let's fail earlier anyway if that were to happen. This breaks any parent process that's potentially using the GIT_CONFIG_* and GIT_CONFIG_PARAMETERS mechanism to pass one-shot config setting down to a git subprocess, but in this case we don't care about the general case of such potential parents. This process neither spawns other "git" processes, nor is it interested in other configuration. We might want to pick up other test modes here, but those will be passed via GIT_TEST_* environment variables. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-22wrapper.c: add x{un,}setenv(), and use xsetenv() in environment.cÆvar Arnfjörð Bjarmason1-2/+1
Add fatal wrappers for setenv() and unsetenv(). In d7ac12b25d3 (Add set_git_dir() function, 2007-08-01) we started checking its return value, and since 48988c4d0c3 (set_git_dir: die when setenv() fails, 2018-03-30) we've had set_git_dir_1() die if we couldn't set it. Let's provide a wrapper for both, this will be useful in many other places, a subsequent patch will make another use of xsetenv(). The checking of the return value here is over-eager according to setenv(3) and POSIX. It's documented as returning just -1 or 0, so perhaps we should be checking -1 explicitly. Let's just instead die on any non-zero, if our C library is so broken as to return something else than -1 on error (and perhaps not set errno?) the worst we'll do is die with a nonsensical errno value, but we'll want to die in either case. Let's make these return "void" instead of "int". As far as I can tell there's no other x*() wrappers that needed to make the decision of deviating from the signature in the C library, but since their return value is only used to indicate errors (so we'd die here), we can catch unreachable code such as if (xsetenv(...) < 0) [...]; I think it would be OK skip the NULL check of the "name" here for the calls to die_errno(). Almost all of our setenv() callers are taking a constant string hardcoded in the source as the first argument, and for the rest we can probably assume they've done the NULL check themselves. Even if they didn't, modern C libraries are forgiving about it (e.g. glibc formatting it as "(null)"), on those that aren't, well, we were about to die anyway. But let's include the check anyway for good measure. 1. https://pubs.opengroup.org/onlinepubs/009604499/functions/setenv.html Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12compression: drop write-only core_compression_* variablesRené Scharfe1-1/+0
Since 8de7eeb54b (compression: unify pack.compression configuration parsing, 2016-11-15) the variables core_compression_level and core_compression_seen are only set, but never read. Remove them. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26environment: move strbuf into block to plug leakAndrzej Hunt1-4/+3
realpath is only populated if we execute the git_work_tree_initialized block. However that block also causes us to return early, meaning we never actually release the strbuf in the case where we populated it. Therefore we move all strbuf related code into the block to guarantee that we can't leak it. LSAN output from t0095: Direct leak of 129 byte(s) in 1 object(s) allocated from: #0 0x49a9b9 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3 #1 0x78f585 in xrealloc wrapper.c:126:8 #2 0x713ff4 in strbuf_grow strbuf.c:98:2 #3 0x713ff4 in strbuf_getcwd strbuf.c:597:3 #4 0x4f0c18 in strbuf_realpath_1 abspath.c:99:7 #5 0x5ae4a4 in set_git_work_tree environment.c:259:3 #6 0x6fdd8a in setup_discovered_git_dir setup.c:931:2 #7 0x6fdd8a in setup_git_directory_gently setup.c:1235:12 #8 0x4cb50d in get_bloom_filter_for_commit t/helper/test-bloom.c:41:2 #9 0x4cb50d in cmd__bloom t/helper/test-bloom.c:95:3 #10 0x4caa1f in cmd_main t/helper/test-tool.c:124:11 #11 0x4caded in main common-main.c:52:11 #12 0x7f0869f02349 in __libc_start_main (/lib64/libc.so.6+0x24349) SUMMARY: AddressSanitizer: 129 byte(s) leaked in 1 allocation(s). It looks like this leak has existed since realpath was first added to set_git_work_tree() in: 3d7747e318 (real_path: remove unsafe API, 2020-03-10) Signed-off-by: Andrzej Hunt <andrzej@ahunt.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15config: allow specifying config entries via envvar pairsPatrick Steinhardt1-0/+1
While we currently have the `GIT_CONFIG_PARAMETERS` environment variable which can be used to pass runtime configuration data to git processes, it's an internal implementation detail and not supposed to be used by end users. Next to being for internal use only, this way of passing config entries has a major downside: the config keys need to be parsed as they contain both key and value in a single variable. As such, it is left to the user to escape any potentially harmful characters in the value, which is quite hard to do if values are controlled by a third party. This commit thus adds a new way of adding config entries via the environment which gets rid of this shortcoming. If the user passes the `GIT_CONFIG_COUNT=$n` environment variable, Git will parse environment variable pairs `GIT_CONFIG_KEY_$i` and `GIT_CONFIG_VALUE_$i` for each `i` in `[0,n)`. While the same can be achieved with `git -c <name>=<value>`, one may wish to not do so for potentially sensitive information. E.g. if one wants to set `http.extraHeader` to contain an authentication token, doing so via `-c` would trivially leak those credentials via e.g. ps(1), which typically also shows command arguments. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15environment: make `getenv_safe()` a public functionPatrick Steinhardt1-5/+2
The `getenv_safe()` helper function helps to safely retrieve multiple environment values without the need to depend on platform-specific behaviour for the return value's lifetime. We'll make use of this function in a following patch, so let's make it available by making it non-static and adding a declaration. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-27Merge branch 'jk/leakfix'Junio C Hamano1-2/+2
Code clean-up. * jk/leakfix: submodule--helper: fix leak of core.worktree value config: fix leak in git_config_get_expiry_in_days() config: drop git_config_get_string_const() config: fix leaks from git_config_get_string_const() checkout: fix leak of non-existent branch names submodule--helper: use strbuf_release() to free strbufs clear_pattern_list(): clear embedded hashmaps
2020-08-17config: drop git_config_get_string_const()Jeff King1-2/+2
As evidenced by the leak fixes in the previous commit, the "const" in git_config_get_string_const() clearly misleads people into thinking that it does not allocate a copy of the string. We can fix this by renaming it, but it's easier still to just drop it. Of the four remaining callers: - The one in git_config_parse_expiry() still needs to allocate, since that's what its callers expect. We can just use the non-const version and cast our pointer. Slightly ugly, but the damage is contained in one spot. - The two in apply are writing to global "const char *" variables, and need to continue allocating. We often mark these as const because we assign default string literals to them. But in this case we don't do that, so we can just declare them as real "char *" pointers and use the non-const version. - The call in checkout doesn't actually need a copy; it can just use the non-allocating "tmp" version of the function. The function is also mentioned in the MyFirstContribution document. We can swap that call out for the non-allocating "tmp" variant, which fits well in the example given. We'll drop the "configset" and "repo" variants, as well (which are unused). Note that this frees up the "const" name, so we could rename the "tmp" variant back to that. But let's give some time for topics in flight to adapt to the new code before doing so (if we do it too soon, the function semantics will change but the compiler won't alert us). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-30strvec: rename struct fieldsJeff King1-1/+1
The "argc" and "argv" names made sense when the struct was argv_array, but now they're just confusing. Let's rename them to "nr" (which we use for counts elsewhere) and "v" (which is rather terse, but reads well when combined with typical variable names like "args.v"). Note that we have to update all of the callers immediately. Playing tricks with the preprocessor is hard here, because we wouldn't want to rewrite unrelated tokens. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-28strvec: convert more callers away from argv_array nameJeff King1-4/+4
We eventually want to drop the argv_array name and just use strvec consistently. There's no particular reason we have to do it all at once, or care about interactions between converted and unconverted bits. Because of our preprocessor compat layer, the names are interchangeable to the compiler (so even a definition and declaration using different names is OK). This patch converts remaining files from the first half of the alphabet, to keep the diff to a manageable size. The conversion was done purely mechanically with: git ls-files '*.c' '*.h' | xargs perl -i -pe ' s/ARGV_ARRAY/STRVEC/g; s/argv_array/strvec/g; ' and then selectively staging files with "git add '[abcdefghjkl]*'". We'll deal with any indentation/style fallouts separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-28strvec: rename files from argv-array to strvecJeff King1-1/+1
This requires updating #include lines across the code-base, but that's all fairly mechanical, and was done with: git ls-files '*.c' '*.h' | xargs perl -i -pe 's/argv-array.h/strvec.h/' Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-13Merge branch 'tb/shallow-cleanup'Junio C Hamano1-0/+1
Code cleanup. * tb/shallow-cleanup: shallow: use struct 'shallow_lock' for additional safety shallow.h: document '{commit,rollback}_shallow_file' shallow: extract a header file for shallow-related functions commit: make 'commit_graft_pos' non-static
2020-04-30shallow: extract a header file for shallow-related functionsTaylor Blau1-0/+1
There are many functions in commit.h that are more related to shallow repositories than they are to any sort of generic commit machinery. Likely this began when there were only a few shallow-related functions, and commit.h seemed a reasonable enough place to put them. But, now there are a good number of shallow-related functions, and placing them all in 'commit.h' doesn't make sense. This patch extracts a 'shallow.h', which takes all of the declarations from 'commit.h' for functions which already exist in 'shallow.c'. We will bring the remaining shallow-related functions defined in 'commit.c' in a subsequent patch. For now, move only the ones that already are implemented in 'shallow.c', and update the necessary includes. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-10real_path: remove unsafe APIAlexandr Miloslavskiy1-1/+6
Returning a shared buffer invites very subtle bugs due to reentrancy or multi-threading, as demonstrated by the previous patch. There was an unfinished effort to abolish this [1]. Let's finally rid of `real_path()`, using `strbuf_realpath()` instead. This patch uses a local `strbuf` for most places where `real_path()` was previously called. However, two places return the value of `real_path()` to the caller. For them, a `static` local `strbuf` was added, effectively pushing the problem one level higher: read_gitfile_gently() get_superproject_working_tree() [1] https://lore.kernel.org/git/1480964316-99305-1-git-send-email-bmwill@google.com/ Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-06set_git_dir: fix crash when used with real_path()Alexandr Miloslavskiy1-1/+10
`real_path()` returns result from a shared buffer, inviting subtle reentrance bugs. One of these bugs occur when invoked this way: set_git_dir(real_path(git_dir)) In this case, `real_path()` has reentrance: real_path read_gitfile_gently repo_set_gitdir setup_git_env set_git_dir_1 set_git_dir Later, `set_git_dir()` uses its now-dead parameter: !is_absolute_path(path) Fix this by using a dedicated `strbuf` to hold `strbuf_realpath()`. Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-25Merge branch 'ds/sparse-cone'Junio C Hamano1-0/+1
Management of sparsely checked-out working tree has gained a dedicated "sparse-checkout" command. * ds/sparse-cone: (21 commits) sparse-checkout: improve OS ls compatibility sparse-checkout: respect core.ignoreCase in cone mode sparse-checkout: check for dirty status sparse-checkout: update working directory in-process for 'init' sparse-checkout: cone mode should not interact with .gitignore sparse-checkout: write using lockfile sparse-checkout: use in-process update for disable subcommand sparse-checkout: update working directory in-process sparse-checkout: sanitize for nested folders unpack-trees: add progress to clear_ce_flags() unpack-trees: hash less in cone mode sparse-checkout: init and set in cone mode sparse-checkout: use hashmaps for cone patterns sparse-checkout: add 'cone' mode trace2: add region in clear_ce_flags sparse-checkout: create 'disable' subcommand sparse-checkout: add '--stdin' option to set subcommand sparse-checkout: 'set' subcommand clone: add --sparse mode sparse-checkout: create 'init' subcommand ...
2019-12-06Sync with 2.23.1Johannes Schindelin1-1/+1
* maint-2.23: (44 commits) Git 2.23.1 Git 2.22.2 Git 2.21.1 mingw: sh arguments need quoting in more circumstances mingw: fix quoting of empty arguments for `sh` mingw: use MSYS2 quoting even when spawning shell scripts mingw: detect when MSYS2's sh is to be spawned more robustly t7415: drop v2.20.x-specific work-around Git 2.20.2 t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters ...
2019-12-06Sync with 2.20.2Johannes Schindelin1-1/+1
* maint-2.20: (36 commits) Git 2.20.2 t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories ...
2019-12-06Sync with 2.19.3Johannes Schindelin1-1/+1
* maint-2.19: (34 commits) Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams ...
2019-12-06Sync with 2.18.2Johannes Schindelin1-1/+1
* maint-2.18: (33 commits) Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up ...
2019-12-06Sync with 2.17.3Johannes Schindelin1-1/+1
* maint-2.17: (32 commits) Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names ...
2019-12-06Sync with 2.16.6Johannes Schindelin1-1/+1
* maint-2.16: (31 commits) Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses ...
2019-12-06Sync with 2.15.4Johannes Schindelin1-1/+1
* maint-2.15: (29 commits) Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses clone --recurse-submodules: prevent name squatting on Windows is_ntfs_dotgit(): only verify the leading segment ...
2019-12-06Sync with 2.14.6Johannes Schindelin1-1/+1
* maint-2.14: (28 commits) Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses clone --recurse-submodules: prevent name squatting on Windows is_ntfs_dotgit(): only verify the leading segment test-path-utils: offer to run a protectNTFS/protectHFS benchmark ...
2019-12-05protect_ntfs: turn on NTFS protection by defaultJohannes Schindelin1-1/+1
Back in the DOS days, in the FAT file system, file names always consisted of a base name of length 8 plus a file extension of length 3. Shorter file names were simply padded with spaces to the full 8.3 format. Later, the FAT file system was taught to support _also_ longer names, with an 8.3 "short name" as primary file name. While at it, the same facility allowed formerly illegal file names, such as `.git` (empty base names were not allowed), which would have the "short name" `git~1` associated with it. For backwards-compatibility, NTFS supports alternative 8.3 short filenames, too, even if starting with Windows Vista, they are only generated on the system drive by default. We addressed the problem that the `.git/` directory can _also_ be accessed via `git~1/` (when short names are enabled) in 2b4c6efc821 (read-cache: optionally disallow NTFS .git variants, 2014-12-16), i.e. since Git v1.9.5, by introducing the config setting `core.protectNTFS` and enabling it by default on Windows. In the meantime, Windows 10 introduced the "Windows Subsystem for Linux" (short: WSL), i.e. a way to run Linux applications/distributions in a thinly-isolated subsystem on Windows (giving rise to many a "2016 is the Year of Linux on the Desktop" jokes). WSL is getting increasingly popular, also due to the painless way Linux application can operate directly ("natively") on files on Windows' file system: the Windows drives are mounted automatically (e.g. `C:` as `/mnt/c/`). Taken together, this means that we now have to enable the safe-guards of Git v1.9.5 also in WSL: it is possible to access a `.git` directory inside `/mnt/c/` via the 8.3 name `git~1` (unless short name generation was disabled manually). Since regular Linux distributions run in WSL, this means we have to enable `core.protectNTFS` at least on Linux, too. To enable Services for Macintosh in Windows NT to store so-called resource forks, NTFS introduced "Alternate Data Streams". Essentially, these constitute additional metadata that are connected to (and copied with) their associated files, and they are accessed via pseudo file names of the form `filename:<stream-name>:<stream-type>`. In a recent patch, we extended `core.protectNTFS` to also protect against accesses via NTFS Alternate Data Streams, e.g. to prevent contents of the `.git/` directory to be "tracked" via yet another alternative file name. While it is not possible (at least by default) to access files via NTFS Alternate Data Streams from within WSL, the defaults on macOS when mounting network shares via SMB _do_ allow accessing files and directories in that way. Therefore, we need to enable `core.protectNTFS` on macOS by default, too, and really, on any Operating System that can mount network shares via SMB/CIFS. A couple of approaches were considered for fixing this: 1. We could perform a dynamic NTFS check similar to the `core.symlinks` check in `init`/`clone`: instead of trying to create a symbolic link in the `.git/` directory, we could create a test file and try to access `.git/config` via 8.3 name and/or Alternate Data Stream. 2. We could simply "flip the switch" on `core.protectNTFS`, to make it "on by default". The obvious downside of 1. is that it won't protect worktrees that were clone with a vulnerable Git version already. We considered patching code paths that check out files to check whether we're running on an NTFS system dynamically and persist the result in the repository-local config setting `core.protectNTFS`, but in the end decided that this solution would be too fragile, and too involved. The obvious downside of 2. is that everybody will have to "suffer" the performance penalty incurred from calling `is_ntfs_dotgit()` on every path, even in setups where. After the recent work to accelerate `is_ntfs_dotgit()` in most cases, it looks as if the time spent on validating ten million random file names increases only negligibly (less than 20ms, well within the standard deviation of ~50ms). Therefore the benefits outweigh the cost. Another downside of this is that paths that might have been acceptable previously now will be forbidden. Realistically, though, this is an improvement because public Git hosters already would reject any `git push` that contains such file names. Note: There might be a similar problem mounting HFS+ on Linux. However, this scenario has been considered unlikely and in light of the cost (in the aforementioned benchmark, `core.protectHFS = true` increased the time from ~440ms to ~610ms), it was decided _not_ to touch the default of `core.protectHFS`. This change addresses CVE-2019-1353. Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com> Helped-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-11-22sparse-checkout: add 'cone' modeDerrick Stolee1-0/+1
The sparse-checkout feature can have quadratic performance as the number of patterns and number of entries in the index grow. If there are 1,000 patterns and 1,000,000 entries, this time can be very significant. Create a new Boolean config option, core.sparseCheckoutCone, to indicate that we expect the sparse-checkout file to contain a more limited set of patterns. This is a separate config setting from core.sparseCheckout to avoid breaking older clients by introducing a tri-state option. The config option does nothing right now, but will be expanded upon in a later commit. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-25Move core_partial_clone_filter_default to promisor-remote.cChristian Couder1-1/+0
Now that we can have a different default partial clone filter for each promisor remote, let's hide core_partial_clone_filter_default as a static in promisor-remote.c to avoid it being use for anything other than managing backward compatibility. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-25Move repository_format_partial_clone to promisor-remote.cChristian Couder1-1/+0
Now that we have has_promisor_remote() and can use many promisor remotes, let's hide repository_format_partial_clone as a static in promisor-remote.c to avoid it being use for anything other than managing backward compatibility. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-29Merge branch 'jk/save-getenv-result'Junio C Hamano1-2/+2
There were many places the code relied on the string returned from getenv() to be non-volatile, which is not true, that have been corrected. * jk/save-getenv-result: builtin_diff(): read $GIT_DIFF_OPTS closer to use merge-recursive: copy $GITHEAD strings init: make a copy of $GIT_DIR string config: make a copy of $GIT_CONFIG string commit: copy saved getenv() result get_super_prefix(): copy getenv() result
2019-01-11get_super_prefix(): copy getenv() resultJeff King1-2/+2
The return value of getenv() is not guaranteed to remain valid across multiple calls (nor across calls to setenv()). Since this function caches the result for the length of the program, we must make a copy to ensure that it is still valid when we need it. Reported-by: Yngve N. Pettersen <yngve@vivaldi.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-04Merge branch 'jk/loose-object-cache'Junio C Hamano1-2/+2
Code clean-up with optimization for the codepath that checks (non-)existence of loose objects. * jk/loose-object-cache: odb_load_loose_cache: fix strbuf leak fetch-pack: drop custom loose object cache sha1-file: use loose object cache for quick existence check object-store: provide helpers for loose_objects_cache sha1-file: use an object_directory for the main object dir handle alternates paths the same as the main object dir sha1_file_name(): overwrite buffer instead of appending rename "alternate_object_database" to "object_directory" submodule--helper: prefer strip_suffix() to ends_with() fsck: do not reuse child_process structs
2018-11-13Merge branch 'js/mingw-perl5lib'Junio C Hamano1-1/+0
Windows fix. * js/mingw-perl5lib: mingw: unset PERL5LIB by default config: move Windows-specific config settings into compat/mingw.c config: allow for platform-specific core.* config settings config: rename `dummy` parameter to `cb` in git_default_config()
2018-11-13sha1-file: use an object_directory for the main object dirJeff King1-2/+2
Our handling of alternate object directories is needlessly different from the main object directory. As a result, many places in the code basically look like this: do_something(r->objects->objdir); for (odb = r->objects->alt_odb_list; odb; odb = odb->next) do_something(odb->path); That gets annoying when do_something() is non-trivial, and we've resorted to gross hacks like creating fake alternates (see find_short_object_filename()). Instead, let's give each raw_object_store a unified list of object_directory structs. The first will be the main store, and everything after is an alternate. Very few callers even care about the distinction, and can just loop over the whole list (and those who care can just treat the first element differently). A few observations: - we don't need r->objects->objectdir anymore, and can just mechanically convert that to r->objects->odb->path - object_directory's path field needs to become a real pointer rather than a FLEX_ARRAY, in order to fill it with expand_base_dir() - we'll call prepare_alt_odb() earlier in many functions (i.e., outside of the loop). This may result in us calling it even when our function would be satisfied looking only at the main odb. But this doesn't matter in practice. It's not a very expensive operation in the first place, and in the majority of cases it will be a noop. We call it already (and cache its results) in prepare_packed_git(), and we'll generally check packs before loose objects. So essentially every program is going to call it immediately once per program. Arguably we should just prepare_alt_odb() immediately upon setting up the repository's object directory, which would save us sprinkling calls throughout the code base (and forgetting to do so has been a source of subtle bugs in the past). But I've stopped short of that here, since there are already a lot of other moving parts in this patch. - Most call sites just get shorter. The check_and_freshen() functions are an exception, because they have entry points to handle local and nonlocal directories separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-31config: move Windows-specific config settings into compat/mingw.cJohannes Schindelin1-1/+0
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-22worktree: add per-worktree config filesNguyễn Thái Ngọc Duy1-0/+1
A new repo extension is added, worktreeConfig. When it is present: - Repository config reading by default includes $GIT_DIR/config _and_ $GIT_DIR/config.worktree. "config" file remains shared in multiple worktree setup. - The special treatment for core.bare and core.worktree, to stay effective only in main worktree, is gone. These config settings are supposed to be in config.worktree. This extension is most useful in multiple worktree setup because you now have an option to store per-worktree config (which is either .git/config.worktree for main worktree, or .git/worktrees/xx/config.worktree for linked ones). This extension can be used in single worktree mode, even though it's pretty much useless (but this can happen after you remove all linked worktrees and move back to single worktree). "git config" reads from both "config" and "config.worktree" by default (i.e. without either --user, --file...) when this extension is present. Default writes still go to "config", not "config.worktree". A new option --worktree is added for that (*). Since a new repo extension is introduced, existing git binaries should refuse to access to the repo (both from main and linked worktrees). So they will not misread the config file (i.e. skip the config.worktree part). They may still accidentally write to the config file anyway if they use with "git config --file <path>". This design places a bet on the assumption that the majority of config variables are shared so it is the default mode. A safer move would be default writes go to per-worktree file, so that accidental changes are isolated. (*) "git config --worktree" points back to "config" file when this extension is not present and there is only one worktree so that it works in any both single and multiple worktree setups. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20Merge branch 'en/incl-forward-decl'Junio C Hamano1-0/+1
Code hygiene improvement for the header files. * en/incl-forward-decl: Remove forward declaration of an enum compat/precompose_utf8.h: use more common include guard style urlmatch.h: fix include guard Move definition of enum branch_track from cache.h to branch.h alloc: make allocate_alloc_state and clear_alloc_state more consistent Add missing includes and forward declarations
2018-08-15Merge branch 'nd/i18n'Junio C Hamano1-2/+2
Many more strings are prepared for l10n. * nd/i18n: (23 commits) transport-helper.c: mark more strings for translation transport.c: mark more strings for translation sha1-file.c: mark more strings for translation sequencer.c: mark more strings for translation replace-object.c: mark more strings for translation refspec.c: mark more strings for translation refs.c: mark more strings for translation pkt-line.c: mark more strings for translation object.c: mark more strings for translation exec-cmd.c: mark more strings for translation environment.c: mark more strings for translation dir.c: mark more strings for translation convert.c: mark more strings for translation connect.c: mark more strings for translation config.c: mark more strings for translation commit-graph.c: mark more strings for translation builtin/replace.c: mark more strings for translation builtin/pack-objects.c: mark more strings for translation builtin/grep.c: mark strings for translation builtin/config.c: mark more strings for translation ...
2018-08-15Merge branch 'jk/core-use-replace-refs'Junio C Hamano1-2/+2
A new configuration variable core.usereplacerefs has been added, primarily to help server installations that want to ignore the replace mechanism altogether. * jk/core-use-replace-refs: add core.usereplacerefs config option check_replace_refs: rename to read_replace_refs check_replace_refs: fix outdated comment
2018-08-15Move definition of enum branch_track from cache.h to branch.hElijah Newren1-0/+1
'branch_track' feels more closely related to branching, and it is needed later in branch.h; rather than #include'ing cache.h in branch.h for this small enum, just move the enum and the external declaration for git_branch_track to branch.h. Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-23environment.c: mark more strings for translationNguyễn Thái Ngọc Duy1-2/+2
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-18check_replace_refs: rename to read_replace_refsJeff King1-2/+2
This was added as a NEEDSWORK in c3c36d7de2 (replace-object: check_replace_refs is safe in multi repo environment, 2018-04-11), waiting for a calmer period. Since doing so now doesn't conflict with anything in 'pu', it seems as good a time as any. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-18Merge branch 'sb/object-store-grafts'Junio C Hamano1-4/+4
The conversion to pass "the_repository" and then "a_repository" throughout the object access API continues. * sb/object-store-grafts: commit: allow lookup_commit_graft to handle arbitrary repositories commit: allow prepare_commit_graft to handle arbitrary repositories shallow: migrate shallow information into the object parser path.c: migrate global git_path_* to take a repository argument cache: convert get_graft_file to handle arbitrary repositories commit: convert read_graft_file to handle arbitrary repositories commit: convert register_commit_graft to handle arbitrary repositories commit: convert commit_graft_pos() to handle arbitrary repositories shallow: add repository argument to is_repository_shallow shallow: add repository argument to check_shallow_file_for_update shallow: add repository argument to register_shallow shallow: add repository argument to set_alternate_shallow_file commit: add repository argument to lookup_commit_graft commit: add repository argument to prepare_commit_graft commit: add repository argument to read_graft_file commit: add repository argument to register_commit_graft commit: add repository argument to commit_graft_pos object: move grafts to object parser object-store: move object access functions to object-store.h
2018-07-17commit-graph: add repo arg to graph readersJonathan Tan1-1/+0
Add a struct repository argument to the functions in commit-graph.h that read the commit graph. (This commit does not affect functions that write commit graphs.) Because the commit graph functions can now read the commit graph of any repository, the global variable core_commit_graph has been removed. Instead, the config option core.commitGraph is now read on the first time in a repository that a commit is attempted to be parsed using its commit graph. This commit includes a test that exercises the functionality on an arbitrary repository that is not the_repository. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-29Merge branch 'sb/object-store-grafts' into sb/object-store-lookupJunio C Hamano1-4/+4
* sb/object-store-grafts: commit: allow lookup_commit_graft to handle arbitrary repositories commit: allow prepare_commit_graft to handle arbitrary repositories shallow: migrate shallow information into the object parser path.c: migrate global git_path_* to take a repository argument cache: convert get_graft_file to handle arbitrary repositories commit: convert read_graft_file to handle arbitrary repositories commit: convert register_commit_graft to handle arbitrary repositories commit: convert commit_graft_pos() to handle arbitrary repositories shallow: add repository argument to is_repository_shallow shallow: add repository argument to check_shallow_file_for_update shallow: add repository argument to register_shallow shallow: add repository argument to set_alternate_shallow_file commit: add repository argument to lookup_commit_graft commit: add repository argument to prepare_commit_graft commit: add repository argument to read_graft_file commit: add repository argument to register_commit_graft commit: add repository argument to commit_graft_pos object: move grafts to object parser object-store: move object access functions to object-store.h
2018-05-18cache: convert get_graft_file to handle arbitrary repositoriesStefan Beller1-3/+3
This conversion was done without the #define trick used in the earlier series refactoring to have better repository access, because this function is easy to review, as all lines are converted and it has only one caller. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18shallow: add repository argument to set_alternate_shallow_fileStefan Beller1-1/+1
Add a repository argument to allow callers of set_alternate_shallow_file to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-08Merge branch 'ls/checkout-encoding'Junio C Hamano1-0/+1
The new "checkout-encoding" attribute can ask Git to convert the contents to the specified encoding when checking out to the working tree (and the other way around when checking in). * ls/checkout-encoding: convert: add round trip check based on 'core.checkRoundtripEncoding' convert: add tracing for 'working-tree-encoding' attribute convert: check for detectable errors in UTF encodings convert: add 'working-tree-encoding' attribute utf8: add function to detect a missing UTF-16/32 BOM utf8: add function to detect prohibited UTF-16/32 BOM utf8: teach same_encoding() alternative UTF encoding names strbuf: add a case insensitive starts_with() strbuf: add xstrdup_toupper() strbuf: remove unnecessary NUL assignment in xstrdup_tolower()
2018-05-08Merge branch 'sb/object-store-replace'Junio C Hamano1-1/+1
The effort to pass the repository in-core structure throughout the API continues. This round deals with the code that implements the refs/replace/ mechanism. * sb/object-store-replace: replace-object: allow lookup_replace_object to handle arbitrary repositories replace-object: allow do_lookup_replace_object to handle arbitrary repositories replace-object: allow prepare_replace_object to handle arbitrary repositories refs: allow for_each_replace_ref to handle arbitrary repositories refs: store the main ref store inside the repository struct replace-object: add repository argument to lookup_replace_object replace-object: add repository argument to do_lookup_replace_object replace-object: add repository argument to prepare_replace_object refs: add repository argument to for_each_replace_ref refs: add repository argument to get_main_ref_store replace-object: check_replace_refs is safe in multi repo environment replace-object: eliminate replace objects prepared flag object-store: move lookup_replace_object to replace-object.h replace-object: move replace_map to object store replace_object: use oidmap
2018-05-08Merge branch 'ds/commit-graph'Junio C Hamano1-0/+1
Precompute and store information necessary for ancestry traversal in a separate file to optimize graph walking. * ds/commit-graph: commit-graph: implement "--append" option commit-graph: build graph from starting commits commit-graph: read only from specific pack-indexes commit: integrate commit graph with commit parsing commit-graph: close under reachability commit-graph: add core.commitGraph setting commit-graph: implement git commit-graph read commit-graph: implement git-commit-graph write commit-graph: implement write_commit_graph() commit-graph: create git-commit-graph builtin graph: add commit graph design document commit-graph: add format document csum-file: refactor finalize_hashfile() method csum-file: rename hashclose() to finalize_hashfile()
2018-04-25Merge branch 'jk/relative-directory-fix'Junio C Hamano1-3/+23
Some codepaths, including the refs API, get and keep relative paths, that go out of sync when the process does chdir(2). The chdir-notify API is introduced to let these codepaths adjust these cached paths to the new current directory. * jk/relative-directory-fix: refs: use chdir_notify to update cached relative paths set_work_tree: use chdir_notify add chdir-notify API trace.c: export trace_setup_key set_git_dir: die when setenv() fails
2018-04-16convert: add round trip check based on 'core.checkRoundtripEncoding'Lars Schneider1-0/+1
UTF supports lossless conversion round tripping and conversions between UTF and other encodings are mostly round trip safe as Unicode aims to be a superset of all other character encodings. However, certain encodings (e.g. SHIFT-JIS) are known to have round trip issues [1]. Add 'core.checkRoundtripEncoding', which contains a comma separated list of encodings, to define for what encodings Git should check the conversion round trip if they are used in the 'working-tree-encoding' attribute. Set SHIFT-JIS as default value for 'core.checkRoundtripEncoding'. [1] https://support.microsoft.com/en-us/help/170559/prb-conversion-problem-between-shift-jis-and-unicode Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-12replace-object: check_replace_refs is safe in multi repo environmentStefan Beller1-1/+1
In e1111cef23 (inline lookup_replace_object() calls, 2011-05-15) a shortcut for checking the object replacement was added by setting check_replace_refs to 0 once the replacements were evaluated to not exist. This works fine in with the assumption of only one repository in existence. The assumption won't hold true any more when we work on multiple instances of a repository structs (e.g. one struct per submodule), as the first repository to be inspected may have no replacements and would set the global variable. Other repositories would then completely omit their evaluation of replacements. This reverts back the meaning of the flag `check_replace_refs` of "Do we need to check with the lookup table?" to "Do we need to read the replacement definition?", adding the bypassing logic to lookup_replace_object after the replacement definition was read. As with the original patch, delay the renaming of the global variable Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-11Merge branch 'sb/object-store'Junio C Hamano1-2/+3
Refactoring the internal global data structure to make it possible to open multiple repositories, work with and then close them. Rerolled by Duy on top of a separate preliminary clean-up topic. The resulting structure of the topics looked very sensible. * sb/object-store: (27 commits) sha1_file: allow sha1_loose_object_info to handle arbitrary repositories sha1_file: allow map_sha1_file to handle arbitrary repositories sha1_file: allow map_sha1_file_1 to handle arbitrary repositories sha1_file: allow open_sha1_file to handle arbitrary repositories sha1_file: allow stat_sha1_file to handle arbitrary repositories sha1_file: allow sha1_file_name to handle arbitrary repositories sha1_file: add repository argument to sha1_loose_object_info sha1_file: add repository argument to map_sha1_file sha1_file: add repository argument to map_sha1_file_1 sha1_file: add repository argument to open_sha1_file sha1_file: add repository argument to stat_sha1_file sha1_file: add repository argument to sha1_file_name sha1_file: allow prepare_alt_odb to handle arbitrary repositories sha1_file: allow link_alt_odb_entries to handle arbitrary repositories sha1_file: add repository argument to prepare_alt_odb sha1_file: add repository argument to link_alt_odb_entries sha1_file: add repository argument to read_info_alternates sha1_file: add repository argument to link_alt_odb_entry sha1_file: add raw_object_store argument to alt_odb_usable pack: move approximate object count to object store ...
2018-04-11commit-graph: add core.commitGraph settingDerrick Stolee1-0/+1
The commit graph feature is controlled by the new core.commitGraph config setting. This defaults to 0, so the feature is opt-in. The intention of core.commitGraph is that a user can always stop checking for or parsing commit graph files if core.commitGraph=0. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-10Merge branch 'nd/remove-ignore-env-field'Junio C Hamano1-3/+28
Code clean-up for the "repository" abstraction. * nd/remove-ignore-env-field: repository.h: add comment and clarify repo_set_gitdir repository: delete ignore_env member sha1_file.c: move delayed getenv(altdb) back to setup_git_env() repository.c: delete dead functions repository.c: move env-related setup code back to environment.c repository: initialize the_repository in main()
2018-03-30set_work_tree: use chdir_notifyJeff King1-1/+22
When we change to the top of the working tree, we manually re-adjust $GIT_DIR and call set_git_dir() again, in order to update any relative git-dir we'd compute earlier. Instead of the work-tree code having to know to call the git-dir code, let's use the new chdir_notify interface. There are two spots that need updating, with a few subtleties in each: 1. the set_git_dir() code needs to chdir_notify_register() so it can be told when to update its path. Technically we could push this down into repo_set_gitdir(), so that even repository structs besides the_repository could benefit from this. But that opens up a lot of complications: - we'd still need to touch set_git_dir(), because it does some other setup (like setting $GIT_DIR in the environment) - submodules using other repository structs get cleaned up, which means we'd need to remove them from the chdir_notify list - it's unlikely to fix any bugs, since we shouldn't generally chdir() in the middle of working on a submodule 2. setup_work_tree now needs to call chdir_notify(), and can lose its manual set_git_dir() call. Note that at first glance it looks like this undoes the absolute-to-relative optimization added by 044bbbcb63 (Make git_dir a path relative to work_tree in setup_work_tree(), 2008-06-19). But for the most part that optimization was just _undoing_ the relative-to-absolute conversion which the function was doing earlier (and which is now gone). It is true that if you already have an absolute git_dir that the setup_work_tree() function will no longer make it relative as a side effect. But: - we generally do have relative git-dir's due to the way the discovery code works - if we really care about making git-dir's relative when possible, then we should be relativizing them earlier (e.g., when we see an absolute $GIT_DIR we could turn it relative, whether we are going to chdir into a worktree or not). That would cover all cases, including ones that 044bbbcb63 did not. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-30set_git_dir: die when setenv() failsJeff King1-3/+2
The set_git_dir() function returns an error if setenv() fails, but there are zero callers who pay attention to this return value. If this ever were to happen, it could cause confusing results, as sub-processes would see a potentially stale GIT_DIR (e.g., if it is relative and we chdir()-ed to the root of the working tree). We _could_ try to fix each caller, but there's really nothing useful to do after this failure except die. Let's just lump setenv() failure into the same category as malloc failure: things that should never happen and cause us to abort catastrophically. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-23repository: introduce raw object store fieldStefan Beller1-2/+3
The raw object store field will contain any objects needed for access to objects in a given repository. This patch introduces the raw object store and populates it with the `objectdir`, which used to be part of the repository struct. As the struct gains members, we'll also populate the function to clear the memory for these members. In a later step, we'll introduce a struct object_parser, that will complement the object parsing in a repository struct: The raw object parser is the layer that will provide access to raw object content, while the higher level object parser code will parse raw objects and keeps track of parenthood and other object relationships using 'struct object'. For now only add the lower level to the repository struct. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-06Merge branch 'bw/c-plus-plus'Junio C Hamano1-12/+12
Avoid using identifiers that clash with C++ keywords. Even though it is not a goal to compile Git with C++ compilers, changes like this help use of code analysis tools that targets C++ on our codebase. * bw/c-plus-plus: (37 commits) replace: rename 'new' variables trailer: rename 'template' variables tempfile: rename 'template' variables wrapper: rename 'template' variables environment: rename 'namespace' variables diff: rename 'template' variables environment: rename 'template' variables init-db: rename 'template' variables unpack-trees: rename 'new' variables trailer: rename 'new' variables submodule: rename 'new' variables split-index: rename 'new' variables remote: rename 'new' variables ref-filter: rename 'new' variables read-cache: rename 'new' variables line-log: rename 'new' variables imap-send: rename 'new' variables http: rename 'new' variables entry: rename 'new' variables diffcore-delta: rename 'new' variables ...
2018-03-05sha1_file.c: move delayed getenv(altdb) back to setup_git_env()Nguyễn Thái Ngọc Duy1-0/+1
getenv() is supposed to work on the main repository only. This delayed getenv() code in sha1_file.c makes it more difficult to convert sha1_file.c to a generic object store that could be used by both submodule and main repositories. Move the getenv() back in setup_git_env() where other env vars are also fetched. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-05repository.c: move env-related setup code back to environment.cNguyễn Thái Ngọc Duy1-3/+27
It does not make sense that generic repository code contains handling of environment variables, which are specific for the main repository only. Refactor repo_set_gitdir() function to take $GIT_DIR and optionally _all_ other customizable paths. These optional paths can be NULL and will be calculated according to the default directory layout. Note that some dead functions are left behind to reduce diff noise. They will be deleted in the next patch. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-22environment: rename 'namespace' variablesBrandon Williams1-5/+5
Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-22environment: rename 'template' variablesBrandon Williams1-7/+7
Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-13Merge branch 'tb/crlf-conv-flags'Junio C Hamano1-1/+1
Code clean-up. * tb/crlf-conv-flags: convert_to_git(): safe_crlf/checksafe becomes int conv_flags
2018-02-13Merge branch 'jh/partial-clone'Junio C Hamano1-0/+1
The machinery to clone & fetch, which in turn involves packing and unpacking objects, have been told how to omit certain objects using the filtering mechanism introduced by the jh/object-filtering topic, and also mark the resulting pack as a promisor pack to tolerate missing objects, taking advantage of the mechanism introduced by the jh/fsck-promisors topic. * jh/partial-clone: t5616: test bulk prefetch after partial fetch fetch: inherit filter-spec from partial clone t5616: end-to-end tests for partial clone fetch-pack: restore save_commit_buffer after use unpack-trees: batch fetching of missing blobs clone: partial clone partial-clone: define partial clone settings in config fetch: support filters fetch: refactor calculation of remote list fetch-pack: test support excluding large blobs fetch-pack: add --no-filter fetch-pack, index-pack, transport: partial clone upload-pack: add object filtering for partial clone
2018-02-13Merge branch 'jh/fsck-promisors'Junio C Hamano1-0/+1
In preparation for implementing narrow/partial clone, the machinery for checking object connectivity used by gc and fsck has been taught that a missing object is OK when it is referenced by a packfile specially marked as coming from trusted repository that promises to make them available on-demand and lazily. * jh/fsck-promisors: gc: do not repack promisor packfiles rev-list: support termination at promisor objects sha1_file: support lazily fetching missing objects introduce fetch-object: fetch one promisor object index-pack: refactor writing of .keep files fsck: support promisor objects as CLI argument fsck: support referenced promisor objects fsck: support refs pointing to promisor objects fsck: introduce partialclone extension extension.partialclone: introduce partial clone extension
2018-01-16convert_to_git(): safe_crlf/checksafe becomes int conv_flagsTorsten Bögershausen1-1/+1
When calling convert_to_git(), the checksafe parameter defined what should happen if the EOL conversion (CRLF --> LF --> CRLF) does not roundtrip cleanly. In addition, it also defined if line endings should be renormalized (CRLF --> LF) or kept as they are. checksafe was an safe_crlf enum with these values: SAFE_CRLF_FALSE: do nothing in case of EOL roundtrip errors SAFE_CRLF_FAIL: die in case of EOL roundtrip errors SAFE_CRLF_WARN: print a warning in case of EOL roundtrip errors SAFE_CRLF_RENORMALIZE: change CRLF to LF SAFE_CRLF_KEEP_CRLF: keep all line endings as they are In some cases the integer value 0 was passed as checksafe parameter instead of the correct enum value SAFE_CRLF_FALSE. That was no problem because SAFE_CRLF_FALSE is defined as 0. FALSE/FAIL/WARN are different from RENORMALIZE and KEEP_CRLF. Therefore, an enum is not ideal. Let's use a integer bit pattern instead and rename the parameter to conv_flags to make it more generically usable. This allows us to extend the bit pattern in a subsequent commit. Reported-By: Randall S. Becker <rsbecker@nexbridge.com> Helped-By: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-19Merge branch 'ar/unconfuse-three-dots'Junio C Hamano1-0/+15
Ancient part of codebase still shows dots after an abbreviated object name just to show that it is not a full object name, but these ellipses are confusing to people who newly discovered Git who are used to seeing abbreviated object names and find them confusing with the range syntax. * ar/unconfuse-three-dots: t2020: test variations that matter t4013: test new output from diff --abbrev --raw diff: diff_aligned_abbrev: remove ellipsis after abbreviated SHA-1 value t4013: prepare for upcoming "diff --raw --abbrev" output format change checkout: describe_detached_head: remove ellipsis after committish print_sha1_ellipsis: introduce helper Documentation: user-manual: limit usage of ellipsis Documentation: revisions: fix typo: "three dot" ---> "three-dot" (in line with "two-dot").
2017-12-08partial-clone: define partial clone settings in configJeff Hostetler1-0/+1
Create get and set routines for "partial clone" config settings. These will be used in a future commit by clone and fetch to remember the promisor remote and the default filter-spec. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-05extension.partialclone: introduce partial clone extensionJonathan Tan1-0/+1
Introduce new repository extension option: `extensions.partialclone` See the update to Documentation/technical/repository-version.txt in this patch for more information. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-04print_sha1_ellipsis: introduce helperAnn T Ropea1-0/+15
Introduce a helper print_sha1_ellipsis() that pays attention to the GIT_PRINT_SHA1_ELLIPSIS environment variable, and prepare the tests to unconditionally set it for the test pieces that will be broken once the code stops showing the extra dots by default. The removal of these dots is merely a plan at this step and has not happened yet but soon will. Document GIT_PRINT_SHA1_ELLIPSIS. Signed-off-by: Ann T Ropea <bedhanger@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-21Merge branch 'bp/fsmonitor'Junio C Hamano1-0/+1
We learned to talk to watchman to speed up "git status" and other operations that need to see which paths have been modified. * bp/fsmonitor: fsmonitor: preserve utf8 filenames in fsmonitor-watchman log fsmonitor: read entirety of watchman output fsmonitor: MINGW support for watchman integration fsmonitor: add a performance test fsmonitor: add a sample integration script for Watchman fsmonitor: add test cases for fsmonitor extension split-index: disable the fsmonitor extension when running the split index test fsmonitor: add a test tool to dump the index extension update-index: add fsmonitor support to update-index ls-files: Add support in ls-files to display the fsmonitor valid bit fsmonitor: add documentation for the fsmonitor extension. fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files. update-index: add a new --force-write-index option preload-index: add override to enable testing preload-index bswap: add 64 bit endianness helper get_be64
2017-10-03Merge branch 'jk/no-optional-locks'Junio C Hamano1-0/+5
Some commands (most notably "git status") makes an opportunistic update when performing a read-only operation to help optimize later operations in the same repository. The new "--no-optional-locks" option can be passed to Git to disable them. * jk/no-optional-locks: git: add --no-optional-locks option
2017-10-01fsmonitor: teach git to optionally utilize a file system monitor to speed up ↵Ben Peart1-0/+1
detecting new or changed files. When the index is read from disk, the fsmonitor index extension is used to flag the last known potentially dirty index entries. The registered core.fsmonitor command is called with the time the index was last updated and returns the list of files changed since that time. This list is used to flag any additional dirty cache entries and untracked cache directories. We can then use this valid state to speed up preload_index(), ie_match_stat(), and refresh_cache_ent() as they do not need to lstat() files to detect potential changes for those entries marked CE_FSMONITOR_VALID. In addition, if the untracked cache is turned on valid_cached_dir() can skip checking directories for new or changed files as fsmonitor will invalidate the cache only for those directories that have been identified as having potential changes. To keep the CE_FSMONITOR_VALID state accurate during git operations; when git updates a cache entry to match the current state on disk, it will now set the CE_FSMONITOR_VALID bit. Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit is cleared and the corresponding untracked cache directory is marked invalid. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-27git: add --no-optional-locks optionJeff King1-0/+5
Some tools like IDEs or fancy editors may periodically run commands like "git status" in the background to keep track of the state of the repository. Some of these commands may refresh the index and write out the result in an opportunistic way: if they can get the index lock, then they update the on-disk index with any updates they find. And if not, then their in-core refresh is lost and just has to be recomputed by the next caller. But taking the index lock may conflict with other operations in the repository. Especially ones that the user is doing themselves, which _aren't_ opportunistic. In other words, "git status" knows how to back off when somebody else is holding the lock, but other commands don't know that status would be happy to drop the lock if somebody else wanted it. There are a couple possible solutions: 1. Have some kind of "pseudo-lock" that allows other commands to tell status that they want the lock. This is likely to be complicated and error-prone to implement (and maybe even impossible with just dotlocks to work from, as it requires some inter-process communication). 2. Avoid background runs of commands like "git status" that want to do opportunistic updates, preferring instead plumbing like diff-files, etc. This is awkward for a couple of reasons. One is that "status --porcelain" reports a lot more about the repository state than is available from individual plumbing commands. And two is that we actually _do_ want to see the refreshed index. We just don't want to take a lock or write out the result. Whereas commands like diff-files expect us to refresh the index separately and write it to disk so that they can depend on the result. But that write is exactly what we're trying to avoid. 3. Ask "status" not to lock or write the index. This is easy to implement. The big downside is that any work done in refreshing the index for such a call is lost when the process exits. So a background process may end up re-hashing a changed file multiple times until the user runs a command that does an index refresh themselves. This patch implements the option 3. The idea (and the test) is largely stolen from a Git for Windows patch by Johannes Schindelin, 67e5ce7f63 (status: offer *not* to lock the index and update it, 2016-08-12). The twist here is that instead of making this an option to "git status", it becomes a "git" option and matching environment variable. The reason there is two-fold: 1. An environment variable is carried through to sub-processes. And whether an invocation is a background process or not should apply to the whole process tree. So you could do "git --no-optional-locks foo", and if "foo" is a script or alias that calls "status", you'll still get the effect. 2. There may be other programs that want the same treatment. I've punted here on finding more callers to convert, since "status" is the obvious one to call as a repeated background job. But "git diff"'s opportunistic refresh of the index may be a good candidate. The test is taken from 67e5ce7f63, and it's worth repeating Johannes's explanation: Note that the regression test added in this commit does not *really* verify that no index.lock file was written; that test is not possible in a portable way. Instead, we verify that .git/index is rewritten *only* when `git status` is run without `--no-optional-locks`. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06repository: free fields before overwriting themJeff King1-1/+3
It's possible that the repository data may be initialized twice (e.g., after doing a chdir() to the top of the worktree we may have to adjust a relative git_dir path). We should free() any existing fields before assigning to them to avoid leaks. This should be safe, as the fields are set based on the environment or on other strings like the gitdir or commondir. That makes it impossible that we are feeding an alias to the just-freed string. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23environment: store worktree in the_repositoryBrandon Williams1-5/+4
Migrate 'work_tree' to be stored in 'the_repository'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23environment: place key repository state in the_repositoryBrandon Williams1-45/+13
Migrate 'git_dir', 'git_common_dir', 'git_object_dir', 'git_index_file', 'git_graft_file', and 'namespace' to be stored in 'the_repository'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23environment: remove namespace_len variableBrandon Williams1-5/+4
Use 'skip_prefix' instead of 'starts_with' so that we can drop the need to keep around 'namespace_len'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23setup: don't perform lazy initialization of repository stateBrandon Williams1-9/+8
Under some circumstances (bogus GIT_DIR value or the discovered gitdir is '.git') 'setup_git_directory()' won't initialize key repository state. This leads to inconsistent state after running the setup code. To account for this inconsistent state, lazy initialization is done once a caller asks for the repository's gitdir or some other piece of repository state. This is confusing and can be error prone. Instead let's tighten the expected outcome of 'setup_git_directory()' and ensure that it initializes repository state in all cases that would have been handled by lazy initialization. This also lets us drop the requirement to have 'have_git_dir()' check if the environment variable GIT_DIR was set as that will be handled by the end of the setup code. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23Merge branches 'bw/ls-files-sans-the-index' and 'bw/config-h' into ↵Junio C Hamano1-0/+1
bw/repo-object * bw/ls-files-sans-the-index: ls-files: factor out tag calculation ls-files: factor out debug info into a function ls-files: convert show_files to take an index ls-files: convert show_ce_entry to take an index ls-files: convert prune_cache to take an index ls-files: convert ce_excluded to take an index ls-files: convert show_ru_info to take an index ls-files: convert show_other_files to take an index ls-files: convert show_killed_files to take an index ls-files: convert write_eolinfo to take an index ls-files: convert overlay_tree_on_cache to take an index tree: convert read_tree to take an index parameter convert: convert renormalize_buffer to take an index convert: convert convert_to_git to take an index convert: convert convert_to_git_filter_fd to take an index convert: convert crlf_to_git to take an index convert: convert get_cached_convert_stats_ascii to take an index * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h alias: use the early config machinery to expand aliases t7006: demonstrate a problem with aliases in subdirectories t1308: relax the test verifying that empty alias values are disallowed help: use early config when autocorrecting aliases config: report correct line number upon error discover_git_directory(): avoid setting invalid git_dir
2017-06-15config: don't include config.h by defaultBrandon Williams1-0/+1
Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-29Merge branch 'jk/bug-to-abort'Junio C Hamano1-1/+1
Introduce the BUG() macro to improve die("BUG: ..."). * jk/bug-to-abort: usage: add NORETURN to BUG() function definitions config: complain about --local outside of a git repo setup_git_env: convert die("BUG") to BUG() usage.c: add BUG() function
2017-05-16Merge branch 'nd/worktree-kill-parse-ref'Junio C Hamano1-0/+2
"git gc" did not interact well with "git worktree"-managed per-worktree refs. * nd/worktree-kill-parse-ref: refs: kill set_worktree_head_symref() worktree.c: kill parse_ref() in favor of refs_resolve_ref_unsafe() refs: introduce get_worktree_ref_store() refs: add REFS_STORE_ALL_CAPS refs.c: make submodule ref store hashmap generic environment.c: fix potential segfault by get_git_common_dir()
2017-05-15setup_git_env: convert die("BUG") to BUG()Jeff King1-1/+1
Converting to BUG() makes it easier to detect and debug cases where we hit this assertion. Coupled with a new test in t1300, this shows that the test suite can detect such corner cases. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-16Merge branch 'jk/snprintf-cleanups'Junio C Hamano1-8/+6
Code clean-up. * jk/snprintf-cleanups: daemon: use an argv_array to exec children gc: replace local buffer with git_path transport-helper: replace checked snprintf with xsnprintf convert unchecked snprintf into xsnprintf combine-diff: replace malloc/snprintf with xstrfmt replace unchecked snprintf calls with heap buffers receive-pack: print --pack-header directly into argv array name-rev: replace static buffer with strbuf create_branch: use xstrfmt for reflog message create_branch: move msg setup closer to point of use avoid using mksnpath for refs avoid using fixed PATH_MAX buffers for refs fetch: use heap buffer to format reflog tag: use strbuf to format tag header diff: avoid fixed-size buffer for patch-ids odb_mkstemp: use git_path_buf odb_mkstemp: write filename into strbuf do not check odb_mkstemp return value for errors
2017-04-16environment.c: fix potential segfault by get_git_common_dir()Nguyễn Thái Ngọc Duy1-0/+2
setup_git_env() must be called before this function to initialize git_common_dir so that it returns a non NULL string. And it must return a non NULL string or segfault can happen because all callers expect so. It does not do so explicitly though and depends on get_git_dir() being called first (which will guarantee setup_git_env()). Avoid this dependency and call setup_git_env() by itself. test-ref-store.c will hit this problem because it's very lightweight, just enough initialization to exercise refs code, and get_git_dir() will never be called until get_worktrees() is, which uses get_git_common_dir and hits a segfault. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-30Merge branch 'jk/no-looking-at-dotgit-outside-repo-final'Junio C Hamano1-1/+4
This is the endgame of the topic to avoid blindly falling back to ".git" when the setup sequence said we are _not_ in Git repository. A corner case that happens to work right now may be broken by a call to die("BUG"). * jk/no-looking-at-dotgit-outside-repo-final: setup_git_env: avoid blind fall-back to ".git"
2017-03-28odb_mkstemp: use git_path_bufJeff King1-4/+2
Since git_path_buf() is smart enough to replace "objects/" with the correct object path, we can use it instead of manually assembling the path. That's slightly shorter, and will clean up any non-canonical bits in the path. Signed-off-by: Jeff King <peff@peff.net>
2017-03-28odb_mkstemp: write filename into strbufJeff King1-8/+8
The odb_mkstemp() function expects the caller to provide a fixed buffer to write the resulting tempfile name into. But it creates the template using snprintf without checking the return value. This means we could silently truncate the filename. In practice, it's unlikely that the truncation would end in the template-pattern that mkstemp needs to open the file. So we'd probably end up failing either way, unless the path was specially crafted. The simplest fix would be to notice the truncation and die. However, we can observe that most callers immediately xstrdup() the result anyway. So instead, let's switch to using a strbuf, which is easier for them (and isn't a big deal for the other 2 callers, who can just strbuf_release when they're done with it). Note that many of the callers used static buffers, but this was purely to avoid putting a large buffer on the stack. We never passed the static buffers out of the function, so there's no complicated memory handling we need to change. Signed-off-by: Jeff King <peff@peff.net>
2017-03-21Merge branch 'jk/pack-name-cleanups'Junio C Hamano1-4/+2
Code clean-up. * jk/pack-name-cleanups: index-pack: make pointer-alias fallbacks safer replace snprintf with odb_pack_name() odb_pack_keep(): stop generating keepfile name sha1_file.c: make pack-name helper globally accessible move odb_* declarations out of git-compat-util.h
2017-03-16odb_pack_keep(): stop generating keepfile nameJeff King1-4/+2
The odb_pack_keep() function generates the name of a .keep file and opens it. This has two problems: 1. It requires a fixed-size buffer to create the filename and doesn't notice when the result is truncated. 2. Of the two callers, one sometimes wants to open a filename it already has, which makes things awkward (it has to do so manually, and skips the leading-directory creation). Instead, let's have odb_pack_keep() just open the file. Generating the name isn't hard, and a future patch will switch callers over to odb_pack_name() anyway. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-08real_pathdup(): fix callsites that wanted it to die on errorJohannes Schindelin1-1/+1
In 4ac9006f832 (real_path: have callers use real_pathdup and strbuf_realpath, 2016-12-12), we changed the xstrdup(real_path()) pattern to use real_pathdup() directly. The problem with this change is that real_path() calls strbuf_realpath() with die_on_error = 1 while real_pathdup() calls it with die_on_error = 0. Meaning that in cases where real_path() causes Git to die() with an error message, real_pathdup() is silent and returns NULL instead. The callers, however, are ill-prepared for that change, as they expect the return value to be non-NULL (and otherwise the function died with an appropriate error message). Fix this by extending real_pathdup()'s signature to accept the die_on_error flag and simply pass it through to strbuf_realpath(), and then adjust all callers after a careful audit whether they would handle NULLs well. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-31refs: add option core.logAllRefUpdates = alwaysCornelius Weig1-1/+1
When core.logallrefupdates is true, we only create a new reflog for refs that are under certain well-known hierarchies. The reason is that we know that some hierarchies (like refs/tags) are not meant to change, and that unknown hierarchies might not want reflogs at all (e.g., a hypothetical refs/foo might be meant to change often and drop old history immediately). However, sometimes it is useful to override this decision and simply log for all refs, because the safety and audit trail is more important than the performance implications of keeping the log around. This patch introduces a new "always" mode for the core.logallrefupdates option which will log updates to everything under refs/, regardless where in the hierarchy it is (we still will not log things like ORIG_HEAD and FETCH_HEAD, which are known to be transient). Based-on-patch-by: Jeff King <peff@peff.net> Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-18Merge branch 'bw/grep-recurse-submodules'Junio C Hamano1-1/+1
"git grep" has been taught to optionally recurse into submodules. * bw/grep-recurse-submodules: grep: search history of moved submodules grep: enable recurse-submodules to work on <tree> objects grep: optionally recurse into submodules grep: add submodules as a grep source type submodules: load gitmodules file from commit sha1 submodules: add helper to determine if a submodule is initialized submodules: add helper to determine if a submodule is populated real_path: canonicalize directory separators in root parts real_path: have callers use real_pathdup and strbuf_realpath real_path: create real_pathdup real_path: convert real_path_internal to strbuf_realpath real_path: resolve symlinks by hand
2016-12-12real_path: have callers use real_pathdup and strbuf_realpathBrandon Williams1-1/+1
Migrate callers of real_path() who duplicate the retern value to use real_pathdup or strbuf_realpath. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-11-15compression: unify pack.compression configuration parsingJunio C Hamano1-1/+1
There are three codepaths that use a variable whose name is pack_compression_level to affect how objects and deltas sent to a packfile is compressed. Unlike zlib_compression_level that controls the loose object compression, however, this variable was static to each of these codepaths. Two of them read the pack.compression configuration variable, using core.compression as the default, and one of them also allowed overriding it from the command line. The other codepath in bulk-checkin did not pay any attention to the configuration. Unify the configuration parsing to git_default_config(), where we implement the parsing of core.loosecompression and core.compression and make the former override the latter, by moving code to parse pack.compression and also allow core.compression to give default to this variable. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-27Merge branch 'lt/abbrev-auto'Junio C Hamano1-1/+1
Allow the default abbreviation length, which has historically been 7, to scale as the repository grows. The logic suggests to use 12 hexdigits for the Linux kernel, and 9 to 10 for Git itself. * lt/abbrev-auto: abbrev: auto size the default abbreviation abbrev: prepare for new world order abbrev: add FALLBACK_DEFAULT_ABBREV to prepare for auto sizing
2016-10-26setup_git_env: avoid blind fall-back to ".git"Jeff King1-1/+4
When we default to ".git" without having done any kind of repository setup, the results quite often do what the user expects. But this has also historically been the cause of some poorly behaved corner cases. These cases can be hard to find, because they happen at the conjunction of two relatively rare circumstances: 1. We are running some code which assumes there's a repository present, but there isn't necessarily one (e.g., low-level diff code triggered by "git diff --no-index" might try to look at some repository data). 2. We have an unusual setup, like being in a subdirectory of the working tree, or we have a .git file (rather than a directory), or we are running a tool like "init" or "clone" which may operate on a repository in a different directory. Our test scripts often cover (1), but miss doing (2) at the same time, and so the fallback appears to work but has lurking bugs. We can flush these bugs out by refusing to do the fallback entirely., This makes potential problems a lot more obvious by complaining even for "usual" setups. This passes the test suite (after the adjustments in the previous patches), but there's a risk of regression for any cases where the fallback usually works fine but the code isn't exercised by the test suite. So by itself, this commit is a potential step backward, but lets us take two steps forward once we've identified and fixed any such instances. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-26Merge branch 'bw/ls-files-recurse-submodules'Junio C Hamano1-0/+13
"git ls-files" learned "--recurse-submodules" option that can be used to get a listing of tracked files across submodules (i.e. this only works with "--cached" option, not for listing untracked or ignored files). This would be a useful tool to sit on the upstream side of a pipe that is read with xargs to work on all working tree files from the top-level superproject. * bw/ls-files-recurse-submodules: ls-files: add pathspec matching for submodules ls-files: pass through safe options for --recurse-submodules ls-files: optionally recurse into submodules git: make super-prefix option
2016-10-10git: make super-prefix optionBrandon Williams1-0/+13
Add a super-prefix environment variable 'GIT_INTERNAL_SUPER_PREFIX' which can be used to specify a path from above a repository down to its root. When such a super-prefix is specified, the paths reported by Git are prefixed with it to make them relative to that directory "above". The paths given by the user on the command line (e.g. "git subcmd --output-file=path/to/a/file" and pathspecs) are taken relative to the directory "above" to match. The immediate use of this option is by commands which have a --recurse-submodule option in order to give context to submodules about how they were invoked. This option is currently only allowed for builtins which support a super-prefix. Signed-off-by: Brandon Williams <bmwill@google.com> Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-03abbrev: auto size the default abbreviationLinus Torvalds1-1/+1
In fairly early days we somehow decided to abbreviate object names down to 7-hexdigits, but as projects grow, it is becoming more and more likely to see such a short object names made in earlier days and recorded in the log messages no longer unique. Currently the Linux kernel project needs 11 to 12 hexdigits, while Git itself needs 10 hexdigits to uniquely identify the objects they have, while many smaller projects may still be fine with the original 7-hexdigit default. One-size does not fit all projects. Introduce a mechanism, where we estimate the number of objects in the repository upon the first request to abbreviate an object name with the default setting and come up with a sane default for the repository. Based on the expectation that we would see collision in a repository with 2^(2N) objects when using object names shortened to first N bits, use sufficient number of hexdigits to cover the number of objects in the repository. Each hexdigit (4-bits) we add to the shortened name allows us to have four times (2-bits) as many objects in the repository. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-03abbrev: add FALLBACK_DEFAULT_ABBREV to prepare for auto sizingJunio C Hamano1-1/+1
We'll be introducing a new way to decide the default abbreviation length by initialising DEFAULT_ABBREV to -1 to signal the first call to "find unique abbreviation" codepath to compute a reasonable value based on the number of objects we have to avoid collisions. We have long relied on DEFAULT_ABBREV being a positive concrete value that is used as the abbreviation length when no extra configuration or command line option has overridden it. Some codepaths wants to use such a positive concrete default value even before making their first request to actually trigger the computation for the auto sized default. Introduce FALLBACK_DEFAULT_ABBREV and use it to the code that attempts to align the report from "git fetch". For now, this macro is also used to initialize the default_abbrev variable, but the auto-sizing code will use -1 and then use the value of FALLBACK_DEFAULT_ABBREV as the starting point of auto-sizing. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-21Merge branch 'jk/setup-sequence-update'Junio C Hamano1-1/+12
There were numerous corner cases in which the configuration files are read and used or not read at all depending on the directory a Git command was run, leading to inconsistent behaviour. The code to set-up repository access at the beginning of a Git process has been updated to fix them. * jk/setup-sequence-update: t1007: factor out repeated setup init: reset cached config when entering new repo init: expand comments explaining config trickery config: only read .git/config from configured repos test-config: setup git directory t1302: use "git -C" pager: handle early config pager: use callbacks instead of configset pager: make pager_program a file-local static pager: stop loading git_default_config() pager: remove obsolete comment diff: always try to set up the repository diff: handle --no-index prefixes consistently diff: skip implicit no-index check when given --no-index patch-id: use RUN_SETUP_GENTLY hash-object: always try to set up the git repository
2016-09-13init: reset cached config when entering new repoJeff King1-0/+5
After we copy the templates into place, we re-read the config in case we copied in a default config file. But since git_config() is backed by a cache these days, it's possible that the call will not actually touch the filesystem at all; we need to tell it that something has changed behind the scenes. Note that we also need to reset the shared_repository config. At first glance, it seems like this should probably just be folded into git_config_clear(). But unfortunately that is not quite right. The shared repository value may come from config, _or_ it may have been set manually. So only the caller who knows whether or not they set it is the one who can clear it (and indeed, if you _do_ put it into git_config_clear(), then many tests fail, as we have to clear the config cache any time we set a new config variable). There are three tests here. The first two actually pass already, though it's largely luck: they just don't happen to actually read any config before we enter the new repo. But the third one does fail without this patch; we look at core.sharedrepository while creating the directory, but need to make sure the value from the template config overrides it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-13config: only read .git/config from configured reposJeff King1-0/+7
When git_config() runs, it looks in the system, user-wide, and repo-level config files. It gets the latter by calling git_pathdup(), which in turn calls get_git_dir(). If we haven't set up the git repository yet, this may simply return ".git", and we will look at ".git/config". This seems like it would be helpful (presumably we haven't set up the repository yet, so it tries to find it), but it turns out to be a bad idea for a few reasons: - it's not sufficient, and therefore hides bugs in a confusing way. Config will be respected if commands are run from the top-level of the working tree, but not from a subdirectory. - it's not always true that we haven't set up the repository _yet_; we may not want to do it at all. For instance, if you run "git init /some/path" from inside another repository, it should not load config from the existing repository. - there might be a path ".git/config", but it is not the actual repository we would find via setup_git_directory(). This may happen, e.g., if you are storing a git repository inside another git repository, but have munged one of the files in such a way that the inner repository is not valid (e.g., by removing HEAD). We have at least two bugs of the second type in git-init, introduced by ae5f677 (lazily load core.sharedrepository, 2016-03-11). It causes init to use git_configset(), which loads all of the config, including values from the current repo (if any). This shows up in two ways: 1. If we happen to be in an existing repository directory, we'll read and respect core.sharedrepository from it, even though it should have no bearing on the new repository. A new test in t1301 covers this. 2. Similarly, if we're in an existing repo that sets core.logallrefupdates, that will cause init to fail to set it in a newly created repository (because it thinks that the user's templates already did so). A new test in t0001 covers this. We also need to adjust an existing test in t1302, which gives another example of why this patch is an improvement. That test creates an embedded repository with a bogus core.repositoryformatversion of "99". It wants to make sure that we actually stop at the bogus repo rather than continuing upward to find the outer repo. So it checks that "git config core.repositoryformatversion" returns 99. But that only works because we blindly read ".git/config", even though we _know_ we're in a repository whose vintage we do not understand. After this patch, we avoid reading config from the unknown vintage repository at all, which is a safer choice. But we need to tweak the test, since core.repositoryformatversion will not return 99; it will claim that it could not find the variable at all. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-13pager: make pager_program a file-local staticJeff King1-1/+0
This variable is only ever used by the routines in pager.c, and other parts of the code should always use those routines (like git_pager()) to make decisions about which pager to use. Let's reduce its scope to prevent accidents. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-26Merge branch 'js/windows-dotgit' into maintJunio C Hamano1-0/+1
On Windows, .git and optionally any files whose name starts with a dot are now marked as hidden, with a core.hideDotFiles knob to customize this behaviour. * js/windows-dotgit: mingw: remove unnecessary definition mingw: introduce the 'core.hideDotFiles' setting
2016-05-17Merge branch 'js/windows-dotgit'Junio C Hamano1-0/+1
On Windows, .git and optionally any files whose name starts with a dot are now marked as hidden, with a core.hideDotFiles knob to customize this behaviour. * js/windows-dotgit: mingw: remove unnecessary definition mingw: introduce the 'core.hideDotFiles' setting
2016-05-17Merge branch 'ab/hooks'Junio C Hamano1-0/+1
A new configuration variable core.hooksPath allows customizing where the hook directory is. * ab/hooks: hooks: allow customizing where the hook directory is githooks.txt: minor improvements to the grammar & phrasing githooks.txt: amend dangerous advice about 'update' hook ACL githooks.txt: improve the intro section
2016-05-11mingw: introduce the 'core.hideDotFiles' settingJohannes Schindelin1-0/+1
On Unix (and Linux), files and directories whose names start with a dot are usually not shown by default. This convention is used by Git: the .git/ directory should be left alone by regular users, and only accessed through Git itself. On Windows, no such convention exists. Instead, there is an explicit flag to mark files or directories as hidden. In the early days, Git for Windows did not mark the .git/ directory (or for that matter, any file or directory whose name starts with a dot) hidden. This lead to quite a bit of confusion, and even loss of data. Consequently, Git for Windows introduced the core.hideDotFiles setting, with three possible values: true, false, and dotGitOnly, defaulting to marking only the .git/ directory as hidden. The rationale: users do not need to access .git/ directly, and indeed (as was demonstrated) should not really see that directory, either. However, not all dot files should be hidden by default, as e.g. Eclipse does not show them (and the user would therefore be unable to see, say, a .gitattributes file). In over five years since the last attempt to bring this patch into core Git, a slightly buggy version of this patch has served Git for Windows' users well: no single report indicated problems with the hidden .git/ directory, and the stream of problems caused by the previously non-hidden .git/ directory simply stopped. The bugs have been fixed during the process of getting this patch upstream. Note that there is a funny quirk we have to pay attention to when creating hidden files: we use Win32's _wopen() function which transmogrifies its arguments and hands off to Win32's CreateFile() function. That latter function errors out with ERROR_ACCESS_DENIED (the equivalent of EACCES) when the equivalent of the O_CREAT flag was passed and the file attributes (including the hidden flag) do not match an existing file's. And _wopen() accepts no parameter that would be transmogrified into said hidden flag. Therefore, we simply try again without O_CREAT. A slightly different method is required for our fopen()/freopen() function as we cannot even *remove* the implicit O_CREAT flag. Therefore, we briefly mark existing files as unhidden when opening them via fopen()/freopen(). The ERROR_ACCESS_DENIED error can also be triggered by opening a file that is marked as a system file (which is unlikely to be tracked in Git), and by trying to create a file that has *just* been deleted and is awaiting the last open handles to be released (which would be handled better by the "Try again?" logic, a story for a different patch series, though). In both cases, it does not matter much if we try again without the O_CREAT flag, read: it does not hurt, either. For details how ERROR_ACCESS_DENIED can be triggered, see https://msdn.microsoft.com/en-us/library/windows/desktop/aa363858 Original-patch-by: Erik Faye-Lund <kusmabite@gmail.com> Initial-Test-By: Pat Thoyts <patthoyts@users.sourceforge.net> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-04hooks: allow customizing where the hook directory isÆvar Arnfjörð Bjarmason1-0/+1
Change the hardcoded lookup for .git/hooks/* to optionally lookup in $(git config core.hooksPath)/* instead. This is essentially a more intrusive version of the git-init ability to specify hooks on init time via init templates. The difference between that facility and this feature is that this can be set up after the fact via e.g. ~/.gitconfig or /etc/gitconfig to apply for all your personal repositories, or all repositories on the system. I plan on using this on a centralized Git server where users can create arbitrary repositories under /gitroot, but I'd like to manage all the hooks that should be run centrally via a unified dispatch mechanism. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-02Merge branch 'jk/check-repository-format' into maintJunio C Hamano1-2/+21
The repository set-up sequence has been streamlined (the biggest change is that there is no longer git_config_early()), so that we do not attempt to look into refs/* when we know we do not have a Git repository. * jk/check-repository-format: verify_repository_format: mark messages for translation setup: drop repository_format_version global setup: unify repository version callbacks init: use setup.c's repo version verification setup: refactor repo format reading and verification config: drop git_config_early check_repository_format_gently: stop using git_config_early lazily load core.sharedrepository wrap shared_repository global in get/set accessors setup: document check_repository_format()
2016-04-13Merge branch 'jk/check-repository-format'Junio C Hamano1-2/+21
The repository set-up sequence has been streamlined (the biggest change is that there is no longer git_config_early()), so that we do not attempt to look into refs/* when we know we do not have a Git repository. * jk/check-repository-format: verify_repository_format: mark messages for translation setup: drop repository_format_version global setup: unify repository version callbacks init: use setup.c's repo version verification setup: refactor repo format reading and verification config: drop git_config_early check_repository_format_gently: stop using git_config_early lazily load core.sharedrepository wrap shared_repository global in get/set accessors setup: document check_repository_format()
2016-03-11setup: drop repository_format_version globalJeff King1-1/+0
Nobody reads this anymore, and they're not likely to; the interesting thing is whether or not we passed check_repository_format(), and possibly the individual "extension" variables. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11lazily load core.sharedrepositoryJeff King1-0/+9
The "shared_repository" config is loaded as part of check_repository_format_version, but it's not quite like the other values we check there. Something like core.repositoryformatversion only makes sense in per-repo config, but core.sharedrepository can be set in a per-user config (e.g., to make all "git init" invocations shared by default). So it would make more sense as part of git_default_config. Commit 457f06d (Introduce core.sharedrepository, 2005-12-22) says: [...]the config variable is set in the function which checks the repository format. If this were done in git_default_config instead, a lot of programs would need to be modified to call git_config(git_default_config) first. This is still the case today, but we have one extra trick up our sleeve. Now that we have the git_configset infrastructure, it's not so expensive for us to ask for a single value. So we can simply lazy-load it on demand. This should be OK to do in general. There are some problems with loading config before setup_git_directory() is called, but we shouldn't be accessing the value before then (if we were, then it would already be broken, as the variable would not have been set by check_repository_format_version!). The trickiest caller is git-init, but it handles the values manually itself. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11wrap shared_repository global in get/set accessorsJeff King1-1/+12
It would be useful to control access to the global shared_repository, so that we can lazily load its config. The first step to doing so is to make sure all access goes through a set of functions. This step is purely mechanical, and should result in no change of behavior. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-06setup: make startup_info available everywhereJeff King1-1/+0
Commit a60645f (setup: remember whether repository was found, 2010-08-05) introduced the startup_info structure, which records some parts of the setup_git_directory() process (notably, whether we actually found a repository or not). One of the uses of this data is for functions to behave appropriately based on whether we are in a repo. But the startup_info struct is just a pointer to storage provided by the main program, and the only program that sets it up is the git.c wrapper. Thus builtins have access to startup_info, but externally linked programs do not. Worse, library code which is accessible from both has to be careful about accessing startup_info. This can be used to trigger a die("BUG") via get_sha1(): $ git fast-import <<-\EOF tag foo from HEAD:./whatever EOF fatal: BUG: startup_info struct is not initialized. Obviously that's fairly nonsensical input to feed to fast-import, but we should never hit a die("BUG"). And there may be other ways to trigger it if other non-builtins resolve sha1s. So let's point the storage for startup_info to a static variable in setup.c, making it available to all users of the library code. We _could_ turn startup_info into a regular extern struct, but doing so would mean tweaking all of the existing use sites. So let's leave the pointer indirection in place. We can, however, drop any checks for NULL, as they will always be false (and likewise, we can drop the test covering this case, which was a rather artificial situation using one of the test-* programs). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-10Merge branch 'cc/untracked'Junio C Hamano1-0/+7
Update the untracked cache subsystem and change its primary UI from "git update-index" to "git config". * cc/untracked: t7063: add tests for core.untrackedCache test-dump-untracked-cache: don't modify the untracked cache config: add core.untrackedCache dir: simplify untracked cache "ident" field dir: add remove_untracked_cache() dir: add {new,add}_untracked_cache() update-index: move 'uc' var declaration update-index: add untracked cache notifications update-index: add --test-untracked-cache update-index: use enum for untracked cache options dir: free untracked cache when removing it
2016-01-27test-dump-untracked-cache: don't modify the untracked cacheChristian Couder1-0/+7
To correctly perform its testing function, test-dump-untracked-cache should not change the state of the untracked cache in the index. As a previous patch makes read_index_from() change the state of the untracked cache and as test-dump-untracked-cache indirectly calls this function, we need a mechanism to prevent read_index_from() from changing the untracked cache state when it's called from test-dump-untracked-cache. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-12Merge branch 'nd/stop-setenv-work-tree'Junio C Hamano1-2/+0
An earlier change in 2.5.x-era broke users' hooks and aliases by exporting GIT_WORK_TREE to point at the root of the working tree, interfering when they tried to use a different working tree without setting GIT_WORK_TREE environment themselves. * nd/stop-setenv-work-tree: Revert "setup: set env $GIT_WORK_TREE when work tree is set, like $GIT_DIR"
2015-12-22Revert "setup: set env $GIT_WORK_TREE when work tree is set, like $GIT_DIR"Nguyễn Thái Ngọc Duy1-2/+0
This reverts d95138e6 (setup: set env $GIT_WORK_TREE when work tree is set, like $GIT_DIR, 2015-06-26). It has caused three regression reports so far. http://article.gmane.org/gmane.comp.version-control.git/281608 http://article.gmane.org/gmane.comp.version-control.git/281979 http://article.gmane.org/gmane.comp.version-control.git/282691 All of them are about spawning git subprocesses, where the new presence of GIT_WORK_TREE either changes command behaviour (git-init or git-clone), or how repo/worktree is detected (from aliases), with or without $GIT_DIR. The original bug will be re-fixed another way. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-26Merge branch 'jk/repository-extension'Junio C Hamano1-0/+1
Prepare for Git on-disk repository representation to undergo backward incompatible changes by introducing a new repository format version "1", with an extension mechanism. * jk/repository-extension: introduce "preciousObjects" repository extension introduce "extensions" form of core.repositoryformatversion
2015-09-25replace trivial malloc + sprintf / strcpy calls with xstrfmtJeff King1-5/+2
It's a common pattern to do: foo = xmalloc(strlen(one) + strlen(two) + 1 + 1); sprintf(foo, "%s %s", one, two); (or possibly some variant with strcpy()s or a more complicated length computation). We can switch these to use xstrfmt, which is shorter, involves less error-prone manual computation, and removes many sprintf and strcpy calls which make it harder to audit the code for real buffer overflows. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-03Merge branch 'nd/export-worktree'Junio C Hamano1-0/+2
Running an aliased command from a subdirectory when the .git thing in the working tree is a gitfile pointing elsewhere did not work. * nd/export-worktree: setup: set env $GIT_WORK_TREE when work tree is set, like $GIT_DIR
2015-06-26setup: set env $GIT_WORK_TREE when work tree is set, like $GIT_DIRNguyễn Thái Ngọc Duy1-0/+2
In the test case, we run setup_git_dir_gently() the first time to read $GIT_DIR/config so that we can resolve aliases. We'll enter setup_discovered_git_dir() and may or may not call set_git_dir() near the end of the function, depending on whether the detected git dir is ".git" or not. This set_git_dir() will set env var $GIT_DIR. For normal repo, git dir detected via setup_discovered_git_dir() will be ".git", and set_git_dir() is not called. If .git file is used however, the git dir can't be ".git" and set_git_dir() is called and $GIT_DIR set. This is the key of this problem. If we expand an alias (or autocorrect command names), then setup_git_dir_gently() is run the second time. If $GIT_DIR is not set in the first run, we run the same setup_discovered_git_dir() as before. Nothing to see. If it is, however, we'll enter setup_explicit_git_dir() this time. This is where the "fun" is. If $GIT_WORK_TREE is not set but $GIT_DIR is, you are supposed to be at the root level of the worktree. But if you are in a subdir "foo/bar" (real worktree's top is "foo"), this rule bites you: your detected worktree is now "foo/bar", even though the first run correctly detected worktree as "foo". You get "internal error: work tree has already been set" as a result. Bottom line is, when $GIT_DIR is set, $GIT_WORK_TREE should be set too unless there's no work tree. But setting $GIT_WORK_TREE inside set_git_dir() may backfire. We don't know at that point if work tree is already configured by the caller. So set it when work tree is detected. It does not harm if $GIT_WORK_TREE is set while $GIT_DIR is not. Reported-by: Bjørnar Snoksrud <snoksrud@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-24introduce "preciousObjects" repository extensionJeff King1-0/+1
If this extension is used in a repository, then no operations should run which may drop objects from the object storage. This can be useful if you are sharing that storage with other repositories whose refs you cannot see. For instance, if you do: $ git clone -s parent child $ git -C parent config extensions.preciousObjects true $ git -C parent config core.repositoryformatversion 1 you now have additional safety when running git in the parent repository. Prunes and repacks will bail with an error, and `git gc` will skip those operations (it will continue to pack refs and do other non-object operations). Older versions of git, when run in the repository, will fail on every operation. Note that we do not set the preciousObjects extension by default when doing a "clone -s", as doing so breaks backwards compatibility. It is a decision the user should make explicitly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-12Allow to control where the replace refs are looked forMike Hommey1-0/+6
It can be useful to have grafts or replace refs for specific use-cases while keeping the default "view" of the repository pristine (or with a different set of grafts/replace refs). It is possible to use a different graft file with GIT_GRAFT_FILE, but while replace refs are more powerful, they don't have an equivalent override. Add a GIT_REPLACE_REF_BASE environment variable to control where git is going to look for replace refs. Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-11Merge branch 'nd/multiple-work-trees'Junio C Hamano1-6/+28
A replacement for contrib/workdir/git-new-workdir that does not rely on symbolic links and make sharing of objects and refs safer by making the borrowee and borrowers aware of each other. * nd/multiple-work-trees: (41 commits) prune --worktrees: fix expire vs worktree existence condition t1501: fix test with split index t2026: fix broken &&-chain t2026 needs procondition SANITY git-checkout.txt: a note about multiple checkout support for submodules checkout: add --ignore-other-wortrees checkout: pass whole struct to parse_branchname_arg instead of individual flags git-common-dir: make "modules/" per-working-directory directory checkout: do not fail if target is an empty directory t2025: add a test to make sure grafts is working from a linked checkout checkout: don't require a work tree when checking out into a new one git_path(): keep "info/sparse-checkout" per work-tree count-objects: report unused files in $GIT_DIR/worktrees/... gc: support prune --worktrees gc: factor out gc.pruneexpire parsing code gc: style change -- no SP before closing parenthesis checkout: clean up half-prepared directories in --to mode checkout: reject if the branch is already checked out elsewhere prune: strategies for linked checkouts checkout: support checking out into a new working directory ...
2015-03-20refs: introduce a "ref paranoia" flagJeff King1-0/+1
Most operations that iterate over refs are happy to ignore broken cruft. However, some operations should be performed with knowledge of these broken refs, because it is better for the operation to choke on a missing object than it is to silently pretend that the ref did not exist (e.g., if we are computing the set of reachable tips in order to prune objects). These processes could just call for_each_rawref, except that ref iteration is often hidden behind other interfaces. For instance, for a destructive "repack -ad", we would have to inform "pack-objects" that we are destructive, and then it would in turn have to tell the revision code that our "--all" should include broken refs. It's much simpler to just set a global for "dangerous" operations that includes broken refs in all iterations. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-17Sync with v2.0.5Junio C Hamano1-0/+10
* maint-2.0: Git 2.0.5 Git 1.9.5 Git 1.8.5.6 fsck: complain about NTFS ".git" aliases in trees read-cache: optionally disallow NTFS .git variants path: add is_ntfs_dotgit() helper fsck: complain about HFS+ ".git" aliases in trees read-cache: optionally disallow HFS+ .git variants utf8: add is_hfs_dotgit() helper fsck: notice .git case-insensitively t1450: refactor ".", "..", and ".git" fsck tests verify_dotfile(): reject .git case-insensitively read-tree: add tests for confusing paths like ".." and ".git" unpack-trees: propagate errors adding entries to the index
2014-12-17Sync with v1.9.5Junio C Hamano1-0/+10
* maint-1.9: Git 1.9.5 Git 1.8.5.6 fsck: complain about NTFS ".git" aliases in trees read-cache: optionally disallow NTFS .git variants path: add is_ntfs_dotgit() helper fsck: complain about HFS+ ".git" aliases in trees read-cache: optionally disallow HFS+ .git variants utf8: add is_hfs_dotgit() helper fsck: notice .git case-insensitively t1450: refactor ".", "..", and ".git" fsck tests verify_dotfile(): reject .git case-insensitively read-tree: add tests for confusing paths like ".." and ".git" unpack-trees: propagate errors adding entries to the index
2014-12-17Sync with v1.8.5.6Junio C Hamano1-0/+10
* maint-1.8.5: Git 1.8.5.6 fsck: complain about NTFS ".git" aliases in trees read-cache: optionally disallow NTFS .git variants path: add is_ntfs_dotgit() helper fsck: complain about HFS+ ".git" aliases in trees read-cache: optionally disallow HFS+ .git variants utf8: add is_hfs_dotgit() helper fsck: notice .git case-insensitively t1450: refactor ".", "..", and ".git" fsck tests verify_dotfile(): reject .git case-insensitively read-tree: add tests for confusing paths like ".." and ".git" unpack-trees: propagate errors adding entries to the index
2014-12-17read-cache: optionally disallow NTFS .git variantsJohannes Schindelin1-0/+5
The point of disallowing ".git" in the index is that we would never want to accidentally overwrite files in the repository directory. But this means we need to respect the filesystem's idea of when two paths are equal. The prior commit added a helper to make such a comparison for NTFS and FAT32; let's use it in verify_path(). We make this check optional for two reasons: 1. It restricts the set of allowable filenames, which is unnecessary for people who are not on NTFS nor FAT32. In practice this probably doesn't matter, though, as the restricted names are rather obscure and almost certainly would never come up in practice. 2. It has a minor performance penalty for every path we insert into the index. This patch ties the check to the core.protectNTFS config option. Though this is expected to be most useful on Windows, we allow it to be set everywhere, as NTFS may be mounted on other platforms. The variable does default to on for Windows, though. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-17read-cache: optionally disallow HFS+ .git variantsJeff King1-0/+5
The point of disallowing ".git" in the index is that we would never want to accidentally overwrite files in the repository directory. But this means we need to respect the filesystem's idea of when two paths are equal. The prior commit added a helper to make such a comparison for HFS+; let's use it in verify_path. We make this check optional for two reasons: 1. It restricts the set of allowable filenames, which is unnecessary for people who are not on HFS+. In practice this probably doesn't matter, though, as the restricted names are rather obscure and almost certainly would never come up in practice. 2. It has a minor performance penalty for every path we insert into the index. This patch ties the check to the core.protectHFS config option. Though this is expected to be most useful on OS X, we allow it to be set everywhere, as HFS+ may be mounted on other platforms. The variable does default to on for OS X, though. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-01setup.c: support multi-checkout repo setupNguyễn Thái Ngọc Duy1-5/+3
The repo setup procedure is updated to detect $GIT_DIR/commondir and set $GIT_COMMON_DIR properly. The core.worktree is ignored when $GIT_COMMON_DIR is set. This is because the config file is shared in multi-checkout setup, but checkout directories _are_ different. Making core.worktree effective in all checkouts mean it's back to a single checkout. Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-01$GIT_COMMON_DIR: a new environment variableNguyễn Thái Ngọc Duy1-7/+22
This variable is intended to support multiple working directories attached to a repository. Such a repository may have a main working directory, created by either "git init" or "git clone" and one or more linked working directories. These working directories and the main repository share the same repository directory. In linked working directories, $GIT_COMMON_DIR must be defined to point to the real repository directory and $GIT_DIR points to an unused subdirectory inside $GIT_COMMON_DIR. File locations inside the repository are reorganized from the linked worktree view point: - worktree-specific such as HEAD, logs/HEAD, index, other top-level refs and unrecognized files are from $GIT_DIR. - the rest like objects, refs, info, hooks, packed-refs, shallow... are from $GIT_COMMON_DIR (except info/sparse-checkout, but that's a separate patch) Scripts are supposed to retrieve paths in $GIT_DIR with "git rev-parse --git-path", which will take care of "$GIT_DIR vs $GIT_COMMON_DIR" business. The redirection is done by git_path(), git_pathdup() and strbuf_git_path(). The selected list of paths goes to $GIT_COMMON_DIR, not the other way around in case a developer adds a new worktree-specific file and it's accidentally promoted to be shared across repositories (this includes unknown files added by third party commands) The list of known files that belong to $GIT_DIR are: ADD_EDIT.patch BISECT_ANCESTORS_OK BISECT_EXPECTED_REV BISECT_LOG BISECT_NAMES CHERRY_PICK_HEAD COMMIT_MSG FETCH_HEAD HEAD MERGE_HEAD MERGE_MODE MERGE_RR NOTES_EDITMSG NOTES_MERGE_WORKTREE ORIG_HEAD REVERT_HEAD SQUASH_MSG TAG_EDITMSG fast_import_crash_* logs/HEAD next-index-* rebase-apply rebase-merge rsync-refs-* sequencer/* shallow_* Path mapping is NOT done for git_path_submodule(). Multi-checkouts are not supported as submodules. Helped-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-01git_path(): be aware of file relocation in $GIT_DIRNguyễn Thái Ngọc Duy1-5/+14
We allow the user to relocate certain paths out of $GIT_DIR via environment variables, e.g. GIT_OBJECT_DIRECTORY, GIT_INDEX_FILE and GIT_GRAFT_FILE. Callers are not supposed to use git_path() or git_pathdup() to get those paths. Instead they must use get_object_directory(), get_index_file() and get_graft_file() respectively. This is inconvenient and could be missed in review (for example, there's git_path("objects/info/alternates") somewhere in sha1_file.c). This patch makes git_path() and git_pathdup() understand those environment variables. So if you set GIT_OBJECT_DIRECTORY to /foo/bar, git_path("objects/abc") should return /foo/bar/abc. The same is done for the two remaining env variables. "git rev-parse --git-path" is the wrapper for script use. This patch kinda reverts a0279e1 (setup_git_env: use git_pathdup instead of xmalloc + sprintf - 2014-06-19) because using git_pathdup here would result in infinite recursion: setup_git_env() -> git_pathdup("objects") -> .. -> adjust_git_path() -> get_object_directory() -> oops, git_object_directory is NOT set yet -> setup_git_env() I wanted to make git_pathdup_literal() that skips adjust_git_path(). But that won't work because later on when $GIT_COMMON_DIR is introduced, git_pathdup_literal("objects") needs adjust_git_path() to replace $GIT_DIR with $GIT_COMMON_DIR. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-25setup_git_env(): introduce git_path_from_env() helperJeff King1-9/+9
"Check the value of an environment and fall back to a known path inside $GIT_DIR" is repeated a few times to determine the location of the data store, the index and the graft file, but the return value of getenv is not guaranteed to survive across further invocations of setenv or even getenv. Make sure to xstrdup() the value we receive from getenv(3), and encapsulate the pattern into a helper function. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-19setup_git_env: use git_pathdup instead of xmalloc + sprintfJeff King1-8/+4
This is shorter, harder to get wrong, and more clearly captures the intent. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-16Merge branch 'sh/enable-preloadindex'Junio C Hamano1-1/+1
* sh/enable-preloadindex: environment.c: enable core.preloadindex by default
2014-06-06Merge branch 'nd/status-auto-comment-char'Junio C Hamano1-0/+1
* nd/status-auto-comment-char: commit: allow core.commentChar=auto for character auto selection config: be strict on core.commentChar
2014-06-03environment.c: enable core.preloadindex by defaultSteve Hoelzer1-1/+1
Many people are on filesystems with horrible stat latency (not limited to Windows but also NFS), which core.preloadindex was designed to help. We discussed enabling it by default early in 2013 but didn't. Per http://thread.gmane.org/gmane.comp.version-control.git/219273/focus=219322 let's enable the setting by default, with the original choice of max 20 threads / min 500 paths per thread parameters. Signed-off-by: Steve Hoelzer <shoelzer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-19commit: allow core.commentChar=auto for character auto selectionNguyễn Thái Ngọc Duy1-0/+1
When core.commentChar is "auto", the comment char starts with '#' as in default but if it's already in the prepared message, find another char in a small subset. This should stop surprises because git strips some lines unexpectedly. Note that git is not smart enough to recognize '#' as the comment char in custom templates and convert it if the final comment char is different. It thinks '#' lines in custom templates as part of the commit message. So don't use this with custom templates. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-06Bump core.deltaBaseCacheLimit to 96mDavid Kastrup1-1/+1
The default of 16m causes serious thrashing for large delta chains combined with large files. Here are some benchmarks (pu variant of git blame): time git blame -C src/xdisp.c >/dev/null for a repository of Emacs repacked with git gc --aggressive (v1.9, resulting in a window size of 250) located on an SSD drive. The file in question has about 30000 lines, 1Mb of size, and a history with about 2500 commits. 16m (previous default): real 3m33.936s user 2m15.396s sys 1m17.352s 32m: real 3m1.319s user 2m8.660s sys 0m51.904s 64m: real 2m20.636s user 1m55.780s sys 0m23.964s 96m: real 2m5.668s user 1m50.784s sys 0m14.288s 128m: real 2m4.337s user 1m50.764s sys 0m12.832s 192m: real 2m3.567s user 1m49.508s sys 0m13.312s Signed-off-by: David Kastrup <dak@gnu.org> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>