git.git - The core git plumbing

Age	Commit message (Collapse)	Author	Files	Lines
12 days	refs/files: use correct repository	Patrick Steinhardt	1	-10/+13
	There are several places in the "files" backend where we use `the_repository` instead of the repository associated with the ref store itself. Adapt those to use the correct repository. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 days	refs: remove `dwim_log()`	Patrick Steinhardt	5	-10/+4
	Remove `dwim_log()` in favor of `repo_dwim_log()` so that we can get rid of one more dependency on `the_repository`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 days	refs: drop `git_default_branch_name()`	Patrick Steinhardt	6	-20/+18
	The `git_default_branch_name()` function is a thin wrapper around `repo_default_branch_name()` with two differences: - We implicitly rely on `the_repository`. - We cache the default branch name. None of the callsites of `git_default_branch_name()` are hot code paths though, so the caching of the branch name is not really required. Refactor the callsites to use `repo_default_branch_name()` instead and drop `git_default_branch_name()`, thus getting rid of one more case where we rely on `the_repository`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 days	refs: pass repo when peeling objects	Patrick Steinhardt	20	-36/+37
	Both `peel_object()` and `peel_iterated_oid()` implicitly rely on `the_repository` to look up objects. Despite the fact that we want to get rid of `the_repository`, it also leads to some restrictions in our ref iterators when trying to retrieve the peeled value for a repository other than `the_repository`. Refactor these functions such that both take a repository as argument and remove the now-unnecessary restrictions. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 days	refs: move object peeling into "object.c"	Patrick Steinhardt	4	-56/+55
	Peeling an object has nothing to do with refs, but we still have the code in "refs.c". Move it over into "object.c", which is a more natural place to put it. Ideally, we'd also move `peel_iterated_oid()` over into "object.c". But this function is tied to the refs interfaces because it uses a global ref iterator variable to optimize peeling when the iterator already has the peeled object ID readily available. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 days	refs: pass ref store when detecting dangling symrefs	Patrick Steinhardt	4	-25/+28
	Both `warn_dangling_symref()` and `warn_dangling_symrefs()` derive the ref store via `the_repository`. Adapt them to instead take in the ref store as a parameter. While at it, rename the functions to have a `ref_` prefix to align them with other functions that take a ref store. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 days	refs: convert iteration over replace refs to accept ref store	Patrick Steinhardt	6	-81/+28
	The function `for_each_replace_ref()` is a bit of an oddball across the refs interfaces as it accepts a pointer to the repository instead of a pointer to the ref store. The only reason for us to accept a repository is so that we can eventually pass it back to the callback function that the caller has provided. This is somewhat arbitrary though, as callers that need the repository can instead make it accessible via the callback payload. Refactor the function to instead accept the ref store and adjust callers accordingly. This allows us to get rid of some of the boilerplate that we had to carry to pass along the repository and brings us in line with the other functions that iterate through refs. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 days	refs: retrieve worktree ref stores via associated repository	Patrick Steinhardt	5	-13/+28
	Similar as with the preceding commit, the worktree ref stores are always looked up via `the_repository`. Also, again, those ref stores are stored in a global map. Refactor the code so that worktrees have a pointer to their repository. Like this, we can move the global map into `struct repository` and stop using `the_repository`. With this change, we can now in theory look up worktree ref stores for repositories other than `the_repository`. In practice, the worktree code will need further changes to look up arbitrary worktrees. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 days	refs: refactor `resolve_gitlink_ref()` to accept a repository	Patrick Steinhardt	11	-18/+29
	In `resolve_gitlink_ref()` we implicitly rely on `the_repository` to look up the submodule ref store. Now that we can look up submodule ref stores for arbitrary repositories we can improve this function to instead accept a repository as parameter for which we want to resolve the gitlink. Do so and adjust callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 days	refs: pass repo when retrieving submodule ref store	Patrick Steinhardt	8	-21/+33
	Looking up submodule ref stores has two deficiencies: - The initialized subrepo will be attributed to `the_repository`. - The submodule ref store will be tracked in a global map. This makes it impossible to have submodule ref stores for a repository other than `the_repository`. Modify the function to accept the parent repository as parameter and move the global map into `struct repository`. Like this it becomes possible to look up submodule ref stores for arbitrary repositories. Note that this also adds a new reference to `the_repository` in `resolve_gitlink_ref()`, which is part of the refs interfaces. This will get adjusted in the next patch. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 days	refs: track ref stores via strmap	Patrick Steinhardt	1	-57/+14
	The refs code has two global maps that track the submodule and worktree ref stores. Even though both of these maps track values by strings, we still use a `struct hashmap` instead of a `struct strmap`. This has the benefit of saving us an allocation because we can combine key and value in a single struct. But it does introduce significant complexity that is completely unneeded. Refactor the code to use `struct strmap`s instead to reduce complexity. It's unlikely that this will have any real-world impact on performance given that most repositories likely won't have all that many ref stores. Furthermore, this refactoring allows us to de-globalize those maps and move them into `struct repository` in a subsequent commit more easily. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 days	refs: implement releasing ref storages	Patrick Steinhardt	7	-0/+68
	Ref storages are typically only initialized once for `the_repository` and then never released. Until now we got away with that without causing memory leaks because `the_repository` stays reachable, and because the ref backend is reachable via `the_repository` its memory basically never leaks. This is about to change though because of the upcoming migration logic, which will create a secondary ref storage. In that case, we will either have to release the old or new ref storage to avoid leaks. Implement a new `release` callback and expose it via a new `ref_storage_release()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 days	refs: rename `init_db` callback to avoid confusion	Patrick Steinhardt	9	-29/+29
	Reference backends have two callbacks `init` and `init_db`. The similarity of these two callbacks has repeatedly confused me whenever I was looking at them, where I always had to look up which of them does what. Rename the `init_db` callback to `create_on_disk`, which should hopefully be clearer. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 days	refs: adjust names for `init` and `init_db` callbacks	Patrick Steinhardt	3	-16/+16
	The names of the functions that implement the `init` and `init_db` callbacks in the "files" and "packed" backends do not match the names of the callbacks, which is inconsistent. Rename them so that they match, which makes it easier to discover their respective implementations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 days	SubmittingPatches: add section for iterating patches	Karthik Nayak	1	-0/+79
	Add a section to explain how to work around other in-flight patches and how to navigate conflicts which arise as a series is being iterated. This provides the necessary steps that users can follow to reduce friction with other ongoing topics and also provides guidelines on how the users can also communicate this to the list efficiently. Co-authored-by: Junio C Hamano <gitster@pobox.com> Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
12 days	Merge branch 'jc/patch-flow-updates' into kn/patch-iteration-doc	Junio C Hamano	1	-51/+70
	* jc/patch-flow-updates: SubmittingPatches: extend the "flow" section SubmittingPatches: move the patch-flow section earlier
12 days	completion: adapt git-config(1) to complete subcommands	Patrick Steinhardt	2	-25/+73
	With fe3ccc7aab (Merge branch 'ps/config-subcommands', 2024-05-15), git-config(1) has gained support for subcommands. These subcommands live next to the old, action-based mode, so that both the old and new way continue to work. The manpage for this command has been updated to prominently show the subcommands, and the action-based modes are marked as deprecated. Update Bash completion scripts accordingly to advertise subcommands instead of actions. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
13 days	t0017: clarify dubious test set-up	Junio C Hamano	1	-1/+8
	1ff750b1 (tests: make GIT_TEST_GETTEXT_POISON a boolean, 2019-06-21) added this test, in which "test-tool -C" is fed a name of a directory that does not exist, and expects that it dies because of a failure to read the configuration file(s), because the configuration setting is screwed up to contain mutual inclusion loop, before it notices that the directory to chdir into does not exist and dies. It is of dubious value to etch the current order of events, i.e., the configuration needs to be read that early (for initializing trace2 subsystem) before we even notice the lack of the directory and have a chance to fail, into stone. Indeed, if you completely compile out trace2 subsystem so that it does not even attempt to read the configuration that early, we would die with a different error message (i.e. "unable to chdir to 'cycle'") and this test will fail. At least give a bogus argument to "test-tool -C" a name that is clearly bogus to make sure we can more easily see what is going on with plenty of comments. We may want to remove this test altogether, instead, though. Signed-off-by: Junio C Hamano <gitster@pobox.com>
13 days	The fifth batch	Junio C Hamano	1	-0/+7
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
13 days	Merge branch 'ps/refs-without-the-repository'	Junio C Hamano	77	-680/+1052
	The refs API lost functions that implicitly assumes to work on the primary ref_store by forcing the callers to pass a ref_store as an argument. * ps/refs-without-the-repository: refs: remove functions without ref store cocci: apply rules to rewrite callers of "refs" interfaces cocci: introduce rules to transform "refs" to pass ref store refs: add `exclude_patterns` parameter to `for_each_fullref_in()` refs: introduce missing functions that accept a `struct ref_store`
13 days	Merge branch 'jl/git-no-advice'	Junio C Hamano	5	-10/+104
	A new global "--no-advice" option can be used to disable all advice messages, which is meant to be used only in scripts. * jl/git-no-advice: t0018: two small fixes advice: add --no-advice global option doc: add spacing around paginate options doc: clean up usage documentation for --no-* opts
13 days	Merge branch 'rs/external-diff-with-exit-code'	Junio C Hamano	2	-38/+3
	* rs/external-diff-with-exit-code: Revert "diff: fix --exit-code with external diff"
13 days	Revert "diff: fix --exit-code with external diff"	Junio C Hamano	2	-38/+3
	This reverts commit 11be65cfa43416219e85384a3a80d672b65b76ba, per original author's request to come up with a better strategy.
13 days	Merge branch 'ps/refs-without-the-repository' into ↵	Junio C Hamano	77	-680/+1052
	ps/refs-without-the-repository-updates * ps/refs-without-the-repository: refs: remove functions without ref store cocci: apply rules to rewrite callers of "refs" interfaces cocci: introduce rules to transform "refs" to pass ref store refs: add `exclude_patterns` parameter to `for_each_fullref_in()` refs: introduce missing functions that accept a `struct ref_store`
13 days	t/t0211-trace2-perf.sh: fix typo patern -> pattern	Marcel Telka	1	-1/+1
	The bug went unnoticed because grep with null RE matches everything. Signed-off-by: Marcel Telka <marcel@telka.sk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	osxkeychain: state to skip unnecessary store operations	Koji Nakamaru	1	-0/+11
	git passes a credential that has been used successfully to the helpers to record. If a credential is already stored, "git-credential-osxkeychain store" just records the credential returned by "git-credential-osxkeychain get", and unnecessary (sometimes problematic) SecItemAdd() and/or SecItemUpdate() are performed. We can skip such unnecessary operations by marking a credential returned by "git-credential-osxkeychain get". This marking can be done by utilizing the "state[]" feature: - The "get" command sets the field "state[]=osxkeychain:seen=1". - The "store" command skips its actual operation if the field "state[]=osxkeychain:seen=1" exists. Introduce a new state "state[]=osxkeychain:seen=1". Suggested-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Koji Nakamaru <koji.nakamaru@gree.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	osxkeychain: exclusive lock to serialize execution of operations	Koji Nakamaru	1	-0/+3
	git passes a credential that has been used successfully to the helpers to record. If "git-credential-osxkeychain store" commands run in parallel (with fetch.parallel configuration and/or by running multiple git commands simultaneously), some of them may exit with the error "failed to store: -25299". This is because SecItemUpdate() in add_internet_password() may return errSecDuplicateItem (-25299) in this situation. Apple's documentation [1] also states as below: In macOS, some of the functions of this API block while waiting for input from the user (for example, when the user is asked to unlock a keychain or give permission to change trust settings). In general, it is safe to use this API in threads other than your main thread, but avoid calling the functions from multiple operations, work queues, or threads concurrently. Instead, serialize function calls or confine them to a single thread. The error has not been noticed before, because the former implementation ignored the error. Introduce an exclusive lock to serialize execution of operations. [1] https://developer.apple.com/documentation/security/certificate_key_and_trust_services/working_with_concurrency Signed-off-by: Koji Nakamaru <koji.nakamaru@gree.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	The fourth batch	Junio C Hamano	1	-1/+19
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	Merge branch 'ds/scalar-reconfigure-all-fix'	Junio C Hamano	2	-3/+45
	Scalar fix. * ds/scalar-reconfigure-all-fix: scalar: avoid segfault in reconfigure --all
14 days	Merge branch 'vd/doc-merge-tree-x-option'	Junio C Hamano	1	-0/+5
	Doc update. * vd/doc-merge-tree-x-option: Documentation/git-merge-tree.txt: document -X
14 days	Merge branch 'rs/external-diff-with-exit-code'	Junio C Hamano	3	-3/+47
	The "--exit-code" option of "git diff" command learned to work with the "--ext-diff" option. * rs/external-diff-with-exit-code: diff: fix --exit-code with external diff diff: report unmerged paths as changes in run_diff_cmd()
14 days	Merge branch 'jt/port-ci-whitespace-check-to-gitlab'	Junio C Hamano	4	-64/+109
	The "whitespace check" task that was enabled for GitHub Actions CI has been ported to GitLab CI. * jt/port-ci-whitespace-check-to-gitlab: gitlab-ci: add whitespace error check ci: make the whitespace report optional ci: separate whitespace check script github-ci: fix link to whitespace error ci: pre-collapse GitLab CI sections
14 days	Merge branch 'ow/refspec-glossary-update'	Junio C Hamano	1	-1/+2
	Doc update. * ow/refspec-glossary-update: Documentation: Mention that refspecs are explained elsewhere
14 days	Merge branch 'jp/tag-trailer'	Junio C Hamano	6	-28/+181
	"git tag" learned the "--trailer" option to futz with the trailers in the same way as "git commit" does. * jp/tag-trailer: builtin/tag: add --trailer option builtin/commit: refactor --trailer logic builtin/commit: use ARGV macro to collect trailers
14 days	Merge branch 'ps/config-subcommands'	Junio C Hamano	6	-370/+812
	The operation mode options (like "--get") the "git config" command uses have been deprecated and replaced with subcommands (like "git config get"). * ps/config-subcommands: builtin/config: display subcommand help builtin/config: introduce "edit" subcommand builtin/config: introduce "remove-section" subcommand builtin/config: introduce "rename-section" subcommand builtin/config: introduce "unset" subcommand builtin/config: introduce "set" subcommand builtin/config: introduce "get" subcommand builtin/config: introduce "list" subcommand builtin/config: pull out function to handle `--null` builtin/config: pull out function to handle config location builtin/config: use `OPT_CMDMODE()` to specify modes builtin/config: move "fixed-value" option to correct group builtin/config: move option array around config: clarify memory ownership when preparing comment strings
14 days	Merge branch 'js/unit-test-suite-runner'	Junio C Hamano	11	-30/+74
	The "test-tool" has been taught to run testsuite tests in parallel, bypassing the need to use the "prove" tool. * js/unit-test-suite-runner: cmake: let `test-tool` run the unit tests, too ci: use test-tool as unit test runner on Windows t/Makefile: run unit tests alongside shell tests unit tests: add rule for running with test-tool test-tool run-command testsuite: support unit tests test-tool run-command testsuite: remove hardcoded filter test-tool run-command testsuite: get shell from env t0080: turn t-basic unit test into a helper
14 days	refs: refuse to write pseudorefs	Patrick Steinhardt	2	-3/+10
	Pseudorefs are not stored in the ref database as by definition, they carry additional metadata that essentially makes them not a ref. As such, writing pseudorefs via the ref backend does not make any sense whatsoever as the ref backend wouldn't know how exactly to store the data. Restrict writing pseudorefs via the ref backend. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	ref-filter: properly distinuish pseudo and root refs	Patrick Steinhardt	5	-27/+31
	The ref-filter interfaces currently define root refs as either a detached HEAD or a pseudo ref. Pseudo refs aren't root refs though, so let's properly distinguish those ref types. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	refs: pseudorefs are no refs	Patrick Steinhardt	2	-32/+50
	The `is_root_ref()` function will happily clarify a pseudoref as a root ref, even though pseudorefs are no refs. Next to being wrong, it also leads to inconsistent behaviour across ref backends: while the "files" backend accidentally knows to parse those pseudorefs and thus yields them to the caller, the "reftable" backend won't ever see the pseudoref at all because they are never stored in the "reftable" backend. Fix this issue by filtering out pseudorefs in `is_root_ref()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	refs: classify HEAD as a root ref	Patrick Steinhardt	4	-15/+5
	Root refs are those refs that live in the root of the ref hierarchy. Our old and venerable "HEAD" reference falls into this category, but we don't yet classify it as such in `is_root_ref()`. Adapt the function to also treat "HEAD" as a root ref. This change is safe to do for all current callers: - `ref_kind_from_refname()` already handles "HEAD" explicitly before calling `is_root_ref()`. - The "files" and "reftable" backends explicitly call both `is_root_ref()` and `is_headref()` together. This also aligns behaviour or `is_root_ref()` and `is_headref()` such that we stop checking for ref existence. This changes semantics for our backends: - In the reftable backend we already know that the ref must exist because `is_headref()` is called as part of the ref iterator. The existence check is thus redundant, and the change is safe to do. - In the files backend we use it when populating root refs, where we would skip adding the "HEAD" file if it was not possible to resolve it. The new behaviour is to instead mark "HEAD" as broken, which will cause us to emit warnings in various places. As there are no callers of `is_headref()` left afer the refactoring, we can absorb it completely into `is_root_ref()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	refs: do not check ref existence in `is_root_ref()`	Patrick Steinhardt	6	-20/+29
	Before this patch series, root refs except for "HEAD" and our special refs were classified as pseudorefs. Furthermore, our terminology clarified that pseudorefs must not be symbolic refs. This restriction is enforced in `is_root_ref()`, which explicitly checks that a supposed root ref resolves to an object ID without recursing. This has been extremely confusing right from the start because (in old terminology) a ref name may sometimes be a pseudoref and sometimes not depending on whether it is a symbolic or regular ref. This behaviour does not seem reasonable at all and I very much doubt that it results in anything sane. Last but not least, the current behaviour can actually lead to a segfault when calling `is_root_ref()` with a reference that either does not exist or that is a symbolic ref because we never initialized `oid`, but then read it via `is_null_oid()`. We have now changed terminology to clarify that pseudorefs are really only "MERGE_HEAD" and "FETCH_HEAD", whereas all the other refs that live in the root of the ref hierarchy are just plain refs. Thus, we do not need to check whether the ref is symbolic or not. In fact, we can now avoid looking up the ref completely as the name is sufficient for us to figure out whether something would be a root ref or not. This change of course changes semantics for our callers. As there are only three of them we can assess each of them individually: - "ref-filter.c:ref_kind_from_refname()" uses it to classify refs. It's clear that the intent is to classify based on the ref name, only. - "refs/reftable_backend.c:reftable_ref_iterator_advance()" uses it to filter root refs. Again, using existence checks is pointless here as the iterator has just surfaced the ref, so we know it does exist. - "refs/files_backend.c:add_pseudoref_and_head_entries()" uses it to determine whether it should add a ref to the root directory of its iterator. This had the effect that we skipped over any files that are either a symbolic ref, or which are not a ref at all. The new behaviour is to include symbolic refs know, which aligns us with the adapted terminology. Furthermore, files which look like root refs but aren't are now mark those as "broken". As broken refs are not surfaced by our tooling, this should not lead to a change in user-visible behaviour, but may cause us to emit warnings. This feels like the right thing to do as we would otherwise just silently ignore corrupted root refs completely. So in all cases the existence check was either superfluous, not in line with the adapted terminology or masked potential issues. This commit thus changes the behaviour as proposed and drops the existence check altogether. Add a test that verifies that this does not change user-visible behaviour. Namely, we still don't want to show broken refs to the user by default in git-for-each-ref(1). What this does allow though is for internal callers to surface dangling root refs when they pass in the `DO_FOR_EACH_INCLUDE_BROKEN` flag. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	refs: rename `is_special_ref()` to `is_pseudo_ref()`	Patrick Steinhardt	1	-10/+10
	Rename `is_special_ref()` to `is_pseudo_ref()` to adapt to the newly defined terminology in our gitglossary(7). Note that in the preceding commit we have just renamed `is_pseudoref()` to `is_root_ref()`, where there may be confusion for in-flight patch series that add new calls to `is_pseudoref()`. In order to intentionally break such patch series we have thus picked `is_pseudo_ref()` instead of `is_pseudoref()` as the new name. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	refs: rename `is_pseudoref()` to `is_root_ref()`	Patrick Steinhardt	5	-15/+37
	Rename `is_pseudoref()` to `is_root_ref()` to adapt to the newly defined terminology in our gitglossary(7). Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	Documentation/glossary: define root refs as refs	Patrick Steinhardt	1	-7/+25
	Except for the pseudorefs MERGE_HEAD and FETCH_HEAD, all refs that live in the root of the ref hierarchy behave the exact same as normal refs. They can be symbolic refs or direct refs and can be read, iterated over and written via normal tooling. All of these refs are stored in the ref backends, which further demonstrates that they are just normal refs. Extend the definition of "ref" to also cover such root refs. The only additional restriction for root refs is that they must conform to a specific naming schema. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	Documentation/glossary: clarify limitations of pseudorefs	Patrick Steinhardt	1	-2/+2
	Clarify limitations that pseudorefs have: - They can be read via git-rev-parse(1) and similar tools. - They are not surfaced when iterating through refs, like when using git-for-each-ref(1). They are not refs, so iterating through refs should not surface them. - They cannot be written via git-update-ref(1) and related commands. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	Documentation/glossary: redefine pseudorefs as special refs	Patrick Steinhardt	1	-28/+12
	Nowadays, Git knows about three different kinds of refs. As defined in gitglossary(7): - Regular refs that start with "refs/", like "refs/heads/main". - Pseudorefs, which live in the root directory. These must have all-caps names and must be a file that start with an object hash. Consequently, symbolic refs are not pseudorefs because they do not start with an object hash. - Special refs, of which we only have "FETCH_HEAD" and "MERGE_HEAD". This state is extremely confusing, and I would claim that most folks don't fully understand what is what here. The current definitions also have several problems: - Where does "HEAD" fit in? It's not a pseudoref because it can be a symbolic ref. It's not a regular ref because it does not start with "refs/". And it's not a special ref, either. - There is a strong overlap between pseudorefs and special refs. The pseudoref section for example mentions "MERGE_HEAD", even though it is a special ref. Is it thus both a pseudoref and a special ref? - Why do we even need to distinguish refs that live in the root from other refs when they behave just like a regular ref anyway? In other words, the current state is quite a mess and leads to wild inconsistencies without much of a good reason. The original reason why pseudorefs were introduced is that there are some refs that sometimes behave like a ref, even though they aren't a ref. And we really only have two of these nowadays, namely "MERGE_HEAD" and "FETCH_HEAD". Those files are never written via the ref backends, but are instead written by git-fetch(1), git-pull(1) and git-merge(1). They contain additional metadata that highlights where a ref has been fetched from or the list of commits that have been merged. This original intent in fact matches the definition of special refs that we have recently introduced in 8df4c5d205 (Documentation: add "special refs" to the glossary, 2024-01-19). Due to the introduction of the new reftable backend we were forced to distinguish those refs more clearly such that we don't ever try to read or write them via the reftable backend. In the same series, we also addressed all the other cases where we used to write those special refs via the filesystem directly, thus circumventing the ref backend, to instead write them via the backends. Consequently, there are no other refs left anymore which are special. Let's address this mess and return the pseudoref terminology back to its original intent: a ref that sometimes behave like a ref, but which isn't really a ref because it gets written to the filesystem directly. Or in other words, let's redefine pseudorefs to match the current definition of special refs. As special refs and pseudorefs are now the same per definition, we can drop the "special refs" term again. It's not exposed to our users and thus they wouldn't ever encounter that term anyway. Refs that live in the root of the ref hierarchy but which are not pseudorefs will be further defined in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: pass data between callbacks via local variables	Patrick Steinhardt	1	-38/+52
	We use several global variables to pass data between callers and callbacks in `get_color()` and `get_colorbool()`. Convert those to use callback data structures instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: convert flags to a local variable	Patrick Steinhardt	1	-19/+29
	Both the `do_all` and `use_key_regexp` bits essentially act like flags to `get_value()`. Let's convert them to actual flags so that we can get rid of the last two remaining global variables that track options. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: track "fixed value" option via flags only	Patrick Steinhardt	1	-7/+7
	We track the "fixed value" option via two separate bits: once via the global variable `fixed_value`, and once via the CONFIG_FLAGS_FIXED_VALUE bit in `flags`. This is confusing and may easily lead to issues when one is not aware that this is tracked via two separate mechanisms. Refactor the code to use the flag exclusively. We already pass it to all the required callsites anyway, except for `collect_config()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: convert `key` to a local variable	Patrick Steinhardt	1	-2/+5
	The `key` variable is used by the `get_value()` function for two purposes: - It is used to store the result of `git_config_parse_key()`, which is then passed on to `collect_config()`. - It is used as a store to convert the provided key to an all-lowercase key when `use_key_regexp` is set. Neither of these cases warrant a global variable at all. In the former case we can pass the key via `struct collect_config_data`. And in the latter case we really only want to have it as a temporary local variable such that we can free associated memory. Refactor the code accordingly to reduce our reliance on global state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: convert `key_regexp` to a local variable	Patrick Steinhardt	1	-8/+8
	The `key_regexp` variable is used by the `format_config()` callback when `use_key_regexp` is set. It is only ever set up by its only caller, `collect_config()` and can thus easily be moved into the `collect_config_data` structure. Do so to remove our reliance on global state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: convert `regexp` to a local variable	Patrick Steinhardt	1	-9/+9
	The `regexp` variable is used by the `format_config()` callback when `CONFIG_FLAGS_FIXED_VALUE` is not set. It is only ever set up by its only caller, `collect_config()` and can thus easily be moved into the `collect_config_data` structure. Do so to remove our reliance on global state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: convert `value_pattern` to a local variable	Patrick Steinhardt	1	-3/+3
	The `value_pattern` variable is used by the `format_config()` callback when `CONFIG_FLAGS_FIXED_VALUE` is used. It is only ever set up by its only caller, `collect_config()` and can thus easily be moved into the `collect_config_data` structure. Do so to remove our reliance on global state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: convert `do_not_match` to a local variable	Patrick Steinhardt	1	-3/+3
	The `do_not_match` variable is used by the `format_config()` callback as an indicator whether or not the passed regular expression is negated. It is only ever set up by its only caller, `collect_config()` and can thus easily be moved into the `collect_config_data` structure. Do so to remove our reliance on global state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: move `respect_includes_opt` into location options	Patrick Steinhardt	1	-7/+12
	The variable tracking whether or not we want to honor includes is tracked via a global variable. Move it into the location options instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: move default value into display options	Patrick Steinhardt	1	-8/+11
	The default value is tracked via a global variable. Move it into the display options instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: move type options into display options	Patrick Steinhardt	1	-31/+29
	The type options are tracked via a global variable. Move it into the display options instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: move display options into local variables	Patrick Steinhardt	1	-70/+101
	The display options are tracked via a set of global variables. Move them into a self-contained structure so that we can easily parse all relevant options and hand them over to the various functions that require them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: move location options into local variables	Patrick Steinhardt	1	-137/+176
	The location options are tracked via a set of global variables. Move them into a self-contained structure so that we can easily parse all relevant options and hand them over to the various functions that require them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: refactor functions to have common exit paths	Patrick Steinhardt	1	-26/+38
	Refactor functions to have a single exit path. This will make it easier in subsequent commits to add common cleanup code. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	config: make the config source const	Patrick Steinhardt	2	-3/+3
	The `struct git_config_source` passed to `config_with_options()` is never modified. Let's mark it as `const` to clarify. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: check for writeability after source is set up	Patrick Steinhardt	2	-5/+11
	The `check_write()` function verifies that we do not try to write to a config source that cannot be written to, like for example stdin. But while the new subcommands do call this function, they do so before calling `handle_config_location()`. Consequently, we only end up checking the default config location for writeability, not the location that was actually specified by the caller of git-config(1). Fix this by calling `check_write()` after `handle_config_location()`. We will further clarify the relationship between those two functions in a subsequent commit where we remove the global state that both implicitly rely on. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: move actions into `cmd_config_actions()`	Patrick Steinhardt	1	-25/+23
	We only use actions in the legacy mode. Convert them to an enum and move them into `cmd_config_actions()` to clearly demonstrate that they are not used anywhere else. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: move legacy options into `cmd_config()`	Patrick Steinhardt	1	-30/+30
	Move the legacy options as well some of the variables it references into `cmd_config_action()`. This reduces our reliance on global state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: move subcommand options into `cmd_config()`	Patrick Steinhardt	1	-14/+14
	Move the subcommand options as well as the `subcommand` variable into `cmd_config()`. This reduces our reliance on global state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: move legacy mode into its own function	Patrick Steinhardt	1	-19/+24
	In `cmd_config()` we first try to parse the provided arguments as subcommands and, if this is successful, call the respective functions of that subcommand. Otherwise we continue with the "legacy" mode that uses implicit actions and/or flags. Disentangle this by moving the legacy mode into its own function. This allows us to move the options into the respective functions and clearly separates concerns. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
14 days	builtin/config: stop printing full usage on misuse	Patrick Steinhardt	2	-18/+13
	When invoking git-config(1) with a wrong set of arguments we end up calling `usage_builtin_config()` after printing an error message that says what was wrong. As that function ends up printing the full list of options, which is quite long, the actual error message will be buried by a wall of text. This makes it really hard to figure out what exactly caused the error. Furthermore, now that we have recently introduced subcommands, the usage information may actually be misleading as we unconditionally print options of the subcommand-less mode. Fix both of these issues by just not printing the options at all anymore. Instead, we call `usage()` that makes us report in a single line what has gone wrong. This should be way more discoverable for our users and addresses the inconsistency. Furthermore, this change allow us to inline the options into the respective functions that use them to parse the command line. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	pack-bitmap: introduce `bitmap_writer_free()`	Taylor Blau	4	-1/+27
	Now that there is clearer memory ownership around the bitmap_writer structure, introduce a bitmap_writer_free() function that callers may use to free any memory associated with their instance of the bitmap_writer structure. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	pack-bitmap-write.c: avoid uninitialized 'write_as' field	Taylor Blau	1	-0/+1
	Prepare to free() memory associated with bitmapped_commit structs by zero'ing the 'write_as' field. In ideal cases, it is fine to do something like: for (i = 0; i < writer->selected_nr; i++) { struct bitmapped_commit *bc = &writer->selected[i]; if (bc->write_as != bc->bitmap) ewah_free(bc->write_as); ewah_free(bc->bitmap); } but if not all of the 'write_as' fields were populated (e.g., because the packing_data given does not form a reachability closure), then we may attempt to free uninitialized memory. Guard against this by preemptively zero'ing this field just in case. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	pack-bitmap: drop unused `max_bitmaps` parameter	Taylor Blau	4	-12/+4
	The `max_bitmaps` parameter in `bitmap_writer_select_commits()` was introduced back in 7cc8f97108 (pack-objects: implement bitmap writing, 2013-12-21), making it original to the bitmap implementation in Git itself. When that patch was merged via 0f9e62e084 (Merge branch 'jk/pack-bitmap', 2014-02-27), its sole caller in builtin/pack-objects.c passed a value of "-1" for `max_bitmaps`, indicating no limit. Since then, the only other caller (in midx.c, added via c528e17966 (pack-bitmap: write multi-pack bitmaps, 2021-08-31)) also uses a value of "-1" for `max_bitmaps`. Since no callers have needed a finite limit for the `max_bitmaps` parameter in the nearly decade that has passed since 0f9e62e084, let's remove the parameter and any dead pieces of code connected to it. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	pack-bitmap: avoid use of static `bitmap_writer`	Taylor Blau	4	-123/+159
	The pack-bitmap machinery uses a structure called 'bitmap_writer' to collect the data necessary to write out .bitmap files. Since its introduction in 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21), there has been a single static bitmap_writer structure, which is responsible for all bitmap writing-related operations. In practice, this is OK, since we are only ever writing a single .bitmap file in a single process (e.g., `git multi-pack-index write --bitmap`, `git pack-objects --write-bitmap-index`, `git repack -b`, etc.). However, having a single static variable makes issues like data ownership unclear, when to free variables, what has/hasn't been initialized unclear. Refactor this code to be written in terms of a given bitmap_writer structure instead of relying on a static global. Note that this exposes the structure definition of the bitmap_writer at the pack-bitmap.h level. We could work around this by, e.g., forcing callers to declare their writers as: struct bitmap_writer writer; bitmap_writer_init(&bitmap_writer); and then declaring `bitmap_writer_init()` as taking in a double-pointer like so: void bitmap_writer_init(struct bitmap_writer *writer); which would avoid us having to expose the definition of the structure itself. This patch takes a different approach, since future patches (like for the ongoing pseudo-merge bitmaps work) will want to modify the innards of this structure (in the previous example, via pseudo-merge.c). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	pack-bitmap-write.c: move commit_positions into commit_pos fields	Taylor Blau	1	-25/+16
	In 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21), the bitmapped_commit struct was introduced, including the 'commit_pos' field, which has been unused ever since its introduction more than a decade ago. Instead, we have used the nearby `commit_positions` array leaving the bitmapped_commit struct with an unused 4-byte field. We could drop the `commit_pos` field as unused, and continue to store the values in the auxiliary array. But we could also drop the array and store the data for each bitmapped_commit struct inside of the structure itself, which is what this patch does. In any spot that we previously read `commit_positions[i]`, we can now instead read `writer.selected[i].commit_pos`. There are a few spots that need changing as a result: - write_selected_commits_v1() is a simple transformation, since we're just reading the field. As a result, the function no longer needs an explicit argument to pass the commit_positions array. - write_lookup_table() also no longer needs the explicit commit_positions array passed in as an argument. But it still needs to sort an array of indices into the writer.selected array to read them in commit_pos order, so table_cmp() is adjusted accordingly. - bitmap_writer_finish() no longer needs to allocate, populate, and free the commit_positions table. Instead, we can just write the data directly into each struct bitmapped_commit. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15	object.h: add flags allocated by pack-bitmap.h	Taylor Blau	1	-0/+1
	In commit 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21) the NEEDS_BITMAP flag was introduced into pack-bitmap.h, but no object flags allocation table existed at the time. In 208acbfb82f (object.h: centralize object flag allocation, 2014-03-25) when that table was first introduced, we never added the flags from 7cc8f971085, which has remained the case since. Rectify this by including the flag bit used by pack-bitmap.h into the centralized table in object.h. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	Sync with Git 2.45.1	Junio C Hamano	43	-86/+1283
	* tag 'v2.45.1': (42 commits) Git 2.45.1 Git 2.44.1 Git 2.43.4 Git 2.42.2 Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks ...
2024-05-13	reftable/merged: adapt interface to allow reuse of iterators	Patrick Steinhardt	4	-61/+22
	Refactor the interfaces exposed by `struct reftable_merged_table` and `struct merged_iter` such that they support iterator reuse. This is done by separating initialization of the iterator and seeking on it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable/stack: provide convenience functions to create iterators	Patrick Steinhardt	5	-30/+63
	There exist a bunch of call sites in the reftable backend that want to create iterators for a reftable stack. This is rather convoluted right now, where you always have to go via the merged table. And it is about to become even more convoluted when we split up iterator initialization and seeking in the next commit. Introduce convenience functions that allow the caller to create an iterator from a reftable stack directly without going through the merged table. Adapt callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable/reader: adapt interface to allow reuse of iterators	Patrick Steinhardt	3	-76/+35
	Refactor the interfaces exposed by `struct reftable_reader` and `struct table_iterator` such that they support iterator reuse. This is done by separating initialization of the iterator and seeking on it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable/generic: adapt interface to allow reuse of iterators	Patrick Steinhardt	4	-22/+68
	Refactor the interfaces exposed by `struct reftable_table` and `struct reftable_iterator` such that they support iterator reuse. This is done by separating initialization of the iterator and seeking on it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable/generic: move seeking of records into the iterator	Patrick Steinhardt	5	-117/+177
	Reftable iterators are created by seeking on the parent structure of a corresponding record. For example, to create an iterator for the merged table you would call `reftable_merged_table_seek_ref()`. Most notably, it is not posible to create an iterator and then seek it afterwards. While this may be a bit easier to reason about, it comes with two significant downsides. The first downside is that the logic to find records is split up between the parent data structure and the iterator itself. Conceptually, it is more straight forward if all that logic was contained in a single place, which should be the iterator. The second and more significant downside is that it is impossible to reuse iterators for multiple seeks. Whenever you want to look up a record, you need to re-create the whole infrastructure again, which is quite a waste of time. Furthermore, it is impossible to optimize seeks, such as when seeking the same record multiple times. To address this, we essentially split up the concerns properly such that the parent data structure is responsible for setting up the iterator via a new `init_iter()` callback, whereas the iterator handles seeks via a new `seek()` callback. This will eventually allow us to call `seek()` on the iterator multiple times, where every iterator can potentially optimize for certain cases. Note that at this point in time we are not yet ready to reuse the iterators. This will be left for a future patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable/merged: simplify indices for subiterators	Patrick Steinhardt	1	-5/+4
	When seeking on a merged table, we perform the seek for each of the subiterators. If the subiterator has the desired record we add it to the priority queue, otherwise we skip it and don't add it to the stack of subiterators hosted by the merged table. The consequence of this is that the index of the subiterator in the merged table does not necessarily correspond to the index of it in the merged iterator. Next to being potentially confusing, it also means that we won't easily be able to re-seek the merged iterator because we have no clear connection between both of the data structures. Refactor the code so that the index stays the same in both structures. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable/merged: split up initialization and seeking of records	Patrick Steinhardt	1	-37/+22
	To initialize a `struct merged_iter`, we need to seek all subiterators to the wanted record and then add their results to the priority queue used to sort the records. This logic is split up across two functions, `merged_table_seek_record()` and `merged_iter_init()`. The scope of these functions is somewhat weird though, where `merged_iter_init()` is only responsible for adding the records of the subiterators to the priority queue. Clarify the scope of those functions such that `merged_iter_init()` is only responsible for initializing the iterator's structure. Performing the subiterator seeks are now part of `merged_table_seek_record()`. This step is required to move seeking of records into the generic `struct reftable_iterator` infrastructure. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable/reader: set up the reader when initializing table iterator	Patrick Steinhardt	1	-17/+22
	All the seeking functions accept a `struct reftable_reader` as input such that they can use the reader to look up the respective blocks. Refactor the code to instead set up the reader as a member of `struct table_iter` during initialization such that we don't have to pass the reader on every single call. This step is required to move seeking of records into the generic `struct reftable_iterator` infrastructure. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable/reader: inline `reader_seek_internal()`	Patrick Steinhardt	1	-22/+12
	We have both `reader_seek()` and `reader_seek_internal()`, where the former function only exists so that we can exit early in case the given table has no records of the sought-after type. Merge these two functions into one. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable/reader: separate concerns of table iter and reftable reader	Patrick Steinhardt	1	-17/+15
	In "reftable/reader.c" we implement two different interfaces: - The reftable reader contains the logic to read reftables. - The table iterator is used to iterate through a single reftable read by the reader. The way those two types are used in the code is somewhat confusing though because seeking inside a table is implemented as if it was part of the reftable reader, even though it is ultimately more of a detail implemented by the table iterator. Make the boundary between those two types clearer by renaming functions that seek records in a table such that they clearly belong to the table iterator's logic. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable/reader: unify indexed and linear seeking	Patrick Steinhardt	1	-26/+16
	In `reader_seek_internal()` we either end up doing an indexed seek when there is one or a linear seek otherwise. These two code paths are disjunct without a good reason, where the indexed seek will cause us to exit early. Refactor the two code paths such that it becomes possible to share a bit more code between them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable/reader: avoid copying index iterator	Patrick Steinhardt	1	-24/+14
	When doing an indexed seek we need to walk down the multi-level index until we finally hit a record of the desired indexed type. This loop performs a copy of the index iterator on every iteration, which is both hard to understand and completely unnecessary. Refactor the code so that we use a single iterator to walk down the indices, only. Note that while this should improve performance, the improvement is negligible in all but the most unreasonable repositories. This is because the effect is only really noticeable when we have to walk down many levels of indices, which is not something that a repository would typically have. So the motivation for this change is really only about readability. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable/block: use `size_t` to track restart point index	Patrick Steinhardt	1	-2/+2
	The function `block_reader_restart_offset()` gets the offset of the `i`th restart point. `i` is a signed integer though, which is certainly not the correct type to track indices like this. Furthermore, both callers end up passing a `size_t`. Refactor the code to use a `size_t` instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	refs/reftable: allow configuring geometric factor	Patrick Steinhardt	2	-0/+15
	Allow configuring the geometric factor used by the auto-compaction algorithm whenever a new table is appended to the stack of tables. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable: make the compaction factor configurable	Patrick Steinhardt	5	-7/+21
	When auto-compacting, the reftable library packs references such that the sizes of the tables form a geometric sequence. The factor for this geometric sequence is hardcoded to 2 right now. We're about to expose this as a config option though, so let's expose the factor via write options. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	refs/reftable: allow disabling writing the object index	Patrick Steinhardt	3	-0/+77
	Besides the expected "ref" and "log" records, the reftable library also writes "obj" records. These are basically a reverse mapping of object IDs to their respective ref records so that it becomes efficient to figure out which references point to a specific object. The motivation for this data structure is the "uploadpack.allowTipSHA1InWant" config, which allows a client to fetch any object by its hash that has a ref pointing to it. This reverse index is not used by Git at all though, and the expectation is that most hosters nowadays use "uploadpack.allowAnySHA1InWant". It may thus be preferable for many users to disable writing these optional object indices altogether to safe some precious disk space. Add a new config "reftable.indexObjects" that allows the user to disable the object index altogether. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	refs/reftable: allow configuring restart interval	Patrick Steinhardt	3	-0/+66
	Add a new option `reftable.restartInterval` that allows the user to control the restart interval when writing reftable records used by the reftable library. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable: use `uint16_t` to track restart interval	Patrick Steinhardt	2	-2/+2
	The restart interval can at most be `UINT16_MAX` as specified in the technical documentation of the reftable format. Furthermore, it cannot ever be negative. Regardless of that we use an `int` to track the restart interval. Change the type to use an `uint16_t` instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	refs/reftable: allow configuring block size	Patrick Steinhardt	4	-1/+118
	Add a new option `reftable.blockSize` that allows the user to control the block size used by the reftable library. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable/dump: support dumping a table's block structure	Patrick Steinhardt	4	-1/+174
	We're about to introduce new configs that will allow users to have more control over how exactly reftables are written. To verify that these configs are effective we will need to take a peak into the actual blocks written by the reftable backend. Introduce a new mode to the dumping logic that prints out the block structure. This logic can be invoked via `test-tool dump-reftables -b`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable/writer: improve error when passed an invalid block size	Patrick Steinhardt	1	-4/+2
	The reftable format only supports block sizes up to 16MB. When the writer is being passed a value bigger than that it simply calls abort(3P), which isn't all that helpful due to the lack of a proper error message. Improve this by calling `BUG()` instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable/writer: drop static variable used to initialize strbuf	Patrick Steinhardt	1	-3/+1
	We have a static variable in the reftable writer code that is merely used to initialize the `last_key` of the writer. Convert the code to instead use `strbuf_init()` and drop the variable. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable: pass opts as constant pointer	Patrick Steinhardt	7	-38/+46
	We sometimes pass the refatble write options as value and sometimes as a pointer. This is quite confusing and makes the reader wonder whether the options get modified sometimes. In fact, `reftable_new_writer()` does cause the caller-provided options to get updated when some values aren't set up. This is quite unexpected, but didn't cause any harm until now. Adapt the code so that we do not modify the caller-provided values anymore. While at it, refactor the code to code to consistently pass the options as a constant pointer to clarify that the caller-provided opts will not ever get modified. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	reftable: consistently refer to `reftable_write_options` as `opts`	Patrick Steinhardt	5	-89/+74
	Throughout the reftable library the `reftable_write_options` are sometimes referred to as `cfg` and sometimes as `opts`. Unify these to consistently use `opts` to avoid confusion. While at it, touch up the coding style a bit by removing unneeded braces around one-line statements and newlines between variable declarations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	documentation: git-update-index: add --show-index-version to synopsis	Dov Murik	1	-0/+1
	In 606e088d5d (update-index: add --show-index-version, 2023-09-12), we added the new '--show-index-version' option to 'git-update-index' and documented it, but forgot to add it to the synopsis section. Add '--show-index-version' to the synopsis of 'git-update-index'. Signed-off-by: Dov Murik <dov.murik@linux.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	fetch-pack: remove unused 'struct loose_object_iter'	Dr. David Alan Gilbert	1	-5/+0
	'struct loose_object_iter' in fetch-pack.c is unused since commit 97b2fa08 (fetch-pack: drop custom loose object cache, 2018-11-12). Remove it. Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	Merge branch 'ps/undecided-is-not-necessarily-sha1' into ↵	Junio C Hamano	17	-92/+168
	jc/undecided-is-not-necessarily-sha1-fix * ps/undecided-is-not-necessarily-sha1: repository: stop setting SHA1 as the default object hash oss-fuzz/commit-graph: set up hash algorithm builtin/shortlog: don't set up revisions without repo builtin/diff: explicitly set hash algo when there is no repo builtin/bundle: abort "verify" early when there is no repository builtin/blame: don't access potentially unitialized `the_hash_algo` builtin/rev-parse: allow shortening to more than 40 hex characters remote-curl: fix parsing of detached SHA256 heads attr: fix BUG() when parsing attrs outside of repo attr: don't recompute default attribute source parse-options-cb: only abbreviate hashes when hash algo is known path: move `validate_headref()` to its only user path: harden validation of HEAD with non-standard hashes
2024-05-13	The third batch	Junio C Hamano	1	-0/+25
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13	Merge branch 'jc/git-gui-maintainer-update'	Junio C Hamano	2	-3/+3
	* jc/git-gui-maintainer-update: SubmittingPatches: welcome the new maintainer of git-gui part
2024-05-13	Merge branch 'fa/p4-error'	Junio C Hamano	1	-11/+13
	P4 update. * fa/p4-error: git-p4: show Perforce error to the user
2024-05-13	Merge branch 'ps/ci-fuzzers-at-gitlab-fix'	Junio C Hamano	1	-0/+9
	CI fix. * ps/ci-fuzzers-at-gitlab-fix: gitlab-ci: fix installing dependencies for fuzz smoke tests gitlab-ci: add smoke test for fuzzers
2024-05-13	Merge branch 'jk/ci-test-with-jgit-fix'	Junio C Hamano	1	-1/+1
	CI fix. * jk/ci-test-with-jgit-fix: ci: update coverity runs_on_pool reference
2024-05-13	Merge branch 'jk/ci-macos-gcc13-fix'	Junio C Hamano	2	-4/+1
	CI fix. * jk/ci-macos-gcc13-fix: ci: stop installing "gcc-13" for osx-gcc ci: avoid bare "gcc" for osx-gcc job ci: drop mention of BREW_INSTALL_PACKAGES variable
2024-05-13	Merge branch 'jc/no-default-attr-tree-in-bare'	Junio C Hamano	3	-10/+10
	Git 2.43 started using the tree of HEAD as the source of attributes in a bare repository, which has severe performance implications. For now, revert the change, without ripping out a more explicit support for the attr.tree configuration variable. * jc/no-default-attr-tree-in-bare: stop using HEAD for attributes in bare repository by default
2024-05-13	Merge branch 'ps/ci-python-2-deprecation'	Junio C Hamano	1	-2/+6
	Unbreak CI jobs so that we do not attempt to use Python 2 that has been removed from the platform. * ps/ci-python-2-deprecation: ci: fix Python dependency on Ubuntu 24.04
2024-05-13	Merge branch 'tb/attr-limits'	Junio C Hamano	2	-10/+19
	The maximum size of attribute files is enforced more consistently. * tb/attr-limits: attr.c: move ATTR_MAX_FILE_SIZE check into read_attr_from_buf()
2024-05-13	Merge branch 'jc/test-workaround-broken-mv'	Junio C Hamano	1	-1/+2
	Tests that try to corrupt in-repository files in chunked format did not work well on macOS due to its broken "mv", which has been worked around. * jc/test-workaround-broken-mv: t/lib-chunk: work around broken "mv" on some vintage of macOS
2024-05-13	Merge branch 'ma/win32-unix-domain-socket'	Junio C Hamano	1	-0/+2
	Build fix. * ma/win32-unix-domain-socket: win32: fix building with NO_UNIX_SOCKETS
2024-05-13	compat/regex: fix argument order to calloc(3)	Junio C Hamano	3	-13/+13
	Windows compiler suddenly started complaining that calloc(3) takes its arguments in <nmemb, size> order. Indeed, there are many calls that has their arguments in a _wrong_ order. Fix them all. A sample breakage can be seen at https://github.com/git/git/actions/runs/9046793153/job/24857988702#step:4:272 Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-11	SubmittingPatches: welcome the new maintainer of git-gui part	Junio C Hamano	2	-3/+3
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-10	Merge branch 'ps/config-subcommands' into ps/builtin-config-cleanup	Junio C Hamano	6	-370/+812
	* ps/config-subcommands: builtin/config: display subcommand help builtin/config: introduce "edit" subcommand builtin/config: introduce "remove-section" subcommand builtin/config: introduce "rename-section" subcommand builtin/config: introduce "unset" subcommand builtin/config: introduce "set" subcommand builtin/config: introduce "get" subcommand builtin/config: introduce "list" subcommand builtin/config: pull out function to handle `--null` builtin/config: pull out function to handle config location builtin/config: use `OPT_CMDMODE()` to specify modes builtin/config: move "fixed-value" option to correct group builtin/config: move option array around config: clarify memory ownership when preparing comment strings
2024-05-10	SubmittingPatches: extend the "flow" section	Junio C Hamano	1	-42/+61
	Explain a full lifecycle of a patch series upfront, so that it is clear when key decisions to "accept" a series is made and how a new patch series becomes a part of a new release. Fold the "you need to monitor the progress of your topic" section into the primary "patch lifecycle" section, as that is one of the things the patch submitter is responsible for. It is not like "I sent a patch and responded to review messages, and now it is their problem". They need to see their patch through the patch life cycle. Earlier versions of this document outlined a slightly different patch flow in an idealized world, where the original submitter gathered agreements from the participants of the discussion and sent the final "we all agreed that this is the good version--please apply" patches to the maintainer. In practice, this almost never happened. Instead, describe what flow was used in practice for the past decade that worked well for us. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-10	SubmittingPatches: move the patch-flow section earlier	Junio C Hamano	1	-49/+49
	Before discussing the small details of how the patch gets sent, we'd want to give people a larger picture first to set the expectation straight. The existing patch-flow section covers materials that are suitable for that purpose, so move it to the beginning of the document. We'll update the contents of the section to clarify what goal the patch submitter is working towards in the next step, which will make it easier to understand the reason behind the individual rules presented in latter parts of the document. This step only moves two sections (patch-flow and patch-status) without changing their contents, except that their section levels are demoted from Level 1 to Level 2 to fit better in the document structure at their new place. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-09	ci: stop installing "gcc-13" for osx-gcc	Jeff King	1	-1/+0
	Our osx-gcc job explicitly asks to install gcc-13. But since the GitHub runner image already comes with gcc-13 installed, this is mostly doing nothing (or in some cases it may install an incremental update over the runner image). But worse, it recently started causing errors like: ==> Fetching gcc@13 ==> Downloading https://ghcr.io/v2/homebrew/core/gcc/13/blobs/sha256:fb2403d97e2ce67eb441b54557cfb61980830f3ba26d4c5a1fe5ecd0c9730d1a ==> Pouring gcc@13--13.2.0.ventura.bottle.tar.gz Error: The `brew link` step did not complete successfully The formula built, but is not symlinked into /usr/local Could not symlink bin/c++-13 Target /usr/local/bin/c++-13 is a symlink belonging to gcc. You can unlink it: brew unlink gcc which cause the whole CI job to bail. I didn't track down the root cause, but I suspect it may be related to homebrew recently switching the "gcc" default to gcc-14. And it may even be fixed when a new runner image is released. But if we don't need to run brew at all, it's one less thing for us to worry about. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-09	ci: avoid bare "gcc" for osx-gcc job	Jeff King	1	-1/+1
	On macOS, a bare "gcc" (without a version) will invoke a wrapper for clang, not actual gcc. Even when gcc is installed via homebrew, that only provides version-specific links in /usr/local/bin (like "gcc-13"), and never a version-agnostic "gcc" wrapper. As far as I can tell, this has been the case for a long time, and this osx-gcc job has largely been doing nothing. We can point it at "gcc-13", which will pick up the homebrew-installed version. The fix here is specific to the github workflow file, as the gitlab one does not have a matching job. It's a little unfortunate that we cannot just ask for the latest version of gcc which homebrew provides, but as far as I can tell there is no easy alias (you'd have to find the highest number gcc-* in /usr/local/bin yourself). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-09	ci: drop mention of BREW_INSTALL_PACKAGES variable	Jeff King	1	-2/+0
	The last user of this variable went away in 4a6e4b9602 (CI: remove Travis CI support, 2021-11-23), so it's doing nothing except making it more confusing to find out which packages _are_ installed. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-09	ci: update coverity runs_on_pool reference	Jeff King	1	-1/+1
	Commit 2d65e5b6a6 (ci: rename "runs_on_pool" to "distro", 2024-04-12) renamed this variable for the main CI workflow, as well as in the ci/ scripts. Because the coverity workflow also relies on those scripts to install dependencies, it needs to be updated, too. Without this patch, the coverity build fails because we lack libcurl. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-09	gitlab-ci: fix installing dependencies for fuzz smoke tests	Patrick Steinhardt	1	-1/+1
	There was a semantic merge conflict between 9cdeb34b96 (ci: merge scripts which install dependencies, 2024-04-12), which has merged "ci/install-docker-dependencies.sh" into "ci/install-dependencies.sh" and c7b228e000 (gitlab-ci: add smoke test for fuzzers, 2024-04-29), which has added a new fuzz smoke test job that makes use of the now-removed script. Adapt the job to instead use the new script to install dependencies. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-09	Merge branch 'ps/ci-python-2-deprecation' into ps/ci-fuzzers-at-gitlab-fix	Junio C Hamano	1	-2/+6
	* ps/ci-python-2-deprecation: ci: fix Python dependency on Ubuntu 24.04
2024-05-09	Merge branch 'ps/ci-enable-minimal-fuzzers-at-gitlab' into ↵	Junio C Hamano	1	-0/+9
	ps/ci-fuzzers-at-gitlab-fix * ps/ci-enable-minimal-fuzzers-at-gitlab: gitlab-ci: add smoke test for fuzzers
2024-05-08	git-p4: show Perforce error to the user	Fahad Alrashed	1	-11/+13
	During "git p4 clone" if p4 process returns an error from the server, it will store the message in the 'err' variable. Then it will send a text command "die-now" to git-fast-import. However, git-fast-import raises an exception: "fatal: Unsupported command: die-now" and err is never displayed. This patch ensures that err is shown to the end user. Signed-off-by: Fahad Alrashed <fahad@keylock.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-08	The second batch	Junio C Hamano	1	-1/+38
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-08	Merge branch 'bb/rgb-12-bit-colors'	Junio C Hamano	4	-12/+41
	The color parsing code learned to handle 12-bit RGB colors, spelled as "#RGB" (in addition to "#RRGGBB" that is already supported). * bb/rgb-12-bit-colors: color: add support for 12-bit RGB colors t/t4026-color: add test coverage for invalid RGB colors t/t4026-color: remove an extra double quote character
2024-05-08	Merge branch 'rs/diff-parseopts-cleanup'	Junio C Hamano	1	-1/+0
	Code clean-up to remove code that is now a noop. * rs/diff-parseopts-cleanup: diff-lib: stop calling diff_setup_done() in do_diff_cache()
2024-05-08	Merge branch 'dk/zsh-git-repo-path-fix'	Junio C Hamano	1	-0/+1
	Command line completion support for zsh (in contrib/) has been updated to stop exposing internal state to end-user shell interaction. * dk/zsh-git-repo-path-fix: completion: zsh: stop leaking local cache variable
2024-05-08	Merge branch 'bc/zsh-compatibility'	Junio C Hamano	2	-9/+10
	zsh can pretend to be a normal shell pretty well except for some glitches that we tickle in some of our scripts. Work them around so that "vimdiff" and our test suite works well enough with it. * bc/zsh-compatibility: vimdiff: make script and tests work with zsh t4046: avoid continue in &&-chain for zsh
2024-05-08	Merge branch 'rj/add-p-typo-reaction'	Junio C Hamano	2	-15/+31
	When the user responds to a prompt given by "git add -p" with an unsupported command, list of available commands were given, which was too much if the user knew what they wanted to type but merely made a typo. Now the user gets a much shorter error message. * rj/add-p-typo-reaction: add-patch: response to unknown command add-patch: do not show UI messages on stderr
2024-05-08	Merge branch 'jt/doc-submitting-rerolled-series'	Junio C Hamano	1	-4/+6
	Developer doc update. * jt/doc-submitting-rerolled-series: doc: clarify practices for submitting updated patch versions
2024-05-08	Merge branch 'rh/complete-symbolic-ref'	Junio C Hamano	2	-3/+51
	Command line completion script (in contrib/) learned to complete "git symbolic-ref" a bit better (you need to enable plumbing commands to be completed with GIT_COMPLETION_SHOW_ALL_COMMANDS). * rh/complete-symbolic-ref: completion: add docs on how to add subcommand completions completion: improve docs for using __git_complete completion: add 'symbolic-ref'
2024-05-08	Merge branch 'ps/the-index-is-no-more'	Junio C Hamano	41	-455/+435
	The singleton index_state instance "the_index" has been eliminated by always instantiating "the_repository" and replacing references to "the_index" with references to its .index member. * ps/the-index-is-no-more: repository: drop `initialize_the_repository()` repository: drop `the_index` variable builtin/clone: stop using `the_index` repository: initialize index in `repo_init()` builtin: stop using `the_index` t/helper: stop using `the_index`
2024-05-08	Merge branch 'bc/credential-scheme-enhancement'	Junio C Hamano	16	-120/+1025
	The credential helper protocol, together with the HTTP layer, have been enhanced to support authentication schemes different from username & password pair, like Bearer and NTLM. * bc/credential-scheme-enhancement: credential: add method for querying capabilities credential-cache: implement authtype capability t: add credential tests for authtype credential: add support for multistage credential rounds t5563: refactor for multi-stage authentication docs: set a limit on credential line length credential: enable state capability credential: add an argument to keep state http: add support for authtype and credential docs: indicate new credential protocol fields credential: add a field called "ephemeral" credential: gate new fields on capability credential: add a field for pre-encoded credentials http: use new headers for each object request remote-curl: reset headers on new request credential: add an authtype field
2024-05-08	Merge branch 'ps/ci-test-with-jgit'	Junio C Hamano	9	-109/+228
	Tests to ensure interoperability between reftable written by jgit and our code have been added and enabled in CI. * ps/ci-test-with-jgit: t0612: add tests to exercise Git/JGit reftable compatibility t0610: fix non-portable variable assignment t06xx: always execute backend-specific tests ci: install JGit dependency ci: make Perforce binaries executable for all users ci: merge scripts which install dependencies ci: fix setup of custom path for GitLab CI ci: merge custom PATH directories ci: convert "install-dependencies.sh" to use "/bin/sh" ci: drop duplicate package installation for "linux-gcc-default" ci: skip sudo when we are already root ci: expose distro name in dockerized GitHub jobs ci: rename "runs_on_pool" to "distro"
2024-05-08	Merge branch 'ps/reftable-write-optim'	Junio C Hamano	16	-556/+230
	Code to write out reftable has seen some optimization and simplification. * ps/reftable-write-optim: reftable/block: reuse compressed array reftable/block: reuse zstream when writing log blocks reftable/writer: reset `last_key` instead of releasing it reftable/writer: unify releasing memory reftable/writer: refactorings for `writer_flush_nonempty_block()` reftable/writer: refactorings for `writer_add_record()` refs/reftable: don't recompute committer ident reftable: remove name checks refs/reftable: skip duplicate name checks refs/reftable: perform explicit D/F check when writing symrefs refs/reftable: fix D/F conflict error message on ref copy
2024-05-07	scalar: avoid segfault in reconfigure --all	Derrick Stolee	2	-3/+45
	During the latest v2.45.0 update, 'scalar reconfigure --all' started to segfault on my machine. Breaking it down via the debugger, it was faulting on a NULL reference to the_hash_algo, which is a macro pointing to the_repository->hash_algo. In my case, this is due to one of my repositories having a detached HEAD, which requires get_oid_hex() to parse that the HEAD reference is valid. Another way to cause a failure is to use the "includeIf.onbranch" config key, which will lead to a BUG() statement. My first inclination was to try to refactor cmd_reconfigure() to execute 'git for-each-repo' instead of this loop. In addition to the difficulty of executing 'scalar reconfigure' within 'git for-each-repo', it would be difficult to perform the clean-up logic for non-existent repos if we relied on that child process. Instead, I chose to move the temporary repo to be within the loop and reinstate the_repository to its old value after we are done performing logic on the current array item. Add tests to t9210-scalar.sh to test 'scalar reconfigure --all' with multiple registered repos. There are two different ways that the old use of the_repository could trigger bugs. These issues are being solved independently to be more careful about the_repository being uninitialized, but the change in this patch around the use of the_repository is still a good safety precaution. Co-authored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	t0018: two small fixes	Junio C Hamano	1	-19/+22
	Even though the three tests that were recently added started their here-doc with "<<-\EOF", it did not take advantage of that and instead wrote the here-doc payload abut to the left edge. Use a tabs to indent these lines. More importantly, because these all hardcode the expected output, which contains the current branch name, they break the CI job that uses 'main' as the default branch name. Use GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=trunk export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME between the test_description line and ". ./test-lib.sh" line to force the initial branch name to 'trunk' and expect it to show in the output. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	Documentation/git-merge-tree.txt: document -X	Victoria Dye	1	-0/+5
	Add an entry in the 'merge-tree' builtin documentation for -X/--strategy-option (added in 6a4c9e7b32 (merge-tree: add -X strategy option, 2023-09-24)). The same option is documented for 'merge', 'rebase', 'revert', etc. in their respective Documentation/ files, so let's do the same for 'merge-tree'. Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	refs: remove functions without ref store	Patrick Steinhardt	2	-268/+209
	The preceding commit has rewritten all callers of ref-related functions to use the equivalents that accept a `struct ref_store`. Consequently, the respective variants without the ref store are now unused. Remove them. There are likely patch series in-flight that use the now-removed functions. To help the authors, the old implementations have been added to "refs.c" in an ifdef'd section as a reference for how to migrate each of the respective callers. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	cocci: apply rules to rewrite callers of "refs" interfaces	Patrick Steinhardt	75	-436/+711
	Apply the rules that rewrite callers of "refs" interfaces to explicitly pass `struct ref_store`. The resulting patch has been applied with the `--whitespace=fix` option. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	cocci: introduce rules to transform "refs" to pass ref store	Patrick Steinhardt	1	-0/+103
	Most of the functions in "refs.h" have two flavors: one that accepts a `struct ref_store`, and one that figures it out via `the_repository`. As part of the libification efforts we want to get rid of the latter variant and stop relying on `the_repository` altogether. Introduce a set of Coccinelle rules that transform callers of the "refs" interfaces to pass a `struct ref_store`. These rules are not yet applied by this patch so that it can be reviewed standalone more easily. This will be done in the next patch. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	refs: add `exclude_patterns` parameter to `for_each_fullref_in()`	Patrick Steinhardt	5	-13/+16
	The `for_each_fullref_in()` function is supposedly the ref-store-less equivalent of `refs_for_each_fullref_in()`, but the latter has gained a new parameter `exclude_patterns` over time. Bring these two functions back in sync again by adding the parameter to the former function, as well. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	refs: introduce missing functions that accept a `struct ref_store`	Patrick Steinhardt	2	-14/+64
	While most of the functions in "refs.h" have a variant that accepts a `struct ref_store`, some don't. Callers of these functions are thus forced to implicitly rely on `the_repository` to figure out the ref store that is to be used. Introduce those missing functions to address this shortcoming. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	builtin/tag: add --trailer option	John Passaro	3	-11/+157
	git-tag supports interpreting trailers from an annotated tag message, using --list --format="%(trailers)". However, the available methods to add a trailer to a tag message (namely -F or --editor) are not as ergonomic. In a previous patch, we moved git-commit's implementation of its --trailer option to the trailer.h API. Let's use that new function to teach git-tag the same --trailer option, emulating as much of git-commit's behavior as much as possible. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: John Passaro <john.a.passaro@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	builtin/commit: refactor --trailer logic	John Passaro	3	-8/+23
	git-commit adds user trailers to the commit message by passing its `--trailer` arguments to a child process running `git-interpret-trailers --in-place`. This logic is broadly useful, not just for git-commit but for other commands constructing message bodies (e.g. git-tag). Let's move this logic from git-commit to a new function in the trailer API, so that it can be re-used in other commands. Helped-by: Patrick Steinhardt <ps@pks.im> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: John Passaro <john.a.passaro@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	builtin/commit: use ARGV macro to collect trailers	John Passaro	1	-9/+1
	Replace git-commit's callback for --trailer with the standard OPT_PASSTHRU_ARGV macro. The callback only adds its values to a strvec and sanity-checks that `unset` is always false; both of these are already implemented in the parse-option API. Signed-off-by: John Passaro <john.a.passaro@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	refs: remove `create_symref` and associated dead code	Karthik Nayak	5	-172/+0
	In the previous commits, we converted `refs_create_symref()` to utilize transactions to perform symref updates. Earlier `refs_create_symref()` used `create_symref()` to do the same. We can now remove `create_symref()` and any code associated with it which is no longer used. We remove `create_symref()` code from all the reference backends and also remove it entirely from the `ref_storage_be` struct. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	refs: rename `refs_create_symref()` to `refs_update_symref()`	Karthik Nayak	5	-11/+9
	The `refs_create_symref()` function is used to update/create a symref. But it doesn't check the old target of the symref, if existing. It force updates the symref. In this regard, the name `refs_create_symref()` is a bit misleading. So let's rename it to `refs_update_symref()`. This is akin to how 'git-update-ref(1)' also allows us to create apart from update. While we're here, rename the arguments in the function to clarify what they actually signify and reduce confusion. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	refs: use transaction in `refs_create_symref()`	Karthik Nayak	3	-8/+41
	The `refs_create_symref()` function updates a symref to a given new target. To do this, it uses a ref-backend specific function `create_symref()`. In the previous commits, we introduced symref support in transactions. This means we can now use transactions to perform symref updates and don't have to resort to `create_symref()`. Doing this allows us to remove and cleanup `create_symref()`, which we will do in the following commit. Modify the expected error message for a test in 't/t0610-reftable-basics.sh', since the error is now thrown from 'refs.c'. This is because in transactional updates, F/D conflicts are caught before we're in the reference backend. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	refs: add support for transactional symref updates	Karthik Nayak	4	-40/+197
	The reference backends currently support transactional reference updates. While this is exposed to users via 'git-update-ref' and its '--stdin' mode, it is also used internally within various commands. However, we do not support transactional updates of symrefs. This commit adds support for symrefs in both the 'files' and the 'reftable' backend. Here, we add and use `ref_update_has_null_new_value()`, a helper function which is used to check if there is a new_value in a reference update. The new value could either be a symref target `new_target` or a OID `new_oid`. We also add another common function `ref_update_check_old_target` which will be used to check if the update's old_target corresponds to a reference's current target. Now transactional updates (verify, create, delete, update) can be used for: - regular refs - symbolic refs - conversion of regular to symbolic refs and vice versa This also allows us to expose this to users via new commands in 'git-update-ref' in the future. Note that a dangling symref update does not record a new reflog entry, which is unchanged before and after this commit. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	refs: move `original_update_refname` to 'refs.c'	Karthik Nayak	4	-33/+25
	The files backend and the reftable backend implement `original_update_refname` to obtain the original refname of the update. Move it out to 'refs.c' and only expose it internally to the refs library. This will be used in an upcoming commit to also introduce another common functionality for the two backends. We also rename the function to `ref_update_original_update_refname` to keep it consistent with the upcoming other 'ref_update_*' functions that'll be introduced. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	refs: support symrefs in 'reference-transaction' hook	Karthik Nayak	2	-9/+25
	The 'reference-transaction' hook runs whenever a reference update is made to the system. In a previous commit, we added the `old_target` and `new_target` fields to the `reference_transaction_update()`. In following commits we'll also add the code to handle symref's in the reference backends. Support symrefs also in the 'reference-transaction' hook, by modifying the current format: <old-oid> SP <new-oid> SP <ref-name> LF to be be: <old-value> SP <new-value> SP <ref-name> LF where for regular refs the output would not change and remain the same. But when either 'old-value' or 'new-value' is a symref, we print the ref as 'ref:<ref-target>'. This does break backward compatibility, but the 'reference-transaction' hook's documentation always stated that support for symbolic references may be added in the future. We do not add any tests in this commit since there is no git command which activates this flow, in an upcoming commit, we'll start using transaction based symref updates as the default, we'll add tests there for the hook too. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	files-backend: extract out `create_symref_lock()`	Karthik Nayak	1	-14/+37
	The function `create_symref_locked()` creates a symref by creating a '<symref>.lock' file and then committing the symref lock, which creates the final symref. Extract the early half of `create_symref_locked()` into a new helper function `create_symref_lock()`. Because the name of the new function is too similar to the original, rename the original to `create_and_commit_symref()` to avoid confusion. The new function `create_symref_locked()` can be used to create the symref lock in a separate step from that of committing it. This allows to add transactional support for symrefs, where the lock would be created in the preparation step and the lock would be committed in the finish step. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07	refs: accept symref values in `ref_transaction_update()`	Karthik Nayak	14	-24/+71
	The function `ref_transaction_update()` obtains ref information and flags to create a `ref_update` and add them to the transaction at hand. To extend symref support in transactions, we need to also accept the old and new ref targets and process it. This commit adds the required parameters to the function and modifies all call sites. The two parameters added are `new_target` and `old_target`. The `new_target` is used to denote what the reference should point to when the transaction is applied. Some functions allow this parameter to be NULL, meaning that the reference is not changed. The `old_target` denotes the value the reference must have before the update. Some functions allow this parameter to be NULL, meaning that the old value of the reference is not checked. We also update the internal function `ref_transaction_add_update()` similarly to take the two new parameters. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	repository: stop setting SHA1 as the default object hash	Patrick Steinhardt	1	-20/+0
	During the startup of Git, we call `initialize_the_repository()` to set up `the_repository` as well as `the_index`. Part of this setup is also to set the default object hash of the repository to SHA1. This has the effect that `the_hash_algo` is getting initialized to SHA1, as well. This default hash algorithm eventually gets overridden by most Git commands via `setup_git_directory()`, which also detects the actual hash algorithm used by the repository. There are some commands though that don't access a repository at all, or at a later point only, and thus retain the default hash function for some amount of time. As some of the the preceding commits demonstrate, this can lead to subtle issues when we access `the_hash_algo` when no repository has been set up. Address this issue by dropping the set up of the default hash algorithm completely. The effect of this is that `the_hash_algo` will map to a `NULL` pointer and thus cause Git to crash when something tries to access the hash algorithm without it being properly initialized. It thus forces all Git commands to explicitly set up the hash algorithm in case there is no repository. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	oss-fuzz/commit-graph: set up hash algorithm	Patrick Steinhardt	1	-0/+1
	Our fuzzing setups don't work in a proper repository, but only use the in-memory configured `the_repository`. Consequently, we never go through the full repository setup procedures and thus do not set up the hash algo used by the repository. The commit-graph fuzzer does rely on a properly initialized hash algo though. Initialize it explicitly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/shortlog: don't set up revisions without repo	Patrick Steinhardt	1	-1/+1
	It is possible to run git-shortlog(1) outside of a repository by passing it output from git-log(1) via standard input. Obviously, as there is no repository in that context, it is thus unsupported to pass any revisions as arguments. Regardless of that we still end up calling `setup_revisions()`. While that works alright, it is somewhat strange. Furthermore, this is about to cause problems when we unset the default object hash. Refactor the code to only call `setup_revisions()` when we have a repository. This is safe to do as we already verify that there are no arguments when running outside of a repository anyway. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/diff: explicitly set hash algo when there is no repo	Patrick Steinhardt	1	-0/+9
	The git-diff(1) command can be used outside repositories to diff two files with each other. But even if there is no repository we will end up hashing the files that we are diffing so that we can print the "index" line: ``` diff --git a/a b/b index 7898192..6178079 100644 --- a/a +++ b/b @@ -1 +1 @@ -a +b ``` We implicitly use SHA1 to calculate the hash here, which is because `the_repository` gets initialized with SHA1 during the startup routine. We are about to stop doing this though such that `the_repository` only ever has a hash function when it was properly initialized via a repo's configuration. To give full control to our users, we would ideally add a new switch to git-diff(1) that allows them to specify the hash function when executed outside of a repository. But for now, we only convert the code to make this explicit such that we can stop setting the default hash algorithm for `the_repository`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/bundle: abort "verify" early when there is no repository	Patrick Steinhardt	1	-0/+5
	Verifying a bundle requires us to have a repository. This is encoded in `verify_bundle()`, which will return an error if there is no repository. We call `open_bundle()` before we call `verify_bundle()` though, which already performs some verifications even though we may ultimately abort due to a missing repository. This is problematic because `open_bundle()` already reads the bundle header and verifies that it contains a properly formatted hash. When there is no repository we have no clue what hash function to expect though, so we always end up assuming SHA1 here, which may or may not be correct. Furthermore, we are about to stop initializing `the_hash_algo` when there is no repository, which will lead to segfaults. Check early on whether we have a repository to fix this issue. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/blame: don't access potentially unitialized `the_hash_algo`	Patrick Steinhardt	1	-3/+2
	We access `the_hash_algo` in git-blame(1) before we have executed `parse_options_start()`, which may not be properly set up in case we have no repository. This is fine for most of the part because all the call paths that lead to it (git-blame(1), git-annotate(1) as well as git-pick-axe(1)) specify `RUN_SETUP` and thus require a repository. There is one exception though, namely when passing `-h` to print the help. Here we will access `the_hash_algo` even if there is no repo. This works fine right now because `the_hash_algo` gets sets up to point to the SHA1 algorithm via `initialize_repository()`. But we're about to stop doing this, and thus the code would lead to a `NULL` pointer exception. Prepare the code for this and only access `the_hash_algo` after we are sure that there is a proper repository. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/rev-parse: allow shortening to more than 40 hex characters	Patrick Steinhardt	2	-3/+8
	The `--short=` option for git-rev-parse(1) allows the user to specify to how many characters object IDs should be shortened to. The option is broken though for SHA256 repositories because we set the maximum allowed hash size to `the_hash_algo->hexsz` before we have even set up the repo. Consequently, `the_hash_algo` will always be SHA1 and thus we truncate every hash after at most 40 characters. Fix this by accessing `the_hash_algo` only after we have set up the repo. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	remote-curl: fix parsing of detached SHA256 heads	Patrick Steinhardt	2	-1/+33
	The dumb HTTP transport tries to read the remote HEAD reference by downloading the "HEAD" file and then parsing it via `http_fetch_ref()`. This function will either parse the file as an object ID in case it is exactly `the_hash_algo->hexsz` long, or otherwise it will check whether the reference starts with "ref :" and parse it as a symbolic ref. This is broken when parsing detached HEADs of a remote SHA256 repository because we never update `the_hash_algo` to the discovered remote object hash. Consequently, `the_hash_algo` will always be the fallback SHA1 hash algorithm, which will cause us to fail parsing HEAD altogteher when it contains a SHA256 object ID. Fix this issue by setting up `the_hash_algo` via `repo_set_hash_algo()`. While at it, let's make the expected SHA1 fallback explicit in our code, which also addresses an upcoming issue where we are going to remove the SHA1 fallback for `the_hash_algo`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	attr: fix BUG() when parsing attrs outside of repo	Patrick Steinhardt	2	-0/+21
	If either the `--attr-source` option or the `GIT_ATTR_SOURCE` envvar are set, then `compute_default_attr_source()` will try to look up the value as a treeish. It is possible to hit that function while outside of a Git repository though, for example when using `git grep --no-index`. In that case, Git will hit a bug because we try to look up the main ref store outside of a repository. Handle the case gracefully and detect when we try to look up an attr source without a repository. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	attr: don't recompute default attribute source	Patrick Steinhardt	1	-9/+16
	The `default_attr_source()` function lazily computes the attr source supposedly once, only. This is done via a static variable `attr_source` that contains the resolved object ID of the attr source's tree. If the variable is the null object ID then we try to look up the attr source, otherwise we skip over it. This approach is flawed though: the variable will never be set to anything else but the null object ID in case there is no attr source. Consequently, we re-compute the information on every call. And in the worst case, when we silently ignore bad trees, this will cause us to try and look up the treeish every single time. Improve this by introducing a separate variable `has_attr_source` to track whether we already computed the attr source and, if so, whether we have an attr source or not. This also allows us to convert the `ignore_bad_attr_tree` to not be static anymore as the code will only be executed once anyway. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	parse-options-cb: only abbreviate hashes when hash algo is known	Patrick Steinhardt	2	-1/+19
	The `OPT__ABBREV()` option can be used to add an option that abbreviates object IDs. When given a length longer than `the_hash_algo->hexsz`, then it will instead set the length to that maximum length. It may not always be guaranteed that we have `the_hash_algo` initialized properly as the hash algorithm can only be set up after we have set up `the_repository`. In that case, the hash would always be truncated to the hex length of SHA1, which may not be what the user desires. In practice it's not a problem as all commands that use `OPT__ABBREV()` also have `RUN_SETUP` set and thus cannot work without a repository. Consequently, both `the_repository` and `the_hash_algo` would be properly set up. Regardless of that, harden the code to not truncate the length when we didn't set up a repository. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	path: move `validate_headref()` to its only user	Patrick Steinhardt	3	-54/+53
	While `validate_headref()` is only called from `is_git_directory()` in "setup.c", it is currently implemented in "path.c". Move it over such that it becomes clear that it is only really used during setup in order to discover repositories. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	path: harden validation of HEAD with non-standard hashes	Patrick Steinhardt	1	-1/+1
	The `validate_headref()` function takes a path to a supposed "HEAD" file and checks whether its format is something that we understand. It is used as part of our repository discovery to check whether a specific directory is a Git directory or not. Part of the validation is a check for a detached HEAD that contains a plain object ID. To do this validation we use `get_oid_hex()`, which relies on `the_hash_algo`. At this point in time the hash algo cannot yet be initialized though because we didn't yet read the Git config. Consequently, it will always be the SHA1 hash algorithm. In practice this works alright because `get_oid_hex()` only ends up checking whether the prefix of the buffer is a valid object ID. And because SHA1 is shorter than SHA256, the function will successfully parse SHA256 object IDs, as well. It is somewhat fragile though and not really the intent to only check for SHA1. With this in mind, harden the code to use `get_oid_hex_any()` to check whether the "HEAD" file parses as any known hash. One might be hard pressed to tighten the check even further and fully validate the file contents, not only the prefix. In practice though that wouldn't make a lot of sense as it could be that the repository uses a hash function that produces longer hashes than SHA256, but which the current version of Git doesn't understand yet. We'd still want to detect the repository as proper Git repository in that case, and we will fail eventually with a proper error message that the hash isn't understood when trying to set up the repository format. It follows that we could just leave the current code intact, as in practice the code change doesn't have any user visible impact. But it also prepares us for `the_hash_algo` being unset when there is no repository. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	Merge branch 'ps/the-index-is-no-more' into ps/undecided-is-not-necessarily-sha1	Junio C Hamano	41	-455/+435
	* ps/the-index-is-no-more: repository: drop `initialize_the_repository()` repository: drop `the_index` variable builtin/clone: stop using `the_index` repository: initialize index in `repo_init()` builtin: stop using `the_index` t/helper: stop using `the_index`
2024-05-06	Merge branch 'jc/no-default-attr-tree-in-bare' into ↵	Junio C Hamano	3	-10/+10
	ps/undecided-is-not-necessarily-sha1 * jc/no-default-attr-tree-in-bare: stop using HEAD for attributes in bare repository by default
2024-05-06	cmake: let `test-tool` run the unit tests, too	Johannes Schindelin	1	-1/+2
	The `test-tool` recently learned to run the unit tests. To this end, it needs to link with `test-lib.c`, which was done in the `Makefile`, and this patch does it in the CMake definition, too. This is a companion of 44400f58407e (t0080: turn t-basic unit test into a helper, 2024-02-02). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	ci: use test-tool as unit test runner on Windows	Josh Steadmon	1	-1/+1
	Although the previous commit changed t/Makefile to run unit tests alongside shell tests, the Windows CI still needs a separate unit-tests step due to how the test sharding works. We want to avoid using `prove` as a test running on Windows due to performance issues [1], so use the new test-tool runner instead. [1] https://lore.kernel.org/git/850ea42c-f103-68d5-896b-9120e2628686@gmx.de/ Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	t/Makefile: run unit tests alongside shell tests	Jeff King	3	-3/+19
	Add a wrapper script to allow `prove` to run both shell tests and unit tests from a single invocation. This avoids issues around running prove twice in CI, as discussed in [1]. Additionally, this moves the unit tests into the main dev workflow, so that errors can be spotted more quickly. Accordingly, we remove the separate unit tests step for Linux CI. (We leave the Windows CI unit-test step as-is, because the sharding scheme there involves selecting specific test files rather than running `make test`.) [1] https://lore.kernel.org/git/pull.1613.git.1699894837844.gitgitgadget@gmail.com/ Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	unit tests: add rule for running with test-tool	Josh Steadmon	2	-2/+10
	In the previous commit, we added support in test-tool for running collections of unit tests. Now, add rules in t/Makefile for running in this way. This new rule can be executed from the top-level Makefile via `make DEFAULT_UNIT_TEST_TARGET=unit-tests-test-tool unit-tests`, or by setting DEFAULT_UNIT_TEST_TARGET in config.mak. Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	test-tool run-command testsuite: support unit tests	Josh Steadmon	1	-3/+14
	Teach the testsuite runner in `test-tool run-command testsuite` how to run unit tests: if TEST_SHELL_PATH is not set, run the programs directly from CWD, rather than defaulting to "sh" as an interpreter. With this change, you can now use test-tool to run the unit tests: $ make $ cd t/unit-tests/bin $ ../../helper/test-tool run-command testsuite This should be helpful on Windows to allow running tests without requiring Perl (for `prove`), as discussed in [1] and [2]. This again breaks backwards compatibility, as it is now required to set TEST_SHELL_PATH properly for executing shell scripts, but again, as noted in [2], there are no longer any such invocations in our codebase. [1] https://lore.kernel.org/git/nycvar.QRO.7.76.6.2109091323150.59@tvgsbejvaqbjf.bet/ [2] https://lore.kernel.org/git/850ea42c-f103-68d5-896b-9120e2628686@gmx.de/ Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	test-tool run-command testsuite: remove hardcoded filter	Josh Steadmon	1	-3/+1
	`test-tool run-command testsuite` currently assumes that it will only be running the shell test suite, and therefore filters out anything that does not match a hardcoded pattern of "t[0-9][0-9][0-9][0-9]-*.sh". Later in this series, we'll adapt `test-tool run-command testsuite` to also support unit tests, which do not follow the same naming conventions as the shell tests, so this hardcoded pattern is inconvenient. Since `testsuite` also allows specifying patterns on the command-line, let's just remove this pattern. As noted in [1], there are no longer any uses of `testsuite` in our codebase, it should be OK to break backwards compatibility in this case. We also add a new filter to avoid trying to execute "." and "..", so that users who wish to execute every test in a directory can do so without specifying a pattern. [1] https://lore.kernel.org/git/850ea42c-f103-68d5-896b-9120e2628686@gmx.de/ Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	test-tool run-command testsuite: get shell from env	Josh Steadmon	1	-1/+8
	When running tests through `test-tool run-command testsuite`, we currently hardcode `sh` as the command interpreter. As discussed in [1], this is incorrect, and we should be using the shell set in TEST_SHELL_PATH instead. Add a shell_path field in struct testsuite so that we can pass this to the task runner callback. If this is non-null, we'll use it as the argv[0] of the subprocess. Otherwise, we'll just execute the test program directly. We will use this feature in a later commit to enable running binary executable unit tests. However, for now when setting up the struct testsuite in testsuite(), use the value of TEST_SHELL_PATH if it's set, otherwise keep the original behavior by defaulting to `sh`. [1] https://lore.kernel.org/git/20240123005913.GB835964@coredump.intra.peff.net/ Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	t0080: turn t-basic unit test into a helper	Josh Steadmon	6	-17/+20
	While t/unit-tests/t-basic.c uses the unit-test framework added in e137fe3b29 (unit tests: add TAP unit test framework, 2023-11-09), it is not a true unit test in that it intentionally fails in order to exercise various codepaths in the unit-test framework. Thus, we intentionally exclude it when running unit tests through the various t/Makefile targets. Instead, it is executed by t0080-unit-test-output.sh, which verifies its output follows the TAP format expected for the various pass, skip, or fail cases. As such, it makes more sense for t-basic to be a helper item for t0080-unit-test-output.sh, so let's move it to t/helper/test-example-tap.c and adjust Makefiles as necessary. Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	ci: fix Python dependency on Ubuntu 24.04	Patrick Steinhardt	1	-2/+6
	Newer versions of Ubuntu have dropped Python 2 starting with Ubuntu 23.04. By default though, our CI setups will try to use that Python version on all Ubuntu-based jobs except for the "linux-gcc" one. We didn't notice this issue due to two reasons: - The "ubuntu:latest" tag always points to the latest LTS release. Until a few weeks ago this was Ubuntu 22.04, which still had Python 2. - Our Docker-based CI jobs had their own script to install dependencies until 9cdeb34b96 (ci: merge scripts which install dependencies, 2024-04-12), where we didn't even try to install Python at all for many of them. Since the CI refactorings have originally been implemented, Ubuntu 24.04 was released, and it being an LTS versions means that the "latest" tag now points to that Python-2-less version. Consequently, those jobs that use "ubuntu:latest" broke. Address this by using Python 2 on Ubuntu 20.04, only, whereas we use Python 3 on all other Ubuntu jobs. Eventually, we should think about dropping support for Python 2 completely. Reported-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	Documentation: Mention that refspecs are explained elsewhere	Øystein Walle	1	-1/+2
	The syntax for refspecs are explained in more detail in documention for git-fetch and git-push. Give a hint to the user too look there more fore information Signed-off-by: Øystein Walle <oystwa@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	format-patch: run range-diff with larger creation-factor	Junio C Hamano	3	-1/+12
	We see too often that a range-diff added to format-patch output shows too many "unmatched" patches. This is because the default value for creation-factor is set to a relatively low value. It may be justified for other uses (like you have a yet-to-be-sent new iteration of your series, and compare it against the 'seen' branch that has an older iteration, probably with the '--left-only' option, to pick out only your patches while ignoring the others) of "range-diff" command, but when the command is run as part of the format-patch, the user _knows_ and expects that the patches in the old and the new iterations roughly correspond to each other, so we can and should use a much higher default. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	gitlab-ci: add smoke test for fuzzers	Patrick Steinhardt	1	-0/+9
	Our GitLab CI setup has a test gap where the fuzzers aren't exercised at all. Add a smoke test, similar to the one we have in GitHub Workflows. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: display subcommand help	Patrick Steinhardt	2	-3/+3
	Until now, `git config -h` would have printed help for the old-style syntax. Now that all modes have proper subcommands though it is preferable to instead display the subcommand help. Drop the `NO_INTERNAL_HELP` flag to do so. While at it, drop the help mismatch in t0450 and add the `--get-colorbool` option to the usage such that git-config(1)'s synopsis and `git config -h` match. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: introduce "edit" subcommand	Patrick Steinhardt	3	-36/+68
	Introduce a new "edit" subcommand to git-config(1). Please refer to preceding commits regarding the motivation behind this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: introduce "remove-section" subcommand	Patrick Steinhardt	3	-6/+41
	Introduce a new "remove-section" subcommand to git-config(1). Please refer to preceding commits regarding the motivation behind this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: introduce "rename-section" subcommand	Patrick Steinhardt	3	-15/+50
	Introduce a new "rename-section" subcommand to git-config(1). Please refer to preceding commits regarding the motivation behind this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: introduce "unset" subcommand	Patrick Steinhardt	3	-26/+84
	Introduce a new "unset" subcommand to git-config(1). Please refer to preceding commits regarding the motivation behind this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: introduce "set" subcommand	Patrick Steinhardt	3	-63/+140
	Introduce a new "set" subcommand to git-config(1). Please refer to preceding commits regarding the motivation behind this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: introduce "get" subcommand	Patrick Steinhardt	3	-99/+194
	Introduce a new "get" subcommand to git-config(1). Please refer to preceding commits regarding the motivation behind this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: introduce "list" subcommand	Patrick Steinhardt	3	-64/+162
	While git-config(1) has several modes, those modes are not exposed with subcommands but instead by specifying action flags like `--unset` or `--list`. This user interface is not really in line with how our more modern commands work, where it is a lot more customary to say e.g. `git remote list`. Furthermore, to add to the confusion, git-config(1) also allows the user to request modes implicitly by just specifying the correct number of arguments. Thus, `git config foo.bar` will retrieve the value of "foo.bar" while `git config foo.bar baz` will set it to "baz". Overall, this makes for a confusing interface that could really use a makeover. It hurts discoverability of what you can do with git-config(1) and is comparatively easy to get wrong. Converting the command to have subcommands instead would go a long way to help address these issues. One concern in this context is backwards compatibility. Luckily, we can introduce subcommands without breaking backwards compatibility at all. This is because all the implicit modes of git-config(1) require that the first argument is a properly formatted config key. And as config keys _must_ have a dot in their name, any value without a dot would have been discarded by git-config(1) previous to this change. Thus, given that none of the subcommands do have a dot, they are unambiguous. Introduce the first such new subcommand, which is "git config list". To retain backwards compatibility we only conditionally use subcommands and will fall back to the old syntax in case no subcommand was detected. This should help to transition to the new-style syntax until we eventually deprecate and remove the old-style syntax. Note that the way we handle this we're duplicating some functionality across old and new syntax. While this isn't pretty, it helps us to ensure that there really is no change in behaviour for the old syntax. Amend tests such that we run them both with old and new style syntax. As tests are now run twice, state from the first run may be still be around in the second run and thus cause tests to fail. Add cleanup logic as required to fix such tests. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: pull out function to handle `--null`	Patrick Steinhardt	1	-6/+9
	Pull out function to handle the `--null` option, which we are about to reuse in subsequent commits. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: pull out function to handle config location	Patrick Steinhardt	1	-65/+68
	There's quite a bunch of options to git-config(1) that allow the user to specify which config location to use when reading or writing config options. The logic to handle this is thus by necessity also quite involved. Pull it out into a separate function so that we can reuse it in subsequent commits which introduce proper subcommands. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: use `OPT_CMDMODE()` to specify modes	Patrick Steinhardt	2	-18/+27
	The git-config(1) command has various different modes which are accessible via e.g. `--get-urlmatch` or `--unset-all`. These modes are declared with `OPT_BIT()`, which causes two minor issues: - The respective modes also have a negated form `--no-get-urlmatch`, which is unintended. - We have to manually handle exclusiveness of the modes. Switch these options to instead use `OPT_CMDMODE()`, which is made exactly for this usecase. Remove the now-unneeded check that only a single mode is given, which is now handled by the parse-options interface. While at it, format optional placeholders for arguments to conform to our style guidelines by using `[<placeholder>]`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: move "fixed-value" option to correct group	Patrick Steinhardt	1	-1/+1
	The `--fixed-value` option can be used to alter how the value-pattern parameter is interpreted for the various actions of git-config(1). But while it is an option, it is currently listed as part of the actions group, which is wrong. Move the option to the "Other" group, which hosts the various options known to git-config(1). Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	builtin/config: move option array around	Patrick Steinhardt	1	-48/+48
	Move around the option array. This will help us with a follow-up commit that introduces subcommands to git-config(1). Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	config: clarify memory ownership when preparing comment strings	Patrick Steinhardt	3	-16/+13
	The ownership of memory returned when preparing a comment string is quite intricate: when the returned value is different than the passed value, then the caller is responsible to free the memory. This is quite subtle, and it's even easier to miss because the returned value is in fact a `const char *`. Adapt the function to always return either `NULL` or a newly allocated string. The function is called at most once per git-config(1), so it's not like this micro-optimization really matters. Thus, callers are now always responsible for freeing the value. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	diff: fix --exit-code with external diff	René Scharfe	2	-3/+38
	You can ask the diff machinery to let the exit code indicate whether there are changes, e.g. with --exit-code. It as two ways to calculate that bit: The quick one assumes blobs with different hashes have different content, and the more elaborate way actually compares the contents, possibly applying transformations like ignoring whitespace. Always use the slower path by setting the flag diff_from_contents, because any of the files could have an external diff driver set via an attribute, which might consider binary differences irrelevant, like e.g. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	diff: report unmerged paths as changes in run_diff_cmd()	René Scharfe	2	-0/+9
	You can ask the diff machinery to let the exit code indicate whether there are changes, e.g. with --quiet. It as two ways to calculate that bit: The quick one assumes blobs with different hashes have different content, and the more elaborate way actually compares the contents, possibly applying transformations like ignoring whitespace. The quick way considers an unmerged file to be a change and reports exit code 1, which makes sense. The slower path uses the struct diff_options member found_changes to indicate whether the blobs differ even with the transformations applied. It's not set for unmerged files, though, resulting in exit code 0. Set found_changes in run_diff_cmd() for unmerged files, for a consistent exit code of 1 if there's an unmerged file, regardless of whether whitespace is ignored. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06	refs: return conflict error when checking packed refs	Ivan Tse	2	-1/+19
	The TRANSACTION_NAME_CONFLICT error code refers to a failure to create a ref due to a name conflict with another ref. An example of this is a directory/file conflict such as ref names A/B and A. "git fetch" uses this error code to more accurately describe the error by recommending to the user that they try running "git remote prune" to remove any old refs that are deleted by the remote which would clear up any directory/file conflicts. This helpful error message is not displayed when the conflicted ref is stored in packed refs. This change fixes this by ensuring error return code consistency in `lock_raw_ref`. Signed-off-by: Ivan Tse <ivan.tse1@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>