aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/git-gc.txt
AgeCommit message (Collapse)AuthorFilesLines
2023-10-05builtin/repack.c: implement support for `--max-cruft-size`Taylor Blau1-0/+7
Cruft packs are an alternative mechanism for storing a collection of unreachable objects whose mtimes are recent enough to avoid being pruned out of the repository. When cruft packs were first introduced back in b757353676 (builtin/pack-objects.c: --cruft without expiration, 2022-05-20) and a7d493833f (builtin/pack-objects.c: --cruft with expiration, 2022-05-20), the recommended workflow consisted of: - Repacking periodically, either by packing anything loose in the repository (via `git repack -d`) or producing a geometric sequence of packs (via `git repack --geometric=<d> -d`). - Every so often, splitting the repository into two packs, one cruft to store the unreachable objects, and another non-cruft pack to store the reachable objects. Repositories may (out of band with the above) choose periodically to prune out some unreachable objects which have aged out of the grace period by generating a pack with `--cruft-expiration=<approxidate>`. This allowed repositories to maintain relatively few packs on average, and quarantine unreachable objects together in a cruft pack, avoiding the pitfalls of holding unreachable objects as loose while they age out (for more, see some of the details in 3d89a8c118 (Documentation/technical: add cruft-packs.txt, 2022-05-20)). This all works, but can be costly from an I/O-perspective when frequently repacking a repository that has many unreachable objects. This problem is exacerbated when those unreachable objects are rarely (if every) pruned. Since there is at most one cruft pack in the above scheme, each time we update the cruft pack it must be rewritten from scratch. Because much of the pack is reused, this is a relatively inexpensive operation from a CPU-perspective, but is very costly in terms of I/O since we end up rewriting basically the same pack (plus any new unreachable objects that have entered the repository since the last time a cruft pack was generated). At the time, we decided against implementing more robust support for multiple cruft packs. This patch implements that support which we were lacking. Introduce a new option `--max-cruft-size` which allows repositories to accumulate cruft packs up to a given size, after which point a new generation of cruft packs can accumulate until it reaches the maximum size, and so on. To generate a new cruft pack, the process works like so: - Sort a list of any existing cruft packs in ascending order of pack size. - Starting from the beginning of the list, group cruft packs together while the accumulated size is smaller than the maximum specified pack size. - Combine the objects in these cruft packs together into a new cruft pack, along with any other unreachable objects which have since entered the repository. Once a cruft pack grows beyond the size specified via `--max-cruft-size` the pack is effectively frozen. This limits the I/O churn up to a quadratic function of the value specified by the `--max-cruft-size` option, instead of behaving quadratically in the number of total unreachable objects. When pruning unreachable objects, we bypass the new code paths which combine small cruft packs together, and instead start from scratch, passing in the appropriate `--max-pack-size` down to `pack-objects`, putting it in charge of keeping the resulting set of cruft packs sized correctly. This may seem like further I/O churn, but in practice it isn't so bad. We could prune old cruft packs for whom all or most objects are removed, and then generate a new cruft pack with just the remaining set of objects. But this additional complexity buys us relatively little, because most objects end up being pruned anyway, so the I/O churn is well contained. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18builtin/gc.c: make `gc.cruftPacks` enabled by defaultTaylor Blau1-2/+3
Back in 5b92477f89 (builtin/gc.c: conditionally avoid pruning objects via loose, 2022-05-20), `git gc` learned the `--cruft` option and `gc.cruftPacks` configuration to opt-in to writing cruft packs when collecting or pruning unreachable objects. Cruft packs were introduced with the merge in a50036da1a (Merge branch 'tb/cruft-packs', 2022-06-03). They address the problem of "loose object explosions", where Git will write out many individual loose objects when there is a large number of unreachable objects that have not yet aged past `--prune=<date>`. Instead of keeping track of those unreachable yet recent objects via their loose object file's mtime, cruft packs collect all unreachable objects into a single pack with a corresponding `*.mtimes` file that acts as a table to store the mtimes of all unreachable objects. This prevents the need to store unreachable objects as loose as they age out of the repository, and avoids the problem of loose object explosions. Beyond avoiding loose object explosions, cruft packs also act as a more efficient mechanism to store unreachable objects as they age out of a repository. This is because pairs of similar unreachable objects serve as delta bases for one another. In 5b92477f89, the feature was introduced as experimental. Since then, GitHub has been running these patches in every repository generating hundreds of millions of cruft packs along the way. The feature is battle-tested, and avoids many pathological cases such as above. Users who either run `git gc` manually, or via `git maintenance` can benefit from having cruft packs. As such, enable cruft pack generation to take place by default (by making `gc.cruftPacks` have the default of "true" rather than "false). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18builtin/gc.c: ignore cruft packs with `--keep-largest-pack`Taylor Blau1-3/+4
When cruft packs were implemented, we never adjusted the code for `git gc`'s `--keep-largest-pack` and `gc.bigPackThreshold` to ignore cruft packs. This option and configuration option share a common implementation, but including cruft packs is wrong in both cases: - Running `git gc --keep-largest-pack` in a repository where the largest pack is the cruft pack itself will make it impossible for `git gc` to prune objects, since the cruft pack itself is kept. - The same is true for `gc.bigPackThreshold`, if the size of the cruft pack exceeds the limit set by the caller. In the future, it is possible that `gc.bigPackThreshold` could be used to write a separate cruft pack containing any new unreachable objects that entered the repository since the last time a cruft pack was written. There are some complexities to doing so, mainly around handling pruning objects that are in an existing cruft pack that is above the threshold (which would either need to be rewritten, or else delay pruning). Rewriting a substantially similar cruft pack isn't ideal, but it is significantly better than the status-quo. If users have large cruft packs that they don't want to rewrite, they can mark them as `*.keep` packs. But in general, if a repository has a cruft pack that is so large it is slowing down GC's, it should probably be pruned anyway. In the meantime, ignore cruft packs in the common implementation for both of these options, and add a pair of tests to prevent any future regressions here. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07docs: add and use include template for config/* includesÆvar Arnfjörð Bjarmason1-2/+1
In b6a8d09f6d8 (gc docs: include the "gc.*" section from "config" in "gc", 2019-04-07) the "git gc" documentation was made to include the config/gc.txt in its "CONFIGURATION" section. We do that in several other places, but "git gc" was the only one with a blurb above the include to orient the reader. We don't want readers to carefully scrutinize "git-config(1)" and "git-gc(1)" looking for discrepancies, instead we should tell them that the latter includes a part of the former. This change formalizes that wording in two new templates to be included, one for the "git gc" case where the entire section is included from "git-config(1)", and another for when the inclusion of "git-config(1)" follows discussion unique to that documentation. In order to use that re-arrange the order of those being discussed in the "git-merge(1)" documentation. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-21gc: simplify --cruft descriptionRené Scharfe1-2/+1
Remove duplicate "loose objects". Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-26builtin/gc.c: conditionally avoid pruning objects via looseTaylor Blau1-0/+5
Expose the new `git repack --cruft` mode from `git gc` via a new opt-in flag. When invoked like `git gc --cruft`, `git gc` will avoid exploding unreachable objects as loose ones, and instead create a cruft pack and `.mtimes` file. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10docs: clarify that refs/notes/ do not keep the attached objects aliveMartin von Zweigbergk1-6/+8
`git help gc` contains this snippet: "[...] it will keep [..] objects referenced by the index, remote-tracking branches, notes saved by git notes under refs/notes/" I had interpreted that as saying that the objects that notes were attached to are kept, but that is not the case. Let's clarify the documentation by moving out the part about git notes to a separate sentence. Signed-off-by: Martin von Zweigbergk <martinvonz@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-05Recommend git-filter-repo instead of git-filter-branchElijah Newren1-9/+8
filter-branch suffers from a deluge of disguised dangers that disfigure history rewrites (i.e. deviate from the deliberate changes). Many of these problems are unobtrusive and can easily go undiscovered until the new repository is in use. This can result in problems ranging from an even messier history than what led folks to filter-branch in the first place, to data loss or corruption. These issues cannot be backward compatibly fixed, so add a warning to both filter-branch and its manpage recommending that another tool (such as filter-repo) be used instead. Also, update other manpages that referenced filter-branch. Several of these needed updates even if we could continue recommending filter-branch, either due to implying that something was unique to filter-branch when it applied more generally to all history rewriting tools (e.g. BFG, reposurgeon, fast-import, filter-repo), or because something about filter-branch was used as an example despite other more commonly known examples now existing. Reword these sections to fix these issues and to avoid recommending filter-branch. Finally, remove the section explaining BFG Repo Cleaner as an alternative to filter-branch. I feel somewhat bad about this, especially since I feel like I learned so much from BFG that I put to good use in filter-repo (which is much more than I can say for filter-branch), but keeping that section presented a few problems: * In order to recommend that people quit using filter-branch, we need to provide them a recomendation for something else to use that can handle all the same types of rewrites. To my knowledge, filter-repo is the only such tool. So it needs to be mentioned. * I don't want to give conflicting recommendations to users * If we recommend two tools, we shouldn't expect users to learn both and pick which one to use; we should explain which problems one can solve that the other can't or when one is much faster than the other. * BFG and filter-repo have similar performance * All filtering types that BFG can do, filter-repo can also do. In fact, filter-repo comes with a reimplementation of BFG named bfg-ish which provides the same user-interface as BFG but with several bugfixes and new features that are hard to implement in BFG due to its technical underpinnings. While I could still mention both tools, it seems like I would need to provide some kind of comparison and I would ultimately just say that filter-repo can do everything BFG can, so ultimately it seems that it is just better to remove that section altogether. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08gc docs: remove incorrect reference to gc.auto=0Ævar Arnfjörð Bjarmason1-2/+1
The chance of a repository being corrupted due to a "gc" has nothing to do with whether or not that "gc" was invoked via "gc --auto", but whether there's other concurrent operations happening. This is already noted earlier in the paragraph, so there's no reason to suggest this here. The user can infer from the rest of the documentation that "gc" will run automatically unless gc.auto=0 is set, and we shouldn't confuse the issue by implying that "gc --auto" is somehow more prone to produce corruption than a normal "gc". Well, it is in the sense that a blocking "gc" would stop you from doing anything else in *that* particular terminal window, but users are likely to have another window, or to be worried about how concurrent "gc" on a server might cause corruption. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08gc docs: clarify that "gc" doesn't throw away referenced objectsÆvar Arnfjörð Bjarmason1-2/+2
Amend the "NOTES" section to fix up wording that's been with us since 3ffb58be0a ("doc/git-gc: add a note about what is collected", 2008-04-23). I can't remember when/where anymore (I think Freenode #Git), but at some point I was having a conversation with someone who was convinced that "gc" would prune things only referenced by e.g. refs/pull/*, and pointed to this section as proof. It turned out that they'd read the "branches and tags" wording here and thought just refs/{heads,tags}/* and refs/remotes/* etc. would be kept, which is what we enumerate explicitly. So let's say "other refs", even though just above we say "objects that are referenced anywhere in your repository". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08gc docs: downplay the usefulness of --aggressiveÆvar Arnfjörð Bjarmason1-2/+27
The existing "gc --aggressive" docs come just short of recommending to users that they run it regularly. I've personally talked to many users who've taken these docs as an advice to use this option, and have, usually it's (mostly) a waste of time. So let's clarify what it really does, and let the user draw their own conclusions. Let's also clarify the "The effects [...] are persistent" to paraphrase a brief version of Jeff King's explanation at [1]. 1. https://public-inbox.org/git/20190318235356.GK29661@sigill.intra.peff.net/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08gc docs: include the "gc.*" section from "config" in "gc"Ævar Arnfjörð Bjarmason1-79/+7
Rather than duplicating the documentation for the various "gc" options let's include the "gc" docs from git-config. They were mostly better already, and now we don't have the same docs in two places with subtly different wording. In the cases where the git-gc(1) docs were saying something the "gc" docs in git-config(1) didn't cover move the relevant section over to the git-config(1) docs. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-24gc docs: clean grammar for "gc.bigPackThreshold"Ævar Arnfjörð Bjarmason1-2/+2
Clean up the grammar in the documentation for "gc.bigPackThreshold". This documentation was added in 9806f5a7bf ("gc --auto: exclude base pack if not enough mem to "repack -ad"", 2018-04-15). Saying "the amount of memory estimated for" flows more smoothly than the previous "the amount of memory is estimated not enough". Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-24gc docs: stop noting "repack" flagsÆvar Arnfjörð Bjarmason1-3/+2
Remove the mention of specific flags from the "gc" documentation, and leave it at describing what we'll do instead. As seen in builtin/gc.c we'll use various repack flags depending on what we detect we need to do, so this isn't always accurate. More importantly, a subsequent change is about to remove all this documentation and replace it with an include of the gc.* docs in git-config(1). By first changing this it's easier to reason about that subsequent change. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-24gc docs: modernize the advice for manually running "gc"Ævar Arnfjörð Bjarmason1-11/+10
The docs have been recommending that users need to run this manually, but that hasn't been needed in practice for a long time except in exceptional circumstances. Let's instead have this reflect reality and say that most users don't need to run this manually at all, while briefly describing the sorts sort of cases where "gc" does need to be run manually. Since we're recommending that users run this most of the and usually don't need to tweak it, let's tone down the very prominent example of the gc.auto=0 command. It's sufficient to point to the gc.auto documentation below. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-11Merge branch 'rd/gc-prune-doc-fix'Junio C Hamano1-1/+1
Doxfix. * rd/gc-prune-doc-fix: docs/git-gc: fix typo "--prune=all" to "--prune=now"
2019-03-03docs/git-gc: fix typo "--prune=all" to "--prune=now"Robert P. J. Day1-1/+1
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-08git-gc.txt: fix typo about gc.writeCommitGraphDerrick Stolee1-1/+1
Reported-by: Stefan Haller <stefan@haller-berlin.de> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-11gc doc: mention the commit-graph in the introÆvar Arnfjörð Bjarmason1-1/+2
Explicitly mention in the intro that we may be writing supplemental data structures such as the commit-graph during "gc", i.e. to call out the "optimize" part of what this command does, it doesn't just "collect garbage" as the "gc" name might imply. Past changes have updated the intro to reflect new commands, such as mentioning "worktree" in b586a96a39 ("gc.txt: more details about what gc does", 2018-03-15). So let's elaborate on what was added in d5d5d7b641 ("gc: automatically write commit-graph files", 2018-06-27). See also https://public-inbox.org/git/87tvm3go42.fsf@evledraar.gmail.com/ (follow-up replies) for an on-list discussion about what "gc" does. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-27gc: automatically write commit-graph filesDerrick Stolee1-0/+4
The commit-graph file is a very helpful feature for speeding up git operations. In order to make it more useful, make it possible to write the commit-graph file during standard garbage collection operations. Add a 'gc.commitGraph' config setting that triggers writing a commit-graph file after any non-trivial 'git gc' command. Defaults to false while the commit-graph feature matures. We specifically do not want to have this on by default until the commit-graph feature is fully integrated with history-modifying features like shallow clones. Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-23Merge branch 'nd/doc-header'Junio C Hamano1-2/+2
Doc formatting fix. * nd/doc-header: doc: keep first level section header in upper case
2018-05-23Merge branch 'nd/repack-keep-pack'Junio C Hamano1-4/+15
"git gc" in a large repository takes a lot of time as it considers to repack all objects into one pack by default. The command has been taught to pretend as if the largest existing packfile is marked with ".keep" so that it is left untouched while objects in other packs and loose ones are repacked. * nd/repack-keep-pack: pack-objects: show some progress when counting kept objects gc --auto: exclude base pack if not enough mem to "repack -ad" gc: handle a corner case in gc.bigPackThreshold gc: add gc.bigPackThreshold config gc: add --keep-largest-pack option repack: add --keep-pack option t7700: have closing quote of a test at the beginning of line
2018-05-08Merge branch 'sg/doc-gc-quote-mismatch-fix'Junio C Hamano1-1/+1
Doc formatting fix. * sg/doc-gc-quote-mismatch-fix: docs/git-gc: fix minor rendering issue
2018-05-02doc: keep first level section header in upper caseNguyễn Thái Ngọc Duy1-2/+2
When formatted as a man page, 1st section header is always in upper case even if we write it otherwise. Make all 1st section headers uppercase to keep it close to the final output. This does affect html since case is kept there, but I still think it's a good idea to maintain a consistent style for 1st section headers. Some sections perhaps should become second sections instead, where case is kept, and for better organization. I will update if anyone has suggestions about this. While at there I also make some header more consistent (e.g. examples vs example) and fix a couple minor things here and there. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-18docs/git-gc: fix minor rendering issueSZEDER Gábor1-1/+1
An unwanted single quote character in the paragraph documenting the 'gc.aggressiveWindow' config variable prevented the name of that config variable from being rendered correctly, ever since that piece of docs was added in 0d7566a5ba (Add --aggressive option to 'git gc', 2007-05-09). Remove that single quote. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-16gc --auto: exclude base pack if not enough mem to "repack -ad"Nguyễn Thái Ngọc Duy1-2/+7
pack-objects could be a big memory hog especially on large repos, everybody knows that. The suggestion to stick a .keep file on the giant base pack to avoid this problem is also known for a long time. Recent patches add an option to do just this, but it has to be either configured or activated manually. This patch lets `git gc --auto` activate this mode automatically when it thinks `repack -ad` will use a lot of memory and start affecting the system due to swapping or flushing OS cache. gc --auto decides to do this based on an estimation of pack-objects memory usage, which is quite accurate at least for the heap part, and whether that fits in half of system memory (the assumption here is for desktop environment where there are many other applications running). This mechanism only kicks in if gc.bigBasePackThreshold is not configured. If it is, it is assumed that the user already knows what they want. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-16gc: add gc.bigPackThreshold configNguyễn Thái Ngọc Duy1-2/+4
The --keep-largest-pack option is not very convenient to use because you need to tell gc to do this explicitly (and probably on just a few large repos). Add a config key that enables this mode when packs larger than a limit are found. Note that there's a slight behavior difference compared to --keep-largest-pack: all packs larger than the threshold are kept, not just the largest one. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-16gc: add --keep-largest-pack optionNguyễn Thái Ngọc Duy1-1/+5
This adds a new repack mode that combines everything into a secondary pack, leaving the largest pack alone. This could help reduce memory pressure. On linux-2.6.git, valgrind massif reports 1.6GB heap in "pack all" case, and 535MB in "pack all except the base pack" case. We save roughly 1GB memory by excluding the base pack. This should also lower I/O because we don't have to rewrite a giant pack every time (e.g. for linux-2.6.git that's a 1.4GB pack file).. PS. The use of string_list here seems overkill, but we'll need it in the next patch... Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15gc.txt: more details about what gc doesNguyễn Thái Ngọc Duy1-9/+19
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-24docs/git-gc: fix default value for `--aggressiveDepth`Patrick Steinhardt1-1/+1
In commit 07e7dbf0d (gc: default aggressive depth to 50, 2016-08-11), the default aggressive depth of git-gc has been changed to 50. While git-config(1) has been updated to represent the new default value, git-gc(1) still mentions the old value. This patch fixes it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-17Merge branch 'mm/gc-safety-doc' into maintJunio C Hamano1-8/+26
Doc update. * mm/gc-safety-doc: git-gc.txt: expand discussion of races with other processes
2016-11-16git-gc.txt: expand discussion of races with other processesMatt McCutchen1-8/+26
In general, "git gc" may delete objects that another concurrent process is using but hasn't created a reference to. Git has some mitigations, but they fall short of a complete solution. Document this in the git-gc(1) man page and add a reference from the documentation of the gc.pruneExpire config variable. Based on a write-up by Jeff King: http://marc.info/?l=git&m=147922960131779&w=2 Signed-off-by: Matt McCutchen <matt@mattmccutchen.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-08doc: change configuration variables formatTom Russello1-8/+8
This change configuration variables that where in italic style to monospace font according to the guideline. It was obtained with grep '[[:alpha:]]*\.[[:alpha:]]*::$' config.txt | \ sed -e 's/::$//' -e 's/\./\\\\./' | \ xargs -iP perl -pi -e "s/\'P\'/\`P\`/g" ./*.txt Signed-off-by: Tom Russello <tom.russello@grenoble-inp.org> Signed-off-by: Erwan Mathoniere <erwan.mathoniere@grenoble-inp.org> Signed-off-by: Samuel Groot <samuel.groot@grenoble-inp.org> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-16Merge branch 'jc/doc-gc-prune-now'Junio C Hamano1-2/+5
"git gc" is safe to run anytime only because it has the built-in grace period to protect young objects. In order to run with no grace period, the user must make sure that the repository is quiescent. * jc/doc-gc-prune-now: Documentation/gc: warn against --prune=<now>
2015-10-14Documentation/gc: warn against --prune=<now>Junio C Hamano1-2/+5
"git gc" is safe to run anytime only because it has the built-in grace period to protect objects that are created by other processes that are waiting for ref updates to anchor them to the history. In order to run with no grace period, the user must make sure that the repository is quiescent. Reviewed-by: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-13*config.txt: stick to camelCase naming conventionNguyễn Thái Ngọc Duy1-6/+6
This should improve readability. Compare "thislongname" and "thisLongName". The following keys are left in unchanged. We can decide what to do with them later. - am.keepcr - core.autocrlf .safecrlf .trustctime - diff.dirstat .noprefix - gitcvs.usecrlfattr - gui.blamehistoryctx .trustmtime - pull.twohead - receive.autogc - sendemail.signedoffbycc .smtpsslcertpath .suppresscc Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-31gc --aggressive: make --depth configurableNguyễn Thái Ngọc Duy1-0/+3
When 1c192f3 (gc --aggressive: make it really aggressive - 2007-12-06) made --depth=250 the default value, it didn't really explain the reason behind, especially the pros and cons of --depth=250. An old mail from Linus below explains it at length. Long story short, --depth=250 is a disk saver and a performance killer. Not everybody agrees on that aggressiveness. Let the user configure it. From: Linus Torvalds <torvalds@linux-foundation.org> Subject: Re: [PATCH] gc --aggressive: make it really aggressive Date: Thu, 6 Dec 2007 08:19:24 -0800 (PST) Message-ID: <alpine.LFD.0.9999.0712060803430.13796@woody.linux-foundation.org> Gmane-URL: http://article.gmane.org/gmane.comp.gcc.devel/94637 On Thu, 6 Dec 2007, Harvey Harrison wrote: > > 7:41:25elapsed 86%CPU Heh. And this is why you want to do it exactly *once*, and then just export the end result for others ;) > -r--r--r-- 1 hharrison hharrison 324094684 2007-12-06 07:26 pack-1d46...pack But yeah, especially if you allow longer delta chains, the end result can be much smaller (and what makes the one-time repack more expensive is the window size, not the delta chain - you could make the delta chains longer with no cost overhead at packing time) HOWEVER. The longer delta chains do make it potentially much more expensive to then use old history. So there's a trade-off. And quite frankly, a delta depth of 250 is likely going to cause overflows in the delta cache (which is only 256 entries in size *and* it's a hash, so it's going to start having hash conflicts long before hitting the 250 depth limit). So when I said "--depth=250 --window=250", I chose those numbers more as an example of extremely aggressive packing, and I'm not at all sure that the end result is necessarily wonderfully usable. It's going to save disk space (and network bandwidth - the delta's will be re-used for the network protocol too!), but there are definitely downsides too, and using long delta chains may simply not be worth it in practice. (And some of it might just want to have git tuning, ie if people think that long deltas are worth it, we could easily just expand on the delta hash, at the cost of some more memory used!) That said, the good news is that working with *new* history will not be affected negatively, and if you want to be _really_ sneaky, there are ways to say "create a pack that contains the history up to a version one year ago, and be very aggressive about those old versions that we still want to have around, but do a separate pack for newer stuff using less aggressive parameters" So this is something that can be tweaked, although we don't really have any really nice interfaces for stuff like that (ie the git delta cache size is hardcoded in the sources and cannot be set in the config file, and the "pack old history more aggressively" involves some manual scripting and knowing how "git pack-objects" works rather than any nice simple command line switch). So the thing to take away from this is: - git is certainly flexible as hell - .. but to get the full power you may need to tweak things - .. happily you really only need to have one person to do the tweaking, and the tweaked end results will be available to others that do not need to know/care. And whether the difference between 320MB and 500MB is worth any really involved tweaking (considering the potential downsides), I really don't know. Only testing will tell. Linus Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-04Merge branch 'nd/gc-lock-against-each-other'Junio C Hamano1-1/+5
* nd/gc-lock-against-each-other: gc: reject if another gc is running, unless --force is given
2013-08-09gc: reject if another gc is running, unless --force is givenNguyễn Thái Ngọc Duy1-1/+5
This may happen when `git gc --auto` is run automatically, then the user, to avoid wait time, switches to a new terminal, keeps working and `git gc --auto` is started again because the first gc instance has not clean up the repository. This patch tries to avoid multiple gc running, especially in --auto mode. In the worst case, gc may be delayed 12 hours if a daemon reuses the pid stored in gc.pid. kill(pid, 0) support is added to MinGW port so it should work on Windows too. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-18git-gc.txt, git-reflog.txt: document new expiry optionsMichael Haggerty1-2/+3
Document the new values that can be used for expiry values where their use makes sense: * git reflog expire --expire=[all|never] * git reflog expire --expire-unreachable=[all|never] * git gc --prune=all Other combinations aren't useful and therefore no documentation is added (even though they are allowed): * git gc --prune=never is redundant with "git gc --no-prune" * git prune --expire=all is equivalent to providing no --expire option * git prune --expire=never is a NOP Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-26docs: stop using asciidoc no-inline-literalJeff King1-1/+1
In asciidoc 7, backticks like `foo` produced a typographic effect, but did not otherwise affect the syntax. In asciidoc 8, backticks introduce an "inline literal" inside which markup is not interpreted. To keep compatibility with existing documents, asciidoc 8 has a "no-inline-literal" attribute to keep the old behavior. We enabled this so that the documentation could be built on either version. It has been several years now, and asciidoc 7 is no longer in wide use. We can now decide whether or not we want inline literals on their own merits, which are: 1. The source is much easier to read when the literal contains punctuation. You can use `master~1` instead of `master{tilde}1`. 2. They are less error-prone. Because of point (1), we tend to make mistakes and forget the extra layer of quoting. This patch removes the no-inline-literal attribute from the Makefile and converts every use of backticks in the documentation to an inline literal (they must be cleaned up, or the example above would literally show "{tilde}" in the output). Problematic sites were found by grepping for '`.*[{\\]' and examined and fixed manually. The results were then verified by comparing the output of "html2text" on the set of generated html pages. Doing so revealed that in addition to making the source more readable, this patch fixes several formatting bugs: - HTML rendering used the ellipsis character instead of literal "..." in code examples (like "git log A...B") - some code examples used the right-arrow character instead of '->' because they failed to quote - api-config.txt did not quote tilde, and the resulting HTML contained a bogus snippet like: <tt><sub></tt> foo <tt></sub>bar</tt> which caused some parsers to choke and omit whole sections of the page. - git-commit.txt confused ``foo`` (backticks inside a literal) with ``foo'' (matched double-quotes) - mentions of `A U Thor <author@example.com>` used to erroneously auto-generate a mailto footnote for author@example.com - the description of --word-diff=plain incorrectly showed the output as "[-removed-] and {added}", not "{+added+}". - using "prime" notation like: commit `C` and its replacement `C'` confused asciidoc into thinking that everything between the first backtick and the final apostrophe were meant to be inside matched quotes - asciidoc got confused by the escaping of some of our asterisks. In particular, `credential.\*` and `credential.<url>.\*` properly escaped the asterisk in the first case, but literally passed through the backslash in the second case. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-06Documentation: use [verse] for SYNOPSIS sectionsMartin von Zweigbergk1-0/+1
The SYNOPSIS sections of most commands that span several lines already use [verse] to retain line breaks. Most commands that don't span several lines seem not to use [verse]. In the HTML output, [verse] does not only preserve line breaks, but also makes the section indented, which causes a slight inconsistency between commands that use [verse] and those that don't. Use [verse] in all SYNOPSIS sections for consistency. Also remove the blank lines from git-fetch.txt and git-rebase.txt to align with the other man pages. In the case of git-rebase.txt, which already uses [verse], the blank line makes the [verse] not apply to the last line, so removing the blank line also makes the formatting within the document more consistent. While at it, add single quotes to 'git cvsimport' for consistency with other commands. Signed-off-by: Martin von Zweigbergk <martin.von.zweigbergk@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-11doc: drop author/documentation sections from most pagesJeff King1-4/+0
The point of these sections is generally to: 1. Give credit where it is due. 2. Give the reader an idea of where to ask questions or file bug reports. But they don't do a good job of either case. For (1), they are out of date and incomplete. A much more accurate answer can be gotten through shortlog or blame. For (2), the correct contact point is generally git@vger, and even if you wanted to cc the contact point, the out-of-date and incomplete fields mean you're likely sending to somebody useless. So let's drop the fields entirely from all manpages except git(1) itself. We already point people to the mailing list for bug reports there, and we can update the Authors section to give credit to the major contributors and point to shortlog and blame for more information. Each page has a "This is part of git" footer, so people can follow that to the main git manpage.
2010-12-19Merge branch 'maint'Junio C Hamano1-1/+1
* maint: gitweb: Include links to feeds in HTML header only for '200 OK' response fsck docs: remove outdated and useless diagnostic userdiff: fix typo in ruby and python word regexes trace.c: mark file-local function static Fix typo in git-gc document.
2010-12-17Fix typo in git-gc document.Jiang Xin1-1/+1
The variable gc.packrefs for git-gc can be set to true, false and "notbare", not "nobare". Signed-off-by: Jiang Xin <jiangxin@ossxp.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-03Change remote tracking to remote-tracking in non-trivial placesMatthieu Moy1-3/+3
To complement the straightforward perl application in previous patch, this adds a few manual changes. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-07-05Merge branch 'maint'Junio C Hamano1-0/+7
* maint: t0006: test timezone parsing rerere.txt: Document forget subcommand Documentation/git-gc.txt: add reference to githooks
2010-07-02Documentation/git-gc.txt: add reference to githooksChris Packham1-0/+7
This advertises the existence of the 'pre-auto-gc' hook and adds a cross reference to where the hook is documented. Signed-off-by: Chris Packham <judge.packham@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-21Merge branch 'jc/maint-no-reflog-expire-unreach-for-head'Junio C Hamano1-0/+10
* jc/maint-no-reflog-expire-unreach-for-head: reflog --expire-unreachable: special case entries in "HEAD" reflog more war on "sleep" in tests Document gc.<pattern>.reflogexpire variables Conflicts: Documentation/config.txt
2010-04-14Document gc.<pattern>.reflogexpire variablesJunio C Hamano1-0/+10
3cb22b8 (Per-ref reflog expiry configuration, 2008-06-15) added support for setting the expiry parameters differently for different reflog, but it was never documented. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-10Documentation: spell 'git cmd' without dash throughoutThomas Rast1-10/+10
The documentation was quite inconsistent when spelling 'git cmd' if it only refers to the program, not to some specific invocation syntax: both 'git-cmd' and 'git cmd' spellings exist. The current trend goes towards dashless forms, and there is precedent in 647ac70 (git-svn.txt: stop using dash-form of commands., 2009-07-07) to actively eliminate the dashed variants. Replace 'git-cmd' with 'git cmd' throughout, except where git-shell, git-cvsserver, git-upload-pack, git-receive-pack, and git-upload-archive are concerned, because those really live in the $PATH.
2010-01-10Documentation: format full commands in typewriter fontThomas Rast1-1/+1
Use `code snippet` style instead of 'emphasis' for `git cmd ...` according to the following rules: * The SYNOPSIS sections are left untouched. * If the intent is that the user type the command exactly as given, it is `code`. If the user is only loosely referred to a command and/or option, it remains 'emphasised'. Signed-off-by: Thomas Rast <trast@student.ethz.ch>
2009-10-20Documentation/git-gc.txt: change "references" to "reference"Matt Kraai1-1/+1
Signed-off-by: Matt Kraai <kraai@ftbfs.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-29Documentation/git-gc.txt: default --aggressive window is 250, not 10Brandon Casey1-1/+1
The default --aggressive window has been 250 since 1c192f34 "gc --aggressive: make it really aggressive", released in git v1.6.3. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2009-08-27Fix overridable written with an extra 'e'Nanako Shiraishi1-1/+1
Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-14gc: make --prune useful again by accepting an optional parameterJohannes Schindelin1-1/+9
With this patch, "git gc --no-prune" will not prune any loose (and dangling) object, and "git gc --prune=5.minutes.ago" will prune all loose objects older than 5 minutes. This patch benefitted from suggestions by Thomas Rast and Jan Krï¿œger. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-05manpages: italicize git command names (which were in teletype font)Jonathan Nieder1-10/+10
The names of git commands are not meant to be entered at the commandline; they are just names. So we render them in italics, as is usual for command names in manpages. Using doit () { perl -e 'for (<>) { s/\`(git-[^\`.]*)\`/'\''\1'\''/g; print }' } for i in git*.txt config.txt diff*.txt blame*.txt fetch*.txt i18n.txt \ merge*.txt pretty*.txt pull*.txt rev*.txt urls*.txt do doit <"$i" >"$i+" && mv "$i+" "$i" done git diff . Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-01Documentation formatting and cleanupJonathan Nieder1-5/+5
Following what appears to be the predominant style, format names of commands and commandlines both as `teletype text`. While we're at it, add articles ("a" and "the") in some places, italicize the name of the command in the manual page synopsis line, and add a comma or two where it seems appropriate. Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-01Documentation: be consistent about "git-" versus "git "Jonathan Nieder1-3/+3
Since the git-* commands are not installed in $(bindir), using "git-command <parameters>" in examples in the documentation is not a good idea. On the other hand, it is nice to be able to refer to each command using one hyphenated word. (There is no escaping it, anyway: man page names cannot have spaces in them.) This patch retains the dash in naming an operation, command, program, process, or action. Complete command lines that can be entered at a shell (i.e., without options omitted) are made to use the dashless form. The changes consist only of replacing some spaces with hyphens and vice versa. After a "s/ /-/g", the unpatched and patched versions are identical. Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-06documentation: move git(7) to git(1)Christian Couder1-1/+1
As the "git" man page describes the "git" command at the end-user level, it seems better to move it to man section 1. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-28Manual subsection to refer to other pages is SEE ALSOJunio C Hamano1-1/+1
Consistently say so in all caps as it is customary to do so. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-24doc/git-gc: add a note about what is collectedJeff King1-0/+15
It seems to be a FAQ that people try running git-gc, and then get puzzled about why the size of their .git directory didn't change. This note mentions the reasons why things might unexpectedly get kept. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-19make it easier for people who just want to get rid of 'git gc --auto'Nicolas Pitre1-2/+9
Give a direct hint to those who feel highly annoyed by the auto gc behavior. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-12gc: call "prune --expire 2.weeks.ago" by defaultJohannes Schindelin1-12/+5
The only reason we did not call "prune" in git-gc was that it is an inherently dangerous operation: if there is a commit going on, you will prune loose objects that were just created, and are, in fact, needed by the commit object just about to be created. Since it is dangerous, we told users so. That led to many users not even daring to run it when it was actually safe. Besides, they are users, and should not have to remember such details as when to call git-gc with --prune, or to call git-prune directly. Of course, the consequence was that "git gc --auto" gets triggered much more often than we would like, since unreferenced loose objects (such as left-overs from a rebase or a reset --hard) were never pruned. Alas, git-prune recently learnt the option --expire <minimum-age>, which makes it a much safer operation. This allows us to call prune from git-gc, with a grace period of 2 weeks for the unreferenced loose objects (this value was determined in a discussion on the git list as a safe one). If you want to override this grace period, just set the config variable gc.pruneExpire to a different value; an example would be [gc] pruneExpire = 6.months.ago or even "never", if you feel really paranoid. Note that this new behaviour makes "--prune" be a no-op. While adding a test to t5304-prune.sh (since it really tests the implicit call to "prune"), also the original test for "prune --expire" was moved there from t1410-reflog.sh, where it did not belong. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2008-03-01gc: Add --quiet optionFrank Lichtenheld1-1/+4
Pass -q option to git-repack. Signed-off-by: Frank Lichtenheld <frank@lichtenheld.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-09Change git-gc documentation to reflect gc.packrefs implementation.Florian La Roche1-3/+3
56752391a8c0c591853b276e4fa0b45c34ced181 (Make "git gc" pack all refs by default) changed the default of gc.packrefs to true, to pack all refs by default in any repository. IOW, the users need to disable it explicitly if they want to by setting the config variable, since 1.5.3. However, we forgot to update the documentation. This fixes it. Signed-off-by: Florian La Roche <laroche@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-06Documentation: rename gitlink macro to linkgitDan McGee1-7/+7
Between AsciiDoc 8.2.2 and 8.2.3, the following change was made to the stock Asciidoc configuration: @@ -149,7 +153,10 @@ # Inline macros. # Backslash prefix required for escape processing. # (?s) re flag for line spanning. -(?su)[\\]?(?P<name>\w(\w|-)*?):(?P<target>\S*?)(\[(?P<attrlist>.*?)\])= + +# Explicit so they can be nested. +(?su)[\\]?(?P<name>(http|https|ftp|file|mailto|callto|image|link)):(?P<target>\S*?)(\[(?P<attrlist>.*?)\])= + # Anchor: [[[id]]]. Bibliographic anchor. (?su)[\\]?\[\[\[(?P<attrlist>[\w][\w-]*?)\]\]\]=anchor3 # Anchor: [[id,xreflabel]] This default regex now matches explicit values, and unfortunately in this case gitlink was being matched by just 'link', causing the wrong inline macro template to be applied. By renaming the macro, we can avoid being matched by the wrong regex. Signed-off-by: Dan McGee <dpmcgee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-10-18Documentation/git-gc: improve description of --autoJeff King1-12/+17
This patch tries to make the description of --auto a little more clear for new users, especially those referred by the "git-gc --auto" notification message. It also cleans up some grammatical errors and typos in the original description, as well as rewording for clarity. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-10-18Documentation/git-gc: explain --auto in descriptionJeff King1-1/+2
Now that git-gc --auto tells the user to look at the man page, it makes sense to mention the auto behavior near the top (since this is likely to be most users' first exposure to git-gc). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-09-17git-gc --auto: run "repack -A -d -l" as necessary.Junio C Hamano1-1/+6
This teaches "git-gc --auto" to consolidate many packs into one without losing unreachable objects in them by using "repack -A" when there are too many packfiles that are not marked with *.keep in the repository. gc.autopacklimit configuration can be used to set the maximum number of packs a repository is allowed to have before this mechanism kicks in. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-17git-gc --auto: add documentation.Junio C Hamano1-1/+10
This documents the auto-packing of loose objects performed by git-gc --auto. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-05-31Fix minor grammatical typos in the git-gc man pageTheodore Ts'o1-3/+3
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-10Add --aggressive option to 'git gc'Theodore Tso1-1/+15
This option causes 'git gc' to more aggressively optimize the repository at the cost of taking much more wall clock and CPU time. Today this option causes git-pack-objects to use --no-use-delta option, and it allows the --window parameter to be set via the gc.aggressiveWindow configuration parameter. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-13git-gc: run pack-refs by default unless the repo is bareJohannes Schindelin1-0/+4
The config variable gc.packrefs is tristate now: "true", "false" and "notbare", where "notbare" is the default. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-21git-gc: do not run git-prune by default.Junio C Hamano1-1/+16
git-prune is not safe when run uncontrolled in parallel while other git operations are creating new objects. To avoid mistakes, do not run git-prune by default from git-gc. Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-17Documentation: a few spelling fixesRené Scharfe1-1/+1
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-27Create 'git gc' to perform common maintenance operations.Shawn O. Pearce1-0/+64
Junio asked for a 'git gc' utility which users can execute on a regular basis to perform basic repository actions such as: * pack-refs --prune * reflog expire * repack -a -d * prune * rerere gc So here is a command which does exactly that. The parameters fed to reflog's expire subcommand can be chosen by the user by setting configuration options in .git/config (or ~/.gitconfig), as users may want different expiration windows for each repository but shouldn't be bothered to remember what they are all of the time. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>