diff options
author | Junio C Hamano <gitster@pobox.com> | 2021-03-02 23:07:49 -0800 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2021-03-02 23:07:49 -0800 |
commit | b66f8a5b45f896b6a49ddc1a1b10518671dbfa60 (patch) | |
tree | 9889c649f0e3f51c138862042531cf77a23d8284 | |
parent | a372d5bb569776befd395d9c66067ecc09f13b96 (diff) | |
download | git-htmldocs-b66f8a5b45f896b6a49ddc1a1b10518671dbfa60.tar.gz |
Autogenerated HTML docs for v2.31.0-rc1
33 files changed, 307 insertions, 71 deletions
diff --git a/RelNotes/2.31.0.txt b/RelNotes/2.31.0.txt index ef8b0d158..04bd5b70a 100644 --- a/RelNotes/2.31.0.txt +++ b/RelNotes/2.31.0.txt @@ -197,6 +197,31 @@ Performance, Internal Implementation, Development Support etc. * The code to implement "git merge-base --independent" was poorly done and was kept from the very beginning of the feature. + * Preliminary changes to fsmonitor integration. + + * Performance optimization work on the rename detection continues. + + * The common code to deal with "chunked file format" that is shared + by the multi-pack-index and commit-graph files have been factored + out, to help codepaths for both filetypes to become more robust. + + * The approach to "fsck" the incoming objects in "index-pack" is + attractive for performance reasons (we have them already in core, + inflated and ready to be inspected), but fundamentally cannot be + applied fully when we receive more than one pack stream, as a tree + object in one pack may refer to a blob object in another pack as + ".gitmodules", when we want to inspect blobs that are used as + ".gitmodules" file, for example. Teach "index-pack" to emit + objects that must be inspected later and check them in the calling + "fetch-pack" process. + + * The logic to handle "trailer" related placeholders in the + "--format=" mechanisms in the "log" family and "for-each-ref" + family is getting unified. + + * Raise the buffer size used when writing the index file out from + (obviously too small) 8kB to (clearly sufficiently large) 128kB. + Fixes since v2.30 ----------------- @@ -318,6 +343,12 @@ Fixes since v2.30 corrected. (merge 20e416409f jc/push-delete-nothing later to maint). + * Test script modernization. + (merge 488acf15df sv/t7001-modernize later to maint). + + * An under-allocation for the untracked cache data has been corrected. + (merge 6347d649bc jh/untracked-cache-fix later to maint). + * Other code cleanup, docfix, build fix, etc. (merge e3f5da7e60 sg/t7800-difftool-robustify later to maint). (merge 9d336655ba js/doc-proto-v2-response-end later to maint). diff --git a/SubmittingPatches.html b/SubmittingPatches.html index 0a4ac313f..feb77597f 100644 --- a/SubmittingPatches.html +++ b/SubmittingPatches.html @@ -1393,7 +1393,7 @@ this problem around.</p></div> <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:50 PST
+ 2021-03-02 23:05:12 PST
</div>
</div>
</body>
diff --git a/git-for-each-ref.html b/git-for-each-ref.html index 728a5fbf3..0dbb45fa5 100644 --- a/git-for-each-ref.html +++ b/git-for-each-ref.html @@ -1163,11 +1163,9 @@ contents:lines=N </dd>
</dl></div>
<div class="paragraph"><p>Additionally, the trailers as interpreted by <a href="git-interpret-trailers.html">git-interpret-trailers(1)</a>
-are obtained as <code>trailers</code> (or by using the historical alias
-<code>contents:trailers</code>). Non-trailer lines from the trailer block can be omitted
-with <code>trailers:only</code>. Whitespace-continuations can be removed from trailers so
-that each trailer appears on a line by itself with its full content with
-<code>trailers:unfold</code>. Both can be used together as <code>trailers:unfold,only</code>.</p></div>
+are obtained as <code>trailers[:options]</code> (or by using the historical alias
+<code>contents:trailers[:options]</code>). For valid [:option] values see <code>trailers</code>
+section of <a href="git-log.html">git-log(1)</a>.</p></div>
<div class="paragraph"><p>For sorting purposes, fields with numeric values sort in numeric order
(<code>objectsize</code>, <code>authordate</code>, <code>committerdate</code>, <code>creatordate</code>, <code>taggerdate</code>).
All other fields are used to sort in their byte-value order.</p></div>
@@ -1326,7 +1324,7 @@ commits and from none of the <code>--no-merged</code> commits are shown.</p></di <div id="footer">
<div id="footer-text">
Last updated
- 2020-09-22 13:11:20 PDT
+ 2021-03-02 23:05:01 PST
</div>
</div>
</body>
diff --git a/git-for-each-ref.txt b/git-for-each-ref.txt index 2962f85a5..2ae2478de 100644 --- a/git-for-each-ref.txt +++ b/git-for-each-ref.txt @@ -260,11 +260,9 @@ contents:lines=N:: The first `N` lines of the message. Additionally, the trailers as interpreted by linkgit:git-interpret-trailers[1] -are obtained as `trailers` (or by using the historical alias -`contents:trailers`). Non-trailer lines from the trailer block can be omitted -with `trailers:only`. Whitespace-continuations can be removed from trailers so -that each trailer appears on a line by itself with its full content with -`trailers:unfold`. Both can be used together as `trailers:unfold,only`. +are obtained as `trailers[:options]` (or by using the historical alias +`contents:trailers[:options]`). For valid [:option] values see `trailers` +section of linkgit:git-log[1]. For sorting purposes, fields with numeric values sort in numeric order (`objectsize`, `authordate`, `committerdate`, `creatordate`, `taggerdate`). diff --git a/git-http-fetch.html b/git-http-fetch.html index e2b53b65b..7e4411ab1 100644 --- a/git-http-fetch.html +++ b/git-http-fetch.html @@ -819,11 +819,22 @@ commit-id </dt>
<dd>
<p>
- Instead of a commit id on the command line (which is not expected in
+ For internal use only. Instead of a commit id on the command
+ line (which is not expected in
this case), <em>git http-fetch</em> fetches the packfile directly at the given
URL and uses index-pack to generate corresponding .idx and .keep files.
The hash is used to determine the name of the temporary file and is
- arbitrary. The output of index-pack is printed to stdout.
+ arbitrary. The output of index-pack is printed to stdout. Requires
+ --index-pack-args.
+</p>
+</dd>
+<dt class="hdlist1">
+--index-pack-args=<args>
+</dt>
+<dd>
+<p>
+ For internal use only. The command to run on the contents of the
+ downloaded pack. Arguments are URL-encoded separated by spaces.
</p>
</dd>
<dt class="hdlist1">
@@ -849,7 +860,7 @@ commit-id <div id="footer">
<div id="footer-text">
Last updated
- 2020-06-25 14:07:29 PDT
+ 2021-03-02 23:05:01 PST
</div>
</div>
</body>
diff --git a/git-http-fetch.txt b/git-http-fetch.txt index 4deb4893f..9fa17b60e 100644 --- a/git-http-fetch.txt +++ b/git-http-fetch.txt @@ -41,11 +41,17 @@ commit-id:: <commit-id>['\t'<filename-as-in--w>] --packfile=<hash>:: - Instead of a commit id on the command line (which is not expected in + For internal use only. Instead of a commit id on the command + line (which is not expected in this case), 'git http-fetch' fetches the packfile directly at the given URL and uses index-pack to generate corresponding .idx and .keep files. The hash is used to determine the name of the temporary file and is - arbitrary. The output of index-pack is printed to stdout. + arbitrary. The output of index-pack is printed to stdout. Requires + --index-pack-args. + +--index-pack-args=<args>:: + For internal use only. The command to run on the contents of the + downloaded pack. Arguments are URL-encoded separated by spaces. --recover:: Verify that everything reachable from target is fetched. Used after diff --git a/git-index-pack.html b/git-index-pack.html index 4cf02f29a..6169e4ae2 100644 --- a/git-index-pack.html +++ b/git-index-pack.html @@ -885,8 +885,12 @@ the objects/pack/ directory of a Git repository.</p></div> </dt>
<dd>
<p>
- Die if the pack contains broken objects. For internal use only.
+ For internal use only.
</p>
+<div class="paragraph"><p>Die if the pack contains broken objects. If the pack contains a tree
+pointing to a .gitmodules blob that does not exist, prints the hash of
+that blob (for the caller to check) after the hash that goes into the
+name of the pack/idx file (see "Notes").</p></div>
</dd>
<dt class="hdlist1">
--threads=<n>
@@ -954,7 +958,7 @@ mentioned above.</p></div> <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-12 14:43:56 PST
+ 2021-03-02 23:05:01 PST
</div>
</div>
</body>
diff --git a/git-index-pack.txt b/git-index-pack.txt index 69ba904d4..7fa74b9e7 100644 --- a/git-index-pack.txt +++ b/git-index-pack.txt @@ -86,7 +86,12 @@ OPTIONS Die if the pack contains broken links. For internal use only. --fsck-objects:: - Die if the pack contains broken objects. For internal use only. + For internal use only. ++ +Die if the pack contains broken objects. If the pack contains a tree +pointing to a .gitmodules blob that does not exist, prints the hash of +that blob (for the caller to check) after the hash that goes into the +name of the pack/idx file (see "Notes"). --threads=<n>:: Specifies the number of threads to spawn when resolving diff --git a/gitdiffcore.html b/gitdiffcore.html index 7fe36ab8a..bfb02082e 100644 --- a/gitdiffcore.html +++ b/gitdiffcore.html @@ -936,6 +936,25 @@ files are "similar enough", and can be customized to use a similarity score different from the default of 50% by giving a
number after the "-M" or "-C" option (e.g. "-M8" to tell it to use
8/10 = 80%).</p></div>
+<div class="paragraph"><p>Note that when rename detection is on but both copy and break
+detection are off, rename detection adds a preliminary step that first
+checks if files are moved across directories while keeping their
+filename the same. If there is a file added to a directory whose
+contents is sufficiently similar to a file with the same name that got
+deleted from a different directory, it will mark them as renames and
+exclude them from the later quadratic step (the one that pairwise
+compares all unmatched files to find the "best" matches, determined by
+the highest content similarity). So, for example, if a deleted
+docs/ext.txt and an added docs/config/ext.txt are similar enough, they
+will be marked as a rename and prevent an added docs/ext.md that may
+be even more similar to the deleted docs/ext.txt from being considered
+as the rename destination in the later step. For this reason, the
+preliminary "match same filename" step uses a bit higher threshold to
+mark a file pair as a rename and stop considering other candidates for
+better matches. At most, one comparison is done per file in this
+preliminary pass; so if there are several remaining ext.txt files
+throughout the directory hierarchy after exact rename detection, this
+preliminary step will be skipped for those files.</p></div>
<div class="paragraph"><p>Note. When the "-C" option is used with <code>--find-copies-harder</code>
option, <em>git diff-*</em> commands feed unmodified filepairs to
diffcore mechanism as well as modified ones. This lets the copy
@@ -1089,7 +1108,7 @@ not sorted when diffcore-order is in effect.</p></div> <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:29 PST
+ 2021-03-02 23:05:01 PST
</div>
</div>
</body>
diff --git a/gitdiffcore.txt b/gitdiffcore.txt index 2bd122047..1c7269655 100644 --- a/gitdiffcore.txt +++ b/gitdiffcore.txt @@ -169,6 +169,26 @@ a similarity score different from the default of 50% by giving a number after the "-M" or "-C" option (e.g. "-M8" to tell it to use 8/10 = 80%). +Note that when rename detection is on but both copy and break +detection are off, rename detection adds a preliminary step that first +checks if files are moved across directories while keeping their +filename the same. If there is a file added to a directory whose +contents is sufficiently similar to a file with the same name that got +deleted from a different directory, it will mark them as renames and +exclude them from the later quadratic step (the one that pairwise +compares all unmatched files to find the "best" matches, determined by +the highest content similarity). So, for example, if a deleted +docs/ext.txt and an added docs/config/ext.txt are similar enough, they +will be marked as a rename and prevent an added docs/ext.md that may +be even more similar to the deleted docs/ext.txt from being considered +as the rename destination in the later step. For this reason, the +preliminary "match same filename" step uses a bit higher threshold to +mark a file pair as a rename and stop considering other candidates for +better matches. At most, one comparison is done per file in this +preliminary pass; so if there are several remaining ext.txt files +throughout the directory hierarchy after exact rename detection, this +preliminary step will be skipped for those files. + Note. When the "-C" option is used with `--find-copies-harder` option, 'git diff-{asterisk}' commands feed unmodified filepairs to diffcore mechanism as well as modified ones. This lets the copy diff --git a/howto-index.html b/howto-index.html index 5914844f2..b1b0ef8cf 100644 --- a/howto-index.html +++ b/howto-index.html @@ -885,7 +885,7 @@ later validate it.</p></div> <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:38 PST
+ 2021-03-02 23:05:05 PST
</div>
</div>
</body>
diff --git a/howto/keep-canonical-history-correct.html b/howto/keep-canonical-history-correct.html index 99bc0c2e9..1e3aa8220 100644 --- a/howto/keep-canonical-history-correct.html +++ b/howto/keep-canonical-history-correct.html @@ -938,7 +938,7 @@ tip of your <em>master</em> again and redo the two merges:</p></div> <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:46 PST
+ 2021-03-02 23:05:10 PST
</div>
</div>
</body>
diff --git a/howto/maintain-git.html b/howto/maintain-git.html index 75a24710f..e90b6aad1 100644 --- a/howto/maintain-git.html +++ b/howto/maintain-git.html @@ -1469,7 +1469,7 @@ $ git update-ref -d $mf/ai/topic</code></pre> <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:46 PST
+ 2021-03-02 23:05:10 PST
</div>
</div>
</body>
diff --git a/howto/new-command.html b/howto/new-command.html index 215c063b9..9fed0f734 100644 --- a/howto/new-command.html +++ b/howto/new-command.html @@ -863,7 +863,7 @@ letter [PATCH 0/n]. <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:40 PST
+ 2021-03-02 23:05:07 PST
</div>
</div>
</body>
diff --git a/howto/rebase-from-internal-branch.html b/howto/rebase-from-internal-branch.html index bf1e2eff7..9a945d376 100644 --- a/howto/rebase-from-internal-branch.html +++ b/howto/rebase-from-internal-branch.html @@ -895,7 +895,7 @@ the #1' commit.</p></div> <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:45 PST
+ 2021-03-02 23:05:10 PST
</div>
</div>
</body>
diff --git a/howto/rebuild-from-update-hook.html b/howto/rebuild-from-update-hook.html index a85f4f8af..9a139d066 100644 --- a/howto/rebuild-from-update-hook.html +++ b/howto/rebuild-from-update-hook.html @@ -847,7 +847,7 @@ This is still crude and does not protect against simultaneous <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:45 PST
+ 2021-03-02 23:05:09 PST
</div>
</div>
</body>
diff --git a/howto/recover-corrupted-blob-object.html b/howto/recover-corrupted-blob-object.html index add827fdb..f79462520 100644 --- a/howto/recover-corrupted-blob-object.html +++ b/howto/recover-corrupted-blob-object.html @@ -880,7 +880,7 @@ thing.</p></div> <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:45 PST
+ 2021-03-02 23:05:09 PST
</div>
</div>
</body>
diff --git a/howto/recover-corrupted-object-harder.html b/howto/recover-corrupted-object-harder.html index 2257e89ee..f05a619f2 100644 --- a/howto/recover-corrupted-object-harder.html +++ b/howto/recover-corrupted-object-harder.html @@ -1189,7 +1189,7 @@ int main(int argc, char **argv) <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:45 PST
+ 2021-03-02 23:05:09 PST
</div>
</div>
</body>
diff --git a/howto/revert-a-faulty-merge.html b/howto/revert-a-faulty-merge.html index 1ba4e1ab7..b7318594e 100644 --- a/howto/revert-a-faulty-merge.html +++ b/howto/revert-a-faulty-merge.html @@ -1025,7 +1025,7 @@ P---o---o---M---x---x---W---x---M2 <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:44 PST
+ 2021-03-02 23:05:09 PST
</div>
</div>
</body>
diff --git a/howto/revert-branch-rebase.html b/howto/revert-branch-rebase.html index 478209f04..541d2b1b8 100644 --- a/howto/revert-branch-rebase.html +++ b/howto/revert-branch-rebase.html @@ -907,7 +907,7 @@ Committed merge 7fb9b7262a1d1e0a47bbfdcbbcf50ce0635d3f8f <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:41 PST
+ 2021-03-02 23:05:07 PST
</div>
</div>
</body>
diff --git a/howto/separating-topic-branches.html b/howto/separating-topic-branches.html index 561d8a1c4..975558e54 100644 --- a/howto/separating-topic-branches.html +++ b/howto/separating-topic-branches.html @@ -841,7 +841,7 @@ o---o"master"</code></pre> <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:43 PST
+ 2021-03-02 23:05:09 PST
</div>
</div>
</body>
diff --git a/howto/setup-git-server-over-http.html b/howto/setup-git-server-over-http.html index aa3b9e917..687b12e27 100644 --- a/howto/setup-git-server-over-http.html +++ b/howto/setup-git-server-over-http.html @@ -1071,7 +1071,7 @@ help diagnosing the problem, but removes security checks.</p></div> <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:43 PST
+ 2021-03-02 23:05:08 PST
</div>
</div>
</body>
diff --git a/howto/update-hook-example.html b/howto/update-hook-example.html index f470fe56c..7f5d8b0ec 100644 --- a/howto/update-hook-example.html +++ b/howto/update-hook-example.html @@ -930,7 +930,7 @@ that JC can make non-fast-forward pushes on it.</p></div> <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:42 PST
+ 2021-03-02 23:05:08 PST
</div>
</div>
</body>
diff --git a/howto/use-git-daemon.html b/howto/use-git-daemon.html index 4380a5015..9888c8a49 100644 --- a/howto/use-git-daemon.html +++ b/howto/use-git-daemon.html @@ -791,7 +791,7 @@ a good practice to put the paths after a "--" separator.</p></div> <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:42 PST
+ 2021-03-02 23:05:08 PST
</div>
</div>
</body>
diff --git a/howto/using-merge-subtree.html b/howto/using-merge-subtree.html index fa5ba7ce0..03d24bbe5 100644 --- a/howto/using-merge-subtree.html +++ b/howto/using-merge-subtree.html @@ -848,7 +848,7 @@ Please note that if the other project merges from you, then it will <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:41 PST
+ 2021-03-02 23:05:07 PST
</div>
</div>
</body>
diff --git a/howto/using-signed-tag-in-pull-request.html b/howto/using-signed-tag-in-pull-request.html index d59c08bc3..2180fad9f 100644 --- a/howto/using-signed-tag-in-pull-request.html +++ b/howto/using-signed-tag-in-pull-request.html @@ -952,7 +952,7 @@ as part of the merge commit.</p></div> <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:41 PST
+ 2021-03-02 23:05:07 PST
</div>
</div>
</body>
diff --git a/technical/api-index.html b/technical/api-index.html index bdd0c29f4..35a7a1130 100644 --- a/technical/api-index.html +++ b/technical/api-index.html @@ -770,7 +770,7 @@ documents them.</p></div> <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-25 17:29:59 PST
+ 2021-03-02 23:05:16 PST
</div>
</div>
</body>
diff --git a/technical/chunk-format.txt b/technical/chunk-format.txt new file mode 100644 index 000000000..593614fce --- /dev/null +++ b/technical/chunk-format.txt @@ -0,0 +1,116 @@ +Chunk-based file formats +======================== + +Some file formats in Git use a common concept of "chunks" to describe +sections of the file. This allows structured access to a large file by +scanning a small "table of contents" for the remaining data. This common +format is used by the `commit-graph` and `multi-pack-index` files. See +link:technical/pack-format.html[the `multi-pack-index` format] and +link:technical/commit-graph-format.html[the `commit-graph` format] for +how they use the chunks to describe structured data. + +A chunk-based file format begins with some header information custom to +that format. That header should include enough information to identify +the file type, format version, and number of chunks in the file. From this +information, that file can determine the start of the chunk-based region. + +The chunk-based region starts with a table of contents describing where +each chunk starts and ends. This consists of (C+1) rows of 12 bytes each, +where C is the number of chunks. Consider the following table: + + | Chunk ID (4 bytes) | Chunk Offset (8 bytes) | + |--------------------|------------------------| + | ID[0] | OFFSET[0] | + | ... | ... | + | ID[C] | OFFSET[C] | + | 0x0000 | OFFSET[C+1] | + +Each row consists of a 4-byte chunk identifier (ID) and an 8-byte offset. +Each integer is stored in network-byte order. + +The chunk identifier `ID[i]` is a label for the data stored within this +fill from `OFFSET[i]` (inclusive) to `OFFSET[i+1]` (exclusive). Thus, the +size of the `i`th chunk is equal to the difference between `OFFSET[i+1]` +and `OFFSET[i]`. This requires that the chunk data appears contiguously +in the same order as the table of contents. + +The final entry in the table of contents must be four zero bytes. This +confirms that the table of contents is ending and provides the offset for +the end of the chunk-based data. + +Note: The chunk-based format expects that the file contains _at least_ a +trailing hash after `OFFSET[C+1]`. + +Functions for working with chunk-based file formats are declared in +`chunk-format.h`. Using these methods provide extra checks that assist +developers when creating new file formats. + +Writing chunk-based file formats +-------------------------------- + +To write a chunk-based file format, create a `struct chunkfile` by +calling `init_chunkfile()` and pass a `struct hashfile` pointer. The +caller is responsible for opening the `hashfile` and writing header +information so the file format is identifiable before the chunk-based +format begins. + +Then, call `add_chunk()` for each chunk that is intended for write. This +populates the `chunkfile` with information about the order and size of +each chunk to write. Provide a `chunk_write_fn` function pointer to +perform the write of the chunk data upon request. + +Call `write_chunkfile()` to write the table of contents to the `hashfile` +followed by each of the chunks. This will verify that each chunk wrote +the expected amount of data so the table of contents is correct. + +Finally, call `free_chunkfile()` to clear the `struct chunkfile` data. The +caller is responsible for finalizing the `hashfile` by writing the trailing +hash and closing the file. + +Reading chunk-based file formats +-------------------------------- + +To read a chunk-based file format, the file must be opened as a +memory-mapped region. The chunk-format API expects that the entire file +is mapped as a contiguous memory region. + +Initialize a `struct chunkfile` pointer with `init_chunkfile(NULL)`. + +After reading the header information from the beginning of the file, +including the chunk count, call `read_table_of_contents()` to populate +the `struct chunkfile` with the list of chunks, their offsets, and their +sizes. + +Extract the data information for each chunk using `pair_chunk()` or +`read_chunk()`: + +* `pair_chunk()` assigns a given pointer with the location inside the + memory-mapped file corresponding to that chunk's offset. If the chunk + does not exist, then the pointer is not modified. + +* `read_chunk()` takes a `chunk_read_fn` function pointer and calls it + with the appropriate initial pointer and size information. The function + is not called if the chunk does not exist. Use this method to read chunks + if you need to perform immediate parsing or if you need to execute logic + based on the size of the chunk. + +After calling these methods, call `free_chunkfile()` to clear the +`struct chunkfile` data. This will not close the memory-mapped region. +Callers are expected to own that data for the timeframe the pointers into +the region are needed. + +Examples +-------- + +These file formats use the chunk-format API, and can be used as examples +for future formats: + +* *commit-graph:* see `write_commit_graph_file()` and `parse_commit_graph()` + in `commit-graph.c` for how the chunk-format API is used to write and + parse the commit-graph file format documented in + link:technical/commit-graph-format.html[the commit-graph file format]. + +* *multi-pack-index:* see `write_midx_internal()` and `load_multi_pack_index()` + in `midx.c` for how the chunk-format API is used to write and + parse the multi-pack-index file format documented in + link:technical/pack-format.html[the multi-pack-index file format]. diff --git a/technical/commit-graph-format.txt b/technical/commit-graph-format.txt index b6658eff1..87971c27d 100644 --- a/technical/commit-graph-format.txt +++ b/technical/commit-graph-format.txt @@ -61,6 +61,9 @@ CHUNK LOOKUP: the length using the next chunk position if necessary.) Each chunk ID appears at most once. + The CHUNK LOOKUP matches the table of contents from + link:technical/chunk-format.html[the chunk-based file format]. + The remaining data in the body is described one chunk at a time, and these chunks may be given in any order. Chunks are required unless otherwise specified. diff --git a/technical/pack-format.html b/technical/pack-format.html index c8c0e3768..c17d8c21d 100644 --- a/technical/pack-format.html +++ b/technical/pack-format.html @@ -1222,6 +1222,11 @@ the number of base MIDX files, hash lengths and types.</p></div> </div></div>
<div class="literalblock">
<div class="content">
+<pre><code>The CHUNK LOOKUP matches the table of contents from
+link:technical/chunk-format.html[the chunk-based file format].</code></pre>
+</div></div>
+<div class="literalblock">
+<div class="content">
<pre><code>The remaining data in the body is described one chunk at a time, and
these chunks may be given in any order. Chunks are required unless
otherwise specified.</code></pre>
@@ -1280,7 +1285,7 @@ otherwise specified.</code></pre> <div id="footer">
<div id="footer-text">
Last updated
- 2021-02-12 14:43:56 PST
+ 2021-03-02 23:05:01 PST
</div>
</div>
</body>
diff --git a/technical/pack-format.txt b/technical/pack-format.txt index 8833b71c8..1faa949bf 100644 --- a/technical/pack-format.txt +++ b/technical/pack-format.txt @@ -336,6 +336,9 @@ CHUNK LOOKUP: (Chunks are provided in file-order, so you can infer the length using the next chunk position if necessary.) + The CHUNK LOOKUP matches the table of contents from + link:technical/chunk-format.html[the chunk-based file format]. + The remaining data in the body is described one chunk at a time, and these chunks may be given in any order. Chunks are required unless otherwise specified. diff --git a/technical/reftable.html b/technical/reftable.html index 6f68bf20a..544baf859 100644 --- a/technical/reftable.html +++ b/technical/reftable.html @@ -1704,16 +1704,11 @@ sorted descending by update index.</p></div> </div>
<div class="sect3">
<h4 id="_layout">Layout</h4>
-<div class="paragraph"><p>A collection of reftable files are stored in the <code>$GIT_DIR/reftable/</code>
-directory:</p></div>
-<div class="literalblock">
-<div class="content">
-<pre><code>00000001-00000001.log
-00000002-00000002.ref
-00000003-00000003.ref</code></pre>
-</div></div>
-<div class="paragraph"><p>where reftable files are named by a unique name such as produced by the
-function <code>${min_update_index}-${max_update_index}.ref</code>.</p></div>
+<div class="paragraph"><p>A collection of reftable files are stored in the <code>$GIT_DIR/reftable/</code> directory.
+Their names should have a random element, such that each filename is globally
+unique; this helps avoid spurious failures on Windows, where open files cannot
+be removed or overwritten. It suggested to use
+<code>${min_update_index}-${max_update_index}-${random}.ref</code> as a naming convention.</p></div>
<div class="paragraph"><p>Log-only files use the <code>.log</code> extension, while ref-only and mixed ref
and log files use <code>.ref</code>. extension.</p></div>
<div class="paragraph"><p>The stack ordering file is <code>$GIT_DIR/reftable/tables.list</code> and lists the
@@ -1722,9 +1717,9 @@ current files, one per line, in order, from oldest (base) to newest <div class="literalblock">
<div class="content">
<pre><code>$ cat .git/reftable/tables.list
-00000001-00000001.log
-00000002-00000002.ref
-00000003-00000003.ref</code></pre>
+00000001-00000001-RANDOM1.log
+00000002-00000002-RANDOM2.ref
+00000003-00000003-RANDOM3.ref</code></pre>
</div></div>
<div class="paragraph"><p>Readers must read <code>$GIT_DIR/reftable/tables.list</code> to determine which
files are relevant right now, and search through the stack in reverse
@@ -1812,7 +1807,7 @@ Prepare temp reftable <code>tmp_XXXXXX</code>, including log entries. </li>
<li>
<p>
-Rename <code>tmp_XXXXXX</code> to <code>${update_index}-${update_index}.ref</code>.
+Rename <code>tmp_XXXXXX</code> to <code>${update_index}-${update_index}-${random}.ref</code>.
</p>
</li>
<li>
@@ -1896,7 +1891,7 @@ the locking protocol. <li>
<p>
Rename <code>${min_update_index}-${max_update_index}_XXXXXX</code> to
-<code>${min_update_index}-${max_update_index}.ref</code>.
+<code>${min_update_index}-${max_update_index}-${random}.ref</code>.
</p>
</li>
<li>
@@ -1921,6 +1916,18 @@ readers to backtrack. <div class="paragraph"><p>Each reftable (compacted or not) is uniquely identified by its name, so
open reftables can be cached by their name.</p></div>
</div>
+<div class="sect3">
+<h4 id="_windows">Windows</h4>
+<div class="paragraph"><p>On windows, and other systems that do not allow deleting or renaming to open
+files, compaction may succeed, but other readers may prevent obsolete tables
+from being deleted.</p></div>
+<div class="paragraph"><p>On these platforms, the following strategy can be followed: on closing a
+reftable stack, reload <code>tables.list</code>, and delete any tables no longer mentioned
+in <code>tables.list</code>.</p></div>
+<div class="paragraph"><p>Irregular program exit may still leave about unused files. In this case, a
+cleanup operation can read <code>tables.list</code>, note its modification timestamp, and
+delete any unreferenced <code>*.ref</code> files that are older.</p></div>
+</div>
</div>
<div class="sect2">
<h3 id="_alternatives_considered">Alternatives considered</h3>
@@ -2021,7 +2028,7 @@ impossible.</p></div> <div id="footer">
<div id="footer-text">
Last updated
- 2021-01-15 16:12:09 PST
+ 2021-03-02 23:05:01 PST
</div>
</div>
</body>
diff --git a/technical/reftable.txt b/technical/reftable.txt index 8095ab259..3ef169af2 100644 --- a/technical/reftable.txt +++ b/technical/reftable.txt @@ -872,17 +872,11 @@ A repository must set its `$GIT_DIR/config` to configure reftable: Layout ^^^^^^ -A collection of reftable files are stored in the `$GIT_DIR/reftable/` -directory: - -.... -00000001-00000001.log -00000002-00000002.ref -00000003-00000003.ref -.... - -where reftable files are named by a unique name such as produced by the -function `${min_update_index}-${max_update_index}.ref`. +A collection of reftable files are stored in the `$GIT_DIR/reftable/` directory. +Their names should have a random element, such that each filename is globally +unique; this helps avoid spurious failures on Windows, where open files cannot +be removed or overwritten. It suggested to use +`${min_update_index}-${max_update_index}-${random}.ref` as a naming convention. Log-only files use the `.log` extension, while ref-only and mixed ref and log files use `.ref`. extension. @@ -893,9 +887,9 @@ current files, one per line, in order, from oldest (base) to newest .... $ cat .git/reftable/tables.list -00000001-00000001.log -00000002-00000002.ref -00000003-00000003.ref +00000001-00000001-RANDOM1.log +00000002-00000002-RANDOM2.ref +00000003-00000003-RANDOM3.ref .... Readers must read `$GIT_DIR/reftable/tables.list` to determine which @@ -940,7 +934,7 @@ new reftable and atomically appending it to the stack: 3. Select `update_index` to be most recent file's `max_update_index + 1`. 4. Prepare temp reftable `tmp_XXXXXX`, including log entries. -5. Rename `tmp_XXXXXX` to `${update_index}-${update_index}.ref`. +5. Rename `tmp_XXXXXX` to `${update_index}-${update_index}-${random}.ref`. 6. Copy `tables.list` to `tables.list.lock`, appending file from (5). 7. Rename `tables.list.lock` to `tables.list`. @@ -993,7 +987,7 @@ prevents other processes from trying to compact these files. should always be the case, assuming that other processes are adhering to the locking protocol. 7. Rename `${min_update_index}-${max_update_index}_XXXXXX` to -`${min_update_index}-${max_update_index}.ref`. +`${min_update_index}-${max_update_index}-${random}.ref`. 8. Write the new stack to `tables.list.lock`, replacing `B` and `C` with the file from (4). 9. Rename `tables.list.lock` to `tables.list`. @@ -1005,6 +999,22 @@ This strategy permits compactions to proceed independently of updates. Each reftable (compacted or not) is uniquely identified by its name, so open reftables can be cached by their name. +Windows +^^^^^^^ + +On windows, and other systems that do not allow deleting or renaming to open +files, compaction may succeed, but other readers may prevent obsolete tables +from being deleted. + +On these platforms, the following strategy can be followed: on closing a +reftable stack, reload `tables.list`, and delete any tables no longer mentioned +in `tables.list`. + +Irregular program exit may still leave about unused files. In this case, a +cleanup operation can read `tables.list`, note its modification timestamp, and +delete any unreferenced `*.ref` files that are older. + + Alternatives considered ~~~~~~~~~~~~~~~~~~~~~~~ |