summaryrefslogtreecommitdiffstats
path: root/man1/git-fast-export.1
blob: 2968104c6187cf2b25591db489c597b47b2ddfec (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
'\" t
.\"     Title: git-fast-export
.\"    Author: [FIXME: author] [see http://www.docbook.org/tdg5/en/html/author]
.\" Generator: DocBook XSL Stylesheets vsnapshot <http://docbook.sf.net/>
.\"      Date: 2024-04-22
.\"    Manual: Git Manual
.\"    Source: Git 2.45.0.rc0.3.g00e10ef10e
.\"  Language: English
.\"
.TH "GIT\-FAST\-EXPORT" "1" "2024\-04\-22" "Git 2\&.45\&.0\&.rc0\&.3\&.g00" "Git Manual"
.\" -----------------------------------------------------------------
.\" * Define some portability stuff
.\" -----------------------------------------------------------------
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.\" http://bugs.debian.org/507673
.\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.ie \n(.g .ds Aq \(aq
.el       .ds Aq '
.\" -----------------------------------------------------------------
.\" * set default formatting
.\" -----------------------------------------------------------------
.\" disable hyphenation
.nh
.\" disable justification (adjust text to left margin only)
.ad l
.\" -----------------------------------------------------------------
.\" * MAIN CONTENT STARTS HERE *
.\" -----------------------------------------------------------------
.SH "NAME"
git-fast-export \- Git data exporter
.SH "SYNOPSIS"
.sp
.nf
\fIgit fast\-export\fR [<options>] | \fIgit fast\-import\fR
.fi
.sp
.SH "DESCRIPTION"
.sp
This program dumps the given revisions in a form suitable to be piped into \fIgit fast\-import\fR\&.
.sp
You can use it as a human\-readable bundle replacement (see \fBgit-bundle\fR(1)), or as a format that can be edited before being fed to \fIgit fast\-import\fR in order to do history rewrites (an ability relied on by tools like \fIgit filter\-repo\fR)\&.
.SH "OPTIONS"
.PP
\-\-progress=<n>
.RS 4
Insert
\fIprogress\fR
statements every <n> objects, to be shown by
\fIgit fast\-import\fR
during import\&.
.RE
.PP
\-\-signed\-tags=(verbatim|warn|warn\-strip|strip|abort)
.RS 4
Specify how to handle signed tags\&. Since any transformation after the export can change the tag names (which can also happen when excluding revisions) the signatures will not match\&.
.sp
When asking to
\fIabort\fR
(which is the default), this program will die when encountering a signed tag\&. With
\fIstrip\fR, the tags will silently be made unsigned, with
\fIwarn\-strip\fR
they will be made unsigned but a warning will be displayed, with
\fIverbatim\fR, they will be silently exported and with
\fIwarn\fR, they will be exported, but you will see a warning\&.
.RE
.PP
\-\-tag\-of\-filtered\-object=(abort|drop|rewrite)
.RS 4
Specify how to handle tags whose tagged object is filtered out\&. Since revisions and files to export can be limited by path, tagged objects may be filtered completely\&.
.sp
When asking to
\fIabort\fR
(which is the default), this program will die when encountering such a tag\&. With
\fIdrop\fR
it will omit such tags from the output\&. With
\fIrewrite\fR, if the tagged object is a commit, it will rewrite the tag to tag an ancestor commit (via parent rewriting; see
\fBgit-rev-list\fR(1))\&.
.RE
.PP
\-M, \-C
.RS 4
Perform move and/or copy detection, as described in the
\fBgit-diff\fR(1)
manual page, and use it to generate rename and copy commands in the output dump\&.
.sp
Note that earlier versions of this command did not complain and produced incorrect results if you gave these options\&.
.RE
.PP
\-\-export\-marks=<file>
.RS 4
Dumps the internal marks table to <file> when complete\&. Marks are written one per line as
\fB:markid SHA\-1\fR\&. Only marks for revisions are dumped; marks for blobs are ignored\&. Backends can use this file to validate imports after they have been completed, or to save the marks table across incremental runs\&. As <file> is only opened and truncated at completion, the same path can also be safely given to \-\-import\-marks\&. The file will not be written if no new object has been marked/exported\&.
.RE
.PP
\-\-import\-marks=<file>
.RS 4
Before processing any input, load the marks specified in <file>\&. The input file must exist, must be readable, and must use the same format as produced by \-\-export\-marks\&.
.RE
.PP
\-\-mark\-tags
.RS 4
In addition to labelling blobs and commits with mark ids, also label tags\&. This is useful in conjunction with
\fB\-\-export\-marks\fR
and
\fB\-\-import\-marks\fR, and is also useful (and necessary) for exporting of nested tags\&. It does not hurt other cases and would be the default, but many fast\-import frontends are not prepared to accept tags with mark identifiers\&.
.sp
Any commits (or tags) that have already been marked will not be exported again\&. If the backend uses a similar \-\-import\-marks file, this allows for incremental bidirectional exporting of the repository by keeping the marks the same across runs\&.
.RE
.PP
\-\-fake\-missing\-tagger
.RS 4
Some old repositories have tags without a tagger\&. The fast\-import protocol was pretty strict about that, and did not allow that\&. So fake a tagger to be able to fast\-import the output\&.
.RE
.PP
\-\-use\-done\-feature
.RS 4
Start the stream with a
\fIfeature done\fR
stanza, and terminate it with a
\fIdone\fR
command\&.
.RE
.PP
\-\-no\-data
.RS 4
Skip output of blob objects and instead refer to blobs via their original SHA\-1 hash\&. This is useful when rewriting the directory structure or history of a repository without touching the contents of individual files\&. Note that the resulting stream can only be used by a repository which already contains the necessary objects\&.
.RE
.PP
\-\-full\-tree
.RS 4
This option will cause fast\-export to issue a "deleteall" directive for each commit followed by a full list of all files in the commit (as opposed to just listing the files which are different from the commit\(cqs first parent)\&.
.RE
.PP
\-\-anonymize
.RS 4
Anonymize the contents of the repository while still retaining the shape of the history and stored tree\&. See the section on
\fBANONYMIZING\fR
below\&.
.RE
.PP
\-\-anonymize\-map=<from>[:<to>]
.RS 4
Convert token
\fB<from>\fR
to
\fB<to>\fR
in the anonymized output\&. If
\fB<to>\fR
is omitted, map
\fB<from>\fR
to itself (i\&.e\&., do not anonymize it)\&. See the section on
\fBANONYMIZING\fR
below\&.
.RE
.PP
\-\-reference\-excluded\-parents
.RS 4
By default, running a command such as
\fBgit fast\-export master~5\&.\&.master\fR
will not include the commit master~5 and will make master~4 no longer have master~5 as a parent (though both the old master~4 and new master~4 will have all the same files)\&. Use \-\-reference\-excluded\-parents to instead have the stream refer to commits in the excluded range of history by their sha1sum\&. Note that the resulting stream can only be used by a repository which already contains the necessary parent commits\&.
.RE
.PP
\-\-show\-original\-ids
.RS 4
Add an extra directive to the output for commits and blobs,
\fBoriginal\-oid <SHA1SUM>\fR\&. While such directives will likely be ignored by importers such as git\-fast\-import, it may be useful for intermediary filters (e\&.g\&. for rewriting commit messages which refer to older commits, or for stripping blobs by id)\&.
.RE
.PP
\-\-reencode=(yes|no|abort)
.RS 4
Specify how to handle
\fBencoding\fR
header in commit objects\&. When asking to
\fIabort\fR
(which is the default), this program will die when encountering such a commit object\&. With
\fIyes\fR, the commit message will be re\-encoded into UTF\-8\&. With
\fIno\fR, the original encoding will be preserved\&.
.RE
.PP
\-\-refspec
.RS 4
Apply the specified refspec to each ref exported\&. Multiple of them can be specified\&.
.RE
.PP
[<git\-rev\-list\-args>\&...]
.RS 4
A list of arguments, acceptable to
\fIgit rev\-parse\fR
and
\fIgit rev\-list\fR, that specifies the specific objects and references to export\&. For example,
\fBmaster~10\&.\&.master\fR
causes the current master reference to be exported along with all objects added since its 10th ancestor commit and (unless the \-\-reference\-excluded\-parents option is specified) all files common to master~9 and master~10\&.
.RE
.SH "EXAMPLES"
.sp
.if n \{\
.RS 4
.\}
.nf
$ git fast\-export \-\-all | (cd /empty/repository && git fast\-import)
.fi
.if n \{\
.RE
.\}
.sp
.sp
This will export the whole repository and import it into the existing empty repository\&. Except for reencoding commits that are not in UTF\-8, it would be a one\-to\-one mirror\&.
.sp
.if n \{\
.RS 4
.\}
.nf
$ git fast\-export master~5\&.\&.master |
        sed "s|refs/heads/master|refs/heads/other|" |
        git fast\-import
.fi
.if n \{\
.RE
.\}
.sp
.sp
This makes a new branch called \fIother\fR from \fImaster~5\&.\&.master\fR (i\&.e\&. if \fImaster\fR has linear history, it will take the last 5 commits)\&.
.sp
Note that this assumes that none of the blobs and commit messages referenced by that revision range contains the string \fIrefs/heads/master\fR\&.
.SH "ANONYMIZING"
.sp
If the \fB\-\-anonymize\fR option is given, git will attempt to remove all identifying information from the repository while still retaining enough of the original tree and history patterns to reproduce some bugs\&. The goal is that a git bug which is found on a private repository will persist in the anonymized repository, and the latter can be shared with git developers to help solve the bug\&.
.sp
With this option, git will replace all refnames, paths, blob contents, commit and tag messages, names, and email addresses in the output with anonymized data\&. Two instances of the same string will be replaced equivalently (e\&.g\&., two commits with the same author will have the same anonymized author in the output, but bear no resemblance to the original author string)\&. The relationship between commits, branches, and tags is retained, as well as the commit timestamps (but the commit messages and refnames bear no resemblance to the originals)\&. The relative makeup of the tree is retained (e\&.g\&., if you have a root tree with 10 files and 3 trees, so will the output), but their names and the contents of the files will be replaced\&.
.sp
If you think you have found a git bug, you can start by exporting an anonymized stream of the whole repository:
.sp
.if n \{\
.RS 4
.\}
.nf
$ git fast\-export \-\-anonymize \-\-all >anon\-stream
.fi
.if n \{\
.RE
.\}
.sp
.sp
Then confirm that the bug persists in a repository created from that stream (many bugs will not, as they really do depend on the exact repository contents):
.sp
.if n \{\
.RS 4
.\}
.nf
$ git init anon\-repo
$ cd anon\-repo
$ git fast\-import <\&.\&./anon\-stream
$ \&.\&.\&. test your bug \&.\&.\&.
.fi
.if n \{\
.RE
.\}
.sp
.sp
If the anonymized repository shows the bug, it may be worth sharing \fBanon\-stream\fR along with a regular bug report\&. Note that the anonymized stream compresses very well, so gzipping it is encouraged\&. If you want to examine the stream to see that it does not contain any private data, you can peruse it directly before sending\&. You may also want to try:
.sp
.if n \{\
.RS 4
.\}
.nf
$ perl \-pe \*(Aqs/\ed+/X/g\*(Aq <anon\-stream | sort \-u | less
.fi
.if n \{\
.RE
.\}
.sp
.sp
which shows all of the unique lines (with numbers converted to "X", to collapse "User 0", "User 1", etc into "User X")\&. This produces a much smaller output, and it is usually easy to quickly confirm that there is no private data in the stream\&.
.sp
Reproducing some bugs may require referencing particular commits or paths, which becomes challenging after refnames and paths have been anonymized\&. You can ask for a particular token to be left as\-is or mapped to a new value\&. For example, if you have a bug which reproduces with \fBgit rev\-list sensitive \-\- secret\&.c\fR, you can run:
.sp
.if n \{\
.RS 4
.\}
.nf
$ git fast\-export \-\-anonymize \-\-all \e
      \-\-anonymize\-map=sensitive:foo \e
      \-\-anonymize\-map=secret\&.c:bar\&.c \e
      >stream
.fi
.if n \{\
.RE
.\}
.sp
.sp
After importing the stream, you can then run \fBgit rev\-list foo \-\- bar\&.c\fR in the anonymized repository\&.
.sp
Note that paths and refnames are split into tokens at slash boundaries\&. The command above would anonymize \fBsubdir/secret\&.c\fR as something like \fBpath123/bar\&.c\fR; you could then search for \fBbar\&.c\fR in the anonymized repository to determine the final pathname\&.
.sp
To make referencing the final pathname simpler, you can map each path component; so if you also anonymize \fBsubdir\fR to \fBpublicdir\fR, then the final pathname would be \fBpublicdir/bar\&.c\fR\&.
.SH "LIMITATIONS"
.sp
Since \fIgit fast\-import\fR cannot tag trees, you will not be able to export the linux\&.git repository completely, as it contains a tag referencing a tree instead of a commit\&.
.SH "SEE ALSO"
.sp
\fBgit-fast-import\fR(1)
.SH "GIT"
.sp
Part of the \fBgit\fR(1) suite