aboutsummaryrefslogtreecommitdiffstats
path: root/archive-tar.c
AgeCommit message (Collapse)AuthorFilesLines
2024-02-12use xstrncmpz()René Scharfe1-1/+1
Add and apply a semantic patch for calling xstrncmpz() to compare a NUL-terminated string with a buffer of a known length instead of using strncmp() and checking the terminating NUL explicitly. This simplifies callers by reducing code duplication. I had to adjust remote.c manually because Coccinelle inexplicably changed the indent of the else branches. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26archive.h: remove unnecessary includeElijah Newren1-0/+1
The unnecessary include in the header transitively pulled in some other headers actually needed by source files, though. Have those source files explicitly include the headers they need. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-17Merge branch 'cw/compat-util-header-cleanup'Junio C Hamano1-1/+0
Further shuffling of declarations across header files to streamline file dependencies. * cw/compat-util-header-cleanup: git-compat-util: move alloc macros to git-compat-util.h treewide: remove unnecessary includes for wrapper.h kwset: move translation table from ctype sane-ctype.h: create header for sane-ctype macros git-compat-util: move wrapper.c funcs to its header git-compat-util: move strbuf.c funcs to its header
2023-07-06Merge branch 'gc/config-context'Junio C Hamano1-2/+3
Reduce reliance on a global state in the config reading API. * gc/config-context: config: pass source to config_parser_event_fn_t config: add kvi.path, use it to evaluate includes config.c: remove config_reader from configsets config: pass kvi to die_bad_number() trace2: plumb config kvi config.c: pass ctx with CLI config config: pass ctx with config files config.c: pass ctx in configsets config: add ctx arg to config_fn_t urlmatch.h: use config_fn_t type config: inline git_color_default_config
2023-07-05git-compat-util: move alloc macros to git-compat-util.hCalvin Wan1-1/+0
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for dynamic array allocation. Moving these macros to git-compat-util.h with the other alloc macros focuses alloc.[ch] to allocation for Git objects and additionally allows us to remove inclusions to alloc.h from files that solely used the above macros. Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28config: pass kvi to die_bad_number()Glen Choo1-2/+2
Plumb "struct key_value_info" through all code paths that end in die_bad_number(), which lets us remove the helper functions that read analogous values from "struct config_reader". As a result, nothing reads config_reader.config_kvi any more, so remove that too. In config.c, this requires changing the signature of git_configset_get_value() to 'return' "kvi" in an out parameter so that git_configset_get_<type>() can pass it to git_config_<type>(). Only numeric types will use "kvi", so for non-numeric types (e.g. git_configset_get_string()), pass NULL to indicate that the out parameter isn't needed. Outside of config.c, config callbacks now need to pass "ctx->kvi" to any of the git_config_<type>() functions that parse a config string into a number type. Included is a .cocci patch to make that refactor. The only exceptional case is builtin/config.c, where git_config_<type>() is called outside of a config callback (namely, on user-provided input), so config source information has never been available. In this case, die_bad_number() defaults to a generic, but perfectly descriptive message. Let's provide a safe, non-NULL for "kvi" anyway, but make sure not to change the message. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28config: add ctx arg to config_fn_tGlen Choo1-1/+2
Add a new "const struct config_context *ctx" arg to config_fn_t to hold additional information about the config iteration operation. config_context has a "struct key_value_info kvi" member that holds metadata about the config source being read (e.g. what kind of config source it is, the filename, etc). In this series, we're only interested in .kvi, so we could have just used "struct key_value_info" as an arg, but config_context makes it possible to add/adjust members in the future without changing the config_fn_t signature. We could also consider other ways of organizing the args (e.g. moving the config name and value into config_context or key_value_info), but in my experiments, the incremental benefit doesn't justify the added complexity (e.g. a config_fn_t will sometimes invoke another config_fn_t but with a different config value). In subsequent commits, the .kvi member will replace the global "struct config_reader" in config.c, making config iteration a global-free operation. It requires much more work for the machinery to provide meaningful values of .kvi, so for now, merely change the signature and call sites, pass NULL as a placeholder value, and don't rely on the arg in any meaningful way. Most of the changes are performed by contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every config_fn_t: - Modifies the signature to accept "const struct config_context *ctx" - Passes "ctx" to any inner config_fn_t, if needed - Adds UNUSED attributes to "ctx", if needed Most config_fn_t instances are easily identified by seeing if they are called by the various config functions. Most of the remaining ones are manually named in the .cocci patch. Manual cleanups are still needed, but the majority of it is trivial; it's either adjusting config_fn_t that the .cocci patch didn't catch, or adding forward declarations of "struct config_context ctx" to make the signatures make sense. The non-trivial changes are in cases where we are invoking a config_fn_t outside of config machinery, and we now need to decide what value of "ctx" to pass. These cases are: - trace2/tr2_cfg.c:tr2_cfg_set_fl() This is indirectly called by git_config_set() so that the trace2 machinery can notice the new config values and update its settings using the tr2 config parsing function, i.e. tr2_cfg_cb(). - builtin/checkout.c:checkout_main() This calls git_xmerge_config() as a shorthand for parsing a CLI arg. This might be worth refactoring away in the future, since git_xmerge_config() can call git_default_config(), which can do much more than just parsing. Handle them by creating a KVI_INIT macro that initializes "struct key_value_info" to a reasonable default, and use that to construct the "ctx" arg. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21object-store-ll.h: split this header out of object-store.hElijah Newren1-1/+1
The vast majority of files including object-store.h did not need dir.h nor khash.h. Split the header into two files, and let most just depend upon object-store-ll.h, while letting the two callers that need it depend on the full object-store.h. After this patch: $ git grep -h include..object-store | sort | uniq -c 2 #include "object-store.h" 129 #include "object-store-ll.h" Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11git-zlib: move declarations for git-zlib functions from cache.hElijah Newren1-0/+1
Move functions from cache.h for zlib.c into a new header file. Since adding a "zlib.h" would cause issues with the real zlib, rename zlib.c to git-zlib.c while we are at it. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21write-or-die.h: move declarations for write-or-die.c functions from cache.hElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21treewide: be explicit about dependence on gettext.hElijah Newren1-0/+1
Dozens of files made use of gettext functions, without explicitly including gettext.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include gettext.h if they are using it. However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an include of gettext.h, it was left out to avoid conflicting with an in-flight topic. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23cache.h: remove dependence on hex.h; make other files include it explicitlyElijah Newren1-0/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23alloc.h: move ALLOC_GROW() functions from cache.hElijah Newren1-1/+2
This allows us to replace includes of cache.h with includes of the much smaller alloc.h in many places. It does mean that we also need to add includes of alloc.h in a number of C files. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-30archive-tar: report filter start error only onceRené Scharfe1-0/+1
A missing tar filter is reported by start_command() using error(), but also by its caller, write_tar_filter_archive(), using die(): $ git -c tar.invalid.command=foo archive --format=invalid HEAD error: cannot run foo: No such file or directory fatal: unable to start 'foo' filter: No such file or directory The second message contains all relevant information and even says that the failed command was intended to be used as a filter. Silence the first one because it's redundant. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-09-01git-compat-util.h: use "UNUSED", not "UNUSED(var)"Ævar Arnfjörð Bjarmason1-2/+2
As reported in [1] the "UNUSED(var)" macro introduced in 2174b8c75de (Merge branch 'jk/unused-annotation' into next, 2022-08-24) breaks coccinelle's parsing of our sources in files where it occurs. Let's instead partially go with the approach suggested in [2] of making this not take an argument. As noted in [1] "coccinelle" will ignore such tokens in argument lists that it doesn't know about, and it's less of a surprise to syntax highlighters. This undoes the "help us notice when a parameter marked as unused is actually use" part of 9b240347543 (git-compat-util: add UNUSED macro, 2022-08-19), a subsequent commit will further tweak the macro to implement a replacement for that functionality. 1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/ 2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19mark unused read_tree_recursive() callback parametersJeff King1-1/+1
We pass a callback to read_tree_recursive(), but not every callback needs every parameter. Let's mark the unused ones to satisfy -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19config: mark unused callback parametersJeff King1-1/+2
The callback passed to git_config() must conform to a particular interface. But most callbacks don't actually look at the extra "void *data" parameter. Let's mark the unused parameters to make -Wunused-parameter happy. Note there's one unusual case here in get_remote_default() where we actually ignore the "value" parameter. That's because it's only checking whether the option is found at all, and not parsing its value. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-15archive-tar: use internal gzip by defaultRené Scharfe1-2/+2
Drop the dependency on gzip(1) and use our internal implementation to create tar.gz and tgz files. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-15archive-tar: use OS_CODE 3 (Unix) for internal gzipRené Scharfe1-0/+7
gzip(1) encodes the OS it runs on in the 10th byte of its output. It uses the following OS_CODE values according to its tailor.h [1]: 0 - MS-DOS 3 - UNIX 5 - Atari ST 6 - OS/2 10 - TOPS-20 11 - Windows NT The gzip.exe that comes with Git for Windows uses OS_CODE 3 for some reason, so this value is used on practically all supported platforms when generating tgz archives using gzip(1). Zlib uses a bigger set of values according to its zutil.h [2], aligned with section 4.4.2 of the ZIP specification, APPNOTE.txt [3]: 0 - MS-DOS 1 - Amiga 3 - UNIX 4 - VM/CMS 5 - Atari ST 6 - OS/2 7 - Macintosh 8 - Z-System 10 - Windows NT 11 - MVS (OS/390 - Z/OS) 13 - Acorn Risc 16 - BeOS 18 - OS/400 19 - OS X (Darwin) Thus the internal gzip implementation in archive-tar.c sets different OS_CODE header values on major platforms Windows and macOS. Git for Windows uses its own zlib-based variant since v2.20.1 by default and thus embeds OS_CODE 10 in tgz archives. The tar archive for a commit is generated consistently on all systems (by the same Git version). The OS_CODE in the gzip header does not influence extraction. Avoid leaking OS information and make tgz archives constistent and reproducable (with the same Git and libz versions) by using OS_CODE 3 everywhere. At least on macOS 12.4 this produces the same output as gzip(1) for the examples I tried: # before $ git -c tar.tgz.command='git archive gzip' archive --format=tgz v2.36.0 | shasum 3abbffb40b7c63cf9b7d91afc682f11682f80759 - # with this patch $ git -c tar.tgz.command='git archive gzip' archive --format=tgz v2.36.0 | shasum dc6dc6ba9636d522799085d0d77ab6a110bcc141 - $ git archive --format=tar v2.36.0 | gzip -cn | shasum dc6dc6ba9636d522799085d0d77ab6a110bcc141 - [1] https://git.savannah.gnu.org/cgit/gzip.git/tree/tailor.h [2] https://github.com/madler/zlib/blob/master/zutil.h [3] https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-15archive-tar: add internal gzip implementationRené Scharfe1-1/+44
Git uses zlib for its own object store, but calls gzip when creating tgz archives. Add an option to perform the gzip compression for the latter using zlib, without depending on the external gzip binary. Plug it in by making write_block a function pointer and switching to a compressing variant if the filter command has the magic value "git archive gzip". Does that indirection slow down tar creation? Not really, at least not in this test: $ hyperfine -w3 -L rev HEAD,origin/main -p 'git checkout {rev} && make' \ './git -C ../linux archive --format=tar HEAD # {rev}' Benchmark #1: ./git -C ../linux archive --format=tar HEAD # HEAD Time (mean ± σ): 4.044 s ± 0.007 s [User: 3.901 s, System: 0.137 s] Range (min … max): 4.038 s … 4.059 s 10 runs Benchmark #2: ./git -C ../linux archive --format=tar HEAD # origin/main Time (mean ± σ): 4.047 s ± 0.009 s [User: 3.903 s, System: 0.138 s] Range (min … max): 4.038 s … 4.066 s 10 runs How does tgz creation perform? $ hyperfine -w3 -L command 'gzip -cn','git archive gzip' \ './git -c tar.tgz.command="{command}" -C ../linux archive --format=tgz HEAD' Benchmark #1: ./git -c tar.tgz.command="gzip -cn" -C ../linux archive --format=tgz HEAD Time (mean ± σ): 20.404 s ± 0.006 s [User: 23.943 s, System: 0.401 s] Range (min … max): 20.395 s … 20.414 s 10 runs Benchmark #2: ./git -c tar.tgz.command="git archive gzip" -C ../linux archive --format=tgz HEAD Time (mean ± σ): 23.807 s ± 0.023 s [User: 23.655 s, System: 0.145 s] Range (min … max): 23.782 s … 23.857 s 10 runs Summary './git -c tar.tgz.command="gzip -cn" -C ../linux archive --format=tgz HEAD' ran 1.17 ± 0.00 times faster than './git -c tar.tgz.command="git archive gzip" -C ../linux archive --format=tgz HEAD' So the internal implementation takes 17% longer on the Linux repo, but uses 2% less CPU time. That's because the external gzip can run in parallel on its own processor, while the internal one works sequentially and avoids the inter-process communication overhead. What are the benefits? Only an internal sequential implementation can offer this eco mode, and it allows avoiding the gzip(1) requirement. This implementation uses the helper functions from our zlib.c instead of the convenient gz* functions from zlib, because the latter doesn't give the control over the generated gzip header that the next patch requires. Original-patch-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-15archive-tar: factor out write_block()René Scharfe1-4/+9
All tar archive writes have the same size and are done to the same file descriptor. Move them to a common function, write_block(), to reduce code duplication and make it easy to change the destination. Original-patch-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-15archive: rename archiver data field to filter_commandRené Scharfe1-5/+5
The void pointer "data" in struct archiver is only used to store filter commands to pass tar archives to, like gzip. Rename it accordingly and also turn it into a char pointer to document the fact that it's a string reference. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-24archive-*.c: use designated initializers for "struct archiver"Ævar Arnfjörð Bjarmason1-3/+3
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-25run-command API users: use strvec_push(), not argv constructionÆvar Arnfjörð Bjarmason1-6/+3
Change a pattern of hardcoding an "argv" array size, populating it and assigning to the "argv" member of "struct child_process" to instead use "strvec_push()" to add data to the "args" member. As noted in the preceding commit this moves us further towards being able to remove the "argv" member in a subsequent commit These callers could have used strvec_pushl(), but moving to strvec_push() makes the diff easier to read, and keeps the arguments aligned as before. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13use CALLOC_ARRAYRené Scharfe1-1/+1
Add and apply a semantic patch for converting code that open-codes CALLOC_ARRAY to use it instead. It shortens the code and infers the element size automatically. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-09archive: support compression levels beyond 9René Scharfe1-1/+2
Compression programs like zip, gzip, bzip2 and xz allow to adjust the trade-off between CPU cost and size gain with numerical options from -1 for fast compression and -9 for high compression ratio. zip also accepts -0 for storing files verbatim. git archive directly support these single-digit compression levels for ZIP output and passes them to filters like gzip. Zstandard additionally supports compression level options -10 to -19, or up to -22 with --ultra. This *seems* to work with git archive in most cases, e.g. it will produce an archive with -19 without complaining, but since it only supports single-digit compression level options this is the same as -1 -9 and thus -9. Allow git archive to accept multi-digit compression levels to support the full range supported by zstd. Explicitly reject them for the ZIP format, as otherwise deflateInit2() would just fail with a somewhat cryptic "stream consistency error". Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-19archive: read short blobs in archive.c::write_archive_entry()René Scharfe1-19/+3
Centralize reading of symlink destinations and the contents of regular files that are too small to be streamed. This reduces code duplication and allows future patches to add support for adding non-tracked files to archives. The backends are expected to stream blobs if buffer is NULL. object_file_to_archive() is only called from archive.c and thus no longer exported. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-10parse_config_key(): return subsection len as size_tJeff King1-2/+2
We return the length to a subset of a string using an "int *" out-parameter. This is fine most of the time, as we'd expect config keys to be relatively short, but it could behave oddly if we had a gigantic config key. A more appropriate type is size_t. Let's switch over, which lets our callers use size_t as appropriate (they are bound by our type because they must pass the out-parameter as a pointer). This is mostly just a cleanup to make it clear this code handles long strings correctly. In practice, our config parser already chokes on long key names (because of a similar int/size_t mixup!). When doing an int/size_t conversion, we have to be careful that nobody was trying to assign a negative value to the variable. I manually confirmed that for each case here. They tend to just feed the result to xmemdupz() or similar; in a few cases I adjusted the parameter types for helper functions to make sure the size_t is preserved. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-31streaming: allow open_istream() to handle any repoMatheus Tavares1-3/+3
Some callers of open_istream() at archive-tar.c and archive-zip.c are capable of working on arbitrary repositories but the repo struct is not passed down to open_istream(), which uses the_repository internally. For now, that's not a problem since the said callers are only being called with the_repository. But to be consistent and avoid future problems, let's allow open_istream() to receive a struct repository and use that instead of the_repository. This parameter addition will also be used in a future patch to make sha1-file.c:check_object_signature() be able to work on arbitrary repos. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-09Merge branch 'rs/pax-extended-header-length-fix'Junio C Hamano1-4/+10
"git archive" recorded incorrect length in extended pax header in some corner cases, which has been corrected. * rs/pax-extended-header-length-fix: archive-tar: turn length miscalculation warning into BUG archive-tar: use size_t in strbuf_append_ext_header() archive-tar: fix pax extended header length calculation archive-tar: report wrong pax extended header length
2019-08-19archive-tar: turn length miscalculation warning into BUGRené Scharfe1-3/+3
Now that we're confident our pax extended header calculation is correct, turn the criticality of the assertion up to the maximum, from warning right up to BUG. Simplify the test, as the stderr comparison step would not be reached in case the BUG message is triggered. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19archive-tar: use size_t in strbuf_append_ext_header()René Scharfe1-5/+5
One of its callers already passes in a size_t value. Use it consistently in this function. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19archive-tar: fix pax extended header length calculationRené Scharfe1-1/+1
A pax extended header record starts with a decimal number. Its value is the length of the whole record, including its own length. The calculation of that number in strbuf_append_ext_header() is off by one in case the length of the rest is close to a higher order of magnitude. This affects paths and link targets a bit shorter than 1000, 10000, 100000 etc. characters -- paths with a length of up to 100 fit into the tar header and don't need a pax extended header. The mistake has been present since the function was added by ae64bbc18c ("tar-tree: Introduce write_entry()", 2006-03-25). Account for digits added to len during the loop and keep incrementing until we have enough space for len and the rest. The crucial change is to check against the current value of len before each iteration, instead of against its value before the loop. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19archive-tar: report wrong pax extended header lengthRené Scharfe1-0/+6
Extended header entries contain a length value that is a bit tricky to calculate because it includes its own length (number of decimal digits) as well. We get it wrong in corner cases. Add a check, report wrong results as a warning and add a test for exercising it. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-01archive: convert struct archiver_args to object_idbrian m. carlson1-3/+4
Change the commit_sha1 member to be called "commit_oid" and change it to be a pointer to struct object_id. Additionally, update some uses of GIT_SHA1_HEXSZ and hard-coded values to use the_hash_algo instead. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-09Indent code with TABsNguyễn Thái Ngọc Duy1-1/+1
We indent with TABs and sometimes for fine alignment, TABs followed by spaces, but never all spaces (unless the indentation is less than 8 columns). Indenting with spaces slips through in some places. Fix them. Imported code and compat/ are left alone on purpose. The former should remain as close as upstream as possible. The latter pretty much has separate maintainers, it's up to them to decide. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-12Upcast size_t variables to uintmax_t when printingTorsten Bögershausen1-1/+1
When printing variables which contain a size, today "unsigned long" is used at many places. In order to be able to change the type from "unsigned long" into size_t some day in the future, we need to have a way to print 64 bit variables on a system that has "unsigned long" defined to be 32 bit, like Win64. Upcast all those variables into uintmax_t before they are printed. This is to prepare for a bigger change, when "unsigned long" will be converted into size_t for variables which may be > 4Gib. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20Merge branch 'nd/no-the-index'Junio C Hamano1-1/+1
The more library-ish parts of the codebase learned to work on the in-core index-state instance that is passed in by their callers, instead of always working on the singleton "the_index" instance. * nd/no-the-index: (24 commits) blame.c: remove implicit dependency on the_index apply.c: remove implicit dependency on the_index apply.c: make init_apply_state() take a struct repository apply.c: pass struct apply_state to more functions resolve-undo.c: use the right index instead of the_index archive-*.c: use the right repository archive.c: avoid access to the_index grep: use the right index instead of the_index attr: remove index from git_attr_set_direction() entry.c: use the right index instead of the_index submodule.c: use the right index instead of the_index pathspec.c: use the right index instead of the_index unpack-trees: avoid the_index in verify_absent() unpack-trees: convert clear_ce_flags* to avoid the_index unpack-trees: don't shadow global var the_index unpack-trees: add a note about path invalidation unpack-trees: remove 'extern' on function declaration ls-files: correct index argument to get_convert_attr_ascii() preload-index.c: use the right index instead of the_index dir.c: remove an implicit dependency on the_index in pathspec code ...
2018-08-15Merge branch 'nd/i18n'Junio C Hamano1-6/+6
Many more strings are prepared for l10n. * nd/i18n: (23 commits) transport-helper.c: mark more strings for translation transport.c: mark more strings for translation sha1-file.c: mark more strings for translation sequencer.c: mark more strings for translation replace-object.c: mark more strings for translation refspec.c: mark more strings for translation refs.c: mark more strings for translation pkt-line.c: mark more strings for translation object.c: mark more strings for translation exec-cmd.c: mark more strings for translation environment.c: mark more strings for translation dir.c: mark more strings for translation convert.c: mark more strings for translation connect.c: mark more strings for translation config.c: mark more strings for translation commit-graph.c: mark more strings for translation builtin/replace.c: mark more strings for translation builtin/pack-objects.c: mark more strings for translation builtin/grep.c: mark strings for translation builtin/config.c: mark more strings for translation ...
2018-08-13archive-*.c: use the right repositoryNguyễn Thái Ngọc Duy1-1/+1
With 'struct archive_args' gaining new repository pointer, we don't have to assume the_repository in the archive backends anymore. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-23archive-tar.c: mark more strings for translationNguyễn Thái Ngọc Duy1-6/+6
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-18Merge branch 'sb/object-store-grafts'Junio C Hamano1-0/+1
The conversion to pass "the_repository" and then "a_repository" throughout the object access API continues. * sb/object-store-grafts: commit: allow lookup_commit_graft to handle arbitrary repositories commit: allow prepare_commit_graft to handle arbitrary repositories shallow: migrate shallow information into the object parser path.c: migrate global git_path_* to take a repository argument cache: convert get_graft_file to handle arbitrary repositories commit: convert read_graft_file to handle arbitrary repositories commit: convert register_commit_graft to handle arbitrary repositories commit: convert commit_graft_pos() to handle arbitrary repositories shallow: add repository argument to is_repository_shallow shallow: add repository argument to check_shallow_file_for_update shallow: add repository argument to register_shallow shallow: add repository argument to set_alternate_shallow_file commit: add repository argument to lookup_commit_graft commit: add repository argument to prepare_commit_graft commit: add repository argument to read_graft_file commit: add repository argument to register_commit_graft commit: add repository argument to commit_graft_pos object: move grafts to object parser object-store: move object access functions to object-store.h
2018-05-30Merge branch 'js/use-bug-macro'Junio C Hamano1-1/+1
Developer support update, by using BUG() macro instead of die() to mark codepaths that should not happen more clearly. * js/use-bug-macro: BUG_exit_code: fix sparse "symbol not declared" warning Convert remaining die*(BUG) messages Replace all die("BUG: ...") calls by BUG() ones run-command: use BUG() to report bugs, not die() test-tool: help verifying BUG() code paths
2018-05-16object-store: move object access functions to object-store.hStefan Beller1-0/+1
This should make these functions easier to find and cache.h less overwhelming to read. In particular, this moves: - read_object_file - oid_object_info - write_object_file As a result, most of the codebase needs to #include object-store.h. In this patch the #include is only added to files that would fail to compile otherwise. It would be better to #include wherever identifiers from the header are used. That can happen later when we have better tooling for it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-06Replace all die("BUG: ...") calls by BUG() onesJohannes Schindelin1-1/+1
In d8193743e08 (usage.c: add BUG() function, 2017-05-12), a new macro was introduced to use for reporting bugs instead of die(). It was then subsequently used to convert one single caller in 588a538ae55 (setup_git_env: convert die("BUG") to BUG(), 2017-05-12). The cover letter of the patch series containing this patch (cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not terribly clear why only one call site was converted, or what the plan is for other, similar calls to die() to report bugs. Let's just convert all remaining ones in one fell swoop. This trick was performed by this invocation: sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-26cache.h: add repository argument to oid_object_infoStefan Beller1-1/+1
Add a repository argument to allow the callers of oid_object_info to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14sha1_file: convert sha1_object_info* to object_idbrian m. carlson1-1/+1
Convert sha1_object_info and sha1_object_info_extended to take pointers to struct object_id and rename them to use "oid" instead of "sha1" in their names. Update the declaration and definition and apply the following semantic patch, plus the standard object_id transforms: @@ expression E1, E2; @@ - sha1_object_info(E1.hash, E2) + oid_object_info(&E1, E2) @@ expression E1, E2; @@ - sha1_object_info(E1->hash, E2) + oid_object_info(E1, E2) @@ expression E1, E2, E3; @@ - sha1_object_info_extended(E1.hash, E2, E3) + oid_object_info_extended(&E1, E2, E3) @@ expression E1, E2, E3; @@ - sha1_object_info_extended(E1->hash, E2, E3) + oid_object_info_extended(E1, E2, E3) Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14streaming: convert open_istream to use struct object_idbrian m. carlson1-1/+1
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14archive: convert sha1_file_to_archive to struct object_idbrian m. carlson1-1/+1
Convert this function to take a pointer to struct object_id and rename it object_file_to_archive. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14archive: convert write_archive_entry_fn_t to object_idbrian m. carlson1-14/+14
Convert the write_archive_entry_fn_t type to use a pointer to struct object_id. Convert various static functions in the tar and zip archivers also. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-24Merge branch 'bw/config-h'Junio C Hamano1-0/+1
Fix configuration codepath to pay proper attention to commondir that is used in multi-worktree situation, and isolate config API into its own header file. * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h
2017-06-15config: don't include config.h by defaultBrandon Williams1-0/+1
Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-09archive-tar: fix a sparse 'constant too large' warningRamsay Jones1-2/+2
Commit dddbad728c ("timestamp_t: a new data type for timestamps", 26-04-2017) introduced a new typedef 'timestamp_t', as a synonym for an unsigned long, which was used at the time to represent timestamps in git. A later commit 28f4aee3fb ("use uintmax_t for timestamps", 26-04-2017) changed the typedef to use an 'uintmax_t' for the timestamp representation type. When building on a 32-bit Linux system, sparse complains that a constant (USTAR_MAX_MTIME) used to detect a 'far-future mtime' timestamp, is too large; 'warning: constant 077777777777UL is so big it is unsigned long long' on lines 335 and 338 of archive-tar.c. Note that both gcc and clang only issue a warning if this constant is used in a context that requires an 'unsigned long' (rather than an uintmax_t). (Since TIME_MAX is no longer equal to 0xFFFFFFFF, even on a 32-bit system, the macro USTAR_MAX_MTIME is set to 077777777777UL, which cannot be represented as an 'unsigned long' constant). In order to suppress the warning, change the definition of the macro constant USTAR_MAX_MTIME to use an 'ULL' type suffix. In a similar vein, on systems which use a 64-bit representation of the 'unsigned long' type, the USTAR_MAX_SIZE constant macro is defined with the value 077777777777ULL. Although this does not cause any warning messages to be issued, it would be more appropriate for this constant to use an 'UL' type suffix rather than 'ULL'. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-27timestamp_t: a new data type for timestampsJohannes Schindelin1-2/+5
Git's source code assumes that unsigned long is at least as precise as time_t. Which is incorrect, and causes a lot of problems, in particular where unsigned long is only 32-bit (notably on Windows, even in 64-bit versions). So let's just use a more appropriate data type instead. In preparation for this, we introduce the new `timestamp_t` data type. By necessity, this is a very, very large patch, as it has to replace all timestamps' data type in one go. As we will use a data type that is not necessarily identical to `time_t`, we need to be very careful to use `time_t` whenever we interact with the system functions, and `timestamp_t` everywhere else. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-12Merge branch 'jk/big-and-future-archive-tar'Junio C Hamano1-10/+5
A small code clean-up. * jk/big-and-future-archive-tar: archive-tar: make write_extended_header() void
2016-08-06archive-tar: make write_extended_header() voidRené Scharfe1-10/+5
The function write_extended_header() only ever returns 0. Simplify it and its caller by dropping its return value, like we did with write_global_extended_header() earlier. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-15archive-tar: huge offset and future timestamps would not work on 32-bitJunio C Hamano1-0/+5
As we are not yet moving everything to size_t but still using ulong internally when talking about the size of object, platforms with 32-bit long will not be able to produce tar archive with 4GB+ file, and cannot grok 077777777777UL as a constant. Disable the extended header feature and do not test it on them. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01archive-tar: drop return valueJeff King1-7/+4
We never do any error checks, and so never return anything but "0". Let's just drop this to simplify the code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01archive-tar: write extended headers for far-future mtimeJeff King1-3/+16
The ustar format represents timestamps as seconds since the epoch, but only has room to store 11 octal digits. To express anything larger, we need to use an extended header. This is exactly the same case we fixed for the size field in the previous commit, and the solution here follows the same pattern. This is even mentioned as an issue in f2f0267 (archive-tar: use xsnprintf for trivial formatting, 2015-09-24), but since it only affected things far in the future, it wasn't deemed worth dealing with. But note that my calculations claiming thousands of years were off there; because our xsnprintf produces a NUL byte, we only have until the year 2242 to fix this. Given that this is just around the corner (geologically speaking, anyway), and because it's easy to fix, let's just make it work. Unlike the previous fix for "size", where we had to write an individual extended header for each file, we can write one global header (since we have only one mtime for the whole archive). There's a slight bit of trickiness there. We may already be writing a global header with a "comment" field for the commit sha1. So we need to write our new field into the same header. To do this, we push the decision of whether to write such a header down into write_global_extended_header(), which will now assemble the header as it sees fit, and will return early if we have nothing to write (in practice, we'll only have a large mtime if it comes from a commit, but this makes it also work if you set your system clock ahead such that time() returns a huge value). Note that we don't (and never did) handle negative timestamps (i.e., before 1970). This would probably not be too hard to support in the same way, but since git does not support negative timestamps at all, I didn't bother here. After writing the extended header, we munge the timestamp in the ustar headers to the maximum-allowable size. This is wrong, but it's the least-wrong thing we can provide to a tar implementation that doesn't understand pax headers (it's also what GNU tar does). Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01archive-tar: write extended headers for file sizes >= 8GBJeff King1-2/+29
The ustar format has a fixed-length field for the size of each file entry which is supposed to contain up to 11 bytes of octal-formatted data plus a NUL or space terminator. These means that the largest size we can represent is 077777777777, or 1 byte short of 8GB. The correct solution for a larger file, according to POSIX.1-2001, is to add an extended pax header, similar to how we handle long filenames. This patch does that, and writes zero for the size field in the ustar header (the last bit is not mentioned by POSIX, but it matches how GNU tar behaves with --format=pax). This should be a strict improvement over the current behavior, which is to die in xsnprintf with a "BUG". However, there's some interesting history here. Prior to f2f0267 (archive-tar: use xsnprintf for trivial formatting, 2015-09-24), we silently overflowed the "size" field. The extra bytes ended up in the "mtime" field of the header, which was then immediately written itself, overwriting our extra bytes. What that means depends on how many bytes we wrote. If the size was 64GB or greater, then we actually overflowed digits into the mtime field, meaning our value was effectively right-shifted by those lost octal digits. And this patch is again a strict improvement over that. But if the size was between 8GB and 64GB, then our 12-byte field held all of the actual digits, and only our NUL terminator overflowed. According to POSIX, there should be a NUL or space at the end of the field. However, GNU tar seems to be lenient here, and will correctly parse a size up 64GB (minus one) from the field. So sizes in this range might have just worked, depending on the implementation reading the tarfile. This patch is mostly still an improvement there, as the 8GB limit is specifically mentioned in POSIX as the correct limit. But it's possible that it could be a regression (versus the pre-f2f0267 state) if all of the following are true: 1. You have a file between 8GB and 64GB. 2. Your tar implementation _doesn't_ know about pax extended headers. 3. Your tar implementation _does_ parse 12-byte sizes from the ustar header without a delimiter. It's probably not worth worrying about such an obscure set of conditions, but I'm documenting it here just in case. Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-26archive-tar: convert snprintf to xsnprintfJeff King1-1/+1
Commit f2f0267 (archive-tar: use xsnprintf for trivial formatting, 2015-09-24) converted cases of "sprintf" to "xsnprintf", but accidentally left one as just "snprintf". This meant that we could silently truncate the resulting buffer instead of flagging an error. In practice, this is impossible to achieve, as we are formatting a ustar checksum, which can be at most 7 characters. But the point of xsnprintf is to document and check for "should be impossible" conditions; this site was just accidentally mis-converted during f2f0267. Noticed-by: Paul Green <Paul.Green@stratus.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25archive-tar: use xsnprintf for trivial formattingJeff King1-13/+13
When we generate tar headers, we sprintf() values directly into a struct with the fixed-size header values. For the most part this is fine, as we are formatting small values (e.g., the octal format of "mode & 0x7777" is of fixed length). But it's still a good idea to use xsnprintf here. It communicates to readers what our expectation is, and it provides a run-time check that we are not overflowing the buffers. The one exception here is the mtime, which comes from the epoch time of the commit we are archiving. For sane values, this fits into the 12-byte value allocated in the header. But since git can handle 64-bit times, if I claim to be a visitor from the year 10,000 AD, I can overflow the buffer. This turns out to be harmless, as we simply overflow into the chksum field, which is then overwritten. This case is also best as an xsnprintf. It should never come up, short of extremely malformed dates, and in that case we are probably better off dying than silently truncating the date value (and we cannot expand the size of the buffer, since it is dictated by the ustar format). Our friends in the year 5138 (when we legitimately flip to a 12-digit epoch) can deal with that problem then. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25convert trivial sprintf / strcpy calls to xsnprintfJeff King1-1/+1
We sometimes sprintf into fixed-size buffers when we know that the buffer is large enough to fit the input (either because it's a constant, or because it's numeric input that is bounded in size). Likewise with strcpy of constant strings. However, these sites make it hard to audit sprintf and strcpy calls for buffer overflows, as a reader has to cross-reference the size of the array with the input. Let's use xsnprintf instead, which communicates to a reader that we don't expect this to overflow (and catches the mistake in case we do). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25archive-tar: fix minor indentation violationJeff King1-1/+1
This looks like a simple omission from 8539070 (archive-tar: unindent write_tar_entry by one level, 2012-05-03). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-20Revert "archive: honor tar.umask even for pax headers"Junio C Hamano1-2/+2
This reverts commit 10f343ea814f5c18a0913997904ee11cd9b7da24, whose output is no longer bit-for-bit equivalent from the older versions of Git, which the infrastructure to (pretend to) upload tarballs kernel.org uses depends on.
2014-09-11Merge branch 'rs/child-process-init'Junio C Hamano1-2/+1
Code clean-up. * rs/child-process-init: run-command: inline prepare_run_command_v_opt() run-command: call run_command_v_opt_cd_env() instead of duplicating it run-command: introduce child_process_init() run-command: introduce CHILD_PROCESS_INIT
2014-08-20run-command: introduce CHILD_PROCESS_INITRené Scharfe1-2/+1
Most struct child_process variables are cleared using memset first after declaration. Provide a macro, CHILD_PROCESS_INIT, that can be used to initialize them statically instead. That's shorter, doesn't require a function call and is slightly more readable (especially given that we already have STRBUF_INIT, ARGV_ARRAY_INIT etc.). Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-04archive: honor tar.umask even for pax headersbrian m. carlson1-2/+2
git archive's tar format uses extended pax headers to encode metadata into the archive. Most tar implementations correctly treat these as metadata, but some that do not understand the pax format extract these as files instead. Apply the tar.umask setting to these entries to prevent tampering by other users. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-23archive-tar: use parse_config_key when parsing configJeff King1-9/+1
This is fewer lines of code, but more importantly, fixes a bogus pointer offset. We are looking for "tar." in the section, but later assume that the dot we found is at offset 9, not 3. This is a holdover from an earlier iteration of 767cf45 which called the section "tarfilter". As a result, we could erroneously reject some filters with dots in their name, as well as read uninitialized memory. Reported by (and test by) René Scharfe. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-10Merge branch 'rs/leave-base-name-in-name-field-of-tar'Junio C Hamano1-0/+2
Improve compatibility with implementations of "tar" that do not like empty name field in header (with the additional prefix field holding everything). * rs/leave-base-name-in-name-field-of-tar: archive-tar: split long paths more carefully
2013-01-05archive-tar: split long paths more carefullyRené Scharfe1-0/+2
The name field of a tar header has a size of 100 characters. This limit was extended long ago in a backward compatible way by providing the additional prefix field, which can hold 155 additional characters. The actual path is constructed at extraction time by concatenating the prefix field, a slash and the name field. get_path_prefix() is used to determine which slash in the path is used as the cutting point and thus which part of it is placed into the field prefix and which into the field name. It tries to cram as much into the prefix field as possible. (And only if we can't fit a path into the provided 255 characters we use a pax extended header to store it.) If a path is longer than 100 but shorter than 156 characters and ends with a slash (i.e. is for a directory) then get_path_prefix() puts the whole path in the prefix field and leaves the name field empty. GNU tar reconstructs the path without complaint, but the tar included with NetBSD 6 does not: It reports the header to be invalid. For compatibility with this version of tar, make sure to never leave the name field empty. In order to do that, trim the trailing slash from the part considered as possible prefix, if it exists -- that way the last path component (or more, but not less) will end up in the name field. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-06-13archive: ustar header checksum is computed unsignedJunio C Hamano1-3/+3
POSIX.1 (pax) is pretty clear on this: The chksum field shall be the ISO/IEC 646:1991 standard IRV representation of the octal value of the simple sum of all octets in the header logical record. Each octet in the header shall be treated as an unsigned value. These values shall be added to an unsigned integer, initialized to zero, the precision of which is not less than 17 bits. When calculating the checksum, the chksum field is treated as if it were all <space> characters. so is GNU: http://www.gnu.org/software/tar/manual/html_node/Checksumming.html Found by 7zip folks and reported by Rafał Mużyło. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-23Merge branch 'rs/archive-tree-in-tip-simplify'Junio C Hamano1-2/+2
By René Scharfe * rs/archive-tree-in-tip-simplify: archive-tar: keep const in checksum calculation archive: simplify refname handling
2012-05-18archive-tar: keep const in checksum calculationRené Scharfe1-2/+2
For correctness, don't needlessly drop the const qualifier when casting. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03archive-tar: stream large blobs to tar fileNguyễn Thái Ngọc Duy1-5/+51
t5000 verifies output while t1050 makes sure the command always respects core.bigfilethreshold Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03archive: delegate blob reading to backendNguyễn Thái Ngọc Duy1-4/+21
archive-tar.c and archive-zip.c now perform conversion check, with help of sha1_file_to_archive() from archive.c This gives backends more freedom in dealing with (streaming) large blobs. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03archive-tar: unindent write_tar_entry by one levelNguyễn Thái Ngọc Duy1-31/+25
It's used to be if (!sha1) { ... } else if (!path) { ... } else { ... } Now that the first two blocks are no-op. We can remove the if/else skeleton and put the else block back by one indent level. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03archive-tar: turn write_tar_entry into blob-writing onlyNguyễn Thái Ngọc Duy1-26/+52
Before this patch write_tar_entry() can: - write global header by write_global_extended_header() calling write_tar_entry with with both sha1 and path == NULL - write extended header for symlinks, by write_tar_entry() calling itself with sha1 != NULL and path == NULL - write a normal blob. In this case both sha1 and path are valid. After this patch, the first two call sites are modified to write the header without calling write_tar_entry(). The function is now for writing blobs only. This simplifies handling when write_tar_entry() learns about large blobs. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22upload-archive: allow user to turn off filtersJeff King1-1/+10
Some tar filters may be very expensive to run, so sites do not want to expose them via upload-archive. This patch lets users configure tar.<filter>.remote to turn them off. By default, gzip filters are left on, as they are about as expensive as creating zip archives. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22archive: provide builtin .tar.gz filterJeff King1-0/+2
This works exactly as if the user had configured it via: [tar "tgz"] command = gzip -cn [tar "tar.gz"] command = gzip -cn but since it is so common, it's convenient to have it builtin without the user needing to do anything. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22archive: implement configurable tar filtersJeff King1-1/+106
It's common to pipe the tar output produce by "git archive" through gzip or some other compressor. Locally, this can easily be done by using a shell pipe. When requesting a remote archive, though, it cannot be done through the upload-archive interface. This patch allows configurable tar filters, so that one could define a "tar.gz" format that automatically pipes tar output through gzip. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22archive: pass archiver struct to write_archive callbackJeff King1-1/+2
The current archivers are very static; when you are in the write_tar_archive function, you know you are writing a tar. However, to facilitate runtime-configurable archivers that will share a common write function we need to tell the function which archiver was used. As a convenience, we also provide an opaque data pointer in the archiver struct so that individual archivers can put something useful there when they register themselves. Technically they could just use the "name" field to look in an internal map of names to data, but this is much simpler. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22archive: refactor list of archive formatsJeff King1-3/+13
Most of the tar and zip code was nicely split out into two abstracted files which knew only about their specific formats. The entry point to this code was a single "write archive" function. However, as these basic formats grow more complex (e.g., by handling multiple file extensions and format names), a static list of the entry point functions won't be enough. Instead, let's provide a way for the tar and zip code to tell the main archive code what they support by registering archiver names and functions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22archive-tar: don't reload default config optionsJeff King1-1/+1
We load our own tar-specific config, and then chain to git_default_config. This is pointless, as our caller should already have loaded the default config. It also introduces a needless inconsistency with the zip archiver, which does not look at the config files at all (and therefore relies on the caller to have loaded config). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-08archive-tar.c: squelch a type mismatch warningJunio C Hamano1-1/+1
On some systems, giving a value of type time_t to printf "%lo" that expects an unsigned long would give a type mismatch warning. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-12Replace calls to strbuf_init(&foo, 0) with STRBUF_INIT initializerBrandon Casey1-4/+2
Many call sites use strbuf_init(&foo, 0) to initialize local strbuf variable "foo" which has not been accessed since its declaration. These can be replaced with a static initialization using the STRBUF_INIT macro which is just as readable, saves a function call, and takes up fewer lines. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-07-19archive: remove unused headersRené Scharfe1-2/+0
Remove obsolete #includes. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-15archive: centralize archive entry writingRené Scharfe1-79/+37
Add the exported function write_archive_entries() to archive.c, which uses the new ability of read_tree_recursive() to pass a context pointer to its callback in order to centralize previously duplicated code. The new callback function write_archive_entry() does the work that every archiver backend needs to do: loading file contents, entering subdirectories, handling file attributes, constructing the full path of the entry. All that done, it calls the backend specific write_archive_entry_fn_t function. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-15archive: add baselen member to struct archiver_argsRené Scharfe1-5/+3
Calculate the length of base and save it in a new member of struct archiver_args. This way we don't have to compute it in each of the format backends. Note: parse_archive_args() guarantees that ->base won't ever be NULL. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-15add context pointer to read_tree_recursive()René Scharfe1-5/+6
Add a pointer parameter to read_tree_recursive(), which is passed to the callback function. This allows callers of read_tree_recursive() to share data with the callback without resorting to global variables. All current callers pass NULL. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-09Teach new attribute 'export-ignore' to git-archiveRené Scharfe1-0/+2
Paths marked with this attribute are not output to git-archive output. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-14Provide git_config with a callback-data parameterJohannes Schindelin1-3/+3
git_config() only had a function parameter, but no callback data parameter. This assumes that all callback functions only modify global variables. With this patch, every callback gets a void * parameter, and it is hoped that this will help the libification effort. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-10git-archive: ignore prefix when checking file attributeRené Scharfe1-2/+4
Ulrik Sverdrup noticed that git-archive doesn't correctly apply the attribute export-subst when the option --prefix is given, too. When it checked if a file has the attribute turned on, git-archive would try to look up the full path -- including the prefix -- in .gitattributes. That's wrong, as the prefix doesn't need to have any relation to any existing directories, tracked or not. This patch makes git-archive ignore the prefix when looking up if value of the attribute export-subst for a file. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-11archive-tar.c: guard config parser from value=NULLJunio C Hamano1-1/+1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-20Fix the expansion pattern of the pseudo-static path buffer.Pierre Habouzit1-3/+2
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
2007-09-16Now that cache.h needs strbuf.h, remove useless includes.Pierre Habouzit1-1/+0
Signed-off-by: Pierre Habouzit <madcoder@debian.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-10Strbuf API extensions and fixes.Pierre Habouzit1-2/+2
* Add strbuf_rtrim to remove trailing spaces. * Add strbuf_insert to insert data at a given position. * Off-by one fix in strbuf_addf: strbuf_avail() does not counts the final \0 so the overflow test for snprintf is the strict comparison. This is not critical as the growth mechanism chosen will always allocate _more_ memory than asked, so the second test will not fail. It's some kind of miracle though. * Add size extension hints for strbuf_init and strbuf_read. If 0, default applies, else: + initial buffer has the given size for strbuf_init. + first growth checks it has at least this size rather than the default 8192. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-10Merge branch 'master' into ph/strbufJunio C Hamano1-1/+4
* master: archive - leakfix for format_subst() Make --no-thin the default in git-push to save server resources fix doc for --compression argument to pack-objects git-tag -s must fail if gpg cannot sign the tag. git-svn: understand grafts when doing dcommit git-diff: don't squelch the new SHA1 in submodule diffs Define NO_MEMMEM on Darwin as it lacks the function git-svn: fix "Malformed network data" with svn:// servers (cvs|svn)import: Ask git-tag to overwrite old tags. git-rebase: fix -C option git-rebase: support --whitespace=<option> Documentation / grammer nit archive: rename attribute specfile to export-subst archive: specfile syntax change: "$Format:%PLCHLDR$" instead of just "%PLCHLDR" (take 2) add memmem() Remove unused function convert_sha1_file() archive: specfile support (--pretty=format: in archive files) Export format_commit_message()
2007-09-06Simplify strbuf uses in archive-tar.c using strbuf APIPierre Habouzit1-49/+16
This is just cleaner way to deal with strbufs, using its API rather than reinventing it in the module (e.g. strbuf_append_string is just the plain strbuf_addstr function, and it was used to perform what strbuf_addch does anyways). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-06Rework strbuf API and semantics.Pierre Habouzit1-1/+1
The gory details are explained in strbuf.h. The change of semantics this patch enforces is that the embeded buffer has always a '\0' character after its last byte, to always make it a C-string. The offs-by-one changes are all related to that very change. A strbuf can be used to store byte arrays, or as an extended string library. The `buf' member can be passed to any C legacy string function, because strbuf operations always ensure there is a terminating \0 at the end of the buffer, not accounted in the `len' field of the structure. A strbuf can be used to generate a string/buffer whose final size is not really known, and then "strbuf_detach" can be used to get the built buffer, and keep the wrapping "strbuf" structure usable for further work again. Other interesting feature: strbuf_grow(sb, size) ensure that there is enough allocated space in `sb' to put `size' new octets of data in the buffer. It helps avoiding reallocating data for nothing when the problem the strbuf helps to solve has a known typical size. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-03archive: specfile support (--pretty=format: in archive files)René Scharfe1-1/+4
Add support for a new attribute, specfile. Files marked as being specfiles are expanded by git-archive when they are written to an archive. It has no effect on worktree files. The same placeholders as those for the option --pretty=format: of git-log et al. can be used. The attribute is useful for creating auto-updating specfiles. It is limited by the underlying function format_commit_message(), though. E.g. currently there is no placeholder for git-describe like output, and expanded specfiles can't contain NUL bytes. That can be fixed in format_commit_message() later and will then benefit users of git-log, too. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-05-21rename dirlink to gitlink.Martin Waitz1-2/+2
Unify naming of plumbing dirlink/gitlink concept: git ls-files -z '*.[ch]' | xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;' Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-18git-archive: convert archive entries like checkouts doRené Scharfe1-7/+9
As noted by Johan Herland, git-archive is a kind of checkout and needs to apply any checkout filters that might be configured. This patch adds the convenience function convert_sha1_file which returns a buffer containing the object's contents, after converting, if necessary (i.e. it's a combination of read_sha1_file and convert_to_working_tree). Direct calls to read_sha1_file in git-archive are then replaced by calls to convert_sha1_file. Since convert_sha1_file expects its path argument to be NUL-terminated -- a convention it inherits from convert_to_working_tree -- the patch also changes the path handling in archive-tar.c to always NUL-terminate the string. It used to solely rely on the len field of struct strbuf before. archive-zip.c already NUL-terminates the path and thus needs no such change. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-12git-archive: don't die when repository uses subprojectsLars Hjemli1-2/+2
Both archive-tar and archive-zip needed to be taught about subprojects. The tar function died when trying to read the subproject commit object, while the zip function reported "unsupported file mode". This fixes both by representing the subproject as an empty directory. Signed-off-by: Lars Hjemli <hjemli@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-27convert object type handling from a string to a numberNicolas Pitre1-2/+2
We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-06Set default "tar" umask to 002 and owner.group to root.rootRené Scharfe1-4/+3
In order to make the generated tar files more friendly to users who extract them as root using GNU tar and its implied -p option, change the default umask to 002 and change the owner name and group name to root. This ensures that a) the extracted files and directories are not world-writable and b) that they belong to user and group root. Before they would have been assigned to a user and/or group named git if it existed. This also answers the question in the removed comment: uid=0, gid=0, uname=root, gname=root is exactly what we want. Normal users who let tar apply their umask while extracting are only affected if their umask allowed the world to change their files (e.g. a umask of zero). This case is so unlikely and strange that we don't need to support it. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-20simplify inclusion of system header files.Junio C Hamano1-1/+0
This is a mechanical clean-up of the way *.c files include system header files. (1) sources under compat/, platform sha-1 implementations, and xdelta code are exempt from the following rules; (2) the first #include must be "git-compat-util.h" or one of our own header file that includes it first (e.g. config.h, builtin.h, pkt-line.h); (3) system headers that are included in "git-compat-util.h" need not be included in individual C source files. (4) "git-compat-util.h" does not have to include subsystem specific header files (e.g. expat.h). Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-24git-tar-tree: Move code for git-archive --format=tar to archive-tar.cRene Scharfe1-0/+325
This patch doesn't change any functionality, it only moves code around. It makes seeing the few remaining lines of git-tar-tree code easier. ;-) Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net>