aboutsummaryrefslogtreecommitdiffstats
path: root/trailer.c
AgeCommit message (Collapse)AuthorFilesLines
9 daysbuiltin/commit: refactor --trailer logicJohn Passaro1-0/+12
git-commit adds user trailers to the commit message by passing its `--trailer` arguments to a child process running `git-interpret-trailers --in-place`. This logic is broadly useful, not just for git-commit but for other commands constructing message bodies (e.g. git-tag). Let's move this logic from git-commit to a new function in the trailer API, so that it can be re-used in other commands. Helped-by: Patrick Steinhardt <ps@pks.im> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: John Passaro <john.a.passaro@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-23Merge branch 'la/format-trailer-info'Junio C Hamano1-52/+29
The code to format trailers have been cleaned up. * la/format-trailer-info: trailer: finish formatting unification trailer: begin formatting unification format_trailer_info(): append newline for non-trailer lines format_trailer_info(): drop redundant unfold_value() format_trailer_info(): use trailer_item objects
2024-04-05Merge branch 'jk/core-comment-string'Junio C Hamano1-3/+3
core.commentChar used to be limited to a single byte, but has been updated to allow an arbitrary multi-byte sequence. * jk/core-comment-string: config: add core.commentString config: allow multi-byte core.commentChar environment: drop comment_line_char compatibility macro wt-status: drop custom comment-char stringification sequencer: handle multi-byte comment characters when writing todo list find multi-byte comment chars in unterminated buffers find multi-byte comment chars in NUL-terminated strings prefer comment_line_str to comment_line_char for printing strbuf: accept a comment string for strbuf_add_commented_lines() strbuf: accept a comment string for strbuf_commented_addf() strbuf: accept a comment string for strbuf_stripspace() environment: store comment_line_char as a string strbuf: avoid shadowing global comment_line_char name commit: refactor base-case of adjust_comment_line_char() strbuf: avoid static variables in strbuf_add_commented_lines() strbuf: simplify comment-handling in add_lines() helper config: forbid newline as core.commentChar
2024-03-15trailer: finish formatting unificationLinus Arver1-4/+4
Rename format_trailer_info() to format_trailers(). Finally, both interpret-trailers and format_trailers_from_commit() can call "format_trailers()"! Update the comment in <trailer.h> to remove the (now obsolete) caveats about format_trailers_from_commit(). Those caveats come from a388b10fc1 (pretty: move trailer formatting to trailer.c, 2017-08-15) where it says: pretty: move trailer formatting to trailer.c The next commit will add many features to the %(trailer) placeholder in pretty.c. We'll need to access some internal functions of trailer.c for that, so our options are either: 1. expose those functions publicly or 2. make an entry point into trailer.c to do the formatting Doing (2) ends up exposing less surface area, though do note that caveats in the docstring of the new function. which suggests format_trailers_from_commit() started out from pretty.c and did not have access to all of the trailer implementation internals, and was never intended to replace (unify) the formatting machinery in trailer.c. The refactors leading up to this commit (as well as additional refactors that will follow) expose additional functions publicly, and is therefore choosing option (1) as described in a388b10fc1. Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-15trailer: begin formatting unificationLinus Arver1-37/+17
Now that the preparatory refactors are over, we can replace the call to format_trailers() in interpret-trailers with format_trailer_info(). This unifies the trailer formatting machinery In order to avoid breakages in t7502 and t7513, we have to steal the features present in format_trailers(). Namely, we have to teach format_trailer_info() as follows: (1) make it aware of opts->trim_empty, and (2) make it avoid hardcoding ": " as the separator and space (which can result in double-printing these characters). For (2), make it only print the separator and space if we cannot find any recognized separator somewhere in the key (yes, keys may have a trailing separator in it --- we will eventually fix this design but not now). Do so by copying the code out of print_tok_val(), and deleting the same function. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-15format_trailer_info(): append newline for non-trailer linesLinus Arver1-2/+3
This wraps up the preparatory refactors to unify the trailer formatters. Two patches ago we made format_trailer_info() use trailer_item objects instead of the "trailers" string array. The strings in the array include trailing newlines, because the string array is split up with trailer_lines = strbuf_split_buf(str + trailer_block_start, end_of_log_message - trailer_block_start, '\n', 0); in trailer_info_get() and strbuf_split_buf() includes the terminator (in this case the newline character '\n') for each split-up substring. And before we made the transition to use trailer_item objects for it, format_trailer_info() called parse_trailer() (which trims newlines) for trailer lines but did _not_ call parse_trailer() for non-trailer lines. So for trailer lines it had to add back the trimmed newline like this if (!opts->separator) strbuf_addch(out, '\n'); But for non-trailer lines it didn't have to add back the newline because it could just reuse same string in the "trailers" string array (which again, already included the trailing newline). Now that format_trailer_info() uses trailer_item objects for all cases, it can't rely on "trailers" string array anymore. And so it must be taught to add a newline back when printing non-trailer lines, just like it already does for trailer lines. Do so now. The test suite can pass again without the need to hide failures with *_failure, so flip the affected test cases back to *_success. Now, format_trailer_info() is in better shape to supersede format_trailers(), which we'll do in the next commit. Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-15format_trailer_info(): drop redundant unfold_value()Linus Arver1-3/+0
This is another preparatory refactor to unify the trailer formatters. In the last patch we made format_trailer_info() use trailer_item objects instead of the "trailers" string array. This means that the call to unfold_value() here is redundant because the trailer_item objects are already unfolded in parse_trailers() which is a dependency of our caller, format_trailers_from_commit(). Remove the redundant call. Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-15format_trailer_info(): use trailer_item objectsLinus Arver1-11/+10
This is another preparatory refactor to unify the trailer formatters. Make format_trailer_info() operate on trailer_item objects, not the raw string array. We will continue to make improvements, culminating in the renaming of format_trailer_info() to format_trailers(), at which point the unification of these formatters will be complete. In order to avoid breaking t4205 and t6300, flip *_success to *_failure in the affected test cases. Add a temporary "test_trailer_option_expect_failure" wrapper which we will use along with "test_expect_failure" in the next commit to avoid breaking tests. When the dust settles with the refactors a few more commits later, we will drop the use of *_failure to make the tests truly pass again. When the preparatory refactors are complete, we'll be able to drop the use of *_failure that we introduce here. Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-14Merge branch 'la/trailer-api'Junio C Hamano1-135/+46
Trailer API updates. Acked-by: Christian Couder <christian.couder@gmail.com> cf. <CAP8UFD1Zd+9q0z1JmfOf60S2vn5-sD3SafDvAJUzRFwHJKcb8A@mail.gmail.com> * la/trailer-api: format_trailers_from_commit(): indirectly call trailer_info_get() format_trailer_info(): move "fast path" to caller format_trailers(): use strbuf instead of FILE trailer_info_get(): reorder parameters trailer: move interpret_trailers() to interpret-trailers.c trailer: reorder format_trailers_from_commit() parameters trailer: rename functions to use 'trailer' shortlog: add test for de-duplicating folded trailers trailer: free trailer_info _after_ all related usage
2024-03-12find multi-byte comment chars in unterminated buffersJeff King1-2/+2
As with the previous patch, we need to swap out single-byte matching for something like starts_with() to match all bytes of a multi-byte comment character. But for cases where the buffer is not NUL-terminated (and we instead have an explicit size or end pointer), it's not safe to use starts_with(), as it might walk off the end of the buffer. Let's introduce a new starts_with_mem() that does the same thing but also accepts the length of the "haystack" str and makes sure not to walk past it. Note that in most cases the existing code did not need a length check at all, since it was written in a way that knew we had at least one byte available (and that was all we checked). So I had to read each one to find the appropriate bounds. The one exception is sequencer.c's add_commented_lines(), where we can actually get rid of the length check. Just like starts_with(), our starts_with_mem() handles an empty haystack variable by not matching (assuming a non-empty prefix). A few notes on the implementation of starts_with_mem(): - it would be equally correct to take an "end" pointer (and indeed, many of the callers have this and have to subtract to come up with the length). I think taking a ptr/size combo is a more usual interface for our codebase, though, and has the added benefit that the function signature makes it harder to mix up the three parameters. - we could obviously build starts_with() on top of this by passing strlen(str) as the length. But it's possible that starts_with() is a relatively hot code path, and it should not pay that penalty (it can generally return an answer proportional to the size of the prefix, not the whole string). - it naively feels like xstrncmpz() should be able to do the same thing, but that's not quite true. If you pass the length of the haystack buffer, then strncmp() finds that a shorter prefix string is "less than" than the haystack, even if the haystack starts with the prefix. If you pass the length of the prefix, then you risk reading past the end of the haystack if it is shorter than the prefix. So I think we really do need a new function. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12find multi-byte comment chars in NUL-terminated stringsJeff King1-1/+1
Several parts of the code need to identify lines that begin with the comment character, and do so with a simple byte equality check. As part of the transition to handling multi-byte characters, we need to match all of the bytes. For cases where we are looking in a NUL-terminated string, we can just use starts_with(), which checks all of the characters in comment_line_str. Note that we can drop the "line.len" check in wt-status.c's read_rebase_todolist(). The starts_with() function handles the case of an empty haystack buffer (it will always return false for a non-empty prefix). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01format_trailers_from_commit(): indirectly call trailer_info_get()Linus Arver1-1/+4
This is another preparatory refactor to unify the trailer formatters. For background, note that the "trailers" string array is the `char **trailers` member in `struct trailer_info` and that the trailer_item objects are the elements of the `struct list_head *head` linked list. Currently trailer_info_get() only populates `char **trailers`. And parse_trailers() first calls trailer_info_get() so that it can use the `char **trailers` to populate a list of `struct trailer_item` objects Instead of calling trailer_info_get() directly from format_trailers_from_commit(), make it call parse_trailers() instead because parse_trailers() already calls trailer_info_get(). This change is a NOP because format_trailer_info() (which format_trailers_from_commit() wraps around) only looks at the "trailers" string array, not the trailer_item objects which parse_trailers() populates. For now we do need to create a dummy LIST_HEAD(trailer_objects); because parse_trailers() expects it in its signature. In a future patch, we'll change format_trailer_info() to use the parsed trailer_item objects (trailer_objects) instead of the `char **trailers` array. Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01format_trailer_info(): move "fast path" to callerLinus Arver1-11/+9
This is another preparatory refactor to unify the trailer formatters. This allows us to drop the "msg" parameter from format_trailer_info(), so that it take 3 parameters, similar to format_trailers() which also takes 3 parameters: void format_trailers(const struct process_trailer_options *opts, struct list_head *trailers, struct strbuf *out) The short-term goal is to make format_trailer_info() be smart enough to deprecate format_trailers(). And then ultimately we will rename format_trailer_info() to format_trailers(). Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01format_trailers(): use strbuf instead of FILELinus Arver1-6/+7
This is another preparatory refactor to unify the trailer formatters. Make format_trailers() also write to a strbuf, to align with format_trailers_from_commit() which also does the same. Doing this makes format_trailers() behave similar to format_trailer_info() (which will soon help us replace one with the other). Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01trailer_info_get(): reorder parametersLinus Arver1-5/+6
This is another preparatory refactor to unify the trailer formatters. Take const struct process_trailer_options *opts as the first parameter, because these options are required for parsing trailers (e.g., whether to treat "---" as the end of the log message). And take struct trailer_info *info last, because it's an "out parameter" (something that the caller wants to use as the output of this function). Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01trailer: move interpret_trailers() to interpret-trailers.cLinus Arver1-106/+13
The interpret-trailers.c builtin is the only place we need to call interpret_trailers(), so move its definition there (together with a few helper functions called only by it) and remove its external declaration from <trailer.h>. Several helper functions that are called by interpret_trailers() remain in trailer.c because other callers in the same file still call them. Declare them in <trailer.h> so that interpret_trailers() (now in builtin/interpret-trailers.c) can continue calling them as a trailer API user. This enriches <trailer.h> with a more granular API, which can then be unit-tested in the future (because interpret_trailers() by itself does too many things to be able to be easily unit-tested). Take this opportunity to demote some file-handling functions out of the trailer API implementation, as these have nothing to do with trailers. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01trailer: reorder format_trailers_from_commit() parametersLinus Arver1-5/+6
Currently there are two functions for formatting trailers in <trailer.h>: void format_trailers(const struct process_trailer_options *, struct list_head *trailers, FILE *outfile); void format_trailers_from_commit(struct strbuf *out, const char *msg, const struct process_trailer_options *opts); and although they are similar enough (even taking the same process_trailer_options struct pointer) they are used quite differently. One might intuitively think that format_trailers_from_commit() builds on top of format_trailers(), but this is not the case. Instead format_trailers_from_commit() calls format_trailer_info() and format_trailers() is never called in that codepath. This is a preparatory refactor to help us deprecate format_trailers() in favor of format_trailer_info() (at which point we can rename the latter to the former). When the deprecation is complete, both format_trailers_from_commit(), and the interpret-trailers builtin will be able to call into the same helper function (instead of format_trailers() and format_trailer_info(), respectively). Unifying the formatters is desirable because it simplifies the API. Reorder parameters for format_trailers_from_commit() to prefer const struct process_trailer_options *opts as the first parameter, because these options are intimately tied to formatting trailers. And take struct strbuf *out last, because it's an "out parameter" (something that the caller wants to use as the output of this function). Similarly, reorder parameters for format_trailer_info(), because later on we will unify the two together. Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01trailer: rename functions to use 'trailer'Linus Arver1-13/+13
Rename process_trailers() to interpret_trailers(), because it matches the name for the builtin command of the same name (git-interpret-trailers), which is the sole user of process_trailers(). In a following commit, we will move "interpret_trailers" from trailer.c to builtin/interpret-trailers.c. That move will necessitate the growth of the trailer.h API, forcing us to expose some additional functions in trailer.h. Rename relevant functions so that they include the term "trailer" in their name, so that clients of the API will be able to easily identify them by their "trailer" moniker, just like all the other functions already exposed by trailer.h. Rename `struct list_head *head` to `struct list_head *trailers` because "head" conveys no additional information beyond the "list_head" type. Reorder parameters for format_trailers_from_commit() to prefer const struct process_trailer_options *opts as the first parameter, because these options are intimately tied to formatting trailers. Parameters like `FILE *outfile` should be last because they are a kind of 'out' parameter, so put such parameters at the end. This will be the pattern going forward in this series. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01shortlog: add test for de-duplicating folded trailersLinus Arver1-0/+1
The shortlog builtin was taught to use the trailer iterator interface in 47beb37bc6 (shortlog: match commit trailers with --group, 2020-09-27). The iterator always unfolds values and this has always been the case since the time the iterator was first introduced in f0939a0eb1 (trailer: add interface for iterating over commit trailers, 2020-09-27). Add a comment line to remind readers of this behavior. The fact that the iterator always unfolds values is important (at least for shortlog) because unfolding allows it to recognize both folded and unfolded versions of the same trailer for de-duplication. Capture the existing behavior in a new test case to guard against regressions in this area. This test case is based off of the existing "shortlog de-duplicates trailers in a single commit" just above it. Now if we were to remove the call to unfold_value(&iter->val); inside the iterator, this new test case will break. Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-01trailer: free trailer_info _after_ all related usageLinus Arver1-2/+1
In de7c27a186 (trailer: use offsets for trailer_start/trailer_end, 2023-10-20), we started using trailer block offsets in trailer_info. In particular, we dropped the use of a separate stack variable "size_t trailer_end", in favor of accessing the new "trailer_block_end" member of trailer_info (as "info.trailer_block_end"). At that time, we forgot to also move the trailer_info_release(&info); line to be _after_ this new use of the trailer_info struct. Move it now. Note that even without this patch, we didn't have leaks or any other problems because trailer_info_release() only frees memory allocated on the heap. The "trailer_block_end" member was allocated on the stack back then (as it is now) so it was still safe to use for all this time. Reported-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-19Merge branch 'la/trailer-cleanups'Junio C Hamano1-8/+7
Fix to an already-graduated topic. * la/trailer-cleanups: trailer: fix comment/cut-line regression with opts->no_divider
2024-02-19trailer: fix comment/cut-line regression with opts->no_dividerJeff King1-8/+7
Commit 97e9d0b78a (trailer: find the end of the log message, 2023-10-20) combined two code paths for finding the end of the log message. For the "no_divider" case, we used to use find_trailer_end(), and that has now been rolled into find_end_of_log_message(). But there's a regression; that function returns early when no_divider is set, returning the whole string. That's not how find_trailer_end() behaved. Although it did skip the "---" processing (which is what "no_divider" is meant to do), we should still respect ignored_log_message_bytes(), which covers things like comments, "commit -v" cut lines, and so on. The bug is actually in the interpret-trailers command, but the obvious way to experience it is by running "commit -v" with a "--trailer" option. The new trailer will be added at the end of the verbose diff, rather than before it (and consequently will be ignored entirely, since everything after the diff's intro scissors line is thrown away). I've added two tests here: one for interpret-trailers directly, which shows the bug via the parsing routines, and one for "commit -v". The fix itself is pretty simple: instead of returning early, no_divider just skips the "---" handling but still calls ignored_log_message_bytes(). Reported-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02Merge branch 'la/trailer-cleanups'Junio C Hamano1-34/+50
Code clean-up. * la/trailer-cleanups: trailer: use offsets for trailer_start/trailer_end trailer: find the end of the log message commit: ignore_non_trailer computes number of bytes to ignore
2023-12-20trailer: use offsets for trailer_start/trailer_endLinus Arver1-15/+14
Previously these fields in the trailer_info struct were of type "const char *" and pointed to positions in the input string directly (to the start and end positions of the trailer block). Use offsets to make the intended usage less ambiguous. We only need to reference the input string in format_trailer_info(), so update that function to take a pointer to the input. While we're at it, rename trailer_start to trailer_block_start to be more explicit about these offsets (that they are for the entire trailer block including other trailers). Ditto for trailer_end. Reported-by: Glen Choo <glencbz@gmail.com> Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-20trailer: find the end of the log messageLinus Arver1-23/+40
Previously, trailer_info_get() computed the trailer block end position by (1) checking for the opts->no_divider flag and optionally calling find_patch_start() to find the "patch start" location (patch_start), and (2) calling find_trailer_end() to find the end of the trailer block using patch_start as a guide, saving the return value into "trailer_end". The logic in (1) was awkward because the variable "patch_start" is misleading if there is no patch in the input. The logic in (2) was misleading because it could be the case that no trailers are in the input (yet we are setting a "trailer_end" variable before even searching for trailers, which happens later in find_trailer_start()). The name "find_trailer_end" was misleading because that function did not look for any trailer block itself --- instead it just computed the end position of the log message in the input where the end of the trailer block (if it exists) would be (because trailer blocks must always come after the end of the log message). Combine the logic in (1) and (2) together into find_patch_start() by renaming it to find_end_of_log_message(). The end of the log message is the starting point which find_trailer_start() needs to start searching backward to parse individual trailers (if any). Helped-by: Jonathan Tan <jonathantanmy@google.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09trailer: handle NULL value when parsing trailer-specific configJeff King1-0/+6
When parsing the "key", "command", and "cmd" trailer config, we just make a copy of the value string. If we see an implicit bool like: [trailer "foo"] key we'll segfault trying to copy a NULL pointer. We can fix this with the usual config_error_nonbool() check. I split this out from the other vanilla cases, because at first glance it looks like a better fix here would be to move the NULL check out of the switch statement. But it would change the behavior of other keys like trailer.*.ifExists, where an implicit bool is interpreted as EXISTS_DEFAULT. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09config: handle NULL value when parsing non-boolsJeff King1-0/+2
When the config parser sees an "implicit" bool like: [core] someVariable it passes NULL to the config callback. Any callback code which expects a string must check for NULL. This usually happens via helpers like git_config_string(), etc, but some custom code forgets to do so and will segfault. These are all fairly vanilla cases where the solution is just the usual pattern of: if (!value) return config_error_nonbool(var); though note that in a few cases we have to split initializers like: int some_var = initializer(); into: int some_var; if (!value) return config_error_nonbool(var); some_var = initializer(); There are still some broken instances after this patch, which I'll address on their own in individual patches after this one. Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-20commit: ignore_non_trailer computes number of bytes to ignoreLinus Arver1-1/+1
ignore_non_trailer() returns the _number of bytes_ that should be ignored from the end of the log message. It does not by itself "ignore" anything. Rename this function to remove the leading "ignore" verb, to sound more like a quantity than an action. Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-11trailer: split process_command_line_args into separate functionsLinus Arver1-13/+21
Previously, process_command_line_args did two things: (1) parse trailers from the configuration, and (2) parse trailers defined on the command line. Separate (1) outside to a new function, parse_trailers_from_config. Rename the remaining logic to parse_trailers_from_command_line_args. Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-11trailer: split process_input_file into separate piecesLinus Arver1-20/+22
Currently, process_input_file does three things: (1) parse the input string for trailers, (2) print text before the trailers, and (3) calculate the position of the input where the trailers end. Rename this function to parse_trailers(), and make it only do (1). The caller of this function, process_trailers, becomes responsible for (2) and (3). These items belong inside process_trailers because they are both concerned with printing the surrounding text around trailers (which is already one of the immediate concerns of process_trailers). Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-11trailer: separate public from internal portion of trailer_iteratorLinus Arver1-5/+5
The fields here are not meant to be used by downstream callers, so put them behind an anonymous struct named as "internal" to warn against their use. This follows the pattern in 576de3d956 (unpack_trees: start splitting internal fields from public API, 2023-02-27). Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-17Merge branch 'cw/compat-util-header-cleanup'Junio C Hamano1-1/+0
Further shuffling of declarations across header files to streamline file dependencies. * cw/compat-util-header-cleanup: git-compat-util: move alloc macros to git-compat-util.h treewide: remove unnecessary includes for wrapper.h kwset: move translation table from ctype sane-ctype.h: create header for sane-ctype macros git-compat-util: move wrapper.c funcs to its header git-compat-util: move strbuf.c funcs to its header
2023-07-05git-compat-util: move alloc macros to git-compat-util.hCalvin Wan1-1/+0
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for dynamic array allocation. Moving these macros to git-compat-util.h with the other alloc macros focuses alloc.[ch] to allocation for Git objects and additionally allows us to remove inclusions to alloc.h from files that solely used the above macros. Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28config: add ctx arg to config_fn_tGlen Choo1-0/+2
Add a new "const struct config_context *ctx" arg to config_fn_t to hold additional information about the config iteration operation. config_context has a "struct key_value_info kvi" member that holds metadata about the config source being read (e.g. what kind of config source it is, the filename, etc). In this series, we're only interested in .kvi, so we could have just used "struct key_value_info" as an arg, but config_context makes it possible to add/adjust members in the future without changing the config_fn_t signature. We could also consider other ways of organizing the args (e.g. moving the config name and value into config_context or key_value_info), but in my experiments, the incremental benefit doesn't justify the added complexity (e.g. a config_fn_t will sometimes invoke another config_fn_t but with a different config value). In subsequent commits, the .kvi member will replace the global "struct config_reader" in config.c, making config iteration a global-free operation. It requires much more work for the machinery to provide meaningful values of .kvi, so for now, merely change the signature and call sites, pass NULL as a placeholder value, and don't rely on the arg in any meaningful way. Most of the changes are performed by contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every config_fn_t: - Modifies the signature to accept "const struct config_context *ctx" - Passes "ctx" to any inner config_fn_t, if needed - Adds UNUSED attributes to "ctx", if needed Most config_fn_t instances are easily identified by seeing if they are called by the various config functions. Most of the remaining ones are manually named in the .cocci patch. Manual cleanups are still needed, but the majority of it is trivial; it's either adjusting config_fn_t that the .cocci patch didn't catch, or adding forward declarations of "struct config_context ctx" to make the signatures make sense. The non-trivial changes are in cases where we are invoking a config_fn_t outside of config machinery, and we now need to decide what value of "ctx" to pass. These cases are: - trace2/tr2_cfg.c:tr2_cfg_set_fl() This is indirectly called by git_config_set() so that the trace2 machinery can notice the new config values and update its settings using the tr2 config parsing function, i.e. tr2_cfg_cb(). - builtin/checkout.c:checkout_main() This calls git_xmerge_config() as a shorthand for parsing a CLI arg. This might be worth refactoring away in the future, since git_xmerge_config() can call git_default_config(), which can do much more than just parsing. Handle them by creating a KVI_INIT macro that initializes "struct key_value_info" to a reasonable default, and use that to construct the "ctx" arg. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21treewide: remove cache.h inclusion due to environment.h changesElijah Newren1-1/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21environment: move comment_line_char from cache.hElijah Newren1-0/+1
This is one step towards making strbuf.c not depend upon cache.h. Additional steps will follow in subsequent commits. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21treewide: be explicit about dependence on gettext.hElijah Newren1-0/+1
Dozens of files made use of gettext functions, without explicitly including gettext.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include gettext.h if they are using it. However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an include of gettext.h, it was left out to avoid conflicting with an in-flight topic. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23alloc.h: move ALLOC_GROW() functions from cache.hElijah Newren1-0/+1
This allows us to replace includes of cache.h with includes of the much smaller alloc.h in many places. It does mean that we also need to add includes of alloc.h in a number of C files. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01git-compat-util.h: use "UNUSED", not "UNUSED(var)"Ævar Arnfjörð Bjarmason1-2/+2
As reported in [1] the "UNUSED(var)" macro introduced in 2174b8c75de (Merge branch 'jk/unused-annotation' into next, 2022-08-24) breaks coccinelle's parsing of our sources in files where it occurs. Let's instead partially go with the approach suggested in [2] of making this not take an argument. As noted in [1] "coccinelle" will ignore such tokens in argument lists that it doesn't know about, and it's less of a surprise to syntax highlighters. This undoes the "help us notice when a parameter marked as unused is actually use" part of 9b240347543 (git-compat-util: add UNUSED macro, 2022-08-19), a subsequent commit will further tweak the macro to implement a replacement for that functionality. 1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/ 2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19config: mark unused callback parametersJeff King1-2/+4
The callback passed to git_config() must conform to a particular interface. But most callbacks don't actually look at the extra "void *data" parameter. Let's mark the unused parameters to make -Wunused-parameter happy. Note there's one unusual case here in get_remote_default() where we actually ignore the "value" parameter. That's because it's only checking whether the option is found at all, and not parsing its value. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-10Merge branch 'ab/env-array'Junio C Hamano1-1/+1
Rename .env_array member to .env in the child_process structure. * ab/env-array: run-command API users: use "env" not "env_array" in comments & names run-command API: rename "env_array" to "env"
2022-06-02run-command API: rename "env_array" to "env"Ævar Arnfjörð Bjarmason1-1/+1
Start following-up on the rename mentioned in c7c4bdeccf3 (run-command API: remove "env" member, always use "env_array", 2021-11-25) of "env_array" to "env". The "env_array" name was picked in 19a583dc39e (run-command: add env_array, an optional argv_array for env, 2014-10-19) because "env" was taken. Let's not forever keep the oddity of "*_array" for this "struct strvec", but not for its "args" sibling. This commit is almost entirely made with a coccinelle rule[1]. The only manual change here is in run-command.h to rename the struct member itself and to change "env_array" to "env" in the CHILD_PROCESS_INIT initializer. The rest of this is all a result of applying [1]: * make contrib/coccinelle/run_command.cocci.patch * patch -p1 <contrib/coccinelle/run_command.cocci.patch * git add -u 1. cat contrib/coccinelle/run_command.pending.cocci @@ struct child_process E; @@ - E.env_array + E.env @@ struct child_process *E; @@ - E->env_array + E->env I've avoided changing any comments and derived variable names here, that will all be done in the next commit. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-02Merge branch 'ep/maint-equals-null-cocci' for maint-2.35Junio C Hamano1-1/+1
* ep/maint-equals-null-cocci: tree-wide: apply equals-null.cocci contrib/coccinnelle: add equals-null.cocci
2022-05-02tree-wide: apply equals-null.cocciJunio C Hamano1-1/+1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-25run-command API: remove "env" member, always use "env_array"Ævar Arnfjörð Bjarmason1-1/+1
Remove the "env" member from "struct child_process" in favor of always using the "env_array". As with the preceding removal of "argv" in favor of "args" this gets rid of current and future oddities around memory management at the API boundary (see the amended API docs). For some of the conversions we can replace patterns like: child.env = env->v; With: strvec_pushv(&child.env_array, env->v); But for others we need to guard the strvec_pushv() with a NULL check, since we're not passing in the "v" member of a "struct strvec", e.g. in the case of tmp_objdir_env()'s return value. Ideally we'd rename the "env_array" member to simply "env" as a follow-up, since it and "args" are now inconsistent in not having an "_array" suffix, and seemingly without any good reason, unless we look at the history of how they came to be. But as we've currently got 122 in-tree hits for a "git grep env_array" let's leave that for now (and possibly forever). Doing that rename would be too disruptive. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04trailer: add new .cmd config optionZheNing Hu1-10/+25
The `trailer.<token>.command` configuration variable specifies a command (run via the shell, so it does not have to be a single name or path to the command, but can be a shell script), and the first occurrence of substring $ARG is replaced with the value given to the `interpret-trailer` command for the token in a '--trailer <token>=<value>' argument. This has three downsides: * The use of $ARG in the mechanism misleads the users that the value is passed in the shell variable, and tempt them to use $ARG more than once, but that would not work, as the second and subsequent $ARG are not replaced. * Because $ARG is textually replaced without regard to the shell language syntax, even '$ARG' (inside a single-quote pair), which a user would expect to stay intact, would be replaced, and worse, if the value had an unmatched single quote (imagine a name like "O'Connor", substituted into NAME='$ARG' to make it NAME='O'Connor'), it would result in a broken command that is not syntactically correct (or worse). * The first occurrence of substring `$ARG` will be replaced with the empty string, in the command when the command is first called to add a trailer with the specified <token>. This is a bad design, the nature of automatic execution causes it to add a trailer that we don't expect. Introduce a new `trailer.<token>.cmd` configuration that takes higher precedence to deprecate and eventually remove `trailer.<token>.command`, which passes the value as an argument to the command. Instead of "$ARG", users can refer to the value as positional argument, $1, in their scripts. At the same time, in order to allow `git interpret-trailers` to better simulate the behavior of `git command -s`, 'trailer.<token>.cmd' will not automatically execute. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Christian Couder <christian.couder@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-19Merge branch 'rs/xcalloc-takes-nelem-first'Junio C Hamano1-3/+3
Code cleanup. * rs/xcalloc-takes-nelem-first: fix xcalloc() argument order
2021-03-13use CALLOC_ARRAYRené Scharfe1-1/+1
Add and apply a semantic patch for converting code that open-codes CALLOC_ARRAY to use it instead. It shortens the code and infers the element size automatically. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-08fix xcalloc() argument orderRené Scharfe1-4/+4
Pass the number of elements first and ther size second, as expected by xcalloc(). Provide a semantic patch, which was actually used to generate the rest of this patch. The semantic patch would generate flip-flop diffs if both arguments are sizeofs. We don't have such a case, and it's hard to imagine the usefulness of such an allocation. If it ever occurs then we could deal with it by duplicating the rule in the semantic patch to make it cancel itself out, or we could change the code to use CALLOC_ARRAY. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-09pretty format %(trailers): add a "key_value_separator"Ævar Arnfjörð Bjarmason1-3/+8
Add a "key_value_separator" option to the "%(trailers)" pretty format, to go along with the existing "separator" argument. In combination these two options make it trivial to produce machine-readable (e.g. \0 and \0\0-delimited) format output. As elaborated on in a previous commit which added "keyonly" it was needlessly tedious to extract structured data from "%(trailers)" before the addition of this "key_value_separator" option. As seen by the test being added here extracting this data now becomes trivial. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-09pretty format %(trailers): add a "keyonly"Ævar Arnfjörð Bjarmason1-3/+6
Add support for a "keyonly". This allows for easier parsing out of the key and value. Before if you didn't want to make assumptions about how the key was formatted. You'd need to parse it out as e.g.: --pretty=format:'%H%x00%(trailers:separator=%x00%x00)' \ '%x00%(trailers:separator=%x00%x00,valueonly)' And then proceed to deduce keys by looking at those two and subtracting the value plus the hardcoded ": " separator from the non-valueonly %(trailers) line. Now it's possible to simply do: --pretty=format:'%H%x00%(trailers:separator=%x00%x00,keyonly)' \ '%x00%(trailers:separator=%x00%x00,valueonly)' Which at least reduces it to a state machine where you get N keys and correlate them with N values. Even better would be to have a way to change the ": " delimiter to something easily machine-readable (a key might contain ": " too). A follow-up change will add support for that. I don't really have a use-case for just "keyonly" myself. I suppose it would be useful in some cases as "key=*" matches case-insensitively, so a plain "keyonly" will give you the variants of the keys you matched. I'm mainly adding it to fix the inconsistency with "valueonly". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-09pretty-format %(trailers): fix broken standalone "valueonly"Ævar Arnfjörð Bjarmason1-1/+2
Fix %(trailers:valueonly) being a noop due to on overly eager optimization in format_trailer_info() which skips custom formatting if no custom options are given. When "valueonly" was added in d9b936db522 (pretty: add support for "valueonly" option in %(trailers), 2019-01-28) we forgot to add it to the list of options that optimization checks for. See e.g. the addition of "key" in 250bea0c165 (pretty: allow showing specific trailers, 2019-01-28) for a similar change where this wasn't missed. Thus the "valueonly" option in "%(trailers:valueonly)" was a noop and the output was equivalent to that of a plain "%(trailers)". This wasn't caught because the tests for it always combined it with other options. Fix the bug by adding !opts->value_only to the list. I initially attempted to make this more future-proof by setting a flag if we got to ":" in "%(trailers:" in format_commit_one() in pretty.c. However, "%(trailers:" is also parsed in trailers_atom_parser() in ref-filter.c. There is an outstanding patch[1] unify those two, and such a fix, or other future-proofing, such as changing "process_trailer_options" flags into a bitfield, would conflict with that effort. Let's instead do the bare minimum here as this aspect of trailers is being actively worked on by another series. Let's also test for a plain "valueonly" without any other options, as well as "separator". All the other existing options on the pretty.c path had tests where they were the only option provided. I'm also keeping a sanity test for "%(trailers:)" being the same as "%(trailers)". There's no reason to suspect it wouldn't be in the current implementation, but let's keep it in the interest of black box testing. 1. https://lore.kernel.org/git/pull.726.git.1599335291.gitgitgadget@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-04Merge branch 'jk/shortlog-group-by-trailer'Junio C Hamano1-0/+36
"git shortlog" has been taught to group commits by the contents of the trailer lines, like "Reviewed-by:", "Coauthored-by:", etc. * jk/shortlog-group-by-trailer: shortlog: allow multiple groups to be specified shortlog: parse trailer idents shortlog: rename parse_stdin_ident() shortlog: de-duplicate trailer values shortlog: match commit trailers with --group trailer: add interface for iterating over commit trailers shortlog: add grouping option shortlog: change "author" variables to "ident"
2020-09-27trailer: add interface for iterating over commit trailersJeff King1-0/+36
The trailer code knows how to parse out the trailers and re-format them, but there's no easy way to iterate over the trailers (you can use trailer_info, but you have to then do a bunch of extra parsing). Let's add an iteration interface that makes this easy to do. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-26run_command: teach API users to use embedded 'args' moreJunio C Hamano1-3/+1
The child_process structure has an embedded strvec for formulating the command line argument list these days, but code that predates the wide use of it prepared a separate char *argv[] array and manually set the child_process.argv pointer point at it. Teach these old-style code to lose the separate argv[] array. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-29pretty: add support for separator option in %(trailers)Anders Waldenborg1-2/+13
By default trailer lines are terminated by linebreaks ('\n'). By specifying the new 'separator' option they will instead be separated by user provided string and have separator semantics rather than terminator semantics. The separator string can contain the literal formatting codes %n and %xNN allowing it to be things that are otherwise hard to type such as %x00, or comma and end-parenthesis which would break parsing. E.g: $ git log --pretty='%(trailers:key=Reviewed-by,valueonly,separator=%x00)' Signed-off-by: Anders Waldenborg <anders@0x63.nu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-29pretty: add support for "valueonly" option in %(trailers)Anders Waldenborg1-2/+4
With the new "key=" option to %(trailers) it often makes little sense to show the key, as it by definition already is knows which trailer is printed there. This new "valueonly" option makes it omit the key when printing trailers. E.g.: $ git show -s --pretty='%s%n%(trailers:key=Signed-off-by,valueonly)' aaaa88182 will show: > upload-pack: fix broken if/else chain in config callback > Jeff King <peff@peff.net> > Junio C Hamano <gitster@pobox.com> Signed-off-by: Anders Waldenborg <anders@0x63.nu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-29pretty: allow showing specific trailersAnders Waldenborg1-4/+6
Adds a new "key=X" option to "%(trailers)" which will cause it to only print trailer lines which match any of the specified keys. Signed-off-by: Anders Waldenborg <anders@0x63.nu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-23interpret-trailers: allow suppressing "---" dividerJeff King1-1/+5
Even with the newly-tightened "---" parser, it's still possible for a commit message to trigger a false positive if it contains something like "--- foo". If the caller knows that it has only a single commit message, it can now tell us with the "--no-divider" option, eliminating any false positives. If we were designing this from scratch, I'd probably make this the default. But we've advertised the "---" behavior in the documentation since interpret-trailers has existed. Since it's meant to be scripted, breaking that would be a bad idea. Note that the logic is in the underlying trailer.c code, which is used elsewhere. The default there will keep the current behavior, but many callers will benefit from setting this new option. That's left for future patches. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-23interpret-trailers: tighten check for "---" patch boundaryJeff King1-1/+3
The interpret-trailers command accepts not only raw commit messages, but it also can manipulate trailers in format-patch output. That means it must find the "---" boundary separating the commit message from the patch. However, it does so by looking for any line starting with "---", regardless of whether there is further content. This is overly lax compared to the parsing done in mailinfo.c's patchbreak(), and may cause false positives (e.g., t/perf output tables uses dashes; if you cut and paste them into your commit message, it fools the parser). We could try to reuse patchbreak() here, but it actually has several heuristics that are not of interest to us (e.g., matching "diff -" without a three-dash separator or even a CVS "Index:" line). We're not interested in taking in whatever random cruft people may send, but rather handling git-formatted patches. Note that the existing documentation was written in a loose way, so technically we are changing the behavior from what it said. But this should implement the original intent in a more accurate way. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-23trailer: pass process_trailer_opts to trailer_info_get()Jeff King1-3/+4
Most of the trailer code has an "opts" struct which is filled in by the caller. We don't pass it down to trailer_info_get(), which does the initial parsing, because there hasn't yet been a need to do so. Let's start passing it down in preparation for adding new options. Note that there's a single caller which doesn't otherwise have such an options struct. Since it's just one caller (that we'd have to modify anyway), let's not bother with any special treatment like accepting a NULL options struct, and just have it allocate one with the defaults. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-23trailer: use size_t for iterating trailer listJeff King1-3/+3
We store the length of the trailers list in a size_t. So on a 64-bit system with a 32-bit int, in the unlikely case that we manage to actually allocate a list with 2^31 entries, we'd loop forever trying to iterate over it (our "int" would wrap to negative before exceeding info->trailer_nr). This probably doesn't matter in practice. Each entry is at least a pointer plus a non-empty string, so even without malloc overhead or the memory to hold the original string we're parsing from, you'd need to allocate tens of gigabytes. But it's easy enough to do it right. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-23trailer: use size_t for string offsetsJeff King1-19/+20
Many of the string-parsing functions inside trailer.c return integer offsets into the string (e.g., to point to the end of the trailer block). Several of these use an "int" to return or store the offsets. On a system where "size_t" is much larger than "int" (e.g., most 64-bit ones), it's easy to feed a gigantic commit message that results in a negative offset. This can result in us reading memory before the string (if the int is used as an index) or far after (if it's implicitly cast to a size_t by passing to a strbuf function). Let's fix this by using size_t for all string offsets. Note that several of the functions need ssize_t, since they use "-1" as a sentinel value. The interactions here can be pretty subtle. E.g., end_of_title in find_trailer_start() does not itself need to be signed, but it is compared to the result of last_line(), which is. That promotes the latter to unsigned, and the ">=" does not behave as you might expect. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-06Replace all die("BUG: ...") calls by BUG() onesJohannes Schindelin1-3/+3
In d8193743e08 (usage.c: add BUG() function, 2017-05-12), a new macro was introduced to use for reporting bugs instead of die(). It was then subsequently used to convert one single caller in 588a538ae55 (setup_git_env: convert die("BUG") to BUG(), 2017-05-12). The cover letter of the patch series containing this patch (cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not terribly clear why only one call site was converted, or what the plan is for other, similar calls to die() to report bugs. Let's just convert all remaining ones in one fell swoop. This trick was performed by this invocation: sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-22trailer: rename 'template' variablesBrandon Williams1-5/+5
Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-22trailer: rename 'new' variablesBrandon Williams1-17/+17
Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06tempfile: auto-allocate tempfiles on heapJeff King1-3/+3
The previous commit taught the tempfile code to give up ownership over tempfiles that have been renamed or deleted. That makes it possible to use a stack variable like this: struct tempfile t; create_tempfile(&t, ...); ... if (!err) rename_tempfile(&t, ...); else delete_tempfile(&t); But doing it this way has a high potential for creating memory errors. The tempfile we pass to create_tempfile() ends up on a global linked list, and it's not safe for it to go out of scope until we've called one of those two deactivation functions. Imagine that we add an early return from the function that forgets to call delete_tempfile(). With a static or heap tempfile variable, the worst case is that the tempfile hangs around until the program exits (and some functions like setup_shallow_temporary rely on this intentionally, creating a tempfile and then leaving it for later cleanup). But with a stack variable as above, this is a serious memory error: the variable goes out of scope and may be filled with garbage by the time the tempfile code looks at it. Let's see if we can make it harder to get this wrong. Since many callers need to allocate arbitrary numbers of tempfiles, we can't rely on static storage as a general solution. So we need to turn to the heap. We could just ask all callers to pass us a heap variable, but that puts the burden on them to call free() at the right time. Instead, let's have the tempfile code handle the heap allocation _and_ the deallocation (when the tempfile is deactivated and removed from the list). This changes the return value of all of the creation functions. For the cleanup functions (delete and rename), we'll add one extra bit of safety: instead of taking a tempfile pointer, we'll take a pointer-to-pointer and set it to NULL after freeing the object. This makes it safe to double-call functions like delete_tempfile(), as the second call treats the NULL input as a noop. Several callsites follow this pattern. The resulting patch does have a fair bit of noise, as each caller needs to be converted to handle: 1. Storing a pointer instead of the struct itself. 2. Passing the pointer instead of taking the struct address. 3. Handling a "struct tempfile *" return instead of a file descriptor. We could play games to make this less noisy. For example, by defining the tempfile like this: struct tempfile { struct heap_allocated_part_of_tempfile { int fd; ...etc } *actual_data; } Callers would continue to have a "struct tempfile", and it would be "active" only when the inner pointer was non-NULL. But that just makes things more awkward in the long run. There aren't that many callers, so we can simply bite the bullet and adjust all of them. And the compiler makes it easy for us to find them all. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-26Merge branch 'jk/trailers-parse'Junio C Hamano1-16/+98
"git interpret-trailers" has been taught a "--parse" and a few other options to make it easier for scripts to grab existing trailer lines from a commit log message. * jk/trailers-parse: doc/interpret-trailers: fix "the this" typo pretty: support normalization options for %(trailers) t4205: refactor %(trailers) tests pretty: move trailer formatting to trailer.c interpret-trailers: add --parse convenience option interpret-trailers: add an option to unfold values interpret-trailers: add an option to show only existing trailers interpret-trailers: add an option to show only the trailers trailer: put process_trailers() options into a struct
2017-08-15pretty: support normalization options for %(trailers)Jeff King1-2/+30
The interpret-trailers command recently learned some options to make its output easier to parse (for a caller whose only interested in picking out the trailer values). But it's not very efficient for asking for the trailers of many commits in a single invocation. We already have "%(trailers)" to do that, but it doesn't know about unfolding or omitting non-trailers. Let's plumb those options through, so you can have the best of both. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-15pretty: move trailer formatting to trailer.cJeff King1-0/+18
The next commit will add many features to the %(trailer) placeholder in pretty.c. We'll need to access some internal functions of trailer.c for that, so our options are either: 1. expose those functions publicly or 2. make an entry point into trailer.c to do the formatting Doing (2) ends up exposing less surface area, though do note that caveats in the docstring of the new function. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-15interpret-trailers: add an option to unfold valuesJeff King1-0/+29
The point of "--only-trailers" is to give a caller an output that's easy for them to parse. Getting rid of the non-trailer material helps, but we still may see more complicated syntax like whitespace continuation. Let's add an option to unfold any continuation, giving the output as a single "key: value" line per trailer. As a bonus, this could be used even without --only-trailers to clean up unusual formatting in the incoming data. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-15interpret-trailers: add an option to show only existing trailersJeff King1-4/+5
It can be useful to invoke interpret-trailers for the primary purpose of parsing existing trailers. But in that case, we don't want to apply existing ifMissing or ifExists rules from the config. Let's add a special mode where we avoid applying those rules. Coupled with --only-trailers, this gives us a reasonable parsing tool. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-15interpret-trailers: add an option to show only the trailersJeff King1-9/+14
In theory it's easy for any reader who wants to parse trailers to do so. But there are a lot of subtle corner cases around what counts as a trailer, when the trailer block begins and ends, etc. Since interpret-trailers already has our parsing logic, let's let callers ask it to just output the trailers. They still have to parse the "key: value" lines, but at least they can ignore all of the other corner cases. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-14interpret-trailers: add options for actionsPaolo Bonzini1-6/+28
Allow using non-default values for trailers without having to set them up in .gitconfig first. For example, if you have the following configuration trailer.signed-off-by.where = end you may use "--where before" when a patch author forgets his Signed-off-by and provides it in a separate email. Likewise for --if-exists and --if-missing Reverting to the behavior specified by .gitconfig is done with --no-where, --no-if-exists and --no-if-missing. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-14trailers: introduce struct new_trailer_itemPaolo Bonzini1-8/+11
This will provide a place to store the current state of the --where, --if-exists and --if-missing options. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-10trailer: put process_trailers() options into a structJeff King1-4/+6
We already have two options and are about to add a few more. To avoid having a huge number of boolean arguments, let's convert to an options struct which can be passed in. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-25trailers: export action enums and corresponding lookup functionsPaolo Bonzini1-32/+33
Separate the mechanical changes out of the next patch. The functions are changed to take a pointer to enum, because struct conf_info is not going to be public. Set the default values explicitly in default_conf_info, since they are not anymore close to default_conf_info and it's not obvious which constant has value 0. With the next patches, in fact, the values will not be zero anymore! Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15config: don't include config.h by defaultBrandon Williams1-0/+1
Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-11-29trailer: have function to describe trailer layoutJonathan Tan1-36/+82
Create a function that, taking a string, describes the position of its trailer block (if available) and the contents thereof, and make trailer use it. This makes it easier for other Git components, in the future, to interpret trailer blocks in the same way as trailer. In a subsequent patch, another component will be made to use this. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-11-29trailer: avoid unnecessary splitting on linesJonathan Tan1-94/+100
trailer.c currently splits lines while processing a buffer (and also rejoins lines when needing to invoke ignore_non_trailer). Avoid such line splitting, except when generating the strings corresponding to trailers (for ease of use by clients - a subsequent patch will allow other components to obtain the layout of a trailer block in a buffer, including the trailers themselves). The main purpose of this is to make it easy to return pointers into the original buffer (for a subsequent patch), but this also significantly reduces the number of memory allocations required. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-11-29commit: make ignore_non_trailer take buf/lenJonathan Tan1-1/+1
Make ignore_non_trailer take a buf/len pair instead of struct strbuf. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-11-29trailer: be stricter in parsing separatorsJonathan Tan1-6/+23
Currently, a line is interpreted to be a trailer line if it contains a separator. Make parsing stricter by requiring the text on the left of the separator, if not the empty string, to be of the "<token><optional whitespace>" form. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-31Merge branch 'jt/trailer-with-cruft'Junio C Hamano1-273/+347
Update "interpret-trailers" machinery and teaches it that people in real world write all sorts of crufts in the "trailer" that was originally designed to have the neat-o "Mail-Header: like thing" and nothing else. * jt/trailer-with-cruft: trailer: support values folded to multiple lines trailer: forbid leading whitespace in trailers trailer: allow non-trailers in trailer block trailer: clarify failure modes in parse_trailer trailer: make args have their own struct trailer: streamline trailer item create and add trailer: use list.h for doubly-linked list trailer: improve const correctness
2016-10-21trailer: support values folded to multiple linesJonathan Tan1-7/+38
Currently, interpret-trailers requires that a trailer be only on 1 line. For example: a: first line second line would be interpreted as one trailer line followed by one non-trailer line. Make interpret-trailers support RFC 822-style folding, treating those lines as one logical trailer. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-21trailer: forbid leading whitespace in trailersJonathan Tan1-1/+1
Currently, interpret-trailers allows leading whitespace in trailer lines. This leads to false positives, especially for quoted lines or bullet lists. Forbid leading whitespace in trailers. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-21trailer: allow non-trailers in trailer blockJonathan Tan1-13/+76
Currently, interpret-trailers requires all lines of a trailer block to be trailers (or comments) - if not it would not identify that block as a trailer block, and thus create its own trailer block, inserting a blank line. For example: echo -e "\nSigned-off-by: x\nnot trailer" | git interpret-trailers --trailer "c: d" would result in: Signed-off-by: x not trailer c: d Relax the definition of a trailer block to require that the trailers (i) are all trailers, or (ii) contain at least one Git-generated trailer and consists of at least 25% trailers. Signed-off-by: x not trailer c: d (i) is the existing functionality. (ii) allows arbitrary lines to be included in trailer blocks, like those in [1], and still allow interpret-trailers to be used. [1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/stable/linux-stable/+/e7d316a02f683864a12389f8808570e37fb90aa3 Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-21trailer: clarify failure modes in parse_trailerJonathan Tan1-22/+53
The parse_trailer function has a few modes of operation, all depending on whether the separator is present in its input, and if yes, the separator's position. Some of these modes are failure modes, and these failure modes are handled differently depending on whether the trailer line was sourced from a file or from a command-line argument. Extract a function to find the separator, allowing the invokers of parse_trailer to determine how to handle the failure modes instead of making parse_trailer do it. In this function, also take in the list of separators, so that we can distinguish between command line arguments (which allow '=' as separator) and file input (which does not allow '=' as separator). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-20trailer: make args have their own structJonathan Tan1-50/+85
Improve type safety by making arguments (whether from configuration or from the command line) have their own "struct arg_item" type, separate from the "struct trailer_item" type used to represent the trailers in the buffer being manipulated. This change also prepares "struct trailer_item" to be further differentiated from "struct arg_item" in future patches. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-20trailer: streamline trailer item create and addJonathan Tan1-70/+60
Currently, creation and addition (to a list) of trailer items are spread across multiple functions. Streamline this by only having 2 functions: one to parse the user-supplied string, and one to add the parsed information to a list. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-20trailer: use list.h for doubly-linked listJonathan Tan1-167/+91
Replace the existing handwritten implementation of a doubly-linked list in trailer.c with the functions and macros from list.h. This significantly simplifies the code. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-14trailer: improve const correctnessJonathan Tan1-7/+7
Change "const char *" to "char *" in struct trailer_item and in the return value of apply_command (since those strings are owned strings). Change "struct conf_info *" to "const struct conf_info *" (since that struct is not modified). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-12cocci: refactor common patterns to use xstrdup_or_null()Junio C Hamano1-6/+3
d64ea0f83b ("git-compat-util: add xstrdup_or_null helper", 2015-01-12) added a handy wrapper that allows us to get a duplicate of a string or NULL if the original is NULL, but a handful of codepath predate its introduction or just weren't aware of it. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-26die("bug"): report bugs consistentlyJohannes Schindelin1-1/+1
The vast majority of error messages in Git's source code which report a bug use the convention to prefix the message with "BUG:". As part of cleaning up merge-recursive to stop die()ing except in case of detected bugs, let's just make the remainder of the bug reports consistent with the de facto rule. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-29trailer.c: mark strings for translationNguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-14interpret-trailers: add option for in-place editingTobias Klauser1-1/+40
Add a command line option --in-place to support in-place editing akin to sed -i. This allows to write commands like the following: git interpret-trailers --trailer "X: Y" a.txt > b.txt && mv b.txt a.txt in a more concise way: git interpret-trailers --trailer "X: Y" --in-place a.txt Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-14trailer: allow to write to files other than stdoutTobias Klauser1-13/+15
Use fprintf instead of printf in trailer.c in order to allow printing to a file other than stdout. This will be needed to support in-place editing in git interpret-trailers. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-02Merge branch 'cc/trailers-corner-case-fix'Junio C Hamano1-4/+11
The "interpret-trailers" helper mistook a multi-paragraph title of a commit log message with a colon in it as the end of the trailer block. * cc/trailers-corner-case-fix: trailer: support multiline title
2015-08-31trailer: support multiline titleChristian Couder1-4/+11
We currently ignore the first line passed to `git interpret-trailers`, when looking for the beginning of the trailers. Unfortunately this does not work well when a commit is created with a line break in the title, using for example the following command: git commit -m 'place of code: change we made' That's why instead of ignoring only the first line, it is better to ignore the first paragraph. Signed-off-by: Christian Couder <christian.couder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-28Merge branch 'cc/trailers-corner-case-fix'Junio C Hamano1-1/+3
"interpret-trailers" helper mistook a single-liner log message that has a colon as the end of existing trailer. * cc/trailers-corner-case-fix: trailer: retitle a test and correct an in-comment message trailer: ignore first line of message
2015-08-26trailer: retitle a test and correct an in-comment messageChristian Couder1-1/+2
Signed-off-by: Christian Couder <christian.couder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-21trailer: ignore first line of messageChristian Couder1-1/+2
When looking for the start of the trailers in the message we are passed, we should ignore the first line of the message. The reason is that if we are passed a patch or commit message then the first line should be the patch title. If we are passed only trailers we can expect that they start with an empty line that can be ignored too. This way we can properly process commit messages that have only one line with something that looks like a trailer, for example like "area of code: change we made". Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-22trailer: use capture_commandJeff King1-13/+5
When we read from a trailer.*.command sub-program, the current code uses run_command followed by a pipe read, which can result in deadlock (though in practice you would have to have a large trailer for this to be a problem). The current code also leaks the file descriptor for the pipe to the sub-command. Instead, let's use capture_command, which makes this simpler (and we can get rid of our custom helper). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-13do not include the same header twiceДилян Палаузов1-1/+0
A few files include the same header file directly more than once. As all these headers protect themselves against repeated inclusion by the "#ifndef FOO_H / #define FOO_H / ... / #endif" idiom, leave only the first inclusion and remove the later inclusion as a no-op clean-up. Signed-off-by: Дилян Палаузов <git-dpa@aegee.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-22Merge branch 'cc/interpret-trailers-more'Junio C Hamano1-7/+25
"git interpret-trailers" learned to properly handle the "Conflicts:" block at the end. * cc/interpret-trailers-more: trailer: add test with an old style conflict block trailer: reuse ignore_non_trailer() to ignore conflict lines commit: make ignore_non_trailer() non static merge & sequencer: turn "Conflicts:" hint into a comment builtin/commit.c: extract ignore_non_trailer() helper function merge & sequencer: unify codepaths that write "Conflicts:" hint builtin/merge.c: drop a parameter that is never used
2014-11-19Merge branch 'cc/interpret-trailers'Junio C Hamano1-4/+10
Small fixes to a new experimental command already in 'master'. * cc/interpret-trailers: trailer: display a trailer without its trailing newline trailer: ignore comment lines inside the trailers
2014-11-10trailer: use CHILD_PROCESS_INIT in apply_command()René Scharfe1-2/+1
Initialize the struct child_process variable cp at declaration time. This is shorter, saves a function call and prevents using the variable before initialization by mistake. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-10trailer: reuse ignore_non_trailer() to ignore conflict linesChristian Couder1-7/+25
Make sure we look for trailers before any conflict line by reusing the ignore_non_trailer() function. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-10trailer: display a trailer without its trailing newlineChristian Couder1-2/+6
Trailers passed to the parse_trailer() function often have a trailing newline. When erroring out, we should display the invalid trailer properly, that means without any trailing newline. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-10trailer: ignore comment lines inside the trailersChristian Couder1-2/+4
Otherwise trailers that are commented out might be processed. We would also error out if the comment line char is also a separator. This means that comments inside a trailer block will disappear, but that was already the case anyway. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-28use child_process_init() to initialize struct child_process variablesRené Scharfe1-1/+1
Call child_process_init() instead of zeroing the memory of variables of type struct child_process by hand before use because the former is both clearer and shorter. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-13trailer: execute command from 'trailer.<name>.command'Christian Couder1-1/+84
Let the user specify a command that will give on its standard output the value to use for the specified trailer. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-13trailer: put all the processing together and printChristian Couder1-0/+69
This patch adds the process_trailers() function that calls all the previously added processing functions and then prints the results on the standard output. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-13trailer: parse trailers from file or stdinChristian Couder1-0/+123
Read trailers from a file or from stdin, parse the trailers and then put the result into a doubly linked list. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-13trailer: process command line trailer argumentsChristian Couder1-0/+112
Parse the trailer command line arguments and put the result into an arg_tok doubly linked list. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-13trailer: read and process config informationChristian Couder1-0/+185
Read the configuration to get trailer information, and then process it and store it in a doubly linked list. The config information is stored in the list whose first item is pointed to by: static struct trailer_item *first_conf_item; Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-13trailer: process trailers from input message and argumentsChristian Couder1-0/+210
Implement the logic to process trailers from the input message and from arguments. At the beginning trailers from the input message are in their own "in_tok" doubly linked list, and trailers from arguments are in their own "arg_tok" doubly linked list. The lists are traversed and when an "arg_tok" should be "applied", it is removed from its list and inserted into the "in_tok" list. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-13trailer: add data structures and basic functionsChristian Couder1-0/+69
We will use a doubly linked list to store all information about trailers and their configuration. This way we can easily remove or add trailers to or from trailer lists while traversing the lists in either direction. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>