Age | Commit message (Collapse) | Author | Files | Lines |
|
The code to parse the From e-mail header has been updated to avoid
recursion.
* jk/mailinfo-iterative-unquote-comment:
mailinfo: avoid recursion when unquoting From headers
t5100: make rfc822 comment test more careful
|
|
OOB read fix.
* jk/mailinfo-oob-read-fix:
mailinfo: fix out-of-bounds memory reads in unquote_quoted_pair()
|
|
Our unquote_comment() function is recursive; when it sees a comment
within a comment, like:
(this is an (embedded) comment)
it recurses to handle the inner comment. This is fine for practical use,
but it does mean that you can easily run out of stack space with a
malicious header. For example:
perl -e 'print "From: ", "(" x 2**18;' |
git mailinfo /dev/null /dev/null
segfaults on my system. And since mailinfo is likely to be fed untrusted
input from the Internet (if not by human users, who might recognize a
garbage header, but certainly there are automated systems that apply
patches from a list) it may be possible for an attacker to trigger the
problem.
That said, I don't think there's an interesting security vulnerability
here. All an attacker can do is make it impossible to parse their email
and apply their patch, and there are lots of ways to generate bogus
emails. So it's more of an annoyance than anything.
But it's pretty easy to fix it. The recursion is not helping us preserve
any particular state from each level. The only flag in our parsing is
take_next_literally, and we can never recurse when it is set (since the
start of a new comment implies it was not backslash-escaped). So it is
really only useful for finding the end of the matched pair of
parentheses. We can do that easily with a simple depth counter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When processing a header like a "From" line, mailinfo uses
unquote_quoted_pair() to handle double-quotes and rfc822 parenthesized
comments. It takes a NUL-terminated string on input, and loops over the
"in" pointer until it sees the NUL. When it finds the start of an
interesting block, it delegates to helper functions which also increment
"in", and return the updated pointer.
But there's a bug here: the helpers find the NUL with a post-increment
in the loop condition, like:
while ((c = *in++) != 0)
So when they do see a NUL (rather than the correct termination of the
quote or comment section), they return "in" as one _past_ the NUL
terminator. And thus the outer loop in unquote_quoted_pair() does not
realize we hit the NUL, and keeps reading past the end of the buffer.
We should instead make sure to return "in" positioned at the NUL, so
that the caller knows to stop their loop, too. A hacky way to do this is
to return "in - 1" after leaving the inner loop. But a slightly cleaner
solution is to avoid incrementing "in" until we are sure it contained a
non-NUL byte (i.e., doing it inside the loop body).
The two tests here show off the problem. Since we check the output,
they'll _usually_ report a failure in a normal build, but it depends on
what garbage bytes are found after the heap buffer. Building with
SANITIZE=address reliably notices the problem. The outcome (both the
exit code and the exact bytes) are just what Git happens to produce for
these cases today, and shouldn't be taken as an endorsement. It might be
reasonable to abort on an unterminated string, for example. The priority
for this patch is fixing the out-of-bounds memory access.
Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When the config parser sees an "implicit" bool like:
[core]
someVariable
it passes NULL to the config callback. Any callback code which expects a
string must check for NULL. This usually happens via helpers like
git_config_string(), etc, but some custom code forgets to do so and will
segfault.
These are all fairly vanilla cases where the solution is just the usual
pattern of:
if (!value)
return config_error_nonbool(var);
though note that in a few cases we have to split initializers like:
int some_var = initializer();
into:
int some_var;
if (!value)
return config_error_nonbool(var);
some_var = initializer();
There are still some broken instances after this patch, which I'll
address on their own in individual patches after this one.
Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In order to further reduce all-in-one headers, separate out functions in
hex.h that do not operate on object hashes into its own file, hex-ll.h,
and update the include directives in the .c files that need only such
functions accordingly.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a new "const struct config_context *ctx" arg to config_fn_t to hold
additional information about the config iteration operation.
config_context has a "struct key_value_info kvi" member that holds
metadata about the config source being read (e.g. what kind of config
source it is, the filename, etc). In this series, we're only interested
in .kvi, so we could have just used "struct key_value_info" as an arg,
but config_context makes it possible to add/adjust members in the future
without changing the config_fn_t signature. We could also consider other
ways of organizing the args (e.g. moving the config name and value into
config_context or key_value_info), but in my experiments, the
incremental benefit doesn't justify the added complexity (e.g. a
config_fn_t will sometimes invoke another config_fn_t but with a
different config value).
In subsequent commits, the .kvi member will replace the global "struct
config_reader" in config.c, making config iteration a global-free
operation. It requires much more work for the machinery to provide
meaningful values of .kvi, so for now, merely change the signature and
call sites, pass NULL as a placeholder value, and don't rely on the arg
in any meaningful way.
Most of the changes are performed by
contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every
config_fn_t:
- Modifies the signature to accept "const struct config_context *ctx"
- Passes "ctx" to any inner config_fn_t, if needed
- Adds UNUSED attributes to "ctx", if needed
Most config_fn_t instances are easily identified by seeing if they are
called by the various config functions. Most of the remaining ones are
manually named in the .cocci patch. Manual cleanups are still needed,
but the majority of it is trivial; it's either adjusting config_fn_t
that the .cocci patch didn't catch, or adding forward declarations of
"struct config_context ctx" to make the signatures make sense.
The non-trivial changes are in cases where we are invoking a config_fn_t
outside of config machinery, and we now need to decide what value of
"ctx" to pass. These cases are:
- trace2/tr2_cfg.c:tr2_cfg_set_fl()
This is indirectly called by git_config_set() so that the trace2
machinery can notice the new config values and update its settings
using the tr2 config parsing function, i.e. tr2_cfg_cb().
- builtin/checkout.c:checkout_main()
This calls git_xmerge_config() as a shorthand for parsing a CLI arg.
This might be worth refactoring away in the future, since
git_xmerge_config() can call git_default_config(), which can do much
more than just parsing.
Handle them by creating a KVI_INIT macro that initializes "struct
key_value_info" to a reasonable default, and use that to construct the
"ctx" arg.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Consistently spell "Message-ID" as such, not "Message-Id".
* jc/spell-id-in-both-caps-in-message-id:
e-mail workflow: Message-ID is spelled with ID in both capital letters
|
|
We used to write "Message-Id:" and "Message-ID:" pretty much
interchangeably, and the header name is defined to be case
insensitive by the RFCs, but the canonical form "Message-ID:" is
used throughout the RFC documents, so let's imitate it ourselves.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
|
|
hex.c contains code for hex-related functions, but for some reason these
functions were declared in the catch-all cache.h. Move the function
declarations into a hex.h header instead.
This also allows us to remove includes of cache.h from a few C files.
For now, we make cache.h include hex.h, so that it is easier to review
the direct changes being made by this patch. In the next patch, we will
remove that, and add the necessary direct '#include "hex.h"' in the
hundreds of C files that need it.
Note that reviewing the header changes in this commit might be
simplified via
git log --no-walk -p --color-moved $COMMIT -- '*.h'`
In particular, it highlights the simple movement of code in .h files
rather nicely.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
To remove bracketed strings containing "PATCH" from the subject line
cleanup_subject() scans the subject for the opening bracket using an
offset from the beginning of the line. It then searches for the
closing bracket with strchr(). To calculate the length of the
bracketed string it unfortunately adds rather than subtracts the
offset from the result of strchr(). This leads to an out of bounds
access in memmem() when looking to see if the brackets contain
"PATCH".
We have tests that trigger this bug that were added in ae52d57f0b
(t5100: add some more mailinfo tests, 2017-05-31). The commit message
mentions that they are marked test_expect_failure as they trigger an
assertion in strbuf_splice(). While it is reassuring that
strbuf_splice() detects the problem and dies in retrospect that should
perhaps have warranted a little more investigation. The bug was
introduced by 17635fc900 (mailinfo: -b option keeps [bracketed]
strings that is not a [PATCH] marker, 2009-07-15). I think the reason
it has survived so long is that '-b' is not a popular option and
without it the offset is always zero.
This was found by the address sanitizer while I was cleaning up the
test_todo idea in [1].
[1] https://lore.kernel.org/git/db558292-2783-3270-4824-43757822a389@gmail.com/
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Code clean-up.
* ar/mailinfo-memcmp-to-skip-prefix:
mailinfo: use starts_with() when checking scissors
|
|
We historically rejected a very short string as an author name
while accepting a patch e-mail, which has been loosened.
* ef/mailinfo-short-name:
mailinfo: don't discard names under 3 characters
|
|
Existing checks for scissors characters using memcmp(3) never read past
the end of the line, because all substrings we are interested in are two
characters long, and the outer loop guarantees we have at least one
character. So at most we will look at the NUL.
However, this is too subtle and may lead to bugs in code which copies
this behavior without realizing substring length requirement. So use
starts_with() instead, which will stop at NUL regardless of the length
of the prefix. Remove extra pair of parentheses while we are here.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
I sometimes receive patches from people with short mononyms, and in my
cultural environment these are not uncommon. To my dismay, git-am
currently discards their names, and replaces them with their email
addresses.
Link: https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/
Signed-off-by: edef <edef@edef.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git mailinfo" (hence "git am") learned the "--quoted-cr" option to
control how lines ending with CRLF wrapped in base64 or qp are
handled.
* dd/mailinfo-quoted-cr:
am: learn to process quoted lines that ends with CRLF
mailinfo: allow stripping quoted CR without warning
mailinfo: allow squelching quoted CRLF warning
mailinfo: warn if CRLF found in decoded base64/QP email
mailinfo: stop parsing options manually
mailinfo: load default metainfo_charset lazily
|
|
In previous changes, we've turned on warning for quoted CR in base64 or
quoted-printable email messages. Some projects see those quoted CR a lot,
they know that it happens most of the time, and they find it's desirable
to always strip those CR.
Those projects in question usually fall back to use other tools to handle
patches when receive such patches.
Let's help those projects handle those patches by stripping those
excessive CR.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In previous change, Git starts to warn for quoted CRLF in decoded
base64/QP email. Despite those warnings are usually helpful,
quoted CRLF could be part of some users' workflow.
Let's give them an option to turn off the warning completely.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When SMTP servers receive 8-bit email messages, possibly with only
LF as line ending, some of them decide to change said LF to CRLF.
Some mailing list softwares, when receive 8-bit email messages,
decide to encode those messages in base64 or quoted-printable.
If an email is transfered through above mail servers, then distributed
by such mailing list softwares, the recipients will receive an email
contains a patch mungled with CRLF encoded inside another encoding.
Thus, such CR (in CRLF) couldn't be dropped by "mailsplit".
Hence, the mailed patch couldn't be applied cleanly.
Such accidents have been observed in the wild [1].
Instead of silently rejecting those messages, let's give our users
some warnings if such CR (as part of CRLF) is found.
[1]: https://nmbug.notmuchmail.org/nmweb/show/m2lf9ejegj.fsf%40guru.guru-group.fi
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
mailinfo.p_hdr_info/s_hdr_info are null-terminated lists of strbuf's,
with entries pointing either to NULL or an allocated strbuf. Therefore
we need to free those strbuf's (and not just the data they contain)
whenever we're done with a given entry. (See handle_header() where those
new strbufs are malloc'd.)
Once we no longer need the list (and not just its entries) we can switch
over to strbuf_list_free() instead of manually iterating over the list,
which takes care of those additional details for us. We can only do this
in clear_mailinfo() - in handle_commit_message() we are only clearing the
array contents but want to reuse the array itself, hence we can't use
strbuf_list_free() there.
However, strbuf_list_free() cannot handle a NULL input, and the lists we
are freeing might be NULL. Therefore we add a NULL check in
strbuf_list_free() to make it safe to use with a NULL input (which is a
pattern used by some of the other *_free() functions around git).
Leak output from t0023:
Direct leak of 72 byte(s) in 3 object(s) allocated from:
#0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9ac9f4 in do_xmalloc wrapper.c:41:8
#2 0x9ac9ca in xmalloc wrapper.c:62:9
#3 0x7f6cf7 in handle_header mailinfo.c:205:10
#4 0x7f5abf in check_header mailinfo.c:583:4
#5 0x7f5524 in mailinfo mailinfo.c:1197:3
#6 0x4dcc95 in parse_mail builtin/am.c:1167:6
#7 0x4d9070 in am_run builtin/am.c:1732:12
#8 0x4d5b7a in cmd_am builtin/am.c:2398:3
#9 0x4cd91d in run_builtin git.c:467:11
#10 0x4cb5f3 in handle_builtin git.c:719:3
#11 0x4ccf47 in run_argv git.c:808:4
#12 0x4caf49 in cmd_main git.c:939:19
#13 0x69e43e in main common-main.c:52:11
#14 0x7fc1fadfa349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 72 byte(s) leaked in 3 allocation(s).
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We're passing buffer from strbuf to reencode_string,
which will call strlen(3) on that buffer,
and discard the length of newly created buffer.
Then, we compute the length of the return buffer to attach to strbuf.
During this process, we introduce a discrimination between mail
originally written in utf-8 and other encoding.
* if the email was written in utf-8, we leave it as is. If there is
a NUL character in that line, we complains loudly:
error: a NUL byte in commit log message not allowed.
* if the email was written in other encoding, we truncate the data as
the NUL character in that line, then we used the truncated line for
the metadata.
We can do better by reusing all the available information,
and call the underlying lower level function that will be called
indirectly by reencode_string. By doing this, we will also postpone
the NUL character processing to the commit step, which will
complains about the faulty metadata.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Code cleanup.
* rs/micro-cleanups:
use strpbrk(3) to search for characters from a given set
quote: use isalnum() to check for alphanumeric characters
|
|
We can check if certain characters are present in a string by calling
strchr(3) on each of them, or we can pass them all to a single
strpbrk(3) call. The latter is shorter, less repetitive and slightly
more efficient, so let's do that instead.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We do the same thing for each header: match it, copy it to a strbuf, and
decode it. Let's put that in a helper function to avoid repetition.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
RFC822 and friends allow arbitrary whitespace after the colon of a
header and before the values. I.e.:
Subject:foo
Subject: foo
Subject: foo
all have the subject "foo". But mailinfo requires exactly one space.
This doesn't seem to be bothering anybody, but it is pickier than the
standard specifies. And we can easily just soak up arbitrary whitespace
there in our parser, so let's do so.
Note that the test covers both too little and too much whitespace, but
the "too much" case already works fine (because we later eat leading and
trailing whitespace from the values).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Our code to parse header values first checks to see if a line starts
with a header, and then manually skips past the matched string to find
the value. We can do this all in one step by modeling after
skip_prefix(), which returns a pointer into the string after the
parsing.
This lets us remove some repeated strings, and will also enable us to
parse more flexibly in a future patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We read each header line into a strbuf, which means that we could
in theory handle header values with embedded NUL bytes. But in practice,
the values we parse out are passed to decode_header(), which uses
strstr(), strchr(), etc. And we would not expect such bytes anyway; they
are forbidden by RFC822, etc. and any non-ASCII characters should be
encoded with RFC2047 encoding.
So let's switch to using strbuf_addstr(), which saves us some length
computations (and will enable further cleanups in this code).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
handle_content_type() only cares about the value after "Content-Type: ";
there is no need to insert that string for it.
Suggested-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a function for inserting a C string into a strbuf. Use it
throughout the source to get rid of magic string length constants and
explicit strlen() calls.
Like strbuf_addstr(), implement it as an inline function to avoid the
implicit strlen() calls to cause runtime overhead.
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add best-effort support for patches sent using format=flowed (RFC 3676).
Remove leading spaces ("unstuff"), remove soft line breaks (indicated
by space + newline), but leave the signature separator (dash dash space
newline) alone.
Warn in git am when encountering a format=flowed patch, because any
trailing spaces would most probably be lost, as the sending MUA is
encouraged to remove them when preparing the email.
Provide a test patch formatted by Mozilla Thunderbird 60 using its
default configuration. It reuses the contents of the file mailinfo.c
before and after this patch.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In d8193743e08 (usage.c: add BUG() function, 2017-05-12), a new macro
was introduced to use for reporting bugs instead of die(). It was then
subsequently used to convert one single caller in 588a538ae55
(setup_git_env: convert die("BUG") to BUG(), 2017-05-12).
The cover letter of the patch series containing this patch
(cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not
terribly clear why only one call site was converted, or what the plan
is for other, similar calls to die() to report bugs.
Let's just convert all remaining ones in one fell swoop.
This trick was performed by this invocation:
sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Corner case bugfix.
* jc/mailinfo-cleanup-fix:
mailinfo: avoid segfault when can't open files
|
|
If <msg> or <patch> files can't be opened, then mailinfo() returns an
error before it even initializes mi->p_hdr_data or mi->s_hdr_data.
When cmd_mailinfo() then calls clear_mailinfo(), we dereference the
NULL pointers trying to free their contents.
Signed-off-by: Juan F. Codagnone <jcodagnone@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git mailinfo" was loose in decoding quoted printable and produced
garbage when the two letters after the equal sign are not
hexadecimal. This has been fixed.
* rs/mailinfo-qp-decode-fix:
mailinfo: don't decode invalid =XY quoted-printable sequences
|
|
Decode =XY in quoted-printable segments only if X and Y are hexadecimal
digits, otherwise just copy them. That's at least better than
interpreting negative results from hexval() as a character.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Gcc 7 adds -Wimplicit-fallthrough, which can warn when a
switch case falls through to the next case. The general idea
is that the compiler can't tell if this was intentional or
not, so you should annotate any intentional fall-throughs as
such, leaving it to complain about any unannotated ones.
There's a GNU __attribute__ which can be used for
annotation, but of course we'd have to #ifdef it away on
non-gcc compilers. Gcc will also recognize
specially-formatted comments, which matches our current
practice. Let's extend that practice to all of the
unannotated sites (which I did look over and verify that
they were behaving as intended).
Ideally in each case we'd actually give some reasons in the
comment about why we're falling through, or what we're
falling through to. And gcc does support that with
-Wimplicit-fallthrough=2, which relaxes the comment pattern
matching to anything that contains "fallthrough" (or a
variety of spelling variants). However, this isn't the
default for -Wimplicit-fallthrough, nor for -Wextra. In the
name of simplicity, it's probably better for us to support
the default level, which requires "fallthrough" to be the
only thing in the comment (modulo some window dressing like
"else" and some punctuation; see the gcc manual for the
complete set of patterns).
This patch suppresses all warnings due to
-Wimplicit-fallthrough. We might eventually want to add that
to the DEVELOPER Makefile knob, but we should probably wait
until gcc 7 is more widely adopted (since earlier versions
will complain about the unknown warning type).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Clean up at the end and jump there instead of returning early.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A common pattern to free a piece of memory and assign NULL to the
pointer that used to point at it has been replaced with a new
FREE_AND_NULL() macro.
* ab/free-and-null:
*.[ch] refactoring: make use of the FREE_AND_NULL() macro
coccinelle: make use of the "expression" FREE_AND_NULL() rule
coccinelle: add a rule to make "expression" code use FREE_AND_NULL()
coccinelle: make use of the "type" FREE_AND_NULL() rule
coccinelle: add a rule to make "type" code use FREE_AND_NULL()
git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
|
|
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.
* bw/config-h:
config: don't implicitly use gitdir or commondir
config: respect commondir
setup: teach discover_git_directory to respect the commondir
config: don't include config.h by default
config: remove git_config_iter
config: create config.h
|
|
Apply the result of the just-added coccinelle rule. This manually
excludes a few occurrences, mostly things that resulted in many
FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent
change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Stop including config.h by default in cache.h. Instead only include
config.h in those files which require use of the config system.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While POSIX states that it is okay to pass EOF to isspace() (and it seems
to be implied that EOF should *not* be treated as whitespace), and also to
pass EOF to ungetc() (which seems to be intended to fail without buffering
the character), it is much better to handle these cases explicitly. Not
only does it reduce head-scratching (and helps static analysis avoid
reporting false positives), it also lets us handle files containing
nothing but whitespace by erroring out.
Reported via Coverity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
If a patch e-mail had its first paragraph after an in-body header
indented (even after a blank line after the in-body header line),
the indented line was mistook as a continuation of the in-body
header. This has been fixed.
* lt/mailinfo-in-body-header-continuation:
mailinfo: fix in-body header continuations
|
|
An empty line should stop any pending in-body headers, and start the
actual body parsing.
This also modifies the original test for the in-body headers to actually
have a real commit body that starts with spaces, and changes the test to
check that the long line matches _exactly_, and doesn't get extra data
from the body.
Fixes:6b4b013f1884 ("mailinfo: handle in-body header continuations")
Cc: Jonathan Tan <jonathantanmy@google.com>
Cc: Jeff King <peff@peff.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Fix for NDEBUG builds.
* jt/mailinfo-fold-in-body-headers:
mailinfo.c: move side-effects outside of assert
|
|
Since 6b4b013f18 (mailinfo: handle in-body header continuations,
2016-09-20, v2.11.0) mailinfo.c has contained new code with an
assert of the form:
assert(call_a_function(...))
The function in question, check_header, has side effects. This
means that when NDEBUG is defined during a release build the
function call is omitted entirely, the side effects do not
take place and tests (fortunately) start failing.
Since the only time that mi->inbody_header_accum is appended to is
in check_inbody_header, and appending onto a blank
mi->inbody_header_accum always happens when is_inbody_header is
true, this guarantees a prefix that causes check_header to always
return true.
Therefore replace the assert with an if !check_header + DIE
combination to reflect this.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Jeff King <peff@peff.net>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
An author name, that spelled a backslash-quoted double quote in the
human readable part "My \"double quoted\" name", was not unquoted
correctly while applying a patch from a piece of e-mail.
* kd/mailinfo-quoted-string:
mailinfo: unescape quoted-pair in header fields
t5100-mailinfo: replace common path prefix with variable
|
|
rfc2822 has provisions for quoted strings in structured header fields,
but also allows for escaping these with so-called quoted-pairs.
The only thing git currently does is removing exterior quotes, but
quotes within are left alone.
Remove exterior quotes and remove escape characters so that they don't
show up in the author field.
Signed-off-by: Kevin Daudt <me@ikke.info>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Mailinfo currently handles multi-line headers, but it does not handle
multi-line in-body headers. Teach it to handle such headers, for
example, for this input:
From: author <author@example.com>
Date: Fri, 9 Jun 2006 00:44:16 -0700
Subject: a very long
broken line
Subject: another very long
broken line
interpret the in-body subject to be "another very long broken line"
instead of "another very long".
An existing test (t/t5100/msg0015) has an indented line immediately
after an in-body header - it has been modified to reflect the new
functionality.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The is_scissors_line takes a struct strbuf * when a char * would
suffice. Make it take char *.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The check_header function contains logic specific to in-body headers,
although it is invoked during both the processing of actual headers and
in-body headers. Separate out the in-body header part into its own
function.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Small code clean-up.
* rs/mailinfo-lib:
mailinfo: recycle strbuf in check_header()
|
|
Small code clean-up.
* rs/mailinfo-lib:
mailinfo: recycle strbuf in check_header()
|
|
handle_message_id() duplicates the contents of the strbuf that is passed
to it. Its only caller proceeds to release the strbuf immediately after
that. Reuse it instead and make that change of object ownership more
obvious by inlining this short function.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The strbuf_getline() interface allows a byte other than LF or NUL as
the line terminator, but this is only because I wrote these
codepaths anticipating that there might be a value other than NUL
and LF that could be useful when I introduced line_termination long
time ago. No useful caller that uses other value has emerged.
By now, it is clear that the interface is overly broad without a
good reason. Many codepaths have hardcoded preference to read
either LF terminated or NUL terminated records from their input, and
then call strbuf_getline() with LF or NUL as the third parameter.
This step introduces two thin wrappers around strbuf_getline(),
namely, strbuf_getline_lf() and strbuf_getline_nul(), and
mechanically rewrites these call sites to call either one of
them. The changes contained in this patch are:
* introduction of these two functions in strbuf.[ch]
* mechanical conversion of all callers to strbuf_getline() with
either '\n' or '\0' as the third parameter to instead call the
respective thin wrapper.
After this step, output from "git grep 'strbuf_getline('" would
become a lot smaller. An interim goal of this series is to make
this an empty set, so that we can have strbuf_getline_crlf() take
over the shorter name strbuf_getline().
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
git_mailinfo_config() expects "struct mailinfo *". But in
setup_mailinfo(), "mi" is already "struct mailinfo *". &mi would make
it "struct mailinfo **" and git_mailinfo_config() would damage some
other memory when it assigns some value to mi->use_scissors.
This is caught by t4150.20. git_mailinfo_config() breaks
mi->name.alloc and makes strbuf_release() in clear_mailinfo() attempt
to free strbuf_slopbuf.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The top-level mailinfo() would instead punt when the code in the
deeper part of the callchain detects an unrecoverable error in the
input.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Instead of dying in convert_to_utf8(), just report an error and let
the callers handle it. Between the two callers:
- decode_header() silently punts when it cannot parse a broken
RFC2047 encoded text (e.g. when it sees anything other than B or
Q after it sees "=?<charset>") by jumping to release_return,
returning the string it successfully parsed out so far, to the
caller. A piece of string that convert_to_utf8() cannot handle
can be treated the same way.
- handle_commit_msg() doesn't cope with a malformed line well, so
die there for now. We'll lift this even higher in later changes
in this series.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Move the bulk of the code from builtin/mailinfo.c to mailinfo.c
so that new callers can start calling mailinfo() directly.
Note that a few calls to exit() and die() need to be cleaned up
for the API to be truly useful, which will come in later steps.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
[jc: with a bit of constness tightening]
Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
[jc: this is based on Eric's patch but also fixes up the parsed
subject headers].
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
It was pointed out that the current behaviour might mispart a patch comment
so remove this behaviour for now.
[jc: this fixes "From: line in the middle" check in t5100 test.]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
* jc/mailinfo:
mailinfo: skip bogus UNIX From line inside body
|
|
We exited prematurely from header parsing loop when the header
field did not have a space after the colon but we insisted on
it, and we got the check wrong because we forgot that we strip
the trailing whitespace before we do the check.
The space after the colon is not even required by RFC2822, so
stop requiring it. While we are at it, the header line is
specified to be more strict than "anything with a colon in it"
(there must be one or more characters before the colon, and they
must not be controls, SP or non US-ASCII), so implement that
check as well, lest we mistakenly think something like:
Bogus not a header line: this is not.
as a header line.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
- handle_from is fixed to not mangle it's input line.
- Then handle_inbody_header is allowed to look in
the body of a commit message for additional headers
that we haven't already seen.
This allows patches with all of the right information in
unfortunate places to be imported.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Only count lines of the form '^.*: ' and '^From ' as email
header lines.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
This prepares for detecting non-email patches that don't have
mail headers. In which case we have already read the first
line so handle_body should not ignore it.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
- Move handle_info into main so it is called once
after everything has been parsed. This allows the removal
of a static variable and removes two duplicate calls.
- Move parsing of inbody headers into handle_commit.
This means we parse the in-body headers after we have decoded
the character set, and it removes code duplication between
handle_multipart_one_part and handle_body.
- Change the flag indicating that we have seen an in body
prefix header into another bit in seen.
This is a little more general and allows the possibility of parsing
in body headers after the body message has begun.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
B and Q decoding is not appropriate for in body headers, so move
it up to where we explicitly know we have a real email header.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Currently we only use the return value from read_one_header line
to tell if the line we have read is a header or not. So make
it a flag. This paves the way for better email detection.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Sometimes people just include the whole format-patch output in
the commit e-mail. Detect it and skip the bogus ">From " line.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Quoted-Printable (RFC 2045) and the "Q" encoding (RFC 2047) are
subtly different; the latter is used on the mail header and an
underscore needs to be decoded to 0x20.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Systems using some uClibc versions do not properly support
iconv stuff. This patch allows Git to be built on those
systems by passing NO_ICONV=YesPlease to make. The only
drawback is mailinfo won't do charset conversion in those
systems.
Signed-off-by: Fernando J. Pereda <ferdy@gentoo.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
If the first part uses quoted-printable to protect iso8859-1
name in the commit log, and the second part was plain ascii text
patchfile without even Content-Transfer-Encoding subheader, we
incorrectly tried to decode the patch as quoted printable.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
This was a stupid typo that did not follow
http://www.iana.org/assignments/character-sets
Long noticed but neglected by JC, but finally reported by
Marco.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
An isolated developer could have a local-only e-mail, which will
be stripped out by mailinfo because it lacks '@'. Define a
fallback parser to accomodate that.
At the same time, reject authorless patch in git-am.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Added an AIX clause in the Makefile; that clause likely
will be wrong for any AIX pre-5.2, but I can only test
on 5.3. mailinfo.c was missing the compat header file,
and convert-objects.c needs to define a specific
_XOPEN_SOURCE as well as _XOPEN_SOURCE_EXTENDED.
Signed-off-by: E. Jason Riedy <ejr@cs.berkeley.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
This attempts to clean up the way various compatibility
functions are defined and used.
- A new header file, git-compat-util.h, is introduced. This
looks at various NO_XXX and does necessary function name
replacements, equivalent of -Dstrcasestr=gitstrcasestr in the
Makefile.
- Those function name replacements are removed from the Makefile.
- Common features such as usage(), die(), xmalloc() are moved
from cache.h to git-compat-util.h; cache.h includes
git-compat-util.h itself.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Specifying the value for a single letter, single dash option
parameter with equal sign looked funny, and more importantly
calling the flag to override encoding from utf-8 to something
else "-u" (obviously abbreviated from "utf-8") did not make any
sense. So spell it out.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
This uses i18n.commitencoding configuration item to pick up the
default commit encoding for the repository when converting form
e-mail encoding to commit encoding (the default is utf8).
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
When the message body does not identify what encoding it is in,
-u assumes it is in latin-1 and converts it to utf8, which is
the recommended encoding for git commit log messages.
With -u=<encoding>, the conversion is made into the specified
one, instead of utf8, to allow project-local policies.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Borrow from NO_MMAP patch by Johannes, squelch compiler warnings by
declaring gitstrcasestr() when we use it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Also make platform specific part more isolated. Currently we only
have Darwin defined, but I've taken a look at SunOS specific patch
(which I dropped on the floor for now) as well. Doing things this way
would make adding it easier.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|