Following command outputs following lines of text on console
git log --pretty=format:"%h;%ai;%s" --shortstat
ed6e0ab;2014-01-07 16:32:39 +0530;Foo
3 files changed, 14 insertions(+), 13 deletions(-)
cdfbb10;2014-01-07 14:59:48 +0530;Bar
1 file changed, 21 insertions(+)
5fde3e1;2014-01-06 17:26:40 +0530;Merge Baz
772b277;2014-01-06 17:09:42 +0530;Qux
7 files changed, 72 insertions(+), 7 deletions(-)
I'm interested in having above format to be displayed like this
ed6e0ab;2014-01-07 16:32:39 +0530;Foo;3;14;13
cdfbb10;2014-01-07 14:59:48 +0530;Bar;1;21;0
5fde3e1;2014-01-06 17:26:40 +0530;Merge Baz;0;0;0
772b277;2014-01-06 17:09:42 +0530;Qux;7;72;7
This will be consumed in some report which can parse semicolon separated values.
The thing is the text "\n 3 files changed, 14 insertions(+), 13 deletions(-)"
(new line included) gets converted to 3;14;13
(without new line)
One possible corner case is text like "5fde3e1;2014-01-06 17:26:40 +0530;Merge Baz"
which doesn't have such line. In that case I want ;0;0;0
Overall the goal is to analyze file change stats over a period of time. I read the git log documentation but couldn't find any format which will help me to render in this format. The best I came up was the above command mentioned.
So any command or shell script which can generate the expected format would be of great help.
Thanks!
This is, unfortunately, impossible to achieve using only git log
. One has to use other scripts to compensate for something most people aren't aware of: some commits don't have stats, even if they are not merges.
I have been working on a project that converts git log
to JSON
and to get it done I had to do what you need: get each commit, with stats, in one line. The project is called Gitlogg and you're welcome to tweak it to your needs: https://github.com/dreamyguy/gitlogg
Below is the relevant part of Gitlogg, that will get you close to what you'd like:
git log --all --no-merges --shortstat --reverse --pretty=format:'commits\tcommit_hash\t%H\tcommit_hash_abbreviated\t%h\ttree_hash\t%T\ttree_hash_abbreviated\t%t\tparent_hashes\t%P\tparent_hashes_abbreviated\t%p\tauthor_name\t%an\tauthor_name_mailmap\t%aN\tauthor_email\t%ae\tauthor_email_mailmap\t%aE\tauthor_date\t%ad\tauthor_date_RFC2822\t%aD\tauthor_date_relative\t%ar\tauthor_date_unix_timestamp\t%at\tauthor_date_iso_8601\t%ai\tauthor_date_iso_8601_strict\t%aI\tcommitter_name\t%cn\tcommitter_name_mailmap\t%cN\tcommitter_email\t%ce\tcommitter_email_mailmap\t%cE\tcommitter_date\t%cd\tcommitter_date_RFC2822\t%cD\tcommitter_date_relative\t%cr\tcommitter_date_unix_timestamp\t%ct\tcommitter_date_iso_8601\t%ci\tcommitter_date_iso_8601_strict\t%cI\tref_names\t%d\tref_names_no_wrapping\t%D\tencoding\t%e\tsubject\t%s\tsubject_sanitized\t%f\tcommit_notes\t%N\tstats\t' |
sed '/^[ \t]*$/d' | # remove all newlines/line-breaks, including those with empty spaces
tr '\n' 'ò' | # convert newlines/line-breaks to a character, so we can manipulate it without much trouble
tr '\r' 'ò' | # convert carriage returns to a character, so we can manipulate it without much trouble
sed 's/tòcommits/tòòcommits/g' | # because some commits have no stats, we have to create an extra line-break to make `paste -d ' ' - -` consistent
tr 'ò' '\n' | # bring back all line-breaks
sed '{
N
s/[)]\n\ncommits/)\
commits/g
}' | # some rogue mystical line-breaks need to go down to their knees and beg for mercy, which they're not getting
paste -d ' ' - - # collapse lines so that the `shortstat` is merged with the rest of the commit data, on a single line
Note that I've used the tab character ( \t
) to separate fields as ;
could have been used on the commit message.
Another important part of this script is that each line must begin with an unique string (in this case it's commits). That's because our script needs to know where the line begins. In fact, whatever comes after the git log
command is there to compensate for the fact that some commits might not have stats.
But it strikes me that what you want to achieve is to have commits neatly outputted in a format you can reliably consume. Gitlogg is perfect for that! Some of its features are:
git log
of multiple repositories into one JSON
file.repository
key/value.files changed
, insertions
and deletions
keys/values.impact
key/value, which represents the cumulative changes for the commit (insertions
- deletions
)."
by converting them to single quotes '
on all values that allow or are created by user input, like subject
.pretty=format:
placeholders are available.JSON
by commenting out/uncommenting the available ones.Success, the JSON was parsed and saved.