Skip to content

Commit 617c2c2

Browse files
committed
Fixed #1
1 parent 2ae0437 commit 617c2c2

File tree

2 files changed

+43
-25
lines changed

2 files changed

+43
-25
lines changed

README.md

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -12,31 +12,43 @@ Solutions that I've found online looked at changes to files irrespective whether
1212
*Tested with Python version 3.5.3 and Git version 2.20.1*
1313

1414
# How it works
15-
This lightweight script looks at a range of commits per author. For each commit it bookkeeps the files that were changed along with the LOC for each file. LOC are kept in a sparse structure and changes per LOC are taken into account as the program loops. When a change to the same LOC is detected it updates this separately to bookkeep the true code churn.
16-
Result is a print with aggregated contribution and churn per author for a given time period.
15+
This lightweight script looks at commits per author for a given date range on the **current branch**. For each commit it bookkeeps the files that were changed along with the LOC for each file. LOC are kept in a sparse structure and changes per LOC are taken into account as the program loops. When a change to the same LOC is detected it updates this separately to bookkeep the true code churn.
16+
Result is a print with aggregated contribution and churn per author for a given period in time.
1717

1818
***Note:*** This includes the `--no-merges` flag as it assumes that merge commits with or without merge conflicts are not indicative of churn.
1919

2020
# Usage
2121
Positional (required) arguments:
2222
- **after**        after a certain date, in YYYY[-MM[-DD]] format
2323
- **before**     before a certain date, in YYYY[-MM[-DD]] format
24-
- **author**     author string (not committer)
24+
- **author**     author string (not a committer), leave blank to scope all authors
2525
- **dir**            include Git repository directory
2626

2727
Optional arguments:
2828
- **-h, --h, --help**    show this help message and exit
2929
- **-exdir**                   exclude Git repository subdirectory
3030

31-
## Example
31+
## Usage Example 1
3232
```bash
3333
python ./gitcodechurn.py after="2018-11-29" before="2019-03-01" author="an author" dir="/Users/myname/myrepo" -exdir="excluded-directory"
3434
```
35-
## Output
35+
## Output 1
3636
```bash
3737
author: an author
3838
contribution: 844
3939
churn: -28
4040
```
41-
Outputs can be used as part of a pipeline that generates bar charts for reports:
41+
42+
## Usage Example 2
43+
```bash
44+
python ./gitcodechurn.py after="2018-11-29" before="2019-03-01" author="" dir="/Users/myname/myrepo" -exdir="excluded-directory"
45+
```
46+
## Output 2
47+
```bash
48+
authors: author1, author2, author3
49+
contribution: 4423
50+
churn: -543
51+
```
52+
53+
Outputs of Usage Example 1 can be used as part of a pipeline that generates bar charts for reports:
4254
![contribution vs churn example chart](/chart.png)

gitcodechurn.py

Lines changed: 25 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -8,16 +8,16 @@
88
Code churn has several definitions, the one that to me provides the
99
most value as a metric is:
1010
11-
"Code churn is when an engineer
12-
rewrites their own code in a short period of time."
11+
"Code churn is when an engineer rewrites their own code in a short time period."
1312
1413
Reference: https://blog.gitprime.com/why-code-churn-matters/
1514
16-
This lightweight script looks at a range of commits per author. For each commit
17-
it book-keeps the files that were changed along with the lines of code (LOC)
18-
for each file. LOC are kept in a sparse structure and changes per LOC are taken
19-
into account as the program loops. When a change to the same LOC is detected it
20-
updates this separately to bookkeep the true code churn.
15+
This lightweight script looks at commits per author for a given date range on
16+
the default branch. For each commit it bookkeeps the files that were changed
17+
along with the lines of code (LOC) for each file. LOC are kept in a sparse
18+
structure and changes per LOC are taken into account as the program loops. When
19+
a change to the same LOC is detected it updates this separately to bookkeep the
20+
true code churn.
2121
2222
Result is a print with aggregated contribution and churn per author for a
2323
given time period.
@@ -41,30 +41,30 @@ def main():
4141
parser.add_argument(
4242
'after',
4343
type = str,
44-
help = 'after a certain date, in YYYY[-MM[-DD]] format'
44+
help = 'search after a certain date, in YYYY[-MM[-DD]] format'
4545
)
4646
parser.add_argument(
4747
'before',
4848
type = str,
49-
help = 'before a certain date, in YYYY[-MM[-DD]] format'
49+
help = 'search before a certain date, in YYYY[-MM[-DD]] format'
5050
)
5151
parser.add_argument(
5252
'author',
5353
type = str,
54-
help = 'author string (not committer)'
54+
help = 'an author (non-committer), leave blank to scope all authors'
5555
)
5656
parser.add_argument(
5757
'dir',
5858
type = dir_path,
5959
default = '',
60-
help = 'include Git repository directory'
60+
help = 'the Git repository root directory to be included'
6161
)
6262
parser.add_argument(
6363
'-exdir',
6464
metavar='',
6565
type = str,
6666
default = '',
67-
help = 'exclude Git repository subdirectory'
67+
help = 'the Git repository subdirectory to be excluded'
6868
)
6969
args = parser.parse_args()
7070

@@ -100,7 +100,13 @@ def main():
100100
exdir
101101
)
102102

103-
print('author: \t', author)
103+
# if author is empty then print a unique list of authors
104+
if len(author.strip()) == 0:
105+
authors = set(get_commits(before, after, author, dir, '%an')).__str__()
106+
authors = authors.replace('{', '').replace('}', '').replace("'","")
107+
print('authors: \t', authors)
108+
else:
109+
print('author: \t', author)
104110
print('contribution: \t', contribution)
105111
print('churn: \t\t', -churn)
106112
# print files in case more granular results are needed
@@ -139,8 +145,9 @@ def get_loc(commit, dir, files, contribution, churn, exdir):
139145
continue
140146
return [files, contribution, churn]
141147

142-
# arrives in a format such as -13 +27,5 (no decimals == 1 loc change)
143-
# returns a dictionary where left are removals and right are additions
148+
# arrives in a format such as -13 +27,5 (no commas mean 1 loc change)
149+
# this is the chunk header where '-' is old and '+' is new
150+
# it returns a dictionary where left are removals and right are additions
144151
# if the same line got changed we subtract removals from additions
145152
def get_loc_change(loc_changes):
146153
# removals
@@ -186,12 +193,11 @@ def is_new_file(result, file):
186193
else:
187194
return file
188195

189-
def get_commits(before, after, author, dir):
196+
# use format='%an' to get a list of author names
197+
def get_commits(before, after, author, dir, format='%h'):
190198
# note --no-merges flag (usually we coders do not overhaul contrib commits)
191199
# note --reverse flag to traverse history from past to present
192-
before = format_date(before)
193-
after = format_date(after)
194-
command = 'git log --author="'+author+'" --format="%h" --no-abbrev '
200+
command = 'git log --author="'+author+'" --format="'+format+'" --no-abbrev '
195201
command += '--before="'+before+'" --after="'+after+'" --no-merges --reverse'
196202
return get_proc_out(command, dir).splitlines()
197203

0 commit comments

Comments
 (0)