Iterating Git commit history using Qi

In the Code­less Code, a cre­ative blog on soft­ware engi­neer­ing, there’s this scribe named Qi and he is a total badass. Qi is source con­trol per­son­i­fied; a prag­mat­ic sto­ry­teller that rea­sons about sys­tems as the sum total of all that came before.

I wrote the qi CLI tool (GitHub, PyPi) as a hat-tip to the author—and with his gra­cious per­mis­sion for use of the name. The idea is to bring devel­op­ers lever­age in dis­cus­sions with man­age­ment through a notice­ably less vio­lent vari­ant of Case 89qi is only a Python wrap­per to git that prints revi­sion hash­es at the end of fixed time inter­vals. This enables run­ning an analy­sis at, say, the end of every week over the past two quar­ters to check if a team deci­sion in Jan­u­ary is hav­ing an intend­ed effect in June.

This helps devel­op­ers and man­agers accept the same facts under pres­sure. I once worked in a high­ly active JS shop with over 30 devel­op­ers and over one hun­dred com­mits every day. We had Bam­boo run­ning builds and auto­mat­ed tests every so often to catch prob­lem­at­ic changes, but it was hard to know if our changes this sprint fol­lowed a spe­cif­ic “direc­tion” for sys­tem qual­i­ty, what­ev­er that meant.  After I wrote and ran qi, I dis­cov­ered that the team was, on aver­age, cre­at­ing sev­en pro­duc­tion code files for every one test suite. To make mat­ters worse, rough­ly 20% of our com­bined JS and CSS went com­plete­ly unused.

We all knew that these things hap­pened, but once I had the num­bers to show that mat­ters were get­ting worse due to man­age­ment pres­sure, we re-pri­or­i­tized to allow more time to increase code cov­er­age for crit­i­cal tests. Qi knew that com­mit his­to­ry could not just be read, but lever­aged.

qi is use­ful when one of the fol­low­ing con­di­tions apply to you:

  • You want to gath­er facts and cor­re­la­tions to study the code’s his­to­ry, review recent changes, or to prove a point.
  • You have no way to mon­i­tor the state of a code­base in terms of met­rics or cor­re­la­tions that are impor­tant to you or man­age­ment.

qi is not use­ful under these con­di­tions:

  • Your project(s) are small or neglect­ed because there is rarely a big pic­ture to derive, let alone a desire to find one.
  • Your team has a strong grasp on the sys­tem being main­tained, or already per­forms suf­fi­cient analy­sis.

What’s happening?

Con­sid­er the com­mand qi --every month --past "3 years".

The com­mand does what it says. It will print revi­sion hash­es that mark the end of every month for the past three years, with some caveats:

  • qi starts from HEAD. The query against his­to­ry will search 3 years pri­or to the com­mit date from HEAD, not from the time you ran the com­mand.
  • If HEAD is younger than 3 years, then the out­put will end at the com­mit mark­ing the end of the first month of devel­op­ment.
  • If HEAD is not even a month old, there will be no out­put.

Print­ing hash­es is one thing, but we want to do some­thing with this infor­ma­tion. Since qi is basi­cal­ly a time iter­a­tor, we can tell it to run a com­mand at each com­mit. Note that qi’s short options use the same let­ter that starts their long options.

qi -e month -p "3 years" -c "./analyze"

When you spec­i­fy a com­mand for qi to run, qi will check out each sig­nif­i­cant com­mit (detach­ing HEAD) and run the com­mand against the code­base at that time.
If your analy­sis fin­ish­es or crash­es, qi will attempt to put HEAD back where it start­ed. But if qi crash­es, HEAD will be left detached and you will need to run git checkout to put it back your­self.

qi -c lets you mod­el infor­ma­tion­al met­rics over time, such as the pro­por­tion of dead code in the project and the cur­rent code cov­er­age by tests. You decide what met­rics mat­ter, because qi is not a report­ing tool, it is one kind of iter­a­tor a report­ing tool might use.

Let’s look at a cou­ple of fun exam­ples.

For triv­ia, if you run qi -e year -p "12 years" -c "du -sh ." on Torvald’s Lin­ux source, you will see the disk usage growth of the project over it’s time on GitHub.

[sage linux]$ qi -e year -p "12 years" -c "du -sh ."
2.8G . # 2017 (to date)
2.7G . # 2016
2.6G . # 2015
2.6G . # 2014
2.5G . # 2013
2.5G . # 2012
2.4G . # 2011
2.4G . # 2010
2.4G . # 2009
2.3G . # 2008
2.2G . # 2007
2.2G . # 2006
2.2G . # 2005

Anoth­er exam­ple is if you want React’s com­plex­i­ty report at the end of every year of its devel­op­ment. In this case you could use this set­up:

$ git clone && cd react
$ qi --every year --past "3 years" --command "./"

TIME=$(git show -s --format=%ct HEAD)

# Using
cr -e -f json ./src > $TIME.json


Do NOT follow this link or you will be banned from the site!