The Hive

Share this post

AI generated Git commit messages

abiraja.substack.com

Discover more from The Hive

Thoughts, stories and ideas from Abi Raja
Over 16,000 subscribers
Continue reading
Sign in

AI generated Git commit messages

A CLI tool for your pleasure and some analysis of how good GPT 3.5 is at this task

Abi Raja
Dec 30, 2022
2
Share this post

AI generated Git commit messages

abiraja.substack.com
Share

When you're in the midst of coding, Git commit messages aren't particularly fun to write. Context switching is never fun when you’re coding and as a result, I find myself committing less often (even though committing more is much better).

AI to the rescue!

Thanks for reading The Hive! Subscribe for free to receive new posts and support my work.

Large language models are very good at summarizing text. Unsurprisingly, simply feeding the output of git diff into OpenAI’s Davinci model results in pretty good commit messages. Here’s an example:

  • My commit message: Update README.md

  • AI suggestions:

    • Fix README to reflect ChatGPT API changes

    • Update README to reflect ChatGPT API changes

    • README: Reflect ChatGPT API changes (this is the best one)

    • README: Update for ChatGPT API changes

    • README: Fix for ChatGPT API changes

Another example:

  • My commit message: Copy fix

  • AI suggestions:

    • Update sidebar buttons to reflect new functionality

    • Add surrounding text retrieval to extension

    • Improve explain and diagnose buttons in sidebar

    • Refactor extension to include new explain and diagnose buttons

    • Add error diagnosis functionality to sidebar

Testing on an entire repo

I wrote a Python script to run through all the commits in a repo, prompt OpenAI (“text-davinci-003”) for commit message suggestions and then, output a CSV of the commit hash, original commit message and 5 suggested commit messages.

I ran the script over my CodeGPT repo (a beautiful ChatGPT-powered coding assistant in VS Code that’s unfortunately now broken due to ChatGPT cracking down on usage of their unofficial API). The repo has 60ish commits and the run cost about $2 in OpenAI API usage (so, not cheap). Here’s a sample of the results below. You can check out the full results on this Google Sheet.

I wanted a more rigorous sense of how good the AI is at generating commit messages so on that sheet, I marked the commits where I felt that my messages were significantly better than the AI suggestions.

  • Number of commits: 58

  • Number where original is preferred: 20

  • AI success rate: 65.52%

65% is quite good, and in many of the remaining 35% of cases, the AI suggestion can be modified quickly to get something quite good.

Undoubtedly, this is a huge time saver.

Failure modes

The results are even more impressive when we look at where the AI fails:

  • Diff doesn't convey intention or downstream effects. In one case, the immediate code change was well-summarized as “Add allowSyntheticDefaultImports to tsconfig.json” but what I was really doing in that commit was fixing a Typescript error I was getting when running the tests. Hence, my commit message of “get tests working”. Without being aware of the contents of my terminal at that time, it would be impossible for the AI to get to this commit message.

    • There are less subtle cases where diffs are insufficient since they only show a few lines before and after each change. We could get more than a few lines before and after a change to slow this.

  • Multiple changes in one commit. Despite my prompting, the AI doesn't seem to want to include multiple changes in one commit messages. I think this problem could be fixed with some examples in the prompt (“few shot prompting”).

We can optimize this further by removing large/irrelevant files from diffs such as yarn.lock. A few of the diffs also exceeded the 4000 token context window of davinci-003. I’m simply truncating until the prompt is under that limit. We could definitely do something smarter here.

CLI

I also built a CLI version of the tool. It generates 5 commit message suggestions for all the changes in the Git repo (in your current working directory). After you pick and edit the commit message you want, it performs the commit.

You can install it on your machine with pip:

pip install aicommit

Then, just run aicommit in any git repo.

On first run, it will prompt you for your OpenAI API key. Sign up for OpenAI if you haven't. Grab your API key by going to the dropdown on the top right, selecting "View API Keys" and creating a new key.

NOTE: once you confirm your commit message, the CLI commits all changes, untracked and unstaged, in your current repo. I don’t use staging at all in my Git workflow so this works for me. But the code’s open source so feel free to change it as needed.

Over the last week or so, this has been saving me a ton of time and I’m enjoying the much better commit messages the AI produces without needing to context switch. Have fun and feel free to open an issue on the Github if you have feedback.

Thanks for reading The Hive! Subscribe for free to receive new posts and support my work.

2
Share this post

AI generated Git commit messages

abiraja.substack.com
Share
Comments
Top
New
Community

No posts

Ready for more?

© 2023 Abi Raja
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing