Git Commit Good Practice

Open Stack wiki has an excellent guide on how to create good commits.  In a few places it is too specific to Open Stack development practices, but overall it’s one of the best guides I’ve seen for any project using git.

It is basically split into two sections.  One on how to decide which code goes into the git commit, and the other is what to include in the git commit message to make it useful.

The first part is simpler:

The cardinal rule for creating good commits is to ensure there is only one “logical change” per commit. There are many reasons why this is an important rule:

  • The smaller the amount of code being changed, the quicker & easier it is to review & identify potential flaws.
  • If a change is found to be flawed later, it may be necessary to revert the broken commit. This is much easier to do if there are not other unrelated code changes entangled with the original commit.
  • When troubleshooting problems using Git’s bisect capability, small well defined changes will aid in isolating exactly where the code problem was introduced.
  • When browsing history using Git annotate/blame, small well defined changes also aid in isolating exactly where & why a piece of code came from.

With these things to avoid:

  • Mixing whitespace changes with functional code changes.
  • Mixing two unrelated functional changes.
  • Sending large new features in a single giant commit.

The second part is slightly more detailed.  Here’s the information that should be included in the commit message, generally speaking (abbreviated quote):

As important as the content of the change, is the content of the commit message describing it. When writing a commit message there are some important things to remember

  • Do not assume the reviewer understands what the original problem was.
  • Do not assume the reviewer has access to external web services/site.
  • Do not assume the code is self-evident/self-documenting.
  • Describe why a change is being made.
  • Read the commit message to see if it hints at improved code structure.
  • Ensure sufficient information to decide whether to review.
  • The first commit line is the most important.
  • Describe any limitations of the current code.
  • Do not include patch set-specific comments.

In other words, if you rebase your change please don’t add “Patch set 2: rebased” to your commit message. That isn’t going to be relevant once your change has merged. Please do make a note of that in Gerrit as a comment on your change, however. It helps reviewers know what changed between patch sets. This also applies to comments such as “Added unit tests”, “Fixed localization problems”, or any other such patch set to patch set changes that don’t affect the overall intent of your commit.

Read the whole thing for more details, examples of good and bad practices, and more specific instructions on the spacing, line length, and more.

And if you need more convincing or a different explanation, then Google “git commit best practices” or simply check out some of these resources:

Deploy and Maintain Redmine, the Right Way

Jens Krämer wrote this nice guide to deploying and maintaining Redmine the right way.  This is basically a combination of the official Redmine documentation with a variety of guides on deploying and running a generic Ruby on Rails application.  The solution is rightfully focusing on git, combining the upstream patches with your own changes.  And given that this is “the right way”, you don’t even have to have any of your own changes.  Just being prepared for some is good.

Once you’ve setup the proper environment, you can further automate the deployment of Redmine with Capistrano.  If you don’t use Capistrano for whatever reason – no worries, the process is easily adoptable to whatever build/deploy tool you are using.

GitFlow considered harmful, and how we do it

I came across this rather strongly opinionated blog post – GitFlow considered harmful, and I have to say that I mostly agree with it.

In our company, we use a similar approach to the Anti-gitflow, but with even more simplicity.  This is one particular thing I like so much about git is that each person, team, or company can pick the workflow that suits them best.

Just to give you a little bit of context, we have a rather small development team (under 10 people), but we do a large number of projects.  All our projects are web-based, varying from small and simple websites (static HTML), through more complex WordPress sites (multilingual, e-commerce, etc), to business applications like CRMs.  Each project is done by several developers at a time and can later on be passed on to other developers, often much later (another iteration after several month).  Each developer is working on a number of projects at a time.  And we do very fast-paced development, often deploying multiple versions per day.  Given the nature of the projects and the development pace, we don’t ever really rollback.  Rollback is just another step (version) forward.  And we don’t have long and complex roadmaps in terms of which features will be released in which version.  It’s more of a constant review of what’s pending, what needs which resources, and what we can do right now.  (It’s far from ideal project management, but it somehow works for us.  If you think you can do better, send me your CV or LinkedIn profile, and we’ll talk.)

In our case, we do the following:

  • We have one eternal branch, and we call it master.
  • The master branch is always stable and deployable.  Even though we don’t really deploy it directly.
  • Nobody is allowed to commit directly to the master branch.  Initially it was just an agreed convention, but because people make mistakes, we now have this rule enforced with the technology.  Both BitBucket and GitHub support protected branches.  BitBucket, in my opinion, does it much better.
  • All new features, fixes, and improvements are developed in separate “feature” branches.  Most of these are branched off the master.  For large chunks of work, we can create a feature branch, and then introduce incremental changes to it via sub-feature branches, branched off the feature one.  This allows for easier code reviews – looking at a smaller set of changes, rather than a giant branch when it’s ready to be merged.
  • We do code review on everything.  The strongly suggested rule is that at least two other developers review the code before it is merged.  But sometimes, this is ignored, because either the changes are small and insignificant (CSS tweaks or content typos), or we are really in a hurry (we’ll review the changes later).  But whatever the case is, nobody is allowed to merge their own pull requests.  That is set in stone.  This guarantees that at least one other person looked at the changes before they were merged in.
  • We tag new versions only on the master branch.
  • We use semantic versioning for our tags.
  • We don’t deploy branches.  We deploy tags.  This helps with preventing untested/unexpected changes sneaking in between the review of the branch and the deployment.

The above process suits us pretty well.

composer-git-hooks – manage git hooks in your composer config

composer-git-hooks looks awesome!  From the project page description:

Manage git hooks easily in your composer configuration. This package makes it easy to implement a consistent project-wide usage of git hooks. Specifying hooks in the composer file makes them available for every member of the project team. This provides a consistent environment and behavior for everyone which is great.

GIT quick statistics

Any git repository contains a tonne of information about commits, contributors, and files.  Extracting this information is not always trivial, mostly because of a gadzillion options to a gadzillion git commands – I don’t think there is a single person alive who knows them all.  Probably not even Linus Torvalds himself.

git-quick-stats is a tool that simplifies access to some of that information and makes reports and statistics quick and easy to extract.  It also works across UNIX-like operating systems, Mac OS X, and Windows.

How To Use Git to Manage your User Configuration Files

There is probably a gadzillion different ways that you can manage and synchronize you configuration files (aka dotfiles) between different Linux/UNIX boxes – anything from custom symlink scripts, all the way to configuration management tools like Puppet and Ansible.  Here are a few options to look at if you are not doing it already.

Personally, I’m using Ansible and I’m quite happy with it, as it allows me to have multiple playbooks (base configuration, desktop configuration, development setup, etc), and do more things than just manage my configuration files (install packages and tools that I often need, setup correct permissions, and more).

Recently, I came across this tutorial from Digital Ocean on how to manage your configuration files with git.  Again, there are a few options discussed in there, as even with git, there’s more than one way to do it (TMTOWTDI).

The one that I’ve heard about a long time ago, but completely forgot, and which I think is quite elegant is the approach of separating the working directory from the git repository:

Now, we do things a bit differently. We will start by specifying a different working directory using the core.worktree git configuration option:

git config core.worktree "../../"

What this does is establish the working directory relative to the path of the .git directory. The first ../refers to the ~/configs directory, and the second one points us one step beyond that to our home directory.

Basically, we’ve told git “keep the repository here, but the files you are managing are two levels above the repo”.

I guess, if you stick purely to git, you can offload some of the additional processing, such as permission changes and package installation, into one of the git hooks.  Something like post-checkout or post-merge.

GitHub pricing : Business

GitHub has yet another update to their pricing options.  Business plans have been launched with support for SAML single sign-on, 99.95% uptime SLA, 24×5 support with 8 hour response, and more.

Unfortunately it still counts external contributors as users in the account, which makes it too expensive for my organizations, but it’s good to see them trying.

Moving files with commit history from one git repository to another

I’ve searched for this before, and I’m sure I’ll do that again (although the need is not that frequent), so here it goes.  It is possible to move files from one git repository to another, preserving commit history.  The following links provide a few examples of how to do this:

Basically, you need git filter-branch command, usually with the –subdirectory-filter parameter.

An example of where it is useful would be the extraction of some code from a project you have into a shared library or a simple plugin.

Use vimdiff as git mergetool

Ruslan Osipov has a very handy tutorial on how to setup Vim text editor as git merge tool, for resolving git conflicts.

Basically, run the following commands to tell git to use Vim as a merge tool (don’t forget the –global flag if you want it for all your projects, not just the current one):

git config merge.tool vimdiff
git config merge.conflictstyle diff3
git config mergetool.prompt false

With that, running “git mergetool” after a conflict was reported, will result in something like this:

The three way split window will show local version (–ours) on the left, the remote version (–theirs) on the right, and the base version with the conflict in the middle.  You can then get changes from one window into another using the following Vim diffget commands:

:diffg RE  " get from REMOTE
:diffg BA  " get from BASE
:diffg LO  " get from LOCAL


Check a few of Ruslan’s other vim-related articles.