Git Commit Good Practice

Open Stack wiki has an excellent guide on how to create good commits.  In a few places it is too specific to Open Stack development practices, but overall it’s one of the best guides I’ve seen for any project using git.

It is basically split into two sections.  One on how to decide which code goes into the git commit, and the other is what to include in the git commit message to make it useful.

The first part is simpler:

The cardinal rule for creating good commits is to ensure there is only one “logical change” per commit. There are many reasons why this is an important rule:

  • The smaller the amount of code being changed, the quicker & easier it is to review & identify potential flaws.
  • If a change is found to be flawed later, it may be necessary to revert the broken commit. This is much easier to do if there are not other unrelated code changes entangled with the original commit.
  • When troubleshooting problems using Git’s bisect capability, small well defined changes will aid in isolating exactly where the code problem was introduced.
  • When browsing history using Git annotate/blame, small well defined changes also aid in isolating exactly where & why a piece of code came from.

With these things to avoid:

  • Mixing whitespace changes with functional code changes.
  • Mixing two unrelated functional changes.
  • Sending large new features in a single giant commit.

The second part is slightly more detailed.  Here’s the information that should be included in the commit message, generally speaking (abbreviated quote):

As important as the content of the change, is the content of the commit message describing it. When writing a commit message there are some important things to remember

  • Do not assume the reviewer understands what the original problem was.
  • Do not assume the reviewer has access to external web services/site.
  • Do not assume the code is self-evident/self-documenting.
  • Describe why a change is being made.
  • Read the commit message to see if it hints at improved code structure.
  • Ensure sufficient information to decide whether to review.
  • The first commit line is the most important.
  • Describe any limitations of the current code.
  • Do not include patch set-specific comments.

In other words, if you rebase your change please don’t add “Patch set 2: rebased” to your commit message. That isn’t going to be relevant once your change has merged. Please do make a note of that in Gerrit as a comment on your change, however. It helps reviewers know what changed between patch sets. This also applies to comments such as “Added unit tests”, “Fixed localization problems”, or any other such patch set to patch set changes that don’t affect the overall intent of your commit.

Read the whole thing for more details, examples of good and bad practices, and more specific instructions on the spacing, line length, and more.

And if you need more convincing or a different explanation, then Google “git commit best practices” or simply check out some of these resources:

Deploy and Maintain Redmine, the Right Way

Jens Krämer wrote this nice guide to deploying and maintaining Redmine the right way.  This is basically a combination of the official Redmine documentation with a variety of guides on deploying and running a generic Ruby on Rails application.  The solution is rightfully focusing on git, combining the upstream patches with your own changes.  And given that this is “the right way”, you don’t even have to have any of your own changes.  Just being prepared for some is good.

Once you’ve setup the proper environment, you can further automate the deployment of Redmine with Capistrano.  If you don’t use Capistrano for whatever reason – no worries, the process is easily adoptable to whatever build/deploy tool you are using.

GitFlow considered harmful, and how we do it

I came across this rather strongly opinionated blog post – GitFlow considered harmful, and I have to say that I mostly agree with it.

In our company, we use a similar approach to the Anti-gitflow, but with even more simplicity.  This is one particular thing I like so much about git is that each person, team, or company can pick the workflow that suits them best.

Just to give you a little bit of context, we have a rather small development team (under 10 people), but we do a large number of projects.  All our projects are web-based, varying from small and simple websites (static HTML), through more complex WordPress sites (multilingual, e-commerce, etc), to business applications like CRMs.  Each project is done by several developers at a time and can later on be passed on to other developers, often much later (another iteration after several month).  Each developer is working on a number of projects at a time.  And we do very fast-paced development, often deploying multiple versions per day.  Given the nature of the projects and the development pace, we don’t ever really rollback.  Rollback is just another step (version) forward.  And we don’t have long and complex roadmaps in terms of which features will be released in which version.  It’s more of a constant review of what’s pending, what needs which resources, and what we can do right now.  (It’s far from ideal project management, but it somehow works for us.  If you think you can do better, send me your CV or LinkedIn profile, and we’ll talk.)

In our case, we do the following:

  • We have one eternal branch, and we call it master.
  • The master branch is always stable and deployable.  Even though we don’t really deploy it directly.
  • Nobody is allowed to commit directly to the master branch.  Initially it was just an agreed convention, but because people make mistakes, we now have this rule enforced with the technology.  Both BitBucket and GitHub support protected branches.  BitBucket, in my opinion, does it much better.
  • All new features, fixes, and improvements are developed in separate “feature” branches.  Most of these are branched off the master.  For large chunks of work, we can create a feature branch, and then introduce incremental changes to it via sub-feature branches, branched off the feature one.  This allows for easier code reviews – looking at a smaller set of changes, rather than a giant branch when it’s ready to be merged.
  • We do code review on everything.  The strongly suggested rule is that at least two other developers review the code before it is merged.  But sometimes, this is ignored, because either the changes are small and insignificant (CSS tweaks or content typos), or we are really in a hurry (we’ll review the changes later).  But whatever the case is, nobody is allowed to merge their own pull requests.  That is set in stone.  This guarantees that at least one other person looked at the changes before they were merged in.
  • We tag new versions only on the master branch.
  • We use semantic versioning for our tags.
  • We don’t deploy branches.  We deploy tags.  This helps with preventing untested/unexpected changes sneaking in between the review of the branch and the deployment.

The above process suits us pretty well.

GIT quick statistics

Any git repository contains a tonne of information about commits, contributors, and files.  Extracting this information is not always trivial, mostly because of a gadzillion options to a gadzillion git commands – I don’t think there is a single person alive who knows them all.  Probably not even Linus Torvalds himself.

git-quick-stats is a tool that simplifies access to some of that information and makes reports and statistics quick and easy to extract.  It also works across UNIX-like operating systems, Mac OS X, and Windows.

How To Use Git to Manage your User Configuration Files

There is probably a gadzillion different ways that you can manage and synchronize you configuration files (aka dotfiles) between different Linux/UNIX boxes – anything from custom symlink scripts, all the way to configuration management tools like Puppet and Ansible.  Here are a few options to look at if you are not doing it already.

Personally, I’m using Ansible and I’m quite happy with it, as it allows me to have multiple playbooks (base configuration, desktop configuration, development setup, etc), and do more things than just manage my configuration files (install packages and tools that I often need, setup correct permissions, and more).

Recently, I came across this tutorial from Digital Ocean on how to manage your configuration files with git.  Again, there are a few options discussed in there, as even with git, there’s more than one way to do it (TMTOWTDI).

The one that I’ve heard about a long time ago, but completely forgot, and which I think is quite elegant is the approach of separating the working directory from the git repository:

Now, we do things a bit differently. We will start by specifying a different working directory using the core.worktree git configuration option:

git config core.worktree "../../"

What this does is establish the working directory relative to the path of the .git directory. The first ../refers to the ~/configs directory, and the second one points us one step beyond that to our home directory.

Basically, we’ve told git “keep the repository here, but the files you are managing are two levels above the repo”.

I guess, if you stick purely to git, you can offload some of the additional processing, such as permission changes and package installation, into one of the git hooks.  Something like post-checkout or post-merge.

GitHub pricing : Business

GitHub has yet another update to their pricing options.  Business plans have been launched with support for SAML single sign-on, 99.95% uptime SLA, 24×5 support with 8 hour response, and more.

Unfortunately it still counts external contributors as users in the account, which makes it too expensive for my organizations, but it’s good to see them trying.

Moving files with commit history from one git repository to another

I’ve searched for this before, and I’m sure I’ll do that again (although the need is not that frequent), so here it goes.  It is possible to move files from one git repository to another, preserving commit history.  The following links provide a few examples of how to do this:

Basically, you need git filter-branch command, usually with the –subdirectory-filter parameter.

An example of where it is useful would be the extraction of some code from a project you have into a shared library or a simple plugin.

Use vimdiff as git mergetool

Ruslan Osipov has a very handy tutorial on how to setup Vim text editor as git merge tool, for resolving git conflicts.

Basically, run the following commands to tell git to use Vim as a merge tool (don’t forget the –global flag if you want it for all your projects, not just the current one):

git config merge.tool vimdiff
git config merge.conflictstyle diff3
git config mergetool.prompt false

With that, running “git mergetool” after a conflict was reported, will result in something like this:

The three way split window will show local version (–ours) on the left, the remote version (–theirs) on the right, and the base version with the conflict in the middle.  You can then get changes from one window into another using the following Vim diffget commands:

:diffg RE  " get from REMOTE
:diffg BA  " get from BASE
:diffg LO  " get from LOCAL

Awesomeness!

Check a few of Ruslan’s other vim-related articles.

GitLab horror story : backup / restore failure

As I am reading this story – GitLab.com melts down after wrong directory deleted, backups fail and these details – every single hair I have, moves … I don’t (and didn’t) have any data on GitLab, so I haven’t lost anything.  But as somebody who worked as a system administrator (and backup administrator) for years, I can imagine the physical and psychological state of the team all too well.

Sure, things could have been done better.  But it’s easier said than done.  Modern technology is very complex.  And it changes fast.  And businesses want to move fast too.  And the proper resources (time, money, people) are not always allocated for mission critical tasks.  One thing is for sure, the responsibility lies on a whole bunch of people for a whole bunch of decisions.  But the hardest job is right now upon the tech people to bring back whatever they can.  There’s no sleep.  Probably no food.  No fun.  And a tremendous pressure all around.

I wish the guys and gals at GitLab a super good luck.  Hopefully they will find a snapshot to restore from and this whole thing will calm down and sort itself out.  Stay strong!

And I guess I’ll be doing test restores all night today, making sure that all my things are covered…

Update: you can now read the full post-mortem as well.

composer-patches – Simple patches plugin for Composer

composer-patches is a plugin for Composer which helps with applying patches to the installed dependencies.  It supports patches from URLs, local files, and from other dependencies.

I think this is absolutely brilliant!

It’s quite often that one finds bugs and issues in external dependencies.  Once the bug (or even the pull request with the fix) is submitted to the vendor, it can take anywhere from a few hours to a few weeks to be resolved and a new version to be released.

If you have a fix for the problem and need it in your project right away, and can’t wait until the vendor releases the new version, your best choice is to fork the dependency, fix the problem, and use your repository instead of the vendor’s package.  This works, but it’s messy.

With the patches plugin to composer, you can still use the vendor’s package and just apply a patch with composer, until the new version is available.  Clean and simple.

This also helps with testing things and working with different changes by different people, if you want to try things out – no need to choose between multiple repositories.  Just select the patches that you want and apply them at the environment you need.

Given that most development work is happening on GitHub these days, this composer plugin is even more useful than what I might think at first.  You see, GitHub provides patch and diff URL for each commit – all you need to do is add the extension to the URL.  For example, take this recent commit to my dotfiles repository.

Commit screen

If you add a “.patch” extension to this URL, you’ll get a patch output, which is useful for git am, and other commands (more on using git with email):

Patch screen

 

If you add a “.diff” extension to this URL, you’ll get a unified diff output, which you can either apply with diff and patch utils, or use with the composer-patches plugin.

Unified diff screen

So, this gives you a way of applying any commit on GitHub (and other repositories) via composer to any of your dependencies.  This is mind blowing!