Revision control for writers

We all know about the more common ways of dealing with backup/revisions. It’s along these lines:

  1. Copy file
  2. Paste file
  3. Rename file with a date
  4. Copy folder with all the revisions to a spare harddrive/etc

Now, I’m fairly techsavy (which is probably evident by the rest of the articles), but since revisions are such a large part of writing, I hope that I will be able to explain in an accessible way how to use revision control.

If there is something in particular that is making you stumble, feel free to contact me through mail or through Twitter. As a note, I mainly work in Linux, so my Windows skills are lacking, but I can at least help to troubleshoot.

If you prefer to have a more visual interface, here’s a good list. However, the article will assume you are using the terminal.

Any lines looking like this means that it should be copied and pasted into the terminal of your chosen operating system. Windows Git-installation comes with its own terminal, link below under installation. To simplify the text, it will look something like this “run some command“, which is shorthand for:

  1. Copy the line
  2. Paste the line in the terminal, at the root of your working folder/directory
  3. Press Enter

Installation

There are several different kinds of revision control systems. To simplify, this will focus on the one I use the most, Git.

Install it on …

First steps of revision control

Getting started with Git

First thing you’ll need to do is to let Git know who you are. You can configurate git using the command git config, but the two specifics needed here are name and e-mail.

  1. Run git config --global user.name "YOUR NAME"
  2. Run git config --global user.email "YOUR E-MAIL ADDRESS"

Setting up the vault for your novel

The word you will hear is repository. I went with the synonym vault as more recognizable. First, a metaphor for what we’re going to do:

Imagine that you have an assistant that keeps track of every change you do to your novel. They don’t actually have the novel itself, but should you lose your copy, they have the information to recreate it as long as you have given it to them. That is the vault / bare repository we will set up. In fact, you can have several of these assistants, if you are of the more paranoid persuasion. If you are, suggestions on where to store things will be detailed further down.

  1. Decide where you want to store the backup / vault / assistant and create a folder for it. A recommendation is to name it ending with .git, ex. mynovel.git. Make note of where this is, you will need it
  2. In the terminal, go to the folder. If you’re using Windows you can right-click when you are in the right folder and select Git Bash option
  3. Run git --bare init to create the repository

Dublicate (clone) the repository

Now into where you’re actually writing.

  1. Create the folder that you want to work in
  2. Go to the folder itself, as per step 2 above
  3. Run git clone file:///path/to/repo.git

At this point you have one backup repository and one working repository.

Start writing

As a note here before we continue with the different commands, the revision control works for any kind of documents. However, if you really want to get the best mileage out of it, you should avoid the latest versions of Word. This is because of how it internally deals with your file, but it’s not a big thing. However, if you needed the push to move away from writing in Word, here it is!

Saving (committing) changes

While writing, you should save as normal. However, when you are at a good point to check things in (what those points are is up to you), go into the terminal and the folder you have the project in. Run git status -sb.

It will show you something like the following:

## master
 M README.md
 D LICENCE-MIT
A  fluff
?? fluff2
UU fluff3

The first (## test) is the headline. It states with branch (later) you’re on.

Space before symbol
The file is not marked (staged) to be saved
M before filename
The file has been changed since you last saved
D before filename
The file has been deleted since you last saved
A before filename
The file has been added and is staged to be saved
?? before filename
The file has been added, but is not staged to be saved
UU before filename
There are “conflicts” in the file (unlikely to show up in the first few steps)
  1. To mark all files (even the deleted ones) as to-be-saved, run git add . --all in the folder where everything is saved
  2. Run git commit -m "Something to remind you what you changed", which will do the actual save
  3. Finally, run git push origin master. This reports the changes to the bare repository we created above

My computer crashed, and now I need to recover the backup!

  1. Create the folder that you want to work in
  2. In the terminal, go to the folder
  3. Run git clone file:///path/to/repo.git

Yes, those are the same steps as above. I repeated them here for ease of use.

Summary of the first steps

This is a very simplistic flow. You can do far more advanced things, but if you just want to have an easy way to backup and have some kind of check on what a scene looked like three days ago, this will do the trick. If you want to wet your appetite a bit more, continue on.

Digging deeper into the rabbit hole

Branches

You may want to experiment a bit with a text. Using the old way you’d be copying the manuscript file and work on that. A similar way to that is to create a branch off of the main manuscript tree. This branch can be treated in whichever way you want. If you completelly mess things up, you can just discard it and go back to your default / untouched Master branch. Branch names should be single words, or maybe underline-separated words, such as my_branch.

Commands

Create a new branch as you move to (checkout) it
Run `git checkout -b “name of branch”`
Move to a branch
Run `git checkout “name of branch”`
Create a new branch without moving to it
Run `git branch “name of branch”`
Merge two branches
Run `git merge “name of branch”`

Examples

Assume that you’re wanting to edit chapter 1. The flow will go about like this, assuming you name the branch chapter1_edit:

  1. Run git checkout -b chapter1_edit. Run git status -sb to confirm that the heading has the name chapter1_edit.
  2. Do changes. Add (git add --all) and commit (git commit -m "my changes") them. Repeat until done.
  3. Run git checkout master to get back to the master branch. If you have changes that are not committed, they will follow along to the branch, and if those changes clash with the current state, you cannot change branch. That’s why you’re recommended to always commit your changes, even if you decide that you don’t like them and will not keep them.
  4. If you want to keep the changes run git merge <name of branch>. You may get a merge conflict, at which point you are best off googling “resolving conflict in git”. (It’s not that difficult, but this is a brief runthrough.)

To delete a branch, run git branch -d <name of branch>. If it is a branch that hasn’t been merged into the branch you are at (maybe you decided to not keep those edits), the command is git branch -D <name of branch>, that is a capital D rather than lowercase.

Finding prior versions

The earlier examples have been dealing with Git as (more or less) a backup tool. However, the point of a revision control system is to be able to find a prior version.

View change log

Let’s start with the easiest: Run git log. This will give you a list of commits, along these lines:

commit d24ef819ff969c27c5b25d75e1d9059b7129564d
Author: Marie Hogebrandt <iam@mariehogebrandt.se>
Date:   Mon Jun 16 08:57:17 2014 +0200

    Removed all filter words

The first row (commit d24ef819ff969c27c5b25d75e1d9059b7129564d) is known as the commit hash and is used to reference this particular commit. You can shorten it to about 7 characters for simplicity, so in this exact case d24ef81.

Everything below the Date: line is the commit message. It’s whatever you wrote after -m when you saved it. If you need more than just a single line, you don’t actually need to use the format git commit -m "message", but can use git commit at which point it will open up an editor in the terminal. I will write a brief explanation on Vim (which I understand is the Windows one it opens) further down.

Use the arrow keys to navigate the log, and once you’re done, press q.

To just see a quick list of changes, run git log --oneline, which will give you the commit hash and message.

View differences between two commits

So, you want to see the difference between two points in time. Let’s say that in chapter2.rtf you removed the bartender, but now you need to use her in chapter 5, and you don’t remember what you named her, or what that important scar looks like, and thus you want to read the original one.

Definition: HEAD (note: capital letters). HEAD is the point you are at after every save point. It’s the unique commit hash (to use that term) for this is where I am. If you’re wanting to compare HEAD to one or two commits back, the easiest is to use the ^ operator.

Assume we have three commits. The current place, the last time you committed, and the two times before that. To reference them without knowing the commit hash, it’s as follows:

Current place
`HEAD`
Commit prior to where you are now
`HEAD^`
Two commits prior to where you are now
`HEAD^^`
Three commits prior to where you are now
`HEAD^^^`

However, if you are looking at more diverse commits (say the HEAD^^^ and the one three commits before that), you’re better off using git log to figure out the commit hash.

To see the difference between two commits on a specific file (in this case HEAD, HEAD^^, and on chapter2.rtf) run git diff HEAD^^ HEAD chapter2.rtf. That is, more generalised:

  1. Pick the earliest point. That’s the first argument
  2. Pick the end-point. That’s the second argument
  3. Pick the file you want to look at. If you want to see all differences, you can leave off the third argument
  4. Run it: git diff <earliest> <latest> <path/to/file>

Pick up an earlier revision

Depending exactly what you’re wanting to do, you can go about this in a few different ways. In all of these cases, we’ll assume that the cat managed to get into your document and delete most of it.

Unstage

You accidentally marked the file your kitty created as to be saved, which you don’t want. Run git reset HEAD <file/kitty/created>, and the file is back to being unmarked for saving.

Go back to the latest commit

You saved the document, but you haven’t committed it yet, so what you want is to undo all the changes to it. Run git checkout -- <file/kitty/changed>, which will fetch the latest committed version of that file.

Whoops, I accidentally commited it!

Now the versatility of the git checkout command is going to come into play. That’s the one we use to change branches, and now they can be used on specific files as well?

Yes. You can even use it to check out specific commits, but let’s not get ahead of ourselves.

`git checkout `
Moves you to that specific branch
`git checkout -b `
Creates the branch and then moves you to it
`git checkout — `
Replaces the current changes in *filename* with the content of the file in the latest save
`git checkout `
Replaces the current changes in *filename* with the content of the file in the specified branch or commit hash. For instance: `git checkout master chapter2.rtf` will fetch the content of the file as it is in the master-branch. Similarly `git checkout df531a9 chapter2.rtf` will fetch it as it was in the commit `df531a9`

Everything I worked on needs to be reset to my latest save!

The path . is “here and subfolders”, so you can run git checkout -- . to undo all the changes since the latest commit. This actually also works with a prior commit, but you may lose everything, see below for details.

I want to explore an entirely different revision of this

  1. Commit any changes you want to keep in the current branch, or you risk losing them
  2. Figure out which commit you want to go to (ex df531a9)
  3. Do you want to check this out in the current branch or in another/new branch?

Current branch

  1. To check it out in the current branch (maybe you want to get some quick info and can’t remember where it was), run git checkout <commit hash>. The detached HEAD business means that nothing you do there will effect things
  2. To get back to the branch you were in, run git checkout <name of branch>

New branch

  1. To check out a new branch from a commit, run git checkout <commit hash> -b <new branch>
  2. Since this is a new branch, it acts in all ways as the branches mentioned above

More than one backup

Us writers are occasionally careful on the verge of paranoid when it comes to our work, so let’s look at how to add more points of backup.

Add more vaults!

  1. Using the guide from above, create a new bare repository
  2. Decide on a memorable name. For instance, usb or 2nd. Avoid using spaces in the name
  3. Add this using the following command: git remote add <name of vault> <path to vault>
  4. Run git push -u <name of vault> master
  5. Whenever you update the prior vault with git push, also run git push <name of vault>

Use some kind of cloud vault

There are two major cloud services for hosting repositories. Bitbucket allows for free private repositories, and Github which you need to pay for any private repositories.

I personally like Github and have no issues with the $7/month that it costs me to have up-to 5 private repositories, however that’s at least partly habit rather than a choice against Bitbucket. Both of them have good tutorials (and a fairly easily-to-understand interface) for how to create an empty repository, and also for how to import an existing repository.

E-mail entire repository

Yes, you can e-mail the repository, or e-mail only certain changes. It uses a command called git bundle

The entire repository

  1. Decide which branches you want. If you want all of them, the third argument is --all, if only one or more, they should be specified like this: master edit test
  2. Pick the path where you want to put the bundle, ex ../mynovel.bundle, which will put it one step above the root. You don’t need to name it .bundle, but it’s a good idea
  3. Run git bundle create <name of bundle> <branch(es)>
  4. E-mail the file to wherever you want it
  5. You can now clone from it, ex git clone <name of bundle> -b master <name of repo> or if you just want to update from the bundle, run git pull <name of bundle>

Updates

You decide you don’t actually want to e-mail the entire repository each time, but rather e-mailed the complete once, and now you just want the updates.

Run git log --oneline to see which commits you have. As an example, view the following log:

71b84da last commit - second repo
c99cf5b fourth commit - second repo
7011d3d third commit - second repo
9a466c5 second commit
b1ec324 first commit

In this example, you want to bundle everything newer than second commit.

  1. The hash you’re wanting is the one of the commit right before the first one you want to bundle, so in the example 9a466c5. You could also go with just adding the commits that are associated with a specific branch, say edit
  2. git bundle create <name of bundle> master ^<hash or branch name>
  3. E-mail it
  4. Check that the bundle is valid for the repository you are trying to apply it to by running bundle verify <name of bundle. It will object if either you are missing one of the ancestor commits, or there’s something else wrong with it
  5. To update your backup repository, run git pull <path to bundle>

But I already have most of my novel in a particular folder, how does that work?

  1. Follow the steps to create the bare repository as above
  2. Go to your novel’s folder and run the following commands:
    1. git init
    2. git add .
    3. git commit -m "Initial commit"
    4. git remote add origin file:///path/to/repo.git
    5. git push -u origin master

You probably recognize git remote add ... from how to add more vaults. You don’t need to name the vault origin, but it’s the standard name that git gives that position when you clone it.

You also probably recognize git push ..., so let’s see what these arguments do, in specific.

`-u`
This sets up a tracking relationship. It allows you to run `git push ` to update that particular vault or `git pull ` to update from it
`origin`
The name of the vault
`master`
The name of the branch, which must match between repositories.

Notes: master and origin are both default, which is why these commands are equivalent:

git push, git push origin, git push origin master.

However, if you are using a different name for the remote, you need to detail that, so the remote usb can only be pushed to using git push usb or git push usb master.

You can also track other branches than master, but this will not go into that.

Finishing thoughts

This is a work in progress. If you run into a hurdle, don’t hesitate to contact me via my contact form or my twitter. Obviously also if you have ideas on what else you’d like to do as a writer using revision control, or just want to chat!


For the moment comments are not enabled, but feel free to reach out on Twitter.