Revision control for writers

We all know about the more com­mon ways of deal­ing with backup/revisions. It’s along these lines:

  1. Copy file
  2. Paste file
  3. Rename file with a date
  4. Copy fold­er with all the revi­sions to a spare harddrive/etc

Now, I’m fair­ly tech­savy (which is prob­a­bly evi­dent by the rest of the arti­cles), but since revi­sions are such a large part of writ­ing, I hope that I will be able to explain in an acces­si­ble way how to use revi­sion con­trol.

If there is some­thing in par­tic­u­lar that is mak­ing you stum­ble, feel free to con­tact me through mail or through Twit­ter. As a note, I main­ly work in Lin­ux, so my Win­dows skills are lack­ing, but I can at least help to trou­bleshoot.

If you pre­fer to have a more visu­al inter­face, here’s a good list. How­ev­er, the arti­cle will assume you are using the ter­mi­nal.

Any lines looking like this means that it should be copied and past­ed into the ter­mi­nal of your cho­sen oper­at­ing sys­tem. Win­dows Git-instal­la­tion comes with its own ter­mi­nal, link below under instal­la­tion. To sim­pli­fy the text, it will look some­thing like this “run some command”, which is short­hand for:

  1. Copy the line
  2. Paste the line in the ter­mi­nal, at the root of your work­ing folder/directory
  3. Press Enter

Installation

There are sev­er­al dif­fer­ent kinds of revi­sion con­trol sys­tems. To sim­pli­fy, this will focus on the one I use the most, Git.

Install it on …

First steps of revision control

Getting started with Git

First thing you’ll need to do is to let Git know who you are. You can con­fig­u­rate git using the com­mand git config, but the two specifics need­ed here are name and e-mail.

  1. Run git config --global user.name "YOUR NAME"
  2. Run git config --global user.email "YOUR E-MAIL ADDRESS"

Setting up the vault for your novel

The word you will hear is repos­i­to­ry. I went with the syn­onym vault as more rec­og­niz­able. First, a metaphor for what we’re going to do:

Imag­ine that you have an assis­tant that keeps track of every change you do to your nov­el. They don’t actu­al­ly have the nov­el itself, but should you lose your copy, they have the infor­ma­tion to recre­ate it as long as you have giv­en it to them. That is the vault / bare repos­i­to­ry we will set up. In fact, you can have sev­er­al of these assis­tants, if you are of the more para­noid per­sua­sion. If you are, sug­ges­tions on where to store things will be detailed fur­ther down.

  1. Decide where you want to store the back­up / vault / assis­tant and cre­ate a fold­er for it. A rec­om­men­da­tion is to name it end­ing with .git, ex. mynovel.git. Make note of where this is, you will need it
  2. In the ter­mi­nal, go to the fold­er. If you’re using Win­dows you can right-click when you are in the right fold­er and select Git Bash option
  3. Run git --bare init to cre­ate the repos­i­to­ry

Dublicate (clone) the repository

Now into where you’re actu­al­ly writ­ing.

  1. Cre­ate the fold­er that you want to work in
  2. Go to the fold­er itself, as per step 2 above
  3. Run git clone file:///path/to/repo.git

At this point you have one back­up repos­i­to­ry and one work­ing repos­i­to­ry.

Start writing

As a note here before we con­tin­ue with the dif­fer­ent com­mands, the revi­sion con­trol works for any kind of doc­u­ments. How­ev­er, if you real­ly want to get the best mileage out of it, you should avoid the lat­est ver­sions of Word. This is because of how it inter­nal­ly deals with your file, but it’s not a big thing. How­ev­er, if you need­ed the push to move away from writ­ing in Word, here it is!

Saving (committing) changes

While writ­ing, you should save as nor­mal. How­ev­er, when you are at a good point to check things in (what those points are is up to you), go into the ter­mi­nal and the fold­er you have the project in. Run git status -sb.

It will show you some­thing like the fol­low­ing:

## master
 M README.md
 D LICENCE-MIT
A  fluff
?? fluff2
UU fluff3

The first (## test) is the head­line. It states with branch (lat­er) you’re on.

Space before sym­bol
The file is not marked (staged) to be saved
M before file­name
The file has been changed since you last saved
D before file­name
The file has been delet­ed since you last saved
A before file­name
The file has been added and is staged to be saved
?? before file­name
The file has been added, but is not staged to be saved
UU before file­name
There are “con­flicts” in the file (unlike­ly to show up in the first few steps)
  1. To mark all files (even the delet­ed ones) as to-be-saved, run git add . --all in the fold­er where every­thing is saved
  2. Run git commit -m "Something to remind you what you changed", which will do the actu­al save
  3. Final­ly, run git push origin master. This reports the changes to the bare repos­i­to­ry we cre­at­ed above

My computer crashed, and now I need to recover the backup!

  1. Cre­ate the fold­er that you want to work in
  2. In the ter­mi­nal, go to the fold­er
  3. Run git clone file:///path/to/repo.git

Yes, those are the same steps as above. I repeat­ed them here for ease of use.

Summary of the first steps

This is a very sim­plis­tic flow. You can do far more advanced things, but if you just want to have an easy way to back­up and have some kind of check on what a scene looked like three days ago, this will do the trick. If you want to wet your appetite a bit more, con­tin­ue on.

Digging deeper into the rabbit hole

Branches

You may want to exper­i­ment a bit with a text. Using the old way you’d be copy­ing the man­u­script file and work on that. A sim­i­lar way to that is to cre­ate a branch off of the main man­u­script tree. This branch can be treat­ed in whichev­er way you want. If you com­pletel­ly mess things up, you can just dis­card it and go back to your default / untouched Mas­ter branch. Branch names should be sin­gle words, or maybe under­line-sep­a­rat­ed words, such as my_branch.

Commands

Cre­ate a new branch as you move to (check­out) it
Run ‘git check­out -b “name of branch“‘
Move to a branch
Run ‘git check­out “name of branch“‘
Cre­ate a new branch with­out mov­ing to it
Run ‘git branch “name of branch“‘
Merge two branch­es
Run ‘git merge “name of branch“‘

Examples

Assume that you’re want­i­ng to edit chap­ter 1. The flow will go about like this, assum­ing you name the branch chapter1_edit:

  1. Run git checkout -b chapter1_edit. Run git status -sb to con­firm that the head­ing has the name chapter1_edit.
  2. Do changes. Add (git add --all) and com­mit (git commit -m "my changes") them. Repeat until done.
  3. Run git checkout master to get back to the mas­ter branch. If you have changes that are not com­mit­ted, they will fol­low along to the branch, and if those changes clash with the cur­rent state, you can­not change branch. That’s why you’re rec­om­mend­ed to always com­mit your changes, even if you decide that you don’t like them and will not keep them.
  4. If you want to keep the changes run git merge <name of branch>. You may get a merge con­flict, at which point you are best off googling “resolv­ing con­flict in git”. (It’s not that dif­fi­cult, but this is a brief run­through.)

To delete a branch, run git branch -d <name of branch>. If it is a branch that hasn’t been merged into the branch you are at (maybe you decid­ed to not keep those edits), the com­mand is git branch -D <name of branch>, that is a cap­i­tal D rather than low­er­case.

Finding prior versions

The ear­li­er exam­ples have been deal­ing with Git as (more or less) a back­up tool. How­ev­er, the point of a revi­sion con­trol sys­tem is to be able to find a pri­or ver­sion.

View change log

Let’s start with the eas­i­est: Run git log. This will give you a list of com­mits, along these lines:

commit d24ef819ff969c27c5b25d75e1d9059b7129564d
Author: Marie Hogebrandt <iam@mariehogebrandt.se>
Date:   Mon Jun 16 08:57:17 2014 +0200

    Removed all filter words

The first row (commit d24ef819ff969c27c5b25d75e1d9059b7129564d) is known as the com­mit hash and is used to ref­er­ence this par­tic­u­lar com­mit. You can short­en it to about 7 char­ac­ters for sim­plic­i­ty, so in this exact case d24ef81.

Every­thing below the Date: line is the com­mit mes­sage. It’s what­ev­er you wrote after -m when you saved it. If you need more than just a sin­gle line, you don’t actu­al­ly need to use the for­mat git commit -m "message", but can use git commit at which point it will open up an edi­tor in the ter­mi­nal. I will write a brief expla­na­tion on Vim (which I under­stand is the Win­dows one it opens) fur­ther down.

Use the arrow keys to nav­i­gate the log, and once you’re done, press q.

To just see a quick list of changes, run git log --oneline, which will give you the com­mit hash and mes­sage.

View differences between two commits

So, you want to see the dif­fer­ence between two points in time. Let’s say that in chapter2.rtf you removed the bar­tender, but now you need to use her in chap­ter 5, and you don’t remem­ber what you named her, or what that impor­tant scar looks like, and thus you want to read the orig­i­nal one.

Def­i­n­i­tion: HEAD (note: cap­i­tal let­ters). HEAD is the point you are at after every save point. It’s the unique com­mit hash (to use that term) for this is where I am. If you’re want­i­ng to com­pare HEAD to one or two com­mits back, the eas­i­est is to use the ^ oper­a­tor.

Assume we have three com­mits. The cur­rent place, the last time you com­mit­ted, and the two times before that. To ref­er­ence them with­out know­ing the com­mit hash, it’s as fol­lows:

Cur­rent place
HEAD
Com­mit pri­or to where you are now
HEAD^‘
Two com­mits pri­or to where you are now
HEAD^^‘
Three com­mits pri­or to where you are now
HEAD^^^‘

How­ev­er, if you are look­ing at more diverse com­mits (say the HEAD^^^ and the one three com­mits before that), you’re bet­ter off using git log to fig­ure out the com­mit hash.

To see the dif­fer­ence between two com­mits on a spe­cif­ic file (in this case HEAD, HEAD^^, and on chapter2.rtf) run git diff HEAD^^ HEAD chapter2.rtf. That is, more gen­er­alised:

  1. Pick the ear­li­est point. That’s the first argu­ment
  2. Pick the end-point. That’s the sec­ond argu­ment
  3. Pick the file you want to look at. If you want to see all dif­fer­ences, you can leave off the third argu­ment
  4. Run it: git diff <earliest> <latest> <path/to/file>

Pick up an earlier revision

Depend­ing exact­ly what you’re want­i­ng to do, you can go about this in a few dif­fer­ent ways. In all of these cas­es, we’ll assume that the cat man­aged to get into your doc­u­ment and delete most of it.

Unstage

You acci­den­tal­ly marked the file your kit­ty cre­at­ed as to be saved, which you don’t want. Run git reset HEAD <file/kitty/created>, and the file is back to being unmarked for sav­ing.

Go back to the latest commit

You saved the doc­u­ment, but you haven’t com­mit­ted it yet, so what you want is to undo all the changes to it. Run git checkout -- <file/kitty/changed>, which will fetch the lat­est com­mit­ted ver­sion of that file.

Whoops, I accidentally commited it!

Now the ver­sa­til­i­ty of the git checkout com­mand is going to come into play. That’s the one we use to change branch­es, and now they can be used on spe­cif­ic files as well?

Yes. You can even use it to check out spe­cif­ic com­mits, but let’s not get ahead of our­selves.

git check­out
Moves you to that spe­cif­ic branch
git check­out -b
Cre­ates the branch and then moves you to it
git check­out —
Replaces the cur­rent changes in *file­name* with the con­tent of the file in the lat­est save
git check­out
Replaces the cur­rent changes in *file­name* with the con­tent of the file in the spec­i­fied branch or com­mit hash. For instance: ‘git check­out mas­ter chapter2.rtf‘ will fetch the con­tent of the file as it is in the mas­ter-branch. Sim­i­lar­ly ‘git check­out df531a9 chapter2.rtf‘ will fetch it as it was in the com­mit ‘df531a9‘

Everything I worked on needs to be reset to my latest save!

The path . is “here and sub­fold­ers”, so you can run git checkout -- . to undo all the changes since the lat­est com­mit. This actu­al­ly also works with a pri­or com­mit, but you may lose every­thing, see below for details.

I want to explore an entirely different revision of this

  1. Com­mit any changes you want to keep in the cur­rent branch, or you risk los­ing them
  2. Fig­ure out which com­mit you want to go to (ex df531a9)
  3. Do you want to check this out in the cur­rent branch or in another/new branch?

Current branch

  1. To check it out in the cur­rent branch (maybe you want to get some quick info and can’t remem­ber where it was), run git checkout <commit hash>. The detached HEAD busi­ness means that noth­ing you do there will effect things
  2. To get back to the branch you were in, run git checkout <name of branch>

New branch

  1. To check out a new branch from a com­mit, run git checkout <commit hash> -b <new branch>
  2. Since this is a new branch, it acts in all ways as the branch­es men­tioned above

More than one backup

Us writ­ers are occa­sion­al­ly care­ful on the verge of para­noid when it comes to our work, so let’s look at how to add more points of back­up.

Add more vaults!

  1. Using the guide from above, cre­ate a new bare repos­i­to­ry
  2. Decide on a mem­o­rable name. For instance, usb or 2nd. Avoid using spaces in the name
  3. Add this using the fol­low­ing com­mand: git remote add <name of vault> <path to vault>
  4. Run git push -u <name of vault> master
  5. When­ev­er you update the pri­or vault with git push, also run git push <name of vault>

Use some kind of cloud vault

There are two major cloud ser­vices for host­ing repos­i­to­ries. Bit­buck­et allows for free pri­vate repos­i­to­ries, and Github which you need to pay for any pri­vate repos­i­to­ries.

I per­son­al­ly like Github and have no issues with the $7/month that it costs me to have up-to 5 pri­vate repos­i­to­ries, how­ev­er that’s at least part­ly habit rather than a choice against Bit­buck­et. Both of them have good tuto­ri­als (and a fair­ly eas­i­ly-to-under­stand inter­face) for how to cre­ate an emp­ty repos­i­to­ry, and also for how to import an exist­ing repos­i­to­ry.

E-mail entire repository

Yes, you can e-mail the repos­i­to­ry, or e-mail only cer­tain changes. It uses a com­mand called git bundle

The entire repository

  1. Decide which branch­es you want. If you want all of them, the third argu­ment is --all, if only one or more, they should be spec­i­fied like this: master edit test
  2. Pick the path where you want to put the bun­dle, ex ../mynovel.bundle, which will put it one step above the root. You don’t need to name it .bundle, but it’s a good idea
  3. Run git bundle create <name of bundle> <branch(es)>
  4. E-mail the file to wher­ev­er you want it
  5. You can now clone from it, ex git clone <name of bundle> -b master <name of repo> or if you just want to update from the bun­dle, run git pull <name of bundle>

Updates

You decide you don’t actu­al­ly want to e-mail the entire repos­i­to­ry each time, but rather e-mailed the com­plete once, and now you just want the updates.

Run git log --oneline to see which com­mits you have. As an exam­ple, view the fol­low­ing log:

71b84da last commit - second repo
c99cf5b fourth commit - second repo
7011d3d third commit - second repo
9a466c5 second commit
b1ec324 first commit

In this exam­ple, you want to bun­dle every­thing new­er than second commit.

  1. The hash you’re want­i­ng is the one of the com­mit right before the first one you want to bun­dle, so in the exam­ple 9a466c5. You could also go with just adding the com­mits that are asso­ci­at­ed with a spe­cif­ic branch, say edit
  2. git bundle create <name of bundle> master ^<hash or branch name>
  3. E-mail it
  4. Check that the bun­dle is valid for the repos­i­to­ry you are try­ing to apply it to by run­ning bundle verify <name of bundle. It will object if either you are miss­ing one of the ances­tor com­mits, or there’s some­thing else wrong with it
  5. To update your back­up repos­i­to­ry, run git pull <path to bundle>

But I already have most of my novel in a particular folder, how does that work?

  1. Fol­low the steps to cre­ate the bare repos­i­to­ry as above
  2. Go to your novel’s fold­er and run the fol­low­ing com­mands:
    1. git init
    2. git add .
    3. git commit -m "Initial commit"
    4. git remote add origin file:///path/to/repo.git
    5. git push -u origin master

You prob­a­bly rec­og­nize git remote add ... from how to add more vaults. You don’t need to name the vault ori­gin, but it’s the stan­dard name that git gives that posi­tion when you clone it.

You also prob­a­bly rec­og­nize git push ..., so let’s see what these argu­ments do, in spe­cif­ic.

-u‘
This sets up a track­ing rela­tion­ship. It allows you to run ‘git push ‘ to update that par­tic­u­lar vault or ‘git pull ‘ to update from it
ori­gin‘
The name of the vault
mas­ter‘
The name of the branch, which must match between repos­i­to­ries.

Notes: mas­ter and ori­gin are both default, which is why these com­mands are equiv­a­lent:

git push, git push origin, git push origin master.

How­ev­er, if you are using a dif­fer­ent name for the remote, you need to detail that, so the remote usb can only be pushed to using git push usb or git push usb master.

You can also track oth­er branch­es than mas­ter, but this will not go into that.

Finishing thoughts

This is a work in progress. If you run into a hur­dle, don’t hes­i­tate to con­tact me via my con­tact form or my twit­ter. Obvi­ous­ly also if you have ideas on what else you’d like to do as a writer using revi­sion con­trol, or just want to chat!


For the moment comments are not enabled, but feel free to reach out on Twitter.