Recent posts / Archive

Categories

Using Git

Originally posted by on 15:35 Wed 6 February 2008, last modified 12:45 Sat 9 February 2008.

File under: 3rd party tools git phd programming scm software

Source Code Management (SCM) is one of those things that computer scientists and geeks love to talk about, and I think I've just worked out why! When you get a new SCM tool working, and the coin drops - it's awesome! It's like the most impressive Hello World ever. Previously I've used Concurrent Versions System (CVS), which is the only time I've actively collaborated on a software project, and when I started my PhD I used Subversion. I'm a follower of fashion when it comes to SCMs, and as you've probably seen, Git has been spotted all over the internet. So I thought I'd investigate, and this is what I've discovered after using it for around 3 months.

So what's it all about?

Git is a distributed version control system. It is the brainchild of Linus Torvalds and was created to manage the linux kernel. Consequently, it's not really Windows friendly, so if you need to work on Windows, you should checkout Mercurial. A distributed version control system is a VCS where every user (or think working copy) contains the whole repository. This means that any changes you make are done to your local repository, which means that git is very fast, and you can make commits without requiring an internet connection to a central repository. Additionally, Git tracks changes in file content, not the files themselves, which is why a git repository doesn't take up that much more space than the equivalent subversion working copy. What about sharing with others then? Well, in general a software project will have a public repository which users then clone to create their own local repositories. When a user is ready, they can then push their changes to the public repository, and pull any other changes from the public repositories. This architecture is shown in the figure below. Note, that the pubic repository is just a computer somewhere, and the same machine can have a local, private repository to develop in too.

Typical setup of software project using git with 3 users

Getting started

Have a look at the git webpage for download links. Installation is fairly straight forward:

Tobias:~ dt05r$ cd Downloads/git-1.5.4
Tobias:git-1.5.4 dt05r$ ./configure --prefix=/usr/local
...
Tobias:git-1.5.4 dt05r$ make all doc
Tobias:git-1.5.4 dt05r$ sudo make install

If you already have git, then you can use it to update itself

Tobias:~ dt05r$ cd Downloads/
Tobias:Downloads dt05r$ git clone http://www.kernel.org/pub/scm/git/git.git
...
Tobias:~ dt05r$ cd git/
Tobias:git dt05r$ make configure
Tobias:git dt05r$ ./configure --prefix=/usr/local
...
Tobias:git dt05r$ make all
Tobias:git dt05r$ sudo make install

If you're on OS X, as I am, then just don't bother installing the man pages because it depends on asciidoc which depends on a bunch of other stuff.... so on and so on, it's not worth it.

At this point, you can use git to get public repositories to play with, which is sort of useful I guess. You can even play around with unimportant public repositories at http://repo.or.cz/. However, from here onwards, I'm going to assume that you want to use git to manage your own software project, either between yourself on multiple computers, or between multiple users, it amounts to the same thing.

Creating a bare repository

The first thing to do is to create a repository that is going to be the public repository, we can then clone it for each user/machine.

Tobias:~ dt05r$ cd /path/to/my/public/repository
Tobias:repository dt05r$ git init
Initialized empty Git repository in .git/
Tobias: repository dt05r$

At this point there isn't anything in the repository, we have to add some files to it. However, I think it's a wise idea to clone this repository to your local user space on the same machine, and then add files to the local repository:

Tobias:~ dt05r$ cd ~/Documents
Tobias:Documents dt05r$ git clone /path/to/my/public/repository/.git MyRepository
Initialized empty Git repository in /Users/dt05r/Documents/MyRepository/.git/
Tobias:Documents dt05r$

Before you add all any files, it's probably a wise idea to set up some configuration options, such as the following:

$ git config --global user.name "Your Name"
$ git config --global user.email name@domain.com
Tobias:Documents dt05r$

You will probably also want to create an ignore file, which you locate at ~/.gitignore for a global ignore file, although you can also include repository specific ones in the root directory, see gitignore for more details. The format is simply an expression like:

*.DS_Store
*.aux
*.bbl
*.blg
*.log
*.out
*.toc
*.consolelog

Once you've set up your global configuration options (there are lots more), you can add your content:

Tobias:~ dt05r$ cd ~/Documents/MyRepository/
Tobias:Documents/MyRepository dt05r$ git add some/files/I/want/to/add/*

If you then call git status in your repository, it will show you the content that is to added, (and any other modifications), to actually add them to the repository, we know issue

Tobias:Documents/MyRepository dt05r$ git commit -a

This is is essentially the basic work cycle, add files, make changes, commit the changes. However, we can do some more interesting stuff...

Connecting to a remote repository.

To check the repository out to another machine we have some options, git can be served over http, ssh and a git protocol. I'm going to focus on ssh, because it's the easiest and most straightforward. Assuming you are able to log into the remote machine using ssh, then getting your git repository is a doddle:

$ git clone ssh://username@server.address.com/path/to/my/public/repository/.git MyRepository

Git will then create a repository which is a copy of the public one. You can track changes, and commit your own changes to the public repository using the pull and push commands:

~ $ cd MyRepository
MyRepository $ git pull
.... MyRepository $ git commit -a -m "These are my local changes, and this is a rubbish commit message"
MyRepository $ git push

The important thing to remember is that pull and push, will fetch and merge from all the references in the repository, if you want to pull and push from a specific branch, then read on...

Git is all about branches

I never, ever, used branches in Subversion, it really seemed like a lot of work, keeping track of all the different revision points, luckily, git was designed with the intention of making branching and merging easy - and it is! To create a local branch....

MyRepository $ git branch Foo
MyRepository $ git checkout Foo

The branch subcommand can list all the branches (use -a option, as in the screen shot at the top), create a local branch from the master (as above), and below, we can track a remote branch:

MyRepository $ git branch --track Bar origin/Bar
MyRepository $ git checkout Bar

We use git checkout to actually change to the branch. In order to push a local branch to the public server, it's important to use the git push command correctly:

MyRepository $ git checkout Foo
MyRepository $ git add some/files/*
MyRepository $ git commit -a
MyRepository $ git push Foo origin/Foo

One last thing, is if you've screwed up your branch, or even the master, then you can go back to the HEAD state using

MyRepository $ git checkout -f master

which has saved me a couple of times.

So, yeah, I think that's about it. I might post again about tags and serving over http etc if I ever get round to it.

comment