Changing History

Overview

Teaching: 30 min
Exercises: 20 min
Questions
  • I’ve just made a mistake, how can I undo it?

Objectives
  • Be able to manipulate the content of the worktree, staging area and commit

  • Be able to change the last in a branch

  • Understand HEAD and the latest commit

  • Be able to move to historical states of the repository

In the last lesson we learnt about commits and how they chain together to form a sequence. In this lesson we’ll start to learn how to manipulate that sequence of commits, in the context of the three trees. We’ll explore this with a simple repository.

Exploring the repository

$ cd ~/git-demystified/example-blacksheep

Let’s take a look at the files

$ cat blacksheep.txt

This contains a few lines from the well-known nursery rhyme.

What about the others?

$ cat README.txt

This contains some simple README text. It is usually a good idea to have one of these in your repository. And the final file:

$ cat commit-number.txt

This contain a number, this is an incrementing number for each commit. Let’s look at the commit history:

$ git log --oneline

Seems like we’ve added parts of the famous nursery rhyme “Baa, baa, black sheep” incrementally. Since this is a very small repostitory, let’s take a look at the contents of every commit, we can do this with by asking git log for patches

$ git log -p

This command would likely be too much information for all but a simple repository, but in this case it’s exactly what we need.

Reset

The reset command is used to manipulate the state of the three trees (working directory, staging area and current commit).

Reset soft

Let’s say that we decide after creating the commits we saw previously are too granular. We created the last three commits independently when we were writing the file, but actually, since they’re on the same line, it now makes sense to us to group them logically into a single commit. Let’s do this with the reset command

$ git reset --soft HEAD~3

The git reset command, with --soft specified, is the simplest variation of the reset command. It simply moves or sets the tip of the current branch to point to a specified commit. In this case, it points it back by three commits, effectively discarding the later commits. Let’s take a look

$ git status

Since the staging area and working directory are not touched by git reset --soft, we still have the same changes ready to commit. Let’s see what they are:

$ git diff --staged

We see that the change to commit, i.e. the difference between the files in the staging area and the files in the last commit of the current branch, is indeed all of the changes made in the last three commits. Let’s not commit these as one change to get the change history we wanted:

$ git commit -m 'Added: Yes, sir, yes, sir, three bags full!'

Let’s take a look at our log to check what has happened

$ git log -p

There are now only two commits, and each one adds a complete line to the file. Note that we created a whole new commit.

Undoing the change

The old commits are still there if we want them, we can change the repository back to the way it was before by resetting the current branch to the commit which we first downloaded. Let’s do that now, to get the excercise back to the point it was when we started.

$ git reset --soft origin/master

The reference origin/master is created by git clone, and is a reference to the master branch on the remote repository. We’ll explain this in more detail later.

Let’s look at the repository history again, with

$ git log -p

Reset HEAD and the staging area

In this case, we had all of our content in the staging area reading to commit again. This can be useful when making changes to the commit, but usually we might want to be able to add it again. This is useful if we want to rewind some changes, and re-commit them as a different set of commits.

When we leave out the --soft option the default in that case is to perform a “mixed” reset. We could use --mixed to indicate this, but since it’s the default we’ll leave this out. This works by moving HEAD, and then copying the current contents of HEAD into the staging area. Effectively, this un-adds files.

Let’s see how it works.

$ git reset HEAD~2

What has this done?

$ git status

This time the difference appear are in the working directory, like in the –soft option, but the staging area is the same as current HEAD.

We can confirm that HEAD has moved back by two commits with:

$ git log --oneline

And see the differences with:

$ git diff

The changes from the last two commits all waiting to be staged. This is because the contents of the commit that HEAD has been moved to has also been copied into the staging area.

We can reset our repository to the way it was before with

$ git reset origin/master

And check this with

$ git status

and

$ git log

Reset everything: treat with care

The final type of reset we can do is called a “hard” reset. Hard is the “next level up” from mixed.

Let’s open the file blacksheep.txt and add the line:

One for the faster and one of same.

We check what is ready to commit:

$ git diff

Let’s add this file to commit, maybe we hadn’t spotted the typos yet. And, let’s create a commit

$ git commit -m 'Added another line'

And check that it behaved as we expected

$ git log -p

Ahhh….but now we spot our mistake!

We could use a mixed reset to step back by one commit, and then change the files - but maybe we would rather just go back to before we made the changes in the first place and reset the files as well. This is where a hard reset comes in. It does everything a mixed reset does, but also drops all the changes in the working directory. Let’s try it

$ git reset --hard HEAD~

We expect no changes in the working directory, let’s check that

$ git status

We also expect the file to be back as it was before we made the changes

$ cat blacksheep.txt

And finally, we expect the final commit to have disappeared, let us check the log

$ git log

Be VERY, VERY careful with a hard reset. If there are changes in your working directory that haven’t been committed yet, you can very easily lose them. It is one of the very few commands in git that will allow you to delete some of your work. If you use this command, it is very likely that you are trying to lose changed - make sure this is what you want.

What about the commits?

Before attempting this challenge, reset your repository with git reset --hard origin/master

Use git reset to throw away the last commit, but keep the changes in the index:

$ git reset HEAD~

Check that this has work successfully with using a git log command. Recreate the commit with the same commit message:

$ git commit -m ', three bags full'

What do you notice that is different about the commit.

You can use

$ git cat-file -p <some-commit-id>

to take a closer look.

where <some-commit-id> is the ID of a commit, to show all the information that git knows about that commit (and many other objects). Can you guess by running this command why the commit id might be different? Can you guess what might happen if you had already shared this commit with someone else and they had work based on it?

Solution

Recreating a commit changes the commit ID. You should not do this if this is a commit that you have already shared with others, as git will see these as two independent commits. If you push this to a repository, other people may not be able to integrate it with their work.

Back in time

Before attempting this challenge, reset your repository with git reset --hard origin/master

Copying the contents of a file from the current commit is often the opposite action to adding some changes. You can restrict the action of reset to a file with:

git reset -- filename

Make some changes to a file, add that file to the staging area, and use git reset to undo the action of git add.

Solution

Add changes to a file with

$ git add <file>

then reset the files with

$ git reset -- <file>

or

$ git reset HEAD -- <file>

or

git reset HEAD -- <file>

Note how if we leave out HEAD, then git will assume we want to pull from the HEAD reference by default.

Gone with the wind

Before attempting this challenge, reset your repository with git reset --hard origin/master

git reset --hard is most useful to throw away all the changes in the current working directory (and the staging area) and start again from the files in the last commit (HEAD). Make some changes in your repository, without adding them to the staging area, check them with git status, then blow away the changes by doing a hard reset to HEAD. Use git status to check that the changes have gone.

Do the same thing again, but this time try add changes to the staging area before doing a hard reset.

Solution

Make some changes to any files in the current directory. Verify that changes have been made with

$ git status

then reset the files with

$ git reset --hard HEAD

you will lose your changes in this way. Check that the changes have gone with

$ git status

The files in both the working directory and the index will be reset.

Without a HEAD

What happens if we do a hard reset, but leave out the place to copy files from, like this

$ git reset --hard

Can you work out where the files come from Hint: it may help to make some changes to the files in the current directory first.

Solution

If the origin of the files is not specified, it is assumed to be HEAD by default.

Checkout on files

The checkout command from earlier has an important variant when passed files as arguments. In this case they behaves very differently. Let’s reset our repository to the way it is on the remote server to begin with.

$ git reset --hard origin/master

Let us take a look at the content of the commit-number.txt

$ cat commit-number.txt

Now, let’s perform a checkout, this time specifying the commit-number.txt file.

$ git checkout HEAD~3 -- commit-number.txt

What happened? Previously checkout would have moved HEAD.

$ git log --oneline origin/master

In fact, we’re still on the same commit, HEAD hasn’t moved at all this time. It doesn’t make sense to move HEAD for some files and keep it in the same place for others, that would get confusing very quickly. Only the file copy operations have been performed. Let’s see what effect this has had.

$ git status

The file commit-number.txt has been copied from the previous commit HEAD~3 into both our working directory as well and into the staging area. We can verify the changes with

$ git diff --staged

The file commit-number.txt has changed and nothing else has. In this case git checkout with a file behaves very much like we would expect git reset --hard to behave with files. It overrides the file in the staging area and working directory and resets any changes. For this reason

$ git reset --hard HEAD~3 -- commit-number.txt

This is not a valid command, since it would perform the same operation as the git checkout command.

Reset with files

Using git reset with files allow us to copy specific files to and from the index, leaving the working directory unchanged. Let’s reset our repository to the way it was at the beginning of this lesson

$ git reset --hard origin/master

Let’s make some changes to README.txt

A sample repository containing the nursery rhyme "Baa, baa, black sheep".
This repository is used to demonstrate reset and checkout for git.

and copy them to the staging area.

$ git add README.txt
$ git status

We can use git reset to copy the version in the repository back, effectively undoing the add.

We can unstage the file with

$ git reset HEAD -- README.txt

Shorthand Reset

If we leave out the specification of the commit, HEAD in this case, git will default to HEAD. Let’s add the file again. We could have achieved this with git reset README.txt

Just one reset

Note, only the mixed (default) version of reset makes sense with files. Changing only the position of the branch label (i.e. --soft) doesn’t make sense with files. The --hard variant would make sense, but is equivalent to checkout with files, so doesn’t exist.

a-HEAD or not on a-HEAD

What happens when we use reset to move when we’re in a detached HEAD state? How does it differ from when we use reset when a branch is checked out? Run the following two sets of commands - what do they do differently? What is different in output of the final log command? Why?

$ git checkout master
$ git checkout HEAD~1
$ git reset --hard <commit-id>
$ git log --oneline origin/master
$ git checkout master
$ git reset  --hard <commit-id>
$ git log --oneline origin/master

Solution

When reset is run and we are not on any branch (e.g. in a detached HEAD state), then the reset command cannot change the current branch. It is an operation that doesn’t make sense when we have no current branch. In this case, reset will simply move the pointer HEAD, leaving the branch where it is.

The disappearing command

What happens if you run reset --soft HEAD -- <filename> or reset --hard HEAD -- <filename> with a file in the working directory? Can you guess why this is the behaviour?

Solution

Neither of these two commands exist. reset --soft with files makes no sense, since --soft operates on the current commit only. reset --hard with files could make sense, but would be exactly the same as checkout with files, therefore only one of the two is implemented.

The dangers of checkout

What happens if you make some modification to README.mdown, add these changes to the staging area with

$ git add README.mdown

and then try to checkout the file with

$ git checkout HEAD -- README.mdown

Can you guess what will happen? Is this potentially dangerous to do?

Solution

The command

$ git checkout HEAD -- <filename>

will overwrite the file filename, even if there are changes. Be careful as you can lose your changes in this way. This command is a useful way to undo any changes you may have made to the files in your working directory.

The way things were

Can you use the checkout command to create a commit which contains the file README.mdown as it was 4 commits ago? Hint you can refer to four commits ago with HEAD~4

Solution

The file can be brought into the current directory with

$ git checkout HEAD~4 -- README.mdown

All that remains is to create a new commit, with a command such as

$ git commit -m 'README.mdown as it was in HEAD~4'

Without a HEAD

Can you work out what the following command does

$ git checkout -- README.mdown

Hint: try making some changes to README.mdown and running the command.

Solution

This command will revert the file README.mdown to the state it is in the current commit. This is equivalent to running

$ git checkout HEAD -- README.mdown

If the commit is not specified, git defaults to using HEAD.

The Reference Log

Let’s change directory back to the gitflow repository from earlier

$ cd ~/example-gitflow

Let’s check we’re on the master branch, with

$ git checkout master

And we’ll create a file in our repository

$ touch this-file-will-be-lost.txt

And add it

$ git add this-file-will-be-lost.txt

Bit we won’t commit it, let’s check the state of the repository

$ git status

And let’s take a look at the history

$ git log --oneline --graph -10

OK, I don’t like this, I’m going to delete it all and go back to the way it was before we started this morning.

$ git reset --hard origin/master

Bam! We appear to have lost all the hard work we put in this morning. Is there a way to get it back? The branch master has changed, so it’s no good looking at that. Remember, a commit doesn’t know its children, so it’s no good starting from the current commit to look for later commits. Let’s use the following command

$ git log -g

This looks just like the output of git log, but in fact shows all the actions which have changed the reference HEAD (i.e. the current commit). If we wanted all the actions which have changed the branch master, we could run

$ git log -g master

Once we identify the commit we want, we can refer to it either with the commit ID or with its position in the appropriate reflog. Let’s take a look at the commit before we move to it

$ git show HEAD@{5}

That looks like the commit we want, let’s reset the current branch to that point

$ git reset --hard HEAD@{1}

This means “go back to the state HEAD was in one moves ago”. We could also have used a reference such as master@{1} here, if we wanted the master branch. We can verify that the current branch has changed in the way we wanted with

$ git log --oneline --graph --all

We can also use references anywhere we would normally use a commit, for example

$ git log master@{yesterday}

or

$ git show master@{2.hours.ago}

Referring to Commits

We’ve seen a bunch of different type of arguments passed to commands such as git checkout. For example, references to HEAD

$ git log -1 HEAD

or to a local branch

$ git log -1 master

or a remote branch

$ git log -1 origin/master

or a tagged commit

$ git log -1 0.4.1

or a reference log entry

$ git log -1 HEAD@{5}

or a reference as it was some time ago

$ git log -1 master@{1.hour.10.minutes.ago}

or an abbreviated commit ID

$ git log -1 1ffb

or a stash references

$ git log stash@{1}

or an ancestry reference

$ git log HEAD~3

Git is clever about allow you to use any way is most convenient, and will ultimately (in most cases) translate them all to a commit reference behind the scenes.

Key Points

  • Learnt to change the state of the index, working tree using git reset

  • Learnt to change the commit which this branch points to with git reset

  • Learnt to selectively pick up historical versions of files with git checkout

  • Understand HEAD, master and the latest commit