Mercurial Clones without a Working Copy

Mercurial repository clones can have two parts:

  1. An .hg/ subdirectory, where all the repository metadata is stored
  2. A “working copy” area, where checked out files may live

The .hg/ subdirectory stores the repository metadata of the specific clone, including the history of all changesets stored in the specific clone, clone-specific hooks and scripts, information about local tags and bookmarks, and so on. This is the only part of a Mercurial repository that is actually mandatory for a functional repository.

The “working copy” area is everything under the clone that is not under the toplevel .hg/ subdirectory of the particular clone. The working area of each Mercurial repository may contain a snapshot of the files stored in the repository: either a clean snapshot, checked out from one of the changesets stored in the repository itself, or a locally modified version of a changeset.

One important detail that may not be apparent from the descriptions above is that:

Even if you have already checked out a particular version, you can delete everything except the .hg/ subdirectory and the Mercurial repository will still function normally.

Clones Without a Working Copy

An example is a good way to demonstrate how a clone still functions as a Mercurial repository without a working copy. Let’s assume that you have a tiny repository at /tmp/hgdemo that contains revisions of just a small hello.c program:

% pwd
/tmp/hgdemo
% hg root
/tmp/hgdemo
% hg log --style compact
1[tip]   c48ee3a9fd78   2010-01-11 08:33 +0200   keramida
  Use EXIT_SUCCESS instead of hard-coded zero.

0   041227edc91b   2010-01-11 08:32 +0200   keramida
  Add hello.c

% hg manifest tip
hello.c
%

You can check-out a copy of the latest file revision of hello.c with the “hg checkout” command:

% hg checkout --clean tip
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
% cat -n hello.c
     1  #include <stdio.h>
     2  #include <stdlib.h>
     3  
     4  int
     5  main(void)
     6  {
     7      printf("Hello world\n");
     8      return EXIT_SUCCESS;
     9  }
%

The repository does not need a checkout to function though. The fact that your working copy has been updated to a particular revision is independent of the way the repository machinery under .hg/ works. So you can remove the source of hello.c and still use the repository to browse the history of the project:

% rm -f hello.c
% hg log --style compact
1[tip]   c48ee3a9fd78   2010-01-11 08:33 +0200   keramida
  Use EXIT_SUCCESS instead of hard-coded zero.

0   041227edc91b   2010-01-11 08:32 +0200   keramida
  Add hello.c

%

With a clone like this it is still possible to use any Mercurial command that does not require a working copy, e.g. “hg diff” to look at the differences between two arbitrary revisions:

% hg diff -r 0:1
diff -r 041227edc91b -r c48ee3a9fd78 hello.c
--- a/hello.c   Mon Jan 11 08:32:59 2010 +0200
+++ b/hello.c   Mon Jan 11 08:33:28 2010 +0200
@@ -1,8 +1,9 @@
 #include <stdio.h>
+#include <stdlib.h>
 
 int
 main(void)
 {
     printf("Hello world\n");
-    return 0;
+    return EXIT_SUCCESS;
 }
%

You can even checkout the “null” revision (a magic revision name which Mercurial treats as “not any revision stored in this repository”):

% hg checkout --clean null
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
% hg identify --id --branch
000000000000 default

When a Mercurial clone has checked out the null revision all the tracked files of the working copy are removed. If the clone does not already contain build-time artifacts you should only see the .hg/ subdirectory when you look at the clone:

% find . -maxdepth 2 -exec /bin/ls -1 -dF {} +
./
./.hg/
./.hg/00changelog.i
./.hg/branch
./.hg/dirstate
./.hg/last-message.txt
./.hg/requires
./.hg/store/
./.hg/tags.cache
./.hg/undo.branch
./.hg/undo.dirstate
%

The disk space such clone requires is limited by the size of the history metadata.

Why Would You Want Such a Clone

For a small repository like the one shown in this example, it seems pretty useless to be able to have a Mercurial clone without a working copy. You don’t really gain much by deleting the source of a small 9-line C program. The space savings of doing that are quite insignificant.

If you are, however, hosting clones of large repositories in a web server somewhere, stripping the working copy of Mercurial clones may be very handy indeed and it may save you a large part of the disk space you would need to keep working copies around. By “large repository” I mean something like a single clone with several hundreds or thousands of files, or a clone whose working copy requires tens or hundreds of megabytes of data.

The OpenSolaris onnv-gate repository is one of the large repositories that use Mercurial. My own Mercurial-based mirror of the FreeBSD head branch is another example for which I readily have size data. Size information for these two repositories is shown in the table below:

  FreeBSD head/ branch
since 2008-01-01
OpenSolaris
onnv-gate repository
Tracked files 41.807 44.784
Changesets 15.513 11.462
Size of .hg store 238 MB 292 MB
Size of working copy 385 MB 543 MB

Both of these Mercurial repositories have a moderately large number of files. It’s also important that the size of the working copy exceeds the size of the .hg/ repository store in both cases. In the onnv-gate repository of OpenSolaris the working copy needs almost twice as much as the entire history of the project. That’s a lot of disk space to carry around in all your local clones of onnv-gate!

If all you are looking for is a local mirror of the project sources — so that you can look at the history of a project, browse the diffs committed over time, search for interesting commit information (e.g. “when was bug 6801336 fixed in OpenSolaris?”) — carrying around a full working copy is probably a waste of space. Updating the files of the working copy after every pull operation from the upstream master-repository is a waste of time too.

Advertisement

5 thoughts on “Mercurial Clones without a Working Copy

  1. Pingback: Distributed Development with Mercurial « Blog Pseudoaccidentale

  2. Hari

    While cloning, you can specify the “-U” or “–noupdate” option to clone without creating a working copy (that way you don’t end up with a working copy to begin with).

  3. Éric Araujo

    > the “null” revision (a magic revision name which Mercurial treats as
    > “not any revision stored in this repository”)

    Actually it’s more “the parent of the first revision”, with a slight pitfall: The first revision has the local number 0, so the null rev is -1. I always use “null” when I want to remove my working copy; 00000000 would also work.

Comments are closed.