Mercurial hooks for ‘personal’ repositories

By default, Mercurial checks out files and creates repository metadata files in repo/.hg directories with a pretty permissive set of file attributes. All directories have read, write and execute for all of “owner”, “group” and “other” on UNIX, and all files have permissions of read-write for their “owner”, read for “group” and “other”.

Now that’s fine for workspaces where you keep data which is ok to share with everyone, but it’s not very convenient when you have a workspace with data files which are “confidential” in any way. It makes files and directories with this confidential set of data files immediately visible to everyone.

Even if umask is carefully set to 077 before you start working in a workspace with this confidential set of data, there is always the possibility of leaking some data to people who shouldn’t have access to them.

Thankfully, by using both umask and a small set of hook commands, Mercurial can be configured in a way which minimizes (but unfortunately does not completely eliminate) the window of time available to anyone trying to read the files of a workspace where they shouldn’t have access to.

The hooks

The hooks which are useful for this sort of Mercurial setup are:

[hooks]
changegroup = find . | xargs chmod go-rwxo
commit = find . | xargs chmod go-rwxo
update = find . | xargs chmod go-rwxo

With this set of hooks, and umask set to 077, the directories and files of a Mercurial workspace only have read, write, or execute permission for their owner.

Setting up a new workspace

A new Mercurial workspace can be set up to use these hooks as shown below:

$ cd /tmp
$ mkdir hgtest
$ cd hgtest
$ hg init repo
$ cd repo
$ cat > .hg/hgrc << EOF
> [hooks]
> changegroup = find . | xargs chmod go-rwxo
> commit = find . | xargs chmod go-rwxo
> update = find . | xargs chmod go-rwxo
> EOF
$ umask 077
$ find .| xargs chmod go-rwxo
$ find . | xargs ls -ld
drwx−-−-−-  3 keramida  wheel  512 May 10 00:26 .
drwx−-−-−-  3 keramida  wheel  512 May 10 00:26 ./.hg
-rw-−-−-−-  1 keramida  wheel   57 May 10 00:26 ./.hg/00changelog.i
-rw-−-−-−-  1 keramida  wheel  127 May 10 00:26 ./.hg/hgrc
-rw-−-−-−-  1 keramida  wheel   15 May 10 00:26 ./.hg/requires
drwx−-−-−-  2 keramida  wheel  512 May 10 00:26 ./.hg/store
$

Now all the workspace files and directories are readable by their owner, writable by their owner, but inaccessible by everyone else.

Commiting changesets

The initial setup of a personal workspace which is not readable by other people is fine, but committing changesets to the workspace should still leave the files in their “secure” state.

Note: Please note, that the term “secure” shouldn’t be considered a guarantee that there is complete, fully functional, unbreakable security over the file sets of the workspace. If this sort of security is required, then it’s easier and probably much more secure to set things up with filesystem ACLs or some other form of kernel-enforced protection over the workspace area.

The “commit” and “update” hooks work in unison to make this happen. Let’s see this in action, by committing some changesets to our test workspace/repository:

$ echo 'The quick brown fox jumper over the       lazy dog.' > demo.txt
$ cat -n demo.txt
     1  The quick brown fox jumper over the lazy dog.
$ ls -l demo*
-rw-−-−-−-  1 keramida  wheel  - 46 May 10 00:39 demo.txt
$ hg add demo.txt
$ hg ci -u test -m 'Add demo text file' demo.txt
$ ls -l demo*
-rw-−-−-−-  1 keramida  wheel  - 46 May 10 00:39 demo.txt
$ find .hg | xargs ls -ld
drwx−-−-−-  3 keramida  wheel  512 May 10 00:40 .hg
-rw-−-−-−-  1 keramida  wheel   57 May 10 00:26 .hg/00changelog.i
-rw-−-−-−-  1 keramida  wheel   65 May 10 00:40 .hg/dirstate
-rw-−-−-−-  1 keramida  wheel  127 May 10 00:26 .hg/hgrc
-rw-−-−-−-  1 keramida  wheel   15 May 10 00:26 .hg/requires
drwx−-−-−-  3 keramida  wheel  512 May 10 00:40 .hg/store
-rw-−-−-−-  1 keramida  wheel  153 May 10 00:40 .hg/store/00changelog.i
-rw-−-−-−-  1 keramida  wheel  115 May 10 00:40 .hg/store/00manifest.i
drwx−-−-−-  2 keramida  wheel  512 May 10 00:40 .hg/store/data
-rw-−-−-−-  1 keramida  wheel  111 May 10 00:40 .hg/store/data/demo.txt.i
-rw-−-−-−-  1 keramida  wheel   49 May 10 00:40 .hg/store/undo
-rw-−-−-−-  1 keramida  wheel   65 May 10 00:40 .hg/undo.dirstate
$

The workspace copy of the demo.txt file and the repository storage area have the correct set of permissions, both before the commit (by virtue of the 077 value of the umask) and after the commit (because of the “commit” commit hook).

We can see this in action, by using the –debug option of Mercurial while committing another changeset:

$ echo 'The quick brown fox jumper over the lazy dog.' >> demo.txt
$ cat -n demo.txt
     1  The quick brown fox jumper over the lazy dog.
     2  The quick brown fox jumper over the lazy dog.
$ hg diff --git .
diff --git a/demo.txt b/demo.txt
--- a/demo.txt
+++ b/demo.txt
@@ -1,1 +1,2 @@ The quick brown fox jumper over the lazy
 The quick brown fox jumper over the lazy dog.
+The quick brown fox jumper over the lazy dog.
$ hg --debug ci -u test -m 'Add some more demo text' demo.txt
demo.txt
running hook commit: find . | xargs chmod go-rwxo
$

Checking out a working copy

Just like the “commit” hook takes care of fixing the permissions at changeset commit time, the “update” hook takes care of fixing the permissions of the workspace when files are checked out.

Every time an “hg update” command runs to check out a working copy of the files, the “update” hook runs too, and resets the permissions of the files checked out by Mercurial with the default permission set:

$ rm -f demo.txt
$ hg --debug update -C tip
resolving manifests
 overwrite True partial False
 ancestor d51df820d6ad local d51df820d6ad+ remote d51df820d6ad
 demo.txt: recreating -> g
getting demo.txt
running hook update: find . | xargs chmod go-rwxo
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
$ ls -ld demo*
-rw-−-−-−-  1 keramida  wheel  - 92 May 10 00:50 demo.txt
$ cat -n demo.txt
     1  The quick brown fox jumper over the lazy dog.
     2  The quick brown fox jumper over the lazy dog.
$

Bugs

I haven’t really tried using these hooks in huge repositories, with multiple hundred or even thousands of files, taking up gigabytes of disk space. It may be unpractical to use these hooks in so large repositories, since traversing the entire working copy and the repository metadata area on every commit/update will probably slow things down a great deal.

Advertisements

2 thoughts on “Mercurial hooks for ‘personal’ repositories

  1. Martin

    This seems very complicated. Why don’t you just change the permissions on the directory containing your repositories so that it’s only readable by you?

  2. keramida Post author

    Because Mercurial uses predictable names for the “metadata” files in workspace/.hg/** files, and a determined person will be able to read those files instead of the workspace copies…

Comments are closed.