Monthly Archives: April 2013

Speeding Up Emacs and Parsing Emacs Lisp from Emacs Lisp

I recently spent a bit of time to clean up all the cruft that my ~/.emacs file and my ~/elisp directory had accumulated. I have been using a multi-file setup to configure my Emacs sessions, since at least 2008. This turned out to be a royal mess after 5+ years of patching stuff without a very clear plan or structure. The total line-count of both my ~/.emacs and all the *.el files I had imported into my ~/elisp directory was almost 20,000 lines of code:

$ wc -l BACKUP/.emacs $( find BACKUP/elisp -name '*.el')
   119 BACKUP/.emacs
    84 BACKUP/elisp/keramida-w3m.el
    90 BACKUP/elisp/keramida-keys.el
   156 BACKUP/elisp/keramida-irc.el
  5449 BACKUP/elisp/erlang.el
   892 BACKUP/elisp/fill-column-indicator.el
   344 BACKUP/elisp/keramida-erc.el
    87 BACKUP/elisp/keramida-chrome.el
    89 BACKUP/elisp/keramida-autoload.el
   141 BACKUP/elisp/keramida-ui.el
    42 BACKUP/elisp/keramida-slime.el
  1082 BACKUP/elisp/ace-jump-mode.el
     2 BACKUP/elisp/scala-mode2/scala-mode2-pkg.el
   907 BACKUP/elisp/scala-mode2/scala-mode2-indent.el
    26 BACKUP/elisp/scala-mode2/scala-mode2-lib.el
   502 BACKUP/elisp/scala-mode2/scala-mode2-fontlock.el
    37 BACKUP/elisp/scala-mode2/scala-mode2-map.el
   808 BACKUP/elisp/scala-mode2/scala-mode2-syntax.el
   111 BACKUP/elisp/scala-mode2/scala-mode2.el
   121 BACKUP/elisp/scala-mode2/scala-mode2-paragraph.el
  1103 BACKUP/elisp/php-mode.el
   142 BACKUP/elisp/themes/cobalt-theme.el
   665 BACKUP/elisp/themes/zenburn-theme.el
   142 BACKUP/elisp/themes/sublime-themes/cobalt-theme.el
    80 BACKUP/elisp/themes/tomorrow-night-blue-theme.el
    80 BACKUP/elisp/themes/tomorrow-night-eighties-theme.el
   115 BACKUP/elisp/themes/tomorrow-theme.el
    80 BACKUP/elisp/themes/tomorrow-night-bright-theme.el
   339 BACKUP/elisp/cmake-mode.el
    95 BACKUP/elisp/keramida-cc-extra.el
  1341 BACKUP/elisp/lua-mode.el
  2324 BACKUP/elisp/markdown-mode.el
   184 BACKUP/elisp/rcirc-notify.el
   167 BACKUP/elisp/keramida-defaults.el
   203 BACKUP/elisp/keramida-hooks.el
    43 BACKUP/elisp/keramida-lang.el
   435 BACKUP/elisp/edit-server.el
   709 BACKUP/elisp/slang-mode.el
    66 BACKUP/elisp/keramida-eshell.el
 19402 total

20,000 lines of code is far too much bloat. It’s obvious that this was getting out of hand, especially if you consider that I had full configuration files for at least two different IRC clients (rcirc and erc) in this ever growing blob of complexity.

What I did was make a backup copy of everything in ~/BACKUP and start over. This time I decided to go a different route from 2008 though. All my configuration lives in a single file, in ~/.emacs, and I threw away any library from my old ~/elisp tree which I haven’t actively used in the past few weeks. I imported the rest of them into the standard user-emacs-directory of modern Emacsen: at ~/.emacs.d/. I also started using eval-after-load pretty extensively, to speed up the startup of Emacs, and only configure extras after the related packages are loaded. This means I could trim down the list of preloaded packages even more.

The result, as I tweeted yesterday was an impressive speedup of the entire startup process of Emacs. Now it can start, load everything and print a message in approximately 0.028 seconds, which is more than 53 times faster than the ~1.5 seconds it required before the cleanup!

I suspected that the main contributor to this speedup was the increased use of eval-after-load forms, but what percentage of the entire file used them?

So I wrote a tiny bit of Emacs Lisp to count how many times each top-level forms appears in my new ~/.emacs file:

(defun file-forms-list (file-name)
  (let ((file-forms nil))
    ;; Keep reading Lisp expressions, until we hit EOF,
    ;; and just add one entry for each toplevel form
    ;; to `file-forms'.
    (condition-case err
        (with-temp-buffer
          (insert-file file-name)
          (goto-char (point-min))
          (while (< (point) (point-max))
            (let* ((expr (read (current-buffer)))
                   (form (first expr)))
              (setq file-forms (cons form file-forms)))))
      (end-of-file nil))
    (reverse file-forms)))

(defun file-forms-alist (file-name)
  (let ((forms-table (make-hash-table :test #'equal)))
    ;; Build a hash that maps form-name => count for all the
    ;; top-level forms of the `file-name' file.
    (dolist (form (file-forms-list file-name))
      (let ((form-name (format "%s" form)))
        (puthash form-name (1+ (gethash form-name forms-table 0))
                 forms-table)))
    ;; Convert the hash table to an alist of the form:
    ;;    ((form-name . count) (form-name-2 . count-2) ...)
    (let ((forms-alist nil))
      (maphash (lambda (form-name form-count)
                 (setq forms-alist (cons (cons form-name form-count)
                                         forms-alist)))
               forms-table)
      forms-alist)))

(progn
  (insert "\n")
  (insert (format "%7s %s\n" "COUNT" "FORM-NAME"))
  (let ((total-forms 0))
    (dolist (fc (sort (file-forms-alist "~/.emacs")
                      (lambda (left right)
                        (> (cdr left) (cdr right)))))
      (insert (format "%7d %s\n" (cdr fc) (car fc)))
      (setq total-forms (+ total-forms (cdr fc))))
    (insert (format "%7d %s\n" total-forms "TOTAL"))))

Evaluating this in a scratch buffer shows output like this:

COUNT FORM-NAME
   32 setq-default
   24 eval-after-load
   14 set-face-attribute
   14 global-set-key
    5 autoload
    4 require
    4 setq
    4 put
    3 defun
    2 when
    1 add-hook
    1 let
    1 set-display-table-slot
    1 fset
    1 tool-bar-mode
    1 scroll-bar-mode
    1 menu-bar-mode
    1 ido-mode
    1 global-hl-line-mode
    1 show-paren-mode
    1 iswitchb-mode
    1 global-font-lock-mode
    1 cua-mode
    1 column-number-mode
    1 add-to-list
    1 prefer-coding-system
  122 TOTAL

This showed that I’m still using a lot of setq-default forms: 26.23% of the top-level forms are of this type. Some of these may still be candidates for lazy initialization, since I can see that many of them are indeed mode-specific, like these two:

(setq-default diff-switches "-u")
(setq-default ps-font-size '(8 . 10))

But eval-after-load is a close second, with 19.67% of all the top-level forms. That seems to agree with the original idea of speeding up the startup of everything by delaying package-loading and configuration until it’s actually needed.

10 of the remaining forms are one-off mode setting calls, like (tool-bar-mode -1), so 8.2% of the total calls is probably going to stay this way for a long time. That’s probably ok though, since the list includes several features I find really useful, very very often.

Saving Space in Rsnapshot Archives with Hardlinks

The hardlink package is quite handy on Linux systems. It appears to do a nice job saving disk space in my local rsnapshot archives (25 GB reported as “saved” after today’s run on a week of daily snapshots). The stress put on my external USB backup disk wasn’t too extreme either, as reported by Munin’s IO/sec graphs for /dev/sdc:

IO/sec graph for a USB disk, while hardlink was running on a collection of rsnapshot daily archives.

IO/sec graph for a USB disk, while hardlink was running on a collection of rsnapshot daily archives.

That’s a lot of read operations, and then a percentage of them converted back to writes, when hardlink(1) discovers duplicate files and replaces them with hard links. Which is exactly the sort of behavior we’d expect from this sort of thing.

The hardlink invocation that triggered these I/O operation was quite mundane:

# time hardlink -f -m -v *home.*
[output stats ellided]
        1380.243 real   17.525 user     47.975 sys

Having run for a little over 23m it managed to hardlink enough files in my daily laptop backup snapshots to save 25 GB of disk space. I think it was worth the time vs. space trade-off :-)