Emacs Lisp lexical binding gotchas and related best practices

Part of the series Living with Emacs Lisp

1. motivation

Let’s say Alice used to use dynamic scope for her Emacs Lisp code, and she found that dynamic scope comes with serious gotchas (for example you cannot use closures), so she decides to switch to using lexical scope (more known as lexical binding in Emacs manuals). Now she wants to know what are some gotchas of lexical scope and how to avoid them.

This article is long and the sections after - - - can be skipped.

2. non-local variables and closures

Before diving into gotchas of lexical binding in Emacs Lisp, let’s go through an example code that requires lexical binding. The goal is to explain what non-local variables and closures are. Those who know what they are should skip to the next section. For how to make an Emacs Lisp file to use lexical scope, see how to enable Unicode, lexical binding, CL functions in an elisp file.

Example: suppose Alice wrote a lexically scoped Emacs Lisp file alice-test.el with this contents:

;;; -*- lexical-binding: t; -*-

(require 'bob-functions) ; for bob-repeat

(defun alice-insert-stuff ()
  (interactive)
  ;; inserts "1111111111\n2222222222\n3333333333" to current buffer
  (dolist (i (list "1" "2" "3"))
    (bob-repeat 10
               (lambda ()
                 (insert i)))
    (insert "\n")))

That relies on Bob’s library bob-functions.el with this contents:

(defun bob-repeat (n func)
  "Calls FUNC repeatedly, N times."
  (dotimes (i n)
    (funcall func)))

(provide 'bob-functions)

The anonymous function that Alice wrote contains use of i. That i is a non-local variable (also called free variables) with respect to the anonymous function, i.e., i is non-local to the anonymous function, but it is still a local variable in alice-insert-stuff.

In the following code, a1, a2, a3 are non-local to the inner anonymous function, while b1, b2, b3 are local variables established within the anonymous function.

(defun my-hello (a1 a2)
  ...
  (dolist (a3 (list 10 10))
    ...
    (lambda (b1 b2)
      (print b1)
      (dolist (b3 (list 10 10 10))
        (print b1 b2 b3)
        (print a1 a2 a3)))
    ...
    ...)
  ...
  ...)

Use of i in alice-insert-stuff and use of same name i in bob-repeat are independent as long as alice-test.el is lexically scoped. If Alice removed the first line of alice-test.el to make it dynamically scoped instead, the function alice-insert-stuff would not behave the way Alice intended.

Evaluating the lambda form results in a function object. A function object that keeps local bindings of non-local variables is called a closure. Only lexically scoped code can generate closures.

3. gotchas and best practices

3.1. code as data

3.1.1. quoted lisp expression is disconnected

(let ((bark 1010))
  (print (symbol-value 'bark)))

That code, if in lexical scope, will not print 1010. Maybe the following code is less mysterious:

(let (bark
      expression)
  (setq bark 999)
  (setq expression '(1+ bark))
  (eval expression t))

That code too will fail. The reason eval and symbol-value don’t play nice with lexical scope is that quoted things like

'bark

or

'(1+ bark)

are disconnected from their surrounding code. The variable bark is non-local to the expression (1+ bark) that is quoted. Use of non-local variables to quoted expressions should be discouraged except for global special variables. This discouragement also applies to add-to-list.

The following is fine because load-path is a global special variable.

(add-to-list 'load-path "~/.emacs.d/lisp/")

The following is bad because animals is a local variable.

(let ((animals nil))
  (add-to-list 'animals "cat")
  (add-to-list 'animals "dog")
  (add-to-list 'animals "dog")
  animals)

The following is the right way. This one is fine because animals is not quoted. cl-adjoin is just a function that takes a list and returns a list.

(let ((animals nil))
  (setq animals (cl-adjoin "cat" animals :test #'equal))
  (setq animals (cl-adjoin "dog" animals :test #'equal))
  (setq animals (cl-adjoin "dog" animals :test #'equal))
  animals)

The following achieves the same. cl-pushnew is a Lisp macro that in this case simply expands to something like above.

(let ((animals nil))
  (cl-pushnew "cat" animals :test #'equal)
  (cl-pushnew "dog" animals :test #'equal)
  (cl-pushnew "dog" animals :test #'equal)
  animals)

To add strings to animals unconditionally, see double pointers and Lisp lists.

Best practices:

  • instead of defining or using a function that takes a quoted variable as argument like in (my-blah 'x), define a function which you can use with the pattern (setq x (my-blah-blah x)).
  • if (setq x (my-blah-blah x)) is too long to type, define a Lisp macro using cl-callf or cl-callf2.

Let’s get back to:

(let (bark
      expression)
  (setq bark 999)
  (setq expression '(1+ bark))
  (eval expression t))

What are workarounds for that kind of code? One workaround is to use a closure instead of a quoted expression. You can create closures, pass them around, and call them any time. Another way is to make the variable in question a global special variable, which may not be applicable in some cases. Sometimes backquote syntax can be a workaround, for example:

;; good.
(defun my-nah-1 (str)
  (rx-to-string `(+ ,(substring str 1))))
(my-nah-1 "@abc")

;; bad.
(defun my-nah-2 (str)
  (rx-to-string '(+ (eval (substring str 1)))))
(my-nah-2 "@abc")

One might say “isn’t backquote sort of like quote too? why is backquote in my-nah-1 OK while quote in my-nah-2 is not OK?” Use of str in the backquote syntax in my-nah-1 is not inside any quoted part. Remember that a comma unquotes an expression within a backquote form.

Best practices:

  • avoid eval if possible
  • avoid using quoted expressions with intention to refer to outside local variables
  • pass closures around instead of (quoted) expressions

3.1.2. is quoted lisp expression lexically scoped?

Just because you wrote a quoted expression in a lexically scoped file doesn’t mean that that expression is going to run as a lexically scoped code.

;;; -*- lexical-binding: t; -*-

(require 'bob-functions)

(defun my-do-something-after-3-seconds ()
  (bob-delayed-eval
   3
   '(progn
      (blah)
      (blah)
      ...)))

A progn form is quoted and then passed to a utility function provided by the bob-functions package. Whether the progn form will be evaluated in lexical scope depends on how bob-delayed-eval handles it. For details on how it depends, see how Emacs determines lexical binding on a variable.

When Emacs is evaluating the file containing above code, Emacs doesn’t make a promise to itself that it will evaluate the progn form in lexical scope. That is because Emacs thinks that the progn form is not even code. It is data, not code. Emacs thinks this is just like the following code where even humans would agree with Emacs that (1 2 3 4) is just data:

(dolist (i '(1 2 3 4))
  (print i))

What can Bob do? He can provide bob-delayed-call instead which takes a function as argument. Then the user of bob-delayed-call can use it like this without fear of losing lexical scope:

(bob-delayed-call
 3
 (lambda ()
   (blah)
   (blah)
   ...))

Bob can even provide a Lisp macro bob-delayed which is a simple wrapper around bob-delayed-call.

(bob-delayed 3
  (blah)
  (blah)
  ...)

3.2. invasion of special variable

3.3. - - -

3.4. mixing iteration, closures, and asynchronous programming together

This section is probably only useful for JavaScript programmers and those who want to write code with lots of asynchronous calls. Others should skip this section.

There can be trouble when the following conditions are met:

  • you produce anonymous functions in a loop
  • the functions are called after the loop finishes
  • the functions use non-local variables that are loop variables, i.e., the functions close over loop variables.

This code produces two buttons with labels egg and chicken.

(cl-loop for word in (list "egg" "chicken")
         do
         (insert-button word
                        'action
                        (lambda (arg)
                          (message "la %s" word)))
         (insert " "))

Clicking on the egg button results in the message “la chicken”, not “la egg”. That is because cl-loop is implemented in a way that the loop establishes a local binding of word just once. It makes just one binding, not several. What I mean is that the loop does something like the following:

;; Example P

(let (word)
  (setq word "egg")
  (insert-button word
                 'action
                 (lambda (arg)
                   (message "la %s" word)))
  (insert " ")

  (setq word "chicken")
  (insert-button word
                 'action
                 (lambda (arg)
                   (message "la %s" word)))
  (insert " "))

Like that, just one binding of word is established. By the time the button is clicked, word in that one binding refers to “la chicken”. On the other hand, the following code establishes two bindings of word:

;;; Example Q

(let ((word "egg"))
  (insert-button word
                 'action
                 (lambda (arg)
                   (message "la %s" word)))
  (insert " "))

(let ((word "chicken"))
  (insert-button word
                 'action
                 (lambda (arg)
                   (message "la %s" word)))
  (insert " "))

With above code, the egg button will echo “la egg”, and the chicken button will echo “la chicken”. In this case, by the time any button is clicked, there are two bindings of word, the name word refers to different strings in two bindings. A single name, in this case word, referring to different things in different bindings at the same time is not some unique strange side of Emacs Lisp. If you have written recursive functions or worked with threads or closures in other languages, you are already familiar with the phenomena of one name referring to many things at the same time. Non-programmers are familiar with this phenomena as well: the name Bob refers to different persons in different bindings. The name “President” refers to different persons in different bindings. As of this writing (2013), the name President refers to the guy on the right (Barack Obama) in this picture in the American binding, while the same name refers to the woman on the left (Park Geun-Hye) in the South Korean binding. Let me relate this analogy to Example P and Example Q above.

Example P is like this: an American was born when George Bush was the President of the United States, and then another American was born while Barack Obama is the President. Now if we ask them “who’s president now?”, they will both answer Barack Obama. The two American babies are sharing one binding of President.

Example Q is like this: a Korean was born while Park is the President of South Korea, and then an American was born while Obama is the US president. Now if we ask them “who’s president now?”, one will answer “Park”, and another “Obama”. The two babies are using different bindings of President.

With the following code, the two buttons echo different messages as intended. Difference from the first cl-loop example is that now the non-local variable, which is msg this time, is not a loop variable and that the loop establishes two bindings of msg simply because the loop enters the let form twice.

(cl-loop for word in (list "egg" "chicken")
         do
         (let ((msg (message "la %s" word)))
           (insert-button word
                          'action
                          (lambda (arg)
                            (message msg)))
           (insert " ")))

Following code works as intended too, which is simply a variation of the above:

(cl-loop for word in (list "egg" "chicken")
         do
         (let ((word word))
           (insert-button word
                          'action
                          (lambda (arg)
                            (message "la %s" word)))
           (insert " ")))

That code actually creates three bindings of word: one outer binding and two inner bindings. The two buttons use two inner bindings.

That was the most general workaround. There are other workarounds like:

  • use mapc instead that takes a function
  • use another loop macro that creates several bindings of loop variable (and is documented so)
  • attach information (such as word or msg) to buttons and let the button functions extract the information

4. testing tools

Here are two Lisp macros I used for testing things out while I was preparing this article.

4.1. my-lexical-bound-p

(defmacro my-lexical-bound-p (var)
  "Returns t if VAR is going to work as a lexically bound variable. Nil otherwise."
  `(let ((,var nil)
         (f (let ((,var t)) (lambda () ,var))))
     (funcall f)))

You can use the macro like this:

(my-lexical-bound-p sodjosijf)
(my-lexical-bound-p case-fold-search)

Run that in lexical scope and then in dynamic scope. You can also use it to test interaction between eval-after-load and lexical scope:

(provide 'my-10)
(eval-after-load 'my-10
  '(print
    (list
     'my-10
     (my-lexical-bound-p osijosf))))
(eval-after-load 'my-20
  '(print
    (list
     'my-20
     (my-lexical-bound-p osijosf))))
(provide 'my-20)

4.2. my-print-safe

(defmacro my-print-safe (object)
  "Prints OBJECT. If error, prints error."
  `(progn
     (condition-case err
         (progn
           (print ,object))
       (error
        (terpri)
        (princ (format "Eval of %S resulted in error %S"
                       ',object err))
        (terpri)))))

Using the macro, I ran the following code in lexical scope and in dynamic scope to check my assumptions.

(let ((bark 0)
      (f (let ((bark 100))
           (lambda ()
             (my-print-safe bark) ; 100 in lexical scope, otherwise 0.
             (my-print-safe
              `(:ha ,bark
                    ,(1+ bark)
                    ,(list bark `(,bark ,bark))))
             (my-print-safe (symbol-value 'bark))
             (my-print-safe (eval '(1+ bark) lexical-binding))
             (my-print-safe `,bark)))))
  (funcall f)
  t)
This entry was posted in Emacs, Lisp and tagged , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s