Table of Contents
- Invasion of Body Snatchers
- yellow naming convention for special variables
- green naming convention for (lexical) local variables
- what about casual global variables?
- - -
- does byte compilation eliminate collision?
- detailed examples of collision
- Common Lisp note
This post is part of Living with Emacs Lisp. This post is very long and you probably don’t need read sections after the
- - - part.
1. Invasion of Body Snatchers
Invasion of special variables is something that might happen when the following two conditions are met
- an emacs lisp file uses
defconstto declare a global special variable
- another emacs lisp file happens to use a (lexically scoped) local variable of the same name as the global special variable from the other emacs lisp file.
What happens is that the behavior of the latter file’s code may become unreliable, although not always. It is as if the latter file is being invaded by the global special variable. The other way around is invasion of local variables.
This happens because of a property of defvar that Erann Gat described as “pervasiveness of defvar”.1 The code
(defvar abc) declares the name
abc special, which makes
abc to be dynamically bound rather than lexically bound, in all emacs lisp files, including files which just happen to name a local variable as
abc, even the files that are loaded before execution of
(defvar abc), even the ones that are written by other teams, and sometimes even the ones that are byte-compiled.
For example, one day, Alice installs
emacs-html-server.el for running an html server on Emacs. (packages mentioned in examples here are all made up and hypothetical unless stated otherwise.) Let’s suppose
emacs-html-server.el defines the command
ehs-start-server which happens to use a local variable with name
python-path. Then another day, Alice installs
python-mode.el. Let’s suppose
python-mode.el contains this line:
(defvar python-path "python" "Path to Python executible.")
After she installs it, she finds that the
ehs-start-server command becomes broken as soon as she opens a python mode buffer.
Another example. One day, Alice installs
print-to-pdf.el for printing buffers to pdf files. Let’s suppose
print-to-pdf.el defines the command
ptp-print-and-open which has an optional parameter
print-newline. Then another day, a new version of Emacs is released. Let’s supposed this version happens to introduce
print-newline as a global special variable which specifies how the function
ptp-print-and-open is not working as before.
We will come to more detailed examples later to see precise conditions in which invasion causes serious collision. There are measures you can take to reduce collision. First, let me tell you a story about Bob. Bob was making something for dinner. The dinner was rice burgers with chicken, tomatoes, lettuce, etc. His cat was observing his cooking and found something funny about the way Bob was cooking. It was that Bob was using two cutting boards. He was using a yellow board for cutting raw chicken, and a green board for cutting tomatoes and lettuce. His cat asked “why two cutting boards?” He answered “Using two boards is a good habit to have because it reduces chance of cross contamination”2. Yellow board for (global) special variables. Green board for (lexically scoped) local variables.
2. yellow naming convention for special variables
Make sure that special variables have names with a hyphen in them. For example, if you use defvar to create a global variable in your emacs init file, you should name the variable with a common prefix (with a hyphen) such as
jh- if jh is your initials.
;; in emacs init file (defvar my-abc nil "blah.") (defvar my-xyz nil "blah.") (defconst my-in-ms-windows (eq system-type 'windows-nt) "Non-nil if this is on MS Windows.")
If you are a package author, you would have a common prefix for names of all special variables and functions in your package. Even special variables for internal use should be named with that common prefix. For example,
python- as its common prefix.
;; in python.el (defvar python-mode-map ...) (defcustom python-indent-offset ...) (defvar python--timer ...) ; for internal use
3. green naming convention for (lexical) local variables
You can choose one of the following three possible rules you can subscribe to:
Rule L1: In any emacs lisp file with
lexical-binding set to
t, all variables should be declared with hyphenless names, except when defining global special variables. For example, names such as
path are OK, but names like
python-path are not.
Rule L2: In any emacs lisp file with
lexical-binding set to
t, all variables mentioned within an anonymous function body (or a local function body if you use cl-flet), except for global special variables defined in that file or in other required files, should have hyphenless names. In other words, unlike Rule L1, you can declare a (lexical) local variable with names with hyphens, as long as you don’t use that variable within the body of an anonymous function.
Rule L3: In any emacs lisp file with
lexical-binding set to
t, all non-local variables mentioned within an anonymous function body or a local function body, except for global special variables, should have hyphenless names.
Rule L1 is the least permissive of names with hyphens, and Rule L3 the most permissive. In that regard, Rule L3 is best. Sticking to Rule L1 requires the least amount of brain power, and Rule L3 the most. In that regard, Rule L1 is best. I choose L2.
Whichever rule you choose, don’t forget that function parameters are local variables too and therefore you must follow that rule when you are choosing names for function parameters.
There are some built-in special variables with hyphenless names, you should avoid those names as well.
(Update: Also, while any of these rules is enough to defend against invasion of special variables, none of them is enough to prevent invasion of local variables. See Invasion of local variables in Emacs Lisp. One of the approaches from that article can be named Rule L0.
Rule L0: In any emacs lisp file, all local variables should be declared with hyphenless names.)
If you are going with Rule L2 or L3, you should know that it is very easy to end up introducing anonymous functions, sometimes without you noticing it. For example, you are very likely to use anonymous functions if you use mapcar a lot. You might be using some fancy looping macro from some library which is implemented using closures, that is, the macro might be writing anonymous functions for you without you noticing. Macros such as with-process-shell-command macro from Nic Ferrier also make a closure from code you pass, for example.
4. what about casual global variables?
Many articles on Emacs Lisp use
setq (rather than
defvar) to introduce global variables with simple names like
y in short code examples. No hyphens there. For example, the following code is from an official article on marks and uses setq to create a global variable
(setq m (mark-marker)) (set-marker m 100) (mark-marker)
The global variable
m is a casual global variable: it is only being used for tutorial purposes, for testing out things, it is not being used as part of code for packages or your init file.
Global variables created by setq are not special variables and therefore do not cause invasion of special variables (true as of Emacs 24). This means that the yellow naming convention does not apply to globals from setq. But this behavior of setq is never explicitly mentioned in the manuals, as far as I know. So it is possible that future emacs may change the behavior of setq so that it creates special variables. In that case, you should start naming all global variables (whether created by setq or defvar) with hyphens in names.
- - -
6. does byte compilation eliminate collision?
Byte compilation does somewhat help reduce chance of special variable invasion, as it tends to remove mentions of local variable names, but one should still rely on variable naming conventions.
We cannot rely on byte compilation alone, because for example, even if the author of
emacs-html-server.el only distributes byte compiled files, he still needs run
emacs-html-server.el without byte compilation during his interactive development of the package. Also, when Alice installs
python-mode.el from a package archive, and then opens a python mode buffer, and then installs
emacs-html-server.el from a package archive and byte compiles it, she may end up with a byte compiled code for
ehs-start-server which still mentions the name
Also, what if using defadvice or debugging something somehow uncompiles some compiled functions? Does that happen? I don’t know.
Another thing is, people rarely compile their emacs init files.
Yet another thing, I am not sure that the elimination of local variable names is a documented behavior of byte compilation.
7. detailed examples of collision
All emacs lisp file names or all emacs lisp package names in these examples are hypothetical unless stated otherwise.
7.1. passing an asynchronous callback function (which involves a nonlocal variable)
Suppose Alice is a user of
example.el has a function that makes xhr requests and gets responses using the
xhr-get function which is defined in
;;; example.el --- example stuff -*- lexical-binding: t -*- (require 'xhr) ... (defun example-something () (dolist (some-query (list "x" "y" "z")) (xhr-get (format "http://www.example.com/%s" some-query) (lambda (response status) (message "%s => %s" some-query response))))) ...
some-query is the name of a variable which is a nonlocal variable to the anonymous callback function. This violates Rule L3.
And here is contents of
;;; some.el --- some stuff ... (defvar some-query "query" "some.el query command") (defvar some-version 1.2 "some.el version") ...
Notice that this file declares
some.el are loaded within same Emacs session, the function
example-something will not work as intended because the name
some-query is resolved by dynamic binding even within
example.el and therefore won’t refer to intended values like “x”, “y” or “z” at the time the anonymous callback function is called, which is after the dolist form has finished executing.
7.2. passing an asynchronous callback code
This example is a slight modification of the previous example. Suppose that
xhr.el provides a Lisp macro
xhr-with-get which is just like
xhr-get except it takes a callback code instead of taking a callback function and that the macro is implemented using
xhr-get in the obvious way. Suppose
example-something-2 which uses that macro:
;;; example.el --- example stuff -*- lexical-binding: t -*- (require 'xhr) ... (defun example-something-2 () (dolist (some-query (list "x" "y" "z")) (xhr-with-get (format "http://www.example.com/%s" some-query) (response status) (message "%s => %s" some-query response)))) ...
That still sort of violates Rule L3.
some.el are loaded within same Emacs session, the function
example-something-2 will not work as intended.
xhr-with-get is written like this:
(defmacro xhr-with-get (url vars &rest body) "Note: use this macro only in lexical bound Emacs Lisp files" (declare (indent 2)) `(xhr-get ,url (lambda ,vars ,@body)))
7.3. passing a synchronous callback function
Suppose Alice is a user of
hello.el which defines the
hello-insert-stuff command which in turn relies on the
fp-repeat function defined in
fp.el (which we suppose is a library providing lots of functions for functional programming written by Bob) and suppose
fp-repeat is for repeatedly calling a function many times.
;;; hello.el --- hello stuff -*- lexical-binding: t; -*- (require 'fp) (defun hello-insert-stuff () (interactive) ;; inserts "1111111111\n2222222222\n3333333333" to current buffer (dolist (i (list "1" "2" "3")) (fp-repeat 10 (lambda () (insert i))) (insert "\n"))) ...
fp-repeat is implemented like this:
;;; fp.el --- fp stuff -*- lexical-binding: t; -*- ... (defun fp-repeat (n func) "Calls FUNC repeatedly, N times." (dotimes (i n) (funcall func))) ...
Now suppose Alice has a habit of using
defvar with hyphenless names for her casual global variables, thereby violating yellow naming convention for special variables. Today Alice got curious about arithmetic operators and ran the following code in scratch buffer to see how they work with more than two arguments.
(defvar i 100) (defvar j 200) (defvar k 300) (print (+ i j k)) (print (* i j k)) (print (- i j k)) (print (/ i j k))
Running that code has a nasty side effect of breaking the intended behavior of
hello-insert-stuff. That is because it declares
i special, and when Alice runs
hello-insert-stuff later, it will insert3 “0123456789\n0123456789\n0123456789″ rather than the intended “1111111111\n2222222222\n3333333333″, because the dynamic binding of
i established from within the
fp-repeat calls will shadow other bindings.
7.4. passing a synchronous callback code
This example is a slight modification of the previous example. Suppose
hello-insert-stuff-2 which is like
hello-insert-stuff except it uses
fp-repeat-do is a Lisp macro defined in
fp.el and is simply a macro version of
(defun hello-insert-stuff-2 () (interactive) ;; inserts "1111111111\n2222222222\n3333333333" to current buffer (dolist (i (list "1" "2" "3")) (fp-repeat-do 10 (insert i)) (insert "\n")))
We also suppose that
fp-repeat-do is implemented using
(defmacro fp-repeat-do (n &rest body) "Note: use this macro only in lexical bound Emacs Lisp files" (declare (indent 1)) `(fp-repeat ,n (lambda () ,@body)))
Alice running her code for testing arithmetic operations has an effect of also breaking the intended behavior of
7.5. mid wrap-up
Someone deviating from naming conventions mentioned in this post causes collision which then breaks the intended behavior of someone else’s code that involves passing callbacks.
7.6. returning an anonymous function (which is a closure)
fp-counter which returns a sort of generator of numbers that start from a certain value and with a certain step. For example,
(fp-counter 100 2) returns a function that returns 100, 102, 104, … on repeated calls.
(defun fp-counter (count-start count-step) (let ((current-number count-start)) (lambda () (prog1 current-number (setq current-number (+ current-number count-step))))))
Notice that the author of
fp.el violated Rule L3 twice in that code:
Now suppose Alice installs
current.el which we suppose is an emacs lisp package for reporting the amount of electric current going through Emacs. What does that mean. I don’t know. Anyway, suppose
current.el happens to include this line:
(defvar current-number nil "The amount of electric current in teaspoons. This number is updated every `current-update-interval' seconds")
When Alice turns on reporting of electric current, everything that relies on
fp-counter will break.
7.7. local function example
I cannot come up with a good example combining violation of naming conventions and use of local functions leading to trouble.
8. Common Lisp note
Common Lisp programmers use a different way of eliminating invasion: the earmuffs convention. They put asterisks around special variable names and those asterisks are called earmuffs. There is a guy4 who has not been using the earmuffs convention for years though. I don’t know about Common Lisp, but with Emacs Lisp, if you don’t stick to the naming conventions, your code will invade others code and vice versa.
setq-ing on an undefined variable name at top level may do one of the following three things depending on Common Lisp implementations:
- creates a lexical global, that is, declares a global variable (or in Lisp speak, sets the global value of the variable) without making it special (or in Common Lisp speak, without proclaiming it special).
- creates a global special variable.
he used that phrase in The Idiot’s Guide to Special Variables and Lexical Closures
bacteria may move from raw chicken to the cutting board or knife and then from there to lettuce which goes to the rice burgers. see http://cooking.stackexchange.com/questions/2209/how-do-you-properly-clean-a-cutting-board-and-knife-to-prevent-cross-contaminati
actually not true. It will insert the character corresponding to ASCII code 0, then that of 1, and so on
Doug Hoyte who wrote Let Over Lambda