flood zones in Seoul

While looking for a place to live in Seoul, it came to my mind that I should check out which parts of Seoul tend to get flooded and here is what I found. Let’s hope that this post becomes obsolete in future as the local government is taking some measures.

Several news article in 2014 reported that the Seoul government named five special regions to take care of against floods and these are regions near:

  • 강남역
  • 사당역
  • 광화문
  • 도림천
  • 한강로

도림천 (a small river) and 한강로 (a road) can be located as red lines in Naver Map. Locating the rest (강남역, 사당역, 광화문) should be obvious.

A news article in 2013 named top 4 regions that get flooded again and again and these are regions near:

  • 강남역
  • 광화문
  • 안양천
  • 사당역

There is also an interactive map showing the often flooded regions in Seoul, but it requires you to select a district first: 풍수해 정보지도. It can show affected areas in each of the years 2010, 2011, 2012, 2013.

Posted in Uncategorized | Tagged , , | Leave a comment

Putting a bar or a tilde over a letter in LaTeX

As you are aware, there are commands to put a bar or a tilde over a symbol in math mode in LaTeX. Sometimes, the output doesn’t come out the way some of us might expect or want.

Fortunately, there are alternative commands that do the same task differently that we can try and there are also other ways of using the same commands.

To put a tilde over a letter, we can use either \tilde or \widetilde. As for which one to use in which situation, compiling a document with the following as its part can help comparison.

$\tilde{A}$ vs $\widetilde{A}$

$\tilde{\mathcal A}$ vs $\widetilde{\mathcal A}$

$\tilde{ABC}$ vs $\widetilde{ABC}$

To put a bar over a letter, we can use either \bar or \overline. It seems that \bar is to \overline what \tilde is to \widetilde. There don’t seem to be \overtilde or \widebar.

$\bar{A}$ vs $\overline{A}$

$\bar{\mathcal A}$ vs $\overline{\mathcal A}$

$\bar{ABC}$ vs $\overline{ABC}$

Over a symbol with a subscript or superscript index, one can imagine four ways of putting a bar or tilde over it. Among the four ways, one can say that one or two of them looks right but that may depend. I can’t comment on which ones look good. You just have to try each and decide with your colleagues.

$\tilde{A_2}$ vs $\widetilde{A_2}$ vs $\tilde{A}_2$ vs $\widetilde{A}_2$

$\bar{A_2}$ vs $\overline{A_2}$ vs $\bar{A}_2$ vs $\overline{A}_2$
Posted in Uncategorized | Tagged | Leave a comment

why learn elisp (Emacs Lisp)

This post is the intro module of Living with Emacs Lisp.

The goal of this post is to hopefully convince you why you should learn elisp. Readers are assumed to be beginning users of Emacs. (Of course he who does not use Emacs does not need to learn elisp)

So you are a beginning user of Emacs and let me raise a relevant question: suppose someone asked you “Why did you choose Emacs?” The longer question would be “What is the point of using Emacs in particular when there are other great text editors out there that are also cross-platform, customizable and extensible?”

You probably have your answer to the question. If you want to hear my answer, I’d say “I choose Emacs because Emacs is customizable and extensible to an extreme degree. I believe that is the defining character of Emacs. That may not mean that everybody should use Emacs but that is certainly why I will continue to use Emacs.”

I think my customization of Emacs is only to a mild degree and I may never actually customize it to an extreme degree. However, because I know that Emacs allows extreme customization, I can be sure that Emacs allows any kind of mild customization I might want later. It’s like how some columnist would feel sure that organizations like ACLU will come defend his right to say his mildly controversial speeches if needed after he has seen ACLU defend those with extreme opinions.

The extent to which Emacs can be extended to suit your needs. You can do a lot with Emacs without learning elisp: you can define keyboard macros, you can install and use Emacs packages for your needs, you can use the Customize facility to customize your Emacs and packages to some extent. That is, without learning elisp, you can already experience Emacs to be one of those editors that are cross-platform, customizable and extensible. As you learn elisp and as you start to use hooks in interesting ways, or make your own small minor modes, or even make your own Emacs package, you start to experience Emacs to be an editor that is customizable and extensible to an extreme level. Think of that as a kind of premium feature of Emacs which you unlock by paying, and you pay not with dollars, but with time: time devoted to learning elisp.

On the other hand, there is an old saying: “Time is money”. Of course we can’t and shouldn’t devote all day to learning elisp. And we wouldn’t expect that one hour of learning elisp would unlock all the power. It’s a gradual thing. A learning curve is there, but not as steep as you might expect. Do you have something in mind that you always wanted to have in your text editor? Your mission, should you choose to accept it, is to learn enough elisp to make that something happen.

Posted in Emacs | Tagged | Leave a comment

Type backslash easily in Emacs AUCTeX

Writing LaTeX documents seem to require many presses of backslashes and the backslash key is located at awkward place and hard to type in some types of keyboards. For users of Emacs AUCTeX, there are some ways to ease this pain.

1. type slash to type backslash

With this elisp code added to your init file, whenever you type slash in a LaTeX buffer, backslash is inserted instead. The slash key is easier to reach.

(defun my-make-slash-backslash ()
  (local-set-key (kbd "/") "\\"))

(add-hook 'TeX-mode-hook 'my-make-slash-backslash)

When you have to enter slash itself, which is not often, use C-q. When you have to enter many slashes, for example you could be typing a url, in this case I believe copy and paste is the simplest workaround. It is possible to extend the code so that typing backslash results in insertion of slash as well, so that is another workaround, but I find that disorienting, so I don’t use that workaround, but you may want to use that.

If you want to be able to turn on and off “slash turns into backslash” on the fly, you can write a minor mode for that.

2. use the two commands

Another way to enter backslashes with ease is to use C-c C-m to enter a TeX macro (things like \section{...} and \frac{..}{..}) and C-c C-e to enter a LaTeX environment (things like \begin{..} .. \end{..}) in LaTeX buffers.

Posted in Emacs | Tagged | Leave a comment

Doob-Dynkin Lemma for probability space

Let f, g be measurable functions from a probability space (\Omega, \mathcal F, \mathbb P) to measurable spaces (X, \mathcal A) and (Y, \mathcal B) respectively. Consider the following three conditions:
(1) \sigma(f) \supset \sigma(g) \bmod \mathbb P (in other words, each element in \sigma(g) is equivalent (up to null sets) to some element in \sigma(f). The notation \bmod \mathbb P is used when an author wants to make explicit the practice of treating sub-sigma algebras ignoring null sets.)
(2) There is a measurable H: (X, \mathcal A) \to (Y, \mathcal B) with g = H \circ f a.e.
(3) There is a measure preserving map H: (X, \mathcal A, \mu) \to (Y, \mathcal B, \nu) with g = H \circ f a.e. where \mu, \nu are the pushforward measures obtained from \mathbb P.

(2) implies (3) trivially. (3) implies (2) trivially (even if the measure preserving map is only a.e. defined.)

(2) implies (1) trivially. The Doob-Dynkin lemma for probability space is that (1) implies (2) if (Y, \mathcal B) is a standard Borel space. In this post, we present some proofs of this lemma and some corollaries and others. Compare with the Doob-Dynkin lemma for measurable spaces (the previous article). As we’ll see, Doob-Dynkin lemma for probability space is a special case of that for measurable spaces. The one for probability spaces suffices for most applications in probability theory and ergodic theory.

1. Proof 1

In this proof, we simply show that the Doob-Dynkin lemma for probability space follows from that for measurable spaces. It is enough to show that the condition \sigma(f) \supset \sigma(g) \bmod \mathbb P can be improved to \sigma(f) \supset \sigma(g) after discarding all points in some null set from \Omega.

For each B \in \mathcal B, there is A_B \in \mathcal A such that the symmetric difference E_B := g^{-1}B \cap f^{-1}A_B is a \mathbb P-null set. It would be nice to be able to discard all points in the union \bigcup_{B \in \mathcal B} E_B but is this a null set? Perhaps not.

Exercise 10. Show that it is enough to discard all points in the countable union \bigcup_{n} E_{B_n} where (B_n)_n is a sequence that generates \mathcal B (Existence of such a sequence is guaranteed by the assumption that (Y, \mathcal B) is a standard Borel space).

2. Proof 2

In this proof, we adapt an argument in the previous article to our setting of ignoring null sets. We divide into two cases.

Case I: when the image of g countable

In this case, there is A_i \in \mathcal A such that g^{-1}y_i = f^{-1}A_i a.e. for each i.

Notice that f^{-1}A_i form a countable partition modulo \mathbb P, in other words, the intersection of f^{-1}A_i and f^{-1}A_j is a null set when i \ne j and their union over all i has measure 1, or equivalently that for a.e. \omega, there exists unique i for which \omega \in f^{-1}A_i.

Exercise 20. Show that A_i form a countable partition mod \mu. (the general idea is that the measure algebra homomorphism A \mapsto f^{-1}A behaves like an inclusion map)

Let H be any (everywhere-defined) measurable function from X to Y such that for \mu-a.e. x and for each i we have x \in A_i \implies H(x) = y_i. It is easy to define such an H.

Now it only remains to show g = H \circ f a.e. For a.e. \omega \in \Omega, we have:
there is i for which f(\omega) is in A_i;
for that i, H(f(\omega)) = y_i (by property of H);
but that y_i = g(\omega) (by how A_i chosen);
so H\circ f = g on this \omega.

Case II: when the image of g is not countable.

WLOG assume (Y, \mathcal B) is the unit interval.

Let g_n := \lfloor 2^n g \rfloor / 2^n. There is a measurable H_n: (X, \mathcal A) \to (Y, \mathcal B) with g_n = H_n \circ f a.e..

Notice that g_n \nearrow g holds everywhere. It would be nice to take H = \lim H_n but we only know H_n converges at least on f(\Omega_0) for some subset \Omega_0 of measure 1, and we don’t know if this image is measurable.

A work around is that we let H be any (everywhere-defined) measurable such that H(x) = \lim H_n(x) for all x for which the limit exists. Such H exists and with this H we can proceed to show g = H \circ f a.e..

Another work around is to notice that the subset [H_n \nearrow] (alternatively also [\exists \lim H_n]) is measurable and so one can show that its measure is 1. Then we set H to be a \mu-a.e. limit of H_n. Then H \circ f is a P-a.e. limit of H_n \circ f. So we can proceed with this H too.

3. Proof 3

In this proof, we show that (1) implies (3) using the fact that (Y, \mathcal B, \nu) is a Lebesgue space (which follows because (Y, \mathcal B) is a standard Borel space) and the fact that a Lebesgue space can be approximated by a sequence of partitions. This proof is probably equivalent to the previous one, but let’s do it for illustration.

The idea in this proof is that points in X (resp. Y) more or less should correspond to appropriate nested sequence of elements in \mathcal A (resp. \mathcal B), so in order to build H, we only need establish an appropriate procedure to transform an appropriate nested sequence of elements in \mathcal A to that of \mathcal B, but that is done for free by observing that with abuse of notation we may safely pretend that \mathcal B \subset \mathcal A \subset \mathcal F \bmod \mathbb P (if we identify \sigma(f), \sigma(g) with \mathcal A, \mathcal B). We’ll proceed without abuse of notation.

Notice that WLOG we may replace (X, \mathcal A, \mu) and (Y, \mathcal B, \nu) with any other almost-isomorphic probability spaces at the minor cost of losing everywhere definedness of f, g.

WLOG there is a non-decreasing sequence of countable partitions \beta_n \subset B such that for any non-increasing choice B_n \in \beta_n the intersection \cap_n B_n is a single point and that (\beta_n)_n generates \mathcal B. This is possible because we may assume that the probability space Y is a disjoint union of the Cantor space and atoms, which follows from the result that any Lebesgue space is almost-isomorphic to a disjoint union of an interval and atoms).

Now we want to pick a good sequence \alpha_n such that for all n we have g^{-1}(\beta_n) = f^{-1}(\alpha_n) \bmod \mathbb P. We can assume that for each n all elements of \beta_n (and hence also those of g^{-1}(\beta_n) too) have positive measure. So we can ensure that all elements of \alpha_n have positive measure. For each n, \alpha_n is a countable partition modulo \mu and the sequence (\alpha_n)_n is non-decreasing modulo \mu. After discarding only countably many \mu-null sets from X, we can ensure that \alpha_n form a non-decreasing sequence of countable partitions. The reason we ensure elements of \alpha_n and \beta_n only have elements of positive measure is because we want to ensure the natural bijection between \alpha_n and \beta_n.

For each x \in X, let \alpha_n(x) be the unique element in \alpha_n containing x, and let \beta_n(x) be the unique element in \beta_n which corresponds to \alpha_n(x) in the sense that g^{-1}(\beta_n(x)) = f^{-1}(\alpha_n(x)) \mod \mathbb P. Define H by \{H(x)\} = \cap_n \beta_n(x) which is a well defined map from X to Y (which is because \beta_n(x) is non-decreasing (for a fixed x) because its counterpart in \Omega is non-decreasing up to null sets because its counterpart in Y is.).

(Measurability of H) For each B_n \in \beta_n, it is easy to show that H^{-1}(B_n) is measurable, moreover, equal to A_n (where A_n \in \alpha_n is the element corresponding to B_n).

(On g = H \circ f a.e.) Notice that H is measurable and everywhere defined, so g' := H\circ f is an almost-everywhere defined measurable function. So we are comparing two a.e. defined measurable functions from (\Omega, \mathcal F, \mathbb P) to (Y, \mathcal B). Inverse images of B_n \in \beta_n under the two functions are equal up to null sets because of H^{-1}(B_n) = A_n. Now notice that the set [g \neq g'] is a subset of \bigcup_{n, B, B': B, B' \in \beta_n, B \neq B'} (g^{-1}B \cap g'^{-1}B') where the terms in the latter are null sets because of the previous sentence. We have shown g = g' a.e..

(On measure preserving) It is easy to show that H is measure preserving from the fact that g = g' a.e.. Alternatively, it also follows from H^{-1}(B_n) = A_n.

4. Uniqueness of H

If H and H’ satisfy (2) , then H = H' holds \mu-a.e.. This follows from the following two observations:

  • H \circ f = H' \circ f holds \mathbb P-a.e.
  • [H = H'] is measurable (because (Y, \mathcal B) is a standard Borel space).

5. applications

Exercise 25. Let f, g be measurable functions from a probability space (\Omega, \mathcal F, \mathbb P) to measurable spaces (X, \mathcal A) and (Y, \mathcal B) respectively. Show that \sigma(f) = \sigma(g) \bmod \mathbb P if and only if there are X_0 \in \mathcal A with \mu(X_0) = 1 and Y_0 \in \mathcal B with \nu(Y_0) = 1 and a bi-measurable bijection H: X_0 \to Y_0 such that g = H \circ f a.e..

Exercise 30. Given two measure-theoretic dynamical systems (X, \mathcal A, \mu, T) and (Y, \mathcal B, \nu, S) (measure preserving transformations on Lebesgue spaces), show that the latter is a factor of the former if and only if there is a joining \lambda (a T\times S-invariant probability measure on (X \times Y)) such that \mathcal B \subset \mathcal A mod \lambda (with abuse of notation).

Posted in Mathematics | Tagged , | Leave a comment

Proofs of Doob-Dynkin Lemma

Let f, g be functions from a non-empty set \Omega to measurable spaces (X, \mathcal A) and (Y, \mathcal B) respectively. Consider the following two conditions:
(1) \sigma(f) \supset \sigma(g)
(2) There is a measurable function H: (X, \mathcal A) \to (Y, \mathcal B) with g = H \circ f

By \sigma(f), I mean the sigma-algebra generated by f, so \sigma(f) = f^{-1}(\mathcal A).

The Doob-Dynkin Lemma is that the two conditions (1) and (2) are equivalent under the assumption that the measurable spaces are reasonable ones. The intuition for the lemma is this: when \mathcal C and \mathcal D are two sigma-algebras on a common underlying set such that \mathcal C \supset \mathcal D, then intuitively it means that the information specified by \mathcal D is contained in the information specified by \mathcal C. So the condition (1) should mean that the information as to what is the value of f(\omega) (given some unknown point \omega \in \Omega) contains the information as to what is the value of g(\omega), in other words, f(\omega) determines g(\omega), that is, g is a function of f. The condition (2) says g is a measurable function of f.

Applications and corollaries of this lemma include

  • characterization of the property of a function being measurable with respect to a sub-sigma-algebra of a given sigma-algebra
  • writing a conditional expectation of a random variable given another random variable as a Borel-measurable function of the latter random variable
  • characterization of the property of two measure theoretical dynamical systems being isomorphic by existence of some special kind of joining of the two.

Notice that (2) implies (1) trivially. We’ll prove that (1) implies (2) if (Y, \mathcal B) is a standard Borel space. But before that, let’s investigate how unique H is.

1. uniqueness of H

If H and H' are two measurable functions satisfying (2), then H=H' on f(\Omega). But notice that we don’t know if the image f(\Omega) is X. We don’t even know if the image is measurable.

If (Y, \mathcal B) is a standard Borel space, then it is known that the set [H=H'] is measurable. In that case, we can say that H=H' holds almost everywhere in the following sense. For the push forward measure \mu (on \mathcal A) of any probability measure on \sigma(f) under f, we have that H=H' holds \mu-a.e. because \mu[H = H'] = 1 because f^{-1}[H=H'] = \Omega.

2. counterexamples

If both \mathcal A and \mathcal B are trivial sigma-algebras, i.e., if \mathcal A = \{\emptyset, X\} and \mathcal B = \{\emptyset, Y\}, then it is easy to find an example where (1) holds but (2) does not. (Exercise 10. find such an example)

With sigma-algebras that separate points, I don’t know any counter-examples but we can at least say that in this case (1) implies g being a (possible non-measurable) function of f.

Exercise 20. Under the assumption that \mathcal B separates points of Y (in the sense that for any two distinct points y,y' there is B \in \mathcal B such that y \in B and y' \not\in B), show first that f(x) = f(x') \implies g(x) = g(x') and then that there is a function H: X \to Y such that g = H \circ f.

3. proof when g has a countable range, with nice Y

Let’s first prove the Doob-Dynkin lemma for the case when the image g(\Omega) is a countable subset in a standard Borel space (Y, \mathcal B).

Suppose y_1, y_2, \dots (finitely many or countably infinite) are all points in the image g(\Omega). For each i, the singleton \{y_i\} is measurable. This follows from the possibly overkill assumption that (Y, \mathcal B) is a standard Borel space and this happens to be the only time we use that assumption in this proof.

Since \{y_i\} is measurable, (1) then implies that there is A_i \in \mathcal A such that g^{-1}y_i = f^{-1}A_i

Notice that f^{-1}A_i form a countable partition. We do not know if A_i form a partition of \Omega. We do not know if f(\Omega) is X or if it is measurable. But at least we can conclude that A_i form a countable partition modulo f(\Omega), by which I mean that A_i \cap f(\Omega) form a partition of f(\Omega). In other words, for each x in the image f(\Omega), there exists unique i such that x \in A_i. (I haven’t checked if we can assume WLOG that f is onto.)

Let H be a measurable function from X to Y such that for each x \in f(\Omega) and for each i we have x \in A_i \implies H(x) = y_i. It is easy to define such H. (Exercise 30. Define such H. Beware that f(\Omega) may not be measurable.)

Now it only remains to show g = H \circ f which we can check point-wise. For each \omega \in \Omega, we have:
there is i for which f(\omega) is in A_i;
for that i, H(f(\omega)) = y_i (by the property we specified for H);
but that y_i = g(\omega) (by how A_i is chosen);
so H\circ f = g on this \omega.

4. proof when Y is the Cantor space

There is a slick proof of the fact that (1) implies (2) when (Y, \mathcal B) = (\{0,1\}^{\mathbb N}, \mathcal B) is the Cantor space along with its Borel sigma-algebra, or equivalently the countably infinite product of the discrete measurable space (\{0,1\}, \mathcal D) (where \mathcal D is just the discrete sigma-algebra for \{0,1\}.)

Let p_i: \{0,1\}^{\mathbb N} \to \{0,1\} be the projection to the i’th coordinate. This is a measurable function. It is easy to see that (1) implies that p_i \circ g = H_i \circ f for some measurable function H_i from X to \{0,1\}. Now define H(x) = (H_1(x), H_2(x), \dots), then H is a measurable function and we have g = H \circ f.

5. proof when Y is the unit interval

Now let’s show that (1) implies (2) when (Y, \mathcal B) is the unit interval along with its Borel sigma-algebra. You may know that the unit interval is isomorphic to the Cantor space (as measurable spaces) (and also to the real line), so showing this is redundant, but let’s do this anyway for illustration purposes.

Let g_n := \lfloor 2^n g \rfloor / 2^n. These are approximations to g. Since we have already proved Doob-Dynkin’s lemma for countable range cases, there is a measurable function H_n: (X, \mathcal A) \to (Y, \mathcal B) with g_n = H_n \circ f.

Notice that g_n \nearrow g. You might want to define H to be \lim H_n, but there is a little problem that we don’t know if the limit exists everywhere. At least we know that H_n converges on f(\Omega). You might now want to define H to take the limit value just on the image f(\Omega), and to take 0 as its value outside the image, but here another problem is we don’t know if this H is measurable: The image f(\Omega) may not be measurable.

There is a way to work around that problem. Just let H be any measurable function from X to Y such that H(x) = \lim H_n(x) holds for all x for which the limit exists. There are at least two ways of taking such an H:

  1. Let H be \lim H_n where the limit exits, and 0 where the limit does not exist; or
  2. Let H be \limsup H_n everywhere.

Now it only remains to show g = H \circ f. For each \omega \in \Omega, we have:
g(\omega) = \lim g_n (\omega) = \lim H_n \circ f (\omega) = H \circ f (\omega).

(Another work around: Define a measurable H such that H = \lim H_n holds on f(\Omega) by letting H = \sup_n H_n everywhere. Then proceed with this H.)

6. proof for standard Borel space case

If (Y, \mathcal B) is a standard Borel space, then it is known that this measurable space is either discrete or isomorphic to the unit interval, but we have proved Doob-Dynkin’s lemma for both cases already.

Posted in Mathematics | Tagged , | Leave a comment

Lisp lists and destructive functions

Goal of this post is to be some small introduction to lists in Lisp. Since functions and macros for manipulating lists can be easily found by searching the Internet, instead of listing all those helpful functions, this post will focus more on making you prepared for terminology and gotchas surrounding Lisp lists. For beginners to be able to use those functions without falling to common mistakes, it is crucial to know the meaning of phases like “modify a list”, “this list shares structure with that list”, “destructive function”, “literal list”.

All Lisp snippets in this post are in elisp, unless stated otherwise. Nevertheless, everything that is said on this post also applies to Common Lisp, not just elisp, unless stated otherwise. To translate some snippets into Common Lisp, see the Common Lisp note at the end. This post assumes the reader have read the previous article about uniform reference semantics and its implications (for container objects and function parameter passing).

Since this post is part of the Living with Emacs Lisp series, I am mainly speaking to elisp beginners. Nevertheless, if you see something that’s elisp-only that’s not explicitly stated as being elisp-only, feel free to point out.

Credits: Chapter 12 of Practical Common Lisp was of huge help for me to eliminate all the confusion I had on this subject some time ago. Large part of this post draws from that chapter.

1. nil and non-nil

The only falsy value in Lisp is the symbol nil. Also, the value of the variable nil is always nil, in other words, the symbol nil evaluates to itself. In case this is confusing, consider running the following code:

(setq aa (+ 1 1))
(setq bb 'ha)

(print aa)
(print bb)

The value of the variable aa is now the number 2, and the value of the variable bb is now the symbol ha. Since the mapping is really from the symbols aa and bb to other objects as mentioned the previous article, this means that the symbol aa now evaluates to the number 2, and the symbol bb now evaluates to the symbol ha. It is in this context that the symbol nil always evaluates to itself. If you try to change that with this code:

(setq nil (+ 1 1))

you get an error message like this in Emacs:

Attempt to set a constant symbol: nil

or in case of a Common Lisp interpreter:

NIL is a constant and thus can't be set.

Since nil is the only false object and everything else is truthy, you will encounter words like non-nil or nil more often than words like true or false or truthy or falsy in articles about Lisp. For example, the elisp docstring for if says:

if is a special form in `eval.c'.

(if COND THEN ELSE...)

If COND yields non-nil, do THEN, else do ELSE...

and that is another way of saying “If COND is true, do THEN, else do ELSE…”

2. cons cell

A cons cell is a Lisp object with two slots; each slot holds, or refers to1, some Lisp object. One slot is known as the car, and the other is known as the cdr.

The following code creates a cons cell that holds 11 and 22, and then illustrates how to access its car and its cdr.

(setq cc (cons (+ 10 1)
               (+ 20 2)))

(print cc) ; ⇒ (11 . 22)

;; the car of cc is 11
(print (car cc)) ; ⇒ 11

;; the cdr of cc is 22
(print (cdr cc)) ; ⇒ 22

The following code then shows how to modify the cons cell.

;; set the car of cc to 1111
(setf (car cc) 1111)

(print cc) ; ⇒ (1111 . 22)

A cons cell is sometimes just called a cons (plural: conses).

3. cons cells in diagrams

Let’s start with this code:

(setq aa (cons (+ 1 1) (+ 2 2)))
(print aa)

(setq bb aa)

(setq the-car (car aa))
(setq the-cdr (cdr aa))

The first line creates a cons holding two numbers 2 and 4. The second line prints it. It is printed as (2 . 4). The last two lines show how to access the components of the cons. The following ASCII art diagram shows what happens:

. aa: -----+        
.          |        
.          |
.          v         
.                    
.        +-------------+
.        |    car: cdr:|
.        |     |    |  |
.        |     |    |  |
.        +-----|----|--+
.              |    |
.          ^   |    |
.          |   v    v
. bb: -----+               
.              2    4      
.                          
.              ^    ^      
.              |    |      
.              |    |      
.        the-car:  the-cdr:

Now we modify the cons by adding 10 to its car and cdr:

(setf (car aa)
      (+ (car aa) 10))

;; another way
(require 'cl-lib)
(cl-incf (cdr aa) 10)

(print aa) ; ⇒ (12 . 14)
(print (car aa)) ; ⇒ 12
(print the-car) ; ⇒ 2

The aftermath in diagram:

. aa: -----+        
.          |        
.          |
.          v         
.                    
.        +-------------+
.        |    car: cdr:|
.        |     |    |  |
.        |     |    |  |
.        +-----|----|--+
.              v    v
.          ^         
.          |   12   14 
. bb: -----+               
.              2    4      
.                          
.              ^    ^      
.              |    |      
.              |    |      
.        the-car:  the-cdr:

Before we get to more complex diagrams, let’s see how we can simply diagrams of cons cells. Above diagram can be simplified as:

. aa: -----+        
.          |        
.          |
.          v         
.                    
.         +------+-----+
.         |      |     |
.         |    | |  |  |
.         +----|-+--|--+
.              v    v
.          ^         
.          |   12   14 
. bb: -----+               
.              2    4      
.                          
.              ^    ^      
.              |    |      
.              |    |      
.        the-car:  the-cdr:

And now let’s draw numbers next to names for further simplification:

. aa: -----+        
.          |        
.          |
.          v         
.                    
.         +------+-----+
.         |      |     |
.         |  12  |  14 |
.         |      |     |
.         +------+-----+
.                     
.          ^          
.          |           
. bb: -----+               
.                          
.                          
.           2         4    
.        the-car:  the-cdr:

A cons cell can hold any type of object. For example, if we further run the following code, then the cons cell aa holds one number and one other cons cell.

(setf (cdr aa) (cons 0 0))

So the diagram is then:

. aa: -----+        
.          |        
.          |
.          v         
.                    
.         +------+-----+     +-----+------+
.         |      |     |     |     |      |
.         |  12  |   ----->  |  0  |  0   |
.         |      |     |     |     |      |
.         +------+-----+     +-----+------+
.                                      
.          ^                           
.          |                           
. bb: -----+

4. lists

The following code makes a list of three elements and assigns it to the variable aa

(setq aa (list 10 11 12))

The above code is actually equivalent to the following code.

(setq aa (cons 10 (cons 11 (cons 12 nil))))

The elisp reference says “Lists in Lisp are not a primitive data type; they are built up from cons cells”. This just means, to quote from Practical Common Lisp, “The key to understanding lists is to understand that they’re largely an illusion built on top of objects that are instances of a more primitive data type. Those simpler objects are pairs of values called cons cells”. (Vectors are primitive data type on the other hand. This is why I kept using vectors in the previous article.)

A list in Lisp is really a (singly) linked list made of cons cells. When we say aa is a list of three elements 10, 11, 12, this actually means the following diagram:

       +-----+-----+   +-----+-----+   +-----+-----+  
       |     |     |   |     |     |   |     |     |  
aa --> |     |   ----> |     |   ----> |     |   ----> nil
       |  |  |     |   |  |  |     |   |  |  |     |  
       +--|--+-----+   +--|--+-----+   +--|--+-----+  
          |               |               |           
          v               v               v           

          10              11              12

When we say the structure of the list aa or list structure of aa, we mean the three cons cells in the above diagram.

Now run the following code.

(setq aa1 (cdr aa))
(setq aa2 (cdr aa1))

aa1 refers to the second cons cell in the diagram. aa2 is the last cons cell. On the other hand, aa1 and aa2 are lists too: aa1 is a list of two elements, and aa2 is a list of one element. What is a list of zero elements? Run the following code.

(setq aa3 (cdr aa2))

aa3 is nil, but also an empty list. nil is how Lisp represents an empty list.

                           aa1             aa2              

                             |               |         aa3
                             v               v          
                                                        |
       +-----+-----+   +-----+-----+   +-----+-----+    v
       |     |     |   |     |     |   |     |     |        
aa --> |     |   ----> |     |   ----> |     |   ----> nil     
       |  |  |     |   |  |  |     |   |  |  |     |  
       +--|--+-----+   +--|--+-----+   +--|--+-----+  
          |               |               |           
          v               v               v           

          10              11              12
(print aa)  ; ⇒ (10 11 12)
(print aa1) ; ⇒ (11 12)
(print aa2) ; ⇒ (12)
(print aa3) ; ⇒ nil

This is how you access elements of a list:

(car (cdr (cdr aa))) ; ⇒ 12


;; alternative way
(elt aa 2)   ; ⇒ 12

;; all elements
(elt aa 0)   ; ⇒ 10
(elt aa 1)   ; ⇒ 11
(elt aa 2)   ; ⇒ 12

5. improper lists and proper lists

The elisp reference says “By convention, the cdr of the last cons cell in a list is nil. We call such a nil-terminated structure a true list” It sounds like there are some kind of improper lists. There are two kinds of improper lists. The first kind, dotted lists, are like usual lists except that the cdr of their last cons cells are not nil.

;; This is a dotted list.
(setq dotted (cons 0 (cons 1 (cons 2 100))))

;; Let's see how it is printed. (This is why it is called a dotted list)
(print dotted) ; ⇒ (0 1 2 . 100)

dotted is NOT a list of four elements 0, 1, 2, 100. dotted is just an improper list.

The second kind of improper lists are circular lists which are like usual lists, except they don’t have their last cons cells because they are circular in the sense illustrated by the following code.

;; this is a proper list for now
(setq circ (list 0 11 22 33 44))

;; this is its last cons cell
(setq last-cons (last circ))

;; this is some cons cell in the middle
(setq mid-cons (cddr circ))

;; we modify the structure of circ to make it a circular list
(setf (cdr last-cons) mid-cons)

Lisp functions that take a list as its argument usually expect you to pass a proper list, but some functions can take improper lists as well. It is good to be aware of improper lists because docstrings of functions may mention them.

6. check for empty list, non-empty list, list

The function listp is used for checking if the argument is a list or not.

(listp (list 1 2 3))    ; ⇒ t
(listp nil)             ; ⇒ t
(listp 100)             ; ⇒ nil
(listp (vector 1 2 3))  ; ⇒ nil
(listp "hello")         ; ⇒ nil

In fact, listp simply check if the argument is either a cons cell or nil. If it is a cons cell, then one can argue that it is a non-empty list (which may be an improper list), and if it is nil, then it is an empty list.

The function consp is used for checking if the argument is a cons cell, and one can argue that this is the same as checking if it is a non-empty list.

(consp (cons 100 200)) ; ⇒ t
(consp 100)            ; ⇒ nil
(consp (list 1 2 3 4)) ; ⇒ t
(consp nil)            ; ⇒ nil

To check if a variable holds an empty list, you simply check if it is nil.

7. sequence

Sometimes the docstring of a function may say it takes a sequence as one of its arguments. A sequence is a supertype of vector, list, and maybe some other types. So if the docstring of a function says it takes a sequence, you can pass a vector or list to it. Examples of such functions include length, elt, copy-sequence.

(length (list 0 1 2))   ; ⇒ 3
(length (vector 0 1 2)) ; ⇒ 3

(elt (list 0 11 22) 1)    ; ⇒ 11
(elt (vector 0 11 22) 1)  ; ⇒ 11

Also, a string is a sequence.

(length "Hello")  ; ⇒ 5
(elt "Hello" 2)   ; ⇒ 108 (is the character l in elisp)
(copy-sequence "Hello")  ; ⇒ "Hello"

8. literal list

What is a literal object? What is a literal list? For that, consider the following code

(list 0 (+ 10 1) (+ 20 2)) ; ⇒ (0 11 22)

;; This is a literal list.
'(0 11 22)   ; ⇒ (0 11 22)

(equal (list 0 (+ 10 1) (+ 20 2))
       '(0 11 22))
;; ⇒ t


;; A literal list should be quoted (with a single quote) in general in code.
(length '(0 11 22))   ; ⇒ 3

;; If you don't quote it, this happens.
(length (0 11 22))    ; ERROR: 0 is not a function.
;; Expecting the above line to evaluate to 3 is like
;; expecting the below line to evaluate to 11.
(length Hello world)  ; ERROR

(length "Hello world") ; ⇒ 11


;; This is a literal cons cell.
'(11 . 22)  ; ⇒ (11 . 22)

(equal (cons 11 22)
       '(11 . 22))
;; ⇒ t

;; These are quoted forms. Quoted forms give you literal objects.
'(11 . 22)
'(0 11 22)
'((0 1) (2 3) (4 5))

;; two ways of writing nested lists
(equal (list (list 0 1) (list 2 3) (list 4 5))
       '((0 1) (2 3) (4 5)))
;; ⇒ t

There are two kinds of literal objects. The first kind comes from writing quoted forms. The second kind comes from writing self-evaluating forms. What are self-evaluating forms?

;; These are self-evaluating forms. They evaluate to themselves.
"Hello world"  ; ⇒ "Hello world"
1234           ; ⇒ 1234


(setq aa 1234)

;; This is NOT a self-evaluating form. This is just a mod form.
(mod 25 10) ; ⇒ 5

;; This is NOT a self-evaluating form. This is just the variable aa.
aa        ; ⇒ 1234

The rule “One should not modify a literal object” means that you should not change any part of a literal object.

;; Assign a literal object to the variable aa.
(setq aa '((0 11) (22 33) (44 55)))

;; Let bb refer to the last element of aa.
(setq bb (elt aa 2)) ; ⇒ (44 55)

;; See what the first element of bb is.
(elt bb 0)  ; ⇒ 44

;; This modifies our literal object. This is NOT ok.
(setf (car bb) 4400)

;; This also modifies our literal object. This is NOT ok.
(setf (cdr bb) nil)

;; the aftermath
aa ⇒ ((0 11) (22 33) (4400))

Some code for illustration

(setq aa '(0 11 22))
(setq bb (list 0 11 22))

;; This modifies a literal object. This is NOT ok.
(setf (elt aa 2) 2200)

;; This is ok.
(setf (elt bb 2) 2200)


(setq aa "foo")
(setq bb (copy-sequence "foo"))

;; This modifies a literal string. This is NOT ok.
(setf (elt aa 0) (elt "m" 0))

;; This is ok.
(setf (elt bb 0) (elt "m" 0))
bb ; ⇒ "moo"

More detail on literal objects will be the subject of another post which I will link as soon I write it, for now it’s enough to know two facts:

  • You should not modify literal data.
  • A literal list is also referred to as “constant list” or “quoted list” (for obvious reasons).

9. shared structure

In the following diagram, we say that the list x and the list y share structure, because they share some tail. Recall that the list structure of x is cons cells making up the list x. The list structure of x and that of y share two cons cells, and that is why we say that the two lists share structure.

      +-----+-----+   +-----+-----+
      |     |     |   |     |     |
x --> |     |   ----> |     |     |  
      |     |     |   |     |  |  |  
      +-----+-----+   +-----+--|--+  
                               |     +-----+-----+   +-----+-----+
                               +---> |     |     |   |     |     |
                                     |     |   ----> |     | nil |
                               +---> |     |     |   |     |     |
                               |     +-----+-----+   +-----+-----+ 
      +-----+-----+   +-----+--|--+                                
      |     |     |   |     |  |  |                                
y --> |     |   ----> |     |     |                                
      |     |     |   |     |     |                                
      +-----+-----+   +-----+-----+

If we modify the list x by doing (setf (elt x 2) something), we change the element of x at index 2 (in the sense of new designation), that is, we change the car of the last-to-second cons cell. And that affects the list y too because the last-to-second cons cells for the list x and that for the list y are the same cons cell.

If we modify the list x by doing2 (setf (cdr (cddr x)) something), we change the cdr of the last-to-second cons cell, and depending on what something is, this can make x a longer list or a shorter list or an improper list. And that affects the list y too.

So doing something to the list x can affect the list y. This is why we sometimes need to watch out for lists that share structure with other lists, and there are many functions that return a list that may share structure with one of its arguments. Notice that (eq x y) being false (i.e. is nil) does NOT guarantee that the two list do not share structure. In the appendix is a function for checking if two lists share structure.

In the following diagram, the list z and the list w do NOT share structure, but they DO share elements. This will remind you of the shallow copy diagram in the previous article. We can say that the list z is a shallow copy of the list w. Often we just say that z is a copy of the list w.

      +-----+-----+   +-----+-----+   +-----+-----+   +-----+-----+
      |     |     |   |     |     |   |     |     |   |     |     |
z --> |     |   ----> |     |   ----> |     |   ----> |     | nil |
      |  |  |     |   |  |  |     |   |  |  |     |   |  |  |     |
      +--|--+-----+   +--|--+-----+   +--|--+-----+   +--|--+-----+
         |               |               |               |         
         v               v               v               v         

         10              11              12              13        

         ^               ^               ^               ^         
         |               |               |               |         
      +--|--+-----+   +--|--+-----+   +--|--+-----+   +--|--+-----+
      |  |  |     |   |  |  |     |   |  |  |     |   |  |  |     |
w --> |     |   ----> |     |   ----> |     |   ----> |     | nil |
      |     |     |   |     |     |   |     |     |   |     |     |
      +-----+-----+   +-----+-----+   +-----+-----+   +-----+-----+

Doing (setf (elt z 2) something) or (setf (cdr (cddr z)) something) affects only z and not w. This is unlike the case with x and y. Nevertheless, the two lists share elements, so if some of the elements were mutable objects, and if you mutate one of those mutable elements, then the change would been seen via both z and w. But you don’t have to worry about that often, because you usually expect and want that change to be seen both lists.

The following code first makes two lists that share structure in the way that matches our diagram for x and y, and then modifies the first list to see what happens.

(setq tail (list 22 33))

(setq x (cons 0 (cons 1 tail)))
(setq y (cons 2 (cons 3 tail)))

x  ; ⇒ (0 1 22 33)
y  ; ⇒ (2 3 22 33)

(setf (elt x 2) 2200)

x  ; ⇒ (0 1 2200 33)
y  ; ⇒ (2 3 2200 33)


(setf (cdr (cddr x)) nil)

x  ; ⇒ (0 1 2200)
y  ; ⇒ (2 3 2200)

The following code first makes the list z and w in the way that matches our diagram for z and w, and then modifies the first list to see what happens

(setq z (list 10 11 12 13))
(setq w (copy-sequence z))

z  ; ⇒ (10 11 12 13)
w  ; ⇒ (10 11 12 13)

(setf (elt z 2) 1200)

z  ; ⇒ (10 11 1200 13)
w  ; ⇒ (10 11 12 13)

(setf (cdr (cddr z)) nil)

z  ; ⇒ (10 11 1200)
w  ; ⇒ (10 11 12 13)

Since vectors are primitive data type unlike lists, a vector cannot share structure only partially with another vector. Two vectors are either the same object or completely distinct: there is no in-between. The need to watch out for shared structure between lists is a consequence of lists being an illusion. Another consequence is the need to assign back to the variable holding the list sometimes, which is the subject of the next two sections.

In the appendix is some tips for figuring out the correct diagram given a snippet where you are not sure which list share structure with which.

10. what it means to modify a list

The previous section gave examples of modifying a list and its effects on another list that happens to share structure with the former list. Here is an example of NOT modifying a list.

(setq aa (list 0 11 22))

aa  ; ⇒ (0 11 22)

;; Insert 10000 to aa.
(push 10000 aa)

aa  ; ⇒ (10000 0 11 22)

;; Insert 20000 to aa.
(push 20000 aa)

aa  ; ⇒ (20000 10000 0 11 22)

;; Pop out the first element from aa.
(pop aa) ; ⇒ 20000

aa  ; ⇒ (10000 0 11 22)

push and pop are treating aa like a stack structure. Nevertheless, no list is being modified in the above code. In fact, what is happening is analogous to the use of incf in the previous article. push and pop are macros and not functions, and the code (push 10000 aa) just expands to the code (setq aa (cons 10000 aa)), and the code (pop aa) expands to something like:

(let ((val (car aa)))
  (setq aa (cdr aa))
  val)

In the same sense that incf does NOT modify any numbers (which are immutable anyway), the following code does NOT modify any list.

;; Let's start with a literal list this time.
(setq aa '(0 11 22))

aa  ; ⇒ (0 11 22)

(setq aa (cons 10000 aa))

aa  ; ⇒ (10000 0 11 22)

(setq aa (cons 20000 aa))

aa  ; ⇒ (20000 10000 0 11 22)

(setq aa (cdr aa))

aa  ; ⇒ (10000 0 11 22)

Nevertheless, we say that we are adding 10000 to the list stored in aa by running (setq aa (cons 10000 aa)). Here is the docstring of push:

push is a Lisp macro in `subr.el'.

(push NEWELT PLACE)

Add NEWELT to the list stored in the generalized variable PLACE.
This is morally equivalent to (setf PLACE (cons NEWELT PLACE)),

Notice the phrases “generalized variable”, “place” being used in the docstring. Use of such phrases implies that the push macro, just like incf and setf, can be used with not just variables (like aa) but also with places like so:

(setq vv (vector (list 0 0 0)
                 (list 1 1 1)))

;; Push 9 into the PLACE (elt vv 0).
(push 9 (elt vv 0))

vv   ; ⇒ [(9 0 0 0) (1 1 1)]

The code (push 9 (elt vv 0)) just expands to an appropriate code. I mention this because this is why a push form CAN expand to some code that modifies a list, and the same is true with pop. For example, the following code shows that a pop form can expand to some code that modifies a list.

(setq aa (list 0 11 22))

(pop (cdr aa))  ; ⇒ 11

aa   ; ⇒ (0 22)

(pop (cdr aa)) expands to some code that mutates the first cons cell in an appropriate way. This is why push and pop are convenient: that pop form just magically expands to the right code and you don’t have to write the right code yourself. On the other hand, this also means that just because you are using pop and push does not mean that no list is being modified.

(setq aa (list 0 11 22))
(setq bb aa)

;; The following line will NOT affect bb, as it does not modify any list.
(push 10000 aa)

aa  ; ⇒ (10000 0 11 22)
bb  ; ⇒ (0 11 22)

;; Remember that the list aa and the list bb still share structure.
;; That is why the following line WILL affect bb.
(setf (elt aa 2) 9)

aa  ; ⇒ (10000 0 9 22)
bb  ; ⇒ (0 9 22)

Now you how to add an element to a list using push and what that actually means. Just in case you are still not sure what’s going on, here is the diagram representing the state right BEFORE running the line (push 10000 aa) in the above code:

                      +-----+-----+   +-----+-----+   +-----+-----+
                      |     |     |   |     |     |   |     |     |
aa  -------------->   |  0  |   ----> | 11  |   ----> | 22  | nil |
                      |     |     |   |     |     |   |     |     |
                      +-----+-----+   +-----+-----+   +-----+-----+

                            ^
                            |
bb -------------------------+

And here is the diagram for the state right after running that line:

      +-----+-----+   +-----+-----+   +-----+-----+   +-----+-----+
      |     |     |   |     |     |   |     |     |   |     |     |
aa -> |10000|   ----> |  0  |   ----> | 11  |   ----> | 22  | nil |
      |     |     |   |     |     |   |     |     |   |     |     |
      +-----+-----+   +-----+-----+   +-----+-----+   +-----+-----+

                            ^
                            |
bb -------------------------+

Emacs users might raise a question: what is the difference between add-to-list and push and how can the function (not a macro) add-to-list do what it does? That will be the subject of another post which I will link as soon as I write it.

11. destructive functions and non-destructive functions

In this section, I will explain the notion of destructive functions, how to use them and when to use them.

Many functions for manipulating lists come in pairs of destructive function and non-destructive function. For example, the functions reverse and nreverse are such a pair, and they are both for reversing a list. reverse is the non-destructive one and nreverse is the destructive one. The following table shows many of such pairs. Functions in the last two rows don’t come in pairs but I included them in the table as a note that those functions are destructive.

non-destructive destructive what the pair is for
reverse nreverse reverse a list
append nconc concatenate two lists or more
remove delete remove elements from a list
cl-substitute cl-nsubstitute replace elements in a list
N/A sort sort a list
N/A cl-merge merge two lists

I believe the pair remove and delete is the most general example, hence the best example to illustrate my points, so let’s start with that pair.

Example use of remove (obtaining a list with the number 5 removed):

(setq arg (list 5 1 2 5))
arg  ; ⇒ (5 1 2 5)
(setq result (remove 5 arg))
result ; ⇒ (1 2)
arg    ; ⇒ (5 1 2 5)

Example use of delete:

(setq arg (list 5 1 2 5))
arg  ; ⇒ (5 1 2 5)
(setq result (delete 5 arg))
result ; ⇒ (1 2)

That is how these two functions are supposed to be used. None of the two functions make it so that the variable arg refers to the result you want. Instead, the return value is the result. Let’s call it Property P1.

(Property P1) Its return value is the result list.

Property P1 is shared by both functions. The following property on the other hand is only guaranteed for the non-destructive function remove and not for the destructive function delete:

(Property P2) The original list that is passed to the function (as its argument) is NOT modified.

That is why remove is called a non-destructive function: it preserves the list structure. No list gets modified when you use remove.

The function delete on the other hand may modify the original list. If you use this function, any other list that happens to share structure with the original list may get modified as well. Hence the term “destructive”. Sometimes we say “the function delete is allowed to destroy the structure of the original list.” and that just means that it may modify the list, and the reason we can say the strong word “destroy” instead of “modify” in that sentence is that the original list as we knew it before is no more.

So the destructive function delete is a dangerous function in some sense, but it’s more efficient than the non-destructive function remove. In general, it’s best to use remove in your code, and if profiling shows you need to optimize, that is when you replace the use of remove with the destructive delete but only if you are sure that the argument list isn’t referenced from anywhere else.

Finally, there is something to watch out for even in the case of the non-destructive remove: the following property is NOT guaranteed for remove or delete.

(Property P3) The result list (the return value) does NOT share structure with the list passed to the function.

In other words, the return value of remove may share structure with its argument.

We’ve mentioned that the following property does NOT hold for remove or delete.

(Property P4) It makes arg become the result.

A common mistake is to assume that P4 holds for the destructive delete or to assume that P3 holds for the non-destructive remove.

Now that delete and remove are explained, I’d add that there are also functions cl-remove-if, cl-remove-if-not, cl-delete-if, cl-delete-if-not.

Here are some basic things to know about the functions mentioned in our table:

  • Every function in the table has Property P1, regardless of whether it is destructive or not.
  • Every function in the first column has Property P2. None of the functions in the second column has that property. A common mistake is to assume that the function sort is non-destructive and has Property P2.
  • Most functions in the table (even the ones in the first column) do NOT guarantee Property P3. The only exception I know is reverse.
  • It is impossible to write a version of nconc, delete, or cl-merge in a way that guarantees P4 (for the same reason that incf cannot be a function.)
  • For most functions in the second column, what happens to the argument is not specified, so you shouldn’t rely on its effects on the argument. (Practical Common Lisp points out two exceptions: nconc and nsubstitute)

Some care need to be taken when you use an operation that result in a list sharing structure and another operation that is destructive. For example

(setq aa (list 5 3 1))
(setq bb (list 4 2 0))

;; Concatenate.
(setq cc (append aa bb))

;; Sort cc.
(setq cc (sort cc '<))

cc  ; ⇒ (0 1 2 3 4 5)

;; Let's see if there was any nasty side effect.
aa  ; ⇒ (5 3 1)
bb  ; ⇒ (4 5)  <----- YMMV

To see why bb got destroyed, recall:

  • The function append is on our table. Its return value may share structure with some of its arguments (in particular, its last argument).3
  • The function sort is on the second column. It is destructive.

If you don’t want bb to be destroyed, you should use sort with copy-sequence like this:

(setq cc (sort (copy-sequence cc) '<))

Think of the idiom (sort (copy-sequence ...) ...) as the non-destructive counterpart to sort. For example, if you want to print all paths in the elisp global variable load-path in a sorted way into the messages buffer, you run:

(dolist (path (sort (copy-sequence load-path) 'string<))
  (message "%s" path))

But if you use sort directly without copy-sequence, the load-path will get messed up.

Some popular examples of safe use of destructive functions in the second column is included in the appendix.

Actually the phrase “this operator is destructive” can have broader meaning than this post touched upon. In fact, from the Common Lisp HyperSpect glossary:
“destructive adj. (of an operator) capable of modifying some program-visible aspect of one or more objects that are either explicit arguments to the operator or that can be obtained directly or indirectly from the global environment by the operator. ”

So setq is a destructive operator in that sense. For this reason, functions on the second column are sometimes called recycling functions (because they are allowed to recycle cons cells of the given list).4

12. Do this post apply to alists and trees as well?

There are other useful illusion data types other than lists. Two of them are:

  • alist (association list)
  • tree

And they are also made of cons cells.

Here is an example of an alist. It is just a list of cons cells, to be interpreted as storing information about which names are male names and which names are not (0 indicates female name, and 1 indicates male name.).

(setq my-alist (list (cons "Alice" 0)
                     (cons "Bob" 1)
                     (cons "Dan" 1)))

my-alist ; ⇒ (("Alice" . 0) ("Bob" . 1) ("Dan" . 1))

When we say the alist structure of my-alist, we mean the six cons cells making up that alist. But when we say the list structure of my-list, we mean the three cons cells making up the list structure. So when we say the alist my-alist share structure with some other alist, we mean that some of the six cons cells are shared with the other alist. But when we say that the list my-alist share structure with some other list, we mean the list structure (made of three cons cells) being shared.

Let’s see the docstring of copy-alist

(copy-alist ALIST)

Return a copy of ALIST.
This is an alist which represents the same mapping from objects to objects,
but does not share the alist structure with ALIST.
The objects mapped (cars and cdrs of elements of the alist)
are shared, however.

So (copy-alist my-alist) will return an alist composed of six new cons cells, while (copy-sequence my-alist) or (cl-copy-list my-alist) will produce only three new cons cells, while the other three cons cells are shared because they are considered as elements of my-alist as as a list. Elements of my-alist as a list are the three cons cells being used for associations, but elements of my-alist as an alist are the three strings and the associated numbers. So “copy of an alist” and “copy of a list” end up meaning different things.

We’ve only covered functions for manipulating lists, but there are also functions for manipulating alists, and some of them come in pairs. Now that you know the meaning of “alist structure”, “modifying an alist”, “alist structure being shared”, you will be able to use those functions safely as well.

Finally, when we say “tree structure”, we mean all cons cells reachable from following car or cdr. So when you have a list of two elements and if the first element is another list with three numbers, and the second element is yet another list with ten strings, then you have a nested list, and the tree structure consists of 2 + 3 + 10 cons cells. You will be able to guess what the function copy-tree does.

13. Common Lisp note

To translate elisp snippets in this post into Common Lisp snippets, change some function names: cl-foo into foo, and copy-sequence into copy-seq. You might also want to wrap the snipets with appropriate let forms (so that aa and bb and so on are local variables), or use defparameter and the earmuff convention.

14. Appendix

One example. This example from Practical Common Lisp uses nreverse in a safe way.

(defun upto (max)
  (let ((result nil))
    (dotimes (i max)
      (push i result))
    (nreverse result)))

(upto 10)   ; ==> (0 1 2 3 4 5 6 7 8 9)

Here is a function for testing shared structure between lists.

(defun my-list-share-structure-p (list1 list2)
  "Return non-nil if the two lists share structure, otherwise nil. Assume non-cyclic."
  (and (consp list1)
       (consp list2)
       (eq (last list1)
           (last list2))))

Test:

(let ((lst (list (list 3 1) (list 4 1) (list 5 9))))
  (list (my-list-share-structure-p lst lst)
        (my-list-share-structure-p lst (cdr lst))
        (my-list-share-structure-p lst (copy-sequence lst))
        (my-list-share-structure-p lst nil)
        (my-list-share-structure-p lst (cons 99 lst))
        (my-list-share-structure-p lst (append (list 99 99) lst))))

Given a snippet, if you are not sure what diagram it should produce, you can use eq to test whether two given cons cells are the same cons cells. If you want to see the diagram at once, use the variable print-circle (for Common Lisp it’s *PRINT-CIRCLE*):

(setq aa (list 0 11 22))
(setq bb (list 1 2))
(setq cc (append aa bb))

(let ((print-circle t)
      (print-level nil)
      (print-length nil))
  (print (vector aa bb cc)))

It would be good to have something like Online Python Tutor – Visualize program execution for Lisp.

Footnotes:

1

Recall from the previous article that container objects refer to other objects

2

Yes this is equivalent to (setf (cdr (cdr (cdr x))) something) and also to (setf (cl-cdddr x) something)

3

As long as the last argument is not nil, sharing structure with the last argument is guaranteed. And only the last argument.

4

Started by Practical Common Lisp

Posted in Emacs, Lisp | Tagged , | Leave a comment