Here's a plan for allowing Emacs Lisp and R4RS Scheme to share data structures in Guile with pretty good transparency.

First, some background:

Guile will support both Emacs Lisp and Scheme, and allow code written in one language to exchange values and share data structures with the other. Because Emacs Lisp and Scheme have very similar type systems, this sharing can be almost transparent.

However, there are differences. Emacs Lisp, following lisp tradition, uses the symbol `nil' to represent the empty list and Boolean false. On the other hand, Scheme has distinct objects for all three uses: the symbol `nil' has no special significance, the empty list is written '(), and the special object #f is false. (It's the only false object, in fact; all other objects are true in Scheme, including the empty list.)

We would very much like to find an arrangement that allows all Emacs Lisp code to run unchanged, allows all R4RS code to run unchanged, and allows the two to communicate mostly transparently.

None of the obvious compromises work nicely. Modifying either language to match the other breaks existing code in ways that are impossible to fix mechanically. Making Emacs Lisp's nil identical to either of Scheme's #f or '() will cause misinterpretations when Scheme receives either Boolean values or lists from Emacs Lisp.

It's worth noting that this arises as a crucial issue only because Emacs Lisp and Scheme values are so tantalizingly similar. When mixing languages with more blatantly different value worlds --- C and Scheme, for example --- programmers should expect to be responsible for making the necessary conversions. But because the Scheme/Emacs Lisp boundary is so transparent already, it's worthwhile to clear up the few remaining discrepancies as much as possible.

In the following paragraphs, I'll try to avoid confusion by using this notation:


I'll write #f for the value assigned by Scheme to the expression #f.
I'll write '() for the value assigned by Scheme to the expression '().
I'll write nil for the value assigned by Emacs Lisp to the expressions nil, '(), and ().

Guile will support Emacs Lisp by translating it into Scheme, and that translation need not map Emacs Lisp forms onto their most obvious Scheme counterparts; this gives us some room to maneuver. For example, Emacs Lisp's `if' form could be translated into a Scheme form `lisp-if', which could handle Boolean values differently from the normal Scheme `if'.

We want list-valued Scheme and Emacs Lisp functions to interoperate. So Scheme must be able to return '() to Emacs Lisp, and have Emacs Lisp recognize it as the empty list. Emacs Lisp must be able to return nil to Scheme, and have Scheme recognize it as the empty list.

We want Boolean-valued Scheme and Emacs Lisp functions to interoperate. So Scheme must be able to return #f to Emacs Lisp, and have Emacs Lisp recognize it as false. And Emacs Lisp must be able to return nil to Scheme, and have Scheme recognize it as false.

Since it is standard in Emacs Lisp to test for the empty list with `if' directly, Emacs Lisp must also recognize '() as false.

I don't see any harm in telling Emacs Lisp that #f is an empty list. That merely equates two objects that Emacs Lisp never distinguished anyway.

So, according to the above paragraphs:

Because Emacs Lisp code frequently uses the `eq' function to examine list structure, Emacs Lisp's `eq' should treat nil, '(), and #f as identical. Scheme's eq? must be able to distinguish the three.

When not mixing Scheme and Emacs Lisp, this preserves both languages intact. When mixing the two, this supports the interactions discussed above.

Where does this screw up?

I'm told some people like to write Scheme functions that return either a (possibly empty) list, or #f to indicate some sort of 'other' case. If Emacs Lisp calls such a function, it won't be able to tell whether it got an empty list or the 'other' case. Here's another symptom of the same clash: if Scheme places #f in the cdr of a pair, Scheme will consider it an improper list, while Emacs Lisp will consider it a proper list.

I don't see anything we can do to resolve this conflict, without simply changing some of the user's code; such Scheme code simply provides an unfortunate interface for Emacs Lisp callers. We think this is rare enough that it will suffice to tell programmers, "the Emacs Lisp/Scheme boundary is almost transparent, but this is one glitch you'll have to cope with."

Note that this approach does require Guile to distinguish between nil and '(); this is necessary to allow Scheme callers to accept values from Emacs Lisp Boolean-valued functions.

Can this be implemented efficiently?

I think so. `eq' can use fast, approximate tests to cover most common cases, and then fall back to a slower, correct function for the rest. Something like this might be useful in the Emacs C code:

#define EQ(x, y) ((x) == (y) \
	          || ((((x) ^ (y)) & MASK) == 0) && lisp_eq ((x), (y)))

where MASK has very few zero bits, and the elements of the equivalence classes have representations differing in only those bits. The '==' clause will catch most cases when EQ is true; the '^' clause will catch most cases when EQ is false; and the lisp_eq function will be always correct, but relatively slow.

I think this arrangement meets the needs of most of the Guile audience.


Appendix: Why not just fix all the Emacs Lisp code?

Various people have asked why Emacs Lisp compatibility is so important. Some others have suggested adapting the Emacs Lisp code to work with the Scheme boolean system. I'd like to clarify why I don't think that's practical.

One major reason the FSF is funding Guile work is that it plans to replace Emacs's lisp interpreter with Guile. Emacs 19.33 comes with 334,000 lines of lisp code; there is also a good amount we don't distribute which is in widespread use anyway. The assumption that the empty list is false is ubiquitous in that code. There is no reliable, mechanical way to detect code which needs to be changed. Even if there were, such adaptations would often require changed interfaces, so code far from the site of a nil/false test would also be affected.

In other words, by differentiating nil and false in Emacs Lisp, we would create thousands of bugs spread throughout a third of a million lines of source code, a good proportion of which will be non-trivial to find and fix.

It seems clear that Scheme's semantics cannot be unified with Emacs Lisp's semantics with perfect transparency. I believe it can be done with pretty good transparency. However, Guile will *not* differentiate the empty list and the false value in Emacs Lisp.


Subject: elisp needs to distinguish '() and #f, specially
From: thomas@gnu.ai.mit.edu (Thomas Bushnell, n/BSG)
To: jimb@red-bean.com
Subject: Re: Guile status report
Date: Tue, 10 Sep 1996 15:52:57 -0400
X-Name-Change: My name used to be `Michael'; now it is `Thomas'.
X-Tom-Swiftie: "I'm sorry I broke your window," Tom said painfully.

Suppose I write a scheme function that wants to distinguish returning '() and #f.
New Emacs lisp needs to be able to call such functions and know what's going on. I think the best way is to export Scheme's eq? to emacs as something like `scheme-eq' so that cognizant Lisp programs can do the right thing.
Thomas