What Makes LISP Unique?
Exploring the homoiconicity feature of LISP and how it makes LISP more of a language for languages
The LISP programming language and its derivatives, such as Scheme and Clojure, have a feature that has long fascinated me, referred to as homoiconicity.
It implies that code is represented as data in LISP, and thus can be manipulated and modified in the same manner as data. To truly appreciate the profound implications of such a simple statement requires reading a book and write your own LISP code. You probably don't have time for that, so let me instead give you an inkling or intuition through this short article.
Understanding LISP always begins with understanding the core data structure LISP is built around, a linked list. It may seem peculiar that understanding a language relies on understanding a data structure, but bear with me.
The following LISP code defines a linked list containing an integer, a string, a boolean value, a floating-point number and a character. Notice how the LISP expression begins and ends with parenthesis. LISP is infamous for the abundance of parenthesis.
(list 43 "hello" true 2.5 'c')
Please note that to make LISP code examples easier to read, I have taken the liberty of using C-like syntax for booleans and characters. Standard LISP will use different syntax.
A linked list consist of nodes, where each node can point to another node. It also has a cell to store data. But this can also be a pointer to another list. Hence, we can have lists of lists, These lists can also be lists of lists.
(list 34 (list "hello" true) (list 2.5 'c'))
Thus, one can express complex data structures and tree structures this way. If you know a little bit about how a compiler works, you already know that when it parses code, it produces an abstract syntax tree. In a mainstream programming language, you may have an expression such as y = 4*(2 + x)
. It could be represented as the syntax tree you see below.
When writing LISP code, you are basically creating this abstract syntax tree directly. In principle, you can rearrange and transform such a tree of code just like any other data structure.
Let us take an example to get across the universality of the LISP approach to code representation. Consider a Hello World program written in C. The abstract syntax tree (AST), the data structure representing this program, is not immediately apparent.
#include <stdio.h>
int main () {
printf("hello, world\n");
return 0;
}
However, there are LISP flavored versions of C, which lets you write this Hello World program in LISP syntax instead (what we call s-expressions). Notice how the syntactic structure of the code becomes more explicit.
(import cstdio)
(def main (fn extern-c int (void)
(printf "hello, world\n")))
Let us look another C code snippet to help clarify. The following code defines a new composite type called Point
with two integer fields x
and y
.
struct Point {
int x;
int y;
};
We can express the same type definition using s-expressions, as shown below.
(def Point (struct intern (
(x int)
(y int)
)))
Where am I going with this? Why write code with such seemingly ugly and awkward syntax? The benefit of using s-expressions is that the code becomes regularized. For instance, in this type definition one can clearly see that the code is defined as a list, where the first two elements are def
and Point
. The third element is another list which starts with the elements struct
and intern
. The inner list has a third element, which is a list of all the fields in the Point
composite type.
Okay, so what? What does this new syntax give me? Let me give you an example: If the code is placed in a file, other LISP code could load the code as data and transform it. The code can iterate over the Point
definition code like any other linked list. Elements can be replaced, inserted or moved.
Here is the kicker: In LISP, you don't need to place the code in a separate file to achieve this. Code can be placed as data directly within other LISP code using quoting. Allow me to clarify with a simple code example. The mathematical expression 4 + x
is written in LISP as:
(+ 4 x)
In LISP if I want to define a variable y
with the value 10 I write:
(defvar y 10)
Later, I can modify the y
variable using the setf
command:
(setf y 5)
The third element in the list doesn't need to be a number. We can put anything in there, even a list of code. But how do we avoid that the code gets evaluated? The following code would first evaluate 3 + 4
and store the resulting value 7 in the y
variable.
(setf y (+ 3 4))
The solution is to quote the list instead. Quoting turns an expression into regular list data. Instead of storing the value 7 in the y
variable, we are now storing the code (+ 3 4)
in the y
variable.
(setf y '(+ 3 4))
Using a LISP interactive REPL environment (read-evaluate-print-loop) such as SBCL (Steel Bank Common LISP) we can inspect this variable and evaluate it:
sbcl> (setf y '(+ 3 4))
(+ 3 4)
sbcl> y
=> (+ 3 4)
sbcl> (eval y)
=> 7
We can piece together such a list in multiple steps using the cons
function. The cons
function add a node to the beginning of a list. In the following code we prepend the value 3 to the list (5 8)
.
sbcl> (cons 3 '(5 8))
=> (3 5 8)
You can use this approach to combine fragments of code found within a list. For instance in the following REPL session we pick out the first element from the code stored in variable y
using the first
function. The first element is the +
operator. Next, we prepend that operator to a list of the numbers 4 and 5 using the cons
function.
sbcl> (cons (first y) '(4 5))
=> (+ 4 5)
As you can see, these operations produce the expression 4 + 5
written as (+ 4 5)
in LISP. Next we can evaluate the expression we have just created using the eval
function and produce the value 9.
sbcl> (eval (cons (first y) '(4 5)))
=> 9
There is a lot more to LISP homoiconicity than I have shown in this story. We have just scratched the surface. But the point is to show you that by putting any code inside s-expressions you can easily transform and manipulate that code.
This approach allows you to place LISP flavored C code, shown earlier, inside regular LISP code and perform transformations on that code. The resulting code can then be feed a program such as C-Mera which will turn the LISP style C into normal C code which can be compiled. Here is an actual example implementing the Unix wc shell command using s-expressions.
(include <stdio.h>)
(function main () -> int
(decl ((int c)
(int nl = 0))
(while (!= (set c (getchar)) EOF)
(if (== c #\newline)
++nl))
(printf "%d\\n" nl)
(return 0)))
The C-Mera program will transform this into C code. We use the cm
command to transform the s-expression source code found in the file wc-l.lisp
into C code.
$ ls
wc-l.lisp
$ cm c wc-l.lisp
#include <stdio.h>
int main(void)
{
int c;
int nl = 0;
while ((c = getchar()) != EOF) {
if (c == '\n')
++nl;
}
printf("%d\n", nl);
}
One might wonder what the benefit of doing this detour is. One of the advantages is that one can utilize the power of LISP to modify and generate code to reduce boilerplate code.
This ability is one of the reasons why LISP is often not thought of as merely a language, but more like a whole system to create your own languages. One real-world example of this was the GOAL (Game Oriented Assembly LISP) language used to make games for Playstation by video game maker company Naughty Dog. For instance, the PlayStation 2 Jak and Daxtergames were made using GOAL. Instead of outputting C code, the LISP code for the games was transformed into Playstation assembly code. The beauty of this approach is that they could create the entire assembler using an existing commercial LISP environment, such as Allegro Common Lisp. Manipulating GOAL code becomes easy with a LISP development environment. It already has the s-expressions LISP is designed to work with. All the tools and editors designed for Allegro could easily be used with GOAL, as they both use the same s-expressions.
Editors designed for s-expressions can be made very powerful. Normal editors are made to work with lines and columns, but when working with s-expressions you can instead think of a tree structure of parents, children, and siblings. Hotkeys are used to jump between siblings, to parents or alternatively selecting or deleting children or siblings.
Usage of LISP in the Real World
A big part of the power of LISP is really about its somewhat alien syntax. It is very easy to implement a LISP parser and a LISP language, which is why you find that a lot of early software used LISP as a script language. You may already know that the EMACS editor famously use LISP as its script language. AutoCAD, a computer-aided design (CAD) tool from Autodesk, famously used LISP as a script language.
LISP also works great as a language for parsing other languages. A clever strategy to use for programming language designers is to first transform your language syntax to a LISP syntax and then let a LISP language take over the rest of the parsing and code generation. A modern high-performance JIT compiled dynamic language called Julia uses this strategy. Under the hood a minimal LISP implementation called Femtolisp is used to process Julia code.
The underlying LISP heritage of Julia is visible in various ways. Julia has a Meta.show_sexpr
function, which allows you to see what s-expression a Julia expression is transformed into. There is even a Julia package called LispSyntax which allows you to write Julia code in LISP syntax in the Julia interactive REPL environment. It adds a new REPL mode named jλ
, which accepts LISP syntax. Here is an example of implementing the fibonacci function in Julia using LISP syntax in a REPL session.
jλ> (defn fib [a]
(if (< a 2)
a
(+ (fib (- a 1)) (fib (- a 2)))))
jλ> (fib 10)
55
I want to just end with saying that LISP is probably not a language you end up using professionally, but it is kind of like a language every serious about computer science should learn once in their lifetime. LISP is like foundational knowledge. You cannot truly know our field without knowing LISP.
Resources
If you are more curious about exploring LISP and LISP like languages such as Scheme, I have added some links to stuff I have found useful over the years.
Land of Lisp: Learn to Program in Lisp, One Game at a Time – A LISP book by Conrad Barski, which offers an entertaining way of learning LISP programming. Seriously, I had a lot of fun reading this book.
Wizard Book – Actually named Structure and Interpretation of Computer Programs is a classic. It is the kind of book everybody serious about programming should read once in their life. It is the kind of book that blows your mind. Furthermore, it covers a functional variant of LISP called Scheme.
Racket - If you are serious about functional LISP programming, then Racket is one of the most pleasant and user-friendly Scheme languages you can find today. It comes with a great graphical IDE called DrRacket which is very beginner friend.
Making Crash Bandicoot – Andy Gavin writes about how he and Jason Rubin of Naughty Dog developed Crash Bandicoot and Jax and Daxter using their own LISP languages GOOL and GOAL.