concrete-syntax-tree
2025-06-22
Library for parsing Common Lisp code into a concrete syntax tree.
Upstream URL
Author
Maintainer
License
1Introduction
This library addresses the issue of attaching source information to
s-expressions, and in particular Common Lisp source code. Source
information refers the place an s-expression came from such as the
line and a range of columns in a file or editor buffer. The issue
arises because when Common Lisp code is processed, for example by
macros, the code is represented as Common Lisp objects which do a
provide a way of attaching source information (and as a consequence,
the objects returned by functions like cl:read
cannot contain any
source information).
The solution offered by this library is based on a Concrete Syntax Tree (or CST for short) data structure which is an alternative representation of a Common Lisp expression such that source information can be associated with object in the expression. Since clients have different requirements and capabilities, this library does not provide or specify the nature of the source information that is attached to CST nodes.
Besides the core protocols and data structures for CSTs, this library provides:
- A CST "reconstruction" function helps with macroexpansion which,given an input CST, requires taking the raw expression out of theinput CST, apply the macro expander to the raw expression, turningthe expansion from a raw expression back into a CST. The"reconstruction" aspect comes from the fact that the functionreuses parts of the input CST in the resulting CST, when possible,so that identity and source information are preserved.
- For Common Lisp code that is represented a CST, this libraryprovides utilities for canonicalizing declarations, parsing lambdalists, separating declarations and documentation strings and codebodies, checking whether a form is a proper list, etc. All theseutilities accept CSTs as their arguments and produce CSTs as theirresults. Whenever possible, these utilities propagate any sourceinformation from their arguments to their results.
This document only gives a very brief overview and highlights some features. Proper documentation can be found in the file:documentation directory.
2Usage Overview
2.1CST from Source Code
Possibly the most common way of producing CSTs with source
information is read
ing Common Lisp code (or arbitrary
s-expressions). This library does not provide any functionality for
reading the character-based representation of s-expressions into
CSTs. Instead, the Eclector library and in particular the
eclector-concrete-syntax-tree
system within that library can be
used.
2.2CST from Expression
It is sometimes useful to produce CSTs from expressions that are
Common Lisp objects (as opposed to Common Lisp source code). For
such cases, this library provides the function
concrete-syntax-tree:cst-from-expression
:
(let* ((expression '(1 #\a))
(cst (concrete-syntax-tree:cst-from-expression expression))
(raw (concrete-syntax-tree:raw cst))
(source (concrete-syntax-tree:source cst)))
(values cst raw source))
#<CONCRETE-SYNTAX-TREE:CONS-CST raw: (1 #\a) {1006C7AD53}>
(1 #\a)
NIL
Note how the resulting CST does not have any source information attached to it (unless the client explicitly provides source information).
2.3CST Reconstruction after Macro Expansion
Expanding macros is a typical activity in programs which process Common Lisp source code. When the source code is represented as CSTs, macro expansion should accept a CST (with source information) and produce a CST (with source information where possible). However, a complication arises from the fact that macro expanders are functions which accept and produce s-expressions, not CSTs.
Given a CST and a macro expander, the only solution is a to
- take the "raw" s-expression from the CST before macro expansion
- apply the macro expander to the s-expression
- somehow turn the expansion (again an s-expression) into a newCST so that source information from the original CST carriesover where possible
The function concrete-syntax-tree:reconstruct
performs step 3.
The following example illustrates the whole process, starting from an "input" CST and with an "output" CST as the result:
(let* ((input-when-cst (concrete-syntax-tree:cst-from-expression
'when :source "when-source"))
(input-test-cst (concrete-syntax-tree:cst-from-expression
'(test x 1 #\y) :source "test-source"))
(input-then-cst (concrete-syntax-tree:cst-from-expression
'a :source "a-source"))
(input-cst (concrete-syntax-tree:list
input-when-cst input-test-cst input-then-cst))
(input-raw (concrete-syntax-tree:raw input-cst))
(expansion (macroexpand-1 input-raw))
(output-cst (concrete-syntax-tree:reconstruct
nil expansion input-cst))
(output-if-cst (cst:first output-cst))
(output-test-cst (cst:second output-cst))
(output-then-cst (cst:third output-cst)))
(let ((*print-pretty* nil))
(format t "~A -> ~A~2%" input-raw expansion)
(flet ((show (cst)
(format t "~52A -> ~A~%" cst (cst:source cst))))
(show output-cst)
(show output-if-cst)
(show output-test-cst)
(show output-then-cst))))
(WHEN (TEST X 1 y) A) -> (IF (TEST X 1 y) A)
#<CONS-CST raw: (IF (TEST X 1 #\y) A) {1004C75DA3}> -> NIL
#<ATOM-CST raw: IF {1004C75F03}> -> NIL
#<CONS-CST raw: (TEST X 1 #\y) {1004C757D3}> -> test-source
#<ATOM-CST raw: A {1004C75A63}> -> a-source