concrete-syntax-tree

2025-06-22

Library for parsing Common Lisp code into a concrete syntax tree.

Upstream URL

github.com/robert-strandh/Concrete-Syntax-Tree

Author

Robert Strandh <robert.strandh@gmail.com>

Maintainer

Jan Moringen <jan.moringen@posteo.de>

License

BSD
README
Concrete Syntax Tree README

1Introduction

This library addresses the issue of attaching source information to s-expressions, and in particular Common Lisp source code. Source information refers the place an s-expression came from such as the line and a range of columns in a file or editor buffer. The issue arises because when Common Lisp code is processed, for example by macros, the code is represented as Common Lisp objects which do a provide a way of attaching source information (and as a consequence, the objects returned by functions like cl:read cannot contain any source information).

The solution offered by this library is based on a Concrete Syntax Tree (or CST for short) data structure which is an alternative representation of a Common Lisp expression such that source information can be associated with object in the expression. Since clients have different requirements and capabilities, this library does not provide or specify the nature of the source information that is attached to CST nodes.

Besides the core protocols and data structures for CSTs, this library provides:

  • A CST "reconstruction" function helps with macroexpansion which,given an input CST, requires taking the raw expression out of theinput CST, apply the macro expander to the raw expression, turningthe expansion from a raw expression back into a CST. The"reconstruction" aspect comes from the fact that the functionreuses parts of the input CST in the resulting CST, when possible,so that identity and source information are preserved.
  • For Common Lisp code that is represented a CST, this libraryprovides utilities for canonicalizing declarations, parsing lambdalists, separating declarations and documentation strings and codebodies, checking whether a form is a proper list, etc. All theseutilities accept CSTs as their arguments and produce CSTs as theirresults. Whenever possible, these utilities propagate any sourceinformation from their arguments to their results.

This document only gives a very brief overview and highlights some features. Proper documentation can be found in the file:documentation directory.

2Usage Overview

2.1CST from Source Code

Possibly the most common way of producing CSTs with source information is read ing Common Lisp code (or arbitrary s-expressions). This library does not provide any functionality for reading the character-based representation of s-expressions into CSTs. Instead, the Eclector library and in particular the eclector-concrete-syntax-tree system within that library can be used.

2.2CST from Expression

It is sometimes useful to produce CSTs from expressions that are Common Lisp objects (as opposed to Common Lisp source code). For such cases, this library provides the function concrete-syntax-tree:cst-from-expression:

     (let* ((expression '(1 #\a))
            (cst (concrete-syntax-tree:cst-from-expression expression))
            (raw (concrete-syntax-tree:raw cst))
            (source (concrete-syntax-tree:source cst)))
       (values cst raw source))
     #<CONCRETE-SYNTAX-TREE:CONS-CST raw: (1 #\a) {1006C7AD53}>
     (1 #\a)
     NIL

Note how the resulting CST does not have any source information attached to it (unless the client explicitly provides source information).

2.3CST Reconstruction after Macro Expansion

Expanding macros is a typical activity in programs which process Common Lisp source code. When the source code is represented as CSTs, macro expansion should accept a CST (with source information) and produce a CST (with source information where possible). However, a complication arises from the fact that macro expanders are functions which accept and produce s-expressions, not CSTs.

Given a CST and a macro expander, the only solution is a to

  1. take the "raw" s-expression from the CST before macro expansion
  2. apply the macro expander to the s-expression
  3. somehow turn the expansion (again an s-expression) into a newCST so that source information from the original CST carriesover where possible

The function concrete-syntax-tree:reconstruct performs step 3.

The following example illustrates the whole process, starting from an "input" CST and with an "output" CST as the result:

     (let* ((input-when-cst (concrete-syntax-tree:cst-from-expression
                       'when :source "when-source"))
            (input-test-cst (concrete-syntax-tree:cst-from-expression
                       '(test x 1 #\y) :source "test-source"))
            (input-then-cst (concrete-syntax-tree:cst-from-expression
                       'a :source "a-source"))
            (input-cst (concrete-syntax-tree:list
                        input-when-cst input-test-cst input-then-cst))
            (input-raw (concrete-syntax-tree:raw input-cst))
            (expansion (macroexpand-1 input-raw))
            (output-cst (concrete-syntax-tree:reconstruct
                         nil expansion input-cst))
            (output-if-cst (cst:first output-cst))
            (output-test-cst (cst:second output-cst))
            (output-then-cst (cst:third output-cst)))
       (let ((*print-pretty* nil))
         (format t "~A -> ~A~2%" input-raw expansion)
         (flet ((show (cst)
                  (format t "~52A -> ~A~%" cst (cst:source cst))))
           (show output-cst)
           (show output-if-cst)
           (show output-test-cst)
           (show output-then-cst))))
     (WHEN (TEST X 1 y) A) -> (IF (TEST X 1 y) A)

     #<CONS-CST raw: (IF (TEST X 1 #\y) A) {1004C75DA3}>  -> NIL
     #<ATOM-CST raw: IF {1004C75F03}>                     -> NIL
     #<CONS-CST raw: (TEST X 1 #\y) {1004C757D3}>         -> test-source
     #<ATOM-CST raw: A {1004C75A63}>                      -> a-source

Dependencies (2)

  • acclimation
  • fiveam

Dependents (2)

  • GitHub
  • Quicklisp