sycamore
2026-01-01
Fast, purely functional data structures
SYCAMORE
========
Fast, purely functional data structures in Common Lisp.
API Documentation: http://ndantam.github.io/sycamore
Comprehensive Set Benchmarks: https://ndantam.github.io/sycamore/bench.html
Features
========
* Fast, purely functional Hash Array Mapped Tries (HAMT) [1,2].
- https://en.wikipedia.org/wiki/Hash_array_mapped_trie
- One simple-vector per layer (bitmap+children) reduces indirection.
- SBCL VOPs (when available) enable single machine instructions for
the POPCNT and CTZ operations commonly performed by HAMTs.
- Generally higher-performance than WB-trees.
- Requires hashing and test functions (defaults are CL standard
SXHASH and EQL), but does not need the total ordering function
required by trees.
* Fast, purely functional weight-balanced binary trees (WB-trees)
[3,4,5].
- http://en.wikipedia.org/wiki/Weight-balanced_tree
- Leaf nodes are simple-vectors, greatly reducing tree height.
* Interfaces for tree Sets and Maps (dictionaries).
* Ropes [6]
- http://en.wikipedia.org/wiki/Rope_(data_structure)
* Purely functional pairing heaps
- http://en.wikipedia.org/wiki/Pairing_heap
* Purely functional amortized queue
Installation
============
* Sycamore uses ASDF (https://common-lisp.net/project/asdf/)
* See `INSTALL` file for details
Examples
========
See also `./example.lisp`
Hash Sets
---------
Create a set with default EQL test:
CL-USER> (sycamore:make-hash-set)
#<SYCAMORE:HASH-SET {}>
Create a set with default EQL test and initial elements from a list:
CL-USER> (sycamore:list-hash-set '(1 2 -10 40))
#<SYCAMORE:HASH-SET {40 1 -10 2}>
Create a set with EQUALP test:
CL-USER> (sycamore:list-hash-set '(1 1.0 2.0 2) :test #'equalp)
#<SYCAMORE:HASH-SET {1 2.0}>
Insertion:
CL-USER> (sycamore:hash-set-insert (sycamore:list-hash-set '(1 2))
0)
#<SYCAMORE:HASH-SET {1 2 0}>
Lookup:
CL-USER> (sycamore:hash-set-find (sycamore:list-hash-set '(0 1 2))
0
T
Removal:
CL-USER> (sycamore:hash-set-remove (sycamore:list-hash-set '(1 2 0))
0)
#<SYCAMORE:HASH-SET {1 2}>
Union:
CL-USER> (sycamore:hash-set-union (sycamore:list-hash-set '(1 2 0))
(sycamore:list-hash-set '(1 0 3)))
#<SYCAMORE:HASH-SET {3 1 2 0}>
Intersection:
CL-USER> (sycamore:hash-set-intersection (sycamore:list-hash-set '(1 2 0))
(sycamore:list-hash-set '(1 0 3)))
#<SYCAMORE:HASH-SET {1 0}>
Difference:
CL-USER> (sycamore:hash-set-difference (sycamore:list-hash-set '(1 2 0))
(sycamore:list-hash-set '(1 0 3)))
#<SYCAMORE:HASH-SET {2}>
Map set:
CL-USER> (sycamore:map-hash-set 'list #'1+
(sycamore:list-hash-set '(1 2 0)))
(2 3 1)
Fold set:
CL-USER> (sycamore:fold-hash-set (lambda (list item) (cons item list))
nil
(sycamore:list-hash-set '(1 2 0)))
(1 2 0)
Tree Sets
---------
Define an ordering function:
CL-USER> (defun compare (a b)
(cond ((< a b) -1)
((> a b) 1)
(t 0)))
COMPARE
Create a set for integers:
CL-USER> (sycamore:tree-set #'compare 1 2 -10 40)
#<TREE-SET {-10 1 2 40}>
Insertion:
CL-USER> (sycamore:tree-set-insert (sycamore:tree-set #'compare 1 2)
0)
#<TREE-SET {0 1 2}>
Lookup:
CL-USER> (sycamore:tree-set-find (sycamore:tree-set #'compare 1 2 0)
0)
0
T
Removal:
CL-USER> (sycamore:tree-set-remove (sycamore:tree-set #'compare 1 2 0)
0)
#<TREE-SET {1 2}>
Union operation:
CL-USER> (sycamore:tree-set-union (sycamore:tree-set #'compare 1 2)
(sycamore:tree-set #'compare 1 0 3))
#<TREE-SET {0 1 2 3}>
Intersection operation:
CL-USER> (sycamore:tree-set-intersection (sycamore:tree-set #'compare 1 2)
(sycamore:tree-set #'compare 1 0 3))
#<TREE-SET {1}>
Difference operation:
CL-USER> (sycamore:tree-set-difference (sycamore:tree-set #'compare 1 2)
(sycamore:tree-set #'compare 1 0 3))
#<TREE-SET {2}>
Map set:
CL-USER> (sycamore:map-tree-set 'list #'1+
(sycamore:tree-set #'compare 1 0 10 2))
(1 2 3 11)
Fold set:
CL-USER> (sycamore:fold-tree-set (lambda (list item) (cons item list))
nil
(sycamore:tree-set #'compare 1 0 10 2))
(10 2 1 0)
Ropes
-----
Create a Rope:
CL-USER> (sycamore:rope "Hello" #\Space 'World!)
#<ROPE "Hello WORLD!">
Also works on lists:
CL-USER> (sycamore:rope (list "Hello" #\Space 'World!))
#<ROPE "Hello WORLD!">
And arrays:
CL-USER> (sycamore:rope (vector "Hello" #\Space 'World!))
#<ROPE "Hello WORLD!">
Rope to string:
CL-USER> (sycamore:rope-string (sycamore:rope "Hello" #\Space 'World!))
"Hello WORLD!"
Print a rope:
CL-USER> (sycamore:rope-write (sycamore:rope "Hello" #\Space 'World!)
:escape nil :stream *standard-output*)
Hello WORLD!
What collection data structure should I use?
============================================
Concept of Operations
---------------------
- Two classes of set and map collections: hashes and trees.
- Hashes require a hash function and an equality function. ANSI CL
provides SXHASH and EQ/EQL/EQUAL/EQUALP/=.
- Trees require a total order (for any two elements A and B: does A
come before B, does B come before A, or are A and B equal?)
- Hashes offer better asymptotic (big-O) and constant factor running
times (O(1)) than trees (O(log(n))).
- In some less-typical cases (e.g., adversarial key selection, very
large collections on 32 bit machines), hash collisions could reduce
hash performance, while balanced trees guarantee consistent
performance in such cases.
Q/A
---
- Q: Is your collection tiny or used outside a performance-critical
path?
- Yes: Use a list. Or whatever you care to. For collections that
are small and/or not performance-critical, it probably won't much
matter.
- No: Let's discuss!
- Q: Are you very certain you will only ever have one version of your
structure that you need? (no sharing between modules, no
backtracking, maybe used only within a single, small function)?
- Yes: A mutable structure such as ANSI CL hash-tables may offer the
best performance.
- No: Let's discuss!
- Q: Are your elements/keys hashable and equality-comparable (lisp
primitives and lists are hashable/equality-comparable), and are keys
not adversarially selected?
- Yes: Hash Array Mapped Tries (HAMTs) could offer the best
performance.
- No, I either don't have a hashing function or my keys could be
adversarially selected.
- Q: Are your keys adversarially selected but able to be
cryptographically hashed?
- Yes: In principle, HAMTs using a cryptographic hashing function
would be robust to adversarially selected keys. However, strong
cryptographic hashes require >64 bits, while the current HAMT
implementation in Sycamore assumes that hash-codes are FIXNUMs
(aligning with ANSI CL SXHASH). This limitation may be addressed
in the future by adding support for larger (e.g., bit-vector)
hash-codes, but for now WB-trees may be best for adversarial
cases.
- Q: Are your keys not hashable but totally ordered (for any two
elements A and B: you can decide if A and B are equal or if A comes
before B, or vice versa)?
- Yes: Weight-balanced trees could offer good performance. They may
be slighter slower than HAMTs, but require comparison rather than
hashing functions.
Alternatives
============
There are many other Common Lisp data structure libraries. Here are a
few alternatives and their trade-offs relative to Sycamore.
ANSI CL Hash Tables
-------------------
Mutable hash tables are very fast. If you don't need persistence or
set operations (union, intersection, difference, subset), then a
mutable hash table could be a good choice. Included benchmarks show
that SBCL's hash tables are around twice as fast as Sycamore HAMTs for
construction and lookup.
FSet
----
https://gitlab.common-lisp.net/fset/fset
FSet implements finite sets using (1) weight-balanced binary trees
similar to (and which in part inspired) Sycamore's WB-tree
implementation and (2) a HAMT variation based on Steindorfer's
Compressed Hash-Array Mapped Prefix-tree (CHAMP) [5]. FSet trees
support partial orders, whereas Sycamore trees assume a total order.
HAMTs don't require any order but do require a hashing function. FSet
trees uses a single, global ordering for elements via a generic
function, while Sycamore orders elements with a per-collection COMPARE
parameter. FSet HAMTs do not memoize hash codes, while Sycamore HAMTs
do.
[Benchmarks](https://ndantam.github.io/sycamore/bench.html) indicate
that Sycamore and FSet HAMTs are similar in performance for insert,
find, and remove operations, while Sycamore HAMTs are 1.3-4.5x faster
for new HAMT construction, union, intersection, and difference on the
tested sets. Sycamore WB-trees are 1.2-3.8x faster than FSet,
depending on the operation and size, for the tested sets.
CL-HAMT
-------
https://github.com/danshapero/cl-hamt
CL-HAMT is an implementation of hash array mapped tries. Included
benchmarks show that Sycamore HAMTs are 10x faster than CL-HAMT.
CL-Containers
-------------
https://github.com/gwkkwg/cl-containers
CL-Containers is stateful/mutable/imperative, while Sycamore is
purely-functional/persistent.
Lisp Interface Library (LIL)
----------------------------
https://github.com/fare/lisp-interface-library
Lisp Interface Library (LIL) provides abstracted data structures using
Interface Passing Style, while Sycamore provides a few concrete data
structures. LIL's Interface Passing Style presumably improves
flexibility at the cost of runtime overhead and API complexity.
Sycamore's explicit data structures have low-overhead, optimized
implementations and a simple, documented API.
References
==========
[1] Bagwell, Phil. Ideal Hash Trees. EPFL Technical Report, 2001.
[2] Michael J. Steindorfer and Jurgen J. Vinju. 2015. Optimizing
hash-array mapped tries for fast and lean immutable JVM
collections. SIGPLAN Not. 50, 10 (October 2015),
783–800. https://doi.org/10.1145/2858965.2814312
[3] Okasaki, Chris. "Purely Functional Data Structures." Cambridge
University Press. June 1999. ISBN: 978-0521663502.
[4] Adams, Stephen., 1992. Implementing sets efficiently in a functional
language. University of Southampton. Tech Report CSTR 92-10.
[5] OCaml Sets. [Documentation](https://ocaml.org/docs/sets).
[Code](https://github.com/ocaml/ocaml/blob/trunk/stdlib/set.ml)
[6] Boehm, H.J., Atkinson, R. and Plass, M., 1995. Ropes: an
alternative to strings. Software: Practice and Experience, 25(12),
pp.1315-1330.
Name
====
http://en.wikipedia.org/wiki/Platanus_occidentalis
Oh, the moonlight's fair tonight along the Wabash,
From the fields there comes the breath of newmown hay.
Through the sycamores the candle lights are gleaming,
On the banks of the Wabash, far away.
- Paul Dresser
"On the Banks of the Wabash, Far Away"
1897