hashtrie
2024-10-12
Persistent/Immutable Hash Trie datastructure based upon Clojure
hashtrie
A fast Hash trie implementation based upon Clojure's.
A Hash Trie works like a Hash Set, except that it has been optimised for immutability and thread-safety.
By default, hashtrie's are persistent and immutable, but this implementation also supports transients for building sets significantly faster.
Usage
Given that hashtrie collides with several important function definitions in the :common-lisp
namespace it is recommended that this library is used with a local nickname. For example, like this:
(defpackage my-package (:use #:cl) (:local-nicknames (#:htr #:hashtrie)))
Constructor:
(htr:make-hashtrie nil "foo" 1 "bar") ;; {nil "foo", 1 "bar"}
Construct with transience:
(htr:with-transient (trans (htr:make-hashtrie)) (htr:add trans 1 "bar") (htr:add trans nil "foo")) ;; {nil "foo", 1 "bar"}
Adding:
(htr:add (htr:make-hashtrie 1 "foo") 1 "bar") ;; {1 "bar"}
Removing:
(htr:remove (htr:make-hashtrie 1 1 2 2) 1) ;; {2 2}
Finding values:
(htr:value (htr:make-hashtrie 1 "foo" 2 "bar") 1) ;; "foo"
Testing keys:
(htr:has-key (htr:make-hashtrie 1 1 2 2) 1) ;; T (htr:has-key (htr:make-hashtrie 1 1 2 2) 100) ;; nil
Length/Count:
(htr:length (htr:make-hashtrie 1 "foo" 2 "bar")) ;; 2
Mapping:
(htr:map (htr:make-hashtrie 1 100 2 200 3 300) (lambda (key val) (+ key val)) ;; (101 202 303)
Reduce:
(htr:reduce (htr:make-hashtrie 1 0 2 0 3 0) (lambda (start key val) (+ start key val)) 0) ;; 6
Thread Safety
In theory the persistent Hash Trie is completely thread safe. This has been tested casually, but never in a production system.
Other important info
This library currently uses sxhash
and equal
for comparison. Alternative hashing/comparison functions are not supported.
Supported Lisps
In theory should work on all Common Lisp implementations.
Has only been tested on SBCL, and CLisp.
Benchmarking
Running SBCL, for comparison between using hashtrie and SBCL's own implementation of hashset, you can see that building SBCL's hashset is a bit over 10x faster. This is to be expected because it isn't immutable.
hashtrie
CL-USER> (time (loop for i from 0 to 1000000 for map = (htr:make-hashtrie i i) then (htr:add map i i) finally (return map))) ;Evaluation took: ; 1.158 seconds of real time ; 1.163023 seconds of total run time (1.002477 user, 0.160546 system) ; [ Run times consist of 0.486 seconds GC time, and 0.678 seconds non-GC time. ] ; 100.43% CPU ; 2,306,805,783 processor cycles ; 1,279,064,720 bytes consed
(time (htr:with-transient (trans (htr:make-hashtrie)) (dotimes (i 1000000) (htr:add trans i i)))) ;Evaluation took: ; 0.640 seconds of real time ; 0.640942 seconds of total run time (0.557056 user, 0.083886 system) ; [ Run times consist of 0.297 seconds GC time, and 0.344 seconds non-GC time. ] ; 100.16% CPU ; 1,275,757,029 processor cycles ; 190,686,512 bytes consed
hashset
(time (let ((m (make-hash-table :test 'equal))) (dotimes (i 1000000) (setf (gethash i m) i)))) ; Evaluation took: ; 0.170 seconds of real time ; 0.171739 seconds of total run time (0.152880 user, 0.018859 system) ; [ Run times consist of 0.024 seconds GC time, and 0.148 seconds non-GC time. ] ; 101.18% CPU ; 339,892,350 processor cycles ; 132,035,232 bytes consed
Also comparing the performance to clojure. Building a hash trie in sbcl is still slightly slower than doing the same in clojure, but not by much.
;Clojure 1.10.2 (defn persistent-build-map [set n] (if (> n 0) (recur (assoc set n n) (dec n)) set)) (defn transient-build-map [n] (loop [i 0 v (transient {})] (if (< i n) (recur (inc i) (assoc! v i i)) (persistent! v)))) (time (count (persistent-build-map {} 1000000))) ;"Elapsed time: 606.96371 msecs" (time (count (transient-build-map 1000000))) "Elapsed time: 502.676784 msecs"