type-r

2019-12-27

Type-R

https://travis-ci.org/guicho271828/type-r.svg?branch=master

This library provides a parser interface for the built-in compound types defined in Common Lisp.

NEWS (2019/04/14) : Pattern compilation especially for the large subtypes (e.g. real-subtype) are made faster by around x10 by reducing the duplicates in the resulting pattern expansion. (Example: 1.155 sec -> 0.175 sec on Thinkpad X240 (Celeron 1.6GHz) for the code below)

TYPE-R> (time (compile nil '(lambda (x) (trivia:match x ((real-subtype low high) (print low) (print high))))))

NEWS (2019/10/11) : Now it is much faster due to the basic improvement in Trivia's compiler. (0.043 sec)

Introduction

Consider when we have an unknown type specifier. It is easy to determine if this type is a subtype of string --- just use subtypep. Ok, so, is it also easy to parse the declared length of this string?

Not quite trivial as expected. This is because CL allows a bunch of alternative representations for a string subtypes. We have to consider all of (simple-base-string 50) , (string) or string. This is the target of this library.

This was initially a re-implementation of Bike/compiler-macro/type-utils.lisp . While it maintains the accessor for each type manually with lots of cond and case statements, I automated the definition of accessors with macros and pattern matcher Trivia, and provide more complete API for Common Lisp.

Types covered by this library are shown in the table below: which means, every compound types in CL. For the complete list of accessors, see quickdocs reference.

c.f. http://www.lispworks.com/documentation/HyperSpec/Body/04_bc.htm

and long-float simple-base-string
array member simple-bit-vector
base-string mod simple-string
bit-vector not simple-vector
complex or single-float
cons rational string
double-float real unsigned-byte
eql satisfies values
float short-float vector
function signed-byte bit
integer simple-array sequence
fixnum
bignum
ratio
null
list

Type Accessors

We have several accessor functions which returns the corresponding metadata of a given type specifier. The name of a function that obtains the Y portion of a given type X is named X-type-Y, e.g., for size portion of string, we have string-type-size.

(string-type-size '(string 50)) ; --> 50
(string-type-size '(string))    ; --> '*
(string-type-size 'string)      ; --> '*
(string-type-size 'base-string) ; |- error
(string-type-size '(base-string 50)) ; |- error

At this moment, the subtype relationship is not considered. For example, string-type-size does not match base-string. We have string-subtype-size and similar variants that matches all such subtypes.

(string-subtype-size '(string 50)) ; --> 50
(string-subtype-size '(string))    ; --> '*
(string-subtype-size 'string)      ; --> '*
(string-subtype-size 'base-string) ; --> '*
(string-subtype-size '(base-string 50)) ; --> 50

List of general pattern names related to arrays

  • base-string-subtype
  • simple-string-subtype
  • string-subtype
  • bit-vector-subtype
  • vector-subtype
  • simple-array-subtype
  • array-subtype
  • sequence-subtype

For type specifiers that are "Compound Type Specifier Kind: Specializing." or "Compound Type Specifier Kind: Abbreviating." in CLHS, there may be more metadata that do not explicitly appear in the type specifier but is still useful. For example, string implies that it is an array of character subtypes .

Given that array's element-type is accessible through array-type-element-type, we also provide string-type-element-type and so on.

(array-type-element-type 'array)               ; --> '*
(array-type-element-type '(array * 20))        ; --> '*
(array-type-element-type '(array character *)) ; --> 'character
(string-type-element-type 'string)             ; --> 'character
(array-subtype-element-type 'string)           ; --> 'character
(array-type-element-type 'string)              ; |- error (use *-subtype-* instead)

Number types

The similar set of functions are provided for numbers. The signature follows that of the type specifier signature.

(mod-type-n '(mod 5))                        ; --> 5
(bit-type-p 'bit)                            ; --> t
(bit-type-p '(bit))                          ; --> t
(unsigned-byte-type-bits '(unsigned-byte 5)) ; --> 5
(unsigned-byte-type-bits '(unsigned-byte))   ; --> '*
(unsigned-byte-type-bits 'unsigned-byte)     ; --> '*
(signed-byte-type-bits 'signed-byte)         ; --> '*
(byte-subtype-bits '(signed-byte 5))         ; --> 5
(byte-subtype-bits '(unsigned-byte 5))       ; --> 5
(fixnum-type-p 'fixnum)                      ; --> t
(bignum-type-p 'bignum)                      ; --> t
(integer-type-p 'bignum)                     ; --> nil
(integer-subtype-p 'bignum)                  ; --> t
(integer-subtype-low 'bignum)                ; --> '*
(integer-subtype-high 'bignum)               ; --> '*
(integer-subtype-high '(unsigned-byte 5))    ; --> 32
(integer-subtype-high 'fixnum)               ; --> most-positive-fixnum
(integer-subtype-high '(mod 5))              ; --> 4

(float-type-low 'float)                         ; --> '*
(single-float-type-low 'single-float)           ; --> '*
(single-float-type-low '(single-float -1.0 2.0) ; --> 1.0
(float-subtype-low '(single-float -1.0 2.0)     ; --> 1.0
(ratio-type-p 'ratio)                           ; --> t

(real-subtype-low 'float)                  ; --> '*
(real-subtype-low 'single-float)           ; --> '*
(real-subtype-low '(single-float -1.0 2.0) ; --> 1.0
(real-subtype-low 'ratio)                  ; --> '*
(real-subtype-low '(mod 5))                ; --> 0
(real-subtype-low '(signed-byte 5))        ; --> -32
(real-subtype-low '(unsigned-byte 5))      ; --> 0
(real-subtype-low 'fixnum)                 ; --> most-negative-fixnum
(real-subtype-low 'bignum)                 ; --> '*
(real-subtype-low '(integer -3 5))         ; --> -3

(complex-type-element-type '(complex (integer -3 5)) ; --> '(integer -3 5)
(number-subtype-p '(complex (integer -3 5))          ; --> t
(number-subtype-p '(integer -3 5)                    ; --> t

List of general pattern names related to numbers

  • byte-subtype --- covers both (unsigned-byte ...) and (signed-byte ...) variants.
  • integer-subtype --- covers all of mod,bit,unsigned-byte,signed-byte,bignum,fixnum,integer type specifiers.
  • fixnum-subtype --- Same as integer-subtype but it matches only when low,high are within the fixnum range.
  • float-subtype
  • rational-subtype
  • real-subtype
  • number-subtype

Other types

We also support function, values, member, or, and, cons, member, eql.

  • function has function-type-return-type / function-type-args-type accessors.
  • member has member-type-members-.
  • or / and has or/and-type-types.
  • cons has cons-type-car/cdr-type.
  • eql has eql-type-object.

Pattern Matcher Integration

Type-R is implemented with a pattern matcher Trivia. This allows further integration of type specifiers with pattern matchers.

For a given type specifier X, there is a Trivia pattern X-type, e.g., for string, we have a pattern named string-type .

(match '(string 50)
  ((string-type size) size)) ; --> 50

(match '(string)
  ((string-type size) size)) ; --> '*

(match 'string
  ((string-type size) size)) ; --> '*
(ematch '(simple-array * 3)
  ((array-type _ rank) rank)) ; --> match error!

(ematch '(simple-array * 3)
  ((array-subtype _ rank) rank)) ; --> 3

(ematch '(simple-array * (3 2))
  ((array-subtype _ (list _ column)) column)) ; --> 2
(ematch '(simple-string 5)
  ((simple-string-type size) size)) ; --> 5

(ematch '(simple-string 5)
  ((simple-string-type _ type) type)) ; --> 'character

(ematch '(base-string 5)
  ((base-string-type _ type) type)) ; --> 'base-char

For number types, we have patterns like (float-type low high). Similarly to the array types, we have optional values that are bounded by default, e.g.,


(match 'fixnum
  ((integer-subtype low _) low)) ; --> [MOST-NEGATIVE-FIXNUM] (implementation dependent)

Example

When writing a numerical manipulation library, it is sometimes necessary to convert a set of several type specifiers under REAL, e.g. RATIO, INTEGERS, FLOATS, to the least specific FLOAT type when any one of them are not integers. The rule for writing this could be cumbersome without this library.

Let's start with an incomplete function that converts two given types into a long-float type when one of them is a long-float type specifier:

(defun upgrade-to-long-float (type1 type2)
  (ematch* (type1 type2)
    (((long-float-type l1 h1) (long-float-type l2 h2))

     `(long-float ,(min* l1 l2)
                  ,(max* h1 h2)))

    (((real-subtype l1 h1) (long-float-type l2 h2))

     `(long-float ,(min* l1 l2)
                  ,(max* h1 h2)))

    (((long-float-type l2 h2) (real-subtype l1 h1))

     `(long-float ,(min* l1 l2)
                  ,(max* h1 h2)))))

This function takes two types, then performs a pattern match on them. We used ematch* which we can feed multiple objects (unlike ematch), and ematch* signals an error when none of the patterns are matched, unlike match* which just returns a nil when that happens.

The code dispatches to the corresponding branch and decomposes the type specifier into its low and high component.

Note that we cannot use the standard min or max because low and high could be a symbol ='*=. min* and max* are exactly those variants as defined below.

Also note that we use real-subtype pattern instead of real-type pattern because we want to match all type specifiers that is a subtype of real.

(defun min* (a b)
  (declare ((or (eql *) real) a b))
  (ematch* (a b)
    (('* _) '*)
    ((_ '*) '*)
    ((_ _) (min a b))))

(defun max* (a b)
  (declare ((or (eql *) real) a b))
  (ematch* (a b)
    (('* _) '*)
    ((_ '*) '*)
    ((_ _) (max a b))))

Now what we finally need to do is to cover all subtypes of float. Note that the match is performed in a top-down manner therefore we don't have to worry short-float matched before long-float etc.

(defun upgrade-to-float-type (&rest typespecs)
  (reduce (lambda (prev now)
            (ematch* (prev now)
              (((long-float-type l1 h1) (long-float-type l2 h2))
               `(long-float ,(min* l1 l2)
                            ,(max* h1 h2)))

              (((real-subtype l1 h1) (long-float-type l2 h2))
               `(long-float ,(min* l1 l2)
                            ,(max* h1 h2)))

              (((long-float-type l2 h2) (real-subtype l1 h1))
               `(long-float ,(min* l1 l2)
                            ,(max* h1 h2)))


              (((double-float-type l1 h1) (double-float-type l2 h2))
               `(double-float ,(min* l1 l2)
                              ,(max* h1 h2)))

              (((real-subtype l1 h1) (double-float-type l2 h2))
               `(double-float ,(min* l1 l2)
                              ,(max* h1 h2)))

              (((double-float-type l2 h2) (real-subtype l1 h1))
               `(double-float ,(min* l1 l2)
                              ,(max* h1 h2)))


              (((single-float-type l1 h1) (single-float-type l2 h2))
               `(single-float ,(min* l1 l2)
                              ,(max* h1 h2)))

              (((real-subtype l1 h1) (single-float-type l2 h2))
               `(single-float ,(min* l1 l2)
                              ,(max* h1 h2)))

              (((single-float-type l2 h2) (real-subtype l1 h1))
               `(single-float ,(min* l1 l2)
                              ,(max* h1 h2)))


              (((short-float-type l1 h1) (short-float-type l2 h2))
               `(short-float ,(min* l1 l2)
                             ,(max* h1 h2)))

              (((real-subtype l1 h1) (short-float-type l2 h2))
               `(short-float ,(min* l1 l2)
                             ,(max* h1 h2)))

              (((short-float-type l2 h2) (real-subtype l1 h1))
               `(short-float ,(min* l1 l2)
                             ,(max* h1 h2)))

              ;; the specific flaot type is unspecified.
              (((float-type l1 h1) (float-type l2 h2))
               `(float ,(min* l1 l2)
                       ,(max* h1 h2)))

              (((real-subtype l1 h1) (float-type l2 h2))
               `(float ,(min* l1 l2)
                       ,(max* h1 h2)))

              (((float-type l2 h2) (real-subtype l1 h1))
               `(float ,(min* l1 l2)
                       ,(max* h1 h2)))

              ;; Now both are rationals = (or integer ratio) = (or (or bignum fixnum) ratio).
              ;; Ratios are converted into single-floats.

              (((ratio-type l1 h1) (ratio-type l2 h2))
               `(single-float ,(min* l1 l2)
                              ,(max* h1 h2)))

              (((real-subtype l1 h1) (ratio-type l2 h2))
               `(single-float ,(min* l1 l2)
                              ,(max* h1 h2)))

              (((ratio-type l2 h2) (real-subtype l1 h1))
               `(single-float ,(min* l1 l2)
                              ,(max* h1 h2)))

              ;; Now both are integers.
              ;; we first match the case with two fixnums:
              (((fixnum-subtype l1 h1) (fixnum-subtype l2 h2))
               ;; note that this also includes the bignum type specifiers
               ;; with a sufficiently small limit.
               `(integer ,(min l1 l2) ,(max h1 h2)))

              ;; the last case is the integers beyond the fixnum range.
              (((real-subtype l1 h1) (real-subtype l2 h2))
               `(single-float ,(min* l1 l2)
                              ,(max* h1 h2)))))
          typespecs))

Dependencies

This library is at least tested on implementation listed below:

  • SBCL 1.2.8 on X86-64 Linux 3.13.0-46-generic (author's environment)

Also, it depends on the following libraries:

Trivia by Masataro Asai

NON-Optimized Pattern Matching Library

alexandria by

Alexandria is a collection of portable public domain utilities.

iterate by

Jonathan Amsterdam's iterator/gatherer/accumulator facility

Author

  • Masataro Asai (guicho2.71828@gmail.com)

Copyright

Copyright (c) 2015 Masataro Asai (guicho2.71828@gmail.com)

License

Licensed under the LLGPL License.

Author
Masataro Asai
License
LLGPL
Categories
convenience library, pattern matching