fresnel

2023-06-18

Bidirectional translation with lenses

Upstream URL

gitlab.com/GrammaTech/Mnemosyne/fresnel

Author

GrammaTech

License

MIT
README

Fresnel

Fresnel implements lenses: lossless, composable bidirectional translations. (Not to be confused with “lenses” as functional references.)

Currently Fresnel is being used to make program synthesis practical for real-world programming languages. Given the AST of a programming language we can use Fresnel to convert the AST into a simplified DSL acceptable to a program synthesis engine, then convert the result back into the AST of a real programming language.

There are few references on lenses outside of formal treatments. The following attempts to suggest an intuition before moving on. A preview: writing bidirectional transformations uses lenses feels very much like writing a parser, except instead of imposing a tree structure you’re starting with one, and you get the reverse direction “for free”.

Lens tutorial

Setup

(The following examples can be evaluated in the fresnel-user package, as long as the fresnel/lens system is loaded.)

For practical purposes, you may want to use fresnel/readtable:lens-readtable instead. This also supports use quasiquotation for pattern matching (with fare-quasiquote) and C-style string escapes, such as #?”\n” for a newline (with cl-interpol).

Lenses are defined in the fresnel/lens package. Many functions share names with existing functions or macros (as bidirectional generalizations). It is not recommended to :USE this package. Use a package-local nickname instead:

(:local-nicknames (:l :fresnel/lens))

Summary

Writing lenses will usually involve the following steps:

  1. Define bidirectional lenses using pairs of functions.
  2. Define structure lenses using bidirectional pattern matching (the restructuring-bind macro).
  3. Lift structure lenses into container lenses (lists, tables) using combinators such as l:mapcar and l:alist-mapping.
  4. Compose lenses using combinators, like l:compose or l:or.
  5. Decorate lenses with before and after steps.

Step 1: Value lenses

The simplest possible case of a bidirectional translation is a bijection. This is a one-to-one function in both directions. For example, negation. We can always get the original back from a negation:

(- 10)  => -10
(- -10) => 10

We can turn this into a lens with make-lens. (This example, and following examples, assume l as a package-local nickname for fresnel/lens.)

(let ((lens (l:make-lens #'-)))
 (l:forward  lens 10)   => -10
 (l:backward lens -10)  => 10
)

For bijective lenses, forward transforms something; and backward undoes the transformation.

Slightly less simple is a bijection that uses a different function in each direction. Take the case of a character:

(char-upcase #\a)   => #\A
(char-downcase #\A) => #\a

We can also turn this into a lens. Now we specify two functions, one for the forward direction, one for the backward direction:

(let ((lens (l:make-lens #'char-upcase #'char-downcase)))
 (l:forward  lens #\a) => #\A
 (l:backward lens #\A) => #\a
)

Lenses like these - defined as a single bijective function, or as pairs of functions we call value lenses.

Step 2: Structure lenses

Lenses that are not value lenses are structure lenses. This is where lenses get interesting. Structure lenses operate on things that have a fixed internal structure. Structure lenses can be bijective, but only if both the original and the transformation have exactly the same slots.

What if they don’t have the same slots? Then we need to parameterize the backward direction. In particular, we need two things:

  • The lens needs to be able to take the original as an extra argument it can be used to “fill in” the slots that were present in the original that aren’t present in the transformed version.
  • The lens needs to be able to specify defaults for the missing slots, if no original is available.

What does that look like? Consider a simple example: translating between points in two and three dimensions. This relationship is not bijective, since 3-D points have an independent z-axis.

Start by defining the points themselves:

;; Define classes to represent points.
(defclass 2d-point ()
  ((x :initarg :x)
   (y :initarg :y)))

(defclass 3d-point ()
  ((x :initarg :x)
   (y :initarg :y)
   (z :initarg :z)))

;; Define print-object methods for easier presentation.
(defmethod print-object ((self 2d-point) stream)
  (print-unreadable-object (self stream :type t)
    (with-slots (x y) self
      (format stream "~a,~a" x y))))

(defmethod print-object ((self 3d-point) stream)
  (print-unreadable-object (self stream :type t)
    (with-slots (x y z) self
      (format stream "~a,~a,~a" x y z))))

The actual lens definition is much simpler than the definitions of the points themselves:

(defun flatland-lens ()
  (l:match-constructors (x
                         y
                         (z (l:identity) 0))
    (make '3d-point :x x :y y :z z)
    (make '2d-point :x x :y y)))

(By convention, lenses are defined as functions, to avoid problems with ordering.)

The match-constructors macro builds on Trivia. It takes two patterns and matches up the components of each pattern according to their names. The inherent syntactic ambiguity of pattern matching allows each clause to define how to destructure an instance of one type, and restructure its components into an instance of another type.

(You don’t have to use match-constructors; you can also use the ternary form of make-lens, passing a third function of two arguments to handle constructing an instance from a translation and an original.)

How the components are translated is defined by the bindings. Each binding defines the lenses to use to translate that component, with exceptions:

  • A bare variable, like x and y above, is equivalent to `(x

(l:identity))`

  • The optional third argument is the default to be used if no original is supplied.
(def 3d (make '3d-point :x 1 :y 2 :z 3))

;; Taking a 3d point to 2d drops the z-axis.
(l:forward (flatland-lens) 3d)
=> #<2D-POINT 1,2>

;; We can take the same point back to 3d.

(l:backward * 3d)
=> #<3D-POINT 1,2,3>

;; Without an original, taking a 2d point to 3d uses the default.
(l:backward (flatland-lens) **)
=> #<3D-POINT 1,2,0>

Step 3: Container lenses

What’s happening here? The crux is the l:match macro, which is used to translate the values of the alist. (alist-mapping is a combinator that applies a lens to all the values of a map.)

Step 4: Lens combinators

Composition

Lenses can be composed with l:compose:

(l:compose (python-dsl) (python-parser))

In this case the value is parsed from a string into an AST using the lens returned by python-parser, then the AST is parsed into a DSL using the lens returned by python-dsl. (Or the other way around, of course: a DSL is unparsed into Python ASTs, and the ASTs are unparsed into a string.)

Alternation

Lenses can be combined, using l:or, in such a way that they are tried in sequence; the first sub-lens to succeed is the result of the entire lens. Alternately, lenses can be combined using l:or!; in this case if more than one lens matches, a runtime error is raised.

Step 5: Before and after functions (“quotienting”)

A note on the bidirectional transformation literature

Lenses are formally defined as total. This is not very practical. Fresnel allows for “discerning lenses”, which can be used to do error checking while transforming in either direction.

The terms “value lens,” “structure lens,” and “container lens” do not occur in the literature.

The literature distinguishes three operations instead of two: forward is called “get”, backward with one argument is create, and backward with two arguments is put.

In the literature, before functions are called “left quotienting”, and after functions are called “right quotienting.”

Afterword: debugging

Since lenses are constructed, then called, they can be difficult to debug. You can use l:trace (and l:untrace) to print information about successful matches. This doesn’t help when matches are total failures, but is very helpful when the lens isn’t returning what you expect.

A note on examples

Examples in this file are evaluated in the gt-user package from gt and use curry-compose-reader-macros. To reproduce:

(ql:quickload :gt/full)
(in-package :gt-user)
(in-readtable :curry-compose-reader-macros)

Usage

Discerning lenses

"Ill-formed" values are defined in terms of multiple values: the first value returned from a function of a lens is considered ill-formed if the last value returned is nil.

Example: address book

Consider an address book stored as nested alists:

(def address-book
  '(("Pat"
     ("Phone" . "555-4444")
     ("URL" . "http://pat.example.com"))
    ("Chris"
     ("Phone" . "555-9999")
     ("URL" . "http://chris.example.org"))))

Assume an abstracted view of the same address book with just names and numbers:

(def abstract-original
  '(("Chris" . "555-9999")
    ("Pat" . "555-4444")))

We want to change the abstract view and have that changed reflected in the original. We drop Chris and add Jo, and change the phone number for Pat.

(def abstract2
  '(("Pat" . "555-4321")
    ("Jo" . "555-6666")))

How can we get back to a concrete address book? The solution is a lens:

(def lens
  (l:alist-mapping
   (l:match ((phone (l:identity))
             (url (l:identity) "http://google.com"))
     (list
      (cons "Phone" phone)
      (cons "URL" url))
     phone)
   :test #'equal))

This lens can reproduce the original abstract view:

(l:get lens address-book)
=> '(("Chris" . "555-9999") ("Pat" . "555-4444"))

And in the opposite direction:

(l:put lens abstract2 address-book)
=> '(("Jo"
      ("Phone" . "555-6666")
      ("URL" . "http://google.com"))
     ("Pat"
      ("Phone" . "555-4321")
      ("URL" . "http://pat.example.com")))
Walkthrough

Advanced: quotient lenses

Lenses were designed to address the view-update problem for sets of records (as in database records). This history shows in their assumption of an “original” that can be meaningfully used in the put direction to recover a concrete value.

In the case where a small, simple concrete AST may result in the synthesis of a large, complex abstract AST in the DSL of a synthesizer, there is usually no meaningful “original”.

Fortunately, lenses can be further generalized into quotient lenses. Quotient lenses generalize lenses so that the lens laws are respected up to some equivalence relation.

Quotient lenses are derived from lenses by quotienting – by being composed with a special type of lens, a canonizer. Lenses that have been quotiented behave the same way from the outside as basic (unquotiented) lenses and as such can be further quotiented.

The utility of quotient lenses is that they allow a style where canonization, instead of being done in one step at the edges, is interleaved as part of the overall translation.

Lenses can be quotiented on the left, concrete side (with lquot) or the right, abstract side (with rquot).

Advanced: canonizers and representativeness

Lenses translate along an axis of abstraction. A lens translates from a set of concrete elements to a set of abstract elements.

A canonizer, on the other hand, translates along an axis of representativeness. A canonizer translates from a set of elements (we will call them constituent elements) to a set of representative elements.

A canonizer, like a bijective lens, has only two functions. The two functions are called canonize and choose.

The canonize function of a canonizer takes a constituent and returns a representative for it. (It is a surjection.)

The choose function of a canonizer takes a representative and returns some constituent element. (It is an injection.)

(Why “constituent”? By analogy with representative democracy, where there is a canonizer, election, that takes a set of constituents and returns a single, canonical representative. In the case of election, the representative is also a constituent, so choose is the identity function. But it is not a requirement of canonizers that the representative belong to the set of constituents – this is the limit of the analogy.)

The axis of representativeness is orthogonal to the axis of abstraction:

When we compose a canonizer with the left (concrete) side of a lens – which is called “left quotienting” – canonize returns a more abstract representative, and choose returns a more concrete constituent. (This is a more limited form of lens composition.)

But when we compose a canonizer with the right (abstract) side of a lens – which is called “right quotienting” – then canonize returns a more concrete representative, and choose returns a more abstract constituent.

(That said, lenses and canonizers are structurally the same, and any lens can be converted into a canonizer by equating “abstract” and “representative”: get becomes canonize and create becomes choose.)

Canonizers need not be total

The formal definition of canonizers requires them to be total. We work around this by treating canonizers as discerning lenses with the behavior that, if the lens fails – if the last value returned by canonize or choose is nil – then the argument is returned unchanged.

Use left-quotienting for simplification and fix-ups

Left-quotienting coarsens the concrete domain.

For canonize, this typically means we want to ignore some aspect of the original (like whitespace).

Here is a simple canonizer that strips (usually redundant) parenthesized expression nodes from an SEL tree-sitter AST:

(defun strip-parens ()
  "Canonizer that strips parentheses."
  (l:canonizer (lambda (c)
                 (match c
                   ((parenthesized-expression-ast
                     :children `(,child))
                    child)))
               #'identity))

Except for the discerning behavior, left-quotienting behaves the same as composition. (If strip-parens were a lens, it would fail for any input that wasn’t a parenthesized expression, but as a canonizer it just returns the input unchanged.)

For choose, this typically means we want to “fix up” the concrete domain to respect some invariant (like operator precedence, or indentation, or interleaved text).

Use right-quotienting for further abstraction

Right-quotienting “coarsens the abstract domain.” This calls for illustration.

Right-quotienting for canonizing human input

One potential use for right-quotienting is handling human input. Suppose you have a lens that translates between a brittle, verbose format into a simpler one that a human can easily edit. You might want to ignore incidental aspects of the human-editable format (whitespace, indentation, case).

Right quotienting provides a principled way to do this input canonization. We define a canonizer with a canonize function that takes the messy user input and returns a cleaned-up version as its ideal representative. (Optionally, we could also define a choose function that pretty-prints that ideal representation.)

(defun input-canonizer ()
  (l:canonizer
   (lambda (c)
     (when (stringp c)
       (string-downcase
        (collapse-whitespace
         (remove #\Newline c)))))
   #'identity))

(defun my-lens* ()
  (l:rquot
   (input-canonizer)
   (my-lens)))
Right-quotienting for handling identifiers

Another potential use for right quotienting is handling identifiers. Assuming we have already performed a structural translation (e.g. into S-expressions) we might want to further translate only the identifers.

A real example: A lens implementation of FPCore translates between a generic S-expression based representation and an FPCore-specific one.

The generic S-expression representation only uses keywords. The FPCore lens, however, uses symbols interned in two specific packages: one constant package that contains only the FPCore builtins, and another “working” package that is used to intern symbols for parameters. (This distinction is important so that we can analyze FPCore expressions to find what parameters they expect.)

The goal is a trivial lens that recognizes Lisp keywords and quotients them with a canonizer.

(defun fpcore-symbol-lens ()
  "Converts between keywords and FPCore symbols."
  (l:rquot
   (fpcore-symbol-canonizer)
    (l:check #'keywordp)))

How to define the canonizer? In this case the representative value is a keyword, while the constituent value is a symbol interned in one of the FPCore packages.

We need a canonize function that gets the representative value by converting any symbol it receives into a keyword:

(lambda (c)
  (and c
       (symbolp c)
       (values (make-keyword (string c)) t)))

And we need a choose function that gets a constituent value by interning the keyword's string in the appropriate package:

(lambda (r)
  ;; Second value as intern may return a second value of nil.
  (values (intern-fpcore-symbol (string r))
          t))

These can then be combined into the canonizer we want:

(defun fpcore-symbol-canonizer ()
  "Translate between keywords and FPCore symbols, when possible."
  (l:canonizer
   (lambda (r)
     (and r
          (symbolp r)
          (values (make-keyword (string r)) t)))
   (lambda (c)
     ;; Second value as intern may return a second value of nil.
     (values (intern-fpcore-symbol (string c))
             t))))

Copyright and Acknowledgments

Copyright (C) 2020 GrammaTech, Inc.

This code is licensed under the MIT license. See the LICENSE file in the project root for license terms.

This material is based upon work supported by the US Air Force, AFRL/RIKE and DARPA under Contract No. FA8750-20-C-0208. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the US Air Force, AFRL/RIKE or DARPA.

Dependencies (5)

  • cl-interpol
  • cl-utils
  • fare-quasiquote
  • stefil-
  • trivial-package-local-nicknames

Dependents (0)

    • GitHub
    • Quicklisp