fresnel

2021-06-30

Bidirectional translation with lenses

Upstream URL

gitlab.com/GrammaTech/Mnemosyne/fresnel

Author

GrammaTech

License

MIT

README.md

Fresnel

The Fresnel library allows bidirectional translation using lenses. Bidirectional translations allow a concrete value to be translated into an abstract value, manipulated in its abstract form, and then translated back into a concrete form without losing information. (But Fresnel is also useful for naive bidirectional translation, without retaining information.)

To be concrete, currently Fresnel is being used for program synthesis. Given the AST of a programming language we can use Fresnel to convert the AST into a DSL acceptable to a program synthesis engine, then convert the result back into the AST of a real programming language. However Fresnel could be useful for many other purposes.

A note on examples

Examples in this file are evaluated in the gt-user package from gt and use curry-compose-reader-macros. To reproduce:

(ql:quickload :gt/full)
(in-package :gt-user)
(in-readtable :curry-compose-reader-macros)

For practical purposes, you may want to use fresnel/readtable:lens-readtable instead. This also supports use quasiquotation for pattern matching (with fare-quasiquote) and C-style string escapes, such as #?”\n” for a newline (with cl-interpol).

Usage

Lenses are defined in the fresnel/lens package. Many functions share names with existing functions or macros (as bidirectional generalizations). It is not recommended to :USE this package. Use a package-local nickname instead:

(:local-nicknames (:l :fresnel/lens))

Writing lenses will usually involve the following steps:

  1. Define primitive translations using l:check and l:bij.
  2. Define structural translations using bidirectional pattern matching with l:match.
  3. Lift structural translations to work on containers (lists, tables) using combinators (such as l:mapcar and l:alist-mapping).
  4. Combine possible translations using l:or.

Caveats:

  • Lenses defined at each step can invoke lenses defined later, recursively.
  • The steps may be interleaved with canonization steps, using quotienting.

Lenses

Bijections

The simplest form of bidirectional translation is bijection: exhaustive, one-to-one correspondence. Bijections can be defined using l:bij.

;; A bijection between lists and vectors.
(def list-to-vector (l:bij {coerce _ 'list} {coerce _ 'vector}))

All bidirectional transformations defined by this library distinguish one domain as “concrete” and the other as “abstract.” In the above example, lists are the concrete domain and vectors are the abstract domain.

The function from concrete to abstract is invoked with l:get:

(l:get list-to-vector '(1 2 3)) => #(1 2 3)

The function from abstract to concrete is invoked with l:create:

(l:create list-to-vector #(1 2 3) => '(1 2 3))

For a bijection, the relationship between concrete and abstract can be reversed using the opp combinator:

(def vector-to-list (l:opp list-to-vector))

(l:get vector-to-list #(1 2 3)) -> '(1 2 3)
(l:create vector-to-list '(1 2 3)) -> #(1 2 3)

Basic lenses

Lenses generalize bijections by allowing the transformation in the forward direction (get) to be non-injective.

This means that, to “put back” an abstract element into the concrete domain, instead of using l:create, we use l:put with the original concrete element as a second argument.

To be a lens, a bidirectional transformation must obey three laws:

  • GetPut: (put l (get l c) c) = c for all c from C
  • PutGet: (get l (put l a c)) = a for all (a, c) from A × C
  • CreateGet: (get l (create l a)) = a for all a from A

Or more informally:

  • GetPut: If A is not changed, neither is C.
  • PutGet: Any update to A is represented faithfully in C.
  • CreateGet: Create doesn’t lose any information.

Fresnel does not enforce these laws, but it does provide the function l:check-laws for testing.

The advantage of the lens laws is that they guarantee that transformations are composable. The downside is that they require all lenses to be total – but we work around the totality requirement by establishing a convention for discerning lenses.

Discerning lenses

Discerning lenses are quotient lenses that have one special equivalence class of "ill-formed values" on both sides, which can be used to do error checking while transforming in either direction.

"Ill-formed" values are defined in terms of multiple values: the first value returned from a function of a lens is considered ill-formed if the last value returned is nil.

Example: address book

Consider an address book stored as nested alists:

(def address-book
  '(("Pat"
     ("Phone" . "555-4444")
     ("URL" . "http://pat.example.com"))
    ("Chris"
     ("Phone" . "555-9999")
     ("URL" . "http://chris.example.org"))))

Assume an abstracted view of the same address book with just names and numbers:

(def abstract-original
  '(("Chris" . "555-9999")
    ("Pat" . "555-4444")))

We want to change the abstract view and have that changed reflected in the original. We drop Chris and add Jo, and change the phone number for Pat.

(def abstract2
  '(("Pat" . "555-4321")
    ("Jo" . "555-6666")))

How can we get back to a concrete address book? The solution is a lens:

(def lens
  (l:alist-mapping
   (l:match ((phone (l:identity))
             (url (l:identity) "http://google.com"))
     (list
      (cons "Phone" phone)
      (cons "URL" url))
     phone)
   :test #'equal))

This lens can reproduce the original abstract view:

(l:get lens address-book)
=> '(("Chris" . "555-9999") ("Pat" . "555-4444"))

And in the opposite direction:

(l:put lens abstract2 address-book)
=> '(("Jo"
      ("Phone" . "555-6666")
      ("URL" . "http://google.com"))
     ("Pat"
      ("Phone" . "555-4321")
      ("URL" . "http://pat.example.com")))
Walkthrough

What’s happening here? The crux is the l:match macro, which is used to translate the values of the alist. (alist-mapping is a combinator that applies a lens to all the values of a map.)

The l:match macro uses Trivia pattern matching to conveniently define bidirectional transformations. The same syntax is used to deconstruct a value that is used to construct it; we can use one form to describe how to analyze and how to build a value.

Consider a simplified concrete value:

(def c
  '(("Phone" . "555-4444")
    ("URL" . "http://pat.example.com")))

(def l
  (l:match
      ;; The components.
      ((phone (l:identity))
       (url (l:identity) "http://google.com"))
    ;; The concrete form.
    (list
     (cons "Phone" phone)
     (cons "URL" url))
    ;; The abstract form.
    phone))

What happens in the get direction?

(l:get l c)
=> "555-4444"
  1. The concrete form is matched again a Trivia pattern. If the pattern fails, the lens as a whole fails. Otherwise, Trivia binds phone and url.
  2. Each of the components is rebound to the result of calling get on the relevant lens and binding, e.g.:
    (l:get (l:identity) phone)
    
    If any component fails, the lens as a whole fails.
  3. Evaluate the abstract form. In this case, it simply returns the value of phone.

What happens in the create direction?

(l:create l "555-4444")
=> '(("Phone" . "555-4444")
     ("URL" . "http://google.com"))
  1. The abstract form is matched, binding phone.
  2. Call l:create on each component’s lens and binding:
    (l:create (l:identity) phone)
    
  3. Since url is not bound by Trivia and there is no concrete value to derive it from, the provided default is used.
  4. Evaluate the concrete form as an expression.

How is this different from what happens in the put direction, when a concrete value is provided to “fill in the blanks”?

(l:put l "555-4444" c)
=> '(("Phone" . "555-4444")
     ("URL" . "http://pat.example.com"))
  1. As before, the abstract form is matched as a Trivia pattern.
  2. Invoke the lens for each component, but using l:put instead, with the concrete value for that component.
    (l:put (l:identity) phone c)
  3. Match the concrete value against the concrete pattern , and use the values derived to fill in any components that are missing in the abstract value. (If the concrete value does not match, just use the defaults.)
  4. Evaluate the concrete form as an expression.

Lens combinators

Lenses can be composed with l:compose:

(l:compose (python-dsl) (python-parser))

In this case the value is parsed from a string into an AST using the lens returned by python-parser, then the AST is parsed into a DSL using the lens returned by python-dsl. (Or the other way around, of course: a DSL is unparsed into Python ASTs, and the ASTs are unparsed into a string.)

Lenses can also be combined, using l:or, in such a way that they are tried in sequence; the first sub-lens to succeed is the result of the entire lens.

Quotient lenses

Lenses were designed to address the view-update problem for sets of records (as in database records). This history shows in their assumption of an “original” that can be meaningfully used in the put direction to recover a concrete value.

In the case where a small, simple concrete AST may result in the synthesis of a large, complex abstract AST in the DSL of a synthesizer, there is usually no meaningful “original”.

Fortunately, lenses can be further generalized into quotient lenses. Quotient lenses generalize lenses so that the lens laws are respected up to some equivalence relation.

Quotient lenses are derived from lenses by quotienting – by being composed with a special type of lens, a canonizer. Lenses that have been quotiented behave the same way from the outside as basic (unquotiented) lenses and as such can be further quotiented.

The utility of quotient lenses is that they allow a style where canonization, instead of being done in one step at the edges, is interleaved as part of the overall translation.

Lenses can be quotiented on the left, concrete side (with lquot) or the right, abstract side (with rquot).

Canonizers and representativeness

Lenses translate along an axis of abstraction. A lens translates from a set of concrete elements to a set of abstract elements.

A canonizer, on the other hand, translates along an axis of representativeness. A canonizer translates from a set of elements (we will call them constituent elements) to a set of representative elements.

A canonizer, like a bijective lens, has only two functions. The two functions are called canonize and choose.

The canonize function of a canonizer takes a constituent and returns a representative for it. (It is a surjection.)

The choose function of a canonizer takes a representative and returns some constituent element. (It is an injection.)

(Why “constituent”? By analogy with representative democracy, where there is a canonizer, election, that takes a set of constituents and returns a single, canonical representative. In the case of election, the representative is also a constituent, so choose is the identity function. But it is not a requirement of canonizers that the representative belong to the set of constituents – this is the limit of the analogy.)

The axis of representativeness is orthogonal to the axis of abstraction:

When we compose a canonizer with the left (concrete) side of a lens – which is called “left quotienting” – canonize returns a more abstract representative, and choose returns a more concrete constituent. (This is a more limited form of lens composition.)

But when we compose a canonizer with the right (abstract) side of a lens – which is called “right quotienting” – then canonize returns a more concrete representative, and choose returns a more abstract constituent.

(That said, lenses and canonizers are structurally the same, and any lens can be converted into a canonizer by equating “abstract” and “representative”: get becomes canonize and create becomes choose.)

Canonizers need not be total

The formal definition of canonizers requires them to be total. We work around this by treating canonizers as discerning lenses with the behavior that, if the lens fails – if the last value returned by canonize or choose is nil – then the argument is returned unchanged.

Use left-quotienting for simplification and fix-ups

Left-quotienting coarsens the concrete domain.

For canonize, this typically means we want to ignore some aspect of the original (like whitespace).

Here is a simple canonizer that strips (usually redundant) parenthesized expression nodes from an SEL tree-sitter AST:

(defun strip-parens ()
  "Canonizer that strips parentheses."
  (l:canonizer (lambda (c)
                 (match c
                   ((parenthesized-expression-ast
                     :children `(,child))
                    child)))
               #'identity))

Except for the discerning behavior, left-quotienting behaves the same as composition. (If strip-parens were a lens, it would fail for any input that wasn’t a parenthesized expression, but as a canonizer it just returns the input unchanged.)

For choose, this typically means we want to “fix up” the concrete domain to respect some invariant (like operator precedence, or indentation, or interleaved text).

Use right-quotienting for further abstraction

Right-quotienting “coarsens the abstract domain.” This calls for illustration.

Right-quotienting for canonizing human input

One potential use for right-quotienting is handling human input. Suppose you have a lens that translates between a brittle, verbose format into a simpler one that a human can easily edit. You might want to ignore incidental aspects of the human-editable format (whitespace, indentation, case).

Right quotienting provides a principled way to do this input canonization. We define a canonizer with a canonize function that takes the messy user input and returns a cleaned-up version as its ideal representative. (Optionally, we could also define a choose function that pretty-prints that ideal representation.)

(defun input-canonizer ()
  (l:canonizer
   (lambda (c)
     (when (stringp c)
       (string-downcase
        (collapse-whitespace
         (remove #\Newline c)))))
   #'identity))

(defun my-lens* ()
  (l:rquot
   (input-canonizer)
   (my-lens)))
Right-quotienting for handling identifiers

Another potential use for right quotienting is handling identifiers. Assuming we have already performed a structural translation (e.g. into S-expressions) we might want to further translate only the identifers.

A real example: A lens implementation of FPCore translates between a generic S-expression based representation and an FPCore-specific one.

The generic S-expression representation only uses keywords. The FPCore lens, however, uses symbols interned in two specific packages: one constant package that contains only the FPCore builtins, and another “working” package that is used to intern symbols for parameters. (This distinction is important so that we can analyze FPCore expressions to find what parameters they expect.)

The goal is a trivial lens that recognizes Lisp keywords and quotients them with a canonizer.

(defun fpcore-symbol-lens ()
  "Converts between keywords and FPCore symbols."
  (l:rquot
   (fpcore-symbol-canonizer)
    (l:check #'keywordp)))

How to define the canonizer? In this case the representative value is a keyword, while the constituent value is a symbol interned in one of the FPCore packages.

We need a canonize function that gets the representative value by converting any symbol it receives into a keyword:

(lambda (c)
  (and c
       (symbolp c)
       (values (make-keyword (string c)) t)))

And we need a choose function that gets a constituent value by interning the keyword's string in the appropriate package:

(lambda (r)
  ;; Second value as intern may return a second value of nil.
  (values (intern-fpcore-symbol (string r))
          t))

These can then be combined into the canonizer we want:

(defun fpcore-symbol-canonizer ()
  "Translate between keywords and FPCore symbols, when possible."
  (l:canonizer
   (lambda (r)
     (and r
          (symbolp r)
          (values (make-keyword (string r)) t)))
   (lambda (c)
     ;; Second value as intern may return a second value of nil.
     (values (intern-fpcore-symbol (string c))
             t))))

Debugging

Since lenses are constructed, then called, they can be difficult to debug. You can use l:trace (and l:untrace) to print information about successful matches. This doesn’t help when matches are total failures, but is very helpful when the lens isn’t returning what you expect.

Copyright and Acknowledgments

Copyright (C) 2020 GrammaTech, Inc.

This code is licensed under the MIT license. See the LICENSE file in the project root for license terms.

This material is based upon work supported by the US Air Force, AFRL/RIKE and DARPA under Contract No. FA8750-20-C-0208. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the US Air Force, AFRL/RIKE or DARPA.

Dependencies (5)

  • cl-interpol
  • cl-utils
  • fare-quasiquote
  • stefil-
  • trivial-package-local-nicknames

Dependents (0)

    • GitHub
    • Quicklisp
    • Sponsor