2018-01-31

edit-distance

Using

Computes the edit distance (aka Levenshtein distance) between two sequences. This is a common distance measure.

For example, to compute the distance between two sequences:

``(edit-distance:distance '(1 2 3 4 5) '(2 3 4 5 6))``

To compute the sequence of edit operations needed to convert sequence 1 into sequence two, use the `diff` function

``(edit-distance:diff '(1 2 3 4 5) '(2 3 4 5 6))``

That will return two values a structure, as follows, and the distance.

``````((:DELETION 1 NIL) (:MATCH 2 2) (:MATCH 3 3) (:MATCH 4 4) (:MATCH 5 5) (:INSERTION NIL 6))
2``````

That structure can be printed more readibly with the `FORMAT-DIFF` function

``(edit-distance:format-diff *path*)``

Or, you can compute the diff and print it readably together by calling `PRINT-DIFF`:

``(edit-distance:print-diff '(1 2 3 4 5) '(2 3 4 5 6))``

which will print a result like this:

``````seq1: 1   2 3 4 5 *** []
seq2: *** 2 3 4 5 6   []``````

Several options are available to the `FORMAT-DIFF` and `PRINT-DIFF` to print a prefix and suffix for each line. Note that displaying substitutions relys on captialization and so substitutions are not visible for non-alphabetic sequence elements.

Additionally, other equality functions can be used, so this evaluates to a distance of zero:

`````` (edit-distance:distance '("ONE" "TWO" "THREE") '("one" "two"
"three") :test 'string-equal)``````

Any type of sequence may be used, but for speed the input sequences are copied into new arrays. If speed is a major concern make sure to provide simple vectors as your input sequences.

Testing

To test with sbcl, run:

``sbcl --eval "(asdf:load-system 'edit-distance-test)" --eval "(unless (lisp-unit:run-tests :all :edit-distance-tests) (uiop:quit 1))"`` 