proc-parse
2019-08-13
Procedural vector parser
Proc-Parse
Question: Are these parser macros for speed or just to make your application look cool?
Answer: Both.
This is a string/octets parser library for Common Lisp with speed and readability in mind. Unlike other libraries, the code is not a pattern-matching-like, but a char-by-char procedural parser.
Although the design is good for speed, the code could look ugly with tagbody
and go
. Proc-Parse wraps the code with sexy macros.
I believe we don't have to give up speed for the readability while we use Common Lisp.
Usage
(defun parse-url-scheme (data &key (start 0) end) "Return a URL scheme of DATA as a string." (declare (optimize (speed 3) (safety 0) (debug 0))) (block nil (with-vector-parsing (data :start start :end end) (match-i-case ("http:" (return "http")) ("https:" (return "https")) (otherwise (unless (standard-alpha-char-p (current)) (return nil)) (bind (scheme (skip* (not #\:))) (return scheme)))))))
API
with-vector-parsing
- can parse both string and octets.
(with-vector-parsing ("It's Tuesday!" :start 5 :end 12) (bind (str (skip-until (lambda (c) (declare (ignore c)) (eofp)))) (print str))) ; "Tuesday" (with-vector-parsing ((babel:string-to-octets "It's Tuesday!") :start 5 :end 12) (bind (str (skip-until (lambda (c) (declare (ignore c)) (eofp)))) (print str))) ; "Tuesday"
with-string-parsing
- can parse string.
(with-string-parsing ("It's Tuesday!" :start 5 :end 12) (bind (str (skip-until (lambda (c) (declare (ignore c)) (eofp)))) (print str))) ; "Tuesday"
with-octets-parsing
- can parse octets.
(with-octets-parsing ((babel:string-to-octets "It's Tuesday!") :start 5 :end 12) (bind (str (skip-until (lambda (c) (declare (ignore c)) (eofp)))) (print str))) ; "Tuesday"
eofp
- can return EOF or not.
(with-vector-parsing ("hello") (print (eofp)) ; NIL (match "hello") (print (eofp))) ; T
current
- can return the character of the current position.
(with-vector-parsing ("hello") (print (current)) ; #\h (skip #\h) (print (current))) ; #\e
peek
- can peek next character from the current position
(with-vector-parsing ("hello") (print (current)) ; #\h (print (peek)) ; #\e (print (current))) ; #\h
- and you can specify the eof-value
(with-vector-parsing ("hello") (match "hell") (print (pos)) ; #\4 (print (peek :eof-value 'yes))) ; YES
pos
- can return the current position.
(with-vector-parsing ("hello") (print (pos)) ; 0 (skip #\h) (print (pos))) ; 1
advance
- can put the current postion forward.
- can cease parsing with EOF.
(with-vector-parsing ("hello") (print (current)) ; #\h (advance) (print (current)) ; #\e (match "ello") (print (current)) ; #\o (advance) (print "Hi")) ; "Hi" won't displayed.
advance*
- can put the current postion forward.
- just returns NIL with EOF.
(with-vector-parsing ("hello") (print (current)) ; #\h (advance*) (print (current)) ; #\e (match "ello") (print (current)) ; #\o (advance*) (print (current)) ; #\o (print "Hi")) ; "Hi"
skip
- can skip the specified character.
- can raise MATCH-FAILED error with unmatched characters.
(with-vector-parsing ("hello") (print (current)) ; #\h (skip #\h) (print (current)) ; #\e (skip (not #\h)) (print (current)) ; #\l (skip #\f)) ;; => Condition MATCH-FAILED was signalled.
skip*
- can skip some straignt specified characters.
- just returns NIL with unmatched characters.
(with-vector-parsing ("hello") (skip* #\h) (print (current)) ; #\e (skip* (not #\l)) (print (current)) ; #\l (skip* #\l) (print (current)) ; #\o (skip* #\f)) ; MATCH-FAILED won't be raised.
skip+
- can skip some straignt specified characters.
- can raise MATCH-FAILED error with unmatched characters.
(with-vector-parsing ("hello") (skip+ #\h) (print (current)) ; #\e (skip* (not #\l)) (print (current)) ; #\l (skip+ #\l) (print (current)) ; #\o (skip+ #\f)) ;; => Condition MATCH-FAILED was signalled.
skip?
- can skip the specified character.
- just returns NIL with unmatched characters.
(with-vector-parsing ("hello") (print (current)) ; #\h (skip? #\h) (print (current)) ; #\e (skip? (not #\h)) (print (current)) ; #\l (skip? #\f)) ; MATCH-FAILED won't be raised.
skip-until
- can skip until form returned T or parsing reached EOF.
(with-vector-parsing ("hello") (skip-until (lambda (char) (char= char #\o))) (print (current)) ; #\o (print (eofp)) ; NIL (skip-until (lambda (char) (char= char #\f))) (print (eofp))) ; T
skip-while
- can skip while form returns T and parsing doesn't reach EOF.
(with-vector-parsing ("hello") (skip-while (lambda (char) (char/= char #\o))) (print (current)) ; #\o (print (eofp)) ; NIL (skip-while (lambda (char) (char/= char #\f))) (print (eofp))) ; T
bind
- can bind subseqed string.
(with-vector-parsing ("hello") (bind (str1 (skip-until (lambda (c) (char= c #\l)))) (print str1)) ; "he" (bind (str2 (skip* (not #\f))) (print str2))) ; "llo"
match
- can skip matched one of the specified strings.
- can raise MATCH-FAILED error with unmatched characters.
(with-vector-parsing ("hello") (match "he") (print (current)) ; #\l (match "l" "ll") (print (current)) ; #\o (match "f")) ;; => Condition MATCH-FAILED was signalled.
match-i
- can skip case-insensitively matched one of the specified strings.
- can raise MATCH-FAILED error with case-insensitively unmatched characters.
(with-vector-parsing ("hello") (match-i "He") (print (current)) ; #\l (match-i "L" "LL") (print (current)) ; #\o (match-i "F")) ;; => Condition MATCH-FAILED was signalled.
match?
- can skip matched one of the specified strings.
- just returns NIL with unmatched characters.
(with-vector-parsing ("hello") (match? "he") (print (current)) ; #\l (match? "l" "ll") (print (current)) ; #\o (match? "f")) ; MATCH-FAILED won't be raised.
match-case
- can dispatch to the matched case.
- aborts parsing when reaching EOF.
(with-vector-parsing ("hello") (match-case ("he" (print 0)) ("ll" (print 1)) (otherwise (print 2))) ; 0 (print (current)) ; #\l (match-case ("he" (print 0)) ("ll" (print 1)) (otherwise (print 2))) ; 1 (print (current)) ; #\o (match-case ("he" (print 0)) ("ll" (print 1)) (otherwise (print 2))) ; 2 (print (current)) ; #\o (match-case ("he" (print 0)) ("ll" (print 1)))) ;; => Condition MATCH-FAILED was signalled. (with-vector-parsing ("hello") (print (match-case ("hello" 0))) ;; Nothing will be printed. (print "It shold not be printed.")) ;; Nothing will be printed. ;; => NIL
match-i-case
- can dispatch to the case-insensitively matched case.
- aborts parsing when reaching EOF.
(with-vector-parsing ("hello") (match-i-case ("He" (print 0)) ("LL" (print 1)) (otherwise (print 2))) ; 0 (print (current)) ; #\l (match-i-case ("He" (print 0)) ("LL" (print 1)) (otherwise (print 2))) ; 1 (print (current)) ; #\o (match-i-case ("He" (print 0)) ("LL" (print 1)) (otherwise (print 2))) ; 2 (print (current)) ; #\o (match-i-case ("He" (print 0)) ("LL" (print 1)))) ;; => Condition MATCH-FAILED was signalled. (with-vector-parsing ("hello") (print (match-i-case ("Hello" 0))) ;; Nothing will be printed. (print "It shold not be printed.")) ;; Nothing will be printed. ;; => NIL
match-failed
- is the condition representing failure of matching.
(with-vector-parsing ("hello") (print (current)) ; #\h (skip #\f)) ;; => Condition MATCH-FAILED was signalled.
Author
- Eitaro Fukamachi
- Rudolph Miller
Copyright
Copyright (c) 2015 Eitaro Fukamachi & Rudolph Miller
License
Licensed under the BSD 2-Clause License.