cl-html-parse
2023-10-21
HTML Parser
Upstream URL
Author
Gary Warren King <gwking@metabang.com>
Maintainer
Gary Warren King <gwking@metabang.com>
License
MIT Style license for the packaging, AllegroServe license for the code.
NOTE
This is unmaintained code. Gary King is no longer an active Lisper. Good luck out there. If you'd like to take it over, let me know.
Introduction
CL-HTML-PARSE is a slash and burn port of Franz's phtml HTML parser.
I made a few small changes to the source mainly involving the call to excl:intern* in collect-tag. In particular, all tags are now string-upcased before being interned. There is probably a better solution but I'm not very Allegro savvy.
Any questions or concerns should be directed to me, Gary Warren King at gwking@metabang.com.
If the feature :use-acl-compat is set, then cl-html-parse will depend on it; otherwise, it will use its own package and if* definitions.
Example #1
(html-parse:parse-html
"<html>
<head><title>Parsing HTML is Phun</title></head>
<body>
<h1>Why is it phun?</h1>
<p>Parsing HTML is phun because angle brackets are better than parentheses.</p>
</body>
</html>")
==>
((:HTML (:HEAD (:TITLE "Parsing HTML is Phun"))
(:BODY (:H1 "Why is it phun?")
(:P
"Parsing HTML is phun because angle brackets are better than parentheses."))))
Example #2
(html-parse:parse-html <file>)
==>
((:HTML ...))