Skip to content
/ pears Public

A combinator parsing library for Common Lisp

Notifications You must be signed in to change notification settings

HenryS1/pears

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

https://github.com/HenryS1/pears/actions/workflows/ci.yaml/badge.svg

Pears

A combinator parsing library for Common Lisp

Motivation

User input and structured data needs to be converted into your domain model to be used. Regular expressions are popular, but they have serious limitations for intrepreting structured input. A simple example is that a regular expressiong can’t match balanced parenthese. Parsers can consume complicated user input, but handwritten parsers can be complicated and aspects such as consuming input and backtracking obscure parsing logic.

Combinator parsers address the complexity of parsing by abstracting backtracking, buffering and input processing. A good example of a combinator parser is Megaparsec from Haskell. This library provides a similar parsing facility for Common Lisp.

High level overview

The two main operators for creating parsers are orp and sequential. The orp form tries each of the provided parsers until one matches, backtracking on failure. The sequential form creates a parser which applies a list of parsers to the input binding each parsed result to a different value. The final form in sequential is the value returned by the parser. If any parser in the list supplied to sequential fails then the parser fails.

Creating a parser

A parser is most easily created by combining the builtin combinators.

one

(stream-element -> bool) -> parser

Create a parser which matches exactly one stream element satisfying the provided predicate.

(one (char= #\a)) ;; matches one 'a' character

many

(stream-element -> bool) -> parser

Create a parser which matches zero or more stream elements satisfying the provided predicate.

(many #'digit-char-p) ;; matches zero or more digits

many1

(stream-element -> bool) -> parser

Like many but requires at least one match to succeed. Returns a sequence of matching elements from the input stream.

(many1 #'alpha-char-p) ;; matches a non-empty sequence of alphabetical characters

manyn

(stream-elemnnt -> bool) -> parser

Creates a parser which expects a predicate to apply to n successive elements from the input stream.

(manyn #'alpha-char-p 4)

sequential

Syntactic sugar for a form which applies the listed parsers to the input binding the parsed results to variables. Returns the last provided form as the parsed result. Works similarly to a let binding.

(sequential (a (one #'alpha-char-p))
            (b (one #'digit-char-p))
            (list a b))
;; parses an alphbetical character and then a digit and returns a list
;; containing them

orp

parser* -> parser

Syntactic sugar for a form which tries to match each of the provided parsers. If one fails this backtracks and tries the next.

(orp (one #'alpha-char-p)
     (lambda (c) (char= c #\*)))
;; parses an alphabetical character or an asterisk

repeated

parser -> parser

Repeatedly apply a parser returning a list of zero or more matches.

(repeated (sequential (a (one #'alpha-char-p))
                      (b (one #'digit-char-p))
                      (list a b)))
;; repeatedly parses an alphabetical character followed by a digit 
;; returning a list of parsed results

repeated1

parser -> parser

Repeatedly apply a parser returning a list of one or more matches. Similar to repeated, but fails if there isn’t at least one match for the parser.

sep-by

(parser, parser) -> parser

Use the first parser to parse values and the second parser to parse a separator. Collect the values into a list.

(sep-by (many1 #'alpha-char-p) (char1 #\,))

discard

(stream-element -> bool) -> parser

Create a parser that discards stream-elements matching the provided predicate.

(discard #'digit-char-p)

ignore whitespace

parser

A parser that discards whitespace

seq

stream-element* -> parser

Parse the provided sequence of stream elements

(seq "true")

optional

parser -> parser

Applies the provided parser zero or one times to the input.

(optional (char1 #\-))

About

A combinator parsing library for Common Lisp

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published