tur/parsec

stdlib/parsec.tur
defn

mzero

(mzero :int)

── Backtracking monad (inlined from backtrack.tur) ──────────────────────────

0, the null pointer representing no results.

  (mzero)  ; => 0
defn

mreturn

(mreturn [x] :int)

lift a single value into the backtracking monad.

xthe int64_t value to wrap.
A newly allocated Cell containing x with a null next pointer.

  (mreturn 42)  ; => Cell{value=42, next=NULL}
defn

mplus

(mplus [xs ys] :int)

concatenate two backtracking result lists.

xsthe first Cell list (copied into a fresh list).
ysthe second Cell list (appended by pointer, not copied).
A new Cell list containing all results from xs followed by ys.

  (mplus (mreturn 1) (mreturn 2))  ; => Cell{1} -> Cell{2}
defn

mbind

(mbind [ma fn] :int)

monadic bind for the backtracking monad.

maa Cell list of values to thread through fn.
fna fat closure of type int64_t -> Cell list.
The concatenated results of applying fn to every value in ma,
  preserving original order.

  (mbind (mreturn 1) (fn [x] (mreturn (* x 2))))  ; => Cell{2}
defn

bt-nil

(bt-nil :int)

return the empty backtracking list.

0 (null pointer).

defn

bt-cons

(bt-cons [v nxt] :int)

prepend a value onto a backtracking Cell list.

vthe int64_t value to prepend.
nxtthe existing Cell list to prepend onto.
A new Cell with value v pointing to nxt.

  (bt-cons 3 (bt-nil))  ; => Cell{3, NULL}
defn

bt-length

(bt-length [xs] :int)

count elements in a backtracking Cell list.

xsthe Cell list to measure.
The number of cells in the list as an int64_t.

  (bt-length (bt-cons 1 (bt-cons 2 (bt-nil))))  ; => 2
defn

pair-new

(pair-new [a b] :int)

── Pair helpers ─────────────────────────────────────────────────────────────

athe first field value.
bthe second field value.
Pointer (as int64_t) to a heap-allocated Pair{first=a, second=b}.

  (pair-new 1 2)  ; => Pair{first=1, second=2}
defn

pair-first

(pair-first [p] :int)

extract the first field from a Pair.

ppointer (as int64_t) to a Pair.
The int64_t value stored in the first field.

  (pair-first (pair-new 7 8))  ; => 7
defn

pair-second

(pair-second [p] :int)

extract the second field from a Pair.

ppointer (as int64_t) to a Pair.
The int64_t value stored in the second field.

  (pair-second (pair-new 7 8))  ; => 8
defn

input-new

(input-new [s :cstr pos] :int)

── Input ────────────────────────────────────────────────────────────────────

sthe source string (:cstr).
posthe initial character position (0-based).
Pointer (as int64_t) to a heap-allocated Input{str, pos}.

  (input-new "hello" 0)  ; => Input at position 0
defn

input-at-end

(input-at-end [inp] :bool)

test whether an Input is at the end of its string.

inppointer (as int64_t) to an Input.
true if the current character is '\0', false otherwise.

  (input-at-end (input-new "" 0))  ; => true
defn

input-current-char

(input-current-char [inp] :int)

return the character at the current position.

inppointer (as int64_t) to an Input.
The ASCII code of the current character as an int64_t (unsigned byte).

  (input-current-char (input-new "abc" 0))  ; => 97  ; 'a'
defn

input-advance

(input-advance [inp] :int)

return a new Input advanced by one character.

inppointer (as int64_t) to the current Input.
A new heap-allocated Input with pos incremented by 1.

  (input-pos (input-advance (input-new "ab" 0)))  ; => 1
defn

input-pos

(input-pos [inp] :int)

return the current position index of an Input.

inppointer (as int64_t) to an Input.
The int64_t character offset into the source string.

  (input-pos (input-new "abc" 2))  ; => 2
defn

input-remaining

(input-remaining [inp] :int)

count bytes remaining in an Input.

inppointer (as int64_t) to an Input.
The number of bytes from the current position to the terminating '\0'.

  (input-remaining (input-new "abc" 1))  ; => 2
defn

apply-fat

(apply-fat [f arg] :int)

── Fat-closure and parser application ───────────────────────────────────────

fpointer (as int64_t) to a fat closure.
argthe single int64_t argument to pass.
The int64_t return value of the closure.

  (apply-fat my-fn 42)  ; => result of calling my-fn with 42
defn

apply-parser

(apply-parser [p inp] :int)

run a parser fat closure against an Input.

ppointer (as int64_t) to a parser fat closure.
inppointer (as int64_t) to the Input to parse.
A Cell list of Pair(value, remaining-Input) results.

  (apply-parser (item) (input-new "a" 0))  ; => Cell{Pair(97, Input@1)}
defn

pfail

(pfail :ptr<void>)

── Core parsers ─────────────────────────────────────────────────────────────

A parser that returns mzero for any input.

  (run-parser (pfail) "abc")  ; => 0  (no results)
defn

pure-impl

(pure-impl [_v inp] :int)

internal implementation for pure.

_vthe value to return.
inpthe current Input (passed through unchanged).
defn

pure

(pure [v] :ptr<void>)

parser that always succeeds with a fixed value, consuming no input.

vthe value to embed in the parse result.
A parser that returns Pair(v, inp) without advancing the input.

  (run-parser (pure 99) "")  ; => Cell{Pair(99, Input@0)}
defn

item-impl

(item-impl [inp] :int)

internal implementation for item.

inpthe current Input.
defn

item

(item :ptr<void>)

parser that consumes exactly one character.

A parser yielding the ASCII code of the next character, or
  failure if at end of input.

  (parse-value (run-parser (item) "A"))  ; => 65  ; 'A'
defn

pchar-impl

(pchar-impl [_c inp] :int)

internal implementation for pchar.

_cthe expected ASCII character code.
inpthe current Input.
defn

pchar

(pchar [c] :ptr<void>)

parser that matches a single specific character.

cthe ASCII integer code of the character to match.
A parser that succeeds with c if the next character equals c,
  or fails otherwise.

  (parse-value (run-parser (pchar 65) "ABC"))  ; => 65  ; 'A'
defn

pstring-c-impl

(pstring-c-impl [s_raw inp] :int)

internal C implementation for pstring.

s_rawthe pattern string as a raw int64_t pointer.
inpthe current Input.
defn

cstr->int

(cstr->int [s :cstr] :int)

reinterpret a :cstr pointer as an int64_t without copying.

sthe C string to reinterpret.

The raw pointer value of s as an int64_t.

defn

pstring

(pstring [s :cstr] :ptr<void>)

parser that matches a literal string prefix.

sthe expected string literal (:cstr).
A parser that succeeds with s (as int64_t pointer) and advances
  past it, or fails if the input does not start with s.

  (run-parser (pstring "hi") "hi there")  ; => Cell{Pair("hi", Input@2)}
defn

or-parser-impl

(or-parser-impl [lp lq inp] :int)

── Combinators ──────────────────────────────────────────────────────────────

lpthe first parser (int64_t fat-closure pointer).
lqthe second parser (int64_t fat-closure pointer).
inpthe current Input.
defn

or-parser

(or-parser [p q] :ptr<void>)

try two parsers and return all successes from either.

pthe first parser to try.
qthe second parser to try.
A parser that returns the concatenation of results from p and q
  applied to the same input (full backtracking).

  (run-parser (or-parser (pchar 65) (pchar 66)) "B")  ; => Cell{Pair(66, ...)}
defn

bind-parser-inner

(bind-parser-inner [lf pair] :int)

apply continuation f to a single parse Pair.

lfthe continuation fat closure (value -> Parser).
paira Pair(value, remaining-Input) from the first parser.
defn

bind-parser-impl

(bind-parser-impl [lp lf inp] :int)

internal implementation for bind-parser.

lpthe first parser.
lfthe continuation fat closure.
inpthe current Input.
defn

bind-parser

(bind-parser [p f] :ptr<void>)

sequence two parsers, threading the first result into the second.

pthe first parser.
fa fat closure of type int64_t -> Parser; receives the parsed value
and returns the next parser to run.
A parser that runs p, then for each result calls (f value) and runs
  the resulting parser on the remaining input.

  (bind-parser (item) (fn [c] (pure c)))  ; identity parser
defn

then-parser-impl

(then-parser-impl [lp lq inp] :int)

internal implementation for then-parser.

lpthe first parser (result discarded).
lqthe second parser (result kept).
inpthe current Input.
defn

then-parser

(then-parser [p q] :ptr<void>)

sequence two parsers, discarding the first result.

pparser whose result is discarded.
qparser whose result is kept.
A parser that runs p, then runs q on the remaining input,
  returning only q's results.

  (run-parser (then-parser (pchar 40) (item)) "(x")
  ; => Cell{Pair(120, Input@2)}  ; 'x' only
defn

many-c-impl

(many-c-impl [p_raw inp] :int)

internal C implementation of greedy zero-or-more repetition.

p_rawparser fat closure as int64_t.
inpthe current Input.
defn

many

(many [p] :ptr<void>)

greedily match a parser zero or more times.

pthe parser to repeat.
A parser that always succeeds, yielding a Cell list of all
  matched values and the remaining Input after the last match.
  Returns an empty list when p does not match at all.

  (run-parser (many (pchar 97)) "aab")
  ; => Cell{Pair(Cell{97,97}, Input@2)}
defn

many1-impl

(many1-impl [_p inp] :int)

internal implementation for many1.

_pthe parser to repeat.
inpthe current Input.
defn

many1

(many1 [p] :ptr<void>)

greedily match a parser one or more times.

pthe parser to repeat.
A parser that succeeds only when p matches at least once, then
  greedily repeats, returning results identical in structure to many.

  (run-parser (many1 (pchar 97)) "aab")
  ; => Cell{Pair(Cell{97,97}, Input@2)}
  (run-parser (many1 (pchar 97)) "bbb")  ; => 0  (fails)
defn

optional-impl

(optional-impl [_p inp] :int)

internal implementation for optional.

_pthe parser to attempt.
inpthe current Input.
defn

optional

(optional [p] :ptr<void>)

match a parser zero or one times.

pthe parser to attempt.
A parser that returns two results when p succeeds (the match and
  a fallback with value 0), or one result with value 0 when p fails.
  Check pair-first: 0 means absent, non-zero means the parsed value.

  (run-parser (optional (pchar 65)) "ABC")  ; => two results
  (run-parser (optional (pchar 65)) "XYZ")  ; => Cell{Pair(0, Input@0)}
defn

run-parser

(run-parser [p s :cstr] :int)

── Runners ──────────────────────────────────────────────────────────────────

pthe parser to run.
sthe input string (:cstr).
A Cell list of Pair(value, remaining-Input) for all successful
  parses, including partial ones.

  (run-parser (pchar 65) "ABC")  ; => Cell{Pair(65, Input@1)}
defn

run-parser-full-c

(run-parser-full-c [results] :int)

internal C helper to extract the first full-input parse.

resultsCell list returned by run-parser.
defn

run-parser-full

(run-parser-full [p s :cstr] :int)

run a parser and return only the first complete parse.

pthe parser to run.
sthe input string (:cstr).
A Cell wrapping the parsed value if the entire input was consumed,
  or 0 (mzero) if no full parse exists.

  (run-parser-full (many (pchar 97)) "aaa")
  ; => Cell{value=Cell{97,97,97}}
  (run-parser-full (pchar 97) "ab")  ; => 0  (leftover input)
defn

parse-value

(parse-value [results] :int)

── Result helpers ───────────────────────────────────────────────────────────

resultsa Cell list returned by run-parser or run-parser-full.
The int64_t value from pair-first of the first Cell, or 0 if empty.

  (parse-value (run-parser (pchar 65) "A"))  ; => 65
defn

parse-result-count

(parse-result-count [results] :int)

count the number of parse results in a result list.

resultsa Cell list returned by run-parser.
The number of successful parse results as an int64_t.

  (parse-result-count (run-parser (or-parser (pchar 65) (pchar 65)) "A"))
  ; => 2  (both branches succeed)
defn

char-list-length

(char-list-length [lst] :int)

count elements in a bt-cons character list.

lsta bt-cons Cell list.
The number of elements as an int64_t.

  (char-list-length (bt-cons 1 (bt-cons 2 (bt-nil))))  ; => 2