Suggestions: New features for version 0.2.0? #1

Evelyn-H · 2019-04-26T19:24:40Z

Hey,
Just putting this issue here to list suggestions / ideas for the next version.

port everything to use nom 5.0 (alpha)
...

zypeh · 2019-04-29T05:44:14Z

I think it is nice if it could add parsing guard feature in next version.

eg: (just my 2 cents)

html =
"<" <tag1: string> ">"
value
"<" "/" <tag1: string> ">"

Basically it adds a runtime constraints to match the tag based on the tag name. And it can be used as a combinator.

Evelyn-H · 2019-04-29T17:32:54Z

@zypeh
Hm, I'm not sure I entirely get what you mean. Could you provide a bit more in-depth example?

Edit: Do you mean a way to ensure that the two strings/tags parsed are equal, and make the parser fail otherwise?

This would definitely be useful for parsing non-regular grammars

zypeh · 2019-04-30T02:53:22Z

@Evelyn-H

Do you mean a way to ensure that the two strings/tags parsed are equal, and make the parser fail otherwise?

Yes. 😃 The example I had given is not so precise anyway, my bad. But I would like to know how to implement this feature.

Evelyn-H · 2019-04-30T03:39:50Z

Hm, I'd probably make it a bit more general and optionally allow the code block at the end to return a Result.

Maybe something like this:

html =
"<" <left_tag: string> ">" 
value
"<" "/" <right_tag: string> ">"
=> ?{
    if left_tag == right_tag { Ok(result) } else { Err(error) }
}

Note the ? before the code block to signify that it returns a Result. (Just an idea, definitely not the final syntax)

jgall · 2019-09-02T16:45:05Z

It might be nice to be able to specify ranges of characters. I'm not entirely sure what the best way to do this would be, but something along the following would be really nice.

textdata = (' '-'!'|'#'-'+'|'-'-'~')

as an alternative to manually writing the nom parser for this, which would look like the following:

pub fn textdata<T>(input: T) -> IResult<T, T>
where
    T: InputTakeAtPosition,
    <T as InputTakeAtPosition>::Item: AsChar,
{
    input.split_at_position(|item| is_textdata(item.as_char()))
}

/// TEXTDATA as seen here: https://tools.ietf.org/html/rfc4180#section-2
fn is_textdata(input: char) -> bool {
    (' ' <= input && input <= '!')
        || ('#' <= input && input <= '+')
        || ('-' <= input && input <= '~')
}

I'm also not sure whether it would be better to use characters here, or strings, or the number values (i.e. 0x20-0x21). Another alternative to this could be to use the re_capture! macro in Nom.

Evelyn-H · 2019-09-02T18:49:09Z

Yeah, I've been thinking about this too, it's definitely one of the next features I wanna add.

Having a specific syntax for nom-peg would significantly increase the complexity of the procedural macro parsing code though, which is already quite complex.
So I had the same idea of maybe outsourcing it to the regex macros/functions in nom, but I haven't looked into it in detail yet.

One potential problem with that approach could be that, if I remember correctly, the semantics of regexes are slightly different than those of PEG grammars. So, interspersing them could become confusing and result in unintuitive results.

jgall · 2019-09-02T19:43:24Z

normally, if I were just writing a nom parser and had repeated occurrences of this "all chars in x range" pattern, I'd write a function like the following:

pub fn in_range<T>(a: char, b: char) -> impl Fn(T) -> IResult<T, T>
where
    T: InputTakeAtPosition,
    <T as InputTakeAtPosition>::Item: AsChar,
{
    move |input| input.split_at_position(|item| between(item.as_char(), a, b))
}

fn between(input: char, start: char, end: char) -> bool {
    start <= input && input <= end
}

However within the pattern declaration portion of the grammar! macro I am unable to call rust functions.
The below code does not compile:

textdata: &'input str = (in_range(' ', '!')|"#")

Maybe allowing some syntax for pure rust blocks that return nom parsers within the macro would be an alternative way to implement this.

Evelyn-H · 2019-09-03T14:14:51Z

Adding support for calling arbitrary nom parser functions in the grammar! macro is definitely planned too. This way you could write a regular nom function and include it as a nonterminal in the grammar.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestions: New features for version 0.2.0? #1

Suggestions: New features for version 0.2.0? #1

Evelyn-H commented Apr 26, 2019

zypeh commented Apr 29, 2019

Evelyn-H commented Apr 29, 2019 •

edited

Loading

zypeh commented Apr 30, 2019 •

edited

Loading

Evelyn-H commented Apr 30, 2019 •

edited

Loading

jgall commented Sep 2, 2019

Evelyn-H commented Sep 2, 2019 •

edited

Loading

jgall commented Sep 2, 2019 •

edited

Loading

Evelyn-H commented Sep 3, 2019 •

edited

Loading

Suggestions: New features for version 0.2.0? #1

Suggestions: New features for version 0.2.0? #1

Comments

Evelyn-H commented Apr 26, 2019

zypeh commented Apr 29, 2019

Evelyn-H commented Apr 29, 2019 • edited Loading

zypeh commented Apr 30, 2019 • edited Loading

Evelyn-H commented Apr 30, 2019 • edited Loading

jgall commented Sep 2, 2019

Evelyn-H commented Sep 2, 2019 • edited Loading

jgall commented Sep 2, 2019 • edited Loading

Evelyn-H commented Sep 3, 2019 • edited Loading

Evelyn-H commented Apr 29, 2019 •

edited

Loading

zypeh commented Apr 30, 2019 •

edited

Loading

Evelyn-H commented Apr 30, 2019 •

edited

Loading

Evelyn-H commented Sep 2, 2019 •

edited

Loading

jgall commented Sep 2, 2019 •

edited

Loading

Evelyn-H commented Sep 3, 2019 •

edited

Loading