Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JSON parser example #391

Merged
merged 19 commits into from
Oct 3, 2024
Merged

Add JSON parser example #391

merged 19 commits into from
Oct 3, 2024

Conversation

jimbxb
Copy link
Collaborator

@jimbxb jimbxb commented Jun 2, 2023

Ideally I'd use a resource to handle the parse string, but I found a few compile errors along the way regarding uninitialised temporary variables. I didn't get a change to debug where these were, though, as they didn't have any location attached to them.

I believe one benefit for using a resource is that, for a test proc, if the proc fails there is already a mechanism to rollback the initial state for the resource. This would clean up the char and digit procs. Of course, these procs should work without the hack I built in, but this is a known issue which is to be documented on GitHub.

Overall, though, I think the use of higher order in both parsing list-like structures and in printing is a treat to work with!

Copy link
Owner

@pschachte pschachte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really lovely! Very readable code, and a nice sales pitch for Wybe. I've made a few style nitpicks here, and also, a bit of documentation would be good for this to be a Wybe code sample.

I agree about using a resource. It'd make a nice example use of resources, too.

samples/json.wybe Outdated Show resolved Hide resolved
samples/json.wybe Outdated Show resolved Hide resolved
samples/json.wybe Show resolved Hide resolved
samples/json.wybe Outdated Show resolved Hide resolved
samples/json.wybe Show resolved Hide resolved
samples/json.wybe Outdated Show resolved Hide resolved
samples/json.wybe Show resolved Hide resolved
samples/json.wybe Outdated Show resolved Hide resolved
samples/json.wybe Outdated Show resolved Hide resolved
@jimbxb
Copy link
Collaborator Author

jimbxb commented Jun 10, 2023

Thanks for the feedback! I've been working on a version with resources that has some different names and some little tweaks to make it ergonomic, but I should add a commit with these changes before tackling the resource version.

Unfortunately, the resource version has uncovered a bad bug that I introduced with HO resources, specifically in the code generation of tests in unbranching and resource transformation. The bug is that, in a test proc, where i save the resources to be reset upon the call failing, I fail to add the tmp variables to the vardicts of nested conditionals -- I do add them to the outer most condional that's generated though. This then results in some tmp variables being "uninitialised" when it comes time to unbranch because they're not known in the vardicts.

The fix should be to just add them, but I think the Haskell code becomes very clunky then, and I'm sure I could do better. The "better" though, is probably a rework of unbranching as a whole, calculating the vardicts when unbranching. I think this is a better design as, if we ever add in more code that generates more tmp variables, then we don't have to worry about updating the vardicts after mode checking. I realise this was done in mode checking because it's convenient, but I think that in unbranching is more appropriate as that's where the data is used.

I'm thinking also that it might be a good idea to join unbranching and clause generation together. The bug where state isn't rolled back appropriately when something fails is probably caused by unbranching. Instead of modifying unbranching to save the state before each condition and rollback in the else branch, clause generation could simply use the previous variable numbering in the else branch. This would save in the number of tmp variables introduced.

@jimbxb
Copy link
Collaborator Author

jimbxb commented Sep 28, 2024

I finally got back into things...

The parser now "works", but only if I turn off multi-specz. WIth multi-specz on I get a segfault!

EDIT: now it works!

@jimbxb jimbxb requested a review from pschachte September 28, 2024 14:34
Comment on lines -104 to +103
wybe.array.[|]<0>[785a827a1b](?command##1:wybe.c_string, ?arguments##2:wybe.array(wybe.c_string), ~arguments##0:wybe.array(wybe.c_string), ?tmp#8##0:wybe.bool) #0 @command_line:nn:nn
wybe.array.[|]<0>(?command##1:wybe.c_string, ?arguments##2:wybe.array(wybe.c_string), ~arguments##0:wybe.array(wybe.c_string), ?tmp#8##0:wybe.bool) #0 @command_line:nn:nn
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an unfortunate side-effect, but we shouldnt just be mutating a global variable like this! :(

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we could add a flag "noalias" to the load here, if we know they're safe... Let me do that

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, but this load isn't generated by the compiler. Bummer!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we can add a NonAliasedParamCond for globals. That way, if we know the global isnt aliased, a load of the global wouldn't load an aliased value, and we can optimise accordingly.

This is tricky, as if we ever store into that same global, the alias information changes to whatever the stored value had.

Recognising that a global is unaliased feels tricky too, but I assume you can just feed in the aliasing information of whatever you store into the global.

Food for thought, but probably not something I'm game to implement.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, very tricky. We need a global analysis to work out that we never either store a potentially aliased value into, and we never alias a value we load from, this global. Alternatively (or additionally), we need a way to declare that a resource should never be aliased, and always generate code that duplicates the value after loading if it can't be shown that the stored value will not be used again, and before storing if it can't be shown that the stored value is unaliased. Something like that. This is a whole other can of worms, certainly not related to a json parser.

@jimbxb
Copy link
Collaborator Author

jimbxb commented Sep 30, 2024

Next up, Wybe quines?

samples/json.wybe Outdated Show resolved Hide resolved
samples/json.wybe Outdated Show resolved Hide resolved
samples/json.wybe Outdated Show resolved Hide resolved
samples/json.wybe Outdated Show resolved Hide resolved
Copy link
Owner

@pschachte pschachte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! This looks really nice, just the way I hoped a Wybe parser could look.
And thanks for fixing many issues in the compiler.

Just one request: could you add a header comment to the top of json.wybe saying that it's an example of how one might write a recursive-descent parser for JSON documents in Wybe?

Comment on lines -104 to +103
wybe.array.[|]<0>[785a827a1b](?command##1:wybe.c_string, ?arguments##2:wybe.array(wybe.c_string), ~arguments##0:wybe.array(wybe.c_string), ?tmp#8##0:wybe.bool) #0 @command_line:nn:nn
wybe.array.[|]<0>(?command##1:wybe.c_string, ?arguments##2:wybe.array(wybe.c_string), ~arguments##0:wybe.array(wybe.c_string), ?tmp#8##0:wybe.bool) #0 @command_line:nn:nn
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, very tricky. We need a global analysis to work out that we never either store a potentially aliased value into, and we never alias a value we load from, this global. Alternatively (or additionally), we need a way to declare that a resource should never be aliased, and always generate code that duplicates the value after loading if it can't be shown that the stored value will not be used again, and before storing if it can't be shown that the stored value is unaliased. Something like that. This is a whole other can of worms, certainly not related to a json parser.

@jimbxb jimbxb requested a review from pschachte October 2, 2024 13:27
Copy link
Owner

@pschachte pschachte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great; thanks for fixing the Brainfuck doc, too!

@pschachte pschachte merged commit f3ed014 into pschachte:master Oct 3, 2024
1 check passed
@jimbxb jimbxb deleted the sample/json branch October 3, 2024 01:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants