-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add JSON parser example #391
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really lovely! Very readable code, and a nice sales pitch for Wybe. I've made a few style nitpicks here, and also, a bit of documentation would be good for this to be a Wybe code sample.
I agree about using a resource. It'd make a nice example use of resources, too.
Thanks for the feedback! I've been working on a version with resources that has some different names and some little tweaks to make it ergonomic, but I should add a commit with these changes before tackling the resource version. Unfortunately, the resource version has uncovered a bad bug that I introduced with HO resources, specifically in the code generation of tests in unbranching and resource transformation. The bug is that, in a test proc, where i save the resources to be reset upon the call failing, I fail to add the tmp variables to the vardicts of nested conditionals -- I do add them to the outer most condional that's generated though. This then results in some tmp variables being "uninitialised" when it comes time to unbranch because they're not known in the vardicts. The fix should be to just add them, but I think the Haskell code becomes very clunky then, and I'm sure I could do better. The "better" though, is probably a rework of unbranching as a whole, calculating the vardicts when unbranching. I think this is a better design as, if we ever add in more code that generates more tmp variables, then we don't have to worry about updating the vardicts after mode checking. I realise this was done in mode checking because it's convenient, but I think that in unbranching is more appropriate as that's where the data is used. I'm thinking also that it might be a good idea to join unbranching and clause generation together. The bug where state isn't rolled back appropriately when something fails is probably caused by unbranching. Instead of modifying unbranching to save the state before each condition and rollback in the else branch, clause generation could simply use the previous variable numbering in the else branch. This would save in the number of tmp variables introduced. |
I finally got back into things... The parser now "works", but only if I turn off multi-specz. WIth multi-specz on I get a segfault! EDIT: now it works! |
wybe.array.[|]<0>[785a827a1b](?command##1:wybe.c_string, ?arguments##2:wybe.array(wybe.c_string), ~arguments##0:wybe.array(wybe.c_string), ?tmp#8##0:wybe.bool) #0 @command_line:nn:nn | ||
wybe.array.[|]<0>(?command##1:wybe.c_string, ?arguments##2:wybe.array(wybe.c_string), ~arguments##0:wybe.array(wybe.c_string), ?tmp#8##0:wybe.bool) #0 @command_line:nn:nn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an unfortunate side-effect, but we shouldnt just be mutating a global variable like this! :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we could add a flag "noalias"
to the load here, if we know they're safe... Let me do that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, but this load isn't generated by the compiler. Bummer!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we can add a NonAliasedParamCond for globals. That way, if we know the global isnt aliased, a load of the global wouldn't load an aliased value, and we can optimise accordingly.
This is tricky, as if we ever store into that same global, the alias information changes to whatever the stored value had.
Recognising that a global is unaliased feels tricky too, but I assume you can just feed in the aliasing information of whatever you store into the global.
Food for thought, but probably not something I'm game to implement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, very tricky. We need a global analysis to work out that we never either store a potentially aliased value into, and we never alias a value we load from, this global. Alternatively (or additionally), we need a way to declare that a resource should never be aliased, and always generate code that duplicates the value after loading if it can't be shown that the stored value will not be used again, and before storing if it can't be shown that the stored value is unaliased. Something like that. This is a whole other can of worms, certainly not related to a json parser.
Next up, Wybe quines? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! This looks really nice, just the way I hoped a Wybe parser could look.
And thanks for fixing many issues in the compiler.
Just one request: could you add a header comment to the top of json.wybe
saying that it's an example of how one might write a recursive-descent parser for JSON documents in Wybe?
wybe.array.[|]<0>[785a827a1b](?command##1:wybe.c_string, ?arguments##2:wybe.array(wybe.c_string), ~arguments##0:wybe.array(wybe.c_string), ?tmp#8##0:wybe.bool) #0 @command_line:nn:nn | ||
wybe.array.[|]<0>(?command##1:wybe.c_string, ?arguments##2:wybe.array(wybe.c_string), ~arguments##0:wybe.array(wybe.c_string), ?tmp#8##0:wybe.bool) #0 @command_line:nn:nn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, very tricky. We need a global analysis to work out that we never either store a potentially aliased value into, and we never alias a value we load from, this global. Alternatively (or additionally), we need a way to declare that a resource should never be aliased, and always generate code that duplicates the value after loading if it can't be shown that the stored value will not be used again, and before storing if it can't be shown that the stored value is unaliased. Something like that. This is a whole other can of worms, certainly not related to a json parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great; thanks for fixing the Brainfuck doc, too!
Ideally I'd use a resource to handle the parse string, but I found a few compile errors along the way regarding uninitialised temporary variables. I didn't get a change to debug where these were, though, as they didn't have any location attached to them.
I believe one benefit for using a resource is that, for a test proc, if the proc fails there is already a mechanism to rollback the initial state for the resource. This would clean up the
char
anddigit
procs. Of course, these procs should work without the hack I built in, but this is a known issue which is to be documented on GitHub.Overall, though, I think the use of higher order in both parsing list-like structures and in printing is a treat to work with!