-
Notifications
You must be signed in to change notification settings - Fork 660
Don't crash when memory is exhausted #389
base: master
Are you sure you want to change the base?
Conversation
Previously, returning NULL from gumbo_parser_allocate was usually unchecked. Instead, bail out with longjmp() and return NULL from gumbo_parse_with_options().
|
||
if (SETJMP(parser._oom_buf)) { | ||
if (parser._parser_state) | ||
parser_state_destroy(&parser); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this actually deallocate all the used memory? What about all the allocations for nodes and errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn’t, even if I add a call to gumbo_destroy_output
at the end whenever there would be something non-NULL
to pass to it.
One option that is guaranteed to work is to place all allocations in a linked list and on error destroy the list. On the other hand, the user can always do that themselves, since Gumbo can call user-provided callbacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was considering using the arena pull request to solve this problem for lua-gumbo. Unfortunately, the state of that pull request as it stands still contains dozens of unchecked calls to malloc
and has a few other problems I can't recall off the top of my head.
Gumbo is essentially an abandoned project at this point though. It seems unlikely that any of the newer bug reports will be fixed in this repo.
@@ -1954,6 +1954,32 @@ TEST_F(GumboParserTest, FragmentWithNamespace) { | |||
EXPECT_EQ(0, GetChildCount(div)); | |||
} | |||
|
|||
TEST_F(GumboParserTest, OutOfMemory) { | |||
GumboOptions options; | |||
memcpy((void*)&options, (void*)&kGumboDefaultOptions, sizeof options); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need for memcpy()
here. You can just use:
GumboOptions options = kGumboDefaultOptions;
FWIW ... It would be nice if some maintainer would truly fork this project to create a maintenance version of gumbo where at least all outstanding bug reports with patches were included into a new master and also maybe an updated arena allocator based branch as well (although thread safety may be an issue in the arena version depending on the underlying malloc used for the arena slab allocator).
We at Sigil have a version of gumbo that has most if not all bug/security fixes in place but we add support for xhtml parsing in our version as well as adding code to allow editing the returned gumbo tree which most might find useful.
Any takers?
…Sent from my iPad
On Oct 8, 2017, at 9:25 PM, Craig Barnes ***@***.***> wrote:
@craigbarnes commented on this pull request.
In src/parser.c:
> parser._options = options;
+
+ if (SETJMP(parser._oom_buf)) {
+ if (parser._parser_state)
+ parser_state_destroy(&parser);
I was considering using the arena pull request to solve this problem for lua-gumbo. Unfortunately, the state of that pull request as it stands still contains dozens of unchecked calls to malloc and has a few other problems I can't recall off the top of my head.
Gumbo is essentially an abandoned project at this point though... It seems unlikely that any of the newer bug reports will be fixed in this repo.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Correction:
"which most might find useful" -> "which most might NOT find useful"
… On Oct 8, 2017, at 9:25 PM, Craig Barnes ***@***.***> wrote:
@craigbarnes commented on this pull request.
In src/parser.c:
> parser._options = options;
+
+ if (SETJMP(parser._oom_buf)) {
+ if (parser._parser_state)
+ parser_state_destroy(&parser);
I was considering using the arena pull request to solve this problem for lua-gumbo. Unfortunately, the state of that pull request as it stands still contains dozens of unchecked calls to malloc and has a few other problems I can't recall off the top of my head.
Gumbo is essentially an abandoned project at this point though... It seems unlikely that any of the newer bug reports will be fixed in this repo.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I'm maintaining a fork of libgumbo within the lua-gumbo repo. If you want to set up a GitHub org, I'd be glad to contribute some of the changes I've made back and help with maintenance. I've noticed a few other forks around GitHub that may be interested in merging efforts too. There's also a Python html5-parser library that seems to be based on a fork of gumbo. |
Right now the only actively maintained HTML parser that is meant for
standalone use is Servo's html5ever. It has a LOT going for it (for one,
it is immune to most code execution exploits by virtue of being written in
safe Rust; it is also faster). However, it isn't exactly easy to bind from
anything other than Rust.
On Oct 8, 2017 9:25 PM, "Craig Barnes" <[email protected]> wrote:
*@craigbarnes* commented on this pull request.
------------------------------
In src/parser.c
<#389 (comment)>:
parser._options = options;
+
+ if (SETJMP(parser._oom_buf)) {
+ if (parser._parser_state)
+ parser_state_destroy(&parser);
I was considering using the arena
<#309> pull request to solve
this problem for lua-gumbo. Unfortunately, the state of that pull request
as it stands still contains dozens of unchecked calls to malloc and has a
few other problems I can't recall off the top of my head.
Gumbo is essentially an abandoned project at this point though... It seems
unlikely that any of the newer bug reports will be fixed in this repo.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#389 (comment)>,
or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AGGWB7qnoT_pJgdqv6RiVUdA2SRX441dks5sqXX7gaJpZM4Ogiay>
.
|
In the case of this particular bug, we could get the same behaviour as in Rust by just calling |
Previously, returning NULL from
gumbo_parser_allocate()
was usuallyunchecked. Instead, bail out with
longjmp()
and return NULL fromgumbo_parse_with_options()
.Fixes #388.