-
Notifications
You must be signed in to change notification settings - Fork 114
Segfault on parse using 1.4.10 #50
Comments
I could reproduce this Segfault on my macbook nokogumbo (1.4.9) => OK
nokogumbo (1.4.10) => Segfault
|
I think I'm seeing the same crash - backtrace looks like:
and here's a test case: require "nokogumbo"
HTML=<<__END__
Test
>><b>Test</b><<
Test
__END__
Nokogiri::HTML5(HTML) It's the |
I'm also getting segmentation fault with 1.4.10 but the parsing works with 1.4.9. |
@stevecheckoway looks like we have a segfault introduced when we add the parse errors to the doc https://github.com/rubys/nokogumbo/pull/46/files#diff-605b9dcccd84567c49656f6590ff3830R230 |
That's no good. I'll try to look into it soon. |
I tracked down the error, only to find that @DmitryBochkarev had already found it. It's a bug in gumbo-parser google/gumbo-parser#371 but I think that patch is wrong. When the error happens on a new line, it tries to append a string that is I created a pull request #51 to work around this bug, but I think a better fix is to fix gumbo. |
👍 I have the issue too with nokogumbo 1.4.10, is there any fixes for this version? |
Version 1.4.10 (current one) has a segfault when parsing certain html documents. This was causing crashes of the sidekiq process leaving feeds in an inconsistent state. Pinning for now until there is a fix, see: rubys/nokogumbo#50
I think this is bad enough that a rollback of the commits that introduced the issue until a fix is ready would be warranted. |
The work around in #51 was merged already. So presumably, it'll be in the next version. |
@rubys would you mind releasing a new version on rubygem? |
Can I get somebody to test drive the following https://intertwingly.net/tmp/nokogumbo-1.4.11.gem ? |
@rubys Gave it a roll locally. Gem is packaged fine, installed fine, passes the html-proofer test suite, and I couldn't tickle the segfault. |
@jeremy thanks! pushed! |
Looks like this is solved. Finally got a chance to loop back and test the latest (now 1.4.13), and we're not seeing any segfaults across Mac and Linux. Thanks! |
I noticed that right after we upgraded to the latest (1.4.10) we are now getting segfaults when parsing certain HTML files as part of our rails asset precompile. For some reason, I can't recreate the segfault on my Mac, but it happens consistently on our CI box running RedHat 6. I can dig deeper to figure out what the actual HTML input is, but I wanted to check if you had some idea of what would cause the segfault first.
In case it helps, the library thats calling nokogumbo is something we maintain, so we can modify it if necessary: https://github.com/uniite/web-components-rails/blob/master/lib/web_components_rails/html_import_processor.rb#L49
Here's the output from the segfault:
The text was updated successfully, but these errors were encountered: