Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setting defaults for SnakeYAML limits #647

Merged
merged 2 commits into from
Sep 18, 2023

Conversation

headius
Copy link
Contributor

@headius headius commented Sep 10, 2023

This adds class-level accessors for the following SnakeYAML-specific parser settings:

  • max_aliases_for_collections
  • allow_duplicate_keys
  • allow_recursive_keys
  • code_point_limit

The initial values are based on SnakeYAML Engine's defaults. This PR does not modify those default values.

Using these accessors, it should be possible for users to globally change them for all future Psych parser instances without resorting to monkey patches as described in #613 (comment).

@headius headius mentioned this pull request Sep 10, 2023
@headius
Copy link
Contributor Author

headius commented Sep 10, 2023

@oliverbarnes Please verify if you get a chance!

@oliverbarnes
Copy link

Having a look 🙏

@oliverbarnes
Copy link

oliverbarnes commented Sep 12, 2023

So, I've declared the psych gem with your fork and branch:

gem "psych", github: "headius/psych", branch: "jruby_default_limits"

And I'm getting the following error when bundling:

Fetching gem metadata from https://rubygems.org/.......
Resolving dependencies...
--- ERROR REPORT TEMPLATE -------------------------------------------------------

TypeError: no implicit conversion of nil into String
  org/jruby/RubyFile.java:1047:in `join'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/stdlib/jars/installer.rb:212:in `do_install'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/stdlib/jars/installer.rb:170:in `vendor_jars'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/stdlib/jars/post_install_hook.rb:28:in `block in <main>'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/source/path/installer.rb:43:in `block in run_hooks'
  org/jruby/RubyArray.java:1989:in `each'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/source/path/installer.rb:42:in `run_hooks'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/source/path/installer.rb:34:in `post_install'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/source/path.rb:244:in `generate_bin'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/source/git.rb:194:in `install'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/installer/gem_installer.rb:54:in `install'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/installer/gem_installer.rb:16:in `install_from_spec'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/installer/parallel_installer.rb:156:in `do_install'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/installer/parallel_installer.rb:147:in `block in worker_pool'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/worker.rb:62:in `apply_func'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/worker.rb:57:in `block in process_queue'
  org/jruby/RubyKernel.java:1601:in `loop'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/worker.rb:54:in `process_queue'
  ~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bundler-2.4.18/lib/bundler/worker.rb:90:in `block in create_threads'

The canonical psych gem bundles fine.

I'm still investigating if there's something on our end that might cause this somehow, but thought it was worth letting you know.

@headius
Copy link
Contributor Author

headius commented Sep 12, 2023

So what you're saying I should actually test this, eh? 😝

I'll make some fixes.

@headius
Copy link
Contributor Author

headius commented Sep 12, 2023

Oh actually I see the problem... JRuby doesn't support installing from Github because we don't build the extension at install time, we build it at build time. So if you build the gem from the fork and install it you should be able to activate it and try it out.

@oliverbarnes
Copy link

oliverbarnes commented Sep 12, 2023

Ok, cloned the fork and installed the gem from the PR branch:

rake compile:psych
rake install

and configured my app's gemfile to use it.

gem "psych", "5.1.0", platform: :jruby

On the initializer for psych I declared

# I might be misreading PsychParser.java, though
Psych::Parser.code_point_limit = 20_000_000

And when running a spec, I get the following error now:

➜  myapp git:(spike-jruby) ✗ be rspec spec/my_spec.rb
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/stdlib/date.rb:471: warning: previous definition of strptime was here
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/stdlib/date.rb:490: warning: previous definition of parse was here
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/stdlib/date.rb:737: warning: previous definition of parse was here
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/concurrent-ruby-1.1.10/lib/concurrent-ruby/concurrent/executor/java_thread_pool_executor.rb:13: warning: method redefined; discarding old to_int
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/concurrent-ruby-1.1.10/lib/concurrent-ruby/concurrent/executor/java_thread_pool_executor.rb:13: warning: method redefined; discarding old to_f
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/nokogiri-1.15.3-java/lib/nokogiri/xml/node.rb:1007: warning: method redefined; discarding old attr
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/bindata-2.4.15/lib/bindata/base.rb:80: warning: previous definition of initialize was here
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/ruby-debug-0.11.0/cli/ruby-debug/commands/list.rb:36: warning: `-' after local variable or literal is interpreted as binary operator
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/ruby-debug-0.11.0/cli/ruby-debug/commands/list.rb:36: warning: even though it seems like unary operator
~/.rbenv/versions/jruby-9.4.3.0/lib/ruby/gems/shared/gems/zeitwerk-2.6.11/lib/zeitwerk/kernel.rb:38: warning: parentheses after method name is interpreted as an argument list, not a decomposed argument

An error occurred while loading spec/my_spec.rb. - Did you mean?
                    <snip>

Failure/Error: require File.expand_path("../config/environment", __dir__)

LoadError:
  load error: ~/myapp/config/environment -- java.lang.ClassCastException: class java.lang.Integer cannot be cast to class org.jruby.runtime.builtin.IRubyObject (java.lang.Integer is in module java.base of loader 'bootstrap'; org.jruby.runtime.builtin.IRubyObject is in module org.jruby.dist of loader 'app')
# ./spec/rails_helper.rb:6:in `<main>'
# ./spec/support/requests_helper.rb:3:in `<main>'
# ./spec/my_spec.rb:3:in `<main>'
# ------------------
# --- Caused by: ---
# Java::JavaLang::ClassCastException:
#   class java.lang.Integer cannot be cast to class org.jruby.runtime.builtin.IRubyObject (java.lang.Integer is in module java.base of loader 'bootstrap'; org.jruby.runtime.builtin.IRubyObject is in module org.jruby.dist of loader 'app')
#   org.jruby.ext.psych.PsychParser.<init>(PsychParser.java:149)

This adds class-level accessors for the following SnakeYAML-
specific parser settings:

* max_aliases_for_collections
* allow_duplicate_keys
* allow_recursive_keys
* code_point_limit

The initial values are based on SnakeYAML Engine's defaults. This
PR does not modify those default values.

Using these accessors, it should be possible for users to globally
change them for all future Psych parser instances without
resorting to monkey patches as described in
ruby#613 (comment).
@headius headius force-pushed the jruby_default_limits branch from c37655a to fb97d89 Compare September 12, 2023 20:54
@headius
Copy link
Contributor Author

headius commented Sep 12, 2023

@oliverbarnes Ooops yup that was on me. I originally planned on storing the non-Ruby object version of these values, and then changed my mind. Unfortunately I didn't change all of the code.

Try it with latest version (force pushed)!

@oliverbarnes
Copy link

oliverbarnes commented Sep 12, 2023

:) seems to be working now. No environment load error, nor any code limit error. I'm now hitting that error from my use case:

VCR::Errors::UnhandledHTTPRequestError: 

described on #648

That's for code_point_limit of course - I haven't checked the other config options:

  • max_aliases_for_collections
  • allow_duplicate_keys
  • allow_recursive_keys

So what you're saying I should actually test this, eh? 😝

not a bad idea 😄 a unit test would be nice, at least

Only supported on JRuby currently.
@headius
Copy link
Contributor Author

headius commented Sep 18, 2023

I added a test for code_point_limit.

I tried to add a test for allow_duplicate_keys but I couldn't get it to fail. 🤔

Still looking for good examples for recursive keys and alias limits.

@headius
Copy link
Contributor Author

headius commented Sep 18, 2023

@asomov Can you point us toward some simple tests for the following settings? I tried to find something in the SnakeYAML Engine repo but have not found anything.

  • max_aliases_for_collections
  • allow_duplicate_keys
  • allow_recursive_keys

I also tried parsing a simple duplicate key script and it did not error, no matter what setting I provided.

foo: bar
foo: bar

Should this be rejected?

@oliverbarnes
Copy link

I added a test for code_point_limit.

👍 thanks, test looks good to me

I tried to add a test for allow_duplicate_keys but I couldn't get it to fail. 🤔

Still looking for good examples for recursive keys and alias limits.

Maybe worth creating separate PRs?

@headius
Copy link
Contributor Author

headius commented Sep 18, 2023

Maybe worth creating separate PRs?

Yeah let's go with that idea.

@headius headius merged commit 48cdab3 into ruby:master Sep 18, 2023
@headius headius deleted the jruby_default_limits branch September 18, 2023 19:28
headius added a commit to headius/psych that referenced this pull request Oct 5, 2023
These settings are not enforced at the parser level in SnakeYAML
Engine, instead being enforced during the construction of the YAML
node graph. Because the JRuby Psych wrapper around SnakeYAML only
uses the parser directly, the settings have no effect. This commit
removes the ineffective settings until we can decide what to do
about them.

See ruby#613, ruby#647, and ruby#649.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants