Project

General

Profile

Actions

Bug #17866

closed

Incompatible changes with Psych 4.0.0

Added by hsbt (Hiroshi SHIBATA) over 3 years ago. Updated almost 3 years ago.


Description

Psych-4.0.0 changes Psych.safe_load by the default.

https://github.com/ruby/psych/pull/487

It breaks the several code like:

I and @mame (Yusuke Endoh) investigate them. We found 2 issues.

  1. Symbol is still ignored Pysch.load. It break many of code like configuration store. https://github.com/ruby/psych/blob/master/lib/psych.rb#L368 passes Symbol used by permitted_classes. But It's not working now. see https://github.com/ruby/psych/issues/490
  2. Pysch.load restrict Gem::Specification or RDoc::Options by the default. Should we add them withpermitted_classes to Psych.load or Psych.load_file? I'm not sure the right way about them.

@tenderlovemaking (Aaron Patterson) Do you have any ideas about the above concerns?

Actions #1

Updated by hsbt (Hiroshi SHIBATA) over 3 years ago

  • Description updated (diff)

Updated by k0kubun (Takashi Kokubun) over 3 years ago

I noticed Psych.unsafe_load uses aliases: false by default. This means that default config/database.yml of Rails will stop working if you use Psych 4.0.0.

https://github.com/rails/rails/blob/v6.1.3.2/railties/lib/rails/generators/rails/app/templates/config/databases/sqlite3.yml.tt#L7
https://github.com/rails/rails/blob/v6.1.3.2/activesupport/lib/active_support/configuration_file.rb#L22
https://github.com/rails/rails/blob/v6.1.3.2/railties/lib/rails/application/configuration.rb#L275

How about using aliases: true by default in Psych.load? I guess the main purpose of the breaking change was to disallow deserialization of arbitrary objects, and disabling aliases wasn't really important.

Actions #3

Updated by byroot (Jean Boussier) over 3 years ago

and disabling aliases wasn't really important.

It kinda is, as aliases allow for circular references which can cause some programs to end up in infinite loop.

How about using aliases: true by default in Psych.load?

Maybe we should have YAML.load_file use a more relaxed default, as we can assume that most of the time this is for loading configs etc rather than user submitted data.

Updated by Eregon (Benoit Daloze) over 3 years ago

byroot (Jean Boussier) wrote in #note-3:

It kinda is, as aliases allow for circular references which can cause some programs to end up in infinite loop.

Doesn't the YAML implementation handle/avoid such recursive/circular references?

At least:

default: &default
  adapter: sqlite3
  foo: bar
  nested:
    <<: *default

gives:

$ ruby -ryaml -e 'pp YAML.load_file "circular.yml"'
{"default"=>
  {"adapter"=>"sqlite3",
   "foo"=>"bar",
   "nested"=>{"adapter"=>"sqlite3", "foo"=>"bar"}}}

so it doesn't loop infinitely.

Updated by k0kubun (Takashi Kokubun) over 3 years ago

It kinda is, as aliases allow for circular references which can cause some programs to end up in infinite loop.

I thought of the same thing and I tested what Eregon wrote before writing the comment. So, unless you know of a YAML that crashes Psych.safe_load(..., aliases: true), I assume it's safe.

Maybe we should have YAML.load_file use a more relaxed default, as we can assume that most of the time this is for loading configs etc rather than user submitted data.

Just a reminder from my previous comment, ActiveSupport::ConfigurationFile.parse uses not YAML.load_file but YAML.load. Therefore you have to fix YAML.load as well if you want to keep existing Rails apps working.

Updated by byroot (Jean Boussier) over 3 years ago

So indeed I was wrong about the circular reference.

As for Rails I fixed it this morning: https://github.com/rails/rails/commit/179d0a1f474ada02e0030ac3bd062fc653765dbe, but yes, if load would allow aliases by default it would make things much easier for everyone.

Updated by tenderlovemaking (Aaron Patterson) over 3 years ago

Eregon (Benoit Daloze) wrote in #note-4:

byroot (Jean Boussier) wrote in #note-3:

It kinda is, as aliases allow for circular references which can cause some programs to end up in infinite loop.

Doesn't the YAML implementation handle/avoid such recursive/circular references?

No, you can definitely make recursive data structures:

irb(main):006:0> x = []
=> []
irb(main):007:0> x << x
=> [[...]]
irb(main):008:0> puts Psych.dump x
--- &1
- *1
=> nil
irb(main):009:0> y = Psych.load Psych.dump x
=> [[...]] 

So, unless you know of a YAML that crashes Psych.safe_load(..., aliases: true), I assume it's safe.

No circular aliases will cause Psych.safe_load to crash. The problem is any code that processes the return value of Psych.load. Sometime like this:

require "psych"

def process thing
  thing.each do |item|
    case item
    when Array
      process item
    when Hash
      # ...
    when String
      # ...
    end
  end
end

user_input = DATA.read

process Psych.load(user_input)

__END__
--- &1
- *1

if load would allow aliases by default it would make things much easier for everyone.

Ya, I agree. If we can allow aliases but not recursive aliases, I think that would be great. I'm not 100% sure how to do it though.

I'm ok with enabling aliases by default. DoS attacks using recursive structures have only been theoretical. However, I'm very tired of dealing with YAML related security issues, so I'm not 100% sure. 😅

Updated by byroot (Jean Boussier) over 3 years ago

I'm ok with enabling aliases by default.

Just to clarify: load would have them on by default, but safe_load wouldn't, right?

Updated by tenderlovemaking (Aaron Patterson) over 3 years ago

byroot (Jean Boussier) wrote in #note-8:

I'm ok with enabling aliases by default.

Just to clarify: load would have them on by default, but safe_load wouldn't, right?

Ya, that's what I was considering. safe_load shouldn't change behavior whatsoever.

Updated by naruse (Yui NARUSE) over 3 years ago

Since I want Ruby 3.1 to keep as much as compatibility for Ruby 3.0 to ensure application/library developer adopt Ruby 3, I’m negative to go Ruby 3.1 with Psych 4.0.0.

Updated by cabo (Carsten Bormann) over 3 years ago

naruse (Yui NARUSE) wrote in #note-10:

Since I want Ruby 3.1 to keep as much as compatibility for Ruby 3.0 to ensure application/library developer adopt Ruby 3, I’m negative to go Ruby 3.1 with Psych 4.0.0.

Unfortunately, users will pick up this incompatibility with their next gem update.

So it won't make much of a difference whether you put it in Ruby 3.1 or not.

(The updated major gem version also does not protect against this update as psych historically has not been a gem, so few Gemfiles have a ~> 3.0 or some such.)

Updated by yahonda (Yasuo Honda) almost 3 years ago

Can I get a final decision if Psych 4 will be shipped with Ruby 3.1 as a standard library?

I think this is one of the biggest incompatibilities which will be introduced with Ruby 3.1.
As far as I understand, Psych is a standard library of Ruby, so it is not recommended to use Psych 3 with Ruby 3.1.

Now I'm not against releasing Ruby 3.1 with Psych 4 because it may be too late to revert changes to ship Ruby 3.1 with Psych 3. I wish Ruby 3.1 release note mention this incompatibility introduced in Psych 4.

Updated by hsbt (Hiroshi SHIBATA) almost 3 years ago

  • Status changed from Assigned to Closed

I discussed this with @naruse (Yui NARUSE). We decided to bundle Psych 4 with Ruby 3.1. And we should note this incompatibility and migration way with Psych 3.3.2 to NEWS.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0