Backport #5564

1.9.3 regression with encoding conversion

Added by Jon Leighton over 3 years ago. Updated over 3 years ago.

[ruby-core:40715]
Status:Closed
Priority:Normal
Assignee:Yuki Sonoda

Description

Hello,

We are experiencing a bug in the Rails test suite against ruby 1.9.3.

The bug looks like this:

1) Error:
test_default_external_works(TestERBTemplate):
ActionView::Template::Error: code converter not found (ISO-8859-1 to UTF-8)
/home/turnip/Code/rails/actionpack/lib/action_view/template/handlers/erb.rb:83:in encode!'
/home/turnip/Code/rails/actionpack/lib/action_view/template/handlers/erb.rb:83:in
call'
/home/turnip/Code/rails/actionpack/lib/action_view/template.rb:252:in compile'
/home/turnip/Code/rails/actionpack/lib/action_view/template.rb:189:in
compile!'
/home/turnip/Code/rails/actionpack/lib/action_view/template.rb:142:in block in render'
/home/turnip/Code/rails/activesupport/lib/active_support/notifications.rb:55:in
instrument'
/home/turnip/Code/rails/actionpack/lib/action_view/template.rb:141:in render'
test/template/template_test.rb:53:in
render'
test/template/template_test.rb:137:in block in test_default_external_works'
test/template/template_test.rb:176:in
with_external_encoding'
test/template/template_test.rb:135:in test_default_external_works'
/home/turnip/.rvm/gems/ruby-1.9.3-p0/gems/mocha-0.10.0/lib/mocha/integration/mini_test/version_230_to_251.rb:28:in
run'

A simplified view of what that code is doing is the following:

s = "hello \u{fc}mlat"
s.force_encoding "ISO-8859-1"
s.encode! 'UTF-8'

However, I have been unable to create any simplified test case for this bug.

What I have been able to find out is the following:

  1. Via git-bisect, I tracked down the changed which introduced the bug (it was not present in 1.9.3-preview1). That change was https://github.com/ruby/ruby/commit/05c2cc90895750255f64254f2491ff0ad77cc13a (r32791)
  2. If I do some sort of conversion at load time in the file action_view/template/handlers/erb.rb, the bug does not occur. For example, if I insert "Encoding::Converter.new('ISO-8859-1', 'UTF-8')" (or the 3 lines above) at the top of the file, I do not experience the bug.

Let me know if I can provide any more information. Also any information about what exactly is triggering this would be very helpful so that we can put a workaround in Rails (if possible).

Thanks


Related issues

Duplicates Ruby trunk - Bug #5279: $SAFEが3以上の時にString#encodeがSecurityErrorを発生させるケースがある Closed 09/06/2011

Associated revisions

Revision 34465
Added by Yui NARUSE over 3 years ago

merge revision(s) 33201,33249,33328: [Backport #5564]

* encoding.c (load_encoding): predefined encoding names are safe.
   [Bug #5279]

* transcode.c (load_transcoder_entry): ditto.

* encoding.c (require_enc): reject only loading from untrusted
  load paths.   [Bug #5279]

* transcode.c (load_transcoder_entry): ditto.

History

#1 Updated by Martin Dürst over 3 years ago

I can't explain why there's a bug, the three lines of code shouldn't produce one.

But if the code in question is really
s = "hello \u{fc}mlat"
s.force_encoding "ISO-8859-1"
s.encode! 'UTF-8'
then I suggest you either explain why you'd want to do that, or fix it.
"\u{fc}mlaut" is an escape for "ümlaut", and the \u escape makes sure the string is encoded in UTF-8.
Using force-encoding to label it as ISO-8859-1 then mislabels it.
The last line is then converting it to UTF-8 again, resulting is what's called double-encoding.
What you may have wanted is "hello \x{fc}mlaut"; then the next two lines would make much more sense.

#2 Updated by Yui NARUSE over 3 years ago

  • Project changed from Ruby trunk to Backport193
  • Status changed from Open to Assigned
  • Assignee set to Yuki Sonoda
  • Tracker changed from Bug to Backport

This is the same problem as Bug #5279 and fixed in r33328.
"it can't load encoding library on $SAFE >= 3"

The workaround is just as you say load the library before set $SAFE.

#3 Updated by Jon Leighton over 3 years ago

Hello,

Thanks Martin and Yui for your responses.

Martin:

I'm sorry, I made a mistake in my 'simplified' example. It should be the following:

s = "hello \xFCmlat"
s.force_encoding "ISO-8859-1"
s.encode! 'UTF-8'

In other words: taking a valid ISO-8859-1 string and converting it to UTF-8.

Yui:

I am not sure this is the same bug, because we are not setting $SAFE in Rails. So the $SAFE level is 0 throughout. Am I missing something?

Many thanks,
Jon

#4 Updated by Jon Leighton over 3 years ago

I have just tried applying r33328 to the ruby_1_9_3 branch, and after doing so I still experience this problem. However, I may have done it incorrectly (I don't really know what I am doing).

Regarding the work-around, it is problematic because we don't know until runtime what conversions we want to perform (so just doing "Encoding::Converter.new('ISO-8859-1', 'UTF-8')" won't help if we need to convert Shift_JIS to UTF-8 at runtime, for example). Is there a way around that problem?

Thank you.

#5 Updated by Jon Leighton over 3 years ago

I have just tried to build trunk and still experience this bug:

$ ruby -v
ruby 2.0.0dev (2011-11-06 trunk 33644) [x86_64-linux]
$ ruby -Itest test/template/template_test.rb
Run options:

Running tests:

.E..............

Finished tests in 0.052595s, 304.2143 tests/s, 399.2812 assertions/s.

1) Error:
test_default_external_works(TestERBTemplate):
ActionView::Template::Error: code converter not found (ISO-8859-1 to UTF-8)
/home/turnip/Code/rails/actionpack/lib/action_view/template/handlers/erb.rb:83:in encode!'
/home/turnip/Code/rails/actionpack/lib/action_view/template/handlers/erb.rb:83:in
call'
/home/turnip/Code/rails/actionpack/lib/action_view/template.rb:252:in compile'
/home/turnip/Code/rails/actionpack/lib/action_view/template.rb:189:in
compile!'
/home/turnip/Code/rails/actionpack/lib/action_view/template.rb:142:in block in render'
/home/turnip/Code/rails/activesupport/lib/active_support/notifications.rb:55:in
instrument'
/home/turnip/Code/rails/actionpack/lib/action_view/template.rb:141:in render'
test/template/template_test.rb:53:in
render'
test/template/template_test.rb:135:in block in test_default_external_works'
test/template/template_test.rb:174:in
with_external_encoding'
test/template/template_test.rb:133:in test_default_external_works'
/usr/local/lib/ruby/gems/1.9.1/gems/mocha-0.10.0/lib/mocha/integration/mini_test/version_230_to_251.rb:28:in
run'

16 tests, 21 assertions, 0 failures, 1 errors, 0 skips

ruby -v: ruby 2.0.0dev (2011-11-06 trunk 33644) [x86_64-linux]

So I do not think it's just a case of backporting an existing fix.

#6 Updated by Martin Dürst over 3 years ago

Hello Jon,

What I would do now is to try and get a C-level backtrace that shows what happens exactly between the call to encode! and the actual errro.

As for "we don't know what converter we might need at runtime", one solution would be to include all converters. When we designed the whole transcoding stuff, we were careful to save runtime memory by dynamically loading just the stuff that's really needed, but there may be some cases where it's easier to load everything in advance. On a clever OS, even if you run many instances of your application, all the data for the transcodings should be shared because it's all static. (I'm not saying that's the solution to your bug, but it's something that may come in handy in some situations.)

Another solution might be to introduce a way to 'bless' certain libraries in advance so that they can be loaded later under $SAFE >= 3 if they are needed. That would leave a security risk of a library being changed between being blessed and being loaded, but there should be enough ways to prevent that.

#7 Updated by Jon Leighton over 3 years ago

Hi Martin,

Thanks for your reply. I don't know how to get a C-level backtrace. Where can I find documentation about how to do this?

Regarding $SAFE level - just to reiterate, I don't think that's the issue here, because in this code $SAFE == 0 throughout.

Thanks

#8 Updated by Martin Dürst over 3 years ago

To get a C-level backtrace:

1) Make sure you compiled with -g
2) Run from gdb, or attach gdb to the process once it's running
3) Set a breakpoint just before the code that produces the exception (message)
4) Run. gdb will stop at the breakpoint
5) Get a backtrace from gdb.

It's not that I do this every day, but the above should work.

#9 Updated by Jon Leighton over 3 years ago

Ok, here you go: https://gist.github.com/1342809

Is that what you need?

Thanks

#10 Updated by Martin Dürst over 3 years ago

https://gist.github.com/1342809 is indeed what I was asking for, but it doesn't seem to be enough.

Rather different question: Is this a problem just with your installation, or can this problem be replicated widely (different OSs,...)?

#11 Updated by Jon Leighton over 3 years ago

My installation is Fedora 15. We have also reproduced on the Rails CI server (Ubuntu Lucid), and I just asked some people to test on a Mac, and they also can reproduce. So I do not think it's platform specific.

You should be able to reproduce on your own computer by setting up the Rails test suite. (This is a bit of a pain I know, I'm sorry I haven't found a simpler test case.)

  1. Clone git://github.com/rails/rails.git
  2. bundle install
  3. cd actionpack/
  4. Open lib/action_view/template.rb and comment the conditional block starting with if defined?(RUBY_ENGINE) (this disables the workaround for the bug) [only necessary if you are running exactly 1.9.3p0, but this bug can be repro'd on trunk also]
  5. ruby -Itest test/template/template_test.rb -n test_default_external_works

#12 Updated by Yui NARUSE over 3 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r34465.
Jon, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


merge revision(s) 33201,33249,33328: [Backport #5564]

* encoding.c (load_encoding): predefined encoding names are safe.
   [Bug #5279]

* transcode.c (load_transcoder_entry): ditto.

* encoding.c (require_enc): reject only loading from untrusted
  load paths.   [Bug #5279]

* transcode.c (load_transcoder_entry): ditto.

Also available in: Atom PDF