Project

General

Profile

Actions

Bug #1449

closed

[REXML] detected encoding isn't used correctly

Added by kou (Kouhei Sutou) over 15 years ago. Updated over 13 years ago.

Status:
Closed
Target version:
ruby -v:
ruby 1.9.2dev (2009-05-09 trunk 23374) [x86_64-linux]
Backport:
[ruby-core:23404]

Description

=begin
REXML::Source can detect source encoding by XML declaration. REXML::IOSource can also detect it but it's not used correctly.

REXML::IOSource uses detected encoding to convert read data from @source. If detected encoding is UTF-8 read data isn't converted. (ref. rexml/encodings/UTF-8.rb) If detected encoding is UTF-8 but @source.external_encoding isn't UTF-8, it may cause a problem.

If @source.external_encoding is ASCII-8BIT and @source only has ASCII data, it doesn't cause any problems. If @source.external_encoding is ASCII-8BIT and @source has non-ASCII data, it causes a problem. In the case, "@buffer << read_data_from_source" causes an Encoding::CompatibilityError. It breaks correct XML parsing.
=end


Files

ruby19-rexml-encoding-mismatch.diff (2.89 KB) ruby19-rexml-encoding-mismatch.diff a test case for the problem and a patch to fix the problem. kou (Kouhei Sutou), 05/09/2009 01:38 PM
Actions

Also available in: Atom PDF

Like0
Like0Like0