Bug #4758
closedyaml file not human readable when saving utf-8
Description
On a fresh ruby installation, I've stored some data within a yaml file.
The data does arrive there as "\x9B\xA6\xA1\xA0\xA3\xE3", thus I'm not able to edit something there.
I file this as a "Bug", because yaml is meant to be human-readable.
=== Workaround ===
within some discussions, the following workaround came up:
require "psych" // require before yaml
require "yaml"
But this is not always achievable, e.g. when yaml is used by a library etc.
=== Insider Context ===
Backward compatibility can be achieved easily by:
YAML::ENGINE.yamler = "syck"
=== Newcomer Context ===
Ruby should work "out of the box" correct with utf-8 data, an thus "psych" should become the default.
As said, if you view this issue strictly, it's a defect/bug.
(I've personally lost some hours with this issue)
Files
Updated by naruse (Yui NARUSE) over 13 years ago
- Status changed from Open to Assigned
- Assignee set to tenderlovemaking (Aaron Patterson)
Updated by drbrain (Eric Hodel) over 13 years ago
This is the YAML spec, it is not a bug of ruby. See: http://www.yaml.org/spec/1.2/spec.html
Updated by tenderlovemaking (Aaron Patterson) over 13 years ago
- ruby -v changed from ruby 1.9.2p180 (2011-02-18) [i386-mingw32] to -
On Tue, May 24, 2011 at 02:51:16AM +0900, Eric Hodel wrote:
Issue #4758 has been updated by Eric Hodel.
This is the YAML spec, it is not a bug of ruby. See: http://www.yaml.org/spec/1.2/spec.html
Yes, it is YAML spec. However, if it's a valid UTF-8 string, I think it
should be output as that UTF-8 string.
For example:
encoding: utf-8¶
require 'yaml'
require 'psych'
p Psych.dump({ :hello => 'こんにちは!'})
p YAML.dump({ :hello => 'こんにちは!'})
The results are:
"---\n:hello: こんにちは!\n"
"--- \n:hello:
"\xE3\x81\x93\xE3\x82\x93\xE3\x81\xAB\xE3\x81\xA1\xE3\x81\xAF\xEF\xBC\x81"\n"
Which seems like unexpected behavior of syck to me.
To fix this, I will make Psych default for 1.9.3.
--
Aaron Patterson
http://tenderlovemaking.com/
Updated by lazaridis.com (Lazaridis Ilias) over 13 years ago
Aaron Patterson wrote:
[...]
Yes, it is YAML spec. However, if it's a valid UTF-8 string, I think it
should be output as that UTF-8 string.
[...]
Yes, you're right, it should:
The YAML specs have "easily readable by humans" as the top priority design goal:
1.1. Goals
The design goals for YAML are, in decreasing priority:
1. YAML is easily readable by humans.
http://www.yaml.org/spec/1.2/spec.html
.
Updated by tenderlovemaking (Aaron Patterson) over 13 years ago
- Status changed from Assigned to Closed
- % Done changed from 0 to 100
I've fixed this in r31715.
Updated by lazaridis.com (Lazaridis Ilias) over 13 years ago
=begin
link: r31715.
=end