Encoding GBK needs update
When GBK was released in 1995, it included 95 characters were not included in Unicode 1.1. Until now (Windows 7), these characters were still assigned Unicode PUA code points in CP936.
GBK isn't an official standard, so I think it won't be updated anymore. But GB18030 is official, and the subset consisting of one-byte and two-byte characters is sometimes also referred to as GBK. In GB18030-2005, 81 characters were assigned to PUA, are now defined in Unicode.
Actually, the remaining 14 characters are now defined in Unicode, too. Please take a look at gbk_fe05.gif, light grey and light yellow ones.
These 95 characters are all defined in Unicode now (see gbk_mod.htm), so I think we should add these characters to gbk-tbl.rb. It won't cause any compatibility issue, at least in Ruby side.
Updated by oCameLo (oCameLo oTnTh) almost 10 years ago
I just can find out only one mailing list thread about this problem here: http://sources.redhat.com/ml/libc-alpha/2000-09/msg00394.html
For compatibility, we should accept this patch. But from the angle of standard, let it go.
Both ways are acceptable.
Updated by naruse (Yui NARUSE) almost 10 years ago
(2010/11/19 20:12), oCameLo oTnTh wrote:
I just can find out only one mailing list thread about this problem
For compatibility, we should accept this patch. But from the angle of
standard, let it go.
Ruby's mapping table should follow de facto or de jure standards.
In current situation, I should say the expectation for compatibility is
wrong. (you may know, Euro sign is also incompatible).
So until other implementation like converters, editors, or web browsers
supports such table, ruby won't support them.
NARUSE, Yui firstname.lastname@example.org