Bug #20929
openTestTime have an assertion different from current implementation.
Description
test/ruby/test_time.rb have following assersion function.
def assert_zone_encoding(time)
zone = time.zone
assert_predicate(zone, :valid_encoding?)
if zone.ascii_only?
assert_equal(Encoding::US_ASCII, zone.encoding)
else
enc = Encoding.default_internal || Encoding.find('locale')
assert_equal(enc, zone.encoding)
end
end
In current implementation, Time#zone are returned in US_ASCII or locale encoding, which does not seem to take into account the default_internal.
C:\>ruby -e "puts Time.now.zone"
東京 (標準時)
C:\>ruby -e "puts Time.now.zone.encoding"
Windows-31J
C:\>ruby -EWindows-31J:UTF-8 -e "puts Time.now.zone"
東京 (標準時)
C:\>ruby -EWindows-31J:UTF-8 -e "puts Time.now.zone.encoding"
Windows-31J
Updated by nobu (Nobuyoshi Nakada) about 2 months ago · Edited
- Description updated (diff)
Indeed, that assertion is incorrect.
But the locale is not the correct/expected encoding always on Windows.
For instance, in Japanese edition, tm_zone
is always CP932.
> chcp.com 932
現在のコード ページ: 932
> ruby -e "p Encoding.find('locale'), (z = Time.now.zone), z.encoding"
#<Encoding:Windows-31J>
"\x{938C}\x{8B9E} (\x{9557}\x{8F80}\x{8E9E})"
#<Encoding:Windows-31J>
Even when locale (active codepage) is changed.
> chcp.com 437
Active code page: 437
> ruby -e "p Encoding.find('locale'), (z = Time.now.zone), z.encoding"
#<Encoding:IBM437>
"\x93\x8C\x8B\x9E (\x95W\x8F\x80\x8E\x9E)"
#<Encoding:IBM437>
And of course regardless the internal encoding.
> ruby -Ecp932 -e "p Encoding.find('locale'), (z = Time.now.zone), z.encoding"
#<Encoding:IBM437>
"\x93\x8C\x8B\x9E (\x95W\x8F\x80\x8E\x9E)"
#<Encoding:IBM437>
I don't think there is the API to obtain what codepage it is encoded in.
Maybe we should use the W API and encode it in UTF-8 ranter than the locale.
@usa (Usaku NAKAMURA), what do you think?
Updated by usa (Usaku NAKAMURA) about 1 month ago
Maybe we should use the W API and encode it in UTF-8 ranter than the locale.
agreed.
Updated by nobu (Nobuyoshi Nakada) about 1 month ago
- Status changed from Open to Closed
Applied in changeset git|78762b52185aa80ee55c0d49b495aceed863dce2.
[Bug #20929] Fix assert_zone_encoding
The default internal encoding is not taken into account to encode
timezone name.
Updated by YO4 (Yoshinao Muramatsu) about 1 month ago
Thank you for your response.
Regarding Time#zone encoding, I am experimenting with it in my branch https://github.com/YO4/ruby/tree/tzname_utf8.
I found this issue in my research for that.
At present, the change to utf-8 causes the following error.
>ruby -e 'puts "タイムゾーン:#{Time.now.zone}"'
-e:1:in '<main>': incompatible character encodings: Windows-31J and UTF-8 (Encoding::CompatibilityError)
To resolve this, other strings must also be in UTF-8 encoding.
I think it would be preferred that strings with Unicode code ranges also have UTF-8 encoding. OS-derived strings, excluding I/O content, seem to meet that requirement.
Of course, this matter should be discussed in another issue.
Thanks.
Updated by YO4 (Yoshinao Muramatsu) 29 days ago
We have changed Time#zone encoding test code to not consider internal_encoding at 78762b5,
but a document of Encoding.default_internal explicitly stated that Time#zone was affected by it.
Do we need to reconsider?
Github PR#12409 is the change for Time#zone to respect Encoding.default_internal.
Updated by nobu (Nobuyoshi Nakada) 25 days ago
We defer this change after 3.4 release.
https://github.com/ruby/ruby/pull/12448
Updated by hsbt (Hiroshi SHIBATA) 11 days ago
- Status changed from Closed to Open