Project

General

Profile

Actions

Bug #1943

closed

unexpected behavior of tr with unicode strings

Added by orban (Tuples Arefun) over 15 years ago. Updated over 13 years ago.

Status:
Rejected
Assignee:
-
ruby -v:
ruby 1.9.1p129 (2009-05-12 revision 23412) [i386-darwin9.7.0]
[ruby-core:24937]

Description

=begin
The unicode code point 8221 is to be replaced by 34, and not 43

wide = [12288, 65288, 65289, 65291, 12540, 8221]
ascii = [32, 40, 41, 43, 45, 34]
foo = [8221, 19997, 37329, 22825, 20351, 8221]
bar = foo.pack('U*').tr(wide.pack('U*'), ascii.pack('U*'))
bar.unpack('U*')

=> [43, 19997, 37329, 22825, 20351, 43]

It works correctly in this example:

[8221].pack('U*').tr([8221].pack('U*'), [34].pack('U*')).unpack('U*')
=> [34]

Why, I don't know.
=end

Actions #1

Updated by mame (Yusuke Endoh) over 15 years ago

  • Status changed from Open to Rejected

=begin
Not a bug.
By design, String#tr handles some meta characters including '-' (ASCII 45).
What you are doing is similar to:

"F@@@@F".tr("ABCDEF", "abcd-a") #=> "d@@@@d"
"F".tr("F", "a") #=> "a@@@@a"

You should use escape character:

"F@@@@F".tr("ABCDEF", "abcd-a") #=> "a@@@@a"

BTW, in a discussion with nurse, we noticed that empty range (such as "d-a")
seems to cause unexpected behavior. I'll register a separate ticket.

--
Yusuke ENDOH
=end

Actions

Also available in: Atom PDF

Like0
Like0