Actions

Copy link

Bug #16927

closed

String#tr won't return the expected result for some sign with diacritics

Bug #16927: String#tr won't return the expected result for some sign with diacritics

Added by psychoslave (mathieu lovato stumpf guntz) about 6 years ago. Updated over 2 years ago.

Status:

Rejected

Assignee:

Target version:

ruby -v:

ruby 2.7.0p0 (2019-12-25 revision 647ee6f091) [x86_64-linux]

Backport:

2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN

[ruby-core:98600]

Description

Context¶

Not much interest for the bug here, but I always appreciate to be given more context. So, as part of a larger project, I needed to utter every number from zero to 255 with a single syllable written as a consonant-vowel-consonant (CVC) in IPA. To avoid ambiguity, the nomenclature should not collide with existing numerical terms like "six" and "ten" in any language for which the documentation was found. As it was not enough nerdy, I came with the idea to mark with diacritics primes and congruence with 2, 8, 12, 16 (optionally and without intended phonological alteration though). If you are curious about it, you can look at the algorithm I used to build the nomenclature matching the specification.

Code to reproduce the bug¶

#!/bin/env ruby
translated = 'aeiou'.tr('aeiou', 'ą̂ę̂į̂ǫ̂ų̂')
substitued = 'aeiou'.sub(/aeiou/, 'ą̂ę̂į̂ǫ̂ų̂')
puts `ruby -v`, translated == substitued, translated, substitued

# Actual result

On my box, this outputs:

ruby 2.7.0p0 (2019-12-25 revision 647ee6f091) [x86_64-linux]
false
ą̂ę̂į
ą̂ę̂į̂ǫ̂ų̂

Expected result¶

tr should return a congruent result: either it should fail for all signs with similar diacritics, or (preferably) return the specified Unicode glyph. That is, in the code above, translated == substitued should be true.

Remarks¶

I am not a Unicode Guru: maybe the missing signs generating the difference comes from the way they are encoded. I am aware that some glyphs come in duplicates: as solo code points vs. combined code point sequences. However I'm cannot tell if the above code uses a mixture of both.

Updated by nobu (Nobuyoshi Nakada) about 6 years ago Actions
Copy link
#1 [ruby-core:98603]

Subject changed from String.tr won't return the expected result for some sign with diacritics to String#tr won't return the expected result for some sign with diacritics

This is because String#tr translates each codepoint, not each grapheme cluster, for now.

Updated by sawa (Tsuyoshi Sawada) about 6 years ago Actions
Copy link
#2

Description updated (diff)

Updated by naruse (Yui NARUSE) over 2 years ago Actions
Copy link
#3 [ruby-core:115010]

As nobu says, since String#tr is for codepoints, so this proposal is rejected.
Maybe though a dedicated method for this use case can be reasonable, the current use case is not enough to design it.

Updated by naruse (Yui NARUSE) over 2 years ago Actions
Copy link
#4

Status changed from Open to Rejected

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

Ruby

Custom queries

Bug #16927

String#tr won't return the expected result for some sign with diacritics

Context¶

Code to reproduce the bug¶

Expected result¶

Remarks¶

Updated by nobu (Nobuyoshi Nakada) about 6 years ago Actions
Copy link
#1 [ruby-core:98603]

Updated by sawa (Tsuyoshi Sawada) about 6 years ago Actions
Copy link
#2

Updated by naruse (Yui NARUSE) over 2 years ago Actions
Copy link
#3 [ruby-core:115010]

Updated by naruse (Yui NARUSE) over 2 years ago Actions
Copy link
#4

Project

General

Profile

Ruby

Custom queries

Bug #16927

String#tr won't return the expected result for some sign with diacritics

Context¶

Code to reproduce the bug¶

Expected result¶

Remarks¶

Updated by nobu (Nobuyoshi Nakada) about 6 years ago ActionsCopy link #1 [ruby-core:98603]

Updated by sawa (Tsuyoshi Sawada) about 6 years ago ActionsCopy link #2

Updated by naruse (Yui NARUSE) over 2 years ago ActionsCopy link #3 [ruby-core:115010]

Updated by naruse (Yui NARUSE) over 2 years ago ActionsCopy link #4

Updated by nobu (Nobuyoshi Nakada) about 6 years ago Actions
Copy link
#1 [ruby-core:98603]

Updated by sawa (Tsuyoshi Sawada) about 6 years ago Actions
Copy link
#2

Updated by naruse (Yui NARUSE) over 2 years ago Actions
Copy link
#3 [ruby-core:115010]

Updated by naruse (Yui NARUSE) over 2 years ago Actions
Copy link
#4