Bug #2379

String#[] returns invalid values for short multibyte strings

Added by raorn (Alexey Froloff) almost 11 years ago. Updated over 9 years ago.

Target version:
ruby -v:
ruby 1.9.1p333 (2009-11-02)


In UTF-8 locale command
ruby -e 'print "ะน"[0,30]' | od -t x1

0000000 d0 b9 00 00
for ruby 1.9.1p333 (2009-11-02) [i586-linux-gnu]

0000000 d0 b9 00 00 00 00 00 00
for ruby 1.9.1p333 (2009-11-02) [x86_64-linux-gnu]

Minimum "len" to reproduce is 9 for i586 and 17 for x86_64.

Related issues

Has duplicate Backport191 - Backport #3633: String accessor [Fixnum, Fixnum] produces wrong result in 1.9.1Closedyugui (Yuki Sonoda)Actions
Has duplicate Backport191 - Backport #4028: substring selection and utf8 encoding problemClosedyugui (Yuki Sonoda)Actions

Updated by raorn (Alexey Froloff) almost 11 years ago

"\u{444}" is better test string.


Updated by naruse (Yui NARUSE) almost 11 years ago

  • Category set to M17N
  • Status changed from Open to Assigned
  • Assignee set to naruse (Yui NARUSE)
  • Target version set to 1.9.2




Updated by raorn (Alexey Froloff) almost 11 years ago

This happens for all UTF-8 strings shorter than sizeof(VALUE) bytes and len greater than sizeof(VALUE)*2. Problem lies somewhere in str_utf8_nth() function.


Updated by nobu (Nobuyoshi Nakada) almost 11 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r25830.
Alexey, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


Also available in: Atom PDF