Feature #18563
closedAdd "graphemes" and "each_grapheme" aliases
Description
https://bugs.ruby-lang.org/issues/13780#note-10
grapheme sounds like an element in the grapheme cluster. How about each_grapheme_cluster?
If everyone gets used to the grapheme as an alias of grapheme cluster, we'd love to add an alias each_grapheme.
Matz.
Languages that have added grapheme cluster support seem to be almost exclusively opting for the shorter "graphemes" alias as a part that stands for the whole.
- JavaScript/TypeScript grapheme-splitter library:
splitGraphemes
- PHP:
grapheme_extract
- Zig ziglyph library:
GraphemeIterator
- Golang uniseg library:
NewGraphemes
- Matlab:
splitGraphemes
- Python grapheme library:
graphemes
- Elixir:
graphemes
- Crystal uni_text_seg library:
graphemes
- Nim nim-graphemes library:
graphemes
- Rust unicode-segmentation library:
graphemes
Now that some time has passed and the "graphemes" alias for "grapheme clusters" has been fairly widely adopted by languages and libraries, I'd like to go ahead and propose a graphemes
alias for grapheme_clusters
and an each_grapheme
alias for each_grapheme_cluster
.
Updated by mame (Yusuke Endoh) almost 3 years ago
- Related to Feature #13780: String#each_grapheme added
Updated by mame (Yusuke Endoh) almost 3 years ago
- Description updated (diff)
(I have added to the description an url to matz's original statement)
Updated by znz (Kazuhiro NISHIYAMA) over 2 years ago
- Subject changed from Add "graphemes" and "each_grapheme aliases to Add "graphemes" and "each_grapheme" aliases
Updated by nobu (Nobuyoshi Nakada) over 2 years ago
How about letters
and each_letter
?
Updated by matz (Yukihiro Matsumoto) over 2 years ago
- Status changed from Open to Closed
For the record, "Grapheme" and "Grapheme cluster" are different concepts. If we call them "grapheme", It's kind of like calling "Wikipedia" as "Wiki".
Until Unicode consortium defines a shorter name for them or the convention calling them "grapheme" become popular as common sense, we don't provide such aliases. So my opinion has not been changed since.
Short answer: "not yet".
Matz.
Updated by Dan0042 (Daniel DeLorme) over 2 years ago
nobu (Nobuyoshi Nakada) wrote in #note-4:
How about
letters
andeach_letter
?
I like the general idea, but to me "letters" mean \p{L}
[retracted part]
Or how about characters
and each_character
?