Feature #2061
closedNamed Unicode Character Escapes
Description
=begin
I suggest the addition of a \N{name} escape where name is the name of a Unicode character. It would resolve to the corresponding codepoint. 'N' is chosen because it's used by both Perl and Python for the same purpose.
This promotes more readable code compared to \u{} escapes because \N{WHITE SMILING FACE} is self-documenting whereas \u263A isn't. It can even be useful when the source encoding is UTF-8 because the meaning of unfamiliar glyphs is often clearer when they are named.
They should:
- Normalise the name by converting to uppercase and replacing underscores with spaces.
- Force the string's encoding to UTF-8 in the same fashion as \u{}.
- Optionally support Perl's aliases for names containing parentheses as detailed in http://perldoc.perl.org/charnames.html .
- Work inside regexp literals.
I'd hoped to write this patch myself, but was unable. I'm happy to update tool/enc-unicode.rb and RubySpec, if that would help.
=end
Updated by naruse (Yui NARUSE) over 14 years ago
- Status changed from Open to Assigned
- Assignee set to naruse (Yui NARUSE)
=begin
I agree this.
If you can't change regexp or other core things, I can do.
You may already know but for others, related documents are:
http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Name
http://www.unicode.org/Public/5.1.0/ucd/NamesList.html
http://www.unicode.org/reports/tr18/#Name_Properties
=end
Updated by runpaint (Run Paint Run Run) over 14 years ago
=begin
If you can't change regexp or other core things, I can do.
Thank you. I made a couple of attempts but made no progress. :-(
=end
Updated by naruse (Yui NARUSE) over 14 years ago
=begin
- Normalise the name by converting to uppercase and replacing underscores with spaces.
done this for properties in r24836.
=end
Updated by naruse (Yui NARUSE) over 14 years ago
- Status changed from Assigned to Closed
=begin
Reopen this when you done it.
=end
Updated by runpaint (Run Paint Run Run) over 14 years ago
=begin
I must have been unclear: I am not able to implement this feature. It requires changes over multiple source files and a familiarity with the lexxer. In addition, it is not clear to me what the ideal data structure is.
=end