Feature #12272

Accepting HTML entity name in string literal

Added by sawa (Tsuyoshi Sawada) about 4 years ago. Updated about 4 years ago.

Target version:


String literal allows the escape character \u to describe a character using UTF-8 character code like this:

"\u201c" # left double quote
"\u2191" # up arrow

This is useful in typing characters that are not easy to input from the keyboard. However, normal people do not memorize the UTF-8 codes by heart.

The HTML symbol entity name is the place where we can compromise (although it is not available for the entire UTF-8), I think. I would like the string literal to be extended to accept HTML entity names and interpret them as the corresponding UTF-8 characters. I do not have a definite idea for the syntax, but a candidate can be an escape character \& ... ;, so that we can type:

"\“" # left double quote
"\↑"  # up arrow

Currently, "\&" is interpreted as "&", so this will be a compatibility breaking change, and if that is not desirable, perhaps a different syntax may be considered.

Also available in: Atom PDF