Feature #19908
open
- Related to Bug #10416: Create mechanism for updating of Unicode data files downstreams when we want added
- Target version deleted (
3.3)
@nobu (Nobuyoshi Nakada):
We have Grapheme_Cluster_Break=...
、so I think '=' may be appropriate. But Grapheme_Cluster_Break=...
uses a long, explicit name. So shouldn't it be Indic_Cluster_Break=...
, not just InCB=...
?
- Related to Bug #20150: Memory leak in grapheme clusters added
Is not this the updated regular expression?
ccs-base := [\p{L}\p{N}\p{P}\p{S}\p{Zs}]
ccs-extend := [\p{M}\p{Join_Control}]
extended_base := ccs-base
| hangul-syllable
-crlf := CR LF
+crlf := CR LF | CR | LF
legacy-core := hangul-syllable
| ri-sequence
| xpicto-sequence
legacy-postcore := [Extend ZWJ]
core := hangul-syllable
| ri-sequence
| xpicto-sequence
+| conjunctCluster
| [^Control CR LF]
postcore := [Extend ZWJ SpacingMark]
precore := Prepend
hangul-syllable := L* (V+ | LV V* | LVT) T*
| L+
| T+
xpicto-sequence := \p{Extended_Pictographic} (Extend* ZWJ \p{Extended_Pictographic})*
+conjunctCluster := \p{InCB=Consonant} ([\p{InCB=Extend} \p{InCB=Linker}]* \p{InCB=Linker} [\p{InCB=Extend} \p{InCB=Linker}]* \p{InCB=Consonant})+
@janosch-x (Janosch Müller) You are correct, thanks! I noticed it a few days ago, but didn't yet get around to write about that here. You beat me to that!
hsbt (Hiroshi SHIBATA) wrote in #note-8:
Unicode 16.0 has been released.
Should we move this instead of 15.1?
I think it's more prudent to do 15.1 first, then 16.0. I hope to be able to work on this soon. I created a separate issue for 16.0.
I think it's more prudent to do 15.1 first, then 16.0.
Agreed, thanks!
- Has duplicate Feature #19171: Update Unicode data to Unicode Version 15.1 added
Also available in: Atom
PDF
Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like1Like0Like0