My [keywords] section is as follows: `primary=Cheese Käse Сыр Сыp Cыp \u0421ыр Déjà Уже Already Bereits HНOО`
I have a UTF-8 encoded file colored like this: **Cheese Käse** Сыр Сыp **Cыp** **Déjà** Уже **Already Bereits HНOО**
Words in **bold** are treated like keywords but any word with initial Cyrillic letter is ignored.
What filetype?
lexer_filetype=C
The way highlighting works is that a language specific lexer (C in your case) analyses the input according to the rules of its language. So "words" (keywords, identifiers, etc depending on the language) are first identified by the lexer, then compared to one or more of the keyword lists.
C keywords and identifiers began with ASCII alphabetic or underscore until recently when the ability to use Unicode escape sequences was introduced [see](https://en.cppreference.com/w/c/language/identifier).
Note that actually allowing unescaped Unicode in identifiers is implementation defined, not standard C. It appears that the lexer is lenient about trailing characters, but has not been updated to allow escape sequences or Unicode as leading characters.
Lexers come from the [Lexilla](https://github.com/ScintillaOrg/lexilla) project, so patches should be provided there first.
Looks like the patch is provided [here](https://github.com/ScintillaOrg/lexilla/commit/c1a8d798e2cad76aae9d4425819be...) Topic is [here](https://github.com/ScintillaOrg/lexilla/issues/130)
Thanks for upstreaming, will be imported when Geany is updated to Lexilla 5.2.2
Thank you!
Am 05.02.2023 um 02:29 schrieb elextr ***@***.***>:
Thanks for upstreaming, will be imported when Geany is updated to Lexilla 5.2.2
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.
github-comments@lists.geany.org