Actually I didn't mean where the code is using it, but rather what are the user-visible symptoms of the code not doing it correctly :)

Well, those functions above aren't where the code is using GEANY_WORDCHARS, but rather "where the code is using it incorrectly". So the user-visible symptoms follow from these. Now to test all the situations we would need a ctags parser supporting all the features and some extra identifier character such as $, -, etc. I've been lazy to search for such a parser and simulated this on C by removing _ from GEANY_WORDCHARS and considering this the "missing identifier character" (such as $ or - for other languages):

editor_start_auto_complete() - check the screenshot below - the word boundary isn't determined correctly and offers autocompletion only for the sequence following the last _. I'm pretty sure you'll run into the same thing with Verilog which allows $ inside identifiers.
Screenshot.2024-11-17.at.22.57.38.png (view on web)

autocomplete_scope() - scope autocompletion works for variables without _ but not for variables with _.
Screenshot.2024-11-17.at.22.59.28.png (view on web)

editor_show_calltip() - no calltip for functions containing _ because of incorrect word boundaries
Screenshot.2024-11-17.at.23.00.26.png (view on web)

symbols_goto_tag() - again, because of incorrect word boundaries the goto attempt is made for an incorrect word. For a Verilog example see #4037 (comment)
Screenshot.2024-11-17.at.22.56.12.png (view on web)

editor_get_word_at_pos() - I have no example but clearly this function propagates this error to plugins using it.

In any case, read_current_word() is the source of all these problems and if it were modified to use wordchars, users could modify this behavior for the specific needs of the languages they use (or better, we'd provide the right wordchars in the default configuration). We could avoid the hacks in editor_start_auto_complete() which handles CSS and Latex but nothing else. We could also make sure that wordchars contains at least GEANY_WORDCHARS to harden against bad user configurations.

One good thing I was wrong about is unicode characters - those !IS_ASCII() checks seem to do the right thing in read_current_word() and all the simple unicode cases with Czech I tried worked fine for me.

—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.