Le 20/08/2011 20:02, Dimitar Zhekov a écrit :
On Sat, 20 Aug 2011 19:45:36 +0200 Colomban Wendling lists.ban@herbesfolles.org wrote:
Scintilla does it... oh, wait. Neither scintilla nor scite can find the word in ‘боза’, or even ‘boza’ (grep does, for 8-bit text).
Not sure I get it?
For Scintilla/Geany, "boza" or "боза", enclosed in non-ascii quotes, is not a word any more.
Ah OK, got it. Yeah, it doesn't detect the quote as "blank chars", so doesn't fit in "...consists of sequences of non-blank characters separated by blanks".
We could maybe add the non-ASCII quotes in default blank chars (assuming using non-ASCII chars here works) -- though there may be way too much "blank chars" to treat.
Maybe Scintilla could use a more complex algorithm to find the start/end of an Unicode word (Pango does it, but it seems to be a hard piece), and it'd only work on Unicode data. And it would need a clever way of integrating the wordchars/blankchars, maybe simply ((unicode_is_word() || is_wordchar()) && ! is_blankchars())...
Well, since using "Find previous/next selection" on a non-selected "g_new0(gchar" still finds the next/previous g_new0 or gchar [...]
OTOH, finding "regcomp(" in "(regcomp(&" still doesn't succeed. Guess I can't have it both ways.
Hum, sorry? How finding "regcomp(" in a document containing "(regcomp (&" could fail?
Word-finding "regcomp(" will not find it in "(regcomp(&", despite being enclosed by punctuation characters.
Ah, yeah word search. Again, doesn't fit in "...consists of a sequence of non-blank characters separated by blanks" :(