[Geany-devel] Use of Scintilla word boundaries for word searches

Colomban Wendling lists.ban at xxxxx
Sat Aug 20 18:19:56 UTC 2011


Le 20/08/2011 20:02, Dimitar Zhekov a écrit :
> On Sat, 20 Aug 2011 19:45:36 +0200
> Colomban Wendling <lists.ban at herbesfolles.org> wrote:
> 
>>> Scintilla does it... oh, wait. Neither scintilla nor scite can find
>>> the word in ‘боза’, or even ‘boza’ (grep does, for 8-bit text).
>>
>> Not sure I get it?
> 
> For Scintilla/Geany, "boza" or "боза", enclosed in non-ascii quotes, is
> not a word any more.

Ah OK, got it.  Yeah, it doesn't detect the quote as "blank chars", so
doesn't fit in "...consists of sequences of non-blank characters
separated by blanks".

We could maybe add the non-ASCII quotes in default blank chars (assuming
using non-ASCII chars here works) -- though there may be way too much
"blank chars" to treat.

Maybe Scintilla could use a more complex algorithm to find the start/end
of an Unicode word (Pango does it, but it seems to be a hard piece), and
it'd only work on Unicode data.  And it would need a clever way of
integrating the wordchars/blankchars, maybe simply ((unicode_is_word()
|| is_wordchar()) && ! is_blankchars())...

>>> Well, since using "Find previous/next selection" on a non-selected
>>> "g_new0(gchar" still finds the next/previous g_new0 or gchar [...]
>>>
>>> OTOH, finding "regcomp(" in "(regcomp(&" still doesn't succeed.
>>> Guess I can't have it both ways.
>>
>> Hum, sorry? How finding "regcomp(" in a document containing "(regcomp
>> (&" could fail?
> 
> Word-finding "regcomp(" will not find it in "(regcomp(&", despite being
> enclosed by punctuation characters.

Ah, yeah word search.  Again, doesn't fit in "...consists of a sequence
of non-blank characters separated by blanks" :(



More information about the Devel mailing list