[Geany-devel] Use of Scintilla word boundaries for word searches

Lex Trotman elextr at xxxxx
Fri Aug 19 23:33:29 UTC 2011

On 20 August 2011 03:40, Dimitar Zhekov <dimitar.zhekov at gmail.com> wrote:
> On Fri, 19 Aug 2011 18:10:42 +0200
> Colomban Wendling <lists.ban at herbesfolles.org> wrote:
>> Hi,
>> I'm trying to address bug 3386129 [1], and I'd like comments & reviews
>> about my fix, because the whole thing don't look obvious at all...
>> We already have 2 ways of determining what a "word" is: a manual one
>> using GEANY_WORDCHARS or a caller-given list of wordchars, and one that
>> uses Scintilla's word boundaries.
> 3? Shoudn't we have symbolchars for the current programming language
> ([A-Za-z_] if unknown), and wordchars that match the current
> locale? They don't have much in common.

By wordchars we mean symbolchars, this confusion has existed from the
beginnings of C at least, and we ain't gonna change it now. :-)

Locale/human language word ends are not as simple as sets of
characters so lets not go there, we would need something like IIUC to
do that.

>> The former seems to make more sense when the caller code knows the kind
>> of characters it wants (e.g. tags lookups), but the latter is better
>> when getting the word to search for.

Shouldn't the tags be using the same definition of word chars as
Scintilla's highlighting?  I don't trust "knowing" stuff in two
places, they will never match :-) I understand that it might be a bit
of work to hack tagmanager into line though.

> There is always a SCI_SETWORDCHARS... Hmmm, we even use it to set the
> sci wordchars to the filetype wordchars if we don't know the exact
> lexer or something? Well, I guess it's really non-trivial.

We should be always setting Scintilla's wordchars from the filetype
file, although IIUC a few lexers think they know better and ignore

>> So in the attached patch, I added a alternative way to get the the
>> current word (that uses the same algorithm as the word selection) and
>> tries to use it whenever the word was fetched for a search.
> Makes sense to me. Though I'm not sure about that SCI_SETWORDCHARS we
> use in highlighting:styleset_common().

Required, to make highlighting match word definitions (assuming lexer

[...] Nothing else suspicious, at least
> from a first sight.


Maybe everything should use the filetype wordchars definition, with
GEANY_WORDCHARS moved to filetypes.common as the default.


More information about the Devel mailing list