[Geany-Devel] pull request on GitHub, to add GeanyHighlightSelectedWords, into Geany Plugins

Fri May 29 23:45:12 UTC 2015

On 2015-05-29 02:03 PM, Jiří Techet wrote:
> On Fri, May 29, 2015 at 1:25 AM, Matthew Brush <mbrush at codebrainz.ca> wrote:
>
>>
>> Ideally you could improve the underlying implementation of an existing one
>> if your way is better[0] and they perform the same function. It's really
>> confusing for users to figure out what is the "right" plugin when there's
>> too many doing the same thing. The same thing happens with GeanyGDB,
>> Debugger, and Scope right now.
>>
>> That being said, showing occurrences of the word is such a common and
>> fairly useful feature for an IDE, I'd personally rather see the 3-4
>> existing plugins obsoleted by a good implementation in core Geany[1].
>>
>> Cheers,
>> Matthew Brush
>>
>>
> +1 on having it directly in Geany.
>
> And IMO, the simplest possible implementation should be used - i.e. using
> just strstr() for finding the names and highlighting the whole editor and
> not just the visible part and redoing this when scrolling.
>
> KMP is quite an overkill in this case - it would be useful only if
>
> 1. The text to locate would be long (which isn't the case because
> function/variable names are quite short)
> 2. The searched text would contain many prefixes from the text to locate
> (again not the case - variables/functions can have common prefix but
> typically there will be at most one per line and not like every second
> character). Most of the time strstr() will find different characters at the
> first position and advance to the next character.
>
> If you consider what we are doing when the document changes - i.e. parsing
> the document twice, once by scintilla lexer, once by ctags parser - and
> this happens on the main thread and nobody notices it, then the search part
> in the highlighting will be almost for free.
>

I might try and improve this feature this weekend if I get some time.

One thing I think would make it much better would be if it was based on 
semantics (ex. the variable "i" would only be highlighted in the current 
loop, since that's where it is scoped), but I think that's not really 
possible at present.

I was thinking something like this for implementation:

- Have a preference to enable the feature (since it would now be 
automatic). Have the preference turned off by default. Put the 
preference in "Preferences->Editor->Display" as a checkbox called 
"Highlight current word" or similar. Or would it be better under 
"Preferences->Editor->Features"?

- Use a different indicator number than the current "Mark All" feature, 
so it won't clash with the one used for the Search dialog and can have 
different styling.

- Remove the "Mark All" keybinding. Also make these new indicator types 
not cleared by the "Document->Remove Markers" menu item.

- Upon Scintilla notification of position changed, check if there's a 
current word at the cursor.

- If not, clear the indicators used for this feature (but not the "Mark 
All" ones activated from the Search Dialog).

- If there is a current word but it's the same as last time, do nothing.

- If there is a current word and it's different from the last one, clear 
the indicators and then have Scintilla close its gap buffer by getting 
the character pointer to it.

- Use strstr() starting at the character pointer to the buffer start, 
comparing the bytes with the bytes of the current word, working its way 
through the whole document.

- For each non-NULL return of strstr(), set an indicator at position 
`foundPtr - docStartPtr` with the indicator length as the number of 
bytes in the current word, and then use strstr() again starting at 
`foundPtr + currentWordLength`, repeat to end.

Does that sound fairly reasonable?

The only thing I'm not 100% sure about is handling of multi-byte 
characters in the UTF-8 of the wordchars or the buffer, but it seems 
like it should "just work" since it's just comparing raw bytes, and at 
worst, setting a indicator position in Scintilla that is between two 
bytes of the same multi-byte char (but not moving the caret there, so no 
weird editing bugs).

Cheers,
Matthew Brush