pull request on GitHub, to add GeanyHighlightSelectedWords, into Geany Plugins

List overview All Threads

newer

older

[RFC]: No force push policy on...

Syntax Highlighting for SWI Prolog

marius buzea

27 May 2015 27 May '15

2:25 p.m.

Hello, I would like to add GeanyHighlightSelectedWords, to Geany Plugins.Would it be okay that I do a git pull-request for doing this? I am the mgnt user on sourceforge, and last week there was a ticketadded to GeanyHighlightSelectedWords, ticket #2, and in thisticket was a question: why not make a pull-request to Geany Plugins. I know there would be some things I would need to do before, likereplace the Makefile with Makefile.am, the autotools, automake way,and write a README file using restructured text content so that thatREADME file can be converted to html. Maybe there is more thatshould be done. I hope my question is okay. Have a great day,Marius Ioan Buzea

Attachments:

attachment.htm (text/html — 1.9 KB)

Show replies by date

Thomas Martitz

27 May 27 May

3:10 p.m.

Am 27.05.2015 um 14:25 schrieb marius buzea:

...

Hello,

I would like to add GeanyHighlightSelectedWords, to Geany Plugins. Would it be okay that I do a git pull-request for doing this? I am the mgnt user on sourceforge, and last week there was a ticket added to GeanyHighlightSelectedWords, ticket #2, and in this ticket was a question: why not make a pull-request to Geany Plugins.

I know there would be some things I would need to do before, like replace the Makefile with Makefile.am, the autotools, automake way, and write a README file using restructured text content so that that README file can be converted to html. Maybe there is more that should be done.

I hope my question is okay.

Have a great day, Marius Ioan Buzea

What does this plugin do exactly? Addons can already mark all instances of the selected word (same as ctrl+shift+m if a word already selected).

marius buzea

5:21 p.m.

Hello,

It highlights all occurrences of the selected word in the visible text.Processing complexity is O(n+m) time and space (the KMP algorithm isused to achieve this.) I estimated it should introduce littlelatency when processing is triggered ( < 200 milliseconds). Worksvery well (I use it daily, and have seen no slow down).

I think that it differs from mark all instances in the following way.I guess that mark all instances searches the whole file which may behuge. The plugin only searches the text that you see in the scintillawidget. When you scroll, for example, the plugin is triggered andagain searches in the visible text for all occurrences of selected word,if any. So you don't have to wait when you edit or read text withGeany.

You may use the plugin when you read some code: click on some variable,and then you see all places where it is used, or click on return and yousee all places where return statement is used in some function.

You may also use the plugin to select any kind of text from any hugetext file. For example, you may be reading this file http://www.gutenberg.org/cache/epub/8117/pg8117.txt%C2%A0%C2%A0 (well, a localcopy of if) and then you may see all visible occurrences of some word.

[KMP] http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm

Have a great day,Marius Ioan Buzea

On Wednesday, May 27, 2015 4:10 PM, Thomas Martitz kugel@rockbox.org wrote:

Am 27.05.2015 um 14:25 schrieb marius buzea:

...

Hello,

I would like to add GeanyHighlightSelectedWords, to Geany Plugins. Would it be okay that I do a git pull-request for doing this? I am the mgnt user on sourceforge, and last week there was a ticket added to GeanyHighlightSelectedWords, ticket #2, and in this ticket was a question: why not make a pull-request to Geany Plugins.

I know there would be some things I would need to do before, like replace the Makefile with Makefile.am, the autotools, automake way, and write a README file using restructured text content so that that README file can be converted to html. Maybe there is more that should be done.

I hope my question is okay.

Have a great day, Marius Ioan Buzea

What does this plugin do exactly? Addons can already mark all instances of the selected word (same as ctrl+shift+m if a word already selected). _______________________________________________ Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Pavel Roschin

5:38 p.m.

Please check:

https://github.com/geany/geany-plugins/tree/master/automark

Feel free to modify this plugin.

I also cannot find your PR on GitHub.

...

Hello,

It highlights all occurrences of the selected word in the visible text.Processing complexity is O(n+m) time and space (the KMP algorithm isused to achieve this.)     I estimated it should introduce littlelatency when processing is triggered ( < 200 milliseconds).      Worksvery well (I use it daily, and have seen no slow down).

I think that it differs from mark all instances in the following way.I guess that mark all instances searches the whole file which may behuge.   The plugin only searches the text that you see in the scintillawidget.     When you scroll, for example, the plugin is triggered andagain searches in the visible text for all occurrences of selected word,if any.     So you don't have to wait when you edit or read text withGeany.

You may use the plugin when you read some code: click on some variable,and then you see all places where it is used, or click on return and yousee all places where return statement is used in some function.

You may also use the plugin to select any kind of text from any hugetext file.    For example, you may be reading this file http://www.gutenberg.org/cache/epub/8117/pg8117.txt%C2%A0%C2%A0 (well, a localcopy of if) and then you may see all visible occurrences of some word.

[KMP] http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm

Have a great day,Marius Ioan Buzea
 On Wednesday, May 27, 2015 4:10 PM, Thomas Martitz
kugel@rockbox.org wrote:

Am 27.05.2015 um 14:25 schrieb marius buzea:

...
Hello,

I would like to add GeanyHighlightSelectedWords, to Geany Plugins. Would it be okay that I do a git pull-request for doing this? I am the mgnt user on sourceforge, and last week there was a ticket added to GeanyHighlightSelectedWords, ticket #2, and in this ticket was a question: why not make a pull-request to Geany Plugins.

I know there would be some things I would need to do before, like replace the Makefile with Makefile.am, the autotools, automake way, and write a README file using restructured text content so that that README file can be converted to html. Maybe there is more that should be done.

I hope my question is okay.

Have a great day, Marius Ioan Buzea

What does this plugin do exactly? Addons can already mark all instances of the selected word (same as ctrl+shift+m if a word already selected). _______________________________________________ Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

-- Best regards, Pavel Roschin aka RPG

-- С уважением, Павел Рощин +7(985)976-17-51 roschin@scriptumplus.ru www.scriptumplus.ru

Steven Blatnick

5:29 p.m.

Just as a side note, my plugin also does something similar /but in addition to/ the "Addons" plugin, by highlighting not by keystroke or by double click like it does, but using separate highlighting for whatever string the cursor last highlighted. It also has a convenient incremental search requiring less keystrokes and UI real estate:

https://github.com/sblatnick/geany-plugins/blob/master/quick-search/src/quic...

Thanks,

Steve

On 05/27/2015 07:10 AM, Thomas Martitz wrote:

...

Am 27.05.2015 um 14:25 schrieb marius buzea:

...
Hello,

I would like to add GeanyHighlightSelectedWords, to Geany Plugins. Would it be okay that I do a git pull-request for doing this? I am the mgnt user on sourceforge, and last week there was a ticket added to GeanyHighlightSelectedWords, ticket #2, and in this ticket was a question: why not make a pull-request to Geany Plugins.

I know there would be some things I would need to do before, like replace the Makefile with Makefile.am, the autotools, automake way, and write a README file using restructured text content so that that README file can be converted to html. Maybe there is more that should be done.

I hope my question is okay.

Have a great day, Marius Ioan Buzea

What does this plugin do exactly? Addons can already mark all instances of the selected word (same as ctrl+shift+m if a word already selected). _______________________________________________ Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Colomban Wendling

11:21 p.m.

Hi!

Le 27/05/2015 14:25, marius buzea a écrit :

...

Hello,

I would like to add GeanyHighlightSelectedWords, to Geany Plugins. Would it be okay that I do a git pull-request for doing this? […]

Sure. I see some other people suggested already included plugins might achieve something similar, so I'll let you check whether you can combine your efforts or not, but we generally are happy including any plugin :) (well, given it's of decent quality, but we can even help with that if we think something might be problematic)

...

I know there would be some things I would need to do before, like replace the Makefile with Makefile.am, the autotools, automake way, and write a README file using restructured text content so that that README file can be converted to html. Maybe there is more that should be done.

It's pretty much it. Build system integration, and internationalization integration. But we can help with both, you can read HACKING, and internationalization is quite easy.

Regards, Colomban

marius buzea

28 May 28 May

11:03 p.m.

New subject: pull request on GitHub, to add GeanyHighlightSelectedWords, into Geany Plugins

Hello,

I have read the source code of automark [ https://github.com/geany/geany-plugins/blob/master/automark/src/automark.c ]. Functionally, automark and GeanyHighlightSelectedWord are alike, I think. There are differences between automark and HighlighSelectedWord.

Automark is concise: it uses the SCI_FINDTEXT messages to scintilla for finding occurrences of selected text. Automark uses a timeout callback for matching text.

GeanyHighlightSelectedText is more 'low-level': it does not use scintilla's SCI_FINDTEXT functionality, instead it uses the KMP algorithm to find all occurrences of selected text in visible text. GeanyHighlightSelectedText does not use timeout callback for matching text.

While the functionality may be the same, there were different decisions in the design of these two plugins. Let's have both automark and GeanyHighlightSelectedText included in Geany Plugins.

If it is okay, I would continue with build system integration, and internationalization for GeanyHighlightSelectedWord. This would be something I would code, and test locally, and when done I would return to ask for support on how to do the git pull request step.

Have a great day, Marius Ioan Buzea

-------------------------------------------- On Thu, 5/28/15, Colomban Wendling lists.ban@herbesfolles.org wrote:

Subject: Re: [Geany-Devel] pull request on GitHub, to add GeanyHighlightSelectedWords, into Geany Plugins To: "marius buzea" magnetudinbuda@yahoo.com, "Geany development list" devel@lists.geany.org Date: Thursday, May 28, 2015, 12:21 AM

Hi!

Le 27/05/2015 14:25, marius buzea a écrit :

...

Hello,

I would like to add

GeanyHighlightSelectedWords, to Geany Plugins.

...

Would it be okay that I do a git

pull-request for doing this? […]

(well, given it's of decent quality, but we can even help with that if we think something might be problematic)

...

I

know there would be some things I would need to do before, like

...

replace the Makefile with

Makefile.am, the autotools, automake way,

...

and write a README file using restructured

text content so that that

...

README file

can be converted to html. Maybe there is more that

...

should be done.

It's pretty much it. Build system integration, and internationalization integration. But we can help with both, you can read HACKING, and internationalization is quite easy.

Regards, Colomban

Matthew Brush

29 May 29 May

1:25 a.m.

On 2015-05-28 02:03 PM, marius buzea wrote:

...

Hello,

I have read the source code of automark [ https://github.com/geany/geany-plugins/blob/master/automark/src/automark.c ]. Functionally, automark and GeanyHighlightSelectedWord are alike, I think. There are differences between automark and HighlighSelectedWord.

Automark is concise: it uses the SCI_FINDTEXT messages to scintilla for finding occurrences of selected text. Automark uses a timeout callback for matching text.

GeanyHighlightSelectedText is more 'low-level': it does not use scintilla's SCI_FINDTEXT functionality, instead it uses the KMP algorithm to find all occurrences of selected text in visible text. GeanyHighlightSelectedText does not use timeout callback for matching text.

While the functionality may be the same, there were different decisions in the design of these two plugins. Let's have both automark and GeanyHighlightSelectedText included in Geany Plugins.

Hi,

Ideally you could improve the underlying implementation of an existing one if your way is better[0] and they perform the same function. It's really confusing for users to figure out what is the "right" plugin when there's too many doing the same thing. The same thing happens with GeanyGDB, Debugger, and Scope right now.

That being said, showing occurrences of the word is such a common and fairly useful feature for an IDE, I'd personally rather see the 3-4 existing plugins obsoleted by a good implementation in core Geany[1].

Cheers, Matthew Brush

[0]: though if it's just for performance, I doubt it will matter at all for any of them except for massive data files or something. [1]: since it's relative simple and such a common editor feature that we already do have, it just doesn't work quite like people want.

Lex Trotman

2:38 a.m.

On 29 May 2015 at 09:25, Matthew Brush mbrush@codebrainz.ca wrote:

...

On 2015-05-28 02:03 PM, marius buzea wrote:

...
Hello,

I have read the source code of automark [ https://github.com/geany/geany-plugins/blob/master/automark/src/automark.c ]. Functionally, automark and GeanyHighlightSelectedWord are alike, I think. There are differences between automark and HighlighSelectedWord.

Automark is concise: it uses the SCI_FINDTEXT messages to scintilla for finding occurrences of selected text. Automark uses a timeout callback for matching text.

GeanyHighlightSelectedText is more 'low-level': it does not use scintilla's SCI_FINDTEXT functionality, instead it uses the KMP algorithm to find all occurrences of selected text in visible text. GeanyHighlightSelectedText does not use timeout callback for matching text.

While the functionality may be the same, there were different decisions in the design of these two plugins. Let's have both automark and GeanyHighlightSelectedText included in Geany Plugins.

Hi,

Ideally you could improve the underlying implementation of an existing one if your way is better[0] and they perform the same function. It's really confusing for users to figure out what is the "right" plugin when there's too many doing the same thing. The same thing happens with GeanyGDB, Debugger, and Scope right now.

That being said, showing occurrences of the word is such a common and fairly useful feature for an IDE, I'd personally rather see the 3-4 existing plugins obsoleted by a good implementation in core Geany[1].

Gotta agree, this is so common (even browsers do it) that it should be an easily found built-in feature, but maybe that just means moving the Menu->Search->more->mark all menu item?

Improving the implementation is also good of course.

...

Cheers, Matthew Brush

[0]: though if it's just for performance, I doubt it will matter at all for any of them except for massive data files or something. [1]: since it's relative simple and such a common editor feature that we already do have, it just doesn't work quite like people want.

Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Colomban Wendling

30 May 30 May

3:09 a.m.

Le 29/05/2015 02:38, Lex Trotman a écrit :

...

[…]

...
That being said, showing occurrences of the word is such a common and fairly useful feature for an IDE, I'd personally rather see the 3-4 existing plugins obsoleted by a good implementation in core Geany[1].

+1

Gotta agree, this is so common (even browsers do it) that it should be an easily found built-in feature, but maybe that just means moving the Menu->Search->more->mark all menu item?

Apparently it's not the search results highlighting they are talking about, but dynamic highlight of the current word (which to me is the same as search result highlighting searching for the word under cursor, but who knows if there's a subtler thing behind it) -- and I don't know any browser doing that (nor, for that matter, any application).

...

Improving the implementation is also good of course.

It depends on what "improving" means. I have to go with Jiří on that one, this should not be over-engineered unless there is a problem to solve, and the added complexity does solve it. This probably can be something like a 30-60 line feature (in C!), so please don't make it a 2k line one unless it makes me coffee in the morning :]

Cheers, Colomban

Matthew Brush

3:24 a.m.

On 2015-05-29 06:09 PM, Colomban Wendling wrote:

...

Le 29/05/2015 02:38, Lex Trotman a écrit :

...
[…]

...
That being said, showing occurrences of the word is such a common and fairly useful feature for an IDE, I'd personally rather see the 3-4 existing plugins obsoleted by a good implementation in core Geany[1].

+1

Gotta agree, this is so common (even browsers do it) that it should be an easily found built-in feature, but maybe that just means moving the Menu->Search->more->mark all menu item?

Apparently it's not the search results highlighting they are talking about, but dynamic highlight of the current word (which to me is the same as search result highlighting searching for the word under cursor, but who knows if there's a subtler thing behind it) -- and I don't know any browser doing that (nor, for that matter, any application).

I think it's confusing because Geany's Search dialog has a "Mark All" button which does a similar but slightly different thing (marking based on search pattern rather than caret position or current selection), which is basically the thing I think Lex is talking about. I haven't investigated, but I would think all these are sharing the same underlying implementation at present, and possibly (but hopefully not) sharing the same code for how they are cleared :)

Cheers, Matthew Brush

Lex Trotman

3:44 a.m.

On 30 May 2015 at 11:24, Matthew Brush mbrush@codebrainz.ca wrote:

...

On 2015-05-29 06:09 PM, Colomban Wendling wrote:

...
Le 29/05/2015 02:38, Lex Trotman a écrit :

...
[…]

...
That being said, showing occurrences of the word is such a common and fairly useful feature for an IDE, I'd personally rather see the 3-4 existing plugins obsoleted by a good implementation in core Geany[1].

+1

Gotta agree, this is so common (even browsers do it) that it should be an easily found built-in feature, but maybe that just means moving the Menu->Search->more->mark all menu item?

Apparently it's not the search results highlighting they are talking about, but dynamic highlight of the current word (which to me is the same as search result highlighting searching for the word under cursor, but who knows if there's a subtler thing behind it) -- and I don't know any browser doing that (nor, for that matter, any application).

I think it's confusing because Geany's Search dialog has a "Mark All" button which does a similar but slightly different thing (marking based on search pattern rather than caret position or current selection), which is basically the thing I think Lex is talking about. I haven't investigated, but I would think all these are sharing the same underlying implementation at present, and possibly (but hopefully not) sharing the same code for how they are cleared :)

Ok, thanks for the clarification, I'm not so enthused about this idea then. But so long as it has an option to turn it off I don't care :)

Cheers Lex

...

Cheers, Matthew Brush

Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Frank Lanitz

29 May 29 May

8:51 a.m.

Am 29.05.2015 um 01:25 schrieb Matthew Brush:

...

On 2015-05-28 02:03 PM, marius buzea wrote:

...
Hello,

I have read the source code of automark [ https://github.com/geany/geany-plugins/blob/master/automark/src/automark.c ]. Functionally, automark and GeanyHighlightSelectedWord are alike, I think. There are differences between automark and HighlighSelectedWord.

Automark is concise: it uses the SCI_FINDTEXT messages to scintilla for finding occurrences of selected text. Automark uses a timeout callback for matching text.

GeanyHighlightSelectedText is more 'low-level': it does not use scintilla's SCI_FINDTEXT functionality, instead it uses the KMP algorithm to find all occurrences of selected text in visible text. GeanyHighlightSelectedText does not use timeout callback for matching text.

While the functionality may be the same, there were different decisions in the design of these two plugins. Let's have both automark and GeanyHighlightSelectedText included in Geany Plugins.

Hi,

Ideally you could improve the underlying implementation of an existing one if your way is better[0] and they perform the same function. It's really confusing for users to figure out what is the "right" plugin when there's too many doing the same thing. The same thing happens with GeanyGDB, Debugger, and Scope right now.

That being said, showing occurrences of the word is such a common and fairly useful feature for an IDE, I'd personally rather see the 3-4 existing plugins obsoleted by a good implementation in core Geany[1].

--verbose: (It's a general view, not related to this plugin) I wouldn't merge a new plugin like this unless there are real good reasons. Sad for the efforts the author did, but having three plugins doing the very same in just a different way on the one hand and increasing efforts to maintain the whole bundle just don't fit in my eyes. As mentioned by Matthew, we had this with the all the gdb-plugins as well as some features of project extending plugins. Of course some of the plugins are lagging active development, maybe because they are feature complete, maybe the author is not having time/mood/* anymore. But even in such cases adding a new plugin doesn't guarantee a change as it might become also unsupported next month. So I prefer to adopt an plugin and improve it in such a case.

And having it inside core seems to be a logical step based upon reasons mentioned. ;)

Cheers, Frank

Jiří Techet

11:03 p.m.

On Fri, May 29, 2015 at 1:25 AM, Matthew Brush mbrush@codebrainz.ca wrote:

...

Ideally you could improve the underlying implementation of an existing one if your way is better[0] and they perform the same function. It's really confusing for users to figure out what is the "right" plugin when there's too many doing the same thing. The same thing happens with GeanyGDB, Debugger, and Scope right now.

That being said, showing occurrences of the word is such a common and fairly useful feature for an IDE, I'd personally rather see the 3-4 existing plugins obsoleted by a good implementation in core Geany[1].

Cheers, Matthew Brush

+1 on having it directly in Geany.

And IMO, the simplest possible implementation should be used - i.e. using just strstr() for finding the names and highlighting the whole editor and not just the visible part and redoing this when scrolling.

KMP is quite an overkill in this case - it would be useful only if

1. The text to locate would be long (which isn't the case because function/variable names are quite short) 2. The searched text would contain many prefixes from the text to locate (again not the case - variables/functions can have common prefix but typically there will be at most one per line and not like every second character). Most of the time strstr() will find different characters at the first position and advance to the next character.

If you consider what we are doing when the document changes - i.e. parsing the document twice, once by scintilla lexer, once by ctags parser - and this happens on the main thread and nobody notices it, then the search part in the highlighting will be almost for free.

Cheers,

Jiri

Steven Blatnick

30 May 30 May

12:06 a.m.

+1 on having it directly in Geany from me as well. My quick-search plugin works that it highlights all instances the moment your cursor highlights any text (any changes to the selection). That may be a potential way to have it work.

As another enhancement that could/should be in core geany, I suggest having the find-in-files via grep dialog allow you to navigate to the matches without having to double click (just simple selection) and keeping the focus on the search results. Support for ack-grep and/or silversearcher-ag and/or moving to the side panel would be nice too.

My quick-find plugin already behaves this way, but having this functionality in core may be preferred since searching is already integrated in geany. (quick-find also keys off of the current path in treebrowser as a directory base for the search, or uses the project if it can't find treebrowser. I used similar functionality in a python plugin for gedit2, so that's why I created quick-find.)

While nobody else seems to be using my plugins at present, I've been using them full time for work, and they are pretty stable. I hope nobody is annoyed I keep mentioning them. Any advice on how I might get them read for geany-plugins is appreciated. I just haven't gotten around to it.

plugins: https://github.com/sblatnick/geany-plugins

Thanks,

Steve

On 05/29/2015 03:03 PM, Jiří Techet wrote:

...

+1 on having it directly in Geany.

Matthew Brush

1:45 a.m.

On 2015-05-29 02:03 PM, Jiří Techet wrote:

...

On Fri, May 29, 2015 at 1:25 AM, Matthew Brush mbrush@codebrainz.ca wrote:

...
Ideally you could improve the underlying implementation of an existing one if your way is better[0] and they perform the same function. It's really confusing for users to figure out what is the "right" plugin when there's too many doing the same thing. The same thing happens with GeanyGDB, Debugger, and Scope right now.

That being said, showing occurrences of the word is such a common and fairly useful feature for an IDE, I'd personally rather see the 3-4 existing plugins obsoleted by a good implementation in core Geany[1].

Cheers, Matthew Brush

+1 on having it directly in Geany.

And IMO, the simplest possible implementation should be used - i.e. using just strstr() for finding the names and highlighting the whole editor and not just the visible part and redoing this when scrolling.

KMP is quite an overkill in this case - it would be useful only if

The text to locate would be long (which isn't the case because

function/variable names are quite short) 2. The searched text would contain many prefixes from the text to locate (again not the case - variables/functions can have common prefix but typically there will be at most one per line and not like every second character). Most of the time strstr() will find different characters at the first position and advance to the next character.

If you consider what we are doing when the document changes - i.e. parsing the document twice, once by scintilla lexer, once by ctags parser - and this happens on the main thread and nobody notices it, then the search part in the highlighting will be almost for free.

I might try and improve this feature this weekend if I get some time.

One thing I think would make it much better would be if it was based on semantics (ex. the variable "i" would only be highlighted in the current loop, since that's where it is scoped), but I think that's not really possible at present.

I was thinking something like this for implementation:

- Have a preference to enable the feature (since it would now be automatic). Have the preference turned off by default. Put the preference in "Preferences->Editor->Display" as a checkbox called "Highlight current word" or similar. Or would it be better under "Preferences->Editor->Features"?

- Use a different indicator number than the current "Mark All" feature, so it won't clash with the one used for the Search dialog and can have different styling.

- Remove the "Mark All" keybinding. Also make these new indicator types not cleared by the "Document->Remove Markers" menu item.

- Upon Scintilla notification of position changed, check if there's a current word at the cursor.

- If not, clear the indicators used for this feature (but not the "Mark All" ones activated from the Search Dialog).

- If there is a current word but it's the same as last time, do nothing.

- If there is a current word and it's different from the last one, clear the indicators and then have Scintilla close its gap buffer by getting the character pointer to it.

- Use strstr() starting at the character pointer to the buffer start, comparing the bytes with the bytes of the current word, working its way through the whole document.

- For each non-NULL return of strstr(), set an indicator at position `foundPtr - docStartPtr` with the indicator length as the number of bytes in the current word, and then use strstr() again starting at `foundPtr + currentWordLength`, repeat to end.

Does that sound fairly reasonable?

The only thing I'm not 100% sure about is handling of multi-byte characters in the UTF-8 of the wordchars or the buffer, but it seems like it should "just work" since it's just comparing raw bytes, and at worst, setting a indicator position in Scintilla that is between two bytes of the same multi-byte char (but not moving the caret there, so no weird editing bugs).

Cheers, Matthew Brush

Matthew Brush

2:23 a.m.

On 2015-05-29 04:45 PM, Matthew Brush wrote:

...

[snip]

I was thinking something like this for implementation:

[snip]

Upon Scintilla notification of position changed, check if there's a

current word at the cursor.

Question: Is it important to make it mark the selection if there is one, otherwise the current word? I've never seen or used that in other editors where I've seen this feature (though I don't doubt it exists) but I think it was mentioned in another message in this thread (and is how the current manual Mark All feature works, apparently).

Cheers, Matthew Brush

Colomban Wendling

3:09 a.m.

Hey,

Le 30/05/2015 01:45, Matthew Brush a écrit :

...

[…]

I was thinking something like this for implementation:

Have a preference to enable the feature (since it would now be

automatic). Have the preference turned off by default. Put the preference in "Preferences->Editor->Display" as a checkbox called "Highlight current word" or similar. Or would it be better under "Preferences->Editor->Features"?

Use a different indicator number than the current "Mark All" feature,

so it won't clash with the one used for the Search dialog and can have different styling.

Remove the "Mark All" keybinding. Also make these new indicator types

not cleared by the "Document->Remove Markers" menu item.

As said on IRC, I probably would rather combine the two feature (current "mark all" [shift-ctrl+m] and this dynamic version of it).

E.g, have a setting in the preferences "Dynamically mark the current word" that decides whether mark all is dynamic or not, and have shift+ctrl+m toggle the marking, whether it's dynamic or not.

on_toggle_mark() { if ((dynamic && active) || (!dynamic && current_char_is_marked()) clear_markers(); else mark_all(); active = dynamic && !active; }

on_caret_moved() { /* if enabled and current word is not already marked */ if (dynamic && active && !current_char_is_marked()) { clear_markers(); mark_all(); } }

...

[…]

If there is a current word and it's different from the last one, clear

the indicators and then have Scintilla close its gap buffer by getting the character pointer to it.

Use strstr() starting at the character pointer to the buffer start,

comparing the bytes with the bytes of the current word, working its way through the whole document.

For each non-NULL return of strstr(), set an indicator at position

`foundPtr - docStartPtr` with the indicator length as the number of bytes in the current word, and then use strstr() again starting at `foundPtr + currentWordLength`, repeat to end.

Why not use the basic Scintilla search features? It should be fast and do just what you want just as easily -- and look like it's the expected way you do it, which may even not need closing the gave or something.

...

Does that sound fairly reasonable?

The only thing I'm not 100% sure about is handling of multi-byte characters in the UTF-8 of the wordchars or the buffer, but it seems like it should "just work" since it's just comparing raw bytes, and at worst, setting a indicator position in Scintilla that is between two bytes of the same multi-byte char (but not moving the caret there, so no weird editing bugs).

you shouldn't have to worry about that. we already have a way to get the word under cursor, so just use that and don't worry about how it's done (we can always fix it if it doesn't get it right, but it seem to be good enough as nobody complained).

Cheers, Colomban

Matthew Brush

3:19 a.m.

On 2015-05-29 06:09 PM, Colomban Wendling wrote:

...

Hey,

Le 30/05/2015 01:45, Matthew Brush a écrit :

...
[…]

[snip]

Remove the "Mark All" keybinding. Also make these new indicator types

not cleared by the "Document->Remove Markers" menu item.

As said on IRC, I probably would rather combine the two feature (current "mark all" [shift-ctrl+m] and this dynamic version of it).

E.g, have a setting in the preferences "Dynamically mark the current word" that decides whether mark all is dynamic or not, and have shift+ctrl+m toggle the marking, whether it's dynamic or not.

[snip]

...

...
[…]

If there is a current word and it's different from the last one, clear

the indicators and then have Scintilla close its gap buffer by getting the character pointer to it.

Use strstr() starting at the character pointer to the buffer start,

comparing the bytes with the bytes of the current word, working its way through the whole document.

For each non-NULL return of strstr(), set an indicator at position

`foundPtr - docStartPtr` with the indicator length as the number of bytes in the current word, and then use strstr() again starting at `foundPtr + currentWordLength`, repeat to end.

Why not use the basic Scintilla search features? It should be fast and do just what you want just as easily -- and look like it's the expected way you do it, which may even not need closing the gave or something.

Just because it's such a trivial search algorithm, using strstr() is much more simple and probably more efficient than using Scintilla's API to find text, but if manual and automatic mode is supported, it would make sense to share the existing code, and that beats out advantage having a redundant (yet simpler/faster) routine to do same, IMO. +1 (if it's not too much hassle to refactor "Mark All").

...

...
Does that sound fairly reasonable?

The only thing I'm not 100% sure about is handling of multi-byte characters in the UTF-8 of the wordchars or the buffer, but it seems like it should "just work" since it's just comparing raw bytes, and at worst, setting a indicator position in Scintilla that is between two bytes of the same multi-byte char (but not moving the caret there, so no weird editing bugs).

you shouldn't have to worry about that. we already have a way to get the word under cursor, so just use that and don't worry about how it's done (we can always fix it if it doesn't get it right, but it seem to be good enough as nobody complained).

Ok, good.

Cheers, Matthew Brush

Thomas Martitz

31 May 31 May

12:05 a.m.

Am 30.05.2015 um 03:19 schrieb Matthew Brush:

...

Just because it's such a trivial search algorithm, using strstr() is much more simple and probably more efficient than using Scintilla's API to find text, but if manual and automatic mode is supported, it would make sense to share the existing code, and that beats out advantage having a redundant (yet simpler/faster) routine to do same, IMO. +1 (if it's not too much hassle to refactor "Mark All").

What makes you think naive strstr() based search is faster/more efficient than whatever Scintilla does?

I haven't looked actually, but I'd think it does the same (probably using C++ strings), or something smarter like KMP mentioned in this thread or whatever the C++ template library provides. But I don't think it does anything slower than the most trivial strstr() method.

Best regards.

Lex Trotman

3:46 a.m.

On 31 May 2015 at 08:05, Thomas Martitz kugel@rockbox.org wrote:

...

Am 30.05.2015 um 03:19 schrieb Matthew Brush:

...
Just because it's such a trivial search algorithm, using strstr() is much more simple and probably more efficient than using Scintilla's API to find text, but if manual and automatic mode is supported, it would make sense to share the existing code, and that beats out advantage having a redundant (yet simpler/faster) routine to do same, IMO. +1 (if it's not too much hassle to refactor "Mark All").

What makes you think naive strstr() based search is faster/more efficient than whatever Scintilla does?

I haven't looked actually, but I'd think it does the same (probably using C++ strings), or something smarter like KMP mentioned in this thread or whatever the C++ template library provides. But I don't think it does anything slower than the most trivial strstr() method.

Scintilla uses its own home rolled gap buffer, not C++ strings (for the text and styles). For plain case sensitive "find text" it uses the naive algorithm, but each char from the buffer is accessed by Document::charat() which will probably inline as its in the header and its only a call to CellBuffer::charat() which probably won't inline since its in the cxx file not the header and which calls SplitVector::charat() which is in the header and may or may not inline due to its size (15 lines).

So its almost certainly slower than strstr().

But to be able to use strstr() (or any other algorithm that needs contiguous text) the gap needs to be closed, so a large block of text and styles need copying, which is done by memmove() which I think uses intrinsics if available so it probably runs at memory speed so its fast. But its something else that complicates comparing speeds :)

Essentially the only way to tell is to try benchmarks for a (hopefully) representative set of cases.

Cheers Lex

...

Best regards.

Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Lex Trotman

7:41 a.m.

On 31 May 2015 at 11:46, Lex Trotman elextr@gmail.com wrote:

...

On 31 May 2015 at 08:05, Thomas Martitz kugel@rockbox.org wrote:

...
Am 30.05.2015 um 03:19 schrieb Matthew Brush:

...
Just because it's such a trivial search algorithm, using strstr() is much more simple and probably more efficient than using Scintilla's API to find text, but if manual and automatic mode is supported, it would make sense to share the existing code, and that beats out advantage having a redundant (yet simpler/faster) routine to do same, IMO. +1 (if it's not too much hassle to refactor "Mark All").

What makes you think naive strstr() based search is faster/more efficient than whatever Scintilla does?

I haven't looked actually, but I'd think it does the same (probably using C++ strings), or something smarter like KMP mentioned in this thread or whatever the C++ template library provides. But I don't think it does anything slower than the most trivial strstr() method.

Scintilla uses its own home rolled gap buffer, not C++ strings (for the text and styles). For plain case sensitive "find text" it uses the naive algorithm, but each char from the buffer is accessed by Document::charat() which will probably inline as its in the header and its only a call to CellBuffer::charat() which probably won't inline since its in the cxx file not the header and which calls SplitVector::charat() which is in the header and may or may not inline due to its size (15 lines).

So its almost certainly slower than strstr().

And on my system strstr() is a builtin that can use any hardware support available.

...

But to be able to use strstr() (or any other algorithm that needs contiguous text) the gap needs to be closed, so a large block of text and styles need copying, which is done by memmove() which I think uses intrinsics if available so it probably runs at memory speed so its fast. But its something else that complicates comparing speeds :)

Essentially the only way to tell is to try benchmarks for a (hopefully) representative set of cases.

Cheers Lex

...
Best regards.

Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Colomban Wendling

10:57 a.m.

Le 31/05/2015 07:41, Lex Trotman a écrit :

...

On 31 May 2015 at 11:46, Lex Trotman elextr@gmail.com wrote:

...
On 31 May 2015 at 08:05, Thomas Martitz kugel@rockbox.org wrote:

...
Am 30.05.2015 um 03:19 schrieb Matthew Brush:

...
Just because it's such a trivial search algorithm, using strstr() is much more simple and probably more efficient than using Scintilla's API to find text, […]

So its almost certainly slower than strstr().

And on my system strstr() is a builtin that can use any hardware support available.

One thing that will make strstr() sound a lot less sexy is that you probably actually want to find *words* rather than substrings. Meaning that if the word under the cursor is "i", you probably don't want to highlight all "i"s in e.g. an identifier "highlighting", but only whole words. And while Scintilla search has the logic for this (SCFIND_WHOLEWORD), it'd probably be annoying/redundant to re-do with the same logic.

Apart that, yes, strstr() from an optimized libc like glibc will be hard to beat without also using very smart optimization combined with use of specialized CPU instruction sets.

Cheers, Colomban

Lex Trotman

11:29 a.m.

On 31 May 2015 at 18:57, Colomban Wendling lists.ban@herbesfolles.org wrote:

...

Le 31/05/2015 07:41, Lex Trotman a écrit :

...
On 31 May 2015 at 11:46, Lex Trotman elextr@gmail.com wrote:

...
On 31 May 2015 at 08:05, Thomas Martitz kugel@rockbox.org wrote:

...
Am 30.05.2015 um 03:19 schrieb Matthew Brush:

...
Just because it's such a trivial search algorithm, using strstr() is much more simple and probably more efficient than using Scintilla's API to find text, […]

So its almost certainly slower than strstr().

And on my system strstr() is a builtin that can use any hardware support available.

One thing that will make strstr() sound a lot less sexy is that you probably actually want to find *words* rather than substrings.

Sure you can, just do the strstr() thing then check those for wordiness :)

Meaning

...

that if the word under the cursor is "i", you probably don't want to highlight all "i"s in e.g. an identifier "highlighting", but only whole words. And while Scintilla search has the logic for this (SCFIND_WHOLEWORD), it'd probably be annoying/redundant to re-do with the same logic.

But yes, as much fun as imagining various premature optimising is, using the existing code first to get it working then optimising *if needed* is the real way to go.

Cheers Lex

...

Apart that, yes, strstr() from an optimized libc like glibc will be hard to beat without also using very smart optimization combined with use of specialized CPU instruction sets.

Cheers, Colomban _______________________________________________ Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Jiří Techet

1 Jun 1 Jun

6:23 p.m.

On Sun, May 31, 2015 at 10:57 AM, Colomban Wendling < lists.ban@herbesfolles.org> wrote:

...

Le 31/05/2015 07:41, Lex Trotman a écrit :

...
On 31 May 2015 at 11:46, Lex Trotman elextr@gmail.com wrote:

...
On 31 May 2015 at 08:05, Thomas Martitz kugel@rockbox.org wrote:

...
Am 30.05.2015 um 03:19 schrieb Matthew Brush:

...
Just because it's such a trivial search algorithm, using strstr() is

much

...
...
...
...
more simple and probably more efficient than using Scintilla's API to

find

...
...
...
...
text, […]

So its almost certainly slower than strstr().

And on my system strstr() is a builtin that can use any hardware support available.

One thing that will make strstr() sound a lot less sexy is that you probably actually want to find *words* rather than substrings. Meaning that if the word under the cursor is "i", you probably don't want to highlight all "i"s in e.g. an identifier "highlighting", but only whole words. And while Scintilla search has the logic for this (SCFIND_WHOLEWORD), it'd probably be annoying/redundant to re-do with the same logic.

Apart that, yes, strstr() from an optimized libc like glibc will be hard to beat without also using very smart optimization combined with use of specialized CPU instruction sets.

Cheers, Colomban

Just to clarify, when I mentioned strstr(), I meant it as an example of using some existing implementation (instead of creating something new) rather than suggesting strstr() is the "best" one. If there's something in Scintilla which would make it easier to implement this feature, just go for it. If there's some performance problem, it can always be improved afterwards (but I don't think there will be any).

Regards,

Jiri

Steven Blatnick

3:41 p.m.

I kind of like the idea of selection highlighting being separate highlighting from the search highlighting. That allows you to have multiple groups highlighted differently, which has come in handy in using plugin versions of these features. Alternatively, perhaps we could add having multiple search groups, but that may be more complicated or less intuitive.

On 05/29/2015 07:09 PM, Colomban Wendling wrote:

...

E.g, have a setting in the preferences "Dynamically mark the current word" that decides whether mark all is dynamic or not, and have shift+ctrl+m toggle the marking, whether it's dynamic or not.

Matthew Brush

30 May 30 May

3:11 a.m.

On 2015-05-29 04:45 PM, Matthew Brush wrote:

...

[snip]

I was thinking something like this for implementation:

[snip]

Remove the "Mark All" keybinding. Also make these new indicator types

not cleared by the "Document->Remove Markers" menu item.

Actually, removing the "Mark All" keybinding probably isn't so good (I thought it would be pointless at first), as the same "mark all" code might be able to be re-factored for both manual and automatic highlighting, and if the user has it set to manual (the default), it will still be a useful feature to trigger by a keybinding.

I will have to see how hard it is to re-factor the existing "Mark All" feature to use separate indicator number and clearing code.

Cheers, Matthew Brush

marius buzea

29 May 29 May

12:10 a.m.

New subject: pull request on GitHub, to add GeanyHighlightSelectedWords, into Geany Plugins

Hello,

I had a look at

https://github.com/sblatnick/geany-plugins/blob/master/quick-search/src/quic....

This plugin does something similar to GeanyHighlightSelectedWord.

The quick-search.c calls Geany's search_find_text several times in one processing, and each time a regex would be recompiled in search_find_text. This is, I guess, a small cost when the regex is just a string.

GeanyHighlightSelectedWord implements search using KMP.

Designs differ, but functionality is alike. I would keep both this plugins, and not try to merge them.

Have a great day, Marius Ioan Buzea

-------------------------------------------- On Thu, 5/28/15, Colomban Wendling lists.ban@herbesfolles.org wrote:

Hi!

Le 27/05/2015 14:25, marius buzea a écrit :

...

Hello,

I would like to add

GeanyHighlightSelectedWords, to Geany Plugins.

...

Would it be okay that I do a git

pull-request for doing this? […]

(well, given it's of decent quality, but we can even help with that if we think something might be problematic)

...

I

know there would be some things I would need to do before, like

...

replace the Makefile with

Makefile.am, the autotools, automake way,

...

and write a README file using restructured

text content so that that

...

README file

can be converted to html. Maybe there is more that

...

should be done.

It's pretty much it. Build system integration, and internationalization integration. But we can help with both, you can read HACKING, and internationalization is quite easy.

Regards, Colomban

Thomas Martitz

9:03 a.m.

Am 29.05.2015 um 00:10 schrieb marius buzea:

...

Hello,

I had a look at

https://github.com/sblatnick/geany-plugins/blob/master/quick-search/src/quic....

This plugin does something similar to GeanyHighlightSelectedWord.

The quick-search.c calls Geany's search_find_text several times in one processing, and each time a regex would be recompiled in search_find_text. This is, I guess, a small cost when the regex is just a string.

GeanyHighlightSelectedWord implements search using KMP.

Designs differ, but functionality is alike. I would keep both this plugins, and not try to merge them.

Can you describe this KMP algorithm, and why it should be superior?

Anyway, the existing plugin (automark) should look into adopting it (if it is indeed an improvment) instead of having multiple plugins with the same functionality.

PS: I also agree with providing it by the core.

Best regards

marius buzea

12:44 p.m.

New subject: pull request on GitHub, to add GeanyHighlightSelectedWords, into Geany Plugins

Hello,

With KMP it is possible to search all occurrences of a m length string, into a n length string, using O(m+n) machine operations. Next page: http://www.inf.fh-flensburg.de/lang/algorithmen/pattern/kmpen.htm describes the algorithm.

The KMP works well with the utf-8 encoding of unicode. One property of utf8 is that the encoding one unicode symbol is not a substring of another utf8 substring. This property allows to take the utf-8 encoding of the string you wish to search, and to find this utf8 encoding string, in the utf8 encoding of the text string. Geany uses scintilla, and scintilla uses utf8 to encode the document it displays, and scintilla has a command that gives the raw utf8 byte array for a [start, end) range. So, KMP gives great speed for searching all occurrences, and may be used with the underlying text representation of scintilla used by geany. The utf-8 encoding of a unicode string of length n, is less than 6n, each utf8 encoding is at most 6 bytes.

I also think that including this functionality/feature into Geany core would be a good choice. It would be a small tradeoff between keeping the core small, and adding this new functionality, but this is your choice.

If you wish to extend automark, then this is good choice too. If you wish, and if it helps, please reuse any part of the implementation provided here: http://sourceforge.net/p/geanyhighlightselectedword/code/HEAD/tree/trunk/Gea... If needed, I would help.

What should I do next? Should I not do the pull request for GeanyHighlightSelectedWord? It is okay with me. GeanyHighlightSelectedWord would then be still available at sourceforge until Geany provides this functionality from its core, or from automark.

Have a great day, Marius Ioan Buzea

-------------------------------------------- On Fri, 5/29/15, Thomas Martitz kugel@rockbox.org wrote:

Subject: Re: [Geany-Devel] pull request on GitHub, to add GeanyHighlightSelectedWords, into Geany Plugins To: devel@lists.geany.org Date: Friday, May 29, 2015, 10:03 AM

Am 29.05.2015 um 00:10 schrieb marius buzea:

...

Hello,

I had a look at

https://github.com/sblatnick/geany-plugins/blob/master/quick-search/src/quic....

...

This plugin does

something similar to GeanyHighlightSelectedWord.

...

The quick-search.c

calls Geany's search_find_text several times in one

...

processing, and each time a regex would be

recompiled in search_find_text.

...

This

is, I guess, a small cost when the regex is just a string.

...

GeanyHighlightSelectedWord implements search using KMP.

...

Designs differ, but

functionality is alike. I would keep both this plugins,

...

and not try to merge them.

Can you describe this KMP algorithm, and why it should be superior?

Anyway, the existing plugin (automark) should look into adopting it (if

it is indeed an improvment) instead of having multiple plugins with the same functionality.

PS: I also agree with providing it by the core.

Best regards _______________________________________________ Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Thomas Martitz

12:49 p.m.

Am 29.05.2015 um 12:44 schrieb marius buzea:

...

Hello,

With KMP it is possible to search all occurrences of a m length string, into a n length string, using O(m+n) machine operations. Next page: http://www.inf.fh-flensburg.de/lang/algorithmen/pattern/kmpen.htm describes the algorithm.

The KMP works well with the utf-8 encoding of unicode. One property of utf8 is that the encoding one unicode symbol is not a substring of another utf8 substring. This property allows to take the utf-8 encoding of the string you wish to search, and to find this utf8 encoding string, in the utf8 encoding of the text string. Geany uses scintilla, and scintilla uses utf8 to encode the document it displays, and scintilla has a command that gives the raw utf8 byte array for a [start, end) range. So, KMP gives great speed for searching all occurrences, and may be used with the underlying text representation of scintilla used by geany. The utf-8 encoding of a unicode string of length n, is less than 6n, each utf8 encoding is at most 6 bytes.

I also think that including this functionality/feature into Geany core would be a good choice. It would be a small tradeoff between keeping the core small, and adding this new functionality, but this is your choice.

If you wish to extend automark, then this is good choice too. If you wish, and if it helps, please reuse any part of the implementation provided here: http://sourceforge.net/p/geanyhighlightselectedword/code/HEAD/tree/trunk/Gea... If needed, I would help.

What should I do next? Should I not do the pull request for GeanyHighlightSelectedWord? It is okay with me. GeanyHighlightSelectedWord would then be still available at sourceforge until Geany provides this functionality from its core, or from automark.

I wonder if this algorithm should be applied to all searches, and thus be integrated into scintilla. Does it have any major drawbacks? I read it has to some kind "prefix table" prior to running the search, but I guess that's negligible for all reasonable search terms?

Best regards

Lex Trotman

1:29 p.m.

KMP calculates a table of how many characters it can skip if some of the search string matches but not all of it. Its more useful for relatively large search strings since it can skip potentially a lot of compares. Since this use is always going to search the whole file, to mark all occurrances rather than stop at the first/next match, its probably worthwhile for any reasonable size search string despite the preparation needed.

But I would really like to see good benchmarks, since with modern CPUs and caches I would also suspect many of the comparisons on the naive algorithm will be hidden by the memory access time and its simple order of access might make better use of prefetches.

And of course there are the SIMD instructions in modern processors just waiting to be exploited by the adventurous :)

Cheers Lex

On 29 May 2015 at 20:49, Thomas Martitz kugel@rockbox.org wrote:

...

Am 29.05.2015 um 12:44 schrieb marius buzea:

...
Hello,

With KMP it is possible to search all occurrences of a m length string, into a n length string, using O(m+n) machine operations. Next page: http://www.inf.fh-flensburg.de/lang/algorithmen/pattern/kmpen.htm describes the algorithm.

The KMP works well with the utf-8 encoding of unicode. One property of utf8 is that the encoding one unicode symbol is not a substring of another utf8 substring. This property allows to take the utf-8 encoding of the string you wish to search, and to find this utf8 encoding string, in the utf8 encoding of the text string. Geany uses scintilla, and scintilla uses utf8 to encode the document it displays, and scintilla has a command that gives the raw utf8 byte array for a [start, end) range. So, KMP gives great speed for searching all occurrences, and may be used with the underlying text representation of scintilla used by geany. The utf-8 encoding of a unicode string of length n, is less than 6n, each utf8 encoding is at most 6 bytes.

I also think that including this functionality/feature into Geany core would be a good choice. It would be a small tradeoff between keeping the core small, and adding this new functionality, but this is your choice.

If you wish to extend automark, then this is good choice too. If you wish, and if it helps, please reuse any part of the implementation provided here:

http://sourceforge.net/p/geanyhighlightselectedword/code/HEAD/tree/trunk/Gea... If needed, I would help.

What should I do next? Should I not do the pull request for GeanyHighlightSelectedWord? It is okay with me. GeanyHighlightSelectedWord would then be still available at sourceforge until Geany provides this functionality from its core, or from automark.

I wonder if this algorithm should be applied to all searches, and thus be integrated into scintilla. Does it have any major drawbacks? I read it has to some kind "prefix table" prior to running the search, but I guess that's negligible for all reasonable search terms?

Best regards _______________________________________________ Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

marius buzea

1:40 p.m.

New subject: pull request on GitHub, to add GeanyHighlightSelectedWords, into Geany Plugins

...

I wonder if this algorithm should be applied to all searches, and thus be integrated into scintilla.

The KMP may be used to find all occurrences of a string P, into a string T. Say P is "hello", then KMP may be used to find all occurrences of "hello", in another string T, T may be the document scintilla is displaying. You could use KMP to also do case insensitive search, if first you change P to be all upper case, and then when you use T[i], you change T[i] to be upper case as well. This are two use-cases that KMP may do. I do not know all possibilities scintilla defines for searching. This may be analyzed.

KMP may be also used to find occurrences iteratively, once you compute the prefix table, you can record the state of the KMP search in a structure, instead of writing the KMP in one function. I did this in GeanyHighlighSelectedWord.

...

Does it have any major drawbacks? I read it has to some kind "prefix table" prior to running the search, but I guess that's negligible for all reasonable search terms?

I think KMP does not have drawbacks. The prefix table is of length m+1, m is the length of the string P for which you wish to find all occurrences in some string T. For example, P is "hello", m == 5, and you build a prefix table of length m+1 == 6. Then T may be any long string, for the long string you do not need any additional memory. You only need the extra m+1 size_t locations for the prefix table, m is length of P, the string you wish to search.

Have a great day, Marius Ioan Buzea

-------------------------------------------- On Fri, 5/29/15, Thomas Martitz kugel@rockbox.org wrote:

Subject: Re: [Geany-Devel] pull request on GitHub, to add GeanyHighlightSelectedWords, into Geany Plugins To: devel@lists.geany.org Date: Friday, May 29, 2015, 1:49 PM

Am 29.05.2015 um 12:44 schrieb marius buzea:

...

Hello,

With KMP it is possible to search all occurrences of a m length string, into a n length string,

...

using O(m+n) machine operations. Next page:

...

http://www.inf.fh-flensburg.de/lang/algorithmen/pattern/kmpen.htm describes the algorithm.

The KMP works well

with the utf-8 encoding of unicode. One property of utf8 is that

...

the encoding one unicode

symbol is not a substring of another utf8 substring. This

...

property allows to take the utf-8

encoding of the string you wish to search, and to

...

find this utf8 encoding string, in the

utf8 encoding of the text string. Geany uses

...

scintilla, and scintilla uses utf8

to encode the document it displays, and scintilla has

...

a command that gives the raw utf8 byte

array for a [start, end) range. So, KMP

...

gives great speed for searching all

occurrences, and may be used with the underlying

...

text representation of scintilla used by

geany. The utf-8 encoding of a unicode

...

string of length n, is less than 6n, each

utf8 encoding is at most 6 bytes.

...

I also think that including this functionality/feature into Geany core would be a good choice.

...

It

would be a small tradeoff between keeping the core small, and adding this new functionality,

...

but

this is your choice.

...

If you wish to extend automark, then this is good choice too. If you wish, and if it helps,

...

please reuse any part of the

implementation provided here:

...

http://sourceforge.net/p/geanyhighlightselectedword/code/HEAD/tree/trunk/Gea... If needed, I would help.

What should I do next? Should I not do the pull request for GeanyHighlightSelectedWord?

...

It is okay with me.

GeanyHighlightSelectedWord would then be still available at sourceforge until

...

Geany provides this

functionality from its core, or from automark.

...

Best regards _______________________________________________ Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Lex Trotman

3:22 p.m.

...

...
Does it have any major drawbacks? I read it has to some kind "prefix table" prior to running the search, but I guess that's negligible for all reasonable search terms?

I think KMP does not have drawbacks. The prefix table is of length m+1, m is the length of the string P for which you wish to find all occurrences in some string T. For example, P is "hello", m == 5, and you build a prefix table of length m+1 == 6. Then T may be any long string, for the long string you do not need any additional memory. You only need the extra m+1 size_t locations for the prefix table, m is length of P, the string you wish to search.

Its drawback is it is more complicated, and as I said you have to benchmark to see if its worth it, things that were worthwhile back when KMP was invented may not be worthwhile today.

...

Have a great day, Marius Ioan Buzea

On Fri, 5/29/15, Thomas Martitz kugel@rockbox.org wrote:

Subject: Re: [Geany-Devel] pull request on GitHub, to add GeanyHighlightSelectedWords, into Geany Plugins To: devel@lists.geany.org Date: Friday, May 29, 2015, 1:49 PM

Am 29.05.2015 um 12:44 schrieb marius buzea:

...
Hello,

With KMP it is possible to search all occurrences of a m length string, into a n length string,

...
using O(m+n) machine operations. Next page:

...
     http://www.inf.fh-flensburg.de/lang/algorithmen/pattern/kmpen.htm
describes the algorithm.

The KMP works well
with the utf-8 encoding of unicode. One property of utf8 is that

...
the encoding one unicode

symbol is not a substring of another utf8 substring. This

...
property allows to take the utf-8

encoding of the string you wish to search, and to

...
find this utf8 encoding string, in the

utf8 encoding of the text string. Geany uses

...
scintilla, and scintilla uses utf8

to encode the document it displays, and scintilla has

...
a command that gives the raw utf8 byte

array for a [start, end) range. So, KMP

...
gives great speed for searching all

occurrences, and may be used with the underlying

...
text representation of scintilla used by

geany. The utf-8 encoding of a unicode

...
string of length n, is less than 6n, each

utf8 encoding is at most 6 bytes.

...
I also think that including this functionality/feature into Geany core would be a good choice.

...
It

would be a small tradeoff between keeping the core small, and adding this new functionality,

...
but

this is your choice.

...
If you wish to extend automark, then this is good choice too. If you wish, and if it helps,

...
please reuse any part of the

implementation provided here:

...
http://sourceforge.net/p/geanyhighlightselectedword/code/HEAD/tree/trunk/Gea... If needed, I would help.

What should I do next? Should I not do the pull request for GeanyHighlightSelectedWord?

...
It is okay with me.

GeanyHighlightSelectedWord would then be still available at sourceforge until

...
Geany provides this

functionality from its core, or from automark.

...
I wonder if this algorithm should be applied to all searches, and thus be integrated into scintilla. Does it have any major drawbacks? I read it has to some kind "prefix table" prior to running the search, but I guess that's negligible for all reasonable search terms?

Best regards _______________________________________________ Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Lex Trotman

3:46 p.m.

On 29 May 2015 at 23:22, Lex Trotman elextr@gmail.com wrote:

...

...
...
Does it have any major drawbacks? I read it has to some kind "prefix table" prior to running the search, but I guess that's negligible for all reasonable search terms?

I think KMP does not have drawbacks. The prefix table is of length m+1, m is the length of the string P for which you wish to find all occurrences in some string T. For example, P is "hello", m == 5, and you build a prefix table of length m+1 == 6. Then T may be any long string, for the long string you do not need any additional memory. You only need the extra m+1 size_t locations for the prefix table, m is length of P, the string you wish to search.

Its drawback is it is more complicated, and as I said you have to benchmark to see if its worth it, things that were worthwhile back when KMP was invented may not be worthwhile today.

I might add that the search time may be irrelevant compared to the time to display unless you have a huge buffer and a very small window.

...

...
Have a great day, Marius Ioan Buzea

On Fri, 5/29/15, Thomas Martitz kugel@rockbox.org wrote:

Subject: Re: [Geany-Devel] pull request on GitHub, to add GeanyHighlightSelectedWords, into Geany Plugins To: devel@lists.geany.org Date: Friday, May 29, 2015, 1:49 PM

Am 29.05.2015 um 12:44 schrieb marius buzea:

...
Hello,

With KMP it is possible to search all occurrences of a m length string, into a n length string,

...
using O(m+n) machine operations. Next page:

...
     http://www.inf.fh-flensburg.de/lang/algorithmen/pattern/kmpen.htm
describes the algorithm.

The KMP works well
with the utf-8 encoding of unicode. One property of utf8 is that

...
the encoding one unicode

symbol is not a substring of another utf8 substring. This

...
property allows to take the utf-8

encoding of the string you wish to search, and to

...
find this utf8 encoding string, in the

utf8 encoding of the text string. Geany uses

...
scintilla, and scintilla uses utf8

to encode the document it displays, and scintilla has

...
a command that gives the raw utf8 byte

array for a [start, end) range. So, KMP

...
gives great speed for searching all

occurrences, and may be used with the underlying

...
text representation of scintilla used by

geany. The utf-8 encoding of a unicode

...
string of length n, is less than 6n, each

utf8 encoding is at most 6 bytes.

...
I also think that including this functionality/feature into Geany core would be a good choice.

...
It

would be a small tradeoff between keeping the core small, and adding this new functionality,

...
but

this is your choice.

...
If you wish to extend automark, then this is good choice too. If you wish, and if it helps,

...
please reuse any part of the

implementation provided here:

...
http://sourceforge.net/p/geanyhighlightselectedword/code/HEAD/tree/trunk/Gea... If needed, I would help.

What should I do next? Should I not do the pull request for GeanyHighlightSelectedWord?

...
It is okay with me.

GeanyHighlightSelectedWord would then be still available at sourceforge until

...
Geany provides this

functionality from its core, or from automark.

...
I wonder if this algorithm should be applied to all searches, and thus be integrated into scintilla. Does it have any major drawbacks? I read it has to some kind "prefix table" prior to running the search, but I guess that's negligible for all reasonable search terms?

Best regards _______________________________________________ Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Colomban Wendling

30 May 30 May

3:21 a.m.

Le 29/05/2015 00:10, marius buzea a écrit :

...

https://github.com/sblatnick/geany-plugins/blob/master/quick-search/src/quic....

[…]

The quick-search.c calls Geany's search_find_text several times in one processing, and each time a regex would be recompiled in search_find_text. This is, I guess, a small cost when the regex is just a string.

search_find_text() doesn't do regex search when the flags don't ask for it, it only uses SCI_FINDTEXT().

=====

BTW, @Steven: search_find_text() is *NOT* part of the Geany plugin API and never have been. The fact you can use it is a issue of the way Geany API was exported, and it is fixed in the dev version (meaning it won't work anymore). Not also that this never worked on Windows.

If you need the function, tell us and we can probably add it. Though here all you need is SCI_FINDTEXT, that is already available through sicnitlla_send_commend(sci, SCI_FINDTEXT, flags, ttf).

Cheers, Colomban

Matthew Brush

4:29 a.m.

On 2015-05-29 06:21 PM, Colomban Wendling wrote:

...

Le 29/05/2015 00:10, marius buzea a écrit :

...
https://github.com/sblatnick/geany-plugins/blob/master/quick-search/src/quic....

[…]

The quick-search.c calls Geany's search_find_text several times in one processing, and each time a regex would be recompiled in search_find_text. This is, I guess, a small cost when the regex is just a string.

search_find_text() doesn't do regex search when the flags don't ask for it, it only uses SCI_FINDTEXT().

=====

BTW, @Steven: search_find_text() is *NOT* part of the Geany plugin API and never have been. The fact you can use it is a issue of the way Geany API was exported, and it is fixed in the dev version (meaning it won't work anymore). Not also that this never worked on Windows.

[snip]

Mea culpa :)

Lessons learned: - Messing with build system flags can affect API (and ABI for that matter) without ever touching the code itself. - Never use any function that isn't explicitly listed in Geany's Doxygen documentation. Even if a function has no or incomplete documentation, if it shows up in the API reference docs (ie. has a /** or similar Doxyen comment), it's safe to use, otherwise it probably should be, or it's a bug.

Cheers, Matthew Brush

Steven Blatnick

1 Jun 1 Jun

3:51 p.m.

Thanks Matthew. I was wondering how to tell what was API. (Colomban can disregard that question in the other email).

On 05/29/2015 08:29 PM, Matthew Brush wrote:

...

Lessons learned:

Messing with build system flags can affect API (and ABI for that

matter) without ever touching the code itself.

Never use any function that isn't explicitly listed in Geany's

Doxygen documentation. Even if a function has no or incomplete documentation, if it shows up in the API reference docs (ie. has a /** or similar Doxyen comment), it's safe to use, otherwise it probably should be, or it's a bug.

Steven Blatnick

3:48 p.m.

Responses:

On 05/29/2015 07:21 PM, Colomban Wendling wrote:

...

Le 29/05/2015 00:10, marius buzea a écrit :

...
https://github.com/sblatnick/geany-plugins/blob/master/quick-search/src/quic....

[…]

The quick-search.c calls Geany's search_find_text several times in one processing, and each time a regex would be recompiled in search_find_text. This is, I guess, a small cost when the regex is just a string.

Odd, I don't see this reply from Marius in my inbox. Was this in private separately?

...

search_find_text() doesn't do regex search when the flags don't ask for it, it only uses SCI_FINDTEXT().

=====

BTW, @Steven: search_find_text() is *NOT* part of the Geany plugin API and never have been. The fact you can use it is a issue of the way Geany API was exported, and it is fixed in the dev version (meaning it won't work anymore). Not also that this never worked on Windows.

Thanks for the information. I wonder if any of my other plugins include non-API calls. Is there an easy way to tell what is allowed and what shouldn't be? Is there a reason we don't allow plugins to tie into anything when they could besides trying to stop plugins from being overtly complicated or breaking things?

...

If you need the function, tell us and we can probably add it. Though here all you need is SCI_FINDTEXT, that is already available through sicnitlla_send_commend(sci, SCI_FINDTEXT, flags, ttf).

Ok, I'll look into that.

Thanks,

Steve

Colomban Wendling

4:09 p.m.

Le 01/06/2015 15:48, Steven Blatnick a écrit :

...

[…]

Odd, I don't see this reply from Marius in my inbox. Was this in private separately?

No, it was sent to the mailing list just like the rest… maybe a spam filter got confused?

...

...
[…]

BTW, @Steven: search_find_text() is *NOT* part of the Geany plugin API and never have been. The fact you can use it is a issue of the way Geany API was exported, and it is fixed in the dev version (meaning it won't work anymore). Not also that this never worked on Windows.

...

Thanks for the information. I wonder if any of my other plugins include non-API calls. Is there an easy way to tell what is allowed and what shouldn't be?

Anything not in the API documentation shouldn't be used. And as said the current Geany development version (1.25) has this fixed so you plugin shouldn't load anymore if it uses something it shouldn't.

...

Is there a reason we don't allow plugins to tie into anything when they could besides trying to stop plugins from being overtly complicated or breaking things?

The reason is that we don't want to break the API every few minutes, so this means it has to be defined. This can't reasonably include every function in Geany, as it would basically mean we can't change anything inside Geany without potentially breaking plugins.

So we choose what to render public (based on needs basically), and we then commit to maintain this API (to a reasonable extent, at least, meaning we will only change it if there is an important reason to).

To use the example of search_find_text() as how non-API things can change, this function actually changed in the 1.24 cycle [1] to fix a real problem.

All this said, if you need a function that isn't part of the API, ask (or make a PR!) and we'll probably be happy to add it if it makes sense.

Regards, Colomban

[1] http://git.geany.org/geany/commit/?id=5412a244ba903624053cdaf7393732bc3af689...

Steven Blatnick

8 Jul 8 Jul

7:57 p.m.

So I've finally got a chance to look at my non-API calls. I was able to code around most of them, but there are two that would be much easier if we could make them APIs. (I haven't pushed any of these changes to my git repo yet.) Could we consider making these API?

* keybindings_load_keyfile - I dynamically add/remove a variable number of plugin keybindings based on the plugin settings, so this allows me to refresh the results easily. This allows my external-tools plugin to have any number of tools with each their own keybinding. Otherwise, most plugins have a set number of bindings. * keybindings_lookup_item - I know keybindings_get_item is available already, but I am attempting to look up a core group keybinding and not plugin's own keybindings. * keybindings_dialog_show_prefs_scroll - I remember someone saying the "Configure Plugins" window would have a button to this already in a later version, but I still don't see it. I only need this API if the button isn't added.

Let me know if this is possible or how I should proceed. I use geany with my plugins daily, and can't upgrade my code base until my plugins are working.

Thanks,

Steve

On 05/29/2015 07:21 PM, Colomban Wendling wrote:

...

BTW, @Steven: search_find_text() is*NOT* part of the Geany plugin API and never have been. The fact you can use it is a issue of the way Geany API was exported, and it is fixed in the dev version (meaning it won't work anymore).

Matthew Brush

9 Jul 9 Jul

3:18 a.m.

On 2015-07-08 10:57 AM, Steven Blatnick wrote:

...

So I've finally got a chance to look at my non-API calls. I was able to code around most of them, but there are two that would be much easier if we could make them APIs. (I haven't pushed any of these changes to my git repo yet.) Could we consider making these API?

keybindings_load_keyfile - I dynamically add/remove a variable number of plugin keybindings based on the plugin settings, so this allows me to refresh the results easily. This allows my external-tools plugin to have any number of tools with each their own keybinding. Otherwise, most plugins have a set number of bindings.

This sounds dubious.

I assume you're talking about `external-tools` plugin? Maybe I don't understand the code enough, but it looks to me like it's just leaking GeanyKeyGroups in `reload_tools()` and then calling `keybindings_load_keyfile()` happens to reload the key group it newly created?

I completely agree there needs to be a way to dynamically add/remove keybindings, but I'm not sure we should promote this way if I understand it correctly. IMO, it would be much better to fix Geany.

...

keybindings_lookup_item - I know keybindings_get_item is available already, but I am attempting to look up a core group keybinding and not plugin's own keybindings.

This sounds reasonable, though I think it would be better if made public to rename it to something like `keybindings_get_builtin_item()` or something. Also I think we should change the signature to use the correct types (those enums we already expose).

...

keybindings_dialog_show_prefs_scroll - I remember someone saying the "Configure Plugins" window would have a button to this already in a later version, but I still don't see it. I only need this API if the button isn't added.

Could probably make such a button/link use the same code as the "Keybindings" button in the Plugin Manager dialog, since it does just that. It might be a bit awkward UI-wise though.

...

Let me know if this is possible or how I should proceed. I use geany with my plugins daily, and can't upgrade my code base until my plugins are working.

Best is to make a PR with the changes you want. Second best is to raise an Issue and hope somebody else wants them enough to do it.

Cheers, Matthew Brush

Matthew Brush

4:58 a.m.

On 2015-07-08 10:57 AM, Steven Blatnick wrote:

...

[snip] Let me know if this is possible or how I should proceed. I use geany with my plugins daily, and can't upgrade my code base until my plugins are working.

I forgot to mention, if you made your plugin part of the Geany-Plugins project (I'm assuming it's FOSS, I couldn't find license), usually core Geany developers will test API changes against GP project to see what breaks and you might even find sometimes patches to your plugin waiting in your inbox, to fix this kind of stuff for you :)

Cheers, Matthew Brush

Steven Blatnick

6 p.m.

Thanks! I'd love to get this into geany-plugins, but I keep procrastinating because of lack of time to figure out how to integrate into the build system. Any suggestions on where to start with that? I'm a bit unclear on what files are generated vs what I need to edit to add my plugins.

Thanks,

Steve

On 07/08/2015 08:58 PM, Matthew Brush wrote:

...

On 2015-07-08 10:57 AM, Steven Blatnick wrote:

...
[snip] Let me know if this is possible or how I should proceed. I use geany with my plugins daily, and can't upgrade my code base until my plugins are working.

I forgot to mention, if you made your plugin part of the Geany-Plugins project (I'm assuming it's FOSS, I couldn't find license), usually core Geany developers will test API changes against GP project to see what breaks and you might even find sometimes patches to your plugin waiting in your inbox, to fix this kind of stuff for you :)

Cheers, Matthew Brush

Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Lex Trotman

14 Jul 14 Jul

7 a.m.

Hi Steve,

There are some changes coming soon (hopefully) in the keybinding which will probably add API and might help you, check out #376.

Cheers Lex

On 9 July 2015 at 03:57, Steven Blatnick steve8track@yahoo.com wrote:

...

So I've finally got a chance to look at my non-API calls. I was able to code around most of them, but there are two that would be much easier if we could make them APIs. (I haven't pushed any of these changes to my git repo yet.) Could we consider making these API?

keybindings_load_keyfile - I dynamically add/remove a variable number of plugin keybindings based on the plugin settings, so this allows me to refresh the results easily. This allows my external-tools plugin to have any number of tools with each their own keybinding. Otherwise, most plugins have a set number of bindings. keybindings_lookup_item - I know keybindings_get_item is available already, but I am attempting to look up a core group keybinding and not plugin's own keybindings. keybindings_dialog_show_prefs_scroll - I remember someone saying the "Configure Plugins" window would have a button to this already in a later version, but I still don't see it. I only need this API if the button isn't added.

Let me know if this is possible or how I should proceed. I use geany with my plugins daily, and can't upgrade my code base until my plugins are working.

Thanks,

Steve

On 05/29/2015 07:21 PM, Colomban Wendling wrote:

BTW, @Steven: search_find_text() is *NOT* part of the Geany plugin API and never have been. The fact you can use it is a issue of the way Geany API was exported, and it is fixed in the dev version (meaning it won't work anymore).

Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Steven Blatnick

5:36 p.m.

Thanks for the heads up :-D

On 07/13/2015 11:00 PM, Lex Trotman wrote:

...

Hi Steve,

There are some changes coming soon (hopefully) in the keybinding which will probably add API and might help you, check out #376.

Cheers Lex

On 9 July 2015 at 03:57, Steven Blatnick steve8track@yahoo.com wrote:

...
So I've finally got a chance to look at my non-API calls. I was able to code around most of them, but there are two that would be much easier if we could make them APIs. (I haven't pushed any of these changes to my git repo yet.) Could we consider making these API?

keybindings_load_keyfile - I dynamically add/remove a variable number of plugin keybindings based on the plugin settings, so this allows me to refresh the results easily. This allows my external-tools plugin to have any number of tools with each their own keybinding. Otherwise, most plugins have a set number of bindings. keybindings_lookup_item - I know keybindings_get_item is available already, but I am attempting to look up a core group keybinding and not plugin's own keybindings. keybindings_dialog_show_prefs_scroll - I remember someone saying the "Configure Plugins" window would have a button to this already in a later version, but I still don't see it. I only need this API if the button isn't added.

Let me know if this is possible or how I should proceed. I use geany with my plugins daily, and can't upgrade my code base until my plugins are working.

Thanks,

Steve

On 05/29/2015 07:21 PM, Colomban Wendling wrote:

BTW, @Steven: search_find_text() is *NOT* part of the Geany plugin API and never have been. The fact you can use it is a issue of the way Geany API was exported, and it is fixed in the dev version (meaning it won't work anymore).

Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Matthew Brush

15 Jun 15 Jun

1:21 a.m.

On 2015-05-27 05:25 AM, marius buzea wrote:

...

Hello,

I would like to add GeanyHighlightSelectedWords, to Geany Plugins. [snip]

See https://github.com/geany/geany/pull/513

Cheers, Matthew Brush

3301

Age (days ago)

3349

Last active (days ago)

devel@lists.geany.org

46 comments

9 participants

tags (0)

participants (9)

Colomban Wendling
Frank Lanitz
Jiří Techet
Lex Trotman
marius buzea
Matthew Brush
Pavel Roschin
Steven Blatnick
Thomas Martitz