Re: [Geany-Devel] pull request on GitHub, to add GeanyHighlightSelectedWords, into Geany Plugins

29 May 2015

      ...
...
Does it have any major drawbacks? I read  it has to some kind
"prefix table" prior to running the search, but I  guess that's
negligible for all reasonable search terms?
I think KMP does not have drawbacks.   The prefix table is of length m+1,
m is the length of the string P for which you wish to find all occurrences
in some string T.    For example, P is "hello", m == 5, and you build a
prefix table of length m+1 == 6.    Then T may be any long string, for the
long string you do not need any additional memory.   You only need the extra
m+1 size_t locations for the prefix table, m is length of P, the string you
wish to search.
Its drawback is it is more complicated, and as I said you have to
benchmark to see if its worth it, things that were worthwhile back
when KMP was invented may not be worthwhile today.
...
Have a great day,
Marius Ioan Buzea

On Fri, 5/29/15, Thomas Martitz kugel@rockbox.org wrote:
Subject: Re: [Geany-Devel] pull request on GitHub, to add GeanyHighlightSelectedWords, into Geany Plugins
 To: devel@lists.geany.org
 Date: Friday, May 29, 2015, 1:49 PM
Am 29.05.2015 um 12:44
 schrieb marius buzea:
...
Hello,
With KMP it is possible to search all occurrences of a m
 length string, into a n length string,
...
using O(m+n) machine operations.    Next page:
...
     http://www.inf.fh-flensburg.de/lang/algorithmen/pattern/kmpen.htm

describes the algorithm.
The KMP works well
with the utf-8 encoding of unicode.    One property of
 utf8 is that
...
the encoding one unicode
symbol is not a substring of another utf8 substring.
 This
...
property allows to take the utf-8
encoding of the string you wish to search, and to
...
find this utf8 encoding string, in the
utf8 encoding of the text string.       Geany
 uses
...
scintilla, and scintilla uses utf8
to encode the document it displays, and scintilla has
...
a command that gives the raw utf8 byte
array for a [start, end) range.      So, KMP
...
gives great speed for searching all
occurrences, and may be used with the underlying
...
text representation of scintilla used by
geany.     The utf-8 encoding of a unicode
...
string of length n, is less than 6n, each
utf8 encoding is at most 6 bytes.
...
I also think that including this functionality/feature into
 Geany core would be a good choice.
...
It
would be a small tradeoff between keeping the core small,
 and adding this new functionality,
...
but
this is your choice.
...
If you wish to extend automark, then this is good choice
 too.   If you wish, and if it helps,
...
please reuse any part of the
implementation provided here:
...
http://sourceforge.net/p/geanyhighlightselectedword/code/HEAD/tree/trunk/Gea...
If needed, I would help.
What should I do next?     Should I not do the
 pull request for GeanyHighlightSelectedWord?
...
It is okay with me.
GeanyHighlightSelectedWord would then be still available at
 sourceforge until
...
Geany provides this
functionality from its core, or from automark.
...
I wonder if this algorithm should be applied to
 all searches, and thus
 be integrated into
 scintilla. Does it have any major drawbacks? I read
 it has to some kind "prefix table"
 prior to running the search, but I
 guess
 that's negligible for all reasonable search terms?
Best regards
 _______________________________________________
 Devel mailing list
 Devel@lists.geany.org
 https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Devel mailing list
Devel@lists.geany.org
https://lists.geany.org/cgi-bin/mailman/listinfo/devel

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

Re: [Geany-Devel] pull request on GitHub, to add GeanyHighlightSelectedWords, into Geany Plugins