[Geany-Devel] pull request on GitHub, to add GeanyHighlightSelectedWords, into Geany Plugins

Lex Trotman elextr at xxxxx
Fri May 29 13:22:03 UTC 2015


>
>> Does it have any major drawbacks? I read  it has to some kind
>> "prefix table" prior to running the search, but I  guess that's
>> negligible for all reasonable search terms?
>
> I think KMP does not have drawbacks.   The prefix table is of length m+1,
> m is the length of the string P for which you wish to find all occurrences
> in some string T.    For example, P is "hello", m == 5, and you build a
> prefix table of length m+1 == 6.    Then T may be any long string, for the
> long string you do not need any additional memory.   You only need the extra
> m+1 size_t locations for the prefix table, m is length of P, the string you
> wish to search.

Its drawback is it is more complicated, and as I said you have to
benchmark to see if its worth it, things that were worthwhile back
when KMP was invented may not be worthwhile today.

>
>
> Have a great day,
> Marius Ioan Buzea
>
>
>
>
>
>
> --------------------------------------------
> On Fri, 5/29/15, Thomas Martitz <kugel at rockbox.org> wrote:
>
>  Subject: Re: [Geany-Devel] pull request on GitHub, to add GeanyHighlightSelectedWords, into Geany Plugins
>  To: devel at lists.geany.org
>  Date: Friday, May 29, 2015, 1:49 PM
>
>  Am 29.05.2015 um 12:44
>  schrieb marius buzea:
>  > Hello,
>  >
>  >
>  >
>  With KMP it is possible to search all occurrences of a m
>  length string, into a n length string,
>  >
>  using O(m+n) machine operations.    Next page:
>  >          http://www.inf.fh-flensburg.de/lang/algorithmen/pattern/kmpen.htm
>  > describes the algorithm.
>  >
>  >
>  >
>  > The KMP works well
>  with the utf-8 encoding of unicode.    One property of
>  utf8 is that
>  > the encoding one unicode
>  symbol is not a substring of another utf8 substring.
>  This
>  > property allows to take the utf-8
>  encoding of the string you wish to search, and to
>  > find this utf8 encoding string, in the
>  utf8 encoding of the text string.       Geany
>  uses
>  > scintilla, and scintilla uses utf8
>  to encode the document it displays, and scintilla has
>  > a command that gives the raw utf8 byte
>  array for a [start, end) range.      So, KMP
>  > gives great speed for searching all
>  occurrences, and may be used with the underlying
>  > text representation of scintilla used by
>  geany.     The utf-8 encoding of a unicode
>  > string of length n, is less than 6n, each
>  utf8 encoding is at most 6 bytes.
>  >
>  >
>  >
>  >
>  I also think that including this functionality/feature into
>  Geany core would be a good choice.
>  > It
>  would be a small tradeoff between keeping the core small,
>  and adding this new functionality,
>  > but
>  this is your choice.
>  >
>  >
>  >
>  >
>  If you wish to extend automark, then this is good choice
>  too.   If you wish, and if it helps,
>  > please reuse any part of the
>  implementation provided here:
>  >    http://sourceforge.net/p/geanyhighlightselectedword/code/HEAD/tree/trunk/GeanyHighlightSelectedWord/GeanyHighlightSelectedWord.c
>  > If needed, I would help.
>  >
>  >
>  >
>  What should I do next?     Should I not do the
>  pull request for GeanyHighlightSelectedWord?
>  > It is okay with me.
>  GeanyHighlightSelectedWord would then be still available at
>  sourceforge until
>  > Geany provides this
>  functionality from its core, or from automark.
>  >
>
>
>  I wonder if this algorithm should be applied to
>  all searches, and thus
>  be integrated into
>  scintilla. Does it have any major drawbacks? I read
>  it has to some kind "prefix table"
>  prior to running the search, but I
>  guess
>  that's negligible for all reasonable search terms?
>
>  Best regards
>  _______________________________________________
>  Devel mailing list
>  Devel at lists.geany.org
>  https://lists.geany.org/cgi-bin/mailman/listinfo/devel
>
> _______________________________________________
> Devel mailing list
> Devel at lists.geany.org
> https://lists.geany.org/cgi-bin/mailman/listinfo/devel


More information about the Devel mailing list