[Geany-Devel] pull request on GitHub, to add GeanyHighlightSelectedWords, into Geany Plugins
Lex Trotman
elextr at xxxxx
Fri May 29 13:46:11 UTC 2015
On 29 May 2015 at 23:22, Lex Trotman <elextr at gmail.com> wrote:
>>
>>> Does it have any major drawbacks? I read it has to some kind
>>> "prefix table" prior to running the search, but I guess that's
>>> negligible for all reasonable search terms?
>>
>> I think KMP does not have drawbacks. The prefix table is of length m+1,
>> m is the length of the string P for which you wish to find all occurrences
>> in some string T. For example, P is "hello", m == 5, and you build a
>> prefix table of length m+1 == 6. Then T may be any long string, for the
>> long string you do not need any additional memory. You only need the extra
>> m+1 size_t locations for the prefix table, m is length of P, the string you
>> wish to search.
>
> Its drawback is it is more complicated, and as I said you have to
> benchmark to see if its worth it, things that were worthwhile back
> when KMP was invented may not be worthwhile today.
>
I might add that the search time may be irrelevant compared to the
time to display unless you have a huge buffer and a very small window.
>>
>>
>> Have a great day,
>> Marius Ioan Buzea
>>
>>
>>
>>
>>
>>
>> --------------------------------------------
>> On Fri, 5/29/15, Thomas Martitz <kugel at rockbox.org> wrote:
>>
>> Subject: Re: [Geany-Devel] pull request on GitHub, to add GeanyHighlightSelectedWords, into Geany Plugins
>> To: devel at lists.geany.org
>> Date: Friday, May 29, 2015, 1:49 PM
>>
>> Am 29.05.2015 um 12:44
>> schrieb marius buzea:
>> > Hello,
>> >
>> >
>> >
>> With KMP it is possible to search all occurrences of a m
>> length string, into a n length string,
>> >
>> using O(m+n) machine operations. Next page:
>> > http://www.inf.fh-flensburg.de/lang/algorithmen/pattern/kmpen.htm
>> > describes the algorithm.
>> >
>> >
>> >
>> > The KMP works well
>> with the utf-8 encoding of unicode. One property of
>> utf8 is that
>> > the encoding one unicode
>> symbol is not a substring of another utf8 substring.
>> This
>> > property allows to take the utf-8
>> encoding of the string you wish to search, and to
>> > find this utf8 encoding string, in the
>> utf8 encoding of the text string. Geany
>> uses
>> > scintilla, and scintilla uses utf8
>> to encode the document it displays, and scintilla has
>> > a command that gives the raw utf8 byte
>> array for a [start, end) range. So, KMP
>> > gives great speed for searching all
>> occurrences, and may be used with the underlying
>> > text representation of scintilla used by
>> geany. The utf-8 encoding of a unicode
>> > string of length n, is less than 6n, each
>> utf8 encoding is at most 6 bytes.
>> >
>> >
>> >
>> >
>> I also think that including this functionality/feature into
>> Geany core would be a good choice.
>> > It
>> would be a small tradeoff between keeping the core small,
>> and adding this new functionality,
>> > but
>> this is your choice.
>> >
>> >
>> >
>> >
>> If you wish to extend automark, then this is good choice
>> too. If you wish, and if it helps,
>> > please reuse any part of the
>> implementation provided here:
>> > http://sourceforge.net/p/geanyhighlightselectedword/code/HEAD/tree/trunk/GeanyHighlightSelectedWord/GeanyHighlightSelectedWord.c
>> > If needed, I would help.
>> >
>> >
>> >
>> What should I do next? Should I not do the
>> pull request for GeanyHighlightSelectedWord?
>> > It is okay with me.
>> GeanyHighlightSelectedWord would then be still available at
>> sourceforge until
>> > Geany provides this
>> functionality from its core, or from automark.
>> >
>>
>>
>> I wonder if this algorithm should be applied to
>> all searches, and thus
>> be integrated into
>> scintilla. Does it have any major drawbacks? I read
>> it has to some kind "prefix table"
>> prior to running the search, but I
>> guess
>> that's negligible for all reasonable search terms?
>>
>> Best regards
>> _______________________________________________
>> Devel mailing list
>> Devel at lists.geany.org
>> https://lists.geany.org/cgi-bin/mailman/listinfo/devel
>>
>> _______________________________________________
>> Devel mailing list
>> Devel at lists.geany.org
>> https://lists.geany.org/cgi-bin/mailman/listinfo/devel
More information about the Devel
mailing list