[Geany-devel] tagmanager changes

Mon Apr 30 16:07:09 UTC 2012

On 29/04/2012 14:07, Lex Trotman wrote:
> On 29 April 2012 22:07, Nick Treleaven<nick.treleaven at btinternet.com>  wrote:
>> On 27/04/2012 06:30, Lex Trotman wrote:
>> I don't see this myself, I see some complicating issues which could be
>> resolved (and I'm willing to work on them), but generally a sound design for
>> what it provides and for extra things we may want to add.
>>
>
> Maybe my problem is simply the attempt to simulate object orientation
> and inheritance in C makes it look complicated for a spoilt C++
> programmer :)

Eh? Tagmanager isn't OOP.

>> I expect the design is better in some respects (and to be fair I didn't look
>> for better things), but finding a tag based on its name is something we are
>> always going to want to be fast. Even for scope completion, you still need
>> to lookup a tag structure from a name string. So I think we will always want
>> a sorted array of tags per document that we can bsearch (or something
>> equally fast).
>
> Yes, we could have one name table and then prune the results to the
> scope required, or a name table per scope (which makes the tables
> smaller and simple searches faster).

It seems to me that if we supported proper scope completion then we 
could still have one array per document, sorted first by scope, then by 
name.

>> I don't know, but we still need fast tag lookup based on name. If O(n) scope
>> lookup is too slow, we will need additional data structures arranged
>> differently, but whatever we have should have something like O(log n) lookup
>> for names as this is by far the most common operation.
>>
>
> Agree, name lookup should be as fast as possible, O(log n) what about
> O(1) like compilers :)
>
> Well since we can't use hashes due to the need to identify prefixes, a
> radix trie seems the best since it can be as fast as a hash ie two
> passes through the key, and that gives a sub-trie that begins with the
> prefix.  But as you say thats orthogonal to the question of scope
> layout.

I only know about a simple trie, but wouldn't that use a lot more memory?

I don't think we want name lookup 'as fast as possible', just no slower 
than we have already.

>>>>> * tags (and most types) are reference-counted (so they aren't
>>>>>    necessarily duplicated, e.g. in the multicache backend);
>>>>
>>>>
>>>>
>>>> I don't really understand src/symbols.c since the real-time parsing
>>>> change,
>>>> so don't really understand why this is needed.
>>>
>>>
>>> Blame C++ and overloaded names I think.
>>
>>
>> I looked at the thread about that, and from what I could tell, the problem
>> was for reparsing unsaved files. Wasn't the order OK for files that have
>> just been saved? (Also I don't follow what that has to do with reference
>> counting).
>
> Hmm, you're right, looks like Colomban is duplicating tags in nested
> caches, so all searches are just one cache, but in fact they are not
> duplicated, but reference the same tag using a reference count to tell
> when to destroy it.  I am not sure where the reference to real time
> parsing comes from?

Sorry, I meant reparsing not real time updates. But is your problem with 
overloaded symbols OK when the file is parsed after saving, i.e. the 
order is only wrong on reparsing?