[Geany-devel] tagmanager changes

Sun Apr 29 13:07:44 UTC 2012

On 29 April 2012 22:07, Nick Treleaven <nick.treleaven at btinternet.com> wrote:
> On 27/04/2012 06:30, Lex Trotman wrote:
>>
>> [...]
>>>
>>>
>>> I don't understand why tagmanager has to be replaced, why not just
>>> replace
>>> the parts you want to improve? Rewriting it is likely to lead to a new
>>> set
>>> of bugs and be hard to review and merge changes from master.
>
>
>> One of the problems with tagmanager is its complexity, leading to
>> considerable wariness on the part of many of us about changing it
>> since we don't understand what we might break.
>
>
> I don't see this myself, I see some complicating issues which could be
> resolved (and I'm willing to work on them), but generally a sound design for
> what it provides and for extra things we may want to add.
>

Maybe my problem is simply the attempt to simulate object orientation
and inheritance in C makes it look complicated for a spoilt C++
programmer :)

>
>> Actually documenting the overall structure of tagmanager and how it is
>> supposed to work would be a good thing, whats a workspace? what is it
>> meant to represent, how are scopes represented? etc.
>
>
> Isn't it clear from the data structures? Look at TMWorkspace. Scopes and
> other tag metadata is the same as CTags. Obviously if we had at-a-glance
> overall documentation that would be good.

Actually I meant to ask, it does have documentation comments, is it a
style that doxygen can read.  If so maybe that will help, I'll try
tomorrow.

As for scopes, all I can see is nested structures, not nested code
scopes.  I guess thats not surprising given that they are not parsed
by the parsers.  So AFAICT solving the example below is still an open
question.

>
> One confusing thing is that a TMTag can be used for an actual tag or for a
> file. Probably that could be cleaned up.
>
>
>>>>   - a "multi-cache" one that, as its name suggests, maintains multiple
>>>>     caches (sorted tags arrays).  it uses a little more memory and is
>>>>     slower on insertion since it maintains several sorted lists, but a
>>>>     search on an already existing cache is very fast.
>>>
>>>
>>>
>>> Won't this be slow for adding many tags at once? How is this design
>>> better
>>> than tagmanager?
>>
>>
>> Perhaps Colomban could confirm, but my first thought is that this is
>> for nested scopes.
>
>
> I expect the design is better in some respects (and to be fair I didn't look
> for better things), but finding a tag based on its name is something we are
> always going to want to be fast. Even for scope completion, you still need
> to lookup a tag structure from a name string. So I think we will always want
> a sorted array of tags per document that we can bsearch (or something
> equally fast).

Yes, we could have one name table and then prune the results to the
scope required, or a name table per scope (which makes the tables
smaller and simple searches faster).

>
> Also, I've probably sounded quite harsh on Colomban's design, but I'm
> commenting on what I think is important. I am genuinely interested in why
> his design decisions are better. It's a lot to take in all at once, so
> probably needs some explanation. Sorry if I didn't make that clear.
>
>
>> How does tagmanager handle nested scopes, or how would it need to be
>> modified to do so, considering the example (in C)
>>
>> { struct a o; struct a p;
>>    o./* struct a members */
>>   { struct b o;
>>     o./* struct b members */
>>     p./* struct a members */
>>   }
>>   o./* struct a members */
>>   p./* struct a members */
>>   { struct c o;
>>     o./* struct c members */
>>     p./* struct a members */
>>   }
>>   o./* struct a members */
>>   p./* struct a members */
>> }
>>
>> How much needs to be changed in tagmanager so that the right
>> autocompletes can be provided at each comment?  (assuming c.c is
>> taught to parse local variables of course)
>
>
> I don't know, but we still need fast tag lookup based on name. If O(n) scope
> lookup is too slow, we will need additional data structures arranged
> differently, but whatever we have should have something like O(log n) lookup
> for names as this is by far the most common operation.
>

Agree, name lookup should be as fast as possible, O(log n) what about
O(1) like compilers :)

Well since we can't use hashes due to the need to identify prefixes, a
radix trie seems the best since it can be as fast as a hash ie two
passes through the key, and that gives a sub-trie that begins with the
prefix.  But as you say thats orthogonal to the question of scope
layout.

>
>>>> * this "backend" abstraction might be really overkill, and maybe we
>>>>   could do better without it?;
>>>
>>>
>>>
>>> I don't see why having two is better. The memory overhead for a pointer
>>> array is not much vs. the size of the tag structures. Fast searching is
>>> important.
>>>
>>
>> It is but is it flexible enough to be expanded to nested scopes.
>
>
> see above.

Still open (at least to my mind).

>
>
>>>> * tags (and most types) are reference-counted (so they aren't
>>>>   necessarily duplicated, e.g. in the multicache backend);
>>>
>>>
>>>
>>> I don't really understand src/symbols.c since the real-time parsing
>>> change,
>>> so don't really understand why this is needed.
>>
>>
>> Blame C++ and overloaded names I think.
>
>
> I looked at the thread about that, and from what I could tell, the problem
> was for reparsing unsaved files. Wasn't the order OK for files that have
> just been saved? (Also I don't follow what that has to do with reference
> counting).

Hmm, you're right, looks like Colomban is duplicating tags in nested
caches, so all searches are just one cache, but in fact they are not
duplicated, but reference the same tag using a reference count to tell
when to destroy it.  I am not sure where the reference to real time
parsing comes from?

>
>
>>> I don't really see what the problem understanding it is. I thought scope
>>> completion was just tm_workspace_find_scoped and related functions, not
>>> some
>>> tagmanager-wide problem.
>>
>>
>> I think the fact that this isn't clear is the problem :)
>
>
> If I'm following you correctly, I think you're saying the design needs to
> change, which I accept may be true. What I was saying was that understanding
> the existing code for scope completion is not really that complex.

I think we are talking about different things, IIUC you are talking
about nested structures, I am talking about the {} in the example
above.  At the moment it doesn't exist in the tagmanager and is
different to the nested structure problem.

Cheers
Lex

>
>
> Regards,
> Nick
> _______________________________________________
> Geany-devel mailing list
> Geany-devel at uvena.de
> https://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel