[Geany-devel] tagmanager changes

Tue May 8 12:31:57 UTC 2012

Le 07/05/2012 18:04, Nick Treleaven a écrit :
> On 02/05/2012 05:46, Lex Trotman wrote:
>> Hi All,
>>
>> To summarise since the thread has several subthreads.
>>
>> 1. Tagmanager Understandability
>>
>> a. I generated the doxygen documentation for tagmanager, it works if
>> you set recursive, but didn't help much:
>>
>> - if its not OOP why does it say things like "TMWorkspace is derived
>> from TMWorkObject" and similar?
> 
> documentation bug IMO

I don't think so.  TM uses a more or less OOP-like approach.  See for
example TMWorkspace:

typedef struct
{
    TMWorkObject work_object; /*!< The parent work object */
    GPtrArray *global_tags; /*!< Global tags loaded at startup */
    GPtrArray *work_objects; /*!< An array of TMWorkObject pointers */
} TMWorkspace;

The first field (work_object) is the inherited "class", here
TMWorkObject.  And you'll see numerous places where the code uses such a
derived structure as a TMWorkObject -- since it is one actually --,
which looks quite like OOP.

Or see tm_workspace.c:44:tm_create_workspace():  it uses
tm_work_object_register() to register itself as a new type of work
object with a few methods (or vfuncs), and the initializes iself with
tm_work_object_init(), etc.

I very well understand Lex's questionings about how it does actually
work, since it brings a second OOP-style programming in C, less well
known than GObject -- though of course less complex also, but still (BTW
maybe porting to GObject could help?)

>> - its not clear how it all goes together, the workspace contains
>> global tags and work_objects, or is that files and whats the
> 
> workspace work objects are document tags. global tags explained in
> geany's manual.
> 
>> difference between source_file and file_entry?
> 
> It doesn't look like tm_file_entry_ is really used.
> 
>>
>> - similarly whats the difference between symbol and tag?
> 
> tm_symbol_ doesn't seem to be used.
> 
>>
>> 2. Ability to expand tagmanager to handle names declared in lexical
>> scopes (not to be confused with struct/class scopes).  Here is the
>> example again with some numbers so I can refer to them
>>
>> { struct a o; struct a p;
>>      o./* struct a members */
>>     { struct b o;
>>       o./* struct b members */
>>       p./* struct a members */
>>     }
>>     o./* struct a members */
>>     p./* struct a members<1>  */
>>     { struct c o;
>>       o./* struct c members */
>>       p./* struct a members<2>  */
>>     }
>>     o./* struct a members */
>>     p./* struct a members */
>>   }
>>
>> a. yep, tries use more memory than an array, the usual speed/space,
>> pick one, tradeoff :)
>>
>> b. @Nick, when you say sort by scope then name, are you wanting to
>> have an entry in the table for each declaration of the name?
> 
> no
> 
>>
>> - If so this makes the array much bigger to search and your search
>> speed depends on size, and it doesn't get you anything, you can't
>> search by scope since you don't know if the name is declared in the
>> scope you are in or an outer scope compare p at<1>  and<2>
>>
>> - having a single name array which then points to scope info for the
>> name is a viable approach (disclosure, thats how I'm doing the symbol
>> table for a language I'm developing) but the table being searched is
>> usually larger than if you have nested arrays.  Being smaller these
>> are faster to search if the search isn't O(1), hence the suggestion of
>> trie instead of bsearch.
> 
> the gain in simplicity makes a bigger array to search worth it.
> Remember, global tags aren't included in the workspace array of
> tagmanager, so we're not talking a big number of tags, and we have o(log
> n) searching.
> 
> 
>> 4. Ctags parsers
>>
>> Agree with Nick that the parsers are usable, but if we start modifying
>> them to handle local declarations then they will be totally
>> incompatible with the Ctags project so I guess it doesn't matter other
>> than for getting languages we don't currently parse.
> 
> ctags c.c already parses local tags
> 
>>
>> 5. Overloaded symbols
>>
>> Since Colombans patch, overloaded symbols are now stable for all
>> practical code (I think theoretically it could get confused if the
>> overloads are on the same line but thats unlikely enough to ignore for
>> human generated code)
> 
> If you're talking about master, I think I still experienced wrong
> parenting on reparsing when removing lines.
> 
>> 6. Moving functionality from symbols.c to tagmanager
>>
>> a. Since its the 100th anniversary of the Titanic sinking, I think
>> that "shuffling the deckchairs" is an apt analogy, the functionality
>> has to be somewhere, its only useful to move it if the destination
>> significantly reduces the effort required.
> 
> I don't think I suggested moving functionality. I wondered whether TM
> could help make symbols.c less complex. I would need to understand the
> complexity to know whether this is appropriate or not.

Well, what symbols.c tries to do when updating the symbols tree is (as
documented above update_tree_tags() BTW):

1) update tags that still exist and remove obsolescent ones
2) add the remaining tags (new ones) to the tree

The implementation is a (tiny) little (bit) more complex than that for
performances reasons, like the two hash tables and now the possibility
for a hash table entry to hold more than one tag, which deals with exact
duplicates.

Not sure how tm could help here, unless maybe it provides a tree rather
than a flat list.