Re: [Geany-devel] tagmanager changes

26 Apr 2012

      On 24/04/2012 22:31, Colomban Wendling wrote:
...
Le 17/04/2012 18:20, Nick Treleaven a écrit :
Sorry for the long delays -- and also small activity -- recently.  I
have/had a lot of non-Geany stuff to do and stuff, the whole story, you
know.
No problem.
...
I finally committed it and pushed it so you can see it, comment, blame,
flame&  more:  see https://github.com/b4n/geany/tree/wip%2Fctagsmanager
A few points, as they comes to my mind:

it support asynchronous parsing (though not concurrent parsing);

What's the difference? Also, what does it buy us?
...

it is under the new ctagsmanager/ directory;
it uses the same tag parsers tagmanager used, in ctagsmanager/ctags;
all types have different names than the tagmanager ones, though
 currently the API is almost an 1:1 mapping -- and that's maybe a
 huge mistake?;

I don't understand why tagmanager has to be replaced, why not just 
replace the parts you want to improve? Rewriting it is likely to lead to 
a new set of bugs and be hard to review and merge changes from master.
...

there is 2 backends as of today:
a "simple" one that is simple and that doesn't waste (too much)
memory, but that searches in O(n);

Linear searching seems like a bad idea when we currently have O(log n). 
Removing x tags is O(n * x) by the look of it.
...

a "multi-cache" one that, as its name suggests, maintains multiple
caches (sorted tags arrays).  it uses a little more memory and is
slower on insertion since it maintains several sorted lists, but a
search on an already existing cache is very fast.

Won't this be slow for adding many tags at once? How is this design 
better than tagmanager?
Given my global tags merge change (see below), I think tagmanager needs 
some changes to avoid sorting the workspace tags array each time tags 
are updated, but this is certainly doable.
...
In practice I haven't yet tested anything big enough to see any
   difference in performances between these two backends, and that's
   something that probably isn't worth bothering about for now.

this "backend" abstraction might be really overkill, and maybe we
 could do better without it?;

I don't see why having two is better. The memory overhead for a pointer 
array is not much vs. the size of the tag structures. Fast searching is 
important.
...

tags (and most types) are reference-counted (so they aren't
 necessarily duplicated, e.g. in the multicache backend);

I don't really understand src/symbols.c since the real-time parsing 
change, so don't really understand why this is needed.
...

tag matching/finding is done using ctm_data_backend_find() (or a
 convenience wrapper), which takes 2 functions for performing the
 search:
a sort function, used to, heh, sort the tags to search and/or the
resulting list (the "simple" backend should use it to sort the
result; and the "multi-cache" backend uses it to sort the caches)

It doesn't sound a good idea to have to sort the haystack on each 
lookup. Lookup should be very fast, like tagmanager. Adding/removing 
tags is always less important because real-time updates can be disabled 
anyway.
...

a match function, use to check whether a tag should be included in
the results.  Like the sort function it returns a strcmp()-like
value, with the only difference that it probably returns 0 for more
tags.

It is somewhat similar to what tagmanager did, but it's more flexible
   -- and maybe complex, though many sort/match functions would already
   be provided.

no pruning is done yet, so there is duplicate in the results;
there is an (almost) working scope completeion implementation;
... I don't see anything to add, so I'll stop here :)

All this isn't of course written in stone: if we already redo a lot of
things, let's get something nice -- though IMO we'll always be better
than with tagmanager, since each time I wanted to touch it it took me
hours, and sometimes I even haven't been able to fix the problem
(thinking of e.g. scope completion...).
I don't really see what the problem understanding it is. I thought scope 
completion was just tm_workspace_find_scoped and related functions, not 
some tagmanager-wide problem.
...
...
Also, later in the thread he says that performance problems with
resorting global and workspace tags cannot be fixed with the design of
tagmanager. I've been working yesterday on improving this significantly
by merging the new tags each time instead of resorting *all* the tags. I
hope to commit this in the next few days.
Cool!  I haven't done any profiling on either tagmanager or may new
attempt, so I can't tell what's actually slow, but any improvement is
good to have :)
I finished the global tags merging change:
https://github.com/geany/geany/commit/712cdd6aa0005f65bd77b15c34cd2906786462...
IMO this should make loading global tags fast enough, even for heavy 
use, as the whole array is now only traversed once.
For avoiding resorting of workspace tags when only reparsing one source 
object, we can remove the source object's old tags & merge the new tags. 
This is all O(n) for the workspace array. I haven't looked into 
implementing this yet because now it's clear you're working on a 
competing change.
The other changes I've made were in response to a user's problem with 
generating C tag files, and some changes to the CTags files.
Regards,
Nick

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

Re: [Geany-devel] tagmanager changes