[Geany-devel] Request: multithreaded tag generation?

Matthew Brush mbrush at xxxxx
Tue Nov 8 01:44:16 UTC 2011


On 11/07/2011 08:11 AM, Thomas Martitz wrote:
> Am 07.11.2011 17:06, schrieb Colomban Wendling:
>> Hi,
>>
>> Le 07/11/2011 16:35, Harold Aling a écrit :
>>> Dear Geany Devs,
>>>
>>> I recently switched from GeanyPRJ to Gproject. Since Gproject doesn't
>>> support multiple open projects I have to switch between projects, but
>>> it takes up to 4 minutes to close one project and open another. A
>>> project consists of roughly 1000-2000 php-related files.
>>>
>>> The "Generate tags for all project files" causes this massive delay,
>>> but I really need that feature.
>>>
>>> At work I have a 2-core CPU, where 1 is completely idle and on my
>>> desktop at home there are 5 cores are doing nothing while generating
>>> tags. Can't they be utilized to speed up the tag generation?
>> TL;DR: it's really not that easy.
>>
>
>
> Might or might not be related. It's rather annoying how long geany takes
> to load files. Fine for single files, but a session of (say) 50 ones
> takes a while And the situation on my end hasn't improved by buying an
> SSD, so I don't think it's I/O related, so suspect tagmanager.
>

I suspect it's that TagManager, for every single tag, is inserting the 
tag into the tags array, removing duplicates, and then re-sorting the 
entire array.

The actual code/algos in use in TM is quite optimized, but I think the 
whole approach is flawed.

The best way IMO would be to use a lightweight DB like SQLite, where you 
can slam a bunch of data into it while it's in memory and then deal with 
sorting/searching during the queries later (or rather let the DB engine 
deal with them).  IIUC, this would make threading much easier too, 
allowing one (or more) threads to be parsing and dumping tag info into 
the DB, while the UI is still running fine and seeing the new tags as 
they get inserted.  You could also flush the DB to disk and use it as an 
index, so next run, much less work needs to be done over.

The chief benefit would of course be dropping all the TM code that no 
one really understands :)

Cheers,
Matthew Brush



More information about the Devel mailing list