On 29/04/2012 14:07, Lex Trotman wrote:
On 29 April 2012 22:07, Nick Treleavennick.treleaven@btinternet.com wrote:
On 27/04/2012 06:30, Lex Trotman wrote: I don't see this myself, I see some complicating issues which could be resolved (and I'm willing to work on them), but generally a sound design for what it provides and for extra things we may want to add.
Maybe my problem is simply the attempt to simulate object orientation and inheritance in C makes it look complicated for a spoilt C++ programmer :)
Eh? Tagmanager isn't OOP.
I expect the design is better in some respects (and to be fair I didn't look for better things), but finding a tag based on its name is something we are always going to want to be fast. Even for scope completion, you still need to lookup a tag structure from a name string. So I think we will always want a sorted array of tags per document that we can bsearch (or something equally fast).
Yes, we could have one name table and then prune the results to the scope required, or a name table per scope (which makes the tables smaller and simple searches faster).
It seems to me that if we supported proper scope completion then we could still have one array per document, sorted first by scope, then by name.
I don't know, but we still need fast tag lookup based on name. If O(n) scope lookup is too slow, we will need additional data structures arranged differently, but whatever we have should have something like O(log n) lookup for names as this is by far the most common operation.
Agree, name lookup should be as fast as possible, O(log n) what about O(1) like compilers :)
Well since we can't use hashes due to the need to identify prefixes, a radix trie seems the best since it can be as fast as a hash ie two passes through the key, and that gives a sub-trie that begins with the prefix. But as you say thats orthogonal to the question of scope layout.
I only know about a simple trie, but wouldn't that use a lot more memory?
I don't think we want name lookup 'as fast as possible', just no slower than we have already.
- tags (and most types) are reference-counted (so they aren't necessarily duplicated, e.g. in the multicache backend);
I don't really understand src/symbols.c since the real-time parsing change, so don't really understand why this is needed.
Blame C++ and overloaded names I think.
I looked at the thread about that, and from what I could tell, the problem was for reparsing unsaved files. Wasn't the order OK for files that have just been saved? (Also I don't follow what that has to do with reference counting).
Hmm, you're right, looks like Colomban is duplicating tags in nested caches, so all searches are just one cache, but in fact they are not duplicated, but reference the same tag using a reference count to tell when to destroy it. I am not sure where the reference to real time parsing comes from?
Sorry, I meant reparsing not real time updates. But is your problem with overloaded symbols OK when the file is parsed after saving, i.e. the order is only wrong on reparsing?