In-memory tagmanager parsing

List overview All Threads

newer

older

Re: [Geany-devel] A patch for...

A patch for better block comments...

Colomban Wendling

18 Apr 2010 18 Apr '10

3:48 p.m.

Hi,

I took a look at the tagmanager in an attempt to make it work in-memory, since it is not yet completely the case. I found why and how to fix the (C?) function arguments (attached patch), and it seems to work pretty well for C (and Python and PHP, though I did only a few tests with them).

For now the only defect I saw (with C) is with anonymous enumerations and so (that are named anon_enum_NUM) with which the NUM increases at each re-parsing (I use document_update_tag_list(doc, TRUE) to re-parse); but I suppose it wouldn't it too difficult to fix, would it?.

Then, I would like to know what else didn't work, what need to be fixed, etc.; because I'd really love to have it working.

Enrico Tröger wrote: (on thread Function Definition)

...

Some time ago, I started working on this but it never really worked and additionally, it could work currently only for a few parsers (some of those are C, Fortran, SQL IIRC). To get it working reliably, some more work is needed and we would have to adjust *all* existing parsers which is by no means an easy task.

Do you know which parsers would *not* work? And hum, if each and every parser must care about buffer VS file, wouldn't it be good to abstract this a little more? (with e.g. a little I/O layer - I already started a small library to check if it would be easy to emulate file I/O on buffer, and it seems not to be too hard)

Regards, Colomban

Attachments:

0001-memory-tagmanager-line-start-fix.diff (text/x-patch — 1.7 KB)

Show replies by date

Enrico Tröger

19 Apr 19 Apr

12:34 a.m.

New subject: [Geany-devel] In-memory tagmanager parsing

On Sun, 18 Apr 2010 15:48:14 +0200, Colomban wrote:

...

Hi,

I took a look at the tagmanager in an attempt to make it work in-memory, since it is not yet completely the case. I found why and how to fix the (C?) function arguments (attached patch), and it seems to work pretty well for C (and Python and PHP, though I did only a few tests with them).

What exactly do you mean? Do you refer to the broken function signature parsing when the in-memory parsing is enabled? But how is this related to Python and PHP parsers as those parsers work completely differently?

...

For now the only defect I saw (with C) is with anonymous enumerations and so (that are named anon_enum_NUM) with which the NUM increases at each re-parsing (I use document_update_tag_list(doc, TRUE) to re-parse); but I suppose it wouldn't it too difficult to fix, would it?.

No idea. I don't know the C parser that well (tagmanager/c.c). I always tried to touch it as less than possible as it is very complex and very easy to break it :(.

...

Then, I would like to know what else didn't work, what need to be fixed, etc.; because I'd really love to have it working.

The parsers...

...

Enrico Tröger wrote: (on thread Function Definition)

...
Some time ago, I started working on this but it never really worked and additionally, it could work currently only for a few parsers (some of those are C, Fortran, SQL IIRC). To get it working reliably, some more work is needed and we would have to adjust *all* existing parsers which is by no means an easy task.

Do you know which parsers would *not* work? And hum, if each and every

All except C, Fortran, SQL and JavaScript. I searched my personal mail archives, the Geany and Geany-devel mailing list archives and everything but I didn't find any correspondance where I mentioned that experiemental code I committed :(. Though I was very sure (until I didn't find any related mail) that I talked to Nick about this, either via personal mail or via a mailing list. Either my memories kid on me or I lost those mails.

Nick do you remember any conversion with me about this and maybe still have the relevant mails?

Anyway, best documentation is the source, haha. The whole stuff have been added in SV r3184 (http://geany.svn.sf.net/viewvc/geany?view=rev&revision=3184).

I don't remember all details (sigh, this is long time ago :( ), but I think the main problem was that most parsers read the data from the file so that it could not be easily converted to read from a buffer. Only the few above mentioned parsers did it so that we could change the underlying data to be a buffer instead of a real file.

...

parser must care about buffer VS file, wouldn't it be good to abstract this a little more? (with e.g. a little I/O layer - I already started a small library to check if it would be easy to emulate file I/O on buffer, and it seems not to be too hard)

Yeah, that would be a clean and proper solution and would probably solve a lot of problems. But before doing this, we need to decide whether we want to stay as compatible as possible with the CTags project(which makes very slow progress, almost dead) as we did before or whether we would spend time on modernising the parsers and adjust them to work more like we need it (not sure how many differences there would be at all though).

Once this decision is made, we can think about your question above about an I/O abstracting layer.

After all, yay for bringing this topic up again, I guess when we really get to get it working, it's being a great thing for users, i.e. us and many more :D.

Regards, Enrico

-- Get my GPG key from http://www.uvena.de/pub.asc

Colomban Wendling

1:26 a.m.

New subject: [Geany-devel] In-memory tagmanager parsing

Enrico Tröger a écrit :

...

On Sun, 18 Apr 2010 15:48:14 +0200, Colomban wrote:

...
Hi,

I took a look at the tagmanager in an attempt to make it work in-memory, since it is not yet completely the case. I found why and how to fix the (C?) function arguments (attached patch), and it seems to work pretty well for C (and Python and PHP, though I did only a few tests with them).

What exactly do you mean?

How, yep I see my sentence was quite confusing... sorry.

...

Do you refer to the broken function signature parsing when the in-memory parsing is enabled? But how is this related to Python and PHP parsers as those parsers work completely differently?

Yes I refer to the function signature parsing, which affects the C parser. I spoke of Python and PHP only because I enabled the update_tags_from_buffer() #if0'ed code and they seemed to work quite well with it. But it seems from what you says below that it was a very fast and wrong assumption that it means it actually use in-memory parsing...

...

...
For now the only defect I saw (with C) is with anonymous enumerations and so (that are named anon_enum_NUM) with which the NUM increases at each re-parsing (I use document_update_tag_list(doc, TRUE) to re-parse); but I suppose it wouldn't it too difficult to fix, would it?.

No idea. I don't know the C parser that well (tagmanager/c.c). I always tried to touch it as less than possible as it is very complex and very easy to break it :(.

Yep, true. Anyway hum, what to say of a code that use longjump()? :-' Not to blame anyone, I never felt it. And for my original question, I think I'll just... see?

...

...
Then, I would like to know what else didn't work, what need to be fixed, etc.; because I'd really love to have it working.

The parsers...

Haha! I must admit I hoped a more precise answer :D

...

...
Enrico Tröger wrote: (on thread Function Definition)

...
Some time ago, I started working on this but it never really worked and additionally, it could work currently only for a few parsers (some of those are C, Fortran, SQL IIRC). To get it working reliably, some more work is needed and we would have to adjust *all* existing parsers which is by no means an easy task.

Do you know which parsers would *not* work? And hum, if each and every

All except C, Fortran, SQL and JavaScript. I searched my personal mail archives, the Geany and Geany-devel mailing list archives and everything but I didn't find any correspondance where I mentioned that experiemental code I committed :(. Though I was very sure (until I didn't find any related mail) that I talked to Nick about this, either via personal mail or via a mailing list. Either my memories kid on me or I lost those mails.

Nick do you remember any conversion with me about this and maybe still have the relevant mails?

Anyway, best documentation is the source, haha. The whole stuff have been added in SV r3184 (http://geany.svn.sf.net/viewvc/geany?view=rev&revision=3184).

Ow. Well, indigestible but still instructive, thanks. But I'll wait hoping for Nick's remembrance, still.

...

I don't remember all details (sigh, this is long time ago :( ), but I think the main problem was that most parsers read the data from the file so that it could not be easily converted to read from a buffer. Only the few above mentioned parsers did it so that we could change the underlying data to be a buffer instead of a real file.

Hum, I don't fully understand. By "it" in the last sentence you mean read.[ch]?

...

...
parser must care about buffer VS file, wouldn't it be good to abstract this a little more? (with e.g. a little I/O layer - I already started a small library to check if it would be easy to emulate file I/O on buffer, and it seems not to be too hard)

Yeah, that would be a clean and proper solution and would probably solve a lot of problems. But before doing this, we need to decide whether we want to stay as compatible as possible with the CTags project(which makes very slow progress, almost dead) as we did before or whether we would spend time on modernising the parsers and adjust them to work more like we need it (not sure how many differences there would be at all though).

Once this decision is made, we can think about your question above about an I/O abstracting layer.

Yeah. I might say if it is so dead, it is hope not so far from blindness to wait for updates and fixes from it. But OTHO I completely understand that the simple idea of being the maintainer of it might be quite... scary.

...

I guess when we really get to get it working, it's being a great thing for users, i.e. us and many more :D.

Definitely!

Regards, Colomban

Enrico Tröger

7:51 p.m.

New subject: [Geany-devel] In-memory tagmanager parsing

On Mon, 19 Apr 2010 01:26:36 +0200, Colomban wrote:

Hi,

...

...
...
Then, I would like to know what else didn't work, what need to be fixed, etc.; because I'd really love to have it working.

The parsers...

Haha! I must admit I hoped a more precise answer :D

I feared about that...:). For details, see below.

...

...
I don't remember all details (sigh, this is long time ago :( ), but I think the main problem was that most parsers read the data from the file so that it could not be easily converted to read from a buffer. Only the few above mentioned parsers did it so that we could change the underlying data to be a buffer instead of a real file.

Hum, I don't fully understand. By "it" in the last sentence you mean read.[ch]?

Yes, "it" referred to the way the parsers read data from the source files.

...

...
...
parser must care about buffer VS file, wouldn't it be good to abstract this a little more? (with e.g. a little I/O layer - I already started a small library to check if it would be easy to emulate file I/O on buffer, and it seems not to be too hard)

Yeah, that would be a clean and proper solution and would probably solve a lot of problems. But before doing this, we need to decide whether we want to stay as compatible as possible with the CTags project(which makes very slow progress, almost dead) as we did before or whether we would spend time on modernising the parsers and adjust them to work more like we need it (not sure how many differences there would be at all though).

Once this decision is made, we can think about your question above about an I/O abstracting layer.

Yeah. I might say if it is so dead, it is hope not so far from blindness to wait for updates and fixes from it. But OTHO I completely understand that the simple idea of being the maintainer of it might be quite... scary.

That's the question for now, I guess. If we decide to not try to stay compatible with CTags we maybe could adjust the parsers more easily to fit our needs, especially to read data from a buffer and we could easily use GLib functions in the parsers which could make the code a bit easier and other things.

...

From what I noticed (mainly reading svn log of the CTags repository),

it sees a few commits every few weeks or months mostly with fixes but no real progress. I think we could go away and push our tagmanager copy into our direction but OTOH it might be not even worth. Not sure.

What about the others, any opinions?

Regards, Enrico

-- Get my GPG key from http://www.uvena.de/pub.asc

Nick Treleaven

20 Apr 20 Apr

3:56 p.m.

New subject: [Geany-devel] In-memory tagmanager parsing

On Mon, 19 Apr 2010 19:51:04 +0200 Enrico Tröger enrico.troeger@uvena.de wrote:

...

...
...
...
parser must care about buffer VS file, wouldn't it be good to abstract this a little more? (with e.g. a little I/O layer - I already started a small library to check if it would be easy to emulate file I/O on buffer, and it seems not to be too hard)

Yeah, that would be a clean and proper solution and would probably solve a lot of problems. But before doing this, we need to decide whether we want to stay as compatible as possible with the CTags project(which makes very slow progress, almost dead) as we did before or whether we would spend time on modernising the parsers and adjust them to work more like we need it (not sure how many differences there would be at all though).

Once this decision is made, we can think about your question above about an I/O abstracting layer.

Yeah. I might say if it is so dead, it is hope not so far from blindness to wait for updates and fixes from it. But OTHO I completely understand that the simple idea of being the maintainer of it might be quite... scary.

That's the question for now, I guess. If we decide to not try to stay compatible with CTags we maybe could adjust the parsers more easily to fit our needs, especially to read data from a buffer and we could easily use GLib functions in the parsers which could make the code a bit easier and other things. From what I noticed (mainly reading svn log of the CTags repository), it sees a few commits every few weeks or months mostly with fixes but no real progress. I think we could go away and push our tagmanager copy into our direction but OTOH it might be not even worth. Not sure.

What about the others, any opinions?

I think we should try to stay fairly compatible with CTags as other projects use it also and may make improvements to their copies.

But IMO it's OK to change the I/O functions.

Regards, Nick

jordan

4:10 p.m.

New subject: [Geany-devel] In-memory tagmanager parsing

On 04/20/2010 09:56 AM, Nick Treleaven wrote:

...

On Mon, 19 Apr 2010 19:51:04 +0200 Enrico Tröger enrico.troeger@uvena.de wrote:

...
...
...
...
parser must care about buffer VS file, wouldn't it be good to abstract this a little more? (with e.g. a little I/O layer - I already started a small library to check if it would be easy to emulate file I/O on buffer, and it seems not to be too hard)

Yeah, that would be a clean and proper solution and would probably solve a lot of problems. But before doing this, we need to decide whether we want to stay as compatible as possible with the CTags project(which makes very slow progress, almost dead) as we did before or whether we would spend time on modernising the parsers and adjust them to work more like we need it (not sure how many differences there would be at all though).

Once this decision is made, we can think about your question above about an I/O abstracting layer.

Yeah. I might say if it is so dead, it is hope not so far from blindness to wait for updates and fixes from it. But OTHO I completely understand that the simple idea of being the maintainer of it might be quite... scary.

That's the question for now, I guess. If we decide to not try to stay compatible with CTags we maybe could adjust the parsers more easily to fit our needs, especially to read data from a buffer and we could easily use GLib functions in the parsers which could make the code a bit easier and other things. From what I noticed (mainly reading svn log of the CTags repository), it sees a few commits every few weeks or months mostly with fixes but no real progress. I think we could go away and push our tagmanager copy into our direction but OTOH it might be not even worth. Not sure.

What about the others, any opinions?

I think we should try to stay fairly compatible with CTags as other projects use it also and may make improvements to their copies.

But IMO it's OK to change the I/O functions.

Regards, Nick _______________________________________________ Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel

How does Monodevelop handle it's tag manager as unlike Anjuta and Geany the tags always point to the correct line? -Jordan

Nick Treleaven

4:20 p.m.

New subject: [Geany-devel] In-memory tagmanager parsing

On Tue, 20 Apr 2010 10:10:33 -0400 jordan phosphor@primus.ca wrote:

...

How does Monodevelop handle it's tag manager as unlike Anjuta and Geany the tags always point to the correct line?

IDEs reparse the tags from memory in idle time.

Regards, Nick

Colomban Wendling

4:45 p.m.

New subject: [Geany-devel] In-memory tagmanager parsing

Nick Treleaven a écrit :

...

On Tue, 20 Apr 2010 10:10:33 -0400 jordan phosphor@primus.ca wrote:

...
How does Monodevelop handle it's tag manager as unlike Anjuta and Geany the tags always point to the correct line?

IDEs reparse the tags from memory in idle time.

Yeah, but it means that they can. Then the question is: what Monodevelop use then to do so? An enhanced version of CTags, a completely other thing or what? (if nobody knows, I'll perhaps take a look at the code later)

jordan

5:48 p.m.

New subject: [Geany-devel] In-memory tagmanager parsing

On 04/20/2010 10:45 AM, Colomban Wendling wrote:

...

Nick Treleaven a écrit :

...
On Tue, 20 Apr 2010 10:10:33 -0400 jordan phosphor@primus.ca wrote:

...
How does Monodevelop handle it's tag manager as unlike Anjuta and Geany the tags always point to the correct line?

IDEs reparse the tags from memory in idle time.

Yeah, but it means that they can. Then the question is: what Monodevelop use then to do so? An enhanced version of CTags, a completely other thing or what? (if nobody knows, I'll perhaps take a look at the code later)

Had a quick look and it looks to me like Monodevelop may use a temporary file with the current buffer. It does use CTags though, however the code for the most part is sparsely commented. However it looks as if it creates a new thread and ctags is run against the tmp file containing the buffer and then the output parsed.

looks like relevant files are in monodevelop-2.2.2/src/addins/CBinding/Parser.

-Jordan

Enrico Tröger

25 Apr 25 Apr

11:36 a.m.

New subject: [Geany-devel] In-memory tagmanager parsing

On Tue, 20 Apr 2010 11:48:30 -0400, jordan wrote:

...

On 04/20/2010 10:45 AM, Colomban Wendling wrote:

...
Nick Treleaven a écrit :

...
On Tue, 20 Apr 2010 10:10:33 -0400 jordan phosphor@primus.ca wrote:

...
How does Monodevelop handle it's tag manager as unlike Anjuta and Geany the tags always point to the correct line?

IDEs reparse the tags from memory in idle time.

Yeah, but it means that they can. Then the question is: what Monodevelop use then to do so? An enhanced version of CTags, a completely other thing or what? (if nobody knows, I'll perhaps take a look at the code later)

Had a quick look and it looks to me like Monodevelop may use a temporary file with the current buffer. It does use CTags though, however the code for the most part is sparsely commented. However it looks as if it creates a new thread and ctags is run against the tmp file containing the buffer and then the output parsed.

This is similar to what we do: we tell the tagmanager(which is more or less an extension of CTags) to parse the file. The main differences are: - we parse the source file itself, not a temporary file - we don't do it in a separate thread

I'm not sure we can easily change the behaviour to use a temp file, probably yes, technically also but it might have side effects. Needs to be examined. About threads: we probably could do this but I doubt it will gain much performance because the parsing itself is probably way faster than the GUI part. I.e. we first parse the file and then retrieve the list of symbols from the tagmanager and fill the Symbol List with the results. Filling the Symbol List can't be done in a thread because GTK doesn't allow this. So, we only could do the real parsing in a thread not the GUI stuff which probably takes way more time.

Regards, Enrico

-- Get my GPG key from http://www.uvena.de/pub.asc

Thomas Martitz

20 Apr 20 Apr

4:53 p.m.

New subject: [Geany-devel] In-memory tagmanager parsing

Am 20.04.2010 15:56, schrieb Nick Treleaven:

...

I think we should try to stay fairly compatible with CTags as other projects use it also and may make improvements to their copies.

But IMO it's OK to change the I/O functions.

Regards, Nick

Aren't Geany tags (the ones that are saved on the disc) already incompatible with ctags since a long time? At least the manual says so. Or is it internally still compatible with ctags?

I think trying to keep things sync'd with an inactive project (which it seems to be as mentioned in a previous mail) is the right way if you want to keep actual progress out.

Best regards.

Nick Treleaven

6:41 p.m.

New subject: [Geany-devel] ctags - Re: In-memory tagmanager parsing

On Tue, 20 Apr 2010 16:53:23 +0200 Thomas Martitz thomas.martitz@student.HTW-Berlin.de wrote:

...

Am 20.04.2010 15:56, schrieb Nick Treleaven:

...
I think we should try to stay fairly compatible with CTags as other projects use it also and may make improvements to their copies.

But IMO it's OK to change the I/O functions.

Regards, Nick

Aren't Geany tags (the ones that are saved on the disc) already incompatible with ctags since a long time? At least the manual says so. Or is it internally still compatible with ctags?

That's the global tag file format, not the source files.

...

I think trying to keep things sync'd with an inactive project (which it seems to be as mentioned in a previous mail) is the right way if you want to keep actual progress out.

I didn't say we shouldn't add features. I've added many myself. What I meant was not to start using GLib functions or making organisational changes unless there's a significant benefit.

As I already said, CTags is used in many projects which *are* actively developed. We may be able to merge changes from these.

Regards, Nick

Enrico Tröger

25 Apr 25 Apr

11:39 a.m.

New subject: [Geany-devel] ctags - Re: In-memory tagmanager parsing

On Tue, 20 Apr 2010 17:41:59 +0100, Nick wrote:

...

On Tue, 20 Apr 2010 16:53:23 +0200 Thomas Martitz thomas.martitz@student.HTW-Berlin.de wrote:

...
Am 20.04.2010 15:56, schrieb Nick Treleaven:

...
I think we should try to stay fairly compatible with CTags as other projects use it also and may make improvements to their copies.

But IMO it's OK to change the I/O functions.

Regards, Nick

Aren't Geany tags (the ones that are saved on the disc) already incompatible with ctags since a long time? At least the manual says so. Or is it internally still compatible with ctags?

That's the global tag file format, not the source files.

...
I think trying to keep things sync'd with an inactive project (which it seems to be as mentioned in a previous mail) is the right way if you want to keep actual progress out.

I didn't say we shouldn't add features. I've added many myself. What I meant was not to start using GLib functions or making organisational changes unless there's a significant benefit.

As I already said, CTags is used in many projects which *are* actively developed. We may be able to merge changes from these.

Sounds ok. Maybe we can get some kind of in-memory parsing by just modifying/extending the IO layer and try to keep changes to the actual parsers minimal.

Regards, Enrico

-- Get my GPG key from http://www.uvena.de/pub.asc

Nick Treleaven

19 Apr 19 Apr

5:34 p.m.

New subject: [Geany-devel] In-memory tagmanager parsing

On Mon, 19 Apr 2010 00:34:42 +0200 Enrico Tröger enrico.troeger@uvena.de wrote:

...

...
For now the only defect I saw (with C) is with anonymous enumerations and so (that are named anon_enum_NUM) with which the NUM increases at each re-parsing (I use document_update_tag_list(doc, TRUE) to re-parse); but I suppose it wouldn't it too difficult to fix, would it?.

I don't think this matters, it already happens after saving as well.

...

...
Enrico Tröger wrote: (on thread Function Definition)

...
Some time ago, I started working on this but it never really worked and additionally, it could work currently only for a few parsers (some of those are C, Fortran, SQL IIRC). To get it working reliably, some more work is needed and we would have to adjust *all* existing parsers which is by no means an easy task.

Do you know which parsers would *not* work? And hum, if each and every

All except C, Fortran, SQL and JavaScript. I searched my personal mail archives, the Geany and Geany-devel mailing list archives and everything but I didn't find any correspondance where I mentioned that experiemental code I committed :(. Though I was very sure (until I didn't find any related mail) that I talked to Nick about this, either via personal mail or via a mailing list. Either my memories kid on me or I lost those mails.

Nick do you remember any conversion with me about this and maybe still have the relevant mails?

I remember we discussed this but no details - I can't find the email(s).

Regards, Nick

Enrico Tröger

7:44 p.m.

New subject: [Geany-devel] In-memory tagmanager parsing

On Mon, 19 Apr 2010 16:34:12 +0100, Nick wrote:

...

On Mon, 19 Apr 2010 00:34:42 +0200 Enrico Tröger enrico.troeger@uvena.de wrote:

...
...
For now the only defect I saw (with C) is with anonymous enumerations and so (that are named anon_enum_NUM) with which the NUM increases at each re-parsing (I use document_update_tag_list (doc, TRUE) to re-parse); but I suppose it wouldn't it too difficult to fix, would it?.

I don't think this matters, it already happens after saving as well.

Indeed. So this is an issue which can be fixed separately from the in-memory parsing discussing. Though I didn't notice it in the last 4 years, and probably most other users as well. Simply not that critical, IMO.

...

...
...
Enrico Tröger wrote: (on thread Function Definition)

...
Some time ago, I started working on this but it never really worked and additionally, it could work currently only for a few parsers (some of those are C, Fortran, SQL IIRC). To get it working reliably, some more work is needed and we would have to adjust *all* existing parsers which is by no means an easy task.

Do you know which parsers would *not* work? And hum, if each and every

All except C, Fortran, SQL and JavaScript. I searched my personal mail archives, the Geany and Geany-devel mailing list archives and everything but I didn't find any correspondance where I mentioned that experiemental code I committed :(. Though I was very sure (until I didn't find any related mail) that I talked to Nick about this, either via personal mail or via a mailing list. Either my memories kid on me or I lost those mails.

Nick do you remember any conversion with me about this and maybe still have the relevant mails?

I remember we discussed this but no details - I can't find the email (s).

Weird. Anyway, thanks for looking for.

Regards, Enrico

-- Get my GPG key from http://www.uvena.de/pub.asc

5424

Age (days ago)

5431

Last active (days ago)

devel@lists.geany.org

14 comments

5 participants

tags (0)

participants (5)

Colomban Wendling
Enrico Tröger
jordan
Nick Treleaven
Thomas Martitz