(Please let me know if I asked the same question ago.)
In Universal Ctags, people can write a parser in C language or .ctags language, which I named optlib.
The code written in optlib is translated into C language with regcomp/regexec calls while building a ctags executable. The optlib is extremely convenient for writing an indexer/parser for a small language.
In the Universal Ctags, I got a pull request for Forth language parser from @farvardin.
Initially, it is written in C. However, the parser was so simple. So we rewrote it in optlib. After rewriting, we polished the pull request. It is mostly ready to merge (https://github.com/universal-ctags/ctags/pull/3812).
However, there is one issue. @farvardin wanted to use the parser in Geany. However, I heard from @farvardin that Geany doesn't support optlib-based parsers. (https://github.com/universal-ctags/ctags/pull/3812/files#r1322116241)
Is what @farvardin wrote correct? If yes, what should we do?
My [reply](https://github.com/geany/geany/pull/3377#issuecomment-1713642517) to @farvardin noted that Geany does not import the optlib directory from ctags so Geany probably does not support optlib parsers directly.
@techee should comment on if the C file generated by optlib that I see is part of https://github.com/universal-ctags/ctags/pull/3812 can be directly used by Geany, that is beyond my pay grade.
But in general regular expression based parsers have not been used in Geany because of concerns about speed. The parsers have to run between keystrokes, it would not be a good user experience if they caused significant delays.
This concern may be misplaced, how do they compare to the C/C++ parser?
The C/C++ parser does not seem to cause issues even with files of several thousand lines.
@masatake @elextr
I've tried to compile the forth.c generated from forth.ctags and I got a compilation error in geany, then I supposed that it was because the ctags code was out of date in geany.
Then I've synchronised the geany code with the latest ctags, this time it compiled without singificant error (if I remember well), but the resulting geany binary didn't load at all (and didn't crash either).
@techee should comment on if the C file generated by optlib that I see is part of https://github.com/universal-ctags/ctags/pull/3812 can be directly used by Geany, that is beyond my pay grade.
I don't think there's any issue with these parsers (apart from possible speed problems, depends on how complex and how many regexes there are). We currently don't use any but we could just grab the generated C source.
My personal (very arbitrary) benchmark of speed acceptability would be that there shouldn't be any significant lags when editing a 1000 LOC source on Raspberry Pi 4. I had this problem with the PEG-based Kotlin parser in
https://github.com/geany/geany/pull/3034
I've tried to compile the forth.c generated from forth.ctags and I got a compilation error in geany, then I supposed that it was because the ctags code was out of date in geany.
Yes, we are like two years behind I'm afraid :-(. In general we want to make an update to the latest ctags early after every release and keep this ctags version until the next release. The problem is that our 3-times-a-year Geany release cycle degraded slightly ;-).
Then I've synchronised the geany code with the latest ctags, this time it compiled without singificant error (if I remember well), but the resulting geany binary didn't load at all (and didn't crash either).
There are always some surprises that have to be solved after updating to the newest ctags version. In general, you always have to update the mappings in
https://github.com/geany/geany/blob/master/src/tagmanager/tm_parser.c
for all parser - if there's a mismatch, Geany will print an error telling you in which mapping the problem is and won't load.
Anyway, if I'm not wrong, optlib parsers exist for quite some time so if you just check out an older ctags version which roughly corresponds to the last ctags update we did in Geany and generate the source of the parser using that version, you should be able to compile it with Geany.
Then check some of the recent open (or closed) PRs adding a filetype support in Geany to see how to integrate the new parser to Geany. Also check the HACKING file, I'm just afraid that the most up-to-date version is part of not-yet-merged https://github.com/geany/geany/pull/3169.
My personal (very arbitrary) benchmark of speed acceptability would be that there shouldn't be any significant lags when editing a 1000 LOC source on Raspberry Pi 4.
I just had a brief look at the parser and there are very few and simple regexes so I don't expect any performance problems.
I have evaluated the performance of the .ctags based Forth parser with https://github.com/universal-ctags/codebase.
``` [yamato@dev64]~/var/codebase% ./codebase ctags --ctags ~/bin/ctags Forth version: db9fe7e0 features: +wildcards +regex +iconv +debug +option-directory +xpath +json +interactive +sandbox +yaml +packcc +optscript +pcre2 log: results/db9fe7e0,Forth...............,..........,time......,default...,2023-09-13-00:41:19.log tagsoutput: /dev/null cmdline: + /home/yamato/bin/ctags --quiet --options=NONE --sort=no --options=profile.d/maps --totals=yes --languages=Forth -o - -R code/4th code/duskos code/gforth code/ueforth code/uf ctags: Warning: cannot open input file "code/gforth/arch/arm/android/libs/x86" : No such file or directory ctags: Warning: cannot open input file "code/gforth/arch/arm/android/libs/arm64-v8a" : No such file or directory ctags: Warning: cannot open input file "code/gforth/arch/arm/android/libs/x86_64" : No such file or directory 1621 files, 211549 lines (6278 kB) scanned in 0.5 seconds (13218 kB/s) ```
See the last line 13218 kB/s.
For Go: ``` [yamato@dev64]~/var/codebase% ./codebase ctags --ctags ~/bin/ctags Go version: db9fe7e0 features: +wildcards +regex +iconv +debug +option-directory +xpath +json +interactive +sandbox +yaml +packcc +optscript +pcre2 log: results/db9fe7e0,Go..................,..........,time......,default...,2023-09-13-00:41:40.log tagsoutput: /dev/null cmdline: + /home/yamato/bin/ctags --quiet --options=NONE --sort=no --options=profile.d/maps --totals=yes --languages=Go -o - -R code/buildah code/go code/kubernetes code/podman 28378 files, 8449364 lines (265525 kB) scanned in 11.3 seconds (23434 kB/s) ```
For C: ``` [yamato@dev64]~/var/codebase% ./codebase ctags --ctags ~/bin/ctags C version: db9fe7e0 features: +wildcards +regex +iconv +debug +option-directory +xpath +json +interactive +sandbox +yaml +packcc +optscript +pcre2 log: results/db9fe7e0,C...................,..........,time......,default...,2023-09-13-01:27:03.log tagsoutput: /dev/null cmdline: + /home/yamato/bin/ctags --quiet --options=NONE --sort=no --options=profile.d/maps --totals=yes --languages=C -o - -R code/gforth code/linux code/mysql-server code/perl5 code/php-src code/postgresql code/qemu code/r-source code/ruby code/ueforth ctags: Warning: cannot open input file "code/gforth/arch/arm/android/libs/x86" : No such file or directory ctags: Warning: cannot open input file "code/gforth/arch/arm/android/libs/arm64-v8a" : No such file or directory ctags: Warning: cannot open input file "code/gforth/arch/arm/android/libs/x86_64" : No such file or directory ctags: Warning: cannot open input file "code/qemu/roms/edk2/EmulatorPkg/Unix/Host/X11IncludeHack" : No such file or directory ctags: Warning: cannot open input file "code/qemu/roms/skiboot/ccan/heap/LICENSE" : No such file or directory 47673 files, 28992400 lines (827018 kB) scanned in 57.2 seconds (14469 kB/s) ```
The Forth parser is a bit slower than the C parser. But I think this is acceptable. NOTE: The ctags executable used in the performance test is built with --enable-debugging.
I have evaluated the performance of the .ctags based Forth parser with https://github.com/universal-ctags/codebase.
Thanks! This will be totally fine for Geany.
@masatake thanks indeed for the performance data.
Closed #3557 as completed.
Thank you. I will merge the optlib-based Forth parser first.
github-comments@lists.geany.org