This is hopefully the last of the big TM-related refactorings from me to improve ctags parser management. I'll be nice afterwards, I promise :-).
There are several problems with the current mapping done in symbols.c:
1. All other language-specific mappings are done in `tm_parser.c` now and this is the only thing that is done elsewhere. Having all the mappings at one place makes things much clearer and makes `tm_parser.c` the only place to play with when introducing new parser or when updating a parser to a new upstream version.
2. The mapping is extremely confusing. First, there are several hard-coded iterator names in [TreeviewSymbols](https://github.com/geany/geany/blob/5a369a41e3cb6fd1bc57e602fa567e8c0d153bfc...) which don't cover all tag types but which are just a subset of them. Then, there is [get_tag_type_iter()](https://github.com/geany/geany/blob/5a369a41e3cb6fd1bc57e602fa567e8c0d153bfc...) which is another mapping that groups certain tag types to `TreeViewSymbols` members. So when looking at mappings defined in [add_top_level_items()](https://github.com/geany/geany/blob/5a369a41e3cb6fd1bc57e602fa567e8c0d153bfc...), it isn't clear by just looking at it how tag types of certain languages get mapped to their roots without also having a look at `get_tag_type_iter()`.
3. Since the groupings in `get_tag_type_iter()` are hard-coded, some tag types have to be grouped together whether it makes sense for the given language or not. For instance, for C we have "Typedefs / Enums" grouped together because `get_tag_type_iter()` returns the same root for `tm_tag_typedef_t` and `tm_tag_enum_t` and even if we wanted to change this for C, we would affect other languages too because this mapping is the same for all languages.
4. Because of the hard-coded grouping of some tag types, we have to make a decision whether we want something to show as we want in the symbol tree or whether we map a ctags kind to tag type that is semantically close to the construct in the given language. For instance, we could separate "Typedefs / Enums" to separate roots in the symbol tree by e.g. mapping typedefs to `tm_tag_typedef_t` and enums to `tm_tag_field_t` which have separate roots but then enum is represented by `tm_tag_field_t` which would confuse some more advanced Geany features like scope completion.
5. In addition, the hard-coded grouping effectively reduces the number of roots to 11 which may not be enough for some languages.
6. Tag icons for autocompletion popup are hard-coded in `get_tag_class()` and may differ from the icons used by the symbol tree. This isn't fixable easily with the current way of mapping.
This patch tries to solve these problems by moving root symbol tree mappings to `tm_parser.c` (so all the mappings are at one place) together with more flexible and easier to maintain way of mapping definition.
For instance, consider kind mappings for the HAXE programming language which until now looked this way.
``` static TMParserMapEntry map_HAXE[] = { {'m', tm_tag_method_t}, // method {'c', tm_tag_class_t}, // class {'e', tm_tag_enum_t}, // enum {'v', tm_tag_variable_t}, // variable {'i', tm_tag_interface_t}, // interface {'t', tm_tag_typedef_t}, // typedef }; ```
In addition, after this patch, `tm_parser.c` contains also the following mapping for the symbol tree roots:
``` static TMParserMapGroup group_HAXE[] = { {_("Interfaces"), TM_ICON_STRUCT, tm_tag_interface_t}, {_("Classes"), TM_ICON_CLASS, tm_tag_class_t}, {_("Methods"), TM_ICON_METHOD, tm_tag_method_t}, {_("Types"), TM_ICON_MACRO, tm_tag_typedef_t | tm_tag_enum_t}, {_("Variables"), TM_ICON_VAR, tm_tag_variable_t}, }; ```
This declaration says that there are 5 roots with the given names, icons attached to these roots, and, finally, the TM types which will appear under these roots. Notice that there may be multiple types under a single root which can be OR-d using `|` because TM types are bit fields. This definition gives us enough flexibility to overcome the problems mentioned above and by having everything at one place, we can manage TM languages much more easily.
There isn't anything particularly interesting about the rest of the patch - there are 2 auxiliary functions in `tm_parser.c/h`:
- `tm_parser_get_sidebar_group()`: returns index of a group for the provided language and TM tag type - `tm_parser_get_sidebar_info()`: returns root name and icon for the provided language and group index
Inside `symbols.c`, `tv_iters` was converted to an array of size `MAX_SYMBOL_TYPES` of `GtkTreeIter` instead of the previous struct of hard-coded roots and the rest of the code is updated to use this array and the above 2 functions to get the mappings.
TODO: Update HACKING with updated description about how to add a ctags parser (will do after a review if this patch is considered OK). You can view, comment on, or merge this pull request online at:
https://github.com/geany/geany/pull/3137
-- Commit Summary --
* Move symbol tree root mappings to tm_parser.c * Add code verifying that all tag types for a language are mapped to some group * Translate symbol tree roots
-- File Changes --
M po/POTFILES.in (1) M src/symbols.c (628) M src/tagmanager/tm_parser.c (1132) M src/tagmanager/tm_parser.h (18)
-- Patch Links --
https://github.com/geany/geany/pull/3137.patch https://github.com/geany/geany/pull/3137.diff
This is hopefully the last of the big TM-related refactorings from me to improve ctags parser management. I'll be nice afterwards, I promise :-).
Thats what you said last time, and the time before and ... :-P
But we are not grateful to you for shouldering that load.
The principles (as much as I understand them from the description) seem sound, but I'm not sure anyone but @b4n could review it, if he remembers any Geany now. Unless someone else wants to become a tagmanager expert to assist @techee.
If it doesn't get reviewed in a reasonable time but it seems to work for anyone testing it, I would suggest that it be merged and the next release be 1.99 (ie pre-2.0, so 2.0 can be stable, that would allow any other "too complex to review in detail" but "test ok for the things testers try" changes to be merged as well).
Of course I meant not UNgrateful!!!!
The principles (as much as I understand them from the description) seem sound, but I'm not sure anyone but @b4n could review it, if he remembers any Geany now. Unless someone else wants to become a tagmanager expert to assist @techee.
The times of unreviewable PRs posted by me are fortunately over and this PR _isn't_ one of them actually - the biggest bulk of this PR is "just" the moving of the language mappings to `tm_parser.c` and updating them to the new way of mapping, the actual "real" code change is maybe just around 100 LOCs and perfectly reviewable. And it's not much of the TM stuff that is affected by this PR, it's rather the `symbols.c` stuff.
The most worrisome part of this PR for me is the `"just" moving and updating the mapping` part for all the languages. I __really__ tried to do it carefully but one gets slightly delirious at language number 30 with 15 more remaining and it's always possible I might have introduced some error.
Anyway, after all the uctags parsers are merged (I'll wait with the remaining ones until this PR is merged), I plan to do a big testing session checking if everything works.
but one gets slightly delirious at language number 30 with 15 more remaining
So do reviewers, hence my comment ... :smile:
So do reviewers, hence my comment ...
Yeah and I don't think the mappings themselves should be reviewed (that would really be a horrible task), just the actual code changed maybe.
But even if there's some problem in the mappings, it won't be a complete disaster, it will just mean some tags will be displayed under incorrect roots or will be missing in the symbol tree, all the rest will work fine.
As we all know I'm not @b4n and I'm sure nobody is as good as he is in reviewing such things but anyway I gave this PR a test and review and it looks very good!!
Oh, and please don't take this as an application for becoming the next tagmanager expert :D.
Thank you so much @techee for this hard work. Much appreciated.
I got a few rather harmless compiler warnings: ``` tm_parser.c: In function ‘tm_parser_get_sidebar_info’: tm_parser.c:1087:19: warning: comparison of integer expressions of different signedness: ‘gint’ {aka ‘int’} and ‘guint’ {aka ‘unsigned int’} [-Wsign-compare] 1087 | if (group >= map->group_num) | ^~ tm_parser.c:1081:15: warning: unused variable ‘i’ [-Wunused-variable] 1081 | guint i; | ^ ``` Easy enough I guess.
``` At top level: tm_parser.c:74:25: warning: ‘map_C_old_parser’ defined but not used [-Wunused-variable] 74 | static TMParserMapEntry map_C_old_parser[] = { | ^~~~~~~~~~~~~~~~ ``` `map_C_old_parser` is used for Ferite and GLSL according to its comment. Ferite is gone and GLSL used the normal C parser, so I guess this whole map entry can be removed?
Apart from that, I'd suggest to merge this soon to get more real life testing (I use master for daily work).
@techee pushed 5 commits.
ca474ed7e5fede88878780b2ae2df64aaaebec1e fixup! Move symbol tree root mappings to tm_parser.c 6180305a87fe6673986382d20011de8c77748d7f fixup! Move symbol tree root mappings to tm_parser.c 2928e71a3de43fb73a8551d77b49e6d727fb3029 fixup! Move symbol tree root mappings to tm_parser.c fdc8a1bb1f9aeb27fa96426191f830008a72e1f0 Fix -Wsign-compare warning 834d0246c359fd590008a0fdb32c17b01c9ff3e2 Fix -Wunused-variable warnings
Oh, and please don't take this as an application for becoming the next tagmanager expert :D.
Too late, it's you now :-P.
I fixed the warnings and noticed a few more of `-Wsign-compare` and `-Wunused-variable` warnings which were introduced in some past commits so fixed these too while there.
map_C_old_parser is used for Ferite and GLSL according to its comment. Ferite is gone and GLSL used the normal C parser, so I guess this whole map entry can be removed?
Yes. And thanks to this we don't need `COMMON_C` and `COMMON_C_NEW_PARSER` as separate defines so I removed `COMMON_C_NEW_PARSER` and moved the entries directly to `COMMON_C`.
By the way, should the localization *.po files be updated as part of this PR (the translatable strings moved from `symbols.c` to `tm_parser.c`) or does this normally happen separately towards the end of the release cycle?
If it should be updated here, I'm not sure how to do it to get the same output as now with the `../` prefix - I'm getting diffs like below when running `make update-po`:
``` -#: ../geany.desktop.in.h:1 ../data/geany.glade.h:343 +#: geany.desktop.in:5 data/geany.glade:6294 msgid "Geany" msgstr "Geany" ```
@techee pushed 1 commit.
1ec68e50c4dc1117eb95c9ab0cd6ea2e88ea0662 Update HACKING with up-to-date information about ctags parser addition
@eht16 commented on this pull request.
@@ -500,13 +500,13 @@ other changes should be passed on to the maintainers at
http://scintilla.org. We normally update to a new Scintilla release
Very nice, also to clean up the old and outdated Anjuta references. :+1:
By the way, should the localization *.po files be updated as part of this PR (the translatable strings moved from `symbols.c` to `tm_parser.c`) or does this normally happen separately towards the end of the release cycle?
Usually, @frlan updates all the message catalogs when announcing a string freeze. So, there is no need to do it in this PR.
If it should be updated here, I'm not sure how to do it to get the same output as now with the `../` prefix - I'm getting diffs like below when running `make update-po`:
-#: ../geany.desktop.in.h:1 ../data/geany.glade.h:343 +#: geany.desktop.in:5 data/geany.glade:6294 msgid "Geany" msgstr "Geany"
Same here, not sure why. @frlan any idea?
Even though I have no idea why the .. did change, I would recommend to not add the geany.pot-Updates to a MR request, as it will cause always merge conflicts if the MR getting a little outdated. geany.pot as well as .po are more or less automatic generated files which is conflicting with the way git is working.
In this special case if there is only a string moving from one file to another most likely it will not even generate a fuzzy string on updating po-files later.
Translators are asked to run `make update-po` in prior of starting their work or download the catalog from from i18n-page which is including the new references by default.
So, I guess we can merge (after squashing).
Done. I just squashed the fixup commits, the rest makes sense separately I think.
Done. I just squashed the fixup commits, the rest makes sense separately I think.
Sure, sorry I was also talking only about the fixup commits.
Merged #3137 into master.
github-comments@lists.geany.org