[Github-comments] [geany/geany] Move symbol tree root mappings to tm_parser.c (PR #3137)

Mon Mar 21 21:29:34 UTC 2022

This is hopefully the last of the big TM-related refactorings from me to improve ctags parser management. I'll be nice afterwards, I promise :-).

There are several problems with the current mapping done in symbols.c:

1. All other language-specific mappings are done in `tm_parser.c` now
and this is the only thing that is done elsewhere. Having all the
mappings at one place makes things much clearer and makes `tm_parser.c`
the only place to play with when introducing new parser or when
updating a parser to a new upstream version.

2. The mapping is extremely confusing. First, there are several
hard-coded iterator names in [TreeviewSymbols](https://github.com/geany/geany/blob/5a369a41e3cb6fd1bc57e602fa567e8c0d153bfc/src/symbols.c#L381) which don't cover all
tag types but which are just a subset of them. Then, there is
[get_tag_type_iter()](https://github.com/geany/geany/blob/5a369a41e3cb6fd1bc57e602fa567e8c0d153bfc/src/symbols.c#L1022) which is another mapping that groups certain
tag types to `TreeViewSymbols` members. So when looking at mappings
defined in [add_top_level_items()](https://github.com/geany/geany/blob/5a369a41e3cb6fd1bc57e602fa567e8c0d153bfc/src/symbols.c#L491), it isn't clear by just looking
at it how tag types of certain languages get mapped to their
roots without also having a look at `get_tag_type_iter()`.

3. Since the groupings in `get_tag_type_iter()` are hard-coded,
some tag types have to be grouped together whether it makes sense
for the given language or not. For instance, for C we have
"Typedefs / Enums" grouped together because `get_tag_type_iter()`
returns the same root for `tm_tag_typedef_t` and `tm_tag_enum_t`
and even if we wanted to change this for C, we would affect
other languages too because this mapping is the same for all
languages.

4. Because of the hard-coded grouping of some tag types, we have to
make a decision whether we want something to show as we want in the
symbol tree or whether we map a ctags kind to tag type that is
semantically close to the construct in the given language. For
instance, we could separate "Typedefs / Enums" to separate roots
in the symbol tree by e.g. mapping typedefs to `tm_tag_typedef_t`
and enums to `tm_tag_field_t` which have separate roots but then
enum is represented by `tm_tag_field_t` which would confuse some
more advanced Geany features like scope completion.

5. In addition, the hard-coded grouping effectively reduces
the number of roots to 11 which may not be enough for some languages.

6. Tag icons for autocompletion popup are hard-coded in
`get_tag_class()` and may differ from the icons used by the symbol
tree. This isn't fixable easily with the current way of mapping.

This patch tries to solve these problems by moving root symbol
tree mappings to `tm_parser.c` (so all the mappings are at one
place) together with more flexible and easier to maintain
way of mapping definition.

For instance, consider kind mappings for the HAXE programming
language which until now looked this way.

```
static TMParserMapEntry map_HAXE[] = {
	{'m', tm_tag_method_t},     // method
	{'c', tm_tag_class_t},      // class
	{'e', tm_tag_enum_t},       // enum
	{'v', tm_tag_variable_t},   // variable
	{'i', tm_tag_interface_t},  // interface
	{'t', tm_tag_typedef_t},    // typedef
};
```

In addition, after this patch, `tm_parser.c` contains also the
following mapping for the symbol tree roots:

```
static TMParserMapGroup group_HAXE[] = {
	{_("Interfaces"), TM_ICON_STRUCT, tm_tag_interface_t},
	{_("Classes"), TM_ICON_CLASS, tm_tag_class_t},
	{_("Methods"), TM_ICON_METHOD, tm_tag_method_t},
	{_("Types"), TM_ICON_MACRO, tm_tag_typedef_t | tm_tag_enum_t},
	{_("Variables"), TM_ICON_VAR, tm_tag_variable_t},
};
```

This declaration says that there are 5 roots with the given
names, icons attached to these roots, and, finally, the TM types
which will appear under these roots. Notice that there may be
multiple types under a single root which can be OR-d using `|`
because TM types are bit fields. This definition gives us enough
flexibility to overcome the problems mentioned above and by having
everything at one place, we can manage TM languages much more
easily.

There isn't anything particularly interesting about the rest of
the patch - there are 2 auxiliary functions in `tm_parser.c/h`:

- `tm_parser_get_sidebar_group()`: returns index of a group for
  the provided language and TM tag type
- `tm_parser_get_sidebar_info()`: returns root name and icon
  for the provided language and group index

Inside `symbols.c`, `tv_iters` was converted to an array of size
`MAX_SYMBOL_TYPES` of `GtkTreeIter` instead of the previous struct
of hard-coded roots and the rest of the code is updated to use
this array and the above 2 functions to get the mappings.

TODO: Update HACKING with updated description about how to add a ctags parser (will do after a review if this patch is considered OK).
You can view, comment on, or merge this pull request online at:

  https://github.com/geany/geany/pull/3137

-- Commit Summary --

  * Move symbol tree root mappings to tm_parser.c
  * Add code verifying that all tag types for a language are mapped to some group
  * Translate symbol tree roots

-- File Changes --

    M po/POTFILES.in (1)
    M src/symbols.c (628)
    M src/tagmanager/tm_parser.c (1132)
    M src/tagmanager/tm_parser.h (18)

-- Patch Links --

https://github.com/geany/geany/pull/3137.patch
https://github.com/geany/geany/pull/3137.diff

-- 
Reply to this email directly or view it on GitHub:
https://github.com/geany/geany/pull/3137
You are receiving this because you are subscribed to this thread.

Message ID: <geany/geany/pull/3137 at github.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.geany.org/pipermail/github-comments/attachments/20220321/339305ba/attachment.htm>