[Geany-devel] Highlighting setup refactoring?
Matthew Brush
mbrush at xxxxx
Wed Nov 9 01:46:03 UTC 2011
On 11/08/2011 02:55 PM, Colomban Wendling wrote:
> Hi guys,
>
> TL;DR: code refactoring, boring stuff :D
>
>
> I tried a rewrite of highlighting.c:styleset_*() functions for all this
> code to be better readable and hopefully more easily maintainable. I
> ended up with no logic change (yet at least ^^), but a big refactoring,
> in the form of arrays mapping the different values.
>
> The code itself can be found here:
> https://github.com/b4n/geany/commits/highlighting-rewrite, the most
> important commit being
> https://github.com/b4n/geany/commit/ab2663e9aef8aa17af330c54eb7e6223647aa0f5
>
> Basically, that's how I made this:
>
>
> A filetype/styling definition [1] is split in 4 variables/arrays:
>
> * highlighting_lexer_<LANG>: (guint)
> the SCI lexer, e.g. SCLEX_CPP
> * highlighting_styles_<LANG>: (HLStyle[])
> SCI style/named style mappings, e.g. SCE_ASM_DEFAULT to
> "default"
> * highlighting_keywords_<LANG>: (HLKeyword[])
> keywords ID/name mappings, e.g. 0 mapped to "primary"
> * highlighting_properties_<LANG>: (HLProperty[])
> default SCI properties and their values, only used to disable
> fold.cpp.comment.explicit ATM, may be dropped maybe.
>
> These definitions are enough to do all the initialization and SCI setup,
> and because of the naming scheme, can be hidden behind a macro:
>
> STYLESET_INIT_FROM_MAPPING(ft_id, config, configh,<LANG>)
> STYLESET_FROM_MAPPING(sci, ft_id,<LANG>)
>
> These does preprocessing "##" magic to hook the different idendifiers
> and call (roughly, there are actually array length too):
>
> styleset_init_from_mapping(ft_id, config, configh, styles, keywords)
> styleset_from_mapping(sci, ft_id, lexer, styles, keywords, properties)
>
> Finally, the old init_styleset_case(geany_ft_id, init_func) can be
> replaced by init_styleset_case_auto(<LANG>), since<LANG> is the same as
> in GEANY_FILETYPE_<LANG>.
>
>
> OK, this may look a little complicated at first glance (though the old
> wasn't that straightforward either). But OTOH, the definitions are
> (IMO) far easier to read and update:
>
> * SCI style/named styles are next to each other
> * No need to allocate enough space manually (new_styleset()), it's done
> automatically
> * All the repetitive calls are in proper loops, making the actually
> useful information more readable.
> * ...
>
These changes solve my biggest gripes with highlighting.c :) I *think*
there might be some performance advantages to doing it this way too
since all of the data is static/const and known upfront. I think the
compiler can treat it differently now (but I'm not a GCC expert, so I
could be wrong).
>
> What's left to do is porting DocBook, HTML, PHP and XML.
>
> For DocBook it is because it uses SCI_STYLESETEOLFILLED that my port
> don't support yet (could easily with one more column);
>
Is it possible to add the extra column, and just not use it in all the
other arrays where it's not required?
> For HTML, PHP and XML it's a bit harder, because they use a weird mixed
> setup (that I didn't try to understand too deeply yet). IMO, first
> thing to do would be to make each have its own styles, and inherit XML's
> one in filetypes definitions ([styling=XML]) rather than in-code. There
> wouldn't be manual duplication with the proposed setup, just alias XML
> definition to HTML and PHP.
> Not sure what to do with the Python filetype dependency though.
>
I *hate* how all these languages are mixed into a single lexer :)
>
> Anyway, I'd like your opinion on this, what to improve, what you feel
> good and bad about.
>
Heh, I actually did a tiny bit of work (mostly thinking) about basically
the exact same thing. I have a script I wrote in Python that builds the
lookup table/array code (nearly identical to yours) from
Scintilla.iface, we could perhaps look at using that to re-generate the
tables when Scintilla updates, or maybe even just for checking after
updates. I don't know if it's necessary though now that you've done the
bulk of the tables and they don't change *that* much between Scintilla
versions. Just a thought.
As we had discussed on IRC, I'd still rather not be storing duplicate
name-mapping in (hard)code like this. It seems cleaner to me for
example, to have the filetypes.* containing a '[style_map]' section
which maps named styles to scintilla style/state numbers. The problem
of course is that it would need comments everywhere that have the
SCE_LANG_FOO constants, otherwise there would just be plain magic
integers everywhere. I guess this approach is similar to SciTE
properties files.
Overall I think your changes are a great improvement in maintainability
and readability (and maybe performance). Nice work!
BTW, should we test your current repo or wait for the other filetypes to
be implemented?
Cheers,
Matthew Brush
More information about the Devel
mailing list