[Geany-devel] Highlighting setup refactoring?
mbrush at xxxxx
Wed Nov 9 01:46:03 UTC 2011
On 11/08/2011 02:55 PM, Colomban Wendling wrote:
> Hi guys,
> TL;DR: code refactoring, boring stuff :D
> I tried a rewrite of highlighting.c:styleset_*() functions for all this
> code to be better readable and hopefully more easily maintainable. I
> ended up with no logic change (yet at least ^^), but a big refactoring,
> in the form of arrays mapping the different values.
> The code itself can be found here:
> https://github.com/b4n/geany/commits/highlighting-rewrite, the most
> important commit being
> Basically, that's how I made this:
> A filetype/styling definition  is split in 4 variables/arrays:
> * highlighting_lexer_<LANG>: (guint)
> the SCI lexer, e.g. SCLEX_CPP
> * highlighting_styles_<LANG>: (HLStyle)
> SCI style/named style mappings, e.g. SCE_ASM_DEFAULT to
> * highlighting_keywords_<LANG>: (HLKeyword)
> keywords ID/name mappings, e.g. 0 mapped to "primary"
> * highlighting_properties_<LANG>: (HLProperty)
> default SCI properties and their values, only used to disable
> fold.cpp.comment.explicit ATM, may be dropped maybe.
> These definitions are enough to do all the initialization and SCI setup,
> and because of the naming scheme, can be hidden behind a macro:
> STYLESET_INIT_FROM_MAPPING(ft_id, config, configh,<LANG>)
> STYLESET_FROM_MAPPING(sci, ft_id,<LANG>)
> These does preprocessing "##" magic to hook the different idendifiers
> and call (roughly, there are actually array length too):
> styleset_init_from_mapping(ft_id, config, configh, styles, keywords)
> styleset_from_mapping(sci, ft_id, lexer, styles, keywords, properties)
> Finally, the old init_styleset_case(geany_ft_id, init_func) can be
> replaced by init_styleset_case_auto(<LANG>), since<LANG> is the same as
> in GEANY_FILETYPE_<LANG>.
> OK, this may look a little complicated at first glance (though the old
> wasn't that straightforward either). But OTOH, the definitions are
> (IMO) far easier to read and update:
> * SCI style/named styles are next to each other
> * No need to allocate enough space manually (new_styleset()), it's done
> * All the repetitive calls are in proper loops, making the actually
> useful information more readable.
> * ...
These changes solve my biggest gripes with highlighting.c :) I *think*
there might be some performance advantages to doing it this way too
since all of the data is static/const and known upfront. I think the
compiler can treat it differently now (but I'm not a GCC expert, so I
could be wrong).
> What's left to do is porting DocBook, HTML, PHP and XML.
> For DocBook it is because it uses SCI_STYLESETEOLFILLED that my port
> don't support yet (could easily with one more column);
Is it possible to add the extra column, and just not use it in all the
other arrays where it's not required?
> For HTML, PHP and XML it's a bit harder, because they use a weird mixed
> setup (that I didn't try to understand too deeply yet). IMO, first
> thing to do would be to make each have its own styles, and inherit XML's
> one in filetypes definitions ([styling=XML]) rather than in-code. There
> wouldn't be manual duplication with the proposed setup, just alias XML
> definition to HTML and PHP.
> Not sure what to do with the Python filetype dependency though.
I *hate* how all these languages are mixed into a single lexer :)
> Anyway, I'd like your opinion on this, what to improve, what you feel
> good and bad about.
Heh, I actually did a tiny bit of work (mostly thinking) about basically
the exact same thing. I have a script I wrote in Python that builds the
lookup table/array code (nearly identical to yours) from
Scintilla.iface, we could perhaps look at using that to re-generate the
tables when Scintilla updates, or maybe even just for checking after
updates. I don't know if it's necessary though now that you've done the
bulk of the tables and they don't change *that* much between Scintilla
versions. Just a thought.
As we had discussed on IRC, I'd still rather not be storing duplicate
name-mapping in (hard)code like this. It seems cleaner to me for
example, to have the filetypes.* containing a '[style_map]' section
which maps named styles to scintilla style/state numbers. The problem
of course is that it would need comments everywhere that have the
SCE_LANG_FOO constants, otherwise there would just be plain magic
integers everywhere. I guess this approach is similar to SciTE
Overall I think your changes are a great improvement in maintainability
and readability (and maybe performance). Nice work!
BTW, should we test your current repo or wait for the other filetypes to
More information about the Devel