[Geany-devel] I'd like to write a lexer for HAML. Can I invoke sub-lexers, and can I set up simple TDD?

Nathan Broadbent nathan.f77 at xxxxx
Tue Nov 29 16:30:51 UTC 2011


Dear Scintilla & Geany mailing lists,

I would like to write a Scintilla lexer for the HAML syntax. HAML is
used by many (if not most) Ruby on Rails developers. It is an indented
HTML markup language with inline Ruby evaluation. It also needs to
support many embedded syntaxes, such as javascript, ruby, css,
markdown, textile, and plain HTML.

I've been studying the HTML lexer for the last 5 hours, and I'm
beginning to understand how everything fits together. But before I
start the lexer, I would like to please ask a few questions.

I've noticed that the HTML lexer function is passed 6 sets of
keywords, including HTML elements, Javascript keywords, Python
keywords, etc. It then defines 'classifyWordHT***' functions in order
to classify words for each of those sub-languages.

I was wondering if it would be possible / advisable to invoke a
different lexer for each sub-language, instead of trying to support
them all from one lexer?
For example, I could detect the opening tag for javascript, scan down
the file and count the character length until the javascript block
ends. Then I could pass the original Accessor to the javascript lexer
(LexerCPP) with the start position and length. This 'sub-lexer' would
colorize the text, and the main lexer could then skip that block of
text.

It seems that a Scintilla lexer isn't designed to do this on it's own,
since Geany passes it all of the data about filetypes, keywords, and
highlighting mappings. But is there any reason why I couldn't
hard-code "javascript => SCLEX_CPP" in my HAML lexer?
Or is there an abstract way to query lexer mappings and keywords from
scintilla's caller application?


My other question is: How do you test your lexers? Do you just compile
them and check them manually in SciTE or Geany?
I'd really like to start with some sample files, hand-code the
expected highlighting, then compile and run the sample files through
the lexer until all the expectations are met. I haven't been able to
find any documentation about testing, but please let me know if
someone has already written a script to automate this process.


Thanks a lot for your time!

Best regards,
Nathan B



More information about the Devel mailing list