This patch adds basic Prolog support (only scintilla laxer, there's no ctags parser). I used swi-prolog for the compiler and run commands which I believe is the most commonly used prolog implementation. I used the keywords from here:
https://github.com/mxw/vim-prolog/blob/master/syntax/prolog.vim
I only used Prolog at school many years ago but it's an interesting language and I believe Geany should support it (which is why #3086 resonated in my head).
Fixes #3086 You can view, comment on, or merge this pull request online at:
https://github.com/geany/geany/pull/3171
-- Commit Summary --
* Add Prolog filetype support
-- File Changes --
M data/Makefile.am (1) A data/filedefs/filetypes.prolog (73) M data/filetype_extensions.conf (1) M meson.build (1) M scintilla/Makefile.am (1) A scintilla/lexilla/lexers/LexVisualProlog.cxx (516) M scintilla/lexilla/src/Lexilla.cxx (1) M scintilla/scintilla_changes.patch (1) M src/filetypes.c (1) M src/filetypes.h (1) M src/highlighting.c (17) M src/highlightingmappings.h (35)
-- Patch Links --
https://github.com/geany/geany/pull/3171.patch https://github.com/geany/geany/pull/3171.diff
This looks good, but, to my thinking, incomplete without support for Visual Prolog, which is what Lexilla'a lexer [actually targets][0]. I opened https://github.com/techee/geany/pull/4 to give an idea of what a more inclusive file def might look like.
![geany_visualprolog_ft](https://user-images.githubusercontent.com/59004801/163878216-624c6065-4381-4...)
[0]: https://github.com/ScintillaOrg/lexilla/blob/0bd13d84b3fef7a72553ab95c43d0d4...
@techee pushed 1 commit.
1497c01f1185ba3a0e909963ac18a5a9e82ed9f4 Add Prolog filetype support
@rdipardo I just updated this PR with suggestions you made in https://github.com/techee/geany/pull/4. For primary keywords I used the keywords from the VS extension you suggested (minus the keywords from the secondary group), for secondary keywords I used those you mentioned. Does the result look good to you?
@techee, It looks like the primary keyword set in your last commit left out the VisualProlog type specifiers [1] (or ["Domains"][0], as the spec calls them). Compare the appearance of `unsigned` in my [earlier screen capture][2] with the one below (in the [Himbeere colorscheme](https://github.com/geany/geany-themes)):
![geany_himbeere_visualprolog_hl](https://user-images.githubusercontent.com/59004801/165013360-9ab0265f-1bbf-4...)
I'm confident that SWI-Prolog users will be completely happy 👍🏼
![geany_himbeere_swi-pl-hl](https://user-images.githubusercontent.com/59004801/165013510-0c2ea3ca-7ac2-4...)
They're getting more lexical categories than even the VS Code extension recognizes. Notice, however, that `append` and `multifile` are styled by VS Code but not Geany:
![vscode_swi-pl_hl](https://user-images.githubusercontent.com/59004801/165013570-7bb85e59-3d2f-4...)
Neither is as complete as [prolog.vim](https://github.com/yochem/prolog.vim), but I think we have to accept that regex-capable parsers like Vim and Textmate grammars are simply better than Scintilla's match-every-character-of-one-lexeme-at-a-time model:
![prolog vim_swi-pl_hl](https://user-images.githubusercontent.com/59004801/165013653-8a8608a3-2254-4...)
---
[1]: See https://github.com/rdipardo/geany/commit/3baa85baa25cd36b4d0e437fe7cd7383a82... The complete list is quite brief:
any binary binaryNonAtomic boolean char compareResult factDB handle integer64 integerNative null pointer real real32 string8 symbol unsigned unsigned64 unsignedNative
[0]: https://wiki.visual-prolog.com/index.php?title=Language_Reference/Built-in_e... [2]: https://user-images.githubusercontent.com/59004801/163895249-93885a44-09dd-4...
Neither is as complete as [prolog.vim](https://github.com/yochem/prolog.vim), but I think we have to accept that regex-capable parsers like Vim and Textmate grammars are simply better than Scintilla's match-every-character-of-one-lexeme-at-a-time model:
Scintilla lexers are C++, so in theory they could do anything, just somebody has to code it :-)
It looks like the primary keyword set in your last commit left out the VisualProlog type specifiers [1] (or ["Domains"](https://wiki.visual-prolog.com/index.php?title=Language_Reference/Built-in_e...), as the spec calls them). Compare the appearance of unsigned in my [earlier screen capture](https://user-images.githubusercontent.com/59004801/163895249-93885a44-09dd-4...) with the one below (in the [Himbeere colorscheme](https://github.com/geany/geany-themes)):
I'm getting slightly lost in what you propose to do - I just took the VS code keywords as you suggested but basically I could merge all the keywords together, i.e.: 1. VS Code 2. Vim 3. Visual Prolog I just suspect not many people will use Geany for Visual Prolog which seems to be Windows-only, proprietary and with an official IDE (whose authors probably wrote the lexer and use it in their IDE).
Scintilla lexers are C++, so in theory they could do anything, just somebody has to code it :-)
Yeah, you should be able to do more things in the code than using regular expression.
I just suspect not many people will use Geany for Visual Prolog
Sounds reasonable. Let's just settle for enough SWI-PL keywords to provide a common denominator between the Vim and VS Code implementations. Type specifiers are unique to Visual Prolog and could be better implemented by a tags parser anyway. Serious users would expect their custom types to be styled the way `typedef`'d structs currently are in C and family.
Scintilla lexers are C++, so in theory they could do anything, just somebody has to code it :-)
Yeah, you should be able to do more things in the code than using regular expression.
Well, being easy but inefficient allowed Python an PHP to become the institutions they are today. Scintilla has followed that tradition to great success.
There was a [proposal made][0] a long time ago to teach Scintilla how to consume flex files. The idea was logical but completely antithetical to the nature of C++, as it would have meant that "Scintilla lexers could be resumed[^1] to more readable definition files, instead of inextricable suite of if / elseif / goto code."
[^1]: The OP is French and seems to have directly translated *résumé* in the sense of *abridged*, *reduced*, etc.
[0]: https://sourceforge.net/p/scintilla/feature-requests/1074
Sounds reasonable. Let's just settle for enough SWI-PL keywords to provide a common denominator between the Vim and VS Code implementations. Type specifiers are unique to Visual Prolog and could be better implemented by a tags parser anyway. Serious users would expect their custom types to be styled the way typedef'd structs currently are in C and family.
So should I merge the vim and VS code keywords? Right now it's the VS code keywords only.
Right now it's the VS code keywords only.
I recommended that set because it's much bigger overall, but it reflects only one syntactic category ([builtin functions][0]). We still need [Vim's][1] more general dictionary for high-frequency predicates like `append` and `write`. Merging them into the "primary" set is fine since they're both SWI-Prolog.
[0]: https://github.com/arthwang/vsc-prolog/blob/3fab7b5916c505d55efc1b7556249bbe... [1]: https://github.com/yochem/prolog.vim/blob/master/syntax/prolog.vim
@rdipardo
Well, being easy but inefficient allowed Python an PHP to become the institutions they are today. Scintilla has followed that tradition to great success.
I'm not sure I understand how Scintilla follows the "inefficient but easy" tradition, I would have said that writing everything in C++ follows the "difficult but efficient" tradition :smile:
Recognising that words (identifiers/names/whatever your language calls them) can represent several different syntactic constructs, and these tend to change as the language evolves, Scintilla provides the facility for the application (thats Geany) to provide several lists of words and facilities for the lexer to efficiently recognise if/which list a word is in, and members of those lists can be styled differently. Most lexers happily use this facility, but how many lists they support varies from lexer to lexer. This facility is even (mis)used by Geany to supply lists of typenames detected by the ctags parsers/tagfiles dynamically at runtime for some languages (eg C/C++).[^1]
The prolog lexer supports these lists:
``` static const char *const visualPrologWordLists[] = { "Major keywords (class, predicates, ...)", "Minor keywords (if, then, try, ...)", "Directive keywords without the '#' (include, requires, ...)", "Documentation keywords without the '@' (short, detail, ...)", 0, }; ```
I think @techee only provided for two in the filetype file. Maybe they can all be allowed since there is no ctags parser so none need to be reserved for that. Then the lists might be better arranged.
[^1]: lexers run each keystroke so need to be fast and do little and ignore incomplete syntax, just identify the syntactic entities. Parsers need to understand the language to read declarations so they run mostly after a delay on the basis that if the meatware has stopped typing they will likely be thinking for a while, and so a parse delay is less likely to be noticed, and the code is also more likely to be legal enough to parse.
I'm not sure I understand how Scintilla follows the "inefficient but easy" tradition
Easy to implement:
see how little code is needed to use Scintilla
https://sourceforge.net/p/scintilla/feature-requests/1331/#392c/280c
Inefficient if you count the time spent chasing subtle bugs inside character-counting loops, or the "inextricable suite of if / elseif / goto code" https://sourceforge.net/p/scintilla/feature-requests/1074
Easy to implement:
On 1331 "Easy" to use, Neil must have had his tongue in cheek, just ignore the thousands of lines of [Scite](https://sourceforge.net/p/scintilla/scite/ci/default/tree/src/) which is just a "test editor", but not easy to write a lexer for a language which was what we are talking about.
1074 is one persons opinion, not gospel truth, only _my_ opinion is gospel truth :grin: [end humility]
"Worse is better" says the user experience is poor to simplify implementation, but I would argue the implementation difficulty of writing lexers in C++ results in a better user experience due to the speed, try opening a big HTML in Geany and in gedit (which uses regex syntax highlighting). But certainly the lexer development experience is worse if you are not a C++ist.
@techee pushed 1 commit.
e897865548c8c604bd9eca7a99672035d32d6063 fixup! Add Prolog filetype support
@rdipardo I merged the vim and vs code keywords into one - does it look good to you?
@techee, I think we're finally done with keywords. :+1:
Now, on to a messier issue I just detected.
In SWI-Prolog, relational operators are prefixed with `@` when used as arguments; for example, in the overloaded form of the very common [`sort` predicate][0].
Visual Prolog interprets the `@` token as the start of a verbatim string, and [Lexilla's lexer][1] imposes that style on every Prolog document:
~~~cpp if (sc.state == SCE_VISUALPROLOG_DEFAULT) { if (sc.Match('@') && isOpenStringVerbatim(sc.chNext, closingQuote)) { sc.SetState(SCE_VISUALPROLOG_STRING_VERBATIM); ~~~
![geany_himbeere_swi-pl-sort-4](https://user-images.githubusercontent.com/59004801/165398111-5f5b43db-6ba2-4...)
I guess Geany could always intercept or override the `SCE_VISUALPROLOG_STRING_VERBATIM` lexical class, or simply ignore it. In my view it's the lexer that needs adapting to comply better with the Prolog implementation that users will actually use. That said, I don't think this warrants an upstream patch, since the lexer is just working as advertised.
[0]: https://www.swi-prolog.org/pldoc/doc_for?object=sort/4 [1]: https://github.com/ScintillaOrg/lexilla/blob/843bb9e1688305cd64484f09bac0b0e...
Geany can't really override any styles since the lexer will just put them back each time it runs, and Scintilla, not Geany, controls when that happens and what range of the file is re-lexed.
A style can't be "ignored" but it can be mapped to the default style, and we could do that for `SCE_VISUALPROLOG_STRING_VERBATIM` if it is only used in that situation.
But then in the example in your image that would mean the second `sort(0,` would not be styled as in the first occurrence.
Which is the least of the two evils?
That said, I don't think this warrants an upstream patch, since the lexer is just working as advertised.
Well clearly someone uses Visual Prolog or they would not have contributed the lexer. Somebody could contribute a patch to Lexilla (controlled by a property) that changed `@` behaviour and any other differences.
Somebody could contribute a patch to Lexilla (controlled by a property) that changed `@` behaviour and any other differences.
The optional exception for SWI-Prolog would have to be opt-*in* for the sake of editors that already consume this lexer. I'm even less enthusiastic about that idea because the track record of "multi-lexers" is a lousy one. The implementation of JavaScript template strings remains blocked by the need to [recognize `SCE_C_STRINGRAW`][0], and LexJSON spoils NPM project descriptors by insisting that a colon inside a property name must be a [compact IRI][1], even though
JSON-LD is neither an update nor an extension to JSON. It is a separate specification of a JSON-based schema. Its relation to JSON is the same as, say, the relation of SVG to XML.
https://github.com/ScintillaOrg/lexilla/issues/72#issuecomment-1093150057
If Geany's users want to see their SWI-Prolog files in living colour, find them a SWI-Prolog lexer.
[0]: https://sourceforge.net/p/scintilla/feature-requests/1112/#1052 [1]: https://github.com/rizonesoft/Notepad3/issues/3899#issuecomment-1022493697
I agree that mixed language lexers/parsers tend to be problematic, and I can understand Neils decision not to work on javascript when it became too complex (shudder, and I guess nobody else has stepped up either).
But this is just a difference between compilers, not languages. There is prior art in having differences in the lexer to accommodate differences in tools, for example LexASM.cpp allows comment characters to be varied to match the differing assemblers `as` which uses `#` and `asm` which uses `;`. Its certainly more likely to happen faster and be accepted sooner than a whole new lexer for the language, but whichever path is used, "somebodys" got to do it, we won't "find" a Scintilla lexer under the doormat.
In the meantime it can be left as is, or the `@` strings mapped to default style, or something else, which is preferable?
@techee pushed 1 commit.
a3a01b49bdfb93bbd6eecf2000feaa6d3413b87b Add an option to disable VisualProlog verbatim strings
Somebody could contribute a patch to Lexilla (controlled by a property) that changed @ behaviour and any other differences.
I've just pushed a patch here disabling '@' as a literal string start character - it was pretty simple. There seem to be many lexers having configuration options like this so I don't expect there would be a problem upstream. If it works as expected here, I'll send a patch upstream.
@techee,
If it works as expected here, I'll send a patch upstream
It would be ideal if `@` were also recognized as a SWI-Prolog operator [^1]; otherwise it's a welcome improvement.
![geany-swi-pl-blk-scheme](https://user-images.githubusercontent.com/59004801/165629647-09a1960f-f636-4...)
[^1]: cf. [prolog.vim](https://github.com/yochem/prolog.vim): ![prolog vim](https://user-images.githubusercontent.com/59004801/165629165-95cc805c-ed51-4...)
@techee pushed 1 commit.
abd4f4c1e004e227afc122c62f42b62dc242107b fixup! Add an option to disable VisualProlog verbatim strings
It would be ideal if @ were also recognized as a SWI-Prolog operator [1](https://github.com/geany/geany/pull/3171#user-content-fn-1-928251d34ffe2578f...); otherwise it's a welcome improvement.
Done.
One more thing we might consider mapping when looking at the vim example is variables. Right now in filetypes.prolog we set ``` variable=default ``` but we could use e.g. ``` variable=parameter ``` (I'm not really sure what the right mapping is in this case.)
One more thing we might consider mapping when looking at the vim example is variables.
Should be trivial since a Prolog variable must begin with a capital letter or an underscore [^1].
[^1]: https://www.let.rug.nl/bos/lpn//lpnpage.php?pagetype=html&pageid=lpn-htm...
@elextr Do you know how `filetype_extensions.conf` is processed? Both Prolog and Perl use the same extension `*.pl` (together with some other extensions) and I added it to the extensions list assuming that Perl will get higher priority than Prolog because it's above it in the alphabetically sorted list. I think `pl` should stay as the default extension for Perl which seems to work with this change, I'm just not sure if it's something I can rely on (alternatively I could remove `pl` from the prolog extension list).
Should be trivial since a Prolog variable must begin with a capital letter or an underscore [1](https://github.com/geany/geany/pull/3171#user-content-fn-1-c5a53e0ad69d4a8a7...).
It's not something we need to implement - it's already implemented by the lexer. It's whether we should map it inside `filetypes.prolog` to something else (and what) than `default` or not.
@techee IIRC the extensions are checked by filetype array order, longest wins, first wins if same length.
@rdipardo just to expand on @techee comment above, the way the highlighting works is:
1. the lexer identifies syntactic entities and marks them with an enumeration (the `SCE_VISUALPROLOG_STRING_VERBATIM` values) 2. Geany maps those enumerations to a string name so humans can refer to it 3. Filetype files map that string to a style name, this is the mapping @techee is talking about 4. Colour scheme files map style names to actual styles
This system allows colour schemes (aka themes) to just specify a single set of styles and have similar entities in all languages be styled the same, but it depends on filetypes files mapping the syntactic entities in a sensible manner.
There is no style name for a "variable" because almost no language allows identifiers to be classed as variables (as distinct from functions and types and other stuff) purely syntactically, so no colour schemes will define a style for it, thats why @techee suggested the "parameter" style name.
It is possible to map to a style in the filetype file, but that then won't change when the colour scheme changes, so a dark colour thats ok on light schemes may not be visible on dark schemes, so its always better to map to existing style names, even if their semantics are slightly different.
It's not something we need to implement - it's already implemented by the lexer.
To be exact, underscores denote an *unused* variable (called a "singleton"). The lexer [maps them][0] to `SCE_VISUALPROLOG_ANONYMOUS`:
~~~cpp } else if (sc.Match('_')) { sc.SetState(SCE_VISUALPROLOG_ANONYMOUS); ~~~
Only words in proper title-case [are styled][1] as `SCE_VISUALPROLOG_VARIABLE`:
~~~cpp } else if (isUpperLetter(sc.ch)) { sc.SetState(SCE_VISUALPROLOG_VARIABLE); ~~~
![vpl-singleton](https://user-images.githubusercontent.com/59004801/165650011-e7d48c7f-b1b7-4...)
[0]: https://github.com/ScintillaOrg/lexilla/blob/b7b401a2b169aa58dffa96888aca96b... [1]: https://github.com/ScintillaOrg/lexilla/blob/b7b401a2b169aa58dffa96888aca96b...
Upshot: you'll need to merge `SCE_VISUALPROLOG_ANONYMOUS` and `SCE_VISUALPROLOG_VARIABLE` into one style to emulate what Vim does:
![swipl-singleton-prlog vim](https://user-images.githubusercontent.com/59004801/165650732-1ffc9c64-bfa5-4...)
Upshot: you'll need to merge SCE_VISUALPROLOG_ANONYMOUS and SCE_VISUALPROLOG_VARIABLE into one style to emulate what Vim does:
Thats possible in the filetype file, see explanation above.
@techee some more thoughts on the extensions thing, IIRC in the past we have removed duplicate extensions because the order is (to the user) non-deterministic, probably best to not add *.pl to prolog.
It would be ideal if @ were also recognized as a SWI-Prolog operator [. . .]
Done.
One more thing. SWI-Prolog [uniquely][1] allows backticks to delimit [string literals][0]:
?- string_chars(`abc`, Chars). Chars = [a, b, c].
I've compared and Vim treats every delimited character sequence as an atom, no matter what the delimiter is, so nothing to be jealous of. But since our lexer already gives a dedicated style to strings, the difference will be apparent:
![geany-pl-sans-bkqouts-himbeere](https://user-images.githubusercontent.com/59004801/165697784-f4c0e8da-5a4f-4...)
I suppose this means yet another lexer property, like this?
~~~diff diff --git a/scintilla/lexilla/lexers/LexVisualProlog.cxx b/scintilla/lexilla/lexers/LexVisualProlog.cxx index 72a75882b..4eb5c2ec9 100644 --- a/scintilla/lexilla/lexers/LexVisualProlog.cxx +++ b/scintilla/lexilla/lexers/LexVisualProlog.cxx @@ -49,8 +49,10 @@ using namespace Lexilla; // Options used for LexerVisualProlog struct OptionsVisualProlog { bool verbatimStrings; + bool backQuotedStrings; OptionsVisualProlog() { verbatimStrings = true; + backQuotedStrings = false; } };
@@ -66,7 +68,8 @@ struct OptionSetVisualProlog : public OptionSet<OptionsVisualProlog> { OptionSetVisualProlog() { DefineProperty("lexer.visualprolog.verbatim.strings", &OptionsVisualProlog::verbatimStrings, "Set to 0 to disable highlighting verbatim strings using '@'."); - + DefineProperty("lexer.visualprolog.backquoted.strings", &OptionsVisualProlog::backQuotedStrings, + "Set to 1 to enable using back quotes (``) to delimit strings."); DefineWordListSets(visualPrologWordLists); } }; @@ -447,6 +450,9 @@ void SCI_METHOD LexerVisualProlog::Lex(Sci_PositionU startPos, Sci_Position leng } else if (sc.Match('"')) { closingQuote = '"'; sc.SetState(SCE_VISUALPROLOG_STRING); + } else if (options.backQuotedStrings && sc.Match('`')) { + closingQuote = '`'; + sc.SetState(SCE_VISUALPROLOG_STRING); } else if (sc.Match('#')) { sc.SetState(SCE_VISUALPROLOG_KEY_DIRECTIVE); } else if (isoperator(static_cast<char>(sc.ch)) || sc.Match('\')) {
~~~
[0]: https://www.swi-prolog.org/pldoc/man?section=string#:~:text=back-quoted%20te... [1]: https://wiki.visual-prolog.com/index.php?title=Language_Reference/Built-in_e...
@rdipardo LGBI
I am not a Prologist but I wonder why SWI-prolog users would use Geany when SIW-prolog has a built-in editor? And why open source users would use SWI-prolog instead of GNU prolog?
Upshot: you'll need to merge SCE_VISUALPROLOG_ANONYMOUS and SCE_VISUALPROLOG_VARIABLE into one style to emulate what Vim does:
So should we do that? I'm asking you because you seem to be the most proficient Prolog user around. Basically, when you edit `filetypes.prolog` and under the 'styling' section set e.g.: ``` variable=parameter anonymous=parameter ``` does it look good to you? Or is there a better choice than `parameter` for the mapping?
One more thing. SWI-Prolog [uniquely](https://wiki.visual-prolog.com/index.php?title=Language_Reference/Built-in_e...) allows backticks to delimit [string literals](https://www.swi-prolog.org/pldoc/man?section=string#:~:text=back-quoted%20te...):
Thanks for the patch, looks good to me, will add it.
I am not a Prologist but I wonder why SWI-prolog users would use Geany when SIW-prolog has a built-in editor? And why open source users would use SWI-prolog instead of GNU prolog?
@elextr SWI prolog is open source and as far as I can tell it is the most popular prolog implementation. On the other hand, Visual Prolog, for which the lexer is written is closed source and Windows only and this is the one I think doesn't need much of our attention.
@techee yeah, I wasn't worried about VPL, its got an IDE of its own, although IIUC the compiler is available as freeware like VC++ editions.
But I was just interested in why the GNU prolog was not the "most popular"[^1] open prolog, and like you I used @rdipardo as the most prolog savvy person here, so asked him.
[^1]: measurement method unspecified
But I was just interested in why the GNU prolog was not the "most popular"[1](https://github.com/geany/geany/pull/3171#user-content-fn-1-9a210e28c532bc078...) open prolog, and like you I used @rdipardo as the most prolog savvy person here[2](https://github.com/geany/geany/pull/3171#user-content-fn-2-9a210e28c532bc078...), so asked him.
It was SWI that I used back at school so it was "most popular" for me :-). But judging from things like the number of stars (the true and only measure of quality, popularity etc.) at github, number of contributors (GNU prolog seems to be more or less a one-man-show) and commit frequency, SWI seems to win.
https://github.com/SWI-Prolog/swipl-devel https://github.com/didoudiaz/gprolog
But judging from things like the number of stars (the true and only measure of quality, popularity etc.) at github
Oh, of course, case closed :-)
GNU prolog seems to be more or less a one-man-show
Well, he is more prolific, but SWI-prolog seems to mostly be Jan, but then Lexilla shows as almost all Neil because of the patch not PR process the project has used in the past, so it may be unfair to both prologs.
@techee pushed 1 commit.
44e659da0f04d408156b55ff5ef2b2b07f97daf6 Support backquoted strings in prolog lexer
@techee pushed 1 commit.
9cfa1adbdac2a4f9c8fa41bbd39d1f7749fa9f84 fixup! Add Prolog filetype support
@techee some more thoughts on the extensions thing, IIRC in the past we have removed duplicate extensions because the order is (to the user) non-deterministic, probably best to not add *.pl to prolog.
Agree, removed.
@techee,
Upshot: you'll need to merge SCE_VISUALPROLOG_ANONYMOUS and SCE_VISUALPROLOG_VARIABLE into one style to emulate what Vim does:
So should we do that?
Yes, that looks right. "Interpretive" styles like muted colours for singletons have some value for learners, but Prologists won't need visual aids.
There seem to be many lexers having configuration options like this so I don't expect there would be a problem upstream. If it works as expected here, I'll send a patch upstream.
Your PR will need [unit tests], and, to separate the mutually incompatible lexer options, those will need to be configured with [conditional properties]. There are no existing Prolog lexer tests, so a language-specific `SciTE.properties` is also needed.
It's [now possible][1] to easily assign lexer properties to a definite number of diverse file types, like this:
~~~ lexer.*.pro;*.P;*.pl=visualprolog ~~~
The test file doesn't have to be a coherent program. Like [AllStyles.rb], it can simply iterate all the lexical classes. The same content can be saved twice, once each with the *.pro and *.pl extensions. The feature under test is that, for example, a sequence like `@"verbatim string"` maps to `SCE_VISUALPROLOG_STRING_VERBATIM` in the *.pro file only, whereas the same text in the *.pl file is lexed with `@` styled as `SCE_VISUALPROLOG_OPERATOR` and the rest as `SCE_VISUALPROLOG_STRING`. Separating properties [by file type][2] can be done like this:
~~~ini # Visual Prolog properties match test01.pro lexer.visualprolog.verbatim.strings=1 lexer.visualprolog.backquoted.strings=0
# ISO/SWI-Prolog properties match test01.pl lexer.visualprolog.verbatim.strings=0 lexer.visualprolog.backquoted.strings=1 ~~~
You can simply copy the keyword groups from SciTE's [default configuration], and assign them all to the *.pro extension. Leftover keyword groups can be filled with SWI-Prolog lexemes and assigned to the *.pl extension.
[unit tests]: https://github.com/ScintillaOrg/lexilla/blob/master/test#readme [conditional properties]: https://github.com/ScintillaOrg/lexilla/commit/1499a42c12e454e8d6e3c1a6ada1d... [default configuration]: https://sourceforge.net/p/scintilla/scite/ci/default/tree/src/visualprolog.p... [AllStyles.rb]: https://github.com/ScintillaOrg/lexilla/blob/master/test/examples/ruby/AllSt... [0]: https://github.com/ScintillaOrg/lexilla/commit/1499a42c12e454e8d6e3c1a6ada1d... [1]: https://github.com/ScintillaOrg/lexilla/commit/ef547f878b41e45612c7f14d0a8bd... [2]: https://github.com/ScintillaOrg/lexilla/commit/2c9e00442f1876254694ebb2fd6ae...
@rdipardo Thanks! Do you have any good source of sample files that could be used as unit tests? It probably involves both VisualProlog samples and SWI prolog samples.
Do you have any good source of sample files that could be used as unit tests?
Before anything, you'll need to apply this patch: [Fix-EOL-splitting-in-LexVisualProlog.diff.txt](https://github.com/geany/geany/files/8694983/Fix-EOL-splitting-in-LexVisualP...)
Lexilla's testing protocol has dramatically improved over the past year, and it now checks for consistency across EOL modes. A hard failure is raised if the CR and LF of a Windows EOL is in two different styles, e.g.,
![vpl-eol-splits](https://user-images.githubusercontent.com/59004801/168474150-2678acee-8906-4...)
In fairness to the lexer, the problem really comes from the flawed implementation of `StyleContext::atLineEnd` (mitigated in [Lexilla 5.1.0][0]). It doesn't account for newlines longer than a single character [^1] , so you can end up with `<false>CR<true>LF`
[^1]: https://github.com/ScintillaOrg/lexilla/blob/a35a59845e793d9d37d249cf097e71f...
[0]: https://github.com/ScintillaOrg/lexilla/blob/9818085b561cc59243afc3f768db206...
@rdipardo Just curious, don't you want to take over the lexilla part of this PR and submit the necessary changes upstream? I'm not as familiar neither with Scintilla nor with Prolog as you seem to be and you will probably be a better person to explain the necessary changes to Neil.
Just curious, don't you want to take over the lexilla part of this PR and submit the necessary changes upstream?
This is ultimately a feature request: #3086. The broken EOL styles are a side issue, yes, but they're blocking the addition of the SWI-Prolog features. In truth, any new features are premature until the lexer functions properly for its original purpose.
I can take care of the EOL styles and the inaugural (Visual Prolog) tests that would require. But I'm not going to assume ownership of a feature request I didn't open.
@rdipardo OK, I'll try to prepare something myself (once I have more time).
@techee
OK, I'll try to prepare something myself (once I have more time).
No rush: I've already proposed a fix for the EOL splitting: ScintillaOrg/lexilla#83 Best to wait for whatever Lexilla release gets the patch. 5.1.7 is [already in the hopper][0], so a successor version is more likely.
[0]: https://github.com/ScintillaOrg/lexilla/commit/342e86ebed1918d5a7a600b02bb9a...
The path is now clear for the SWI-PL feature request. All that's left is to: - apply this to your Lexilla fork: [Lexilla-3d02c15.diff.txt](https://github.com/geany/geany/files/8790372/Lexilla-3d02c15.diff.txt) - drop these files into `test/examples/visualprolog`: [Lexilla-swipl-tests.zip](https://github.com/geany/geany/files/8790379/Lexilla-swipl-tests.zip)
Geany normally does not maintain modifications from Scintilla/Lexilla releases, is the Lexilla patch in the latest release?
is the Lexilla patch in the latest release?
Committed only yesterday (Aussie time), but still too late for 5.1.7: ScintillaOrg/lexilla@3d02c15f
@rdipardo Many thanks for the provided unit tests and all the other work here, it has been a great time-saver for me. I've just opened this PR upstream:
https://github.com/ScintillaOrg/lexilla/pull/89
@techee pushed 1 commit.
823ee03f5948d9ef79352bd68072a574866ace09 Update to the upstream version of VisualProlog lexer
The lexer changes have been merged upstream so I updated this PR with the upstream VisualProlog lexer.
The upstream Prlog lexer may be undergoing [a substantial refactoring](https://github.com/ScintillaOrg/lexilla/pull/178) soon. I can't seem to get Geany to build for me these days, but maybe now's a good time for @techee to synchronize his topic branch?
Sorry to disappoint you, any PR should use a version of a lexer that matches the Lexilla version in Geany, if its a version that is several into the future it might go backward if someone upgrades Lexilla one step.
@techee pushed 2 commits.
76925635112c447415142b31d69abf573efeed29 Add Prolog filetype support b4e795754e32a59081eb8e819516b988b4893096 Update to the latest Prolog lexer
@rdipardo I just rebased this PR on top of master and also made the necessary updates related to the updated lexer. Since you are probably the most proficient prolog user around, would you have a look at it if it looks alright?
@techee
would you have a look at it if it looks alright?
It's everything I would want, thanks! But where are the symbols? Or is tag parsing still on the TODO list?
Or is tag parsing still on the TODO list?
AFAICT from [uctags](https://github.com/universal-ctags/ctags/tree/master) nobody has written a parser for prolog.
It's everything I would want, thanks! But where are the symbols? Or is tag parsing still on the TODO list?
Not on my todo list :-).
But if you want to create something, I can imagine a super-simple approach of implementing the parser which works just based on the common way prolog files are indented; indented lines would be ignored by the parser and for non-indented lines the string between the start of the line until `(` would be taken as the name of the tag. The erlang parser does something similar:
https://github.com/geany/geany/blob/master/ctags/parsers/erlang.c
Before starting to work on something like that, better to ask at the universal-ctags project whether the maintainer doesn't prefer the parser in the form of a regex-based parser.
@b4n @eht16 OK to merge this PR? I'd like to avoid the situation like with the Raku parser (#3169, also updated on top of 2.0 now) where it rots in open PRs and then, before the release, it's too late to get merged.
I went through the various open PRs and open issues providing/requesting support of certain languages because some seem to be a bit neglected: 1. There's this PR adding Prolog support. 2. There's the Raku language support in #3169 which is hopefully in a mergable state. 3. I created #3647 (Scintilla lexer for JSON) and #3648 (Scintilla lexer for Nim) - for JSON there has already been a pull request in the past and they have also been requested in some open issues and in general I agree that when there's a Scintilla lexer, we should use it. 4. Yesterday I reviewed #3480 (CIL language support) and if the minor comments I had get addresses, I think it should be merged. 5. There are various open PRs and issues asking for Dart, LESS, and SCSS - these could be added as external filetypes. There should be no problem using C lexer for Dart and CSS lexer for LESS/SCSS (the CSS lexer supports LESS/SCSS modes).
What do you think?
Maybe move the above comment to a separate issue so it doesn't get lost when this is merged (maybe with ticklist of filetypes to add).
This PR LGBI.
Maybe move the above comment to a separate issue so it doesn't get lost when this is merged (maybe with ticklist of filetypes to add).
Yep, done in https://github.com/geany/geany/issues/3651
@techee
Yep, done in #3651
Except for the "ticklist of filetypes to add", or *task*list, which GitHub [can automatically generate][0] for list items with a "tick box" beside them, e.g.
~~~markdown - [ ] Needs a tick <!-- note the single space --> - [X] Ticked! ~~~
[0]: https://docs.github.com/en/issues/managing-your-tasks-with-tasklists/creatin...
@techee pushed 2 commits.
b804e83c5314bdea602f2cf3a329d35351163e05 Add Prolog filetype support 63f074a9074995c98dbeefe971ec5e2f4aca8f6e Update to the latest Prolog lexer
I've rebased this poor PR stuck in limbo for two years on top of current master - I'll do some more self-review and testing and if there are no objections, I'd merge it it about a week.
Merged #3171 into master.
github-comments@lists.geany.org