This commit changes JSON from a custom filetype to one with its own lexer, as requested in https://github.com/geany/geany/issues/384#issuecomment-583869454. The lexer is taken from Scintilla's upstream (version 3.10.4).
This has the following effects: - JavaScript comments (//, /* */) are no longer highlighted as comments, since JSON doesn't allow them - Text outside of quotes, except keywords and numbers, is now considered an error, and highlighted as such (in red)
It should also allow keys and string values to be highlighted in different colours, as in many other editors, but I couldn't get that to work.
Stopping comments from being highlighted is the main motivation for this - highlighting them misleads the user into thinking that JSON allows them. You can view, comment on, or merge this pull request online at:
https://github.com/geany/geany/pull/2454
-- Commit Summary --
* Use Scintilla's JSON lexer for JSON instead of the JavaScript/C++ lexer
-- File Changes --
M data/Makefile.am (2) D data/filedefs/filetypes.JSON.conf (17) A data/filedefs/filetypes.json (32) M scintilla/Makefile.am (1) A scintilla/lexers/LexJSON.cxx (498) M scintilla/scintilla_changes.patch (2) M scintilla/src/Catalogue.cxx (1) M src/editor.c (1) M src/filetypes.c (1) M src/filetypes.h (1) M src/highlighting.c (2) M src/highlightingmappings.h (25)
-- Patch Links --
https://github.com/geany/geany/pull/2454.patch https://github.com/geany/geany/pull/2454.diff
Isn't it common that JSON parsers accept comments as an extension? I know many also don't, but unless you want to split up into two separate JSON-pure and JSON-with-comments file types it might be better to keep in the comment highlighting
@ell1e what is "common"? This is hard to define in a general way and the JSON standard does *not* allow comments.
@d5l6 I wonder if it wouldn't be better to keep the custom filetype and just add the Scintilla JSON lexer to Geany and update the custom filetype definition to use it. Otherwise we need also a migration path for existing users or they will end up with two JSON filetypes. Not sure which one would be used then.
@eht16 just google "JSON for comments", apart from posts telling you it's not allowed you'll find quite a bunch of different parsers to enable this. And Visual Studio Code (which is pretty popular) uses this style for all its configs, too. I don't have proper statistics, but I think it should be easy enough to see that a non-trivial amount of people uses this
@ell1e I still think a dedicated JSON filetype should (at least by default) not allow comments. But we don't even need to make a hard decision here as the Scintilla JSON lexer has a property to toggle highlighting of comments: `lexer.json.allow.comments`.
So, one can add: ``` [lexer_properties] lexer.json.allow.comments=1 ``` to the filetype definition and so enable highlighting of comments if desired. @d5l6 this could go as commented and documented option in the filetype definition (for reference see https://sourceforge.net/p/scintilla/code/ci/default/tree/lexers/LexJSON.cxx#...).
I think the best way could be to have a JSON with comments file type for the lenient parsing if the core JSON type to allow it. A global option as you suggest will basically force anyone to disable strict mode if they have just have just 1+ files that rely on the expanded syntax, which IMHO turns A option useless. It's not like usually a dev's choice, really, what files they have to deal with. So with a global option you might as well just leave things as they are right now (which I think wouldn't be the worst thing to do, actually)
The JSON lexer setting is global for the filetype, so there would need to be two filetypes to handle both commentable JSON and uncommentable JSON. Changing a filetype setting during editing is simply not usable. But AFAIK there is nothing to distinguish between the variants so different filetypes can be selected. The whole file would have to be searched for comments before it was distinguished if it the file is a JSON with comments or a JSON without comments.
And then just because a file has no comments when opened doesn't say the user doesn't want to add them if the tool they are feeding the JSON to allows, and Geany doesn't know that tool.
As a general principle the highlighting lexer and even the tags parser are not meant to be syntax checkers, they should be as inclusive as possible unless its possible to distinguish which variant is intended. For example the C++ lexer and parser don't distinguish if the file is C++98 or C++20 nor the C lexer/parser C89 to C21.
But the use the specific JSON lexer is still useful if it has other benefits, but it should be more inclusive by default, not less.
Changing a filetype setting during editing is simply not usable.
For what it's worth, I think that is what VS Code does: it attempts to auto-detect (not sure if just based on file extension or actually content) but t the docs suggest it shows the resulting filetype somewhere prominent, I think on the statusbar, where it can easily be changed. So it kind of expects you to manually switch during editing if it gets it wrong I believe. But I personally also prefer the JSON highlighter to just not be as strict in general, so that I don't need to do manual toggling if it gets it wrong
@ell1e I still think a dedicated JSON filetype should (at least by default) not allow comments.
But we don't even need to make a hard decision here as the Scintilla JSON lexer has a property to toggle highlighting of comments: lexer.json.allow.comments...
+1
The JSON lexer setting is global for the filetype, so there would need to be two filetypes to handle both commentable JSON and uncommentable JSON.
This seems OK. We could ship the proper JSON filetype and any users who want to derive some customized version of the language, could simply define a custom filetype inheriting from the real JSON filetype but with their overrides (such as the property @eht16 mentioned).
It's easy enough to switch filetypes at runtime, or add a new mapping for the custom JSON-like filetype to `filetype_mappings.conf` to have it auto-detected.
I would recommend to ship a `JSON with comments` filetype from the start because it does seem to be somewhat common to me (mostly due to MS's VS Code which popularized). Being able to extend it is always cool, but obviously the user experience is better if the right file mode is already available in the built-in list. With that given, it seems reasonable to then make the "regular" `JSON` type be strict about comments being not allowed and all that
Fwiw, "JSON with comments" usually means allowing trailing commas in objects and lists, comment type `// line comment`, comment type `/* block comment */`, and sometimes comment type `# alternate line comment`. I would recommend accepting these all as valid in a `JSON with comments` file type if you choose to provide one
would recommend to ship a JSON with comments filetype from the start because it does seem to be somewhat common to me (mostly due to MS's VS Code which popularized)...
Sure, like "JSON (strict)" filetype using the real JSON lexer, and then a custom filetype "JSON (loose)" inheriting from the Javascript lexer like the current JSON custom filetype does.
Sounds good! Although I'd call it "JSON (strict)" & "JSON (with comments)", because that is how most people seem to refer to the looser interpretation on the web. But that's just my personal preference, obviously
@d5l6 , update `LexJSON.cxx`, please: [Scintilla 3.21.1](https://github.com/geany/geany/commit/8f0909685dad46096ecf92152e82d309b0852b...) has `LexJSON.cxx` with small changes.
For what it's worth, I've now seen JSONC files (with comments) stored in `.json` multiple times, e.g. the launch settings in any VS Code project. So while technically a different file type, I do wonder how you'd want to detect it based on the file ending, making the idea of two different parsers for this IMHO quite impractical.
The thing is, I want the syntax highlighting to mostly make my file readable, not to teach me how "wrong" it is when it is actually correct given the actual file type, and I would assume that applies to many users. And it seems like there is no way to safely detect here if a comment in a `.json` file is an actual mistake, or just a quirk of whatever tool wrote/reads that file. So I don't think it makes sense to attempt it, or to mark it as "mistake" by default.
github-comments@lists.geany.org