`filetypes.verilog` only includes ancient Verilog-1995 keywords (plus `signed` and `unsigned` for some reason), but is missing plenty of the newer Verilog-2001 and the newest Verilog-2005 keywords, some of them very common, such as `generate`/`endgenerate`, `localparam`, `automatic`...
I have added all those "new" keywords to the `word3=` category to distinguish them from the "classic" keywords from the previous century, although I honestly don't know what's the difference between the two categories.
I have also moved all the keywords that used to be in `word3=` to `word=`, since I didn't see any reason to keep those keywords there (they seem to be related to "variable declarations" one way or another, but then again, so are many of the keywords listed in `word=`). This way, `word=` will be for the "old" keywords, and `word3=` for the "new" ones that "might not work in a Verilog tool made in the previous century".
Finally, I have added `$` to the list of `wordchars=`, because Verilog is special and considers $ to be an identifier character like `_` (so e.g. `$finish` and `gotabout$350` are valid identifiers). You can view, comment on, or merge this pull request online at:
https://github.com/geany/geany/pull/4037
-- Commit Summary --
* filetypes.verilog: add Verilog-2005 keywords and $
-- File Changes --
M data/filedefs/filetypes.verilog (6)
-- Patch Links --
https://github.com/geany/geany/pull/4037.patch https://github.com/geany/geany/pull/4037.diff
(Note that this PR is **unrelated** to **SystemVerilog;** all I added here is still plain old Verilog. I still plan to create a commit adding support for SystemVerilog, but that'll be on a different PR.)
As a side note, I see that `words2=` includes a bunch system tasks and functions (`$display`, `$finish`, etc.), but the list is not complete (see [chapter 17 of the Verilog standard](https://www.eg.bucknell.edu/~csci320/2016-fall/wp-content/uploads/2015/08/ve...) for a complete list; there are 122 in total).
Do you think it would be a good idea to add those as well?
(Personally I've never used most of those; the only few I've ever used were properly highlighted.)
Note that these are not "keywords" per se, just system functions, but I suppose it's OK to handle them as "keywords" for the purpose of highlighting. (They're highlighted in a different color from actual keywords, which is the important thing.)
@cousteaulecommandant Thanks. I think nobody from Geany developers uses Verilog so I guess we'll trust your choices :-).
> I have added all those "new" keywords to the word3= category to distinguish them from the "classic" keywords from the previous century, although I honestly don't know what's the difference between the two categories.
I think it would be best to put the new keywords to the "word" list among the original keywords.
The only difference is the coloring of the various lists - see the mapping to the theme colors: ``` word=keyword_1 word2=keyword_2 word3=keyword_3 ```
> Finally, I have added $ to the list of wordchars=, because Verilog is special and considers $ to be an identifier character like _ (so e.g. $finish and gotabout$350 are valid identifiers).
Note that Geany currently doesn't respect the `wordchars` characters for everything and for some things the "old" hard-coded `a-zA_Z0-9_` is still used. Should be fixed eventually.
> (Note that this PR is unrelated to SystemVerilog; all I added here is still plain old Verilog. I still plan to create a commit adding support for SystemVerilog, but that'll be on a different PR.)
Just a note - the Verilog ctags parser contains also System Verilog parser. We could enable it if you add the SystemVerilog filetype.
> As a side note, I see that words2= includes a bunch system tasks and functions ($display, $finish, etc.), but the list is not complete (see [chapter 17 of the Verilog standard](https://www.eg.bucknell.edu/~csci320/2016-fall/wp-content/uploads/2015/08/ve...) for a complete list; there are 122 in total).
> Do you think it would be a good idea to add those as well?
> Note that these are not "keywords" per se, just system functions, but I suppose it's OK to handle them as "keywords" for the purpose of highlighting. (They're highlighted in a different color from actual keywords, which is the important thing.)
Depends whether Verilog users would expect them to be highlighted or not. The other "keyword" lists are used this way for other languages too but I don't know what's the common practice for Verilog.
> I have also moved all the keywords that used to be in word3= to word=, since I didn't see any reason to keep those keywords there (they seem to be related to "variable declarations" one way or another, but then again, so are many of the keywords listed in word=)
Some of them appear to be builtin types so maybe it makes sense to keep them in `word3` - other languages do something like that too. Not sure about `reg wire input output inout` though.
> (Note that this PR is unrelated to SystemVerilog; all I added here is still plain old Verilog. I still plan to create a commit adding support for SystemVerilog, but that'll be on a different PR.)
There's also this PR you might want to check https://github.com/geany/geany/pull/1831 - if it's alright, maybe that one could be merged.
> The only difference is the coloring of the various lists - see the mapping to the theme colors:
Honestly this is why I was confused. In the default theme, both word and word3 are bold navy blue and therefore indistinguishable - in fact I didn't know Geany treated them differently for Verilog until I saw it in filetypes.verilog - whereas word2 is dark red. It makes sense because this way navy blue = keywords but dark red = "standard functions". But it doesn't make sense that some keywords are considered one type and others are another type; I couldn't figure out which criteria was used to make keywords "word" or "word3" (the latter seem to be "stuff for net/variable declaration", but then again so are many of the keywords declared in "word").
> > I have also moved all the keywords that used to be in word3= to word= > > Some of them appear to be builtin types so maybe it makes sense to keep them in `word3` - other languages do something like that too. Not sure about `reg wire input output inout` though.
All of them are "keywords", [according to the standard (Annex B)](https://www.eg.bucknell.edu/~csci320/2016-fall/wp-content/uploads/2015/08/ve...). I tried to figure out a pattern (are they "types"? "Stuff used to declare variables"?) and had actually started making a list of which things were "variable stuff", moving not only reg and wire but also wand, wor, tri, tri0, tri1, realtype, signed, (unsigned), etc., but eventually I realized that (1) I don't know the standard well enough to see which of these make sense to be in a separate category of keywords; (2) the whole idea of "type" is a bit fuzzy, since everything in Verilog is basically "array of bits"; the stuff in that list is more like "qualifiers" mostly; (3) I have no idea what the original idea of having a separate list of keywords was meant for, since it's not mentioned anywhere, so I can't decided what should go in there and what shouldn't; and (4) overall I see no point in having two lists of keywords (neither C nor C++ seem to have those so I can't think of an analogy). So I thought it would be easiest to just put everything in a single category of keywords.
The only thing about "different" keywords I could find was in [section 19.11 of the standard](https://www.eg.bucknell.edu/~csci320/2016-fall/wp-content/uploads/2015/08/ve...), which states that you can actually "disable" all of the new keywords added in Verilog 2001 and 2005, so if your design is super old and uses variable names that have later become reserved keywords, you can disable those so that you can use them as regular identifiers. In this scenario, it might be useful to highlight these in a different color, since they're "keywords that can be forced to stop being keywords". And that's why I thought it might be a good idea to put them in the (now empty) word3= category.
> Note that Geany currently doesn't respect the `wordchars` characters for everything and for some things the "old" hard-coded `a-zA_Z0-9_` is still used. Should be fixed eventually.
Honestly I don't quite understand what `wordchars` does, but it felt right to add the `$` there.
> > Do you think it would be a good idea to add those as well? > > > Note that these are not "keywords" per se, just system functions, but I suppose it's OK to handle them as "keywords" for the purpose of highlighting. (They're highlighted in a different color from actual keywords, which is the important thing.) > > Depends whether Verilog users would expect them to be highlighted or not. The other "keyword" lists are used this way for other languages too but I don't know what's the common practice for Verilog.
Personally I think it's good to see `$finish` and `$display` etc highlighted as "this is something important", despite them not being keywords strictly speaking, so the current behavior is good. My question was whether to add the rest of functions starting with `$` that are in the standard. I have no idea what most of those functions do, to be honest.
> Just a note - the Verilog ctags parser contains also System Verilog parser. We could enable it if you add the SystemVerilog filetype.
That's the plan, yes. The Verilog lexer also seems to support SystemVerilog, so I've got the two hardest parts covered. I already have a commit that adds the language, and a PR almost ready.
> > I still plan to create a commit adding support for SystemVerilog, but that'll be on a different PR.) > > There's also this PR you might want to check #1831 - if it's alright, maybe that one could be merged.
That one only adds keywords though; proper handling of SystemVerilog may require a bit more. (And the parser and the lexer are already made so why not?)
I tried to figure out a pattern (are they "types"? "Stuff used to declare variables"?) and had actually started making a list of which things were "variable stuff", moving not only reg and wire but also wand, wor, tri, tri0, tri1, realtype, signed, (unsigned), etc., but eventually I realized that (1) I don't know the standard well enough to see which of these make sense to be in a separate category of keywords; (2) the whole idea of "type" is a bit fuzzy, since everything in Verilog is basically "array of bits"; the stuff in that list is more like "qualifiers" mostly; (3) I have no idea what the original idea of having a separate list of keywords was meant for, since it's not mentioned anywhere, so I can't decided what should go in there and what shouldn't; and (4) overall I see no point in having two lists of keywords (neither C nor C++ seem to have those so I can't think of an analogy). So I thought it would be easiest to just put everything in a single category of keywords.
It's really up to the contributor of filetype support. For instance, java uses ```ini primary=abstract assert break case catch class const continue default do else enum exports extends final finally for goto if implements import instanceof interface module native new non-sealed open opens package permits private protected provides public record requires return sealed static strictfp super switch synchronized this throw throws to transient transitive try uses var volatile when while with yield true false null secondary=boolean byte char double float int long short void # documentation keywords for javadoc doccomment=author deprecated exception param return see serial serialData serialField since throws todo version typedefs= ``` Those "secondary" keywords are normal keywords, just defining primitive types and are highlighted differently which kind of makes sense. But if you feel there's no such analogy in Verilog, it's probably best to have them all in one group.
Honestly I don't quite understand what wordchars does, but it felt right to add the $ there.
I have kind of the same feeling ;-). See https://github.com/geany/geany/issues/4038
Personally I think it's good to see $finish and $display etc highlighted as "this is something important", despite them not being keywords strictly speaking, so the current behavior is good. My question was whether to add the rest of functions starting with $ that are in the standard. I have no idea what most of those functions do, to be honest.
Then I'd say leave it as it is.
Regarding the keywords: as @techee said, the mapping of the keyword meaning in Geany to the one in Scintilla is sometimes kind of arbitrary.
While you are at, Scintilla supports (in the meantime) even more keyword lists: https://github.com/ScintillaOrg/lexilla/blob/master/lexers/LexVerilog.cxx#L1...
Maybe it also helps to have a look at the SciTE configuration: https://sourceforge.net/p/scintilla/scite/ci/default/tree/src/verilog.proper...
Those "secondary" keywords are normal keywords, just defining primitive types and are highlighted differently which kind of makes sense. But if you feel there's no such analogy in Verilog, it's probably best to have them all in one group.
There is, but it's complicated. Basically there are a few "data types" proper (think C's char, int, float), and a ton of modifiers/specifiers describing how the signal behaves "physically" (think static, const, volatile, extern, etc). But I think I can make a distinction between "keywords that are used to declare signals/variables/constants" and "keywords that are not used to declare signals/variables/constants". If that's common practice I can give it a try; I think it can be done.
Option 2 is what I have now - separate keywords in "old keywords" and "newer keywords from a newer standard that a very old file might be using for something else", but I don't think that'll be that much useful.
Option 3 is cram everything into a single category, which I think is what C does. In any case, the default theme uses the same color for both so it's hard for me to see the difference.
whether to add the rest of functions starting with $ that are in the standard. I have no idea what most of those functions do, to be honest.
Then I'd say leave it as it is.
On second thought, I think I'll better add those as well. There are a few I use often that aren't listed (`$clog2` and `$signed` are super common, for example), and there's a handy list of all these *"system tasks and functions that are considered part of the Verilog HDL"* in the standard so this is a no-brainer.
----
While you are at, Scintilla supports (in the meantime) even more keyword lists: https://github.com/ScintillaOrg/lexilla/blob/master/lexers/LexVerilog.cxx#L1...
Those look pretty much like what we already have though. * *Primary keywords and identifiers* -> `word=` * *Secondary keywords and identifiers* -> `word3=` * *System Tasks* -> `word2=` * *User defined tasks and identifiers* -> guess this is meant to allow users to add their own keywords (or to detect them automatically from the files?) * *Documentation comment keywords* -> I guess this is `docComment=` (although Geany doesn't seem to support special `///` or `/**` comments in Verilog like it does for C, so no `@todo`, `\param`, etc.) * *Preprocessor definitions* -> this is the only one that I'm not sure about. Not sure if this refers to macro definitions (`` `define A_MACRO 42 ``) or macros per se (`` x = `A_MACRO; ``), or "preprocessor stuff in general" (`` `whatever ``), or "specifically the macros the user has defined", or "only the standard macros".
Maybe it also helps to have a look at the SciTE configuration: https://sourceforge.net/p/scintilla/scite/ci/default/tree/src/verilog.proper...
Apparently they cram all the keywords as keywords1 and leave keywords2 blank (my "Option 3"), keywords3 is the $ stuff (`word2=`) but more complete, keywords4 is blank as well, and keywords5 is not used for "documentation comments" but for "pragma comments" (like `//synopsys translate_off`), which is a nonstandard feature used by some implementations.
@cousteaulecommandant pushed 1 commit.
fbad48d33240f4548998fa02cb17758cbf8fdbb9 filetypes.verilog: add Verilog-2005 keywords and $
For the sake of "keeping the things the way they were", I've gone with option 1, and tried my best at grouping the keywords into things that are used for declaring signals/constants, and things that are not. I think the result is good.
I have also decided to add all the standard system functions/tasks.
This is how it looks now (with `word3` highlighted in a lighter blue). Signal declarations use keywords of one type, other control structures use the other. I think it looks nice. ![geany_verilog_keywords](https://github.com/user-attachments/assets/a5fee803-711f-4168-ac8a-03a653cdf...)
I've pushed the new version to this PR (and rebased it on top of master, while I was at it).
@techee commented on this pull request.
@@ -37,7 +37,7 @@ mime_type=text/x-verilog
# these characters define word boundaries when making selections and searching # using word matching options -#wordchars=_abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 +wordchars=_abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789$
I just went through other filetypes and none of them defines `wordchars` (all of them have it commented-out). I think it's meant to be an example that can be overridden by users. So maybe best to do the same in this case too.
Apart from the `wordchar` comment, this PR looks good to me.
@cousteaulecommandant commented on this pull request.
@@ -37,7 +37,7 @@ mime_type=text/x-verilog
# these characters define word boundaries when making selections and searching # using word matching options -#wordchars=_abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 +wordchars=_abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789$
OK, as you wish :)
I did notice that a few of them (R, CSS, Smalltalk, Raku) use a different list though, although it's commented out as you said. In any case, this only seems to affect what happens when you double-click a word or select "Find only whole words", but for instance Ctrl-click doesn't work if I declare a signal called `foo$bar`. Not too useful overall I guess. I'll leave it commented out for consistency with other formats as you request, but if it's OK I'll leave the $ in there.
(This field doesn't seem to be widely used across formats though. For example LaTeX doesn't touch it, despite of `_` not being a wordchar and `@` often being one, at least in packages.)
@cousteaulecommandant pushed 1 commit.
ffc8f62b5d94625e614c53784bc1f3a7b00140d7 filetypes.verilog: add Verilog-2005 keywords
@techee commented on this pull request.
@@ -37,7 +37,7 @@ mime_type=text/x-verilog
# these characters define word boundaries when making selections and searching # using word matching options -#wordchars=_abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 +wordchars=_abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789$
I did notice that a few of them (R, CSS, Smalltalk, Raku) use a different list though, although it's commented out as you said. In any case, this only seems to affect what happens when you double-click a word or select "Find only whole words", but for instance Ctrl-click doesn't work if I declare a signal called foo$bar. Not too useful overall I guess.
Yes, I think it should eventually be improved in Geany, see https://github.com/geany/geany/issues/4038
...Sorry; I didn't realize I have to RESOLVE the code reviews after I fix them. :sweat_smile:
...Sorry; I didn't realize I have to RESOLVE the code reviews after I fix them. 😅
Nah, it doesn't matter :-)
Alright, let's merge this one too. Thanks for your contributions!
Merged #4037 into master.
github-comments@lists.geany.org