Skip to content

Conversation

digitcrusher
Copy link
Contributor

  1. Shortens:
[(field name: _ @here)
 (dot_index_expression field: _ @here)
 (method_index_expression method: _ @here)]

to just:

(member_identifier) @here

Such a construction is used quite heavily in my highlights queries since I apply a different highlight name to each of: fields, type fields, constant fields, function fields, metafields, and metamethods. From what I remember, my queries from before this change were about 20 lines longer.

  1. Reduces interdependencies between highlights queries resulting in much more maintainable code. Previously, any queries for fields had to come strictly after the queries for variables, if you wanted to highlight them differently or if you wanted to highlight built-in functions not only when they are called, because a query for (identifier) matched both variables and fields. The latter use case is not uncommon, see: pcall(require, 'maybe_not_there') As a proof of concept, I did exactly that: I extended the built-in functions query from just call expressions to any position in Lua code without shuffling anything around, wrote some tests (included in this commit) and they just worked. No questions asked.
  2. Fixes a bug where foo in { [foo] = 1 } was highlighted as a field name, even though it's just a variable, whose run-time value is the real field name.

@MunifTanjim
Copy link
Collaborator

I don't agree with these changes. I don't think these provide much value. Also I don't completely understand the reason these changes are needed.

But if @clason or @ObserverOfTime feels otherwise, I can look deeper into it.

P.S. This grammar is based on the extended bnf here: https://lua.org/manual/5.4/manual.html#9

@clason
Copy link
Collaborator

clason commented Jul 26, 2025

Yeah, I agree -- these changes seem driven by philosophical differences on how a grammar should be written (basically, write the parser for the queries rather than vice versa).

And these query changes look quite breaking, which is not something I want to handle for a widely-used parser that should be (and is) stable.

@digitcrusher
Copy link
Contributor Author

Leaving this PR unmerged will leave the aforementioned bug (point 3) unfixed. I opened #61 for this, in case you have a different solution in mind.

Note that making field identifiers into a separate node is nothing out of the ordinary and is already done in the official C Tree-sitter grammar.

@clason
Copy link
Collaborator

clason commented Jul 26, 2025

Yes, and I have opinions on that grammar. The tree-sitter org grammars are there for historical reasons, not because they are in any way "blessed" or official.

Please respect that different languages (and projects) have different approaches and philosophies; writing (performant) tree-sitter grammars is more of an art than a science.

Fixes a bug where foo in { [foo] = 1 } was highlighted as a field name, even though it's just a variable, whose run-time value is the real field name.

That can easily be handled on the query level (it's captured as both, with the field name simply having priority due to pattern order).

@digitcrusher
Copy link
Contributor Author

That can easily be handled on the query level (it's captured as both, with the field name simply having priority due to pattern order).

Could you please provide a sample query or a list of node names to match against? I don't see what information from the parse tree below I could use to distinguish foo and bar in t = { foo = 1, [bar] = 2 }:

(chunk [0, 0] - [1, 0]
  (assignment_statement [0, 0] - [0, 26]
    (variable_list [0, 0] - [0, 1]
      name: (identifier [0, 0] - [0, 1]))
    (expression_list [0, 4] - [0, 26]
      value: (table_constructor [0, 4] - [0, 26]
        (field [0, 6] - [0, 13]
          name: (identifier [0, 6] - [0, 9])
          value: (number [0, 12] - [0, 13]))
        (field [0, 15] - [0, 24]
          name: (identifier [0, 16] - [0, 19])
          value: (number [0, 23] - [0, 24]))))))

Field name identifiers and field names expressions are not the same entity and it's necessary to highlight them differently to ensure correct and consistent syntax highlighting.

@clason
Copy link
Collaborator

clason commented Jul 26, 2025

( "[" . (identifier) @property . "]" )

@digitcrusher
Copy link
Contributor Author

Thanks! In case anyone encounters the same problem, the full set of highlight queries to distinguish the two aforementioned cases is:

(field name: (identifier) @field)
(field "[" . name: (identifier) @variable . "]")

I have to admit, it's not that longer than what you get with my PR: (field (member_identifier) @field)

@clason
Copy link
Collaborator

clason commented Jul 26, 2025

I am not saying it's pretty, mind you, just that it can be done without breaking query changes -- which, again, I would like to avoid if possible. But in the end it's @MunifTanjim's decision whether this change is worth it.

@MunifTanjim
Copy link
Collaborator

MunifTanjim commented Jul 26, 2025

So I read #61, and this was indeed an oversight when I initially wrote the grammar.

I think, we should introduce a bracketed_expression node, similar to parenthesized_expression. I'm not sure if that'll cause a breaking change or not... It shouldn't.

diff --git a/grammar.js b/grammar.js
index e27e1ae..af69baf 100644
--- a/grammar.js
+++ b/grammar.js
@@ -440,16 +440,14 @@ module.exports = grammar({
     _prefix_expression: ($) =>
       prec(1, choice($.variable, $.function_call, $.parenthesized_expression)),
 
-    // var ::=  Name | prefixexp [ exp ] | prefixexp . Name
+    // var ::=  Name | prefixexp '[' exp ']' | prefixexp . Name
     variable: ($) =>
       choice($.identifier, $.bracket_index_expression, $.dot_index_expression),
-    // prefixexp [ exp ]
+    // prefixexp '[' exp ']'
     bracket_index_expression: ($) =>
       seq(
         field('table', $._prefix_expression),
-        '[',
-        field('field', $.expression),
-        ']'
+        field('name', $.bracketed_expression)
       ),
     // prefixexp . Name
     dot_index_expression: ($) =>
@@ -482,6 +480,8 @@ module.exports = grammar({
 
     // '(' exp ')'
     parenthesized_expression: ($) => seq('(', $.expression, ')'),
+    // '[' exp ']'
+    bracketed_expression: ($) => seq('[', $.expression, ']'),
 
     // tableconstructor ::= '{' [fieldlist] '}'
     table_constructor: ($) => seq('{', optional($._field_list), '}'),
@@ -493,9 +493,7 @@ module.exports = grammar({
     field: ($) =>
       choice(
         seq(
-          '[',
-          field('name', $.expression),
-          ']',
+          field('name', $.bracketed_expression),
           '=',
           field('value', $.expression)
         ),

But for some reason I'm getting this error when I try to install the compiled parser in neovim:

Error in BufReadPost Autocommands for "*":
Lua callback: ...llar/neovim/HEAD-8bccd07/share/nvim/runtime/filetype.lua:36: BufReadPost Autocommands for "*"..FileType Autocommands for "*"..function <SNR>1_LoadFTPlugin[20]..script /opt/ho
mebrew/Cellar/neovim/HEAD-8bccd07/share/nvim/runtime/ftplugin/lua.lua: Vim(runtime):E5113: Lua chunk: ...m/HEAD-8bccd07/share/nvim/runtime/lua/vim/treesitter.lua:460: Parser could not be crea
ted for buffer 11 and language "lua"
stack traceback:
        [C]: in function 'assert'
        ...m/HEAD-8bccd07/share/nvim/runtime/lua/vim/treesitter.lua:460: in function 'start'
        .../neovim/HEAD-8bccd07/share/nvim/runtime/ftplugin/lua.lua:2: in main chunk
        [C]: in function 'nvim_cmd'
        ...llar/neovim/HEAD-8bccd07/share/nvim/runtime/filetype.lua:36: in function <...llar/neovim/HEAD-8bccd07/share/nvim/runtime/filetype.lua:35>
        [C]: in function 'pcall'
        vim/shared.lua: in function <vim/shared.lua:0>
        [C]: in function '_with'
        ...llar/neovim/HEAD-8bccd07/share/nvim/runtime/filetype.lua:35: in function <...llar/neovim/HEAD-8bccd07/share/nvim/runtime/filetype.lua:10>
stack traceback:
        [C]: in function '_with'
        ...llar/neovim/HEAD-8bccd07/share/nvim/runtime/filetype.lua:35: in function <...llar/neovim/HEAD-8bccd07/share/nvim/runtime/filetype.lua:10>

Okay, it was breaking because of this query: https://github.com/nvim-treesitter/nvim-treesitter/blob/42fc28ba918343ebfd5565147a42a26580579482/queries/lua/injections.scm#L189-L202

@MunifTanjim
Copy link
Collaborator

@clason are you okay with the changes above?

Also (not related to this repo), should a bad query prevent the parser from loading completely in neovim? 🤔

@clason
Copy link
Collaborator

clason commented Jul 26, 2025

That one is fairly low value; if we can't adapt it I would be fine removing it.

@clason
Copy link
Collaborator

clason commented Jul 26, 2025

Pro tip: use ts_query_ls check $directory to verify queries work with the grammar.

@clason
Copy link
Collaborator

clason commented Jul 26, 2025

@clason are you okay with the changes above?

Also (not related to this repo), should a bad query prevent the parser from loading completely in neovim? 🤔

Yes, since the parser is loaded together with the queries, not in isolation. (Error messages can be improved, but it's difficult with the current code.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants