-
Notifications
You must be signed in to change notification settings - Fork 30
feat: discriminate member identifiers #60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
I don't agree with these changes. I don't think these provide much value. Also I don't completely understand the reason these changes are needed. But if @clason or @ObserverOfTime feels otherwise, I can look deeper into it. P.S. This grammar is based on the extended bnf here: https://lua.org/manual/5.4/manual.html#9 |
Yeah, I agree -- these changes seem driven by philosophical differences on how a grammar should be written (basically, write the parser for the queries rather than vice versa). And these query changes look quite breaking, which is not something I want to handle for a widely-used parser that should be (and is) stable. |
Leaving this PR unmerged will leave the aforementioned bug (point 3) unfixed. I opened #61 for this, in case you have a different solution in mind. Note that making field identifiers into a separate node is nothing out of the ordinary and is already done in the official C Tree-sitter grammar. |
Yes, and I have opinions on that grammar. The tree-sitter org grammars are there for historical reasons, not because they are in any way "blessed" or official. Please respect that different languages (and projects) have different approaches and philosophies; writing (performant) tree-sitter grammars is more of an art than a science.
That can easily be handled on the query level (it's captured as both, with the field name simply having priority due to pattern order). |
Could you please provide a sample query or a list of node names to match against? I don't see what information from the parse tree below I could use to distinguish (chunk [0, 0] - [1, 0]
(assignment_statement [0, 0] - [0, 26]
(variable_list [0, 0] - [0, 1]
name: (identifier [0, 0] - [0, 1]))
(expression_list [0, 4] - [0, 26]
value: (table_constructor [0, 4] - [0, 26]
(field [0, 6] - [0, 13]
name: (identifier [0, 6] - [0, 9])
value: (number [0, 12] - [0, 13]))
(field [0, 15] - [0, 24]
name: (identifier [0, 16] - [0, 19])
value: (number [0, 23] - [0, 24])))))) Field name identifiers and field names expressions are not the same entity and it's necessary to highlight them differently to ensure correct and consistent syntax highlighting. |
|
Thanks! In case anyone encounters the same problem, the full set of highlight queries to distinguish the two aforementioned cases is: (field name: (identifier) @field)
(field "[" . name: (identifier) @variable . "]") I have to admit, it's not that longer than what you get with my PR: |
I am not saying it's pretty, mind you, just that it can be done without breaking query changes -- which, again, I would like to avoid if possible. But in the end it's @MunifTanjim's decision whether this change is worth it. |
So I read #61, and this was indeed an oversight when I initially wrote the grammar. I think, we should introduce a diff --git a/grammar.js b/grammar.js
index e27e1ae..af69baf 100644
--- a/grammar.js
+++ b/grammar.js
@@ -440,16 +440,14 @@ module.exports = grammar({
_prefix_expression: ($) =>
prec(1, choice($.variable, $.function_call, $.parenthesized_expression)),
- // var ::= Name | prefixexp [ exp ] | prefixexp . Name
+ // var ::= Name | prefixexp '[' exp ']' | prefixexp . Name
variable: ($) =>
choice($.identifier, $.bracket_index_expression, $.dot_index_expression),
- // prefixexp [ exp ]
+ // prefixexp '[' exp ']'
bracket_index_expression: ($) =>
seq(
field('table', $._prefix_expression),
- '[',
- field('field', $.expression),
- ']'
+ field('name', $.bracketed_expression)
),
// prefixexp . Name
dot_index_expression: ($) =>
@@ -482,6 +480,8 @@ module.exports = grammar({
// '(' exp ')'
parenthesized_expression: ($) => seq('(', $.expression, ')'),
+ // '[' exp ']'
+ bracketed_expression: ($) => seq('[', $.expression, ']'),
// tableconstructor ::= '{' [fieldlist] '}'
table_constructor: ($) => seq('{', optional($._field_list), '}'),
@@ -493,9 +493,7 @@ module.exports = grammar({
field: ($) =>
choice(
seq(
- '[',
- field('name', $.expression),
- ']',
+ field('name', $.bracketed_expression),
'=',
field('value', $.expression)
), But for some reason I'm getting this error when I try to install the compiled parser in neovim:
Okay, it was breaking because of this query: https://github.com/nvim-treesitter/nvim-treesitter/blob/42fc28ba918343ebfd5565147a42a26580579482/queries/lua/injections.scm#L189-L202 |
@clason are you okay with the changes above? Also (not related to this repo), should a bad query prevent the parser from loading completely in neovim? 🤔 |
That one is fairly low value; if we can't adapt it I would be fine removing it. |
Pro tip: use |
Yes, since the parser is loaded together with the queries, not in isolation. (Error messages can be improved, but it's difficult with the current code.) |
to just:
Such a construction is used quite heavily in my highlights queries since I apply a different highlight name to each of: fields, type fields, constant fields, function fields, metafields, and metamethods. From what I remember, my queries from before this change were about 20 lines longer.
(identifier)
matched both variables and fields. The latter use case is not uncommon, see:pcall(require, 'maybe_not_there')
As a proof of concept, I did exactly that: I extended the built-in functions query from just call expressions to any position in Lua code without shuffling anything around, wrote some tests (included in this commit) and they just worked. No questions asked.foo
in{ [foo] = 1 }
was highlighted as a field name, even though it's just a variable, whose run-time value is the real field name.