-
Notifications
You must be signed in to change notification settings - Fork 203
Description
Hi,
I am investigating on an issue reported on pglast, where the stmt_location of the RawStmt are not consistent, when feeding a multi-statements SQL to pg_query_parse() function. I could not figure either a reason, or a workaround.
A little detail: pglast uses the location/stmt_location only when it asked to preserve the comments in the statement. In that case, it first collects all the comments using pg_query_scan(), and then it does its best to emit them as soon as it reprints (deparse, in libpg_query parlance) each node, roughly comparing the node position with the one of the collected comments.
Consider the reported sample SQL:
-- First comment
SELECT * FROM first_table;
-- Second comment
SELECT * FROM second_table;Adding it to examples/scan.c, I get the following:
version: 170000, tokens: 12, size: 114
"-- First comment" = [ 0, 16, SQL_COMMENT, NO_KEYWORD ]
"SELECT" = [ 17, 23, SELECT, RESERVED_KEYWORD ]
"*" = [ 24, 25, ASCII_42, NO_KEYWORD ]
"FROM" = [ 26, 30, FROM, RESERVED_KEYWORD ]
"first_table" = [ 31, 42, IDENT, NO_KEYWORD ]
";" = [ 42, 43, ASCII_59, NO_KEYWORD ]
"-- Second comment" = [ 45, 62, SQL_COMMENT, NO_KEYWORD ]
"SELECT" = [ 63, 69, SELECT, RESERVED_KEYWORD ]
"*" = [ 70, 71, ASCII_42, NO_KEYWORD ]
"FROM" = [ 72, 76, FROM, RESERVED_KEYWORD ]
"second_table" = [ 77, 89, IDENT, NO_KEYWORD ]
";" = [ 89, 90, ASCII_59, NO_KEYWORD ]
Notice the position of the second SELECT: 63.
Using pg_query_parse() I get this (reformatted, for clarity):
{
"version": 170000,
"stmts": [
{
"stmt": {
"SelectStmt": {
"targetList": [
{
"ResTarget": {
"val": {
"ColumnRef": {
"fields": [
{
"A_Star": {}
}
],
"location": 24
}
},
"location": 24
}
}
],
"fromClause": [
{
"RangeVar": {
"relname": "first_table",
"inh": true,
"relpersistence": "p",
"location": 31
}
}
],
"limitOption": "LIMIT_OPTION_DEFAULT",
"op": "SETOP_NONE"
}
},
"stmt_len": 42
},
{
"stmt": {
"SelectStmt": {
"targetList": [
{
"ResTarget": {
"val": {
"ColumnRef": {
"fields": [
{
"A_Star": {}
}
],
"location": 70
}
},
"location": 70
}
}
],
"fromClause": [
{
"RangeVar": {
"relname": "second_table",
"inh": true,
"relpersistence": "p",
"location": 77
}
}
],
"limitOption": "LIMIT_OPTION_DEFAULT",
"op": "SETOP_NONE"
}
},
"stmt_location": 43,
"stmt_len": 46
}
]
}As you can see, the second RawStmt is reported to start at 43, just after the semicolon ending the first SELECT, that is clearly wrong, while all other sub-nodes location are correct.
Thanks in advance for any clue!