Conversation
… client; rewrite original substition function to only check enough params are provided
…types are already serialised by json.dumps in an acceptable way and specific adapters have been removed
…dict is provided as parameters
…instead of the raw JSON item
There was a problem hiding this comment.
Pull Request Overview
This PR migrates pyrqlite from client-side string substitution to using the rqlite params API for parameter binding. This change improves security by allowing server-side parameter preparation and binding, and fixes issues where placeholder characters could appear inside string literals.
- Refactored parameter handling to use JSON-based parameter passing instead of string substitution
- Updated adapter functions to prepare Python values for JSON serialization rather than string formatting
- Enhanced parameter validation with proper string literal parsing to avoid false placeholder matches
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/test/test_dbapi.py | Removes expected failure decorator and adds iterator methods to test parameter sequence handling |
| src/pyrqlite/extensions.py | Updates adapters to prepare values for JSON serialization instead of string formatting |
| src/pyrqlite/cursors.py | Replaces string substitution with params API, adds string literal parsing and parameter validation |
src/test/test_dbapi.py
Outdated
| assert x == 0 | ||
| if x >= self.__len__(): |
There was a problem hiding this comment.
This bounds check will never trigger because it's placed after assert x == 0. The assertion will fail for any x >= 1, making the IndexError check unreachable. Consider removing the assertion or restructuring the logic.
| assert x == 0 | |
| if x >= self.__len__(): | |
| if x < 0 or x >= self.__len__(): |
There was a problem hiding this comment.
Removed redundant bounds check
… test object was modified to be iterable but still with a single item
|
Hi @joshiggins -- this ready for review? cc @zmedico |
|
Yes ready for review, thanks |
otoolep
left a comment
There was a problem hiding this comment.
Is there a unit test you can add that would fail before your change, but now passes?
src/test/test_dbapi.py
Outdated
|
|
||
| def __getitem__(self, x): | ||
| assert x == 0 | ||
| assert x == 0 |
There was a problem hiding this comment.
Trailing whitespace -- please remove.
…t interpreted as parameter placeholders
|
@otoolep added 2 tests to check that qmark and colon characters appearing inside string literals in the SQL statement are not interpreted as parameter placeholders. Failure message before this change: Also I enabled the test It looked like previously the zero byte was treated somewhere along the line as a string terminator and rqlite ends up getting a truncated SQL statement. When it's a bound parameter this test passes since the 5 characters (one of them being the zero byte) are stored and returned as expected. Failure message before this change: Finally I enabled |
|
Just so we're clear, can you tell me what you mean by "client side substitution"? Are you saying this library would rewrite, say, |
|
rqlite API and parameters: https://rqlite.io/docs/api/api/#parameterized-statements Surely there is no need to manipulate SQL statements strings. It's up to users to get them right, no? |
Yes, exactly. The original method is here pyrqlite/src/pyrqlite/cursors.py Line 91 in 46c2c95 |
| ''' | ||
| SQLite natively supports only the types TEXT, INTEGER, REAL, BLOB and | ||
| NULL | ||
| This function removes string literals from the SQL operation so we |
There was a problem hiding this comment.
@joshiggins -- this is what I don't understand then. Why modify the SQL string at all? Should it not be supplied in the SQL strings place holder in the Params API? The SQLite code will take care of it.
There was a problem hiding this comment.
Oh, I see, I had the wrong mental model for this change (I'm not super familiar with this library). This is about breaking apart the SQL entered by the client library so the API call to rqlite can be made correctly. Let me take a look now.
|
@joshiggins -- I've got a question for you. This change appears to be checking the parameter count within the SQL to the parameters supplied by the user. Why do that? Why not just push the stuff into the rqlite API as it, and let it return errors? What I mean is look at the db2 docs, an example: https://docs.python.org/3/library/sqlite3.html#sqlite3.Cursor.execute If I was to code a client library I would just take the SQL string, shove that in the HTTP API request to rqlite, and then look at the type of parameters. If it's a list form the HTTP API request one way, if it's a dict, for the HTTP API request another way. This library was probably like this already, but I don't see any point in checking that the user has supplied the right number of parameters for the number of '?' or named params (as a dictionary) in the SQL query. rqlite will check all that, and return an error to the user. if there is an error Is there a good reason to also do the checking in a client library like this one? It could be brittle. By simply focusing on building the HTTP API request and sending that the rqlite, rqlite will do the checking for you in a bullet-proof manner (since rqlite in turn will use its copy of SQLite to check). |
| statements = json.dumps([self._get_operation_with_params(operation, parameters)]) | ||
| except TypeError as e: | ||
| raise InterfaceError(e) | ||
|
|
There was a problem hiding this comment.
This could catch an unexpected TypeError, so I would prefer that _get_operation_with_params internally converted TypeError to InterfaceError if needed.
A TypeError from json.dumps can be handled separately like this:
statements = self._get_operation_with_params(operation, parameters)
try:
statements = json.dumps([statements])
except TypeError as e:
raise InterfaceError(e)| try: | ||
| statements.append(self._get_operation_with_params(operation, parameters)) | ||
| except TypeError as e: | ||
| raise InterfaceError(e) |
There was a problem hiding this comment.
This could catch an unexpected TypeError, so I would prefer that _get_operation_with_params internally converted TypeError to InterfaceError if needed.
There was a problem hiding this comment.
There's a json.dumps(statements) call later in this function, and we want to do the TypeError to InterfaceError conversion for that.
| def _adapt_bytes(value): | ||
| # Use byte array for the params API | ||
| if not isinstance(value, bytes): | ||
| value = value.encode('utf-8') |
There was a problem hiding this comment.
Can't we safely omit this isinstance check because _adapt_from_python will only pass in bytes type here?
This PR makes pyrqlite use the params API instead of client side string substitution.
It fixes an issue where placeholder characters could not appear inside string literals in the SQL statement.
It also potentially improves security by allowing the server to prepare and bind the parameters.
These changes should not break existing use cases for pyrqlite.