Adding uc name validation in cli and demo's#260
Adding uc name validation in cli and demo's#260ravi-databricks wants to merge 3 commits intofeature/v0.0.11from
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## feature/v0.0.11 #260 +/- ##
===================================================
+ Coverage 89.73% 89.81% +0.07%
===================================================
Files 10 10
Lines 2094 2110 +16
Branches 405 411 +6
===================================================
+ Hits 1879 1895 +16
Misses 116 116
Partials 99 99
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
demo/generate_dabs_resources.py
Outdated
| The initialized runner configuration. | ||
| """ | ||
| run_id = uuid.uuid4().hex | ||
| self.validate_uc_catalog_name(self.args["uc_catalog_name"]) |
There was a problem hiding this comment.
Maybe we can do self.args.GET("uc_catalog_name") instead of self.args["uc_catalog_name"] and then in validate_uc_catalog_name throw an explicit error when catalog name is None? Because if it isn't present we'll get generic Python error that key isn't found
| """ | ||
| if not name or not name.strip(): | ||
| raise ValueError("'uc_catalog_name' must not be empty.") | ||
| if not re.match(r'^[a-zA-Z_][a-zA-Z0-9_]*$', name): |
There was a problem hiding this comment.
No. It's incorrect - per documentation, we can use any unicode character:
There was a problem hiding this comment.
only delimited identifiers can use any unicode character. Non delimited identifiers have restrictions.
Instead of rejecting the input, we can have another approach: if we spot a character not allowed in the non-delimited version, we automatically add backticks to convert to delimited identifier
src/cli.py
Outdated
| if self.onboard_layer.lower() not in ["bronze", "silver", "bronze_silver"]: | ||
| raise ValueError("onboard_layer must be one of bronze, silver, bronze_silver") | ||
| if self.uc_enabled and self.uc_catalog_name: | ||
| if not re.match(r'^[a-zA-Z_][a-zA-Z0-9_]*$', self.uc_catalog_name): |
…t with backticks, deduplicate
| logger = logging.getLogger('databricks.labs.dltmeta') | ||
|
|
||
|
|
||
| def validate_uc_catalog_name(name): |
There was a problem hiding this comment.
Why do we have the same implementation in the two places?
Added validation as per sql_ref_identifiers for uc catalog name in demo and cli