Persist workflow problems in UCX multiworkspace catalog#3756
Persist workflow problems in UCX multiworkspace catalog#3756JCZuurmond wants to merge 9 commits intomainfrom
Conversation
JCZuurmond
left a comment
There was a problem hiding this comment.
@asnare : Would like you guidance on this PR
| end_line: int | ||
| end_col: int | ||
|
|
||
| __id_attributes__: ClassVar[tuple[str, ...]] = ( |
There was a problem hiding this comment.
@asnare : This can only be str according to the current implementation. Do you know why? And/or what what to do with non-string fields?
There was a problem hiding this comment.
These are constrained because the mechanism we use to encode them as an identifier is only defined for when they're a string. I vaguely recall proposing JSON-encoding the sequence, but that was rejected.
Anything that isn't a string needs to be made available (unambiguously) as a string.
There was a problem hiding this comment.
Anything that isn't a string needs to be made available (unambiguously) as a string.
Can we made these available as string without changing the dataclass attributes? (We want to avoid that as this is persisted as table in the UCX inventory, thus difficult to update the data types.)
There was a problem hiding this comment.
I think read-only properties will work?
|
|
||
| class WorkflowLinter: | ||
| def __init__( | ||
| class WorkflowLinter(CrawlerBase): |
There was a problem hiding this comment.
@asnare : This "crawler" does not fit the usual crawler format as it "crawls" multiple resources: JobProblems, UsedTable and DirectFsAccess. You looked into this. What is your conclusion on handling this.
There was a problem hiding this comment.
This was a very difficult problem to solve: I'm not sure I reached a conclusion. I had a sense that we could use some sort of "composite" crawler than produced the different things all at once and then split them up, but never really went through the detailed effort of figuring out the details to graft those two worlds together.
Changes
Persist workflow problems in UCX multiworkspace catalog:
JobProblemownership in UCX multiworkspace catalogWorkflowLinterto be a crawler for similar API for persisting snapshot in mutlworkspace catalogJobProblemOwnershipto identify owners for job problemsLinked issues
Resolves #3237
Functionality
migration-progressTests