Skip to content

Lineage is not published to Purview - START events missing environment-properties #236

@gerson23

Description

@gerson23

Describe the bug
START events are skipped in OpenLineageIn function because they are missing the environment-properties field. This causes COMPLETE events to not be processed by PurviewOut function, therefore no lineage is published to Purview.

However, as discussed in OpenLineage/OpenLineage#2203, this can be considered a valid scenario, because OL model is cumulative, so the following RUNNING event should have the environment-properties information on top of that.

Expected behavior
START events, even when missing environment-properties field should be accepted. RUNNING events should be accepted as well and used to fill the information from environment-properties when they have it. Then, a COMPLETE event can be properly processed and lineage be published to Purview.

Environment

  • OpenLineage Version: OL 1.5.0+ (I changed parameter for the newer version)
  • Databricks Runtime Version: 14.3
  • Cluster Type: Job

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions