Skip to content

Create CopyExec node on Spark side, move related logic from native #1995

Open
@mbutrovich

Description

@mbutrovich

While working on #1887 I looked at logic related to wrapping nodes with CopyExec (mostly for unpacking dictionaries. We have logic in planner.rs in several places that determines when to insert CopyExec nodes in the native physical plan. We should move that logic to the Spark side and define a new CopyExec node that we can translate 1:1 on the native side. This has a few benefits:

  • CopyExec nodes will show up in the Spark UI representation of the plan
  • More logic about plan translation is consolidated on the Spark side, which offers benefits of its own:
    • Easier to understand what the final native plan will be during Comet rule execution on the Spark side
    • Easier to cache serialized native plans
    • Reduce redundant work across Spark workers in native code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions