Skip to content

Reconciling point intersection methods #2673

@jlarsen-usgs

Description

@jlarsen-usgs

I'm starting this issue/discussion as part of my review of pull requests related to point intersection within the modflow grid. Currently there are two approaches that tackle this problem. My personal opinion is to select one of the two methods for intersection to ease the code maintenance burden.

#2654 which is an optimization for the grid.intersect() method

and

#2657 which extends the GridIntersect class and adds a points_to_cellids() method.

Both methods accomplish similar goals but take different approaches. I've also done some profiling of these methods as well as a geopandas based method to get cellids and here are the results (StructuredGrid, Python 3.12, Windows 11 OS):

Image

Results here show that convex hull method #2654 is slightly slower than the grid intersect points_to_cellids method #2657 for up to 10,000 points. Around 100,000 points these two methods perform similarly. At 1,000,000 pts the convex hull method is about 6x faster. Profiling revealed about half of the execution time of the grid intersect method was due to recarray construction (~1.7 seconds for 1 million pts).

The geopandas method that I tested for comparison used spatial joins between geodataframes of the grid and points. This method was the fastest for up to 100,000 points of intersection, at 1,000,000 points it was slightly slower than the convex hull method. It was also only a few lines of code, but I didn't do any data prep to formalize the method as it was a test (so it would be slightly slower and more than a few lines of code if it was updated for "production").

Profiling results for Vertex/UnstructuredGrid's are very different, however.

Image

Performance of #2657 (grid intersect) applied to a voronoi grid (3 layer, 2317 nodes per layer) is consistent to the performance it had on a structured grid. Performance of #2654 (convex hull) is poor on a voronoi grid and profiling was halted at 5000 points of intersection, due to slow performance. The difference in performance between structured and voronoi grid for #2654 is due to very different approaches taken for different grid types. For structured grids, a vectorized "sweep-line" type search is performed which is fast. For unstructured and vertex grids the convex hull method is applied and performance is slow, likely due to a lot of looping in python to search for the grid cell in an indexed neighborhood.

Consideration should also be given to the maintenance burden of the code. #2657 reduces the total amount of code in the repository by about 200 lines. #2654 adds about 1,100 lines of code to the repository, with the core feature being around 964 lines of code.

Given the results outlined here, #2657 (grid intersect improvements) seems like the best possible option moving forward. It is consistently fast (although not the fastest in all regards) across grid types. With the addition of .points_to_cellids() on GridIntersect I think we could use the .intersection() functionality on the grid as a thin wrapper that calls GridIntersect() instead of maintaining a separate intersection routine.

My recommendation is to close #2654 and not merge it in its current state as it does not perform well for vertex and unstructured grids. Any additional code review and development work should be focused on #2657 and improving the existing GridIntersect utility.

Comments and thoughts are welcome.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions