Skip to content

Conversation

@jlarsen-usgs
Copy link
Contributor

This PR supersedes the draft PR #2573 and continues the cleanup of legacy shapefile exporting code and integration of geopandas GeoDataFrame support.

@codecov
Copy link

codecov bot commented Dec 23, 2025

Codecov Report

❌ Patch coverage is 56.47208% with 343 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.4%. Comparing base (556c088) to head (1fec868).
⚠️ Report is 105 commits behind head on develop.

Files with missing lines Patch % Lines
flopy/export/shapefile_utils.py 40.7% 45 Missing ⚠️
flopy/plot/plotutil.py 2.5% 38 Missing ⚠️
flopy/mf6/data/mfdatalist.py 40.9% 36 Missing ⚠️
flopy/mf6/data/mfdataplist.py 50.0% 30 Missing ⚠️
flopy/utils/binaryfile/__init__.py 7.4% 25 Missing ⚠️
flopy/modflow/mfsfr2.py 54.7% 24 Missing ⚠️
flopy/mf6/data/mfdataarray.py 65.1% 23 Missing ⚠️
flopy/utils/datafile.py 32.0% 17 Missing ⚠️
flopy/mf6/mfpackage.py 48.3% 16 Missing ⚠️
flopy/utils/util_array.py 68.0% 16 Missing ⚠️
... and 12 more
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #2671      +/-   ##
===========================================
+ Coverage     55.5%    72.4%   +16.8%     
===========================================
  Files          644      667      +23     
  Lines       124135   130810    +6675     
===========================================
+ Hits         68947    94739   +25792     
+ Misses       55188    36071   -19117     
Files with missing lines Coverage Δ
flopy/discretization/grid.py 74.3% <100.0%> (-1.7%) ⬇️
flopy/mf6/data/mfdatautil.py 60.6% <100.0%> (-17.3%) ⬇️
flopy/utils/gridgen.py 75.8% <100.0%> (-11.3%) ⬇️
flopy/mf6/mfmodel.py 57.5% <88.2%> (-23.4%) ⬇️
flopy/discretization/structuredgrid.py 53.2% <72.7%> (+5.7%) ⬆️
flopy/mbase.py 70.3% <82.3%> (-2.4%) ⬇️
flopy/utils/geospatial_utils.py 91.5% <40.0%> (-1.6%) ⬇️
flopy/utils/particletrackfile.py 95.6% <94.8%> (-0.4%) ⬇️
flopy/discretization/unstructuredgrid.py 80.4% <66.6%> (-1.1%) ⬇️
flopy/discretization/vertexgrid.py 78.9% <54.5%> (-4.8%) ⬇️
... and 15 more

... and 537 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jlarsen-usgs jlarsen-usgs marked this pull request as ready for review December 23, 2025 17:35
@jlarsen-usgs
Copy link
Contributor Author

@wpbonelli it looks like the smoke tests are failing on test_mf2005_copy and also failed on the prior commit to FloPy. I can't currently reproduce the error on Windows. Any insight into what might be causing this?

Copy link
Member

@wpbonelli wpbonelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to be able to lean on geopandas for shapefile export. I guess we should look at using xarray for netcdf export similarly

General comment re: deprecations, we usually put an expected removal version in the deprecation warning which I think is typically ~2 minor releases ahead?

@property
def geo_dataframe(self):
"""
Returns a geopandas GeoDataFrame of the model grid
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the running convention is to add deprecation notes to docstrings as well as the warning in the code itself

filename : str or PathLike
path of the shapefile to write
mg : flopy.discretization.grid.Grid object
modelgrid : flopy.discretization.grid.Grid object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest keeping/deprecating mg just making it optional and introducing modelgrid alongside it to avoid breakage. or, since this function will go away soon, is it worth renaming this parameter at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed back to mg and added deprecation notes


def model_attributes_to_shapefile(
path: Union[str, PathLike],
filename: Union[str, PathLike],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as mg/modelgrid above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted naming convention and added deprecation notes

Parameters
----------
shpname : str or PathLike
filename : str or PathLike
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted naming convention and added deprecation notes

geoms,
shpname: Union[str, PathLike] = "recarray.shp",
mg=None,
filename: Union[str, PathLike] = "recarray.shp",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted naming convention and added deprecation notes


return gdf

def export(self, f, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we would ideally deprecate export() now that the geodataframe approach is the recommended way to export to shapefile, but since export() also handles netcdf we have to wait til there is a similar netcdf export capability via e.g. xarray? working in the same way as the geopandas approach in this PR like to_xarray().to_netcdf()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the long term I think we should deprecate export() for methods like .to_netcdf(), .to_vtk(), .to_raster() to be explicit and follow similar conventions as pandas, geopandas, etc...

self.repeating = True
self.empty_keys = {}

def to_geodataframe(self, gdf=None, kper=0, sparse=False, truncate_attrs=False, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case you have in mind for sparse=False? Since stress period data is sparse by default (at least in 3.x) it seems reasonable to return a sparse geodataframe too, either exclusively or at least by default?

Kind of like what you're doing in shapefile_export_example.py for WEL.

I guess I'm wondering when someone calling to_geodataframe() on stress period data would want one containing the whole grid, rather than just the relevant boundary conditions. In that case couldn't they get the grid geodataframe and .merge() it manually with the SPD frame or pass the grid frame into the gdf parameter here.

Performance wise, it seems expensive to always do the full grid then dropna().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sparse=False is useful when exporting from higher level objects like gwf.to_geodataframe(), from multiple individual packages into the same geodataframe, and if a user is exporting multiple stress periods with different boundary conditions from the same package. Because the geodataframe is constructed in the modelgrid object, it is created as a single layer representation of the whole grid and then we filter after the fact. For operations like gwf.to_geodataframe() there is a single construction of geometries and the geodataframe is then passed through to each package/datatype export.

Additionally sparse operates a little differently depending on if it's used on a TransientList or a MFList like object. On a TransientList the dataframe and attributes are constructed using a full grid representation and then filtered after all stress-period data is added. This is a much simpler operation than trying to reconcile multiple sparse dataframes from many stress-periods into a single dataframe.

We could cache a copy of the polygons after creation to improve performance; however, we don't currently give an option on the grid to construct a GeoDataFrame with only certain cells. We could add a nodes= parameter or kwarg to the grid .to_geodataframe() method which would only construct geometries for those nodes, however this will most likely be used in limited cases like exporting data from a single stress period or exporting SFR/LAK/UZF packagedata.


@property
def geo_dataframe(self):
def geodataframe(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why to_geodataframe... elsewhere and not here, other than the fact the old property didn't have to_...?

suggest also deprecating the old one to avoid breakage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two comments here:

  1. geodataframe was used to keep the convention consistent with other properties on the GeoSpatialUtil class (e.g., shapely, geojson, points, etc...).

  2. I did not add a deprecation here because this is a low-level object that generally shouldn't be called by end users. This serves as a "catch all" in methods that handle vector geospatial data and converts it to the type that the function was originally written for. I can keep the old method around and add a deprecation warning in the off chance someone is using this method to convert vector data outside of the intended use case.

optional attribute name, default uses util2d name
forgive : bool
optional flag to continue if data shape not compatible with GeoDataFrame
truncate_attrs : bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think GeoDataFrame.to_file() already truncates for you when writing to shapefile, is this parameter necessary? It seems like having our own API for this would be useful if we let the user customize the truncation but otherwise redundant?

Copy link
Contributor Author

@jlarsen-usgs jlarsen-usgs Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The truncate_attrs flag is definitely useful. to_file() does truncate the attributes, but it just cuts off the end and can potentially write a dbf file with multiple attributes that have identical names. truncate_attrs maintains the way we previously constructed attributes in flopy by removing text from the middle of the attribute name and keeping the layer and stress period number.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it- didn't realize this. thanks!

geoms = GeoSpatialCollection(geoms, "Point")
gdf = gpd.GeoDataFrame.from_features(geoms)

for col in list(welldata):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of looping to add columns, could this just be

gdf = gpd.GeoDataFrame(welldata, geometry=geoms)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated this, need to use:

geoms = GeoSpatialCollection(geoms, "Point").shape
gdf = gpd.GeoDataFrame(welldata, geometry=geoms)

for this type of GeoDataFrame construction because it expects an arraylike object

Copy link

@christianlangevin christianlangevin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a quick look through this PR. Really nice, @jlarsen-usgs. Excited to see it rolled out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants