Skip to content

Regression: pandas.read_parquet hangs when using filprofiler 2022.09.0 #415

@kdebrab

Description

@kdebrab

I hope the following is sufficient for reproducing the issue.

Writing with df.to_parquet goes fine, it's when reading the data back with pd.read_parquet that the code hangs. The parquet engine used is pyarrow. No error is raised, the docker container simply hangs forever.

python: 3.10.7
OS: Linux
pandas: 1.4.4
numpy: 1.23.3
pyarrow: 9.0.0

Disabling filprofiler (I use the api with a conditional environment variable as documented in https://pythonspeed.com/fil/docs/api.html#using-the-python-api) resolves the issue. Also reverting to filprofiler 2022.06.0 (with everything else exactly the same) resolves the issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    NEXTbugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions