Skip to content

Conversation

alessandro-nori
Copy link
Contributor

@alessandro-nori alessandro-nori commented Aug 28, 2025

Rationale for this change

Now that pyarrow FileIO supports ADLS, we can update the SCHEMA_TO_FILE_IO mapping for abfs and wasb to use ARROW_FILE_IO, similar to how it’s handled for s3

We’re keeping FsspecFileIO as the preferred default, since the PyArrowFileIO implementation is only available in pyarrow >= 20.0.0

Are these changes tested?

Are there any user-facing changes?

Add mapping SCHEMA_TO_FILE_IO for wasb and abfs to FsspecFileIO and PyArrowFileIO

Copy link
Contributor

@NikitaMatskevich NikitaMatskevich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good catch!

@alessandro-nori alessandro-nori changed the title use arrow_file_io as default for Azure use PyArrowFileIO as default for abfs and wasb schemes Aug 28, 2025
Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR @alessandro-nori

i would prefer to swap the order and default to fsspec for now. Users can still optionally pick pyarrow. the pyarrow AzureFileSystem implementation is only available for versions >= 20.0.0, which is quite recent

MIN_PYARROW_VERSION_SUPPORTING_AZURE_FS = "20.0.0"
if version.parse(pyarrow.__version__) < version.parse(MIN_PYARROW_VERSION_SUPPORTING_AZURE_FS):
raise ImportError(
f"pyarrow version >= {MIN_PYARROW_VERSION_SUPPORTING_AZURE_FS} required for AzureFileSystem support, "
f"but found version {pyarrow.__version__}."
)

we can revisit this in the future when we bump up our pyarrow version

WDYT?

@alessandro-nori
Copy link
Contributor Author

Thanks for the review @kevinjqliu !
I agree, I inverted the default 👍

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@kevinjqliu kevinjqliu merged commit 3457bc2 into apache:main Aug 28, 2025
10 checks passed
@kevinjqliu
Copy link
Contributor

Thanks @alessandro-nori and thank you for the review @NikitaMatskevich

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants