Skip to content

Set rm_files to be a synchronous method #503

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

anjaliratnam-msft
Copy link
Collaborator

This addresses the github issue where rm_files is not implemented. This was a simple fix where sync_wrapper(_rm_files) just needed to be set to rm_files. Tests were also added to make sure it works as expected.

adlfs/spec.py Outdated
@@ -1248,7 +1248,7 @@ async def _rm_files(
for file in file_paths:
self.invalidate_cache(self._parent(file))

sync_wrapper(_rm_files)
rm_files = sync_wrapper(_rm_files)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah this is something that I missed as well when we went over the original GitHub feature request... Based on the fsspec specification there is no rm_files() shared interface; there's only an rm_file(). The original GitHub issue: #497 is also only requesting for rm_file().

So, it is not appropriate to be setting a sync wrapper like this because it does not appear rm_files to be a shared interface across file systems. Instead, it would probably make sense to add an async _rm_file that is a simplified wrapper over the _rm implementation to implement the feature request.

Even more interesting, it seems there used to be a _rm_file() implementation prior to this PR: #383 and because the underlying AsyncFileSystem mirrors methods, I suspect that adlfs might have actually at one point supported rm_file() and could be a regression. It would be great to confirm if adlfs ever supported rm_file() in a version prior to that PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the previous implementation of rm_file I found, but it looks like it was only ever used by rm and was not callable.

Copy link
Collaborator

@kyleknap kyleknap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's looking better. Just had a couple of follow up suggestions on the direction we take this.

adlfs/spec.py Outdated
if p != "":
await self._rm_files(container_name, [p.rstrip(delimiter)])
else:
await self._rmdir(container_name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So looking more at rm_file it seems like it's sole purpose is to just delete a single file and does not support deleting a directory provided. This seems to be the intention based on other implementation's versions of rm_file as well:

  • s3fs - Only deletes a single object which would be a blob in adlfs
  • local - Uses os.remove() which only handles removing files and not directories.

I think it would make sense to stick with this contract to be consistent, especially if rm_file was not actually never exposed publicly.

adlfs/spec.py Outdated
@@ -1278,6 +1278,31 @@ async def _rm_files(

sync_wrapper(_rm_files)

async def _rm_file(
self, path: typing.Union[str, typing.List[str]], delimiter: str = "/", **kwargs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And on a similar note, I'm not sure if we should be exposing a delimiter for this method since there is not really any recursive nature to this public contract and instead always just use / if we need any splitting logic. Also not supporting a delimiter seems consistent with the other implementations I linked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants