Skip to content

(WIP) Implement mutation api's#1

Open
anujkumar93 wants to merge 58 commits intoBIOINF-976--DNAnexus_apifrom
BIOINF-982--mutation_api
Open

(WIP) Implement mutation api's#1
anujkumar93 wants to merge 58 commits intoBIOINF-976--DNAnexus_apifrom
BIOINF-982--mutation_api

Conversation

@anujkumar93
Copy link
Copy Markdown
Owner

@anujkumar93 anujkumar93 commented Sep 5, 2018

Motivation

As part of stor support for DNAnexus paths

Implementation

The following api's, including their test cases are handled in this PR for DNAnexus paths:

  • copy
  • copytree
  • remove
  • rmtree
  • rename
  • makedirs_p
    The private methods constructed to support the above copy and copytree api's are the following. Although these methods are functional, they have not been explicitly tested for every edge case as they are built as private functions.
  • _clone
  • _move
  • _clonetree
  • _movetree

A couple other things taken care of in this PR:

  • VCR cassettes are optimized by mocking out time.sleep. If reading back from cassettes, no need to wait to read.
  • login configuration is set up. Users only need to set their DX_AUTH_TOKEN to be able to interact with DX through stor. We can possibly set a default value for this.

@anujkumar93 anujkumar93 changed the title (WIP) Implement copy/copytree/remove/rmtree/open (WIP) Implement mutation api's and open Sep 5, 2018
@anujkumar93 anujkumar93 requested a review from pkaleta September 6, 2018 21:49
@anujkumar93 anujkumar93 self-assigned this Sep 6, 2018
Copy link
Copy Markdown

@jtratner jtratner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like the right approach, but I'm wondering about edge cases and also the necessity of the _is_folder() check

Comment thread stor/dx.py Outdated
Comment thread stor/dx.py Outdated
@anujkumar93 anujkumar93 changed the title (WIP) Implement mutation api's and open (WIP) Implement mutation api's Sep 7, 2018
@anujkumar93
Copy link
Copy Markdown
Owner Author

Agreed. DX platform has separate concepts of folder and file, and a virtual path on stor can refer to either/both.

Copy link
Copy Markdown

@jtratner jtratner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting pretty close! Really at the level of edge cases now - awesome! I had a bunch of comments. (Note - I haven't gone through test cases yet - I plan to go back and look at them after you've made changes to address my comments below)

Most of my comments boil down to:

  1. be really clear on what exceptions may be raised by a method
  2. clear error messages around duplicates with better phrasing
  3. commenting on "target directory already exists" behavior

I have a more general concern: right now there are a lot of uncaught DXErrors that can bubble up but that are also undocumented in the method definitions. (and, because they aren't wrapped, make it harder to write generic code to interact with storage).

One part is just starting to list out which exceptions can happen (particularly handling authentication errors), but another issue are race conditions around file existence. Some of which need to be documented and others which need to have slightly more defensive coding applied.

Needs defensive coding: if we resolve a file to a canonical object in process A, then delete it in process B, then call copy() on the object in process A; rather than raise a NotFoundError, process A will raise an unhandled DXError. This looks to be the case in pretty much every place where we call out to dnanexus resources. While we can't cover every case, we at least need to wrap DNAnexus exceptions.

If an error is recoverable, obv should use a try/except; however, if an error is not recoverable, might be nice just to use a contextmanager that wraps the exception, e.g.:

with wrap_dx_exceptions():
    file_handler.get_download_url()

then within the contextmanager can do whatever exception parsing you like.

Needs documentation: virtual paths are resolved the first time they are encountered; if a virtual path changes either its target file object or its location during an operation, that has undefined behavior in stor.

Comment thread stor/dx.py
Comment thread stor/dx.py Outdated
Comment thread stor/dx.py Outdated
Comment thread stor/dx.py
Comment thread stor/dx.py Outdated
Comment thread stor/dx.py Outdated
Comment thread stor/dx.py Outdated
Comment thread stor/dx.py Outdated
Comment thread stor/dx.py Outdated
Comment thread stor/dx.py
Comment thread stor/dx.py Outdated
Comment thread stor/dx.py Outdated
Copy link
Copy Markdown

@jtratner jtratner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really close to being done - bit of code to DRY up first. Please see my comments for additional notes.

I'd really really like to see documentation written for this before we ship it.

Comment thread requirements.txt
Comment thread stor/dx.py Outdated
Comment thread stor/dx.py Outdated
Comment thread stor/dx.py Outdated
Comment thread stor/dx.py Outdated
Comment thread stor/dx.py
Comment thread stor/dx.py
Comment thread stor/dx.py Outdated
Comment thread stor/dx.py Outdated
Comment thread stor/dx.py Outdated
jtratner and others added 17 commits October 26, 2018 12:12
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Copy link
Copy Markdown

@jtratner jtratner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nearly there!!

Comment thread stor/swift.py
Comment thread stor/test.py Outdated
Comment thread stor/tests/test_dx.py
Comment thread stor/tests/test_dx.py
Comment thread stor/tests/test_dx.py
Comment thread stor/tests/test_integration_dx.py Outdated
Comment thread stor/tests/test_integration_dx.py Outdated
Comment thread stor/tests/test_integration_dx.py
Comment thread stor/tests/test_integration_dx.py
time.sleep(.5)
self.assertFalse((self.test_dir / which_obj).exists())

@skipIf(six.PY3, 'dxpy3 assumes utf-8 encoding, not suitable for gzip')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it shouldn't assume utf-8 encoding if you open with rb - can you put up a small example on the GH dx-toolkit bug tracker to demonstrate issue? (e.g., something like creating and uploading gzipped file that you then try to open using mode='rb')

pretty clear (but small) bug

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no mode parameter in dxpy.DXFile.read(). For stor, if mode is 'r', DXPath.read_object() is called, and if mode is 'rb', DXPath.read_object().decode() is invoked. DXPath.read_object() and hence dxpy.DXFile.read() is supposed to return the raw bytes. They return the raw bytes by using decode('utf'-8') (it's hardcoded).

I opened the issue at https://github.com/dnanexus/dx-toolkit/issues/426 . Please take a look.

@pkaleta pkaleta removed their request for review December 16, 2022 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants