(WIP) Implement mutation api's#1
(WIP) Implement mutation api's#1anujkumar93 wants to merge 58 commits intoBIOINF-976--DNAnexus_apifrom
Conversation
jtratner
left a comment
There was a problem hiding this comment.
this looks like the right approach, but I'm wondering about edge cases and also the necessity of the _is_folder() check
|
Agreed. DX platform has separate concepts of folder and file, and a virtual path on stor can refer to either/both. |
jtratner
left a comment
There was a problem hiding this comment.
Getting pretty close! Really at the level of edge cases now - awesome! I had a bunch of comments. (Note - I haven't gone through test cases yet - I plan to go back and look at them after you've made changes to address my comments below)
Most of my comments boil down to:
- be really clear on what exceptions may be raised by a method
- clear error messages around duplicates with better phrasing
- commenting on "target directory already exists" behavior
I have a more general concern: right now there are a lot of uncaught DXErrors that can bubble up but that are also undocumented in the method definitions. (and, because they aren't wrapped, make it harder to write generic code to interact with storage).
One part is just starting to list out which exceptions can happen (particularly handling authentication errors), but another issue are race conditions around file existence. Some of which need to be documented and others which need to have slightly more defensive coding applied.
Needs defensive coding: if we resolve a file to a canonical object in process A, then delete it in process B, then call copy() on the object in process A; rather than raise a NotFoundError, process A will raise an unhandled DXError. This looks to be the case in pretty much every place where we call out to dnanexus resources. While we can't cover every case, we at least need to wrap DNAnexus exceptions.
If an error is recoverable, obv should use a try/except; however, if an error is not recoverable, might be nice just to use a contextmanager that wraps the exception, e.g.:
with wrap_dx_exceptions():
file_handler.get_download_url()
then within the contextmanager can do whatever exception parsing you like.
Needs documentation: virtual paths are resolved the first time they are encountered; if a virtual path changes either its target file object or its location during an operation, that has undefined behavior in stor.
jtratner
left a comment
There was a problem hiding this comment.
really close to being done - bit of code to DRY up first. Please see my comments for additional notes.
I'd really really like to see documentation written for this before we ship it.
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
Co-Authored-By: anujkumar93 <anujkumar.maverick@gmail.com>
| time.sleep(.5) | ||
| self.assertFalse((self.test_dir / which_obj).exists()) | ||
|
|
||
| @skipIf(six.PY3, 'dxpy3 assumes utf-8 encoding, not suitable for gzip') |
There was a problem hiding this comment.
it shouldn't assume utf-8 encoding if you open with rb - can you put up a small example on the GH dx-toolkit bug tracker to demonstrate issue? (e.g., something like creating and uploading gzipped file that you then try to open using mode='rb')
pretty clear (but small) bug
There was a problem hiding this comment.
There is no mode parameter in dxpy.DXFile.read(). For stor, if mode is 'r', DXPath.read_object() is called, and if mode is 'rb', DXPath.read_object().decode() is invoked. DXPath.read_object() and hence dxpy.DXFile.read() is supposed to return the raw bytes. They return the raw bytes by using decode('utf'-8') (it's hardcoded).
I opened the issue at https://github.com/dnanexus/dx-toolkit/issues/426 . Please take a look.
check for DX_AUTH_TOKEN with every dxpy api call
…ar93/stor into BIOINF-982--mutation_api
Motivation
As part of stor support for DNAnexus paths
Implementation
The following api's, including their test cases are handled in this PR for DNAnexus paths:
The private methods constructed to support the above copy and copytree api's are the following. Although these methods are functional, they have not been explicitly tested for every edge case as they are built as private functions.
A couple other things taken care of in this PR:
DX_AUTH_TOKENto be able to interact with DX through stor. We can possibly set a default value for this.