DNAnexus module within stor#2
Conversation
| NotFoundError = stor_exceptions.NotFoundError | ||
|
|
||
|
|
||
| def _parse_dx_error(exc, **kwargs): |
There was a problem hiding this comment.
this looks unused now - should you remove it? It's also missing a second else clause when none of the cases match that would lead to a confusing error message
There was a problem hiding this comment.
Yep, outdated and removed.
| wrapper.__doc__ = 'No-op for %r' % attr_name | ||
| return wrapper | ||
|
|
||
| abspath = _noop('abspath') |
There was a problem hiding this comment.
is this going to lead to bugs around dx://Proj:/parent/subdir/../anotherfile.txt - do we care about that behavior?
There was a problem hiding this comment.
abspath is not used for DXPaths in code base, and seeing that OBS paths in general have abspath as a noop, it is upto the user to use abspath cautiously with DXPaths.
| project=self.canonical_project | ||
| )[0] | ||
| else: | ||
| raise DXError('DX Projects cannot have a temporary download url') |
There was a problem hiding this comment.
looks pretty close to the right way to do this; however a couple points:
- since the if clause always raise an error, you can check it first, raise the exception and dedent here, i.e.:
if not self.resource:
raise DXError('Can only generate a download URL for individual files')
try:
return file_handler.get_download_url(...)
except: pass
- I think you want to check
resourceand not canonical resource before raising an error, no reason to go hit an API to canonicalize a project before raising this exception.
There was a problem hiding this comment.
Agreed on both points, and changed.
| return file_handler.get_download_url( | ||
| duration=lifetime, | ||
| preauthenticated=True, | ||
| filename=filename, |
There was a problem hiding this comment.
if you're already describing this, should filename just be what's returned from describe() if not present? what happens if filename is None here?
There was a problem hiding this comment.
If filename is None here, the get_download_url defaults to the name of the source file, which is the behavior we want.
| else: | ||
| raise DXError('DX Projects cannot have a temporary download url') | ||
| except ValueError: | ||
| raise DXError('DXPaths ending in folders cannot have a temporary download url') |
There was a problem hiding this comment.
how would you hit this exception? wouldn't you know this because you can't generate a canonical resource for this method?
I'm thinking it'd be clearer to write something like:
if not self.resource: error
canonical = self.canonicalize() # causes error if it's a folder already??
in the second case I think it's fine to just have an error that's just about data objects, etc
There was a problem hiding this comment.
Agreed after discussion. Changed the method to reflect this.
| if canonicalize: | ||
| yield DXCanonicalPath('dx://{}:/{}'.format(obj['project'], obj['id'])) | ||
| else: | ||
| dx_p = DXVirtualPath(self.drive + proj_name + ':' + obj['describe']['folder']) |
There was a problem hiding this comment.
I think this'd be clearer written as:
'{drive}{proj_name}:{folder}/{name}'.format(drive=drive, proj_name=proj_name, folder=obj['describe']['folder'], name=obj['describe']['name'])
``` which I believe does the same thing but avoids a level of indirection
There was a problem hiding this comment.
Yep, changed here and other places where it would be more intuitive to write like this.
| pass | ||
| # otherwise we could be a directory, so try to grab first | ||
| # file/subfolder | ||
| if self._is_folder(): |
There was a problem hiding this comment.
can you help me understand what the _is_folder check adds here? why not just try to list it directly and return False on the NotFoundError?
There was a problem hiding this comment.
can you ensure there's an integration test that shows the following is True?
# empty dirs should be seen as existing
folder_path = dx_p / 'myfolder'
folder_path.makedirs_p()
assert stor.exists(folder_path)
There was a problem hiding this comment.
should be a documented diff between swift vs. s3 vs. posix
Swift & S3 - "empty" folder is exists = False
Posix & DX - "empty" folder is exists = True
There was a problem hiding this comment.
Removed _is_folder
There is a unit test TestExists.test_true_empty_dir to check that empty dirs are seen as existing. Should I add an integration test as well?
| """ | ||
| if not self.resource: | ||
| return None | ||
| if self._is_folder(): |
There was a problem hiding this comment.
when I look here, it feels like this should just raise a NotFoundError that specifies no data object could be found for the given path.
| 'folder': self.resource.parent, | ||
| 'project': self.canonical_project | ||
| }] | ||
| results = dxpy.resolve_data_objects(objects=objects)[0] |
There was a problem hiding this comment.
aside, should you pass in limit=2 here, just cuz we only care 0 or 1
There was a problem hiding this comment.
Unfortunately resolve_data_objects doesn't have a limit argument, hence cannot pass it.
| .format(self.project)) | ||
| return proj_dict['id'] | ||
| except DXError as e: | ||
| raise DuplicateProjectError('Duplicate projects were found with given name ({})' |
There was a problem hiding this comment.
this seems like a dangerous check, because it's possible you'd swallow up a different error (e.g., about authentication or something else). can you list projects with a limit of 2 and then handle 0/1/2 projects with appropriate exceptions?
There was a problem hiding this comment.
Changed DXError to DXSearchError which is only raised when there are multiple projects.
Since find_one_project does not have a limit argument, the only other way is to pass zero_ok=False, in which case DXSearchError is raised for 0/2 projects and there's no way to differentiate the two scenarios.
|
|
||
| Args: | ||
| canonicalize (boolean): whether to return canonicalized paths | ||
| only (str): "objects" for only objects, "folders" for only folder, |
There was a problem hiding this comment.
makes sense to keep because using it below
| """List the path as a dir, returning top-level directories and files. | ||
|
|
||
| Args: | ||
| canonicalize (boolean): whether to return canonicalized paths |
There was a problem hiding this comment.
canonicalize (bool, default False): if True, return canonical paths
There was a problem hiding this comment.
Changed here and other places. Will include in next commit.
Yes, added a maximum limit of 15s, after which the test case fails |
Agreed. Made a 'setup_file' method that waits for the files to close before returning the handler. |
Implement apis for DNAnexus inside stor. Also implement unit and integration testing for DNAnexus paths inside stor. Setup DNAnexus credentials for testing.
Motivation
As part of the epic to support DNAnexus paths within stor.
Implementation