feat(download): v2 file download via /files/{fileId}#686
Open
jhagberg wants to merge 6 commits into
Open
Conversation
7 tasks
43fc834 to
3805aeb
Compare
This was referenced May 13, 2026
8e09ea2 to
1fd215d
Compare
7 tasks
Interface-only change. Implementations stubbed until later tasks in this PR.
Resolves UserArg (path or ID) via ListFiles + legacy substring match,
then GETs /s3/{dataset}/{filePath}. Returns the raw body for the
caller to stream to disk.
Substring matching preserved from download.getFileIDURL — known
imprecise but legacy-compatible. v2 uses exact filePath filter.
Resolves UserArg via exact filePath filter (paths) or list+match (ids),
follows server-provided downloadUrl, sends X-C4GH-Public-Key. 403 from
either list or download surfaces as ambiguous 'does not exist or
access denied' (existence-leakage prevention).
File gains a DownloadURL field (empty on v1). Required so downloads
can follow the server's URL instead of constructing /files/{id}
client-side.
datasetCase/recursiveCase/fileCase now call a shared downloadOne helper that goes through client.DownloadFile, followed by the new writeBodyToDisk helper (progress-bar + .part atomic rename) extracted from the old downloadFile. Removes the #679 backward-compat wrappers (download.GetDatasets, download.GetFilesInfo, download.File alias) and the legacy helpers that moved to apiclient in #679 (setupCookieJar, cookieJar, cookiePath, getBody) plus getFileIDURL and downloadFile — UserArg (path or fileId) is now resolved inside V1Client.DownloadFile / V2Client.DownloadFile. recursiveCase uses v2's pathPrefix filter server-side when --api-version=v2; v1 still filters client-side since it has no server-side filter. Adds a test for the v2-missing-pubkey guard and replaces the old downloadFile-cleanup test with a writeBodyToDisk-cleanup test driven by an errReader that fails mid-stream. Tests for deleted helpers (TestGetDatasets, TestGetFilesInfo, TestFileIdUrl, TestGetBody*, TestSetupCookiejar) are removed — their behavior is covered by apiclient/v1_test.go (TestV1Client_ListDatasets_*, TestV1Client_DownloadFile_*).
Pulls the dev c4gh pubkey from the reencrypt container, calls DownloadFile for the seeded test file, verifies size and non-empty body.
Renames the new V1Client.DownloadFile and V2Client.DownloadFile call sites from c.cfg.Version to c.cfg.ClientVersion, mirrors the v2 "SDA-Client-Version alongside User-Agent" header pair that #681 picked up, and updates the downloadOneWithClient test fixture.
3805aeb to
7cac102
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Related issue(s) and PR(s)
Closes #677. Stacks on #685.
Description
Wires v2 file download through
GET /files/{fileId}plus the server-provideddownloadUrlplus theX-C4GH-Public-Keyheader, retires the legacy/s3-transfer helpers, and shrinksdownload/download.goto adownloadOne→writeBodyToDiskflow sitting onapiclient.Client.DownloadFile. Removes the "v2 download is not yet implemented" guard that #685 put in.Key design points
Client.DownloadFile(ctx, DownloadRequest) (DownloadResult, error)is the new abstraction.DownloadResultcarries the canonicalFile(authoritative filename — UserArg can be a bare fileId), the response body, and the server'sContent-Length.V2Client.DownloadFileresolves UserArg viaExactPath(paths) orListFilesscan (bare ids), then hits the server-provideddownloadUrlwithX-C4GH-Public-Key. URLs go throughurl.Parse+ResolveReferenceso both relative and absolute forms work.BaseURL, theAuthorizationheader is dropped. A pre-signed cross-origin URL is self-authenticating; leaking the bearer to a third-party host is the kind of thing you read about in CVE writeups.dataset/file does not exist or access denied: <arg>— existence-leakage contract preserved from v1.downloadOneusesresult.File.FilePath(anonymized) for the on-disk path. Downloading by fileId now produces a meaningfully-named file instead ofoutdir/<fileId>. A containment check rejects anyoutputPathescapingoutDir, so a server-provided path with../can't write outside the configured output directory.downloadFile,getFileIDURL,GetDatasets,GetFilesInfo,getBody,setupCookieJar, thedownload.Filealias, and thecookieJar/cookiePath/appVersionpackage vars. Theapiclient.WithV1CookieJaroption also retires here in the same refactor.Version→ClientVersion) andint64widening from feat(apiclient): Client interface + V1Client wrapper #679's review-amend into the new V1/V2 DownloadFile sites, and mirrors feat(download): v2 CI harness + minimal ListDatasets #681'sSDA-Client-Version+User-Agentheader pair on v2.Out of scope (tracked separately)
datasetCase/recursiveCase(separate follow-up).How to test
Local:
go build ./...,go vet ./...,gofmt -l .: clean.go test ./...: 176 pass.golangci-lint run --timeout 5m: 0 issues.Integration (build tag
integration, against the live v2 dev stack):TestV2_ListDatasets_Smoke,TestV2_ListFiles_Smoke,TestV2_ListFiles_ExactPath_Smoke,TestV2_ListFiles_PathPrefix_NoMatch,TestV2_DatasetInfo_Smoke,TestV2_DownloadFile_EndToEnd: all pass.Manual, rebuilt
sda-cliagainst the live dev stack:download --api-version v2 --dataset-id EGAD00000000001 --pubkey <pem> test-file.c4ghwrites the file (~1.2 kB c4gh bytes).download --api-version v2 ... EGAF00000000001(by fileId) writestest-file.c4gh— the canonical name, not the bare fileId.download --api-version v2without--pubkeyerrors withv2 downloads require --pubkey.download --api-version v2 --dataset-id DOES_NOT_EXIST ...errors withdataset/file does not exist or access denied: <arg>(403 flattened).Test plan
go test ./...: 176 passgo vet ./...: cleangolangci-lint run: cleangofmt -l .: clean