feat(download): v2 CI harness + minimal ListDatasets#681
Merged
Conversation
8bead00 to
89d9148
Compare
368286e to
b711dd3
Compare
89d9148 to
aa48c66
Compare
b711dd3 to
60a9c6b
Compare
aa48c66 to
6882860
Compare
This was referenced Apr 24, 2026
60a9c6b to
0ead8d5
Compare
6882860 to
f00c5f7
Compare
This was referenced Apr 24, 2026
nanjiangshu
reviewed
May 12, 2026
nanjiangshu
reviewed
May 12, 2026
nanjiangshu
reviewed
May 12, 2026
nanjiangshu
reviewed
May 12, 2026
0ead8d5 to
dfcea63
Compare
jhagberg
added a commit
that referenced
this pull request
May 13, 2026
Three changes from review comments (nanjiangshu): - v2.ListDatasets uses url.JoinPath(BaseURL, "datasets") instead of string concatenation. Handles trailing slashes correctly and surfaces a clear error on a malformed BaseURL. - v2 getJSON sends SDA-Client-Version alongside the existing User-Agent header so log pipelines keyed on the v1 header still see the version on v2 traffic. - v2_types.go toFile() drops the int(f.DecryptedSize) cast now that apiclient.File.DecryptedFileSize is int64 (from the #679 amend).
f00c5f7 to
82e21b0
Compare
jhagberg
added a commit
that referenced
this pull request
May 13, 2026
Three follow-ups after rebasing on the amended #679 and #681: - v2_test.go: int64 assertion on the widened File.DecryptedFileSize - v2.go: blank line before paginate return (lint nlreturn) - list.go: drop now-redundant int(info.Size) cast on DatasetInfo.Size (both formatFileSizeOutput and DatasetInfo.Size are int64)
jhagberg
added a commit
that referenced
this pull request
May 13, 2026
Renames the new V1Client.DownloadFile and V2Client.DownloadFile call sites from c.cfg.Version to c.cfg.ClientVersion, mirrors the v2 "SDA-Client-Version alongside User-Agent" header pair that #681 picked up, and updates the downloadOneWithClient test fixture.
dfcea63 to
e9d7558
Compare
jhagberg
added a commit
that referenced
this pull request
May 19, 2026
- Switch ListDatasets URL construction to url.JoinPath for trailing-slash tolerance (nanjiangshu); thread error from JoinPath through paginate. - Send SDA-Client-Version alongside User-Agent on v2 calls so any downstream monitoring keyed on the SDA header still sees v2 traffic (nanjiangshu). - Widen v2File.toFile() conversion to drop the int(f.DecryptedSize) cast that paired with the v1 int field; relies on #679 widening DecryptedFileSize to int64. - Pick up #679's package rename apiclient -> downloadclient and Config field/parameter renames (ClientVersion, apiVersion) in v2 code paths.
82e21b0 to
43c4f11
Compare
jhagberg
added a commit
that referenced
this pull request
May 19, 2026
- Switch ListDatasets URL construction to url.JoinPath for trailing-slash tolerance (nanjiangshu); thread error from JoinPath through paginate. - Send SDA-Client-Version alongside User-Agent on v2 calls so any downstream monitoring keyed on the SDA header still sees v2 traffic (nanjiangshu). - Widen v2File.toFile() conversion to drop the int(f.DecryptedSize) cast that paired with the v1 int field; relies on #679 widening DecryptedFileSize to int64. - Pick up #679's package rename apiclient -> downloadclient and Config field/parameter renames (ClientVersion, apiVersion) in v2 code paths.
43c4f11 to
ba8cb90
Compare
jhagberg
added a commit
that referenced
this pull request
May 19, 2026
Three follow-ups on the rebased base: - v2_test.go: int64 assertion on the widened File.DecryptedFileSize - v2.go: blank line before paginate return (lint nlreturn) - integration_v2_test.go: pick up the apiclient -> downloadclient rename and the ClientVersion field name (the v2 integration smoke test was added before #679's review-amend landed)
jhagberg
added a commit
that referenced
this pull request
May 19, 2026
Two follow-ups on the rebased base: - v2.go DownloadFile: emit SDA-Client-Version alongside User-Agent on v2 so monitoring keyed on the SDA header sees v2 download traffic (matches #681's review-amend on v2 list calls). - download_test.go downloadOneWithClient: use the renamed Config field (ClientVersion, not Version) so the v1 download test fixture matches the post-rename API. This commit also carries the gofmt re-sort of download/download.go and download_test.go imports after the apiclient -> downloadclient rename cascaded into pr4.
nanjiangshu
approved these changes
May 20, 2026
Contributor
nanjiangshu
left a comment
There was a problem hiding this comment.
Looks good to me, well done!
zeidlitz
approved these changes
May 21, 2026
Contributor
zeidlitz
left a comment
There was a problem hiding this comment.
Looks pretty straight forward to me. Left a few questions for curiosity.
jhagberg
added a commit
that referenced
this pull request
May 21, 2026
Carries forward fallout from #679/#681 review-amends, plus the JoinPath consistency fix flagged by nanjiang on ListFiles and applied to DatasetInfo for symmetry: - v2.go ListFiles + DatasetInfo: switch from string concat to url.JoinPath(c.cfg.BaseURL, "datasets", url.PathEscape(datasetID), ...) matching ListDatasets. PathEscape stays because the SDA schema allows DOI-format dataset IDs (e.g. "10.1234/567") which contain "/", as jbygdell pointed out. - v2_test.go: int64 assertion on the widened File.DecryptedFileSize - v2.go: blank line before paginate return (lint nlreturn) - integration_v2_test.go: pick up the apiclient -> downloadclient rename and the ClientVersion field name (the v2 integration smoke test was added before #679's review-amend landed)
jhagberg
added a commit
that referenced
this pull request
May 21, 2026
Two follow-ups on the rebased base: - v2.go DownloadFile: emit SDA-Client-Version alongside User-Agent on v2 so monitoring keyed on the SDA header sees v2 download traffic (matches #681's review-amend on v2 list calls). - download_test.go downloadOneWithClient: use the renamed Config field (ClientVersion, not Version) so the v1 download test fixture matches the post-rename API. This commit also carries the gofmt re-sort of download/download.go and download_test.go imports after the apiclient -> downloadclient rename cascaded into pr4.
jhagberg
added a commit
that referenced
this pull request
May 21, 2026
- Switch ListDatasets URL construction to url.JoinPath for trailing-slash tolerance (nanjiangshu); thread error from JoinPath through. - Send SDA-Client-Version alongside User-Agent on v2 calls so any downstream monitoring keyed on the SDA header still sees v2 traffic (nanjiangshu). - Widen v2File.toFile() conversion to drop the int(f.DecryptedSize) cast that paired with the v1 int field; relies on #679 widening DecryptedFileSize to int64. - Extract const maxErrorBodyBytes = 200 and use it in v2.go's error-body truncation (zeidlitz); the same constant becomes useful in #685's chore commit (caps the body read with io.LimitReader, max+1). - Drop the dead 'invalid base URL' string-equality blocks in list.go's datasetFiles and Datasets (zeidlitz noted errors.Is would be nicer); after #679's fail-fast validation in downloadclient.New() the new error is 'Config.BaseURL is required' and it fires before ListFiles or ListDatasets is ever called, so the special case was unreachable. - Picks up #679's package rename apiclient -> downloadclient and Config field/parameter renames (ClientVersion, apiVersion) in v2 code paths.
ba8cb90 to
6e99807
Compare
jhagberg
added a commit
that referenced
this pull request
May 21, 2026
Carries forward fallout from #679/#681 review-amends, plus the JoinPath consistency fix flagged by nanjiang on ListFiles and applied to DatasetInfo for symmetry, plus zeidlitz's maxErrorBodyBytes constant adopted by the LimitReader site that this PR introduces: - v2.go ListFiles + DatasetInfo: switch from string concat to url.JoinPath(c.cfg.BaseURL, "datasets", url.PathEscape(datasetID), ...) matching ListDatasets. PathEscape stays because the SDA schema allows DOI-format dataset IDs (e.g. "10.1234/567") which contain "/", as jbygdell pointed out. - v2.go LimitReader: switch the literal 201 to maxErrorBodyBytes+1 (zeidlitz; constant defined in #681's review-amend). - v2_test.go: int64 assertion on the widened File.DecryptedFileSize. - v2.go: blank line before paginate return (lint nlreturn). - integration_v2_test.go: pick up the apiclient -> downloadclient rename and the ClientVersion field name (the v2 integration smoke test was added before #679's review-amend landed).
jhagberg
added a commit
that referenced
this pull request
May 21, 2026
Three follow-ups on the rebased base: - v2.go DownloadFile: emit SDA-Client-Version alongside User-Agent on v2 so monitoring keyed on the SDA header sees v2 download traffic (matches #681's review-amend on v2 list calls). - v2.go DownloadFile error path: adopt maxErrorBodyBytes for the LimitReader cap and the body truncation (constant from #681). - download_test.go downloadOneWithClient: use the renamed Config field (ClientVersion, not Version) so the v1 download test fixture matches the post-rename API. This commit also carries the gofmt re-sort of download/download.go and download_test.go imports after the apiclient -> downloadclient rename cascaded into pr4.
jhagberg
added a commit
that referenced
this pull request
May 21, 2026
- Switch ListDatasets URL construction to url.JoinPath for trailing-slash tolerance (nanjiangshu); thread error from JoinPath through. - Send SDA-Client-Version alongside User-Agent on v2 calls so any downstream monitoring keyed on the SDA header still sees v2 traffic (nanjiangshu). - Widen v2File.toFile() conversion to drop the int(f.DecryptedSize) cast that paired with the v1 int field; relies on #679 widening DecryptedFileSize to int64. - Extract const maxErrorBodyBytes = 200 and use it in v2.go's error-body truncation (zeidlitz); the same constant becomes useful in #685's chore commit (caps the body read with io.LimitReader, max+1). - Drop the dead 'invalid base URL' string-equality blocks in list.go's datasetFiles and Datasets (zeidlitz noted errors.Is would be nicer); after #679's fail-fast validation in downloadclient.New() the new error is 'Config.BaseURL is required' and it fires before ListFiles or ListDatasets is ever called, so the special case was unreachable. - Picks up #679's package rename apiclient -> downloadclient and Config field/parameter renames (ClientVersion, apiVersion) in v2 code paths.
6e99807 to
b834ee9
Compare
jhagberg
added a commit
that referenced
this pull request
May 21, 2026
Carries forward fallout from #679/#681 review-amends, plus the JoinPath consistency fix flagged by nanjiang on ListFiles and applied to DatasetInfo for symmetry, plus zeidlitz's maxErrorBodyBytes constant adopted by the LimitReader site that this PR introduces: - v2.go ListFiles + DatasetInfo: switch from string concat to url.JoinPath(c.cfg.BaseURL, "datasets", url.PathEscape(datasetID), ...) matching ListDatasets. PathEscape stays because the SDA schema allows DOI-format dataset IDs (e.g. "10.1234/567") which contain "/", as jbygdell pointed out. - v2.go LimitReader: switch the literal 201 to maxErrorBodyBytes+1 (zeidlitz; constant defined in #681's review-amend). - v2_test.go: int64 assertion on the widened File.DecryptedFileSize. - v2.go: blank line before paginate return (lint nlreturn). - integration_v2_test.go: pick up the apiclient -> downloadclient rename and the ClientVersion field name (the v2 integration smoke test was added before #679's review-amend landed).
jhagberg
added a commit
that referenced
this pull request
May 21, 2026
Three follow-ups on the rebased base: - v2.go DownloadFile: emit SDA-Client-Version alongside User-Agent on v2 so monitoring keyed on the SDA header sees v2 download traffic (matches #681's review-amend on v2 list calls). - v2.go DownloadFile error path: adopt maxErrorBodyBytes for the LimitReader cap and the body truncation (constant from #681). - download_test.go downloadOneWithClient: use the renamed Config field (ClientVersion, not Version) so the v1 download test fixture matches the post-rename API. This commit also carries the gofmt re-sort of download/download.go and download_test.go imports after the apiclient -> downloadclient rename cascaded into pr4.
Record the sensitive-data-archive commit that the upcoming v2 integration-CI job will check out. Pinning prevents upstream drift from silently breaking our CI. Pin: 608878fa453770fcb3962bf0239366905c125982 — latest origin/main commit that directly touched dev-tools/download-v2-dev/ (Copilot-review fixes to the dev compose added in neicnordic/sensitive-data-archive#2368).
v2's FileInfo has checksums as array and a server-provided downloadUrl. Conversion to the shared File type happens in v2File.toFile() — maps checksums to legacy scalar fields (prefer sha256).
First real v2 wire call. Single-page only — pagination arrives in #676 with an explicit error if the server returns a non-null nextPageToken (prevents silent result truncation). ListFiles and DatasetInfo are still stubs that error out with 'not implemented until #676'. The list command adapts accordingly: --dataset <id> fails with that stub error (needs ListFiles), while --datasets branches on version and prints just dataset IDs for v2 (per-dataset file count + size enrichment lands with DatasetInfo in #676). download_test.go and list_test.go cover both paths. All '#N' references in V2Client doc comments, stub-error messages, and the boundary tests use the real sub-issue numbers so GitHub auto-links to the intended issue rather than whatever PR happens to carry that short number in the repository.
… error Clarifies the Config.Version godoc so it documents both v1 (SDA-Client-Version header) and v2 (User-Agent "sda-cli/<version>") mappings, trims the pagination-not-yet-implemented error message to the first clause, and points the trimmed message at the real follow-up sub-issue (#676) rather than a placeholder.
Boots dev-tools/download-v2-dev/ from neicnordic/sensitive-data-archive at a pinned commit, fetches a dev token from mockauth, and runs tests guarded by //go:build integration. Pin documented in docs/v2-dev-stack-pin.md. First integration test: V2Client.ListDatasets against the seeded EGAD00000000001 dataset.
- Wait loop 120s → 240s covers cold-runner image-pull time. - Wait for both download (:8085/health/ready) and mockauth (:8000/tokens) before the token-fetch step — mockauth's first-run pip install can lag behind download's readiness, making /tokens fail intermittently. - Dump all compose services on failure, not just download (postgres, minio, reencrypt, or mockauth failures show up too). - Run the log dump BEFORE teardown so the dump still has containers to read from (teardown removes the compose project).
After the apiclient abstraction in #679 landed, list.datasetFiles and list.Datasets returned raw V1Client errors on HTTP / transport failures, dropping the 'failed to get files/datasets, reason: ...' prefix that callers saw before. This restores the legacy prefix for both commands while leaving the bare 'invalid base URL' error untouched (TestListDatasetNoUrl and TestListDatasetsNoUrl assert on that string directly). Caught by code review on #675. download already had the wrap in getFileIDURL but list lost it during the abstraction refactor.
- Switch ListDatasets URL construction to url.JoinPath for trailing-slash tolerance (nanjiangshu); thread error from JoinPath through. - Send SDA-Client-Version alongside User-Agent on v2 calls so any downstream monitoring keyed on the SDA header still sees v2 traffic (nanjiangshu). - Widen v2File.toFile() conversion to drop the int(f.DecryptedSize) cast that paired with the v1 int field; relies on #679 widening DecryptedFileSize to int64. - Extract const maxErrorBodyBytes = 200 and use it in v2.go's error-body truncation (zeidlitz); the same constant becomes useful in #685's chore commit (caps the body read with io.LimitReader, max+1). - Drop the dead 'invalid base URL' string-equality blocks in list.go's datasetFiles and Datasets (zeidlitz noted errors.Is would be nicer); after #679's fail-fast validation in downloadclient.New() the new error is 'Config.BaseURL is required' and it fires before ListFiles or ListDatasets is ever called, so the special case was unreachable. - Picks up #679's package rename apiclient -> downloadclient and Config field/parameter renames (ClientVersion, apiVersion) in v2 code paths.
b834ee9 to
779d968
Compare
jhagberg
added a commit
that referenced
this pull request
May 26, 2026
Carries forward fallout from #679/#681 review-amends, plus the JoinPath consistency fix flagged by nanjiang on ListFiles and applied to DatasetInfo for symmetry, plus zeidlitz's maxErrorBodyBytes constant adopted by the LimitReader site that this PR introduces: - v2.go ListFiles + DatasetInfo: switch from string concat to url.JoinPath(c.cfg.BaseURL, "datasets", url.PathEscape(datasetID), ...) matching ListDatasets. PathEscape stays because the SDA schema allows DOI-format dataset IDs (e.g. "10.1234/567") which contain "/", as jbygdell pointed out. - v2.go LimitReader: switch the literal 201 to maxErrorBodyBytes+1 (zeidlitz; constant defined in #681's review-amend). - v2_test.go: int64 assertion on the widened File.DecryptedFileSize. - v2.go: blank line before paginate return (lint nlreturn). - integration_v2_test.go: pick up the apiclient -> downloadclient rename and the ClientVersion field name (the v2 integration smoke test was added before #679's review-amend landed).
jhagberg
added a commit
that referenced
this pull request
May 26, 2026
Three follow-ups on the rebased base: - v2.go DownloadFile: emit SDA-Client-Version alongside User-Agent on v2 so monitoring keyed on the SDA header sees v2 download traffic (matches #681's review-amend on v2 list calls). - v2.go DownloadFile error path: adopt maxErrorBodyBytes for the LimitReader cap and the body truncation (constant from #681). - download_test.go downloadOneWithClient: use the renamed Config field (ClientVersion, not Version) so the v1 download test fixture matches the post-rename API. This commit also carries the gofmt re-sort of download/download.go and download_test.go imports after the apiclient -> downloadclient rename cascaded into pr4.
zeidlitz
approved these changes
May 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Related issue(s) and PR(s)
Closes #675. Stacks on #679. Part of umbrella issue #663.
Description
Wires
V2Client.ListDatasets(single-page) against the real v2 server, lands the v2 wire types (v2File, response envelopes,datasetInfoResponse) that #685 and #686 consume, and adds a CI workflow that boots a pinnedneicnordic/sensitive-data-archivedev stack for integration tests.ListFilesandDatasetInforemain stubs until #685. Also restores the legacy"failed to get X, reason: ..."error-wrap prefix for thelistcommand (accidentally dropped when the apiclient abstraction landed in #679).Updated from review (nanjiangshu):
V2Client.ListDatasetsusesurl.JoinPathinstead of string concatenation,getJSONnow sendsSDA-Client-VersionalongsideUser-Agentso v1-style log pipelines still see the version on v2, and theint(f.DecryptedSize)cast inv2File.toFile()drops out following theFile.DecryptedFileSizewidening toint64in #679. The HTTP-client timeout question from review is being discussed inline —Client.Timeoutwould cut multi-GB downloads in #686, so the counter-suggestion isTransport.ResponseHeaderTimeout.How to test
go build ./...,go vet ./...,go test ./...(174 pass) — clean.go test -tags=integration ./...against a booted dev stack —TestV2_ListDatasets_Smokepasses.sda-cli list --datasets --api-version v2prints the seeded dataset IDs.sda-cli list --dataset X --api-version v2errors with the stub message until feat(download): v2 full list — pagination, filters, DatasetInfo #685.integration-v2exercises the full setup on GitHub Actions.Test plan
go test ./...: 174 passgo vet ./...: cleangolangci-lint run: cleangofmt -l .: cleanTestV2_ListDatasets_Smokeagainst the live dev stack--datasetsprintsEGAD00000000001;--dataset X --api-version v2returns stub error;--api-version v9rejected