Describe the bug
core / ubuntu-latest / hf / hf_dataset failed on main in the Behavior Test workflow:
The triggering commit only changed README.md and website/static/img/architectural.png, so this does not look caused by an HF service code change in that commit.
The job created a temporary private dataset repo successfully:
OPENDAL_HF_REPO_ID=opendal/test-dataset-26216207614-test-b4d1f2ca
OPENDAL_HF_REPO_TYPE=dataset
OPENDAL_TEST=hf
The behavior run then hit repeated 10s I/O timeouts against the real HF dataset backend. The failures were spread across write/delete/list-related tests instead of one assertion:
will retry Write (attempt 1) after 1s because: Unexpected (temporary) at write => io operation timeout reached
Context:
timeout: 10
will retry Delete (attempt 1) after 1s because: Unexpected (temporary) at delete => io operation timeout reached
Context:
timeout: 10
failures:
behavior::test_read_full
behavior::test_batch_delete
behavior::test_list_file_with_recursive
behavior::test_list_dir_with_file_path
test result: FAILED. 88 passed; 4 failed; 0 ignored; 0 measured; 0 filtered out; finished in 119.18s
Post-job cleanup also failed:
Cleanup failed: HTTP 403: {"error":"You have read access but not the required permissions for this operation"}
This suggests we should inspect both the HF dataset behavior-test setup and the token/repo-permission model. It may also be backend slowness/rate-limiting or our 10s timeout being too aggressive for HF/XET writes under this test shape.
cc @kszucs since you authored the recent HF/XET write path and related HF fixes.
Steps to Reproduce
Run the core behavior test matrix for the HF dataset setup on Linux:
# In CI this is generated by .github/workflows/test_behavior_core.yml
# via .github/services/hf/hf_dataset/action.yml.
OPENDAL_TEST=hf \
OPENDAL_HF_REPO_TYPE=dataset \
OPENDAL_HF_REPO_ID=<temporary-private-dataset-repo> \
OPENDAL_HF_TOKEN=<token> \
RUST_TEST_THREADS=1 \
cargo test -p opendal --features services-hf,tests behavior
The observed failure happened in GitHub Actions on Ubuntu 24.04 with Rust 1.95.0.
Expected Behavior
The HF dataset behavior test should pass reliably, or fail with a clear HF/OpenDAL error that identifies the real backend/token problem. It should not fail multiple unrelated behavior tests via generic 10s I/O timeouts.
Additional Context
Relevant local files:
.github/services/hf/hf_dataset/action.yml
.github/actions/hf-temp-repo/setup.js
.github/actions/hf-temp-repo/cleanup.js
core/services/hf/
Related but different issue: #7367 tracked the Java blocking HF/XET segfault and was mitigated by disabling Java HF behavior tests. This issue is about the core Rust HF dataset behavior job timing out on main.
Describe the bug
core / ubuntu-latest / hf / hf_datasetfailed onmainin the Behavior Test workflow:10975e960a18e9133b66683ebc2a7f4ad69d2885docs: update architecture overview image (#7574)The triggering commit only changed
README.mdandwebsite/static/img/architectural.png, so this does not look caused by an HF service code change in that commit.The job created a temporary private dataset repo successfully:
OPENDAL_HF_REPO_ID=opendal/test-dataset-26216207614-test-b4d1f2caOPENDAL_HF_REPO_TYPE=datasetOPENDAL_TEST=hfThe behavior run then hit repeated 10s I/O timeouts against the real HF dataset backend. The failures were spread across write/delete/list-related tests instead of one assertion:
Post-job cleanup also failed:
This suggests we should inspect both the HF dataset behavior-test setup and the token/repo-permission model. It may also be backend slowness/rate-limiting or our 10s timeout being too aggressive for HF/XET writes under this test shape.
cc @kszucs since you authored the recent HF/XET write path and related HF fixes.
Steps to Reproduce
Run the core behavior test matrix for the HF dataset setup on Linux:
The observed failure happened in GitHub Actions on Ubuntu 24.04 with Rust 1.95.0.
Expected Behavior
The HF dataset behavior test should pass reliably, or fail with a clear HF/OpenDAL error that identifies the real backend/token problem. It should not fail multiple unrelated behavior tests via generic 10s I/O timeouts.
Additional Context
Relevant local files:
.github/services/hf/hf_dataset/action.yml.github/actions/hf-temp-repo/setup.js.github/actions/hf-temp-repo/cleanup.jscore/services/hf/Related but different issue: #7367 tracked the Java blocking HF/XET segfault and was mitigated by disabling Java HF behavior tests. This issue is about the core Rust HF dataset behavior job timing out on
main.