-
Notifications
You must be signed in to change notification settings - Fork 14
Refactor storage access layer using obstore and obspec #27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
nvictus
wants to merge
31
commits into
higlass:master
Choose a base branch
from
nvictus:refactor-nezar
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Fix logic in _resolve_search_dir and _resolve_path functions - Add detailed docstrings explaining prefix handling behavior - Clarify comments about directory vs file prefix logic
- Add tests/test_caching.py with 40 tests for LRUCache and CachedStore - Add tests/test_httpfs.py with 30 tests for FUSE operations - Test multi-tier caching behavior and scheme-based cache isolation - Test path_to_url helper, load_store function, and all FUSE methods - Use obstore.MemoryStore for realistic testing scenarios - Include proper exception handling and error condition testing
- Refactor CachedStore to use base_url parameter instead of separate scheme - Add dedicated cache key generation methods for metadata and blocks - Include URI scheme in cache keys to prevent collisions across storage backends - Fix critical bug in get_range() method passing wrong path parameter - Simplify list_with_delimiter() implementation
- Update to use obspec.exceptions for proper error handling with map_exception() - Replace isinstance checks with proper exception mapping - Enhance error handling in getattr(), readdir(), and read() methods - Add proper FUSE error codes (EACCES, ENOENT, EIO) for different error types - Update _load_cached_store() method for consistent store creation - Improve logging and debug output formatting
64015cc
to
f36122b
Compare
f36122b
to
9363f9c
Compare
9363f9c
to
02a6a52
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR migrates from custom HTTP fetching to obspec/obstore for standardized cloud storage access: HTTP(s), S3, GCS, Azure. This includes implementing a custom obspec store for FTP and a
CachedStore
wrapper to manage the metadata caching and the two-tier caching of blocks.There is no longer a separate mount point for each URL scheme. Instead the url scheme is given explicitly and the request is dispatched to the appropriate storage client, e.g.:
simple-httpfs -f --log log.txt /tmp/cloud & cat /tmp/cloud/https://example.com/README.txt... cat /tmp/cloud/s3://example/README.txt...
Configuring the storage clients and their request behavior is supported through the
store_configs
,credential_providers
,client_options
, andretry_config
options. These are not yet exposed via the CLI, but can also be configured with environment variables (see docs). The default behavior includes retry attempts. When no storage config is provided for an object storage backend, we setskip_signature
to True to support public buckets.Additional updates
FUSE interface
readdir
operation to supportls
.noappledouble
)Modernization
Issues