-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add Support for HTTP Headers in URL Fetch Requests with Secure Storage for Landing Requests #20924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Add Support for HTTP Headers in URL Fetch Requests with Secure Storage for Landing Requests #20924
Conversation
lib/galaxy/managers/landing.py
Outdated
| ) | ||
| except Exception: | ||
| log.warning("Failed to encrypt headers in landing request state", exc_info=True) | ||
| pass # Continue without encryption if vault fails |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather this fail outright than risk storing things that should be encrypted in an unencrypted fashion - especially given the rest of the app will assume the encryption has already occurred. Does this make testing harder or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new version should cover this. Thank you!!
|
This approach has made the admin configuration trivial and deployment much easier as a result. The existing upload process as is is already... sort of exploitable... I mean we don't do a great job at rate limiting Galaxy (maybe this has improved?) and we let most URIs be accessed for users on behalf of the Galaxy server - it is scary from a security perspective. Allowing users to set arbitrary headers including (especially?) user-agent makes it even a richer target for hacking I would suspect. If we shipped an allow-list of headers and URI patterns that allow that header and whether the header should be secured then I would be much more comfortable from a security perspective. It would be much harder to configure then but we would be sure exactly what the exploit surface is. Additionally, I trust the list of headers is relatively complete and well thought through but again I would be more comfortable if we had an explicit allow list because again we would understand the exploit surface exactly. I'm not a -1 on any of this though - I'm just expressing my concerns and telling you how I would have had it work - which people may think would be too much config. Still though one can imagine blends of the approaches - maybe it is off by default but there is a configuration that allows any requests like this or restricted set of requests and admins can decide on their level of comfort. Even if people believe I'm being too cautious or appropriately cautious but the admin/deployment burden of addressing it would be too steep - I would still strongly encourage we don't allow the user agent to be overridden - if an API wants Galaxy to access it they shouldn't require a non-Galaxy user agent. What we pick to allow through makes me anxious - but after that - the actual implementation of allowing those headers and securing seems really well thought through well. It seems to fit with our existing APIs beautifully - that part of the implementation seems perfect to me. |
|
Thank you @jmchilton! As always, great constructive feedback! I will add some configuration to make this functionality more explicitly controlled 👍 |
c7f7816 to
174dc62
Compare
|
I've added the config option to specify URL patterns and sets of allowed headers for each pattern. This should be more explicit while maintaining flexibility for admins to allow general safe headers. I've also updated the PR description with the updates. |
c0c11c7 to
d7a5df4
Compare
d7a5df4 to
00cf447
Compare
Introduces the ability to specify optional HTTP headers for URL-based data fetching. These headers are passed to the fetch logic to enhance flexibility in handling authenticated or customized requests.
Introduces functions to identify, encrypt, and decrypt sensitive HTTP headers securely using Galaxy's Vault system.
Refactors header encryption and decryption logic to remove tool-specific dependencies, enabling support for workflow landings.
Introduces a new integration test to verify the encryption of sensitive headers in workflow landing requests. Ensures that headers containing authorization tokens and API keys are securely encrypted using Galaxy's vault system and not stored in plain text in the database. Refactors helper methods to support both tool and workflow landing request models.
Introduces warning logs to capture encryption and decryption failures in the landing request state, providing better visibility into issues with header processing. This helps in diagnosing and addressing potential problems during runtime without halting execution.
Ensuring that issues with the vault or encryption process are surfaced immediately.
Introduces a utility function to recursively check for sensitive headers in nested data structures, enhancing the ability to identify headers requiring encryption. Includes unit tests covering various cases such as nested headers, non-sensitive headers, and edge cases to ensure robustness.
Introduces a new module to configure and manage allowed HTTP request headers for external URL fetches.
Ensures that when multiple URL patterns match a given URL, header permissions (allowance and sensitivity) are correctly consolidated.
Introduces a new sample configuration to define an allow-list for HTTP headers in external URL fetch requests. This mechanism allows administrators to specify which headers are permitted for different URL patterns, improving security and control over fetch requests. The configuration also supports marking headers as sensitive, prompting encryption of their values. The sample provides illustrative examples for common services like GitHub, AWS S3, and generic cloud storage.
Adds common authentication-related headers (Authorization, X-Auth-Token, X-API-Key) to the default sensitive list for HTTPS URLs in the sample configuration. This provides a more secure default example for users, preventing accidental exposure of sensitive credentials. Includes a new comment advising users to only employ the minimum necessary configuration for their specific needs, reinforcing security best practices.
00cf447 to
c970f53
Compare
Currently, we can only fetch data from public URLs without any authentication or custom headers.
This PR introduces support for HTTP headers in URL fetch requests for landing requests. Headers are controlled through pattern-based configuration, and sensitive headers are automatically encrypted using Galaxy's vault system before storing in the database.
🚀 Features
1. Pattern-Based URL Header Configuration
*,?,**)config/url_headers_conf.yml2. Automatic Sensitive Header Encryption
3. Secure Storage Architecture
__VAULT_HEADER_AUTHORIZATION__)headers/{landing_uuid}/{header_name}🔧 Configuration
Example Configuration File
How Pattern Matching Works
*= any chars,?= single char,**= recursive)🔧 How It Works
API Usage Examples
Creating a Data Landing Request with Headers
Creating a Workflow Landing Request with Headers
Under the Hood: Encryption Process
{ "headers": { "Authorization": "__VAULT_HEADER_AUTHORIZATION__", "X-API-Key": "__VAULT_HEADER_X_API_KEY__" } }🔒 Security Features
Pattern-Based Access Control
Vault Configuration Required
This feature requires a configured Galaxy vault. See the vault documentation for setup instructions.
Fallback Behavior
✅ Testing
🎯 Use Cases
This enhancement enables several important use cases:
How to test the changes?
License