-
Notifications
You must be signed in to change notification settings - Fork 9
add Airflow 3 support #147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This will be the place for all Airflow 2 specific code.
These are being made accessible to Airflow's plugin entrypoints via astronomer_starship.plugins.
Any imports should happen via the compat module.
We jsonify the corresponding error inside the flask specific starship_route function.
We turn the None result into a no content response in `starship_route`.
The instance's DB session shall not be shared across requests.
We replace the following tools with Ruff and extend the selected rules: * black * blacken-docs * isort
We want to keep flask-limiter contrained, but we don't want to install it if it's not needed (i.e on Airflow 3.x).
This base class will be used in Airflow 2 and Airflow 3 compatibility layers and is Airflow version agnostic.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #147 +/- ##
===========================================
- Coverage 41.28% 18.70% -22.59%
===========================================
Files 16 21 +5
Lines 1153 1957 +804
===========================================
- Hits 476 366 -110
- Misses 677 1591 +914 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…BaseStarshipAirflow
The dev2 folder is for Airflow 2 development, while dev3 will be for Airflow 3 development.
It's faster and we don't need to setup python or just.
The following API endpoints are implemented: * health * telescope * airflow version * info * env_vars * pools * variables * connections
All needed methods in BaseStarshipAirflow are defined and have to be either replaced by a common implementation or be implemented in the corresponding subclass.
We know have two different StarshipApi implementations and can't easily maintain doc strings in either. By moving into the markdown file it's also easier to adhere to certain markdown formatting rules.
In Airflow 3 all dags are stored serialized, so the argument `store_serialized_dags` has been removed.
fritz-astronomer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally lgtm
Is hard to tell what's new-new vs copypaste in a different form.
Would be good if it could be more-DRY, but I can understand if that's not really possible, just because of fastapi vs flask.
* dev: AF3 starship UI * test: add tests and codecov
yeah, the part which is DRY has been moved to the common module. At the end we need two isolated code paths to avoid any conflicts with airflow 2 and 3 dependencies. i did move the API docs out of the StarshipApi class in order to DRY. :) |
For now we don't support migrating DAG bundles. The existing code is able to operate without this, while setting the dag version ID for task instance and dag runs to the latest version automatically. In future version we can consider adding full supported and documented API endpoints to migrate DAG bundles.
Since we added the UI to Airflow 3 too, we also need to build the frontend when we start or reload the dev3 instance.
m1racoli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akshaykumarsalunke this amazing work!
Just some things:
- I removed the update dag version ID endpoint as I believe it's currently not needed for the existing APIs to work. In the future we can add full support for migrating DAG bundle versions.
- I've left a commend about the re-generation if TI UUIDs as I am not sure it's really needed and it only will make adding support for additional Airflow 3 models much harder.
| # to avoid PK conflicts with source data | ||
| if "id" in item: | ||
| if table_name in ["task_instance", "task_instance_history"]: | ||
| item["id"] = str(uuid.uuid4()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akshaykumarsalunke I am wondering if we really need a new random UUID for task instances. 🤔
Since those IDs are already random in the source. Wouldn't re-generation just have the same likelihood of conflicts? Maybe even worse because the source IDs were already unique within their environment.
I'd like to have a really good reason to re-generate those IDs in the target, because this will make it much much harder to migrate task reschedules (less important), task instance notes and HITL detail if we decide to add support for those in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree. Preserving the UUID will give us flexibility to introduce additional functionality in the future.
This PR provides the foundation for Airflow 3 support (#118) in Starship by restructuring the code base into Airflow version agnostic and Airflow version specific parts.
Structure
Airflow version specifics
StarshipAPIPluginand handles the translation from the HTTP framework toStarshipCompatabilityLayer. In particular the implementation of StarshipRoute (orstarship_route) handles framework specifics and translates framework agnostic responses (dicts, exceptions, None, ...) into framework specific response implementations.StarshipAirflowand it's minor-version specifc subclasses and makes them accessible viaStarshipCompatabilityLayer. Implementations of StarshipAirflow handle DB schema changes and other internals specific to the corresponding Airflow versions. HTTP framework specific imports/code should be avoided. If an implementation of a specific endpoint supports Airflow 2 and 3 then it should be implemented inBaseStarshipAirflow(for exampleget_poolsetc.). On the contrary*_attrsimplementations should only be done in subclasses to avoid the complexity of tracking schema changes across many generations of Airflow versions (i.e.StarshipAirflowin _af3 starts with a fresh initial definition of these)StarshipPluginresponsible for hosting the Starship UI.Note that existing import paths of
StarshipCompatabilityLayer,StarshipAPIPluginandStarshipPluginremain unchanged.Implemented features
The following features marked with ✅ are implement for Airflow 3 and anything else will be considered for future work.
Additional improvements