Skip to content

Conversation

@m1racoli
Copy link
Contributor

@m1racoli m1racoli commented Nov 7, 2025

This PR provides the foundation for Airflow 3 support (#118) in Starship by restructuring the code base into Airflow version agnostic and Airflow version specific parts.

Structure

  • common: responsible for implementing common functionality and primitives used across Airflow versions.
  • compat: handles version specific imports.

Airflow version specifics

  • starship_api: Implements StarshipAPIPlugin and handles the translation from the HTTP framework to StarshipCompatabilityLayer. In particular the implementation of StarshipRoute (or starship_route) handles framework specifics and translates framework agnostic responses (dicts, exceptions, None, ...) into framework specific response implementations.
  • starship_compatability: Implements StarshipAirflow and it's minor-version specifc subclasses and makes them accessible via StarshipCompatabilityLayer. Implementations of StarshipAirflow handle DB schema changes and other internals specific to the corresponding Airflow versions. HTTP framework specific imports/code should be avoided. If an implementation of a specific endpoint supports Airflow 2 and 3 then it should be implemented in BaseStarshipAirflow (for example get_pools etc.). On the contrary *_attrs implementations should only be done in subclasses to avoid the complexity of tracking schema changes across many generations of Airflow versions (i.e. StarshipAirflow in _af3 starts with a fresh initial definition of these)
  • starship: Implements StarshipPlugin responsible for hosting the Starship UI.

Note that existing import paths of StarshipCompatabilityLayer, StarshipAPIPlugin and StarshipPlugin remain unchanged.

Implemented features

The following features marked with ✅ are implement for Airflow 3 and anything else will be considered for future work.

  • Starship API
    • health: ✅
    • telescope: ✅
    • airflow_version: ✅
    • info: ✅
    • env_vars: ✅
    • pools: ✅
    • variables: ✅
    • connections: ✅
    • dags: ❌
    • dag_runs: ❌
    • task_instances: ❌
    • task_instance_history: ❌
    • task_logs: ❌
    • xcom: ❌
  • Starship UI: ❌
  • Airflow 3 specific models (backfills, teams, asset triggers, ...): ❌

Additional improvements

  • use ruff for linting and formatting (remove black, pylint and isort)
  • use prek in github action (easier to setup and faster to use)
  • update pre-commit hooks
  • improve package metadata
  • refine test configuration

This will be the place for all Airflow 2 specific code.
These are being made accessible to Airflow's plugin entrypoints via astronomer_starship.plugins.
Any imports should happen via the compat module.
We jsonify the corresponding error inside the flask specific starship_route function.
We turn the None result into a no content response in `starship_route`.
The instance's DB session shall not be shared across requests.
We replace the following tools with Ruff and extend the selected rules:

* black
* blacken-docs
* isort
We want to keep flask-limiter contrained, but we don't want to install
it if it's not needed (i.e on Airflow 3.x).
This base class will be used in Airflow 2 and Airflow 3 compatibility layers
and is Airflow version agnostic.
@codecov-commenter
Copy link

codecov-commenter commented Nov 7, 2025

Codecov Report

❌ Patch coverage is 18.57242% with 1175 lines in your changes missing coverage. Please review.
✅ Project coverage is 18.70%. Comparing base (de68415) to head (e869a13).
⚠️ Report is 133 commits behind head on main.

Files with missing lines Patch % Lines
astronomer_starship/_af2/starship_compatability.py 0.00% 427 Missing ⚠️
astronomer_starship/_af3/starship_compatability.py 18.67% 209 Missing ⚠️
astronomer_starship/common.py 33.19% 157 Missing ⚠️
astronomer_starship/_af2/starship_api.py 0.00% 135 Missing ⚠️
tests/docker_test/docker_test.py 14.92% 57 Missing ⚠️
astronomer_starship/_af3/starship_api.py 57.02% 52 Missing ⚠️
astronomer_starship/_af2/starship.py 0.00% 44 Missing ⚠️
astronomer_starship/_af3/starship.py 36.92% 41 Missing ⚠️
..._starship/providers/starship/operators/starship.py 0.00% 17 Missing ⚠️
tests/api_integration_test.py 20.00% 12 Missing ⚠️
... and 9 more
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #147       +/-   ##
===========================================
- Coverage   41.28%   18.70%   -22.59%     
===========================================
  Files          16       21        +5     
  Lines        1153     1957      +804     
===========================================
- Hits          476      366      -110     
- Misses        677     1591      +914     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

The dev2 folder is for Airflow 2 development, while dev3 will be for Airflow 3 development.
It's faster and we don't need to setup python or just.
The following API endpoints are implemented:

* health
* telescope
* airflow version
* info
* env_vars
* pools
* variables
* connections
All needed methods in BaseStarshipAirflow are defined and have to be
either replaced by a common implementation or be implemented in the
corresponding subclass.
We know have two different StarshipApi implementations and can't easily
maintain doc strings in either. By moving into the markdown file it's
also easier to adhere to certain markdown formatting rules.
In Airflow 3 all dags are stored serialized, so the argument
`store_serialized_dags` has been removed.
@m1racoli m1racoli marked this pull request as ready for review November 20, 2025 02:22
Copy link
Contributor

@fritz-astronomer fritz-astronomer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally lgtm
Is hard to tell what's new-new vs copypaste in a different form.
Would be good if it could be more-DRY, but I can understand if that's not really possible, just because of fastapi vs flask.

@m1racoli
Copy link
Contributor Author

m1racoli commented Dec 5, 2025

generally lgtm Is hard to tell what's new-new vs copypaste in a different form. Would be good if it could be more-DRY, but I can understand if that's not really possible, just because of fastapi vs flask.

yeah, the part which is DRY has been moved to the common module. At the end we need two isolated code paths to avoid any conflicts with airflow 2 and 3 dependencies.

i did move the API docs out of the StarshipApi class in order to DRY. :)

For now we don't support migrating DAG bundles. The existing code
is able to operate without this, while setting the dag version ID for
task instance and dag runs to the latest version automatically.

In future version we can consider adding full supported and documented
API endpoints to migrate DAG bundles.
Since we added the UI to Airflow 3 too, we also need to build the frontend
when we start or reload the dev3 instance.
Copy link
Contributor Author

@m1racoli m1racoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akshaykumarsalunke this amazing work!

Just some things:

  • I removed the update dag version ID endpoint as I believe it's currently not needed for the existing APIs to work. In the future we can add full support for migrating DAG bundle versions.
  • I've left a commend about the re-generation if TI UUIDs as I am not sure it's really needed and it only will make adding support for additional Airflow 3 models much harder.

# to avoid PK conflicts with source data
if "id" in item:
if table_name in ["task_instance", "task_instance_history"]:
item["id"] = str(uuid.uuid4())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akshaykumarsalunke I am wondering if we really need a new random UUID for task instances. 🤔

Since those IDs are already random in the source. Wouldn't re-generation just have the same likelihood of conflicts? Maybe even worse because the source IDs were already unique within their environment.

I'd like to have a really good reason to re-generate those IDs in the target, because this will make it much much harder to migrate task reschedules (less important), task instance notes and HITL detail if we decide to add support for those in the future.

Image

Copy link
Contributor

@akshaykumarsalunke akshaykumarsalunke Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree. Preserving the UUID will give us flexibility to introduce additional functionality in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants