-
|
Dagster's dbt integration makes it easy to represent dbt models as Dagster assets. After representing the models in Dagster, there is some necessary translation of concepts. How do dbt schemas, model definitions, sources, tests, seeds, freshness, and exposures map to Dagster concepts? When and how would I use dbt freshness vs Dagster's freshness policies? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
core dbt conceptsdbt modelA model in a dbt project maps to an asset in a Dagster project. A table in a data warehouse is an example of an asset. Since dbt models create tables/views, Dagster maps your dbt models to assets. The dbt model's code, the path to the table/view made, and other metadata about the dbt model is found within the asset's description in the Dagster UI. This mapping is the basis of the dagster-dbt integration. dbt runMaterializing a Dagster asset that corresponds to a dbt model usually means invoking dbt run. dbt incremental materializationsDagster uses partitions to incrementally add or upsert records into tables. Rather than tying the incremental update to when the run happens, using partitions lets you define how you want the data to be split and run individual slices. For example, data is updated by partition of time (day, week, month, etc.) or dimension (state, region, store, experiment). This translation is more involved. See the following links for tips and implementation details: #7683 or #14477 or dagster-io/hooli-data-eng-pipelines#21 dbt schema"schema" in dbt mean multiple things. Users typically develop dbt projects in their own development schema/environment. In Dagster, users configure what schema you write to with your database's Resource, much like you would configure your If you'd like to see the path (ie. which database and schema) a dbt model materializes into, this is found in the asset metadata in Dagster's UI. dbt testdbt test does not have a direct counterpart in Dagster. However, many teams use dbt run and test together, indicating that a job should fail if the dbt test does not pass. This is done in Dagster by having the Dagster asset materialization command invoke dbt freshnessdbt freshness does not have a direct counterpart in Dagster. The main idea of dbt freshness is to help dbt, which only handles sql transformations, know when upstream source tables are updated. In Dagster, the upstream source tables is directly modeled as Dagster assets (or observable source assets). This allows Dagster to show and control the full lineage between source systems and dbt transformations. So instead of using dbt freshness to monitor source data for updates, Dagster can launch dbt runs immediately after source data is updated. Don't confuse Dagster freshness policies with dbt freshness. Dagster freshness policies are declarative - they specify when assets (including dbt) should be run to meet stakeholder expectations, as opposed to dbt freshness which just observes when data has been updated. dbt exposuresSimilar to freshness, dbt exposures do not have a direct counterpart in Dagster. Exposures help document what systems downstream of dbt depend on dbt managed warehouse transformations. In Dagster, these downstream systems (BI dashboards, ML models, etc) can be directly modeled as Dagster assets. This allows Dagster to show the full lineage of data, and because Dagster is not just a transformation tool, downstream assets can actually be acted upon, not just documented. For example, a Tableau extract can be triggered after the dbt model its powered by is materialized. To summarize, because Dagster can show and manage data assets both upstream and downstream of dbt, there is no direct need or translations of dbt freshness or exposures. For details on these common mappings, and additional customization options, see: #14477 Examples from https://github.com/dagster-io/hooli-data-eng-pipelines Dagster assets cover lineage both upstream and downstream of dbt:
Dagster dbt assets include the model in the description and schema as table metadata:
Materializing a dagster asset corresponds to a dbt run command: |
Beta Was this translation helpful? Give feedback.
-
|
@slopp just came to this topic, maybe the answer should be updated to indicate that dbt tests are now interpreted as Asset Checks as per : https://dagster.io/blog/dagster-asset-checks#:~:text=any%20dbt%20test%20can%20be%20understood%20as%20an%20Asset%20Check. |
Beta Was this translation helpful? Give feedback.




core dbt concepts
dbt model
A model in a dbt project maps to an asset in a Dagster project. A table in a data warehouse is an example of an asset. Since dbt models create tables/views, Dagster maps your dbt models to assets. The dbt model's code, the path to the table/view made, and other metadata about the dbt model is found within the asset's description in the Dagster UI. This mapping is the basis of the dagster-dbt integration.
dbt run
Materializing a Dagster asset that corresponds to a dbt model usually means invoking dbt run.
dbt incremental materializations
Dagster uses partitions to incrementally add or upsert records into tables. Rather than tying the incremental update to when t…