Major changes to dg and Components leading up to RC in early July
#30285
Replies: 1 comment
-
|
Heads-up for dg early adopters upgrading to Dagster ≥ 1.10.18 The dg CLI is now installed in each project's Python environment instead of as a global tool. Migration steps:
Then, the mismatch warning disappears and Full details in the changelog → https://github.com/dagster-io/dagster/blob/master/CHANGES.md#11018-core--02618-libraries |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
About two months ago we previewed
dgand Components in a Github discussion, followed by a webinar last month. If you are new to the system and want to learn more we have docs in Labs for bothdgand Components. We have been making great progress in concert with our early partners and are ready to move on to the next phase of the project: Release Candidate (RC)dgchanges leading up to RCThis week are pushing major changes to the
dgCLI interface and project layout over the next two releases (1.10.18and1.10.19). This is in preparation for our reclassification ofdgand Components both as RC the beginning of July. At that point,dgand Components will be integrated into our mainline documentation, used in our tutorials, and recommended for production use.From here on out, we are focusing on stabilization, docs, and building Components-style integrations for the technologies that will most benefit from them, without any more major changes in vocabulary or project layout.
We're confident based on our and our partners' experiences that this is going to be a broadly used set of features increasingly at the core of the Dagster experience. It makes Dagster easier to use and more powerful at the same time. And it is opt-in, additive, and incrementally adoptable. In fact, we've changed our entire internal data platform to use this new stack in place and with incremental process, and we're thrilled with the process and results.
We thank all the early adopters and appreciate your patience dealing with the changes and bugs that are inevitable as part of a preview. It is an invaluable part of the process, thank you for being a part of it. Once the RC lands, there will be a step function improvement in stability.
You do not need to immediately migrate to these changes, and we have strived to maintain backwards compatibility for our existing partners. However, we will remove support for some of these APIs for the RC.
Structural changes in the dg cli architecture
dgis a traditional python CLI installed on a per-project basisIn our preview release
dgwas a tool that was installable using theuv toolsystem (and laterbrew,curl, and so forth). This was a constant source of confusion and introduced version skew issues. Instead,dgis installed on a per-virtual-environment basis, more in line with traditional Python CLIs.create-dagsterfor project and workspace scaffoldingHowever we still want the ability to "bootstrap" a project without a pre-existing virtual environment. For this we have a new CLI,
create-dagster, invocable viapipxoruvx, and installable viabrew,curl, and as auvtool. It is responsible for creating a new Dagster project on your local machine and provides a happy path for those users that do not have an opinionated virtual environment toolchain.dgis package-manager-neutralWhile we default to
uvincreate-dagsterin our documentation nothing else ties the system touv. You can use any Python environment management system asdgis installed in your local environment whether that bepip,poetry, or others.CLI ontology
Namespaced scaffold commands
There is no longer a global namespace of scaffolds that spans projects, definitions, and components. Instead we are namespacing this command, which we believe is more intuitive and organized. There will be
dg scaffold defsanddg scaffold componentto start.dg scaffold projecthas been moved to the newdagster-createCLI.dg scaffold defsOne of the new namespaces is
defs, which means that you are scaffolding a set of definitions at a specific path in thedefshierarchy. This is inline with the move fromcomponent.yamltodefs.yamldetailed below.No more "plugins"
Related, we are almost entirely hiding the term "plugin" as an external-facing noun. While it continues to live on in our internal architecture "A python package introspectable by
dg" as a user you should not encounter the term aside from in diagnostics.Project Layout Changes
No more
libPreviously we had a
libfolder meant for autodiscovery of components and scaffolding, which also required the inclusion of a.gitignorefile. The purpose of this folder was unclear to users and it implied that we are required all code outside ofdefsto reside there, which is not the case.We still want an auto-discovery default location for scaffolded component classes (via
dg scaffold component), so there is acomponentsdirectory that components are scaffolded into and discoverable by thedg components listcommand.definitions.pyno longer required at project rootPrior to this release, a
definitions.pyfile at the root of the project was required. This is no longer true and we support "autoloading" thedefsfrom within the framework. This cleans up the project layout considerably.No more special handling of
component.pyordefinitions.pyin thedefshierarchyWe have hardcoded special behavior when our autoloaded discovered
definitions.pyandcomponents.pyin thedefsfolder. For new projects this is no longer true. We have preserved the behavior for existing projects that calledload_defsfrom the rootdefinitions.pyto avoid thrashing existing users.Components changes
All component classes are imported at top-level
dagsterrather thandagster.componentsUp until now our components classes (
Component,ComponentLoadContext, etc) were in thedagster.componentssubpackage. They are now exported at the top-level ofdagstercomponents.yaml—>defs.yamlWe have changed the name of the file that declares a component from
components.yamltodefs.yaml.We will support backwards compatibility for a few weeks, but will eliminate it once we push out the RC.
defs.yamlsupports--for multiple instancesOur usage by and our partners, it was common to want to instantiate multiple instances of components in a single yaml file. We now support this case with the
---sigil, similar to other familiar tools such as Kubernetes.First-class support for "inline components"
We found that it was common to build custom components that were only used in a single folder in the
defshierarchy, and it felt disorganized to force this component in the globallibdirectory. As a result, we are more formally supporting the notion "inline components" in our scaffolding, whether the Python component class lives directly alongside the definitions that are created by it.dg scaffold defs inline-component --typename ComponentName path/to/defsComponents Templating Frontend
More flexible mechanism for injecting variables into templates
Components comes with a yaml templating system (internally we call it
Resolved) that allows for the injection of Python objects on a per-field basis. We’ve found that a killer feature of the system is the ability to inject functions as template variables and invoke them. (We colloquially refer to these as udfs). Right now the only way to do that is on a per-component-type basis and the only way to add additional variables and udfs is through entirely new custom componets.We now allow a user to mixin new variables usable to in a template via a new
@template_vardecorator that can live in thedefshierarchy alongside where they are used.Injectable by default
We have now made template fields “injectable” by default, meaning that you can use udfs and variables at any spot in the templating hierarchy. Previously one needed to “opt into” this capability at the per-field level, which was overly constraining and untenable when it came to constructing reusable libraries of schema.
Conclusion
We are releasing these changes as part of
1.10.18to be released on Thursday.Beta Was this translation helpful? Give feedback.
All reactions