Skip to content

Releases: datahub-project/datahub

v0.8.18

10 Dec 19:45
d651040
Compare
Choose a tag to compare

DataHub Release 0.8.18 is here!

Release Highlights

  1. Metadata Service Authentication: Make authenticated requests to the Metadata Service APIs (GraphQL + Rest.li)

    1. Video Demo
    2. Technical Deep Dive
  2. Redshift Lineage: Out-of-the-box support for ingesting Dataset->Dataset lineage from Redshift system tables. Includes Tables, Views, and COPY from S3

    1. Video Demo
  3. Apache Nifi Connector (Beta) : Integration with Apache Nifi to extract DataJobs and DataFlows! Read the source docs here. This source is currently incubating in beta.

  4. Mode Connector (Beta): Integration with Mode Analytics to extract reports, charts, and more! Read the source docs here. This source is currently incubating in beta.

  5. Add Aspects without a fork: This is a major milestone towards No-Code UI

    1. Watch the No Code UI Sneak Peek
  6. Glossary Term Transformer: Allows users to add tags or glossary terms to entities based on a regex match filter (Shoutout to Community Member ecooklin!)

  7. Bug Fixes:

    1. [metadata service] Empty search query fails to resolve
    2. [metadata service] Log4j vulnerability addressed!! Highly recommend folks to upgrade to latest.
    3. [metadata ingestion] [bigquery] Fix handling of partitioned & snapshotted tables for lineage usage, and basic table indexing.
    4. [metadata-service] [recommendations] Fix issue where recently viewed and most popular recommendations were not showing up when user urn contains special chars.
    5. [metadata ingestion] Add config to specify ca certificate path for datahub-rest sink
    6. [metadata ingestion][snowflake] Handling for special characters in snowflake databases and schemas.
    7. [ui] Fix Groups page not showing asset ownership correctly
    8. [ui] Fix issue where markdown links were not clickable.
    9. [metadata service] Improve search & recommendations performance by ~50%, homepage load by ~50%.
    10. [cli] Fix deletes by search cannot accept auth token
    11. [metadata service][policies] Fix invalid Tag creation policy
    12. [metadata service][upgrade] Fix Spring injection of Entity Client inside datahub-upgrade

Backwards Incompatible Changes

  • The standalone Spring GraphQL Service has been removed. (Replaced in full by Metadata Service GraphQL API)

New Contributors

What's Changed

Read more

v0.8.17

19 Nov 07:58
f1045f8
Compare
Choose a tag to compare

Notable Changes

  • Added Recommendations and redesigned the home page!
    • Modular way to add recommendations throughout the application
    • Recommendation modules for top platforms, recently viewed, popular entities, top tags/terms were added to home page
    • Search page also has top tags/terms module on the bottom
  • Ingestion Sources
    • DBT enhancements
      • Creating dbt platform entities to capture dbt node types such as models, tests, source, seed, etc. linking dbt entities with other dbt or underlying platform entities.
    • OpenAPI specs
    • Kafka Connect (Regex based transformers, BigQuery sink)
    • Trino Usage (Starburst)
  • Improved lineage viz performance and lineage viz UX
    • Improved layout logic
    • Nodes can be dragged and dropped
  • Fixes for delete API not always deleting all of an entities data
  • Improved documentation for adding a custom Metadata Ingestion Source
    • Fixes description rendering for Charts, Dashboards, Flows, Jobs
  • Add YAML configuration file for Metadata Service
  • Filter search results by Sub-Type (Looker Explore, View, etc)
  • Support proxying DataHub Frontend requests to Metadata Service at /api/gms
  • Multi-platform (x86, arm64) support for Docker images (Apple M1 support)
  • Graph Service: DGraph support (phase 1)

What's Changed

Read more

DataHub v0.8.16

21 Oct 21:00
dd8c592
Compare
Choose a tag to compare

Release Highlights

  • Important bug-fixes: properties for DataJob and DataFlow, descriptions for Datasets should now correctly show in the UI
  • Search redesign! Single search experience across all entity types with left filter bar
  • Added searchAcrossEntities endpoint on both GraphQL and Rest.li that pulls search results for all entity types and mixes them together
  • Dataset level lineages - Added support for ingesting dataset level lineages for bigquery. Added support for linking external tables in redshift to the corresponding table in the external data catalog.
  • Performance optimization: graphql will now directly call the entity service instead of calling the entity resource over http to hydrate graphql models.
  • The “filter” input model used for “search” API now supports disjunctive normal form. (OR of ANDs). The previous filter model should continue to work as expected. (criteria array)
  • Adding foundations (models) for search insights, or highlights shown in the search result previews.
  • Add owner experience improvements: using full text search to find users and groups.
  • User & Group Management Screens!
    • View all users (and those who have logged in)
    • View all groups
    • Create new groups
    • Add and remove group members

Breaking Changes

None

What's Changed

Read more

DataHub v0.8.15

29 Sep 19:35
268d112
Compare
Choose a tag to compare
DataHub v0.8.15 Pre-release
Pre-release

Notable Changes

  • Support the “NONE” Client Authentication Method for OIDC login.
  • Migrated to the new UI for Charts, Dashboards, Data Flows (Pipelines), Data Jobs (Tasks) profile pages
  • Primary and Foreign Keys rendered in the UI
  • Ingestion
    • Support for redshift-usage source
    • Fixes for looker ingestion
    • datahub cli supports -f/--force option to skip confirmations

Changelog

DataHub v0.8.14

17 Sep 17:51
97bed71
Compare
Choose a tag to compare

Release Highlights

  • Small bug fixes over 0.8.13

Notable Changes

  • Fix bug in OIDC config for setting response type
  • Add WAU chart in the analytics page
  • Starting with acryl_datahub==0.8.13.1 (pypi), Looker and Lookml ingestion will now name views differently from before. You will need to delete old LookML metadata to start with a clean slate or specify view_naming_pattern = “{name}” in both your Looker and LookML ingestion recipes to get the old behavior.
  • Populate the user email field in usage statistics to correctly show top users on the entity page
  • Full changelog below

Changelog

DataHub v0.8.13

16 Sep 00:48
f665ffc
Compare
Choose a tag to compare

Release Highlights

  • Support for aggregated statistics wrt the timeseries aspect. Moved usage stats functionality to use the new framework.
  • Auto-ingest common data platforms on GMS boot! No more generic logos.
  • Fixes re-ingestion of modified policies at startup
  • Full changelog below

Breaking Changes

  • Usage stats endpoint now uses the time-series aspect index in Elastic, meaning that statistics ingested previously will be lost. Please re-run usage ingestion (e.g. bigquery-usage / snowflake-usage) etc. to backfill your usage statistics history.

Changelog

DataHub v0.8.12

09 Sep 05:16
940cbb1
Compare
Choose a tag to compare

Release Highlights

  • RBAC Phase 1: Added abilities to control access through policies in the UI and backend
  • Dataset page refresh!!! + improved home page, search and browse screens
  • Added the ability to monitor DataHub through Prometheus and provided example Grafana dashboards
  • GraphQL API browser hosted on /api/graphql endpoint.
  • Support for Business Glossary ingestion through yml file
  • Support for Azure AD ingestion source

Notable Changes

  • Fixed unicode rendering bug introduced in v0.8.11
  • Added the ability to search by properties in the customProperties bag: supports case-insensitive matches of the form ‘key=value’
    • For instance, query “encoding=utf-8” will return entities with “encoding”: “utf-8” in the property bag
  • Full changelog below

Changelog

DataHub v0.8.11

25 Aug 05:35
d1b5792
Compare
Choose a tag to compare

Release Highlights

  • Business Glossary: Phase 1 is feature complete. Full support for UI viewing and API-based edits, no support for UI edits.
  • Users and Groups: Just-in-time User and Group provisioning on login (SSO/OIDC), basic Group pages with membership information
  • New Integrations: Redash

Notable Changes

  • GraphQL and REST API-s are now both served by datahub-metadata-service (new name for gms). Frontend is now a proxy. Container names are not changed.
  • Kafka source will no longer tokenize on . in the topic name. This will result in a flat browse experience in UI.
  • Airflow lineage emission will only populate specific properties of Tasks and DAGs to limit bloat and avoid leaking environment variables.
  • Schema history feature turned off in UI based on feedback from the community. Will re-emerge in a future release!
  • Mongodb collections with extremely wide schemas will have schema fields sampled to keep UI responsive.
  • Full changelog below.

ChangeLog

DataHub v0.8.10

13 Aug 18:16
39a0081
Compare
Choose a tag to compare

Release Highlights

Bugfix release for 0.8.9

  • [#3096] Fix dependency injection issue introduced by this PR
  • Increase REST emitter timeout to 30 seconds by default

ChangeLog

  • #3095 @shirshanka fix(ingest): increasing default ingestion REST timeout to 30 seconds
  • #3096 @dexter-mh-lee fix(upgrade): Fix MAE consumer and upgrade's dependency issue
  • #3092 @jensenity fix(postgres): fix postgres setup to handle existing database

DataHub v0.8.9

13 Aug 05:01
c13d83b
Compare
Choose a tag to compare
DataHub v0.8.9 Pre-release
Pre-release

Release Highlights

  • Support for nested structs, union types and key-value schemas in Kafka
  • Support for JDBC Connector based sources in Kafka Connect
  • Support for Okta as a source for User and Group metadata
  • Support for using AWS Glue schema registry

Breaking Changes

  • [#3079] : Introduces a change to fieldPath encoding in schema metadata. Note: This is a backwards compatible change for the storage layer. Old fieldPaths will still be rendered correctly. At read time, fieldPaths in the new encoding will be translated to the old encoding to discover tags written before this change. Tags and Descriptions applied to fields earlier (which were being stored in the old format) will be migrated on applying new tags or editing descriptions.

Important Bug Fixes

  • [#3070] Charts and Dataset lineage was broken in release 0.8.8. This has been fixed via [gma-125]

ChangeLog