Skip to content

DBT-809 Microbatch support for dbt-hive adapter #164

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

niranjancdw
Copy link
Collaborator

Describe your changes

Added support for microbatch strategy in incremental mode for dbt-hive adapter.

Internal Jira ticket number or external issue link

https://cloudera.atlassian.net/browse/DBT-809

Testing procedure/screenshots(if appropriate):

https://gist.github.com/niranjancdw/36b901d49b55fbe11f2bc4c9237a0d7d

Checklist before requesting a review

  • I have performed a self-review of my code
  • I have formatted my added/modified code to follow pep-8 standards
  • I have checked suggestions from python linter to make sure code is of good quality.

@niranjancdw niranjancdw requested review from niteshy and Copilot July 2, 2025 09:20
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds support for the new microbatch incremental strategy in the dbt-hive adapter by extending validation macros, integrating the strategy into materialization logic, and adding functional tests.

  • Extend file‐format and incremental‐strategy validators to allow microbatch and enforce partition_by
  • Wire microbatch through strategy dispatch and early validation in incremental.sql
  • Add functional tests covering missing partition_by and bump test adapter dependency

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/functional/adapter/incremental/test_incremental_microbatch.py New tests for microbatch strategy, including missing partition_by
dev-requirements.txt Bumped dbt-tests-adapter from 1.9.* to 1.10.*
dbt/include/hive/macros/materializations/incremental/validate.sql Added avro format, support for microbatch, and partition_by check
dbt/include/hive/macros/materializations/incremental/strategies.sql Treat microbatch as an insert_overwrite variant
dbt/include/hive/macros/materializations/incremental/incremental.sql Hooked validation macros before running incremental SQL
dbt/adapters/hive/relation.py Introduced _render_subquery_alias helper for conditional aliases
Comments suppressed due to low confidence (3)

tests/functional/adapter/incremental/test_incremental_microbatch.py:59

  • [nitpick] Class name TestHiveMicroBatchNoPartitionKey uses inconsistent casing for “MicroBatch” compared to TestHiveMicrobatch. Consider renaming to TestHiveMicrobatchNoPartitionKey for consistency.
class TestHiveMicroBatchNoPartitionKey:

dbt/include/hive/macros/materializations/incremental/validate.sql:66

  • [nitpick] This compiler error message is split across multiple lines and may include unintended whitespace or line breaks. Consider consolidating it into a single line or trimming surrounding whitespace to ensure the output is clean.
      dbt-hive 'microbatch' incremental strategy requires a `partition_by` config.

tests/functional/adapter/incremental/test_incremental_microbatch.py:42

  • Tests cover the error path for missing partition_by, but there isn’t a functional test verifying a successful microbatch run when partition_by is provided. Consider adding a positive test case to confirm correct behavior.
class TestHiveMicrobatch(BaseMicrobatch):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant