Skip to content

Conversation

@Anhui-tqhuang
Copy link

@Anhui-tqhuang Anhui-tqhuang commented Nov 20, 2025

related issue:

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Add support to azure data lake gen 2 storage, use the following flag to enable it

is_azure_data_lake_gen2: true

Verification

test config

type: AZURE
config:
  storage_create_container: true
  storage_account: staks1120033525main
  storage_account_key: xxx
  container: pr-238
  is_azure_data_lake_gen2: true

e2e tests

export THANOS_TEST_OBJSTORE_SKIP="MEMORY,FILESYSTEM,GCS,S3,SWIFT,COS,ALIYUNOSS,BOS,OCI,OBS"
export AZURE_STORAGE_ACCOUNT='staks1120033525main'
export AZURE_STORAGE_ACCESS_KEY='xxxx'
export IS_AZURE_DATA_LAKE_GEN2='true'
go test -timeout 30s -run ^TestObjStore_AcceptanceTest_e2e$ github.com/thanos-io/objstore/objtesting
ok  	github.com/thanos-io/objstore/objtesting	85.917s

run thanos locally

$ ./.bin/thanos tools bucket inspect --objstore.config-file ./config.yaml
ts=2025-11-20T06:56:09.685825Z caller=factory.go:39 level=info msg="loading bucket configuration"
ts=2025-11-20T06:56:12.085033Z caller=azure_data_lake_gen2.go:45 level=info msg="Azure Data Lake Gen2 filesystem successfully created" address=pr-238
ts=2025-11-20T06:56:13.735965Z caller=fetcher.go:690 level=info component=block.BaseFetcher msg="successfully synchronized block metadata" duration=1.649292625s duration_ms=1649 cached=0 returned=0 partial=0
| ULID | FROM | UNTIL | RANGE | UNTIL-DOWN | #SERIES | #SAMPLES | #CHUNKS | COMP-LEVEL | COMP-FAILED | LABELS | RESOLUTION | SOURCE |
|------|------|-------|-------|------------|---------|----------|---------|------------|-------------|--------|------------|--------|
ts=2025-11-20T06:56:13.737848Z caller=main.go:174 level=info msg=exiting
$ ./.bin/thanos tools bucket verify --objstore.config-file ./config.yaml
ts=2025-11-20T07:02:12.228115Z caller=factory.go:39 level=info msg="loading bucket configuration"
ts=2025-11-20T07:02:14.381131Z caller=verify.go:139 level=info verifiers=overlapped_blocks,index_known_issues msg="Starting verify task"
ts=2025-11-20T07:02:14.381936Z caller=overlapped_blocks.go:30 level=info verifiers=overlapped_blocks,index_known_issues verifier=overlapped_blocks msg="started verifying issue"
ts=2025-11-20T07:02:16.314397Z caller=fetcher.go:690 level=info component=block.BaseFetcher msg="successfully synchronized block metadata" duration=1.932397125s duration_ms=1932 cached=0 returned=0 partial=0
ts=2025-11-20T07:02:16.315963Z caller=index_issue.go:34 level=info verifiers=overlapped_blocks,index_known_issues verifier=index_known_issues msg="started verifying issue" with-repair=false
ts=2025-11-20T07:02:16.602837Z caller=fetcher.go:690 level=info component=block.BaseFetcher msg="successfully synchronized block metadata" duration=286.107166ms duration_ms=286 cached=0 returned=0 partial=0
ts=2025-11-20T07:02:16.602994Z caller=index_issue.go:76 level=info verifiers=overlapped_blocks,index_known_issues verifier=index_known_issues msg="verified issue" with-repair=false
ts=2025-11-20T07:02:16.603063Z caller=verify.go:158 level=info verifiers=overlapped_blocks,index_known_issues msg="verify task completed"
ts=2025-11-20T07:02:16.603778Z caller=main.go:174 level=info msg=exiting

@Anhui-tqhuang Anhui-tqhuang force-pushed the support-azure-data-lake-gen-2 branch from 6779b70 to 42d4a8b Compare November 20, 2025 06:28
Signed-off-by: Anhui-tqhuang <[email protected]>
Signed-off-by: Anhui-tqhuang <[email protected]>
Signed-off-by: Anhui-tqhuang <[email protected]>
Signed-off-by: Anhui-tqhuang <[email protected]>
Signed-off-by: Anhui-tqhuang <[email protected]>
@ringerc
Copy link

ringerc commented Nov 20, 2025

This is a fantastic workaround for the issues reported in #239.

I note that it duplicates a lot of the azure provider though. Is it feasible to instead probe the Azure Storage container/bucket to determine if it's a gen2 bucket and switch the logic used within the main azure provider instead? Perhaps by probing the type then creating the correct provider impl internally, and sharing routines that have common logic?

MSIResource string `yaml:"msi_resource"`

// IsAzureDataLakeGen2 indicates whether the provided storage account is an Azure Data Lake Gen2 account.
IsAzureDataLakeGen2 bool `yaml:"is_azure_data_lake_gen2"`
Copy link

@ringerc ringerc Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be interrogated from the account at runtime, rather than relying on user configuration?

It looks like the azblob go SDK doesn't support gen2 at all; a different datalake SDK is required.
. Which is why this PR duplicates so much I expect.

The blob sdk has [func (client *BlobClient) getAccountInfoHandleResponse(resp *http.Response) (BlobClientGetAccountInfoResponse, error)])(https://github.com/Azure/azure-sdk-for-go/blob/49a431a28f26a3e5ccf3f4b8c00ccada08572a60/sdk/storage/azblob/internal/generated/zz_blob_client.go#L1274-L1280) which is aware of HNS (gen2). This is exposed as func (client *BlobClient) GetAccountInfo(ctx context.Context, options *BlobClientGetAccountInfoOptions) (BlobClientGetAccountInfoResponse, error); see the go doc for service.Client's GetAccountInfo(...).

For the underlying API info, it's it's exposed in the Create and List APIs as properties.isHnsEnabled. It's also present in the Get Account Information REST API as header x-ms-is-hns-enabled when using API version >= 2019-07-07.

So it seems like the main azure client should be able to create a storage.Client, call client.GetAccountInfo(), and runtime-dispatch to use either the azblob sdk or the azdatalake SDK as required.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on a revision of this patch to add autodetection.

One open question is whether the default behaviour should be autodetection, or whether the user should have to explicitly configure autodetection. The only reason not to autodetect by default is if someone might have a running configuration that uses the gen1 azblob client with a gen2 azdatalake account, and if switching the client could introduce behaviour change that could break the existing app.

I have no evidence to think that such a regression could occur though; it's possible that all gen1 sdk ops performed on a gen2 account work unchanged when switching to gen2 sdk.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly right, the configuration might not be required

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ringerc i pushed a commit about auto-detection 3efbe73

Comment on lines +267 to +270
if os.Getenv("IS_AZURE_DATA_LAKE_GEN2") == "true" && bkt.Provider() == AZURE {
expected = []string{"obj_5.some", "id1/", "id2/"} // Azure Data Lake Gen2 keeps empty dirs.
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this ^ is the cause of the problem observed with using the objstore client in Loki where it tries to open a directory and fails.

The loki code is written with the assumption that when all contents of a bucket subpath are removed, the subpath ("directory") is implicitly removed too.

This is the case with a gen1 store, but is not the case for a gen2 store where directories are actual filesystem entities, not just delimiters in the names of objects in a flat namespace. This behaviour difference will persist irrespective of whether azblob or azdatalake is used as the client to communicate with a gen2 store.

Signed-off-by: Anhui-tqhuang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants