Skip to content

[azure-storage-file-datalake] Checking for a non-existent file induces memory leak #45999

@hallmeier

Description

@hallmeier
  • Package Name: azure-storage-file-datalake
  • Package Version: 12.23.0 (current)
  • Operating System: Linux and Darwin
  • Python Version: 3.14.3 (current)

Describe the bug
When the existence of a non-existent file is checked with the Azure file client, objects in parent scopes are not garbage collected anymore.

To Reproduce
I reproduced this on Linux machines running in Azure Kubernetes and a Mac both in and outside of a Docker container. MWE:

import os

from azure.storage.filedatalake import FileSystemClient
from time import sleep

account_name = os.environ["ACCOUNT_NAME"]
file_system_name = os.environ["FILE_SYSTEM_NAME"]
sas_token = os.environ["SAS_TOKEN"]
directory_name = os.environ["DIRECTORY_NAME"]
file_name = os.environ["FILE_NAME"]


file_system_client = FileSystemClient(account_url=f"https://{account_name}.dfs.core.windows.net",
                                      file_system_name=file_system_name, credential=sas_token)
directory_client = file_system_client.get_directory_client(directory_name)


def check_file():
    file_client = directory_client.get_file_client(file_name)
    file_client.exists()
    big_list = list(range(10000000))


i = 0
while True:
    i += 1
    print(f"Iteration {i}")
    check_file()
    sleep(0.5)

This fills about 1 GB of memory every 5 iterations, eventually leading to an out-of-memory error.

Expected behavior
This should be able to run forever, as it is when the file existence check is omitted or the existence of an existent file is checked.

Metadata

Metadata

Labels

Service AttentionWorkflow: This issue is responsible by Azure service team.customer-reportedIssues that are reported by GitHub users external to the Azure organization.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK teamquestionThe issue doesn't require a change to the product in order to be resolved. Most issues start as that

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions