Automatic cleanup of items in /tmp #102

KelvinLinBU · 2025-05-30T19:51:07Z

Closes #38. items being downloaded into /tmp converted to use tempfile.NamedTemporaryFile so that they are automatically cleaned up

Closes https://github.com/issues/assigned?issue=CCI-MOC%7Copenstack-billing-from-db%7C38. items being downloaded into /tmp converted to use tempfile.NamedTemporaryFile so that they are automatically cleaned up

QuanMPhm · 2025-06-09T18:12:18Z

src/openstack_billing_db/fetch.py

-    logger.info(f"Downloading {key} to {download_location}.")
-    s3.download_file(s3_bucket, key, download_location)
-
+    tmp_gz = tempfile.NamedTemporaryFile(delete=True, suffix=".gz")


In this function, we are checking if the fetched S3 file is indeed an archive (gz) file, only then do we run gzip to decompress the archive. Therefore, you should remove suffix=".gz" since the file is not guaranteed to be a gz archive. You should also add in the suffix check.

QuanMPhm · 2025-06-09T18:19:25Z

src/openstack_billing_db/fetch.py

-    s3.download_file(s3_bucket, key, download_location)
-
+    tmp_gz = tempfile.NamedTemporaryFile(delete=True, suffix=".gz")
+    s3.download_file(s3_bucket, key, tmp_gz.name)


You reference tmp_gz.name several times in this function. You can instead do this:

Suggested change

s3.download_file(s3_bucket, key, tmp_gz.name)

download_location = tmp_gz.name

Generally, in our repos, whenever we want to reference an object attribute (or any values that requires some traversal to get) more than once, we'll create a variable for it. This has the advantage of:

You can provide a more informative name for the information you're using

Can create a simpler diff. In this case, some of your lines are marked as changed only because you used download_location instead of tmp_gz.name. Following the suggestion above would avoid that and let your PR be a bit more succinct.

QuanMPhm · 2025-06-09T18:21:21Z

src/openstack_billing_db/fetch.py

-    s3.download_file(s3_bucket, key, download_location)
-
+    tmp_gz = tempfile.NamedTemporaryFile(delete=True, suffix=".gz")
+    s3.download_file(s3_bucket, key, tmp_gz.name)


Any reason why you moved the s3.download_file statement before the logger.info(f"Downloading {key} to {tmp_gz.name}")? This would change logging behavior

QuanMPhm · 2025-06-09T18:22:00Z

src/openstack_billing_db/fetch.py

-        if command.returncode != 0:
-            raise Exception(f"Error uncompressing {download_location}.")
+    tmp_sql = tempfile.NamedTemporaryFile(delete=True, suffix=".sql", mode="wb")
+    logger.info(f"Uncompressing {tmp_gz.name}")


Same variable naming suggesting here for tmp_sql.name

QuanMPhm · 2025-06-09T18:25:13Z

src/openstack_billing_db/fetch.py

-            raise Exception(f"Error uncompressing {download_location}.")
+    tmp_sql = tempfile.NamedTemporaryFile(delete=True, suffix=".sql", mode="wb")
+    logger.info(f"Uncompressing {tmp_gz.name}")
+    result = subprocess.run(["gzip", "-cd", tmp_gz.name], stdout=tmp_sql)


Any reason why you renamed the variable from command to result?

QuanMPhm · 2025-06-09T18:25:51Z

src/openstack_billing_db/fetch.py

        raise Exception(
            f"Error converting {path_to_dump} to SQLite compatible"
-            f" at {destination_path}."
+            f" at {tmp_converted.name}."


QuanMPhm · 2025-06-09T18:28:51Z

src/openstack_billing_db/fetch.py

Doing a small test myself, it seems using tempfile.NamedTemporaryFile(delete=True) will delete the file after leaving the function's scope:

>>> def write_temp(): ... tmp_f = tempfile.NamedTemporaryFile(delete=True, mode='w+') ... tmp_f.write("test\n") ... tmp_f.flush() ... return tmp_f.name ... >>> temp_f = write_temp() >>> with open(temp_f) as f: ... print(f.read()) ... Traceback (most recent call last): File "<stdin>", line 1, in <module> FileNotFoundError: [Errno 2] No such file or directory: '/var/folders/3j/jrd7vlnj1tv7c3vfpf1c9mlm0000gq/T/tmpzqv8wdld' >>>

Did this work for you on your local environment? It seems Mr. Lars' original comment may be right. This may require some code refactoring. If you can't find a solution that only involves editing the functions in fetch.py, let me know and we can discuss to move forward with the issue.

Automatic cleanup of items in /tmp

47a6ac8

Closes https://github.com/issues/assigned?issue=CCI-MOC%7Copenstack-billing-from-db%7C38. items being downloaded into /tmp converted to use tempfile.NamedTemporaryFile so that they are automatically cleaned up

QuanMPhm self-requested a review June 3, 2025 14:53

QuanMPhm requested changes Jun 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Automatic cleanup of items in /tmp #102

Automatic cleanup of items in /tmp #102

Uh oh!

KelvinLinBU commented May 30, 2025

Uh oh!

QuanMPhm Jun 9, 2025

Uh oh!

QuanMPhm Jun 9, 2025

Uh oh!

QuanMPhm Jun 9, 2025

Uh oh!

QuanMPhm Jun 9, 2025

Uh oh!

QuanMPhm Jun 9, 2025

Uh oh!

QuanMPhm Jun 9, 2025

Uh oh!

QuanMPhm Jun 9, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	s3.download_file(s3_bucket, key, tmp_gz.name)
	download_location = tmp_gz.name

Automatic cleanup of items in /tmp #102

Are you sure you want to change the base?

Automatic cleanup of items in /tmp #102

Uh oh!

Conversation

KelvinLinBU commented May 30, 2025

Uh oh!

QuanMPhm Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

QuanMPhm Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

QuanMPhm Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

QuanMPhm Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

QuanMPhm Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

QuanMPhm Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

QuanMPhm Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

QuanMPhm Jun 9, 2025 •

edited

Loading