Skip to content

Conversation

@jimmysway
Copy link
Contributor

Closes #92

Added fetch function as an out-of-band utility to easily fetch invoices

Added fetch function as an out-of-band utility to easily fetch invoices
@jimmysway jimmysway requested a review from QuanMPhm August 19, 2025 18:43
@QuanMPhm
Copy link
Contributor

@jimmysway It's not perfectly clear to me what the objective of the original issue is, so I will wait until @knikolla is back from PTO before giving this a review

Copy link
Contributor

@QuanMPhm QuanMPhm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With every new feature introduced, we expect test cases to also be written for them. In this case, I would recommend writing a simple test that mocks the S3 bucket and calls fetch(), asserting that self.data is loaded with the expected data

@jimmysway jimmysway requested a review from QuanMPhm September 5, 2025 15:33
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest that you add this test into test_base_invoice.py instead since fetch() is a function belonging to the base Invoice class. Doing so will also mean the test will be discovered by unittest. Right now, this file cannot be discovered by unittest, and so your test is never ran by the workflow.

from process_report.tests import util as test_utils


def test_fetch_with_mock_s3_bucket(mocker: "MockerFixture") -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless if the test is compatible for unittest or pytest, I would suggest you use unittest.mock to mock the S3 bucket and read_csv, since that is what we use in the rest of this repo. This will also avoid the need to add another dependancy

test_invoice.fetch(mock_s3_bucket)

assert test_invoice.data is not None
pd.testing.assert_frame_equal(test_invoice.data, expected_data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest

Suggested change
pd.testing.assert_frame_equal(test_invoice.data, expected_data)
assert test_invoice.data.equals(expected_data)

I think this looks simpler, and is also what the rest of the repo does

@jimmysway jimmysway force-pushed the fix/92-implement-fetch branch from 8aba5e7 to 1368d46 Compare September 10, 2025 16:10
@jimmysway jimmysway requested a review from QuanMPhm September 10, 2025 16:10
Copy link
Contributor

@QuanMPhm QuanMPhm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question

import pandas
from typing import TYPE_CHECKING

if TYPE_CHECKING:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use of this check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.python.org/3/library/typing.html#typing.TYPE_CHECKING

Using this pattern essentially lets you import types only when running type checkers or using code editors, not during actual program execution. This avoids slow imports or circular dependencies while still keeping type hints working properly.

Copy link
Contributor

@QuanMPhm QuanMPhm Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. That is interesting. People come up with tools for everything.

I'd argue that this change is beyond the scope of this issue. Introducing the use (or in this case, the implication) of type checkers is seperate from adding a fetch function, and should be done in its own PR.

Do note also, by using TYPE_CHECKING here, there's pressure to refactor the entire repo to follow this pattern as well. At least in our repos, whenever a code pattern or tool is proposed to a repo, we will want that feature to be applied uniformly across the entire repo, not just in one-off locations. This generally makes the repo more cohesive and easier to maintain.

That being said. If you'd like to propose using TYPE_CHECKING and encourage the use of type checkers, I would suggest making a new PR for it and ask people on their opinion. I, for one, am ambivalent.

self.assertTrue(result_invoice.equals(answer_invoice))

@mock.patch("pandas.read_csv")
def test_fetch_with_mock_s3_bucket(self, mock_read_csv: "MagicMock") -> None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're conditionally importing the MagicMock due to the type hints in this function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Followup on the comment above. I suppose it's fine to keep this type hint string

@jimmysway jimmysway requested a review from QuanMPhm September 11, 2025 19:53
Copy link
Contributor

@QuanMPhm QuanMPhm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement a fetch function on the Invoice

2 participants