Skip to content

Explore fixing test issues for older pytorch versions #3434

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

goanpeca
Copy link
Collaborator

@goanpeca goanpeca commented Jul 8, 2025

Supersedes #3356

Description:

Check list:

  • New tests are added (if a new feature is added)
  • New doc strings: description and/or example code are in RST format
  • Documentation is updated (if required)

@github-actions github-actions bot added module: metrics Metrics module ci CI labels Jul 8, 2025
@goanpeca goanpeca changed the title Fix testing issues for older pytorch versions Explore fixing test issues for older pytorch versions Jul 8, 2025
@github-actions github-actions bot added the docker label Jul 8, 2025
@goanpeca goanpeca force-pushed the fix/pytorch-versions branch from 6f5a787 to 6b337b8 Compare July 8, 2025 21:22
@goanpeca goanpeca closed this Jul 9, 2025
@goanpeca goanpeca reopened this Jul 9, 2025
@goanpeca goanpeca force-pushed the fix/pytorch-versions branch from 9b1a70a to 08cf6f6 Compare July 31, 2025 18:26
@goanpeca goanpeca marked this pull request as ready for review July 31, 2025 18:27
@Copilot Copilot AI review requested due to automatic review settings July 31, 2025 18:27
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses compatibility issues with older PyTorch versions by removing or modifying features that were introduced in newer versions. The changes primarily focus on replacing stable=True parameter in torch.argsort() (added in PyTorch 1.13) with version-specific handling and fixing device comparison syntax.

  • Remove .bool() calls on tensor operations that already return boolean tensors for older PyTorch compatibility
  • Replace direct device object comparisons with device type string comparisons
  • Add version checks to conditionally use the stable parameter in torch.argsort()

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/ignite/metrics/vision/test_object_detection_map.py Replace device object comparisons with device type string comparisons
tests/ignite/metrics/test_mean_average_precision.py Remove redundant .bool() calls on torch.randint() operations
tests/ignite/distributed/utils/test_native.py Add version checks and test ordering for older PyTorch compatibility
ignite/metrics/vision/object_detection_average_precision_recall.py Add version-based handling for stable parameter in torch.argsort() and fix tensor operations
ignite/metrics/mean_average_precision.py Add version checks for stable parameter and fix device comparisons
ignite/metrics/gan/fid.py Add .numpy() calls for scipy operations that require numpy arrays
.github/workflows/pytorch-version-tests.yml Fix Python version string formatting and installation commands

@@ -37,7 +37,7 @@ def fid_score(
diff = mu1 - mu2

# Product might be almost singular
covmean, _ = scipy.linalg.sqrtm(sigma1.mm(sigma2), disp=False)
covmean, _ = scipy.linalg.sqrtm(sigma1.mm(sigma2).numpy(), disp=False)
Copy link
Preview

Copilot AI Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converting the tensor to numpy adds an unnecessary GPU-to-CPU transfer. Consider checking if the input tensors are on GPU and handle the conversion more efficiently, or use PyTorch native operations if available.

Copilot uses AI. Check for mistakes.

Comment on lines +51 to 53
tr_covmean = np.sum(np.sqrt(((np.diag(sigma1.numpy()) * eps) * (np.diag(sigma2.numpy()) * eps)) / (eps * eps)))

return float(diff.dot(diff).item() + torch.trace(sigma1) + torch.trace(sigma2) - 2 * tr_covmean)
Copy link
Preview

Copilot AI Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple .numpy() calls on the same tensors (sigma1 and sigma2) create redundant GPU-to-CPU transfers. Consider converting these tensors to numpy once and reusing the numpy arrays.

Suggested change
tr_covmean = np.sum(np.sqrt(((np.diag(sigma1.numpy()) * eps) * (np.diag(sigma2.numpy()) * eps)) / (eps * eps)))
return float(diff.dot(diff).item() + torch.trace(sigma1) + torch.trace(sigma2) - 2 * tr_covmean)
tr_covmean = np.sum(np.sqrt(((np.diag(sigma1) * eps) * (np.diag(sigma2) * eps)) / (eps * eps)))
return float(diff.dot(diff).item() + torch.trace(torch.tensor(sigma1)) + torch.trace(torch.tensor(sigma2)) - 2 * tr_covmean)

Copilot uses AI. Check for mistakes.

@goanpeca goanpeca force-pushed the fix/pytorch-versions branch from 08cf6f6 to 8cd193c Compare July 31, 2025 18:28
@goanpeca
Copy link
Collaborator Author

@vfdev-5 I think these fixes all the issues on pytorch versions.

@goanpeca
Copy link
Collaborator Author

Ok, it seems now there are som typing issues

ignite/distributed/comp_models/native.py:182: error: Invalid index type
"int | float" for "Tensor"; expected type
"SupportsIndex | bool | int | slice[Any, Any, Any] | EllipsisType | Tensor | _NestedSequence[bool | int | slice[Any, Any, Any] | EllipsisType | Tensor | None] | None | tuple[SupportsIndex | bool | int | slice[Any, Any, Any] | EllipsisType | Tensor | _NestedSequence[bool | int | slice[Any, Any, Any] | EllipsisType | Tensor | None] | None, ...]"
 [index]
                local_rank = rank - cumsum_sizes[node_rank].item()
                                                 ^~~~~~~~~
ignite/distributed/comp_models/native.py:183: error: Incompatible return value
type (got "tuple[int, int | float]", expected "tuple[int, int]")  [return-value]
                return int(local_rank), node_rank
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~
ignite/handlers/clearml_logger.py:137: error: Need type annotation for "_task" 
[var-annotated]
            self._task = Task.current_task()
                         ^~~~~~~~~~~~~~~~~~~
ignite/handlers/clearml_logger.py:861: error: Name "Task" already defined
(possibly by an import)  [no-redef]
                    from trains import Task
                    ^~~~~~~~~~~~~~~~~~~~~~~
ignite/handlers/clearml_logger.py:871: error: Need type annotation for "_task" 
[var-annotated]
            self._task = Task.current_task()
                         ^~~~~~~~~~~~~~~~~~~
ignite/handlers/clearml_logger.py:936: error: Name "WeightsFileHandler" already
defined (possibly by an import)  [no-redef]
                    from trains.binding.frameworks import WeightsFileHandl...
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ignite/handlers/clearml_logger.py:960: error: Argument 1 to "add_pre_callback"
of "WeightsFileHandler" has incompatible type "Callable[[str, Any], Any]";
expected "Callable[[str | CallbackType, ModelInfo], ModelInfo | None]" 
[arg-type]
    ... pre_cb_id = WeightsFileHandler.add_pre_callback(cb_context.pre_callba...
                                                        ^~~~~~~~~~~~~~~~~~~~~~~
ignite/handlers/clearml_logger.py:961: error: Argument 1 to "add_post_callback"
of "WeightsFileHandler" has incompatible type "Callable[[str, Any], Any]";
expected "Callable[[str | CallbackType, ModelInfo], ModelInfo]"  [arg-type]
    ...ost_cb_id = WeightsFileHandler.add_post_callback(cb_context.post_callb...
                                                        ^~~~~~~~~~~~~~~~~~~~~~~~
Found 8 errors in 2 files (checked 154 source files)

@goanpeca
Copy link
Collaborator Author

I dont understand where those issues are coming from now. I run mypy locally and no errors are reported. :\

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Aug 1, 2025

Maybe, let's just ignore them with appropriate comment ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants