Skip to content

PyTorch emits UserWarning for deprecated __floordiv__ operation #926

Closed
@osma

Description

@osma

Describe the bug
When performing lemmatization of certain Finnish expressions, PyTorch emits a UserWarning about the deprecated __floordiv__ operation. The lemmatization is still working. The UserWarning is only shown once per process/session.

This appears to be quite rare, only certain combinations of words will trigger this. But when processing a large file in Finnish, it will eventually be triggered. I've also done similar lemmatization for long documents in Swedish and English, but never saw this warning with those languages.

To Reproduce

This code will trigger the warning for me:

import stanza
nlp = stanza.Pipeline(lang='fi', processors='tokenize,mwt,pos,lemma')
doc = nlp("ettei se")

Output:

2022-01-14 13:39:50 INFO: Loading these models for language: fi (Finnish):
=======================
| Processor | Package |
-----------------------
| tokenize  | tdt     |
| mwt       | tdt     |
| pos       | tdt     |
| lemma     | tdt     |
=======================

2022-01-14 13:39:50 INFO: Use device: cpu
2022-01-14 13:39:50 INFO: Loading: tokenize
2022-01-14 13:39:50 INFO: Loading: mwt
2022-01-14 13:39:50 INFO: Loading: pos
2022-01-14 13:39:51 INFO: Loading: lemma
2022-01-14 13:39:51 INFO: Done loading processors!
[REDACTED]/lib/python3.8/site-packages/stanza/models/common/beam.py:86: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  prevK = bestScoresId // numWords

Expected behavior
Expected no UserWarning.

Environment (please complete the following information):

  • OS: Ubuntu 20.04
  • Python version: 3.8.10 from Ubuntu system package 3.8.10-0ubuntu1~20.04.2
  • Stanza version: 1.3.0 (installed from PyPI in a virtual environment)
  • PyTorch version: 1.10.1 (installed from PyPI in a virtual environment)

Additional context

According to the warning message, the problem seems to be this line:

prevK = bestScoresId // numWords

Here is a PR fixing the same warning in another codebase: NVIDIA/MinkowskiEngine#407

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions