Skip to content

[BUG] Use of weakref.proxy for rtsys._memsys_module appears to be incorrect #573

@cpcloud

Description

@cpcloud

Describe the bug

I am setting up pytest-randomly in order to catch test-order-dependent bugs and this was the first one I found.

When a context is reset, and there's only a single reference to the rtsys._memsys_module object, it gets collected, even though its APIs are still available for use.

The following code is the smallest reproducer I could come up with.

Steps/Code to reproduce bug

import pytest

from numba.cuda.cudadrv import devices
from numba.cuda.memory_management.nrt import rtsys


@pytest.fixture
def alloc_init():
    rtsys.ensure_allocated()
    rtsys.ensure_initialized()


@pytest.fixture
def ctx(alloc_init):
    ctx = devices.get_context()
    yield ctx
    ctx.reset()

    # this fails with a ReferenceError
    str(rtsys._memsys_module)


def test_nothing(ctx):
    """
    The sequence that causes the failure is:

    1. the alloc_init fixture creates a weakref proxy to a CudaPythonModule
       (rtsys._memsys_module) from devices.get_context()
    2. ctx.reset() calls dict.clear() on the context's module dict
    3. since there's only a single strong ref to `rtsys._memsys_module` it gets collected
       during the clear
    4. All ops on the module (a weakref proxy) fail, because it is now
       referencing a dead object
    """

Expected behavior

Since rtsys is a module global, with APIs that liberally use _memsys_module, I would not expect that clearing my current context would immediately make them unusable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions