namedtuple is incompatible with num_workers>0 in PyTorch DataLoader

The `__getitem__` of the datasets can not handle a namedtuple when there are multiple parallel workers. The parallel workers are needed to reach high data loading speeds for powerful compute nodes.
This is a fundamental pytorch issue: because each worker instantiates a `Dataset` object, and the namedtuple is instantiated in each one, the parallel workers can't collate the batches in a "custom" named tuple. 

Potential workarounds: 
- write a custom `collate_fn` for the DataLoader which converts the entire batch into a namedtuple. 
- each batch returns a dictionary, as in `{device_name: value_tensor}`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

namedtuple is incompatible with num_workers>0 in PyTorch DataLoader #32

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

namedtuple is incompatible with num_workers>0 in PyTorch DataLoader #32

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions