-
Notifications
You must be signed in to change notification settings - Fork 23
Closed
Labels
needs-to-be-checkedIs this issue still relevant and aligned with our current goals?Is this issue still relevant and aligned with our current goals?
Description
The __getitem__ of the datasets can not handle a namedtuple when there are multiple parallel workers. The parallel workers are needed to reach high data loading speeds for powerful compute nodes.
This is a fundamental pytorch issue: because each worker instantiates a Dataset object, and the namedtuple is instantiated in each one, the parallel workers can't collate the batches in a "custom" named tuple.
Potential workarounds:
- write a custom
collate_fnfor the DataLoader which converts the entire batch into a namedtuple. - each batch returns a dictionary, as in
{device_name: value_tensor}
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
needs-to-be-checkedIs this issue still relevant and aligned with our current goals?Is this issue still relevant and aligned with our current goals?