Skip to content

CUDA error: device-side assert triggered #110

@Raviteja-banda

Description

@Raviteja-banda

I have been trying to run the speaker_id.py file(for a customised librispeech dataset) but i was stuck at this error from the past few days. I have even checked if there's any mismatch between the number of labels and the output neurons but they are fine. I've changed the config file as below:
class_lay=251
batch_size=8
N_batches=251 (Since total number of train files(excluding validation data) = 2008)

[windowing]
fs=16000
cw_len=200
cw_shift=10

[cnn]
cnn_N_filt=80,60,60
cnn_len_filt=251,5,5
cnn_max_pool_len=3,3,3
cnn_use_laynorm_inp=True
cnn_use_batchnorm_inp=False
cnn_use_laynorm=True,True,True
cnn_use_batchnorm=False,False,False
cnn_act=leaky_relu,leaky_relu,leaky_relu
cnn_drop=0.0,0.0,0.0

[dnn]
fc_lay=2048,2048,2048
fc_drop=0.0,0.0,0.0
fc_use_laynorm_inp=True
fc_use_batchnorm_inp=False
fc_use_batchnorm=True,True,True
fc_use_laynorm=False,False,False
fc_act=leaky_relu,leaky_relu,leaky_relu

[class]
class_lay=251
class_drop=0.0
class_use_laynorm_inp=False
class_use_batchnorm_inp=False
class_use_batchnorm=False
class_use_laynorm=False
class_act=softmax

I have also changed the data paths in the config file.

Below is my stack trace of the error:

C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [0,0,0] Assertion t >= 0 && t < n_classes failed.
C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [1,0,0] Assertion t >= 0 && t < n_classes failed.
C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [2,0,0] Assertion t >= 0 && t < n_classes failed.
C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [3,0,0] Assertion t >= 0 && t < n_classes failed.
C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [4,0,0] Assertion t >= 0 && t < n_classes failed.
C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [5,0,0] Assertion t >= 0 && t < n_classes failed.
C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [6,0,0] Assertion t >= 0 && t < n_classes failed.
C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\cuda\Loss.cu:242: block: [0,0,0], thread: [7,0,0] Assertion t >= 0 && t < n_classes failed.
Traceback (most recent call last):
File "D:\SincNet-librispeech\speaker_id.py", line 245, in
loss.backward()
File "C:\ProgramData\Anaconda3\lib\site-packages\torch_tensor.py", line 487, in backward
torch.autograd.backward(
File "C:\ProgramData\Anaconda3\lib\site-packages\torch\autograd_init_.py", line 197, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: CUDA error: device-side assert triggered

Any possible solutions?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions