Skip to content

Conversation

@NanoNabla
Copy link
Contributor

I implemented distributed training using Horovod similar to the one that made it already into DeepSpeech.
I already opened a discussion #1849 a while ago if this feature is wanted by you but there isn't any answer, yet.

I tried to keep the changes as minimal as possible. It is still possible to run your undistributed code version. However I also noticed a slightly performance improvement by using Horovod on one of our IBM machines with 6 V100 cards.

I didn't added any CI because I don't have any knowledge of it.

If you need any help with Horovod don't hesitate to ask.

@CLAassistant
Copy link

CLAassistant commented Aug 3, 2021

CLA assistant check
All committers have signed the CLA.

@NanoNabla
Copy link
Contributor Author

My PR seems to be unregarded since I made it in May.
Are you interested in parallel training as in DeepSpeech?

If you are interested in it I would try to get my PR able to merge again. Otherwise feel free to close this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants