Skip to content

CNN scripts cleanup#2

Closed
evaklimentova wants to merge 3 commits into
mainfrom
miRBind_CNN
Closed

CNN scripts cleanup#2
evaklimentova wants to merge 3 commits into
mainfrom
miRBind_CNN

Conversation

@evaklimentova
Copy link
Copy Markdown
Collaborator

Rewriting scripts for miRBind 2022 CNN training.

Encoding script takes as an input the TSV dataset file and outputs two .npy files: one with encoded 2D binding matrices and the other with corresponding labels. As the dataset files are now much bigger than the older datasets, processing is done in smaller batches.

Training script takes the encoded dataset, the dataset pos:neg ratio (for balancing during the training) and the size of the dataset. It trains a model with the same hyperparameters as described in the miRBind paper. The thing that is different to the original version is internal working with the dataset - the data gets loaded to memory only when needed (per batch).

@katarinagresova
Copy link
Copy Markdown
Member

@evaklimentova I think many parts of the code are the same as code in miRBench_paper repo. Please incorporate changes made in miRBench_paper here as well. Would it make sense to extract common code into separate repo and import it from there?

@evaklimentova
Copy link
Copy Markdown
Collaborator Author

@evaklimentova I think many parts of the code are the same as code in miRBench_paper repo. Please incorporate changes made in miRBench_paper here as well. Would it make sense to extract common code into separate repo and import it from there?

Yep, I will transfer the changes here at some point

I'm not sure about making a separate repo, I think it's a bit of an overkill for just two scripts...

@katarinagresova
Copy link
Copy Markdown
Member

@evaklimentova @davidcechak is this pull request still valid? I think David already put this code into his #7. Can we close it? It has been open for half a year now.

@evaklimentova
Copy link
Copy Markdown
Collaborator Author

@evaklimentova @davidcechak is this pull request still valid? I think David already put this code into his #7. Can we close it? It has been open for half a year now.

I would discard the branch, all the scripts regarding miRBench CNN are in that repo and up to date, we don´t need to duplicate it here I guess.

@katarinagresova
Copy link
Copy Markdown
Member

@evaklimentova @davidcechak is this pull request still valid? I think David already put this code into his #7. Can we close it? It has been open for half a year now.

I would discard the branch, all the scripts regarding miRBench CNN are in that repo and up to date, we don´t need to duplicate it here I guess.

Well, they are replicated in @davidcechak's PR. I will close this one. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants