CNN scripts cleanup by evaklimentova · Pull Request #2 · BioGeMT/miRBind_2.0

evaklimentova · 2024-10-02T12:07:32Z

Rewriting scripts for miRBind 2022 CNN training.

Encoding script takes as an input the TSV dataset file and outputs two .npy files: one with encoded 2D binding matrices and the other with corresponding labels. As the dataset files are now much bigger than the older datasets, processing is done in smaller batches.

Training script takes the encoded dataset, the dataset pos:neg ratio (for balancing during the training) and the size of the dataset. It trains a model with the same hyperparameters as described in the miRBind paper. The thing that is different to the original version is internal working with the dataset - the data gets loaded to memory only when needed (per batch).

katarinagresova · 2024-11-07T16:04:24Z

@evaklimentova I think many parts of the code are the same as code in miRBench_paper repo. Please incorporate changes made in miRBench_paper here as well. Would it make sense to extract common code into separate repo and import it from there?

evaklimentova · 2024-11-07T16:09:33Z

@evaklimentova I think many parts of the code are the same as code in miRBench_paper repo. Please incorporate changes made in miRBench_paper here as well. Would it make sense to extract common code into separate repo and import it from there?

Yep, I will transfer the changes here at some point

I'm not sure about making a separate repo, I think it's a bit of an overkill for just two scripts...

katarinagresova · 2025-04-10T10:16:21Z

@evaklimentova @davidcechak is this pull request still valid? I think David already put this code into his #7. Can we close it? It has been open for half a year now.

evaklimentova · 2025-04-10T11:08:27Z

@evaklimentova @davidcechak is this pull request still valid? I think David already put this code into his #7. Can we close it? It has been open for half a year now.

I would discard the branch, all the scripts regarding miRBench CNN are in that repo and up to date, we don´t need to duplicate it here I guess.

katarinagresova · 2025-04-10T11:12:00Z

@evaklimentova @davidcechak is this pull request still valid? I think David already put this code into his #7. Can we close it? It has been open for half a year now.

I would discard the branch, all the scripts regarding miRBench CNN are in that repo and up to date, we don´t need to duplicate it here I guess.

Well, they are replicated in @davidcechak's PR. I will close this one. Thanks!

evaklimentova requested a review from stephaniesamm October 2, 2024 12:07

evaklimentova assigned davidcechak Oct 2, 2024

evaklimentova added 2 commits October 15, 2024 11:23

Sync with main

7eb1cef

miRBind CNN cleanup

e92e058

evaklimentova force-pushed the miRBind_CNN branch from ee09861 to e92e058 Compare October 16, 2024 07:32

adding pipeline for running miRBin CNN training

63f0558

katarinagresova closed this Apr 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CNN scripts cleanup#2

CNN scripts cleanup#2
evaklimentova wants to merge 3 commits into
mainfrom
miRBind_CNN

evaklimentova commented Oct 2, 2024

Uh oh!

katarinagresova commented Nov 7, 2024

Uh oh!

evaklimentova commented Nov 7, 2024

Uh oh!

katarinagresova commented Apr 10, 2025

Uh oh!

evaklimentova commented Apr 10, 2025

Uh oh!

katarinagresova commented Apr 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

evaklimentova commented Oct 2, 2024

Uh oh!

katarinagresova commented Nov 7, 2024

Uh oh!

evaklimentova commented Nov 7, 2024

Uh oh!

katarinagresova commented Apr 10, 2025

Uh oh!

evaklimentova commented Apr 10, 2025

Uh oh!

katarinagresova commented Apr 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants