Skip to content

Releases: trinezac/SG_optimization

MSPminer_benchmark

14 Jun 09:24
c862129

Choose a tag to compare

The following scripts and datafiles are used for benchmarking of MSPminer on the simulated gene catalogue.

In order to perform the SG identification, a read count matrix and a record of the species with its respective genes and information of gene lengths is needed.
The data used for the analysis provided by Borderes M, et al* can be found at https://zenodo.org/record/4306051#.Yg5xgy8w2u5

  • Functions.R: Contains the functions used for SG optimization
  • SG_refinement_SCG.R: Main for optimizing the SGs
  • abundance_profiles.R: Takes the optimized SGs from SG_refinement_SCG.R and creates relative abundance estimates.
  • mspminer_format_conversion.R: Takes the data fra Borderes M, et al* and converts it to the format used for SG_refinement_SCG.R
  • tax_df.RDS: Created using the information in the supplementaty file, Table S3 provided by Borderes M, et al*

*Borderes M, et al. A comprehensive evaluation of binning methods to recover human gut microbial species from a non-redundant reference gene catalog. NAR Genom Bioinform. 2021 Mar 1;3(1):lqab009.