Caitlin Cherryh
June 2024
The evolutionary relationships between the Metazoa have been particularly difficult to resolve, due to the deep evolutionary timescales and due to a rapid radiation at the root of the animal tree. Both incomplete lineage sorting and long branch attraction have been hypothesised to impact metazoan tree inference. By applying concordance factors, the topological variation at key branches within the metazoan tree can be quantified to determine the evolutionary processes involved. This repository contains the scripts necessary to perform this analyis by calculating and analysing the gene and quartet concordance factors of 12 empirical metazoan phylogenetic datasets.
- Scripts
- All scripts necessary to completely replicate this analysis are included in the
code/folder - Each script includes an overview, a list of necessary parameters or file paths, and a list of software necessary to run that script
- All scripts necessary to completely replicate this analysis are included in the
- Output
- All files containing output from each stage of the analysis can be found in the
output/folder dataset_included_taxa.tsv: Table with one column per empirical dataset. Column names are dataset names, and all taxon names within that dataset are listed one per rowempirical_alignment_files: Table listing dataset name, dataset ID, alignment file and partition file for each empirical datasetinput_partition_gene_tree_estimation.csv: Input file for estimating gene treesinput_c60_gene_tree_estimation.csv: Input file for estimating gene treesresults_gene_tree_likelihood.csv: Table with log likelihood and BIC score for all gene trees estimated under partition modelsinput_paths_cf_analysis.csv: Input file for concordance factor analysisresults_cf_gCF_values.csv: Gene concordance factor results for all datasetsresults_cf_gCF_values_formatted.csv: Gene concordance factor results for all datasets, formatted differently for plottingresults_cf_qCF_values.csv: Quartet concordance factor results for all datasetsresults_cf_qCF_values_formatted.csv: Quartet concordance factor results for all datasets, formatted differently for plotting
- All files containing output from each stage of the analysis can be found in the
- Constraint trees
- The
constraint_treesdirectory contains a filealternative_phylogenetic_hypotheses.nxe, which contains Newick trees for the three evolutionary hypotheses considered in this study
- The
- Instructions for replication
- Instructions for replicating these analyses are in this
README.mdfile.
- Instructions for replicating these analyses are in this
To fully replicate the analyses, follow these steps:
- Download and install the following software programs:
- Download alignments
- Download the figshare repository for this project
- Estimate phylogenetic trees from each empirical dataset with Partition and C60 substitution models
- Run the script
01_empirical_tree_estimation.R
- Run the script
- Estimate gene trees from each empirical dataset
- To estimate gene trees with the models from the best partitioning scheme, run the script
02_gene_tree_estimation.R. This file contains code to estimate unconstrained gene trees and constrained gene trees. Constrained gene trees are not needed for the concordance analysis and are only used in the constrained tree analysis - To estimate gene trees with a C60 model, run the script
02_C60_gene_tree_estimation.R
- To estimate gene trees with the models from the best partitioning scheme, run the script
- Calculate concordance factors
- Run the script
03_empirical_concordance_factors.R
- Run the script
- Plot results
- Run the script
05_plot_figures_cf.R
- Run the script
- Generate gene trees and constrained gene trees using the script
02_gene_tree_estimation.R - Run the script
04_single_gene_processing.R - Plot results using script
05_plot_figures_cf.R
- Run the script
5_plot_trees.R
If you replicate any part of these analyses or use functions from these scripts, please cite this repository. Thank you!
Caitlin Cherryh, 2024. "Evaluating the impact of conflicting phylogenetic signal on hypotheses of early animal evolution", GitHub repository. https://github.com/caitlinch/ancient_ILS (Accessed 4/6/2024)