Skip to content

Training the RDP classifier -c option #23

@ddavis3739

Description

@ddavis3739

RDPstaff,

I am trying to retrain the RDP classifier and have an issue with the -c option. I have already prepped my seq and tax files (end of email) and trained RDP against them.

It output 4 files (below), but none of them is the properties file.

bergeyTrainingTree.xml logWordPrior.txt
genus_wordConditionalProbList.txt wordConditionalProbIndexArr.txt

Do I need to include the -c file to get this? If so, there is no information anywhere on how to generate it that I can find so I was hoping can help. According to the README, "It should at least three columns: name, rank and mean for the lowest rank taxon to be trained". What do you mean by mean in the context of this file? Furthermore, how should I go about generating the whole file?

SEQ FILE

AB353770|AB353770.1.1740_U Root;Eukaryota;Alveolata;Dinoflagellata;Dinophyceae;Peridiniales;Kryptoperidiniaceae;Unruhdinium
ATGCTTGTCTCAAAGATTAAGCCATGCATGTCTCAGTATAAGCTTTTACATGGCGAAACTGCGAATGGCTCATTAAAACAGTTACAGTTTATTTGAAG (cont.)

TAX FILE

0*Root*-1*0*rootrank
1*Eukaryota*0*1*domain
2*Alveolata*1*2*supergroup
3*Dinoflagellata*2*3*division
4*Dinophyceae*3*4*class
5*Peridiniales*4*5*order

Thanks for the help

-Andrew Davis

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions