Skip to content

Error: invalid base: 0067 #137

@holmrenser

Description

@holmrenser

I'm trying to use simpleaf to build an index for Glycine max (soybean). The genome and gtf files required some preprocessing to get them properly formatted.

I ran the following command (using simpleaf 0.16.2):

simpleaf index --output simpleaf_index --fasta ../../g_max.genome.fasta --gtf ../../g_max.longest_transcripts.gtf --rlen 91 --threads 16 --use-piscem

Which resulted in the following output:

2024-06-04T10:05:48.261414Z  INFO simpleaf::simpleaf_commands::indexing: preparing to make reference with roers
2024-06-04T10:05:50.342651Z  INFO grangers::reader::gtf: Finished parsing the input file. Found 0 comments and 752330 records.
2024-06-04T10:05:51.029383Z  INFO roers: Built the Grangers object for 752330 records
2024-06-04T10:05:51.237147Z  WARN grangers::grangers_info: The exon_number column contains null values. Will compute the exon number from exon start position .
2024-06-04T10:05:51.527120Z  WARN roers: Found missing gene_id and/or gene_name; Imputing. If both missing, will impute using transcript_id; Otherwise, will impute using the existing one.
2024-06-04T10:05:51.549542Z  INFO roers: Proceed 278761 exon records from 55589 transcripts
Error: invalid base: 0067

The error message is a bit cryptic, so I don't really know what to do. I tried searching some of the rust repositories but haven't found the error message source yet.

If relevant I can provide the genome and gtf files.

EDIT:

Upon further investigation this seems to stem from the noodles crate: https://github.com/zaeleus/noodles/blob/906f5237c68fc6b04a73010580d3c4fed2c7b66e/noodles-fasta/src/record/sequence/complement.rs#L24. However, I don't really understand what's wrong yet.

Quick python check:

>>> bytes([67])
b'C'

Which should be possible to reverse complement?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions