Skip to content

Passing bam::record::Record between threads causes a segfault #293

@DonFreed

Description

@DonFreed

Thank you for the very nice library!

I'm working on a tool that passes reads between threads. Unfortunately, the tool is producing non-deterministic segmentation faults and other memory errors. I've traced some of these issues back to rust-htslib, which will crash somewhat randomly when bam::record::Records are passes between threads. I am not too familiar with this library, so I am wondering if this is the expected behavior?

Here is a simplified example that can reproduce the crash:

use std::error::Error;
use std::str;
use std::sync::mpsc::{self, Receiver};
use std::thread;

use rust_htslib::bam::{Read, Reader};
use rust_htslib::bam::record::Record;

fn sum_mapqs(rx: Receiver<Option<Record>>) -> Result<(), Box<dyn Error>> {
    let mut total_mapq = 0u64;
    loop {
        match rx.recv() {
            Ok(x) => {
                match x {
                    Some(read) => {
                        let mapq = read.mapq();
                        total_mapq = total_mapq.saturating_add(mapq as u64);
                    },
                    None => {  // No more data
                        println!("Total MapQ: {}", total_mapq);
                        return Ok(());
                    },
                }
            },
            Err(e) => {
                eprintln!("Error reciving data: {}", e);
            }
        }
    }
}

fn main() -> Result<(), Box<dyn Error>> {
    let mut bam = match Reader::from_stdin() {
        Ok(bam) => { bam },
        Err(e) => { return Err(Box::new(e)); },
    };

    // Initialize the writer thread
    let (tx, rx) = mpsc::channel();
    let writer = thread::spawn(move || {
        if let Err(e) = sum_mapqs(rx) {
            eprintln!("Error writing output - {}", e);
        }
    });

    for read in bam.records() {
        match read {
            Ok(read) => {
                eprintln!("Parsed read: {}", str::from_utf8(read.qname()).unwrap());
                if let Err(e) = tx.send(Some(read)) {
                    eprintln!("Error sending data to writer thread");
                    return Err(Box::new(e));
                }
            },
            Err(e) => {
                return Err(Box::new(e));
            },
        }
    }

    // Close the spawned thread
    let _ = tx.send(None);
    let _ = writer.join().unwrap();
    Ok(())
}

Compiling with RUSTFLAGS="-g" cargo build and then running with a SAM passed through stdin produces the following:

$ cat test.sam | target/debug/rust-htslib-crash
...
Parsed read: H203:185:D2990ACXX:4:1101:11561:5493
23559 Broken pipe             cat test.sam
23560 Segmentation fault      (core dumped) | target/debug/rust-htslib-crash

Re-running the same command will produce the crash in different parts of the program. I've attached a backtrace from one of the crashes: crash backtrace.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions