Skip to content

Memory leak when running multiple sessions #1165

Open
@scodesido-at-proton

Description

@scodesido-at-proton

It seems that when running multiple times a PDF rendering within the same process, there is a pretty bad memory leak. I've been able to reproduce with a relatively minimal example that compiles some very basic inline TeX, no extra packages used. The leak occurs the moment the session is .run() - removing that line causes the leak to go away.

My context is that I want to use tectonic to render PDFs inside a web service (related to the discussion in here). I noticed the leak in there, but obviously this doesn't work there.

This is the Cargo.toml of the minimal example to reproduce, which uses the most recently available versions (although the problem also arises in e.g. 0.14 or in other minor versions of the dependency crates).

[package]
name = "memtest"
version = "0.1.0"
edition = "2021"

[dependencies]
memory-stats = "1.1"
tectonic = "0.15"
tectonic_bundles = "0.3"
tectonic_engine_xetex = "0.4"
tectonic_bridge_harfbuzz = { version = "0.2", features = ["external-harfbuzz"] }

And this is the code that showcases the memory leak

use memory_stats::memory_stats;
use std::fs::create_dir_all;
use tectonic::{
    driver::{OutputFormat, ProcessingSessionBuilder},
    status::NoopStatusBackend,
};
use tectonic_bundles::{cache::Cache, get_fallback_bundle_url, itar::IndexedTarBackend};

pub fn main() {
    println!("Starting");
    create_dir_all("./tectonic_cache/formats/").expect("Creating format cache folder");

    for n in 0..=200 {
        let mut status = NoopStatusBackend::default();

        let mut cache = Cache::get_for_custom_directory("./tectonic_cache/");
        let bundle_url = get_fallback_bundle_url(tectonic_engine_xetex::FORMAT_SERIAL);
        let bundle = cache
            .open::<IndexedTarBackend>(&bundle_url, false, &mut status)
            .expect("Opening cache");

        let tex = r"
            \documentclass{article}
            \begin{document}
                hello
            \end{document}
        ";

        let mut session_builder = ProcessingSessionBuilder::default();
        session_builder
            .bundle(Box::new(bundle))
            .primary_input_buffer(tex.as_bytes())
            .filesystem_root(".")
            .tex_input_name("texput.tex")
            .format_name("latex")
            .format_cache_path("./tectonic_cache/formats/")
            .keep_logs(false)
            .keep_intermediates(false)
            .print_stdout(false)
            .output_format(OutputFormat::Pdf)
            .do_not_write_output_files();

        let mut session = session_builder
            .create(&mut status)
            .expect("Creating session");

        session.run(&mut status).expect("Compiling TeX");

        if n % 10 == 0 {
            let mem_use = memory_stats().expect("Checking memory");
            println!(
                "n={} virtual={:.1} MiB physical={:.1} MiB",
                n,
                (mem_use.physical_mem as f64) / 1024.0 / 1024.0,
                (mem_use.virtual_mem as f64) / 1024.0 / 1024.0,
            );
        }
    }
}

which outputs

Starting
n=0 virtual=57.6 MiB physical=113.3 MiB
n=10 virtual=75.4 MiB physical=138.1 MiB
n=20 virtual=85.1 MiB physical=148.1 MiB
n=30 virtual=99.4 MiB physical=159.1 MiB
n=40 virtual=105.1 MiB physical=168.1 MiB
n=50 virtual=115.1 MiB physical=178.1 MiB
n=60 virtual=125.1 MiB physical=188.1 MiB
n=70 virtual=135.1 MiB physical=198.1 MiB
n=80 virtual=145.1 MiB physical=208.1 MiB
n=90 virtual=155.1 MiB physical=218.1 MiB
n=100 virtual=165.1 MiB physical=228.1 MiB
n=110 virtual=175.1 MiB physical=238.1 MiB
n=120 virtual=185.1 MiB physical=248.1 MiB
n=130 virtual=195.1 MiB physical=258.1 MiB
n=140 virtual=205.1 MiB physical=268.1 MiB
n=150 virtual=215.1 MiB physical=278.1 MiB
n=160 virtual=225.1 MiB physical=288.1 MiB
n=170 virtual=235.1 MiB physical=298.1 MiB
n=180 virtual=245.1 MiB physical=308.1 MiB
n=190 virtual=255.1 MiB physical=318.1 MiB
n=200 virtual=265.1 MiB physical=328.1 MiB

This specific test was on debian bookworm, but I first noticed inside a Docker image based on alpine, so I doubt it's related to the underlying libs. The leak is quite bad and it gets worse if e.g. the document includes images, which I haven't done in the example for the sake of simplicity.

As for why creating a Bundle and a Session every time, the .into_file_data() of the ProcessingSession consumes the session, which means I cannot reuse the session for multiple runs. The .bundle() method of ProcessingSessionBuilder consumes the bundle, and the .create() method consumes the ProcessingSession too. It seems ownership is required all the way from creating the bundle to getting the PDF, and it doesn't look like it can be worked around.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions