MusicSheetConvert

A Python toolkit for converting between staff notation, numbered notation (jianpu), MIDI, and MusicXML, with built-in optical music recognition (Audiveris) and high-fidelity rendering (MuseScore 4).

Features

End-to-end OCR pipeline — drop in a sheet image or PDF, get MIDI, jianpu text, and a re-rendered PNG in one command.
Bidirectional conversion between MusicXML, MIDI, and jianpu (numbered) text via music21 and pretty_midi.
Complex-score builders for grand staff, multiple key signatures (C / G / D / F / Bb / Eb / Am / Em / Dm), ii-V-I 7th-chord progressions, and the first four measures of Bach's Prelude in C major (BWV 846).
High-fidelity staff rendering via MuseScore 4 (PNG, PDF, SVG).
Pure-Python core — no daemon, no web server, just a CLI and a library.

Demo

$ python -m src.main --demo bach_prelude --render --out data/demo_bach

MusicXML : data\demo_bach\bach_prelude_m1_m4.musicxml
MIDI     : data\demo_bach\bach_prelude_m1_m4.mid
jianpu   : data\demo_bach\jianpu\bach_prelude_m1_m4.txt
PNG      : data\demo_bach\rendered\bach_prelude_m1_m4.png

The generated Bach prelude is recognized correctly by Audiveris 5.10.2 when fed back through the OCR pipeline:

$ python -m src.main data/ocr_e2e/preprocessed/bach_prelude-1.png --render

MusicXML : data\ocr_e2e\musicxml\bach_prelude-1.mxl
MIDI     : data\ocr_e2e\midi\bach_prelude-1.mid
jianpu   : 3+ 5+ 7+ 3+ 1+ 2+ 1+ 6+ 2+ 5+ 7 2+ 3+ 5+ 1++ 3+ 5---
PNG      : data\ocr_e2e\rendered\bach_prelude-1.png

Architecture

                +------------------+
                |  PDF / PNG image |
                +---------+--------+
                          |
                  preprocess (OpenCV)
                          |
                          v
                  +-------+--------+
                  |   binarized   |
                  |    PNG        |
                  +-------+--------+
                          |
                  Audiveris OMR (batch)
                          |
                          v
                  +-------+--------+
                  |  MusicXML/.mxl|
                  +-------+--------+
                          |
        +-----------------+------------------+------------------+
        |                 |                  |                  |
   music21           core (jianpu)      MuseScore 4           ...
        |                 |                  |
        v                 v                  v
   MIDI / .mid      jianpu / .txt     staff PNG / PDF

Module	Responsibility
`src/main.py`	CLI + pipeline orchestration (no conversion logic).
`src/music_convert/core.py`	Pure-Python MusicXML <-> MIDI <-> jianpu.
`src/music_convert/advanced.py`	Grand-staff, key-signature, chord, Bach-prelude builders.
`src/ocr/__init__.py`	Audiveris 5.x adapter with auto-discovery.
`src/preprocess/__init__.py`	Image / PDF preprocessing (grayscale, adaptive threshold, scaling).
`src/render/__init__.py`	MuseScore 4 rendering (PNG, PDF, SVG, MusicXML).

See docs/ARCHITECTURE.md for more detail.

Installation

1. Clone the repository

git clone https://github.com/Unk1ndledAC/MusicSheetConvert.git
cd MusicSheetConvert

2. Create a Python venv and install dependencies

# Modify the directories in these two files to your actual Python and virtual environment paths first.
# Windows
scripts\install.bat

# Unix / Git Bash
./scripts/install.sh

3. Install external tools

Tool	Why	Where to get it
Audiveris 5.x	Optical music recognition	https://github.com/Audiveris/audiveris/releases
MuseScore 4	Staff-notation rendering (PNG / PDF / SVG)	https://musescore.org/download
Poppler	PDF -> image (used by `pdf2image`)	https://github.com/oschwartz10612/poppler-windows/releases

Set the AUDIVERIS_EXE and MUSESCORE_EXE environment variables to point at non-default locations. The find_audiveris() and find_musescore() helpers fall back to PATH lookup, then to this machine's default install paths.

Quick Start

Run the full pipeline on a sheet image

python -m src.main path/to/score.png --out data/output --render --bpm 100

This produces, under data/output/:

data/output/
├── preprocessed/   # binarized copies of the inputs
├── musicxml/       # .mxl outputs from Audiveris
├── midi/           # .mid files
├── jianpu/         # jianpu text files
└── rendered/       # staff-notation PNGs (only when --render is passed)

Run the full pipeline on a PDF

python -m src.main path/to/scorebook.pdf --out data/output --render --bpm 90

Each page becomes a separate result with the suffix _p<NNN>.

Skip OCR when the input is already MusicXML

python -m src.main path/to/score.musicxml --out data/output --skip-ocr --render

Generate a complex score from a preset (no input file)

# Two-hand grand staff in any of the supported keys
python -m src.main --demo grand_staff --key Bb --render --out data/demo_bb

# ii-V-I 7th-chord progression
python -m src.main --demo chord_progression --key C --render --out data/demo_c

# All 9 supported key signatures in one piece
python -m src.main --demo key_demonstration --render --out data/demo_keys

# First 4 measures of Bach's Prelude in C major
python -m src.main --demo bach_prelude --render --out data/demo_bach

Use as a Python library

from pathlib import Path
from src.music_convert import core
from src.music_convert.advanced import build_grand_staff, NoteSpec
from src.ocr import image_to_musicxml
from src.render import musicxml_to_png

# 1) MusicXML -> MIDI
core.musicxml_to_midi("in.musicxml", "out.mid", core.ConvertOptions(bpm=100))

# 2) MusicXML -> jianpu text
print(core.musicxml_to_jianpu_text("in.musicxml"))

# 3) Build a complex score from scratch
score = build_grand_staff(
    right_hand_measures=[[NoteSpec(["C4"], 1.0), NoteSpec(["D4"], 1.0),
                         NoteSpec(["E4"], 1.0), NoteSpec(["F4"], 1.0)]],
    left_hand_measures=[[NoteSpec(["C3", "E3", "G3"], 4.0)]],
    key_name="C", title="Demo"
)

# 4) OMR (image -> MusicXML)
xml_path = image_to_musicxml("score.png", Path("data/output/musicxml"))

# 5) Render MusicXML to PNG via MuseScore
musicxml_to_png(xml_path, Path("data/output/rendered/score.png"))

CLI Reference

python -m src.main [INPUT] [OPTIONS]

Option	Description
`INPUT`	Path to a PDF / PNG / JPG / MusicXML file. Omit when using `--demo`.
`--out DIR`	Output root directory (default `./data/output`).
`--bpm N`	Default MIDI tempo (default 90).
`--render`	Additionally render staff-notation PNG via MuseScore.
`--skip-ocr`	Skip Audiveris OCR (use when the input is already MusicXML).
`--audiveris-bin PATH`	Explicit path to the Audiveris executable.
`--demo {grand_staff,chord_progression,key_demonstration,bach_prelude}`	Generate a preset complex score instead of reading a file.
`--key {C,G,D,F,Bb,Eb,Am,Em,Dm}`	Key signature for the generated score.
`--title TEXT`	Title for the generated score.

Jianpu Text Grammar

The numbered-notation (jianpu) text format used by this project is intentionally minimal:

Token	Meaning
`1` `2` `3` `4` `5` `6` `7`	do, re, mi, fa, sol, la, ti (middle C = 1)
`1+` `1++`	One / two octaves up
`1-`	One octave down
`#1` `b3`	Sharp / flat
`/`	Chord separator (e.g. `1/3/5`)
`\|`	Bar separator (kept for readability, otherwise ignored)
whitespace	Token separator

Each token defaults to one quarter-note beat. See src/music_convert/core.py:_emit if you want to extend the grammar (e.g. for explicit duration tokens like 1- / 1. / 1_).

Testing

The repository ships with four test suites. They auto-skip when an external dependency is missing, so they can be run anywhere.

# 1) MusicXML <-> MIDI <-> jianpu roundtrip
python tests/smoke.py

# 2) End-to-end CLI pipeline (skip OCR)
python tests/e2e_cli.py

# 3) Complex scores: grand staff, 9 key signatures, chords, Bach prelude
python tests/complex_e2e.py

# 4) Real Audiveris + MuseScore pipeline (skipped if either is missing)
python tests/ocr_e2e.py

The final regression on a fully configured machine prints:

ALL TESTS PASSED          # smoke
E2E OK                    # e2e_cli
ALL E2E PASSED            # complex_e2e
ALL OCR E2E PASSED        # ocr_e2e

Known Limitations

Jianpu durations are emitted as fixed quarter-note beats. Real durations, ties, and triplets are decoded from MusicXML but not yet re-encoded into the compact jianpu text.
Audiveris accuracy on synthesized/clean scores is good; on real hand-written or noisy scans it can drop considerably, especially for dense broken-chord patterns (the OCR step may collapse two adjacent eighth notes into a single quarter).
MuseScore 4 CLI writes the output filename as <basename>-1.png for PNG; the wrapper normalizes this back to the user-requested path.

Contributing

We welcome issues and pull requests. See CONTRIBUTING.md for development setup, code style, and the contribution workflow.

License

This project is released under the MIT License. See LICENSE for the full text.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MusicSheetConvert

Table of Contents

Features

Demo

Architecture

Installation

1. Clone the repository

2. Create a Python venv and install dependencies

3. Install external tools

Quick Start

Run the full pipeline on a sheet image

Run the full pipeline on a PDF

Skip OCR when the input is already MusicXML

Generate a complex score from a preset (no input file)

Use as a Python library

CLI Reference

Jianpu Text Grammar

Testing

Known Limitations

Contributing

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
data		data
docs		docs
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

MusicSheetConvert

Table of Contents

Features

Demo

Architecture

Installation

1. Clone the repository

2. Create a Python venv and install dependencies

3. Install external tools

Quick Start

Run the full pipeline on a sheet image

Run the full pipeline on a PDF

Skip OCR when the input is already MusicXML

Generate a complex score from a preset (no input file)

Use as a Python library

CLI Reference

Jianpu Text Grammar

Testing

Known Limitations

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages