DEMUN: DEtector of MUsic Notation
Please use the following text to cite this item or export to a predefined format:
Dvořák, Vojtěch; et al., 2025,
DEMUN: DEtector of MUsic Notation, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-6051.
Authors
Dvořák, Vojtěch ; et al.
Item identifier
Project URL
Demo URL
Date issued
2025-12-10
Type
Description
DEMUN stands simply for DEtector of MUsic Notation. This describes exactly what DEMUN does: it detects whether there is or isn’t music notation on a given input page. Crucially, DEMUN is capable of doing this at very large scales: digital libraries with tens of millions of digitized pages (or more!), with very diverse documents. The users of DEMUN are digital libraries and their technical staff who are tasked with providing and maintaining metadata for their digital library. There is no GUI: just components intended to run servers. DEMUN is not intended for library end-users. “Using DEMUN” means setting it up and running it as part of a library collection processing pipeline.
In order for DEMUN to be usable in practice, there are two requirements: Accuracy with extremely imbalanced data, and speed. In order to achieve both the speed and accuracy requirements, DEMUN is decomposed into two stages: a pre-filter (Stage 1), and a main stage with a “clean” filter that includes a notation type classifier as well (Stage 2). The pre-filter (Stage 1) is intended to be run on premises, with low hardware requirements and massive parallelization possible without having to take into account network speed and bandwidth. The “clean” filter (Stage 2) has significantly higher accuracy, but requires a GPU to run quickly. Typically, such computing infrastructure is not easily available for libraries (though if it is, then of course it is possible to take advantage of this and run the second stage locally as well).
DEMUN achieves a total false positive rate (FPR) below 0.02 %. On experiments with 4.000.000 images from the collections of the Moravian Library, it outputs roughly 3 sheet music images for every 1 false positive, while the input collection contains some 2.500 non-music pages for every page that contains sheet music. DEMUN thus achieves a nearly 7,500x increase in sheet music concentration, and is capable of discovering musical material at a useful rate. The combined false negative rate (FNR) is estimated at 5 %, which means DEMUN misses only approx. one page in twenty. Its two-stage architecture cuts down processing times to approx. a day per 10.000.000 images.
This capability to accurately and quickly (and relatively cheaply!) discover music in large-scale library collections independently on pre-existing metadata is currently unique.
Acknowledgement
Ministry of Culture
Project code:DH23P03OVV008
Project name:OmniOMR – rozpoznávání hudebního záznamu pomocí strojového učení pro digitální knihovny
Collections
Files in this item
- Name
- demun-main_1.0.0.zip
- Size
- 11.08 MB
- Format
- application/zip
- Description
- Zip
- MD5
- 0730ff5f6194682ec6223b91f7604753

- demun-1.0.0
- .gitignore4 kB
- README.md3 kB
- Makefile1 kB
- requirements.txt154 B
- src
- setup_dataset_bic.py1 kB
- dataset_generator.py4 kB
- setup_all_background_datasets.py240 B
- setup_dataset_bicm.py958 B
- run_train.py1 kB
- setup_dataset_mcm.py1 kB
- dataset_loader.py2 kB
- background_dataset_generator.py3 kB
- setup_dataset_sipa.py1 kB
- utils.py3 kB
- square_pad.py1008 B
- logger.py341 B
- constants.py1 kB
- run_val.py3 kB
- setup_all_datasets.py671 B
- slurm_script_generator.py2 kB
- run_predict.py4 kB
- docs
- images
- notation-type-examples.png1 MB
- padding.jpg364 kB
- README.md115 B
- results
- bic_confusion_matrix_normalized.png91 kB
- README.md1 kB
- backmcm_confusion_matrix.png114 kB
- backbic_confusion_matrix_normalized.png88 kB
- bic_confusion_matrix.png86 kB
- sipa_confusion_matrix.png110 kB
- mcm_confusion_matrix.png104 kB
- backsipa_confusion_matrix.png112 kB
- mcm_confusion_matrix_normalized.png118 kB
- backsipa_confusion_matrix_normalized.png110 kB
- sipa_confusion_matrix_normalized.png117 kB
- backbic_confusion_matrix.png92 kB
- backmcm_confusion_matrix_normalized.png110 kB
- technical-doc.md5 kB
- experiment-design.md1 kB
- user-doc.md4 kB
- images
- .env448 B
- models
- demun_sipa_2025-11-06_13-04-09.pt3 MB
- demun_bic_2025-11-06_13-03-41.pt3 MB
- demun_mcm_2025-11-06_13-00-16.pt3 MB
- Name
- demun-prefilter-1.0.0.zip
- Size
- 77.11 MB
- Format
- application/zip
- Description
- Zip
- MD5
- 9c3d80d693e849e622701067e56aed08

- demun-prefilter-1.0.0
- config.json48 B
- model_b4.pth67 MB
- .gitignore7 kB
- README.md5 kB
- app.py11 kB
- requirements.txt130 B
- model_b0.pth15 MB
- Name
- demun-technical-documentation.pdf
- Size
- 5.95 MB
- Format
- application/pdf
- Description
- Adobe PDF
- MD5
- 211d13cd2ed77387f799d438a34e83df

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

