Harmonising music information retrieval with semantics: from data integration to multimodality

Poltronieri, Andrea (2025) Harmonising music information retrieval with semantics: from data integration to multimodality, [Dissertation thesis], Alma Mater Studiorum Università di Bologna. Dottorato di ricerca in Computer science and engineering, 37 Ciclo. DOI 10.48676/unibo/amsdottorato/11758.

Salva citazione

Citato da

Documenti full-text disponibili:

[thumbnail of PhD_Thesis__Andrea_Poltronieri.pdf]

Documento PDF (English) - Richiede un lettore di PDF come Xpdf o Adobe Acrobat Reader
Disponibile con Licenza: Creative Commons: Attribuzione - Non Commerciale - Condividi allo Stesso Modo 4.0 (CC BY-NC-SA 4.0) .
Download (25MB)

Abstract

In the era of big data and machine learning, the fragmentation of musical datasets and lack of standardised representations hinder advancements in Music Information Retrieval (MIR). The multifaceted nature of music complicates both the representation of content - with small, task-specific datasets scattered across various formats, and context (metadata), where there is a lack of consistent terminology. These challenges increase the effort required for data collection and pre-processing, reduce reproducibility, and limit the scalability of MIR models. To address these issues, this thesis proposes a unified semantic model to foster interoperability and advance MIR tasks. A specific instance of this fragmentation can be found in harmonic annotations, where harmony is inconsistently represented across datasets, formats, and notational systems. Taking harmony as a use case, this thesis develops a standardised workflow to harmonise disconnected datasets, enabling the creation of large-scale unified corpora. Building on these harmonised datasets, a key contribution is the exploration of harmonic similarity to reveal connections across diverse tracks, periods, and genres through novel state-of-the-art similarity functions. While integrating symbolic data offers significant advantages, certain limitations persist. Primary challenges include the limited diversity of annotated data, often biased toward a narrow range of musical genres, and the inherent ambiguity and subjectivity in harmonic annotations. Such challenges have led MIR tasks like Audio Chord Estimation (ACE) to hit a "glass ceiling," where neither increasing computational power nor the volume of data has led to improved results. To address these issues, this thesis explores a multimodal approach combining audio and chord annotations. We propose a method for enriching datasets with aligned audio annotations and introduce a new ACE model that embeds music theory concepts like consonance and dissonance. This model aims to mitigate chord vocabulary imbalance and annotation subjectivity, advancing the state-of-the-art in audio-based harmonic analysis.

Abstract

Tipologia del documento

Tesi di dottorato

Autore

Poltronieri, Andrea

Supervisore

Presutti, Valentina

Dottorato di ricerca

Computer science and engineering

Ciclo

Coordinatore

Bartolini, Ilaria

Settore disciplinare

Area 01 - Scienze matematiche e informatiche > INF/01 Informatica

Settore concorsuale

Area 01 - Scienze matematiche e informatiche > 01/B - Informatica > 01/B1 Informatica

Parole chiave

music information retrieval; harmony; computational musicology; chord estimation; signal processing; symbolic music processing; music similarity; ontology; semantic web; knowledge graph

DOI

10.48676/unibo/amsdottorato/11758

Data di discussione

9 Aprile 2025

URI

https://amsdottorato.unibo.it/id/eprint/11758