Detection and enforcement of non-linear correlations for fair and robust machine learning applications

Giuliani, Luca (2025) Detection and enforcement of non-linear correlations for fair and robust machine learning applications, [Dissertation thesis], Alma Mater Studiorum Università di Bologna. Dottorato di ricerca in Computer science and engineering, 37 Ciclo. DOI 10.48676/unibo/amsdottorato/12027.

Salva citazione

Citato da

Documenti full-text disponibili:

[thumbnail of Detection and Enforcement of Non-Linear Correlations for Fair and Robust Machine Learning Applications.pdf]

Documento PDF (English) - Richiede un lettore di PDF come Xpdf o Adobe Acrobat Reader
Disponibile con Licenza: Creative Commons: Attribuzione - Non Commerciale - Non Opere Derivate 4.0 (CC BY-NC-ND 4.0) .
Download (10MB)

Abstract

Detecting correlations is crucial in several Machine Learning tasks, such as the identification of patterns or the enforcement of certain relational constraints. In the realm of algorithmic fairness, correlations are particularly significant, as indicators typically quantify the degree of dependence between a sensitive input attribute and a target variable. Nonetheless, traditional measures have been focusing solely on categorical protected attributes due to technical limitations, thus neglecting continuous sensitive information like age, income, degree of disability, or other aggregated numerical variables. To overcome these limitations, recent research has suggested using the Hirschfeld–Gebelein–Rényi (HGR) correlation coefficient as a measure of fairness. HGR is an extension of Pearson's coefficient able to detect non-linear correlations by employing two mapping functions called copula transformations; in this dissertation, we present a novel computational approach for estimating it by means of user-defined kernel functions parameterized through a vector of mixing coefficients. Our approach is deterministic, offers increased robustness, improves interpretability compared to existing methods, and features other advantageous properties that make it more trustworthy for practical applications. We demonstrate its benefits over other computational techniques in both synthetic data and real-world benchmarks; then, following a minor variation of the HGR semantics, we introduce the Generalized Disparate Impact (GeDI) indicator, which broadens the legal notion of disparate impact to continuous input variables. Empirical findings confirm that this indicator can effectively reduce unfairness across three benchmark datasets, as well as in a practical use case involving long-term fairness in ranking systems; moreover, we show how both measures can be brought into a unified framework, and are equivalent up to a data-dependent scaling factor. To conclude, we discuss ongoing and future works regarding both methodological extensions of our Kernel-Based HGR method and potential applications in intersectional fairness and causal discovery.

Abstract

Tipologia del documento

Tesi di dottorato

Autore

Giuliani, Luca

Supervisore

Lombardi, Michele

Dottorato di ricerca

Computer science and engineering

Ciclo

Coordinatore

Bartolini, Ilaria

Settore disciplinare

Area 09 - Ingegneria industriale e dell'informazione > ING-INF/05 Sistemi di elaborazione delle informazioni

Settore concorsuale

Area 09 - Ingegneria industriale e dell'informazione > 09/H - Ingegneria informatica > 09/H1 Sistemi di elaborazione delle informazioni

Parole chiave

Non-Linear Correlations, Polynomial Regression, Algorithmic Fairness, Machine Learning, Fair Machine Learning, Robust Machine Learning, Constrained Machine Learning

DOI

10.48676/unibo/amsdottorato/12027

Data di discussione

9 Aprile 2025

URI

https://amsdottorato.unibo.it/id/eprint/12027