CINECA IRIS Institutional Research Information System

Co-clustering is a powerful data mining tool that extracts summary information from a data matrix, by simultaneously computing row and column clusters that provide a compact representation of the data. However, if the matrix contains data about individuals, the co-clustering results may be influenced by the societal biases that are reproduced in the data. Consequently, subsequent tasks such as recommendation systems may also be influenced by these biases, thereby compromising the fairness and integrity of the overall knowledge discovery or machine learning process. Despite the extensive research on fairness considerations in clustering, this issue has not been addressed in the context of co-clustering algorithms. In addressing this critical gap in the literature, this paper proposes a novel fair co-clustering algorithm. The proposed algorithm is based on an associative measure derived from the Goodman-Kruskal’s tau, which has demonstrated good convergence properties. This ensures optimal clustering and fairness performance by implementing an in-process rebalancing mechanism inspired by the fair assignment problem. An extensive experimental validation is provided to demonstrate the efficacy of our approach, also in comparison to a state-of-the-art method that uses co-clustering for fair recommendation.

Fair Associative Co-clustering

Peiretti, Federico^{First

Membro del Collaboration Group};Pensa, Ruggero G.^{Last

Membro del Collaboration Group}

2026-01-01

Abstract

Co-clustering is a powerful data mining tool that extracts summary information from a data matrix, by simultaneously computing row and column clusters that provide a compact representation of the data. However, if the matrix contains data about individuals, the co-clustering results may be influenced by the societal biases that are reproduced in the data. Consequently, subsequent tasks such as recommendation systems may also be influenced by these biases, thereby compromising the fairness and integrity of the overall knowledge discovery or machine learning process. Despite the extensive research on fairness considerations in clustering, this issue has not been addressed in the context of co-clustering algorithms. In addressing this critical gap in the literature, this paper proposes a novel fair co-clustering algorithm. The proposed algorithm is based on an associative measure derived from the Goodman-Kruskal’s tau, which has demonstrated good convergence properties. This ensures optimal clustering and fairness performance by implementing an in-process rebalancing mechanism inspired by the fair assignment problem. An extensive experimental validation is provided to demonstrate the efficacy of our approach, also in comparison to a state-of-the-art method that uses co-clustering for fair recommendation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2026
			
	Titolo dell'evento
	
				The 2025 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2025)
			
	Luogo dell'evento
	
				Porto, Portugal
			
	Data dell'evento
	
				September 15-19, 2025
			
	Titolo del volume
	
				Machine Learning and Knowledge Discovery in Databases. Research Track.
			
	Nome editore
	
				Springer
			
	N. Volume
	
				16013
			
	Pagine (da)
	
				282
			
	Pagine (a)
	
				300
			
	Codice ISBN
	
				9783032059611
9783032059628
			
	DOI
	
				https://dx.doi.org/10.1007/978-3-032-05962-8_17
			
	URL del prodotto (archivi open access, fulltext su sito editore, etc.)
	
				https://link.springer.com/chapter/10.1007/978-3-032-05962-8_17
			
	Parole Chiave
	
				Clustering, Fairness, High-dimensional data
			
	Tutti gli autori
	
						Peiretti, Federico; Pensa, Ruggero G.
					
	Appare nelle tipologie:
	
				04A-Conference paper in volume

File in questo prodotto:

File	Dimensione	Formato
ecmlpkdd2025_preprint.pdf Accesso aperto Descrizione: Preprint Tipo di file: PREPRINT (PRIMA BOZZA) Dimensione 1.41 MB Formato Adobe PDF Visualizza/Apri	1.41 MB	Adobe PDF	Visualizza/Apri
ecmlpkdd2025_printed.pdf Accesso riservato Descrizione: PDF editore Tipo di file: PDF EDITORIALE Dimensione 1.87 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.87 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/2096930

Citazioni

ND

0

0

social impact