Co-clustering is a powerful data mining tool that extracts summary information from a data matrix, by simultaneously computing row and column clusters that provide a compact representation of the data. However, if the matrix contains data about individuals, the co-clustering results may be influenced by the societal biases that are reproduced in the data. Despite the extensive research on fairness considerations in clustering, this issue has not been addressed in the context of co-clustering algorithms. This paper proposes a novel fair co-clustering algorithm based on an associative measure derived from the Goodman-Kruskal’s tau, which has demonstrated good convergence properties. This ensures optimal clustering and fairness performance by implementing an in-process rebalancing mechanism inspired by the fair assignment problem. An extensive experimental validation is provided to demonstrate the efficacy of our approach.

An Associative Approach to Fair Co-clustering

Peiretti, Federico
First
;
Pensa, Ruggero G.
Last
2026-01-01

Abstract

Co-clustering is a powerful data mining tool that extracts summary information from a data matrix, by simultaneously computing row and column clusters that provide a compact representation of the data. However, if the matrix contains data about individuals, the co-clustering results may be influenced by the societal biases that are reproduced in the data. Despite the extensive research on fairness considerations in clustering, this issue has not been addressed in the context of co-clustering algorithms. This paper proposes a novel fair co-clustering algorithm based on an associative measure derived from the Goodman-Kruskal’s tau, which has demonstrated good convergence properties. This ensures optimal clustering and fairness performance by implementing an in-process rebalancing mechanism inspired by the fair assignment problem. An extensive experimental validation is provided to demonstrate the efficacy of our approach.
2026
33rd Symposium on Advanced Database Systems SEBD 2025
Ischia, Italy
June 16 - 19, 2025
SEBD 2025 - Symposium on Advanced Database Systems 2025
CEUR-WS.org
4182
446
454
https://ceur-ws.org/Vol-4182/paper68.pdf
Clustering, Fairness, High-dimensional data
Peiretti, Federico; Pensa, Ruggero G.
File in questo prodotto:
File Dimensione Formato  
sebd2025_online.pdf

Accesso aperto

Descrizione: PDF online
Tipo di file: PDF EDITORIALE
Dimensione 1.19 MB
Formato Adobe PDF
1.19 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/2130350
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact