CINECA IRIS Institutional Research Information System

Since its early formulations, co-clustering has gained popularity and interest both within and outside the machine learning community as a powerful learning paradigm for clustering high-dimensional data with good explainability properties. The simultaneous partitioning of all the modes of the input data tensors (rows and columns in a data matrix) is both a method for improving clustering on one mode while performing dimensionality reduction on the other mode(s), and a tool for providing an actionable interpretation of the clusters in the main mode as summaries of the features in each other mode(s). Hence, it is useful in many complex decision systems and data science applications. In this paper, we survey the the co-clustering literature by reviewing the main co-clustering methods, with a special focus on the work done in the last twenty-five years. We identify, describe and compare the main algorithmic categories, and provide a practical characterization with respect to similar unsupervised techniques. Additionally, we also try to explain why it is still a powerful tool despite the apparent recent decreasing interest shown by the machine learning community. To this purpose, we review the most recent trends in co-clustering research and outline the open problems and promising future research perspectives.

Co-clustering: a Survey of the Main Methods, Recent Trends and Open Problems

Elena Battaglia^{Co-first

Membro del Collaboration Group};Federico Peiretti^{Co-first

Membro del Collaboration Group};Ruggero Gaetano Pensa^{Last

Membro del Collaboration Group}

2024-01-01

Abstract

Since its early formulations, co-clustering has gained popularity and interest both within and outside the machine learning community as a powerful learning paradigm for clustering high-dimensional data with good explainability properties. The simultaneous partitioning of all the modes of the input data tensors (rows and columns in a data matrix) is both a method for improving clustering on one mode while performing dimensionality reduction on the other mode(s), and a tool for providing an actionable interpretation of the clusters in the main mode as summaries of the features in each other mode(s). Hence, it is useful in many complex decision systems and data science applications. In this paper, we survey the the co-clustering literature by reviewing the main co-clustering methods, with a special focus on the work done in the last twenty-five years. We identify, describe and compare the main algorithmic categories, and provide a practical characterization with respect to similar unsupervised techniques. Additionally, we also try to explain why it is still a powerful tool despite the apparent recent decreasing interest shown by the machine learning community. To this purpose, we review the most recent trends in co-clustering research and outline the open problems and promising future research perspectives.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Titolo rivista
	
				ACM COMPUTING SURVEYS
			
	N. Volume
	
				57
			
	Fascicolo
	
				2
			
	Pagine (da)
	
				1
			
	Pagine (a)
	
				33
			
	DOI
	
				https://dx.doi.org/10.1145/3698875
			
	URL del prodotto (archivi open access, fulltext su sito editore, etc.)
	
				https://dl.acm.org/doi/10.1145/3698875
			
	Parole Chiave
	
				cluster analysis, surveys and overviews, clustering
			
	Tutti gli autori
	
						Elena Battaglia; Federico Peiretti; Ruggero Gaetano Pensa
					
	Appare nelle tipologie:
	
				03B-Review in Rivista / Rassegna della Lett. in Riv. / Nota Critica

File in questo prodotto:

File	Dimensione	Formato
csur2024_printed.pdf Accesso aperto Descrizione: PDF open access Tipo di file: PDF EDITORIALE Dimensione 869.7 kB Formato Adobe PDF Visualizza/Apri	869.7 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/2019731

Citazioni

ND

0

0

social impact