Tensors co-clustering has been proven useful in many applications, due to its ability of coping with high-dimensional data and sparsity. However, setting up a co-clustering algorithm properly requires the specification of the desired number of clusters for each mode as input parameters. This choice is already difficult in relatively easy settings, like flat clustering on data matrices, but on tensors it could be even more frustrating. To face this issue, we propose a tensor co-clustering algorithm that does not require the number of desired co-clusters as input, as it optimizes an objective function based on a measure of association across discrete random variables (called Goodman and Kruskal’s τ) that is not affected by their cardinality. The effectiveness of our algorithm is shown on both synthetic and real-world datasets, also in comparison with state-of-the-art co-clustering methods based on tensor factorization.

Parameter-Less Tensor Co-clustering

BATTAGLIA, ELENA
First
;
Pensa, Ruggero G.
Last
2019-01-01

Abstract

Tensors co-clustering has been proven useful in many applications, due to its ability of coping with high-dimensional data and sparsity. However, setting up a co-clustering algorithm properly requires the specification of the desired number of clusters for each mode as input parameters. This choice is already difficult in relatively easy settings, like flat clustering on data matrices, but on tensors it could be even more frustrating. To face this issue, we propose a tensor co-clustering algorithm that does not require the number of desired co-clusters as input, as it optimizes an objective function based on a measure of association across discrete random variables (called Goodman and Kruskal’s τ) that is not affected by their cardinality. The effectiveness of our algorithm is shown on both synthetic and real-world datasets, also in comparison with state-of-the-art co-clustering methods based on tensor factorization.
2019
DS2019: 22nd International Conference on Discovery Science
Split, Croatia
October 28-30, 2019
Discovery Science. DS 2019.
Springer
11828
205
219
978-3-030-33777-3
978-3-030-33778-0
https://link.springer.com/chapter/10.1007/978-3-030-33778-0_17
Clustering, Higher-order data, Unsupervised learning
Battaglia, Elena; Pensa, Ruggero G.
File in questo prodotto:
File Dimensione Formato  
ds2019_battaglia_draft.pdf

Accesso aperto

Descrizione: paper (postprint)
Tipo di file: POSTPRINT (VERSIONE FINALE DELL’AUTORE)
Dimensione 550.04 kB
Formato Adobe PDF
550.04 kB Adobe PDF Visualizza/Apri
ds2019_battaglia_printed.pdf

Accesso riservato

Descrizione: PDF online
Tipo di file: PDF EDITORIALE
Dimensione 622.68 kB
Formato Adobe PDF
622.68 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1714020
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 1
social impact