CINECA IRIS Institutional Research Information System

With many applications relying on multi-dimensional datasets for decision making, matrix factorization (or decomposition) is becoming the basis for many knowledge discovery and machine learning tasks, from clustering, trend detection, anomaly detection, to correlation analysis. Unfortunately, a major shortcoming of matrix analysis operations is that, despite their effectiveness when the data is scalar, these operations become difficult to apply in the presence of non-scalar data, as they are not designed for data that include non-scalar observations, such as intervals. Yet, in many applications, the available data are inherently non-scalar for various reasons, including imprecision in data collection, conflicts in aggregated data, data summarization, or privacy issues, where one is provided with a reduced, clustered, or intentionally noisy and obfuscated version of the data to hide information. In this paper, we propose matrix decomposition techniques that consider the existence of interval-valued data. We show that naive ways to deal with such imperfect data may introduce errors in analysis and present factorization techniques that are especially effective when the amount of imprecise information is large.

Matrix Factorization with Interval-Valued Data

Mao-Lin Li;Francesco Di Mauro;K. Selcuk Candan;Maria Luisa Sapino

2021-01-01

Abstract

With many applications relying on multi-dimensional datasets for decision making, matrix factorization (or decomposition) is becoming the basis for many knowledge discovery and machine learning tasks, from clustering, trend detection, anomaly detection, to correlation analysis. Unfortunately, a major shortcoming of matrix analysis operations is that, despite their effectiveness when the data is scalar, these operations become difficult to apply in the presence of non-scalar data, as they are not designed for data that include non-scalar observations, such as intervals. Yet, in many applications, the available data are inherently non-scalar for various reasons, including imprecision in data collection, conflicts in aggregated data, data summarization, or privacy issues, where one is provided with a reduced, clustered, or intentionally noisy and obfuscated version of the data to hide information. In this paper, we propose matrix decomposition techniques that consider the existence of interval-valued data. We show that naive ways to deal with such imperfect data may introduce errors in analysis and present factorization techniques that are especially effective when the amount of imprecise information is large.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2021
			
	Titolo rivista
	
				IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
			
	N. Volume
	
				33
			
	Fascicolo
	
				4
			
	Pagine (da)
	
				1644
			
	Pagine (a)
	
				1658
			
	DOI
	
				https://dx.doi.org/10.1109/TKDE.2019.2942310
			
	URL del prodotto (archivi open access, fulltext su sito editore, etc.)
	
				https://doi.org/10.1109/TKDE.2019.2942310
			
	Parole Chiave
	
				Matrix decomposition,  Semantics, 
    Probabilistic logic,   Principal component analysis, 
    Singular value decomposition
			
	Tutti gli autori
	
						Mao-Lin Li,
Francesco Di Mauro,
K. Selcuk Candan,
Maria Luisa Sapino
					
	Appare nelle tipologie:
	
				03A-Articolo su Rivista

File in questo prodotto:

File	Dimensione	Formato
tkde18_revised_submission.pdf Accesso aperto Descrizione: articolo e appendice Tipo di file: POSTPRINT (VERSIONE FINALE DELL’AUTORE) Dimensione 1.31 MB Formato Adobe PDF Visualizza/Apri	1.31 MB	Adobe PDF	Visualizza/Apri
Sapino-08844796.pdf Accesso riservato Tipo di file: PDF EDITORIALE Dimensione 2.85 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	2.85 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1726448

Citazioni

ND

7

0

social impact