CINECA IRIS Institutional Research Information System

With the availability of user-generated content in the Web, malicious users dispose of huge repositories of private (and often sensitive) information regarding a large part of the world’s population. The self-disclosure of personal information, in the form of text, pictures and videos, exposes the authors of such contents (and not only them) to many criminal acts such as identity thefts, stalking, burglary, frauds, and so on. In this paper, we propose a way to evaluate the harmfulness of any form of content by defining a new data mining task called content sensitivity analysis. According to our definition, a score can be assigned to any object (text, picture, video...) according to its degree of sensitivity. Even though the task is similar to sentiment analysis, we show that it has its own peculiarities and may lead to a new branch of research. Thanks to some preliminary experiments, we show that content sensitivity analysis can not be addressed as a simple binary classification task.

Towards Content Sensitivity Analysis

Battaglia, Elena^Co-first;Bioglio, Livio^Co-first;Pensa, Ruggero G.^Last

2020-01-01

Abstract

With the availability of user-generated content in the Web, malicious users dispose of huge repositories of private (and often sensitive) information regarding a large part of the world’s population. The self-disclosure of personal information, in the form of text, pictures and videos, exposes the authors of such contents (and not only them) to many criminal acts such as identity thefts, stalking, burglary, frauds, and so on. In this paper, we propose a way to evaluate the harmfulness of any form of content by defining a new data mining task called content sensitivity analysis. According to our definition, a score can be assigned to any object (text, picture, video...) according to its degree of sensitivity. Even though the task is similar to sentiment analysis, we show that it has its own peculiarities and may lead to a new branch of research. Thanks to some preliminary experiments, we show that content sensitivity analysis can not be addressed as a simple binary classification task.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
			2020
		
	Titolo dell'evento
	
			18th International Symposium on Intelligent Data Analysis, IDA 2020
		
	Luogo dell'evento
	
			Konstanz, Germany
		
	Data dell'evento
	
			April 27–29, 2020
		
	Titolo del volume
	
			Advances in Intelligent Data Analysis XVIII
		
	Nome editore
	
			Springer
		
	N. Volume
	
			12080
		
	Pagine (da)
	
			67
		
	Pagine (a)
	
			79
		
	Codice ISBN
	
			978-3-030-44583-6
978-3-030-44584-3
		
	DOI
	
			https://dx.doi.org/10.1007/978-3-030-44584-3_6
		
	URL del prodotto (archivi open access, fulltext su sito editore, etc.)
	
			https://link.springer.com/chapter/10.1007/978-3-030-44584-3_6
		
	Parole Chiave
	
			Privacy, Text mining, Text categorization
		
	Tutti gli autori
	
			Battaglia, Elena; Bioglio, Livio; Pensa, Ruggero G.
		
	Appare nelle tipologie:
	
			04A-Conference paper in volume

File in questo prodotto:

File	Dimensione	Formato
ida2020_open.pdf Accesso aperto Descrizione: PDF online (open access) Tipo di file: PDF EDITORIALE Dimensione 579.55 kB Formato Adobe PDF Visualizza/Apri	579.55 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1736937

Citazioni

ND

5

5

social impact