CINECA IRIS Institutional Research Information System

With the availability of user-generated content in the Web, malicious users have access to huge repositories of private (and often sensitive) information regarding a large part of the world’s population. In this paper, we propose a way to evaluate the harmfulness of text content by defining a new data mining task called content sensitivity analysis. According to our definition, a score can be assigned to any text sample according to its degree of sensitivity. Even though the task is similar to sentiment analysis, we show that it has its own peculiarities and may lead to a new branch of research. Thanks to some preliminary experiments, we show that content sensitivity analysis can not be addressed as a simple binary classification task.

Classification-based Content Sensitivity Analysis

Battaglia, Elena^Co-first;Bioglio, Livio^Co-first;Pensa, Ruggero G.^Last

2020-01-01

Abstract

With the availability of user-generated content in the Web, malicious users have access to huge repositories of private (and often sensitive) information regarding a large part of the world’s population. In this paper, we propose a way to evaluate the harmfulness of text content by defining a new data mining task called content sensitivity analysis. According to our definition, a score can be assigned to any text sample according to its degree of sensitivity. Even though the task is similar to sentiment analysis, we show that it has its own peculiarities and may lead to a new branch of research. Thanks to some preliminary experiments, we show that content sensitivity analysis can not be addressed as a simple binary classification task.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2020
			
	Lingua di pubblicazione
	
				Inglese
			
	Su invito
	
				contributo
			
	Tipo di evento
	
				1 - Conferenza
			
	Titolo dell'evento
	
				28th Symposium on Advanced Database Systems (SEBD 2020)
			
	Luogo dell'evento
	
				Villasimius, Italy
			
	Data dell'evento
	
				June 21-24, 2020
			
	Rilevanza dell'evento
	
				Nazionale
			
	Curatore del volume
	
				Maristella Agosti, Maurizio Atzori, Paolo Ciaccia, Letizia Tanca
			
	Titolo del volume
	
				Proceedings of the 28th Italian Symposium on Advanced Database Systems,Villasimius, Sud Sardegna, Italy (virtual due to Covid-19 pandemic),June 21-24, 2020
			
	Referee
	
				Comitato scientifico
			
	Nome editore
	
				CEUR-WS.org
			
	Città editore
	
				Aachen
			
	Nazione editore
	
				GERMANIA
			
	N. Volume
	
				2646
			
	Pagine (da)
	
				326
			
	Pagine (a)
	
				333
			
	Numero di Pagine
	
				8
			
	Titolo della serie (se presente ISSN)
	
				CEUR WORKSHOP PROCEEDINGS
			
	Codice Scopus
	
				2-s2.0-85090899358
			
	URL del prodotto (archivi open access, fulltext su sito editore, etc.)
	
				http://ceur-ws.org/Vol-2646/12-paper.pdf
			
	Parole Chiave
	
				privacy, text mining, text categorization
			
	Coautori affiliati a enti stranieri
	
				no
			
	Prodotto conforme al Regolamento di Ateneo sull'accesso aperto?
	
				1 – prodotto con  file in versione Open Access (allegherò il file al passo 6 - Carica)
			
	Numero autori
	
				3
			
	Tipologia
	
				info:eu-repo/semantics/conferenceObject
			
	Tipologia
	
				04-CONTRIBUTO IN ATTI DI CONVEGNO::04A-Conference paper in volume
			
	Tutti gli autori
	
						Battaglia, Elena; Bioglio, Livio; Pensa, Ruggero G.
					
	Tipologia sito docente
	
				273
			
	Fulltext
	
				open
			
	Appare nelle tipologie:
	
				04A-Conference paper in volume

File in questo prodotto:

File	Dimensione	Formato
sebd2020_2_online.pdf Accesso aperto Descrizione: PDF online (open access) Tipo di file: PDF EDITORIALE Dimensione 423.41 kB Formato Adobe PDF Visualizza/Apri	423.41 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1749119

Citazioni

ND

2

ND

social impact