With the availability of user-generated content in the Web, malicious users dispose of huge repositories of private (and often sensitive) information regarding a large part of the world’s population. The self-disclosure of personal information, in the form of text, pictures and videos, exposes the authors of such contents (and not only them) to many criminal acts such as identity thefts, stalking, burglary, frauds, and so on. In this paper, we propose a way to evaluate the harmfulness of any form of content by defining a new data mining task called content sensitivity analysis. According to our definition, a score can be assigned to any object (text, picture, video...) according to its degree of sensitivity. Even though the task is similar to sentiment analysis, we show that it has its own peculiarities and may lead to a new branch of research. Thanks to some preliminary experiments, we show that content sensitivity analysis can not be addressed as a simple binary classification task.
Towards Content Sensitivity Analysis
Battaglia, ElenaCo-first
;Bioglio, LivioCo-first
;Pensa, Ruggero G.
Last
2020-01-01
Abstract
With the availability of user-generated content in the Web, malicious users dispose of huge repositories of private (and often sensitive) information regarding a large part of the world’s population. The self-disclosure of personal information, in the form of text, pictures and videos, exposes the authors of such contents (and not only them) to many criminal acts such as identity thefts, stalking, burglary, frauds, and so on. In this paper, we propose a way to evaluate the harmfulness of any form of content by defining a new data mining task called content sensitivity analysis. According to our definition, a score can be assigned to any object (text, picture, video...) according to its degree of sensitivity. Even though the task is similar to sentiment analysis, we show that it has its own peculiarities and may lead to a new branch of research. Thanks to some preliminary experiments, we show that content sensitivity analysis can not be addressed as a simple binary classification task.File | Dimensione | Formato | |
---|---|---|---|
ida2020_open.pdf
Accesso aperto
Descrizione: PDF online (open access)
Tipo di file:
PDF EDITORIALE
Dimensione
579.55 kB
Formato
Adobe PDF
|
579.55 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.