Distance-based machine learning methods have limited applicability to categorical data, since they do not capture the complexity of the relationships among different values of a categorical attribute. Nonetheless, categorical attributes are common in many application scenarios, including clinical and health records, census and survey data. Although distance learning algorithms exist for categorical data, they may disclose private information about individual records if applied to a secret dataset. To address this problem, we introduce a differentially private algorithm for learning distances between any pair of values of a categorical attribute according to the way they are co-distributed with the values of other categorical attributes forming the so-called context. We show empirically that our approach consumes little privacy budget while providing accurate distances
DP-DILCA: Learning Differentially Private Context-based Distances for Categorical Data (Discussion Paper)
Elena BattagliaFirst
;Ruggero G. Pensa
Last
2021-01-01
Abstract
Distance-based machine learning methods have limited applicability to categorical data, since they do not capture the complexity of the relationships among different values of a categorical attribute. Nonetheless, categorical attributes are common in many application scenarios, including clinical and health records, census and survey data. Although distance learning algorithms exist for categorical data, they may disclose private information about individual records if applied to a secret dataset. To address this problem, we introduce a differentially private algorithm for learning distances between any pair of values of a categorical attribute according to the way they are co-distributed with the values of other categorical attributes forming the so-called context. We show empirically that our approach consumes little privacy budget while providing accurate distancesFile | Dimensione | Formato | |
---|---|---|---|
sebd2021_dpdilca_open.pdf
Accesso aperto
Descrizione: PDF online (open access)
Tipo di file:
PDF EDITORIALE
Dimensione
997.39 kB
Formato
Adobe PDF
|
997.39 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.