The availability of reliable socioeconomic data is critical for designing urban policies and implementing location-based services; however, their temporal and geographical coverage often remains scarce. We explore the potential for insurance customer data to predict the socioeconomic indicators of Swiss municipalities. First, we define a feature space by aggregating individual customer data at the city level along several behavioral and user profile dimensions. Second, we collect official statistics shared by the Swiss authorities on a wide spectrum of categories: Population, Transportation, Work, Space and Territory, Housing, and Economy. Third, we adopt two spatial regression models exploring global and local geographical dependencies to investigate their predictability. Results show consistently a correlation between insurance customer characteristics and official socioeconomic indexes. Performance fluctuates depending on the category, with values of R2 > 0.6 for several target variables using a 5-fold cross-validation. As a case study, we focus on predicting the percentage of the population using public transportation, and we discuss the implications on a regional scope. We believe that this methodology can support official statistical offices, and it could open up new opportunities for the characterization of socioeconomic traits at highly granular spatial and temporal scales.
Leveraging insurance customer data to characterize socioeconomic indicators of Swiss municipalities
Schifanella, Rossano;
2021-01-01
Abstract
The availability of reliable socioeconomic data is critical for designing urban policies and implementing location-based services; however, their temporal and geographical coverage often remains scarce. We explore the potential for insurance customer data to predict the socioeconomic indicators of Swiss municipalities. First, we define a feature space by aggregating individual customer data at the city level along several behavioral and user profile dimensions. Second, we collect official statistics shared by the Swiss authorities on a wide spectrum of categories: Population, Transportation, Work, Space and Territory, Housing, and Economy. Third, we adopt two spatial regression models exploring global and local geographical dependencies to investigate their predictability. Results show consistently a correlation between insurance customer characteristics and official socioeconomic indexes. Performance fluctuates depending on the category, with values of R2 > 0.6 for several target variables using a 5-fold cross-validation. As a case study, we focus on predicting the percentage of the population using public transportation, and we discuss the implications on a regional scope. We believe that this methodology can support official statistical offices, and it could open up new opportunities for the characterization of socioeconomic traits at highly granular spatial and temporal scales.File | Dimensione | Formato | |
---|---|---|---|
journal.pone.0246785.pdf
Accesso aperto
Tipo di file:
PDF EDITORIALE
Dimensione
2.03 MB
Formato
Adobe PDF
|
2.03 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.