This paper shows how data science can contribute to improving empirical research in economics by leveraging on large datasets and extracting information otherwise unsuitable for a traditional econometric approach. As a test-bed for our framework, machine learning algorithms allow to create a new holistic measure of innovation following a 2012 Italian Law aimed at boosting new high-tech firms. We adopt this measure to analyse the impact of innovativeness on a large population of Italian firms which entered the market at the beginning of the 2008 global crisis. The methodological contribution is organised in different steps. First, we train seven supervised learning algorithms to recognise innovative firms on 2013 firmographics data and select a combination of those models with the best prediction power. Second, we apply the latter on the 2008 dataset and predict which firms would have been labelled as innovative according to the definition of the 2012 law. Finally, we adopt this new indicator as the regressor in a survival model to explain firms' ability to remain in the market after 2008. The results suggest that innovative firms are more likely to survive than the rest of the sample, but the survival premium is likely to depend on location.

Start-ups survival through a crisis. Combining machine learning with econometrics to measure innovation

Guerzoni M.;Nava C. R.;Nuccio M.
2021-01-01

Abstract

This paper shows how data science can contribute to improving empirical research in economics by leveraging on large datasets and extracting information otherwise unsuitable for a traditional econometric approach. As a test-bed for our framework, machine learning algorithms allow to create a new holistic measure of innovation following a 2012 Italian Law aimed at boosting new high-tech firms. We adopt this measure to analyse the impact of innovativeness on a large population of Italian firms which entered the market at the beginning of the 2008 global crisis. The methodological contribution is organised in different steps. First, we train seven supervised learning algorithms to recognise innovative firms on 2013 firmographics data and select a combination of those models with the best prediction power. Second, we apply the latter on the 2008 dataset and predict which firms would have been labelled as innovative according to the definition of the 2012 law. Finally, we adopt this new indicator as the regressor in a survival model to explain firms' ability to remain in the market after 2008. The results suggest that innovative firms are more likely to survive than the rest of the sample, but the survival premium is likely to depend on location.
2021
30
5
468
493
economic crisis; Innovation; machine learning; start-ups; survival analysis
Guerzoni M.; Nava C.R.; Nuccio M.
File in questo prodotto:
File Dimensione Formato  
ijerph-17-00607_final.pdf

Accesso riservato

Tipo di file: PDF EDITORIALE
Dimensione 3.05 MB
Formato Adobe PDF
3.05 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1770710
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 19
  • ???jsp.display-item.citation.isi??? 18
social impact