Being able to automatically recognize notable sites in the physical world using artificial intelligence embedded in mobile devices can pave the way to new forms of urban exploration and open novel channels of interactivity between residents, travellers, and cities. Although the development of outdoor recognition systems has been a topic of interest for a while, most works have been limited in geographic coverage due to the lack of high-quality image data that can be used for training site recognition engines. As a result, prior systems usually lack generality and operate on a limited scope of pre-elected sites. In this work, we design a mobile system that can automatically recognise sites of interest and project relevant information to a user that navigates the city. We build a collection of notable sites using Wikipedia and then exploit online services such as Google Images and Flickr to collect large collections of crowd-sourced imagery describing those sites. These images are then used to train minimal deep learning architectures that can be effectively deployed to dedicated applications on mobile devices. By conducting an evaluation and performing a series of online and real-world experiments, we recognise a number of key challenges in deploying a site recognition system and highlight the importance of incorporating mobile contextual information to facilitate the visual recognition task. The similarity in the feature maps of objects that undergo identification, the presence of noise in crowd-sourced imagery and arbitrary user-induced inputs are among the factors the impede correct classification for deep learning models. We show how curating the training data through the application of a class-specific image de-noising method and the incorporation of information such as user location, orientation, and attention patterns can allow for significant improvement in classification accuracy and the election of an end-to-end system that can effectively be used to recognise sites in the wild.

Notable Site Recognition using Deep Learning on Mobile and Crowd-sourced Imagery

Rossano Schifanella
2020

Abstract

Being able to automatically recognize notable sites in the physical world using artificial intelligence embedded in mobile devices can pave the way to new forms of urban exploration and open novel channels of interactivity between residents, travellers, and cities. Although the development of outdoor recognition systems has been a topic of interest for a while, most works have been limited in geographic coverage due to the lack of high-quality image data that can be used for training site recognition engines. As a result, prior systems usually lack generality and operate on a limited scope of pre-elected sites. In this work, we design a mobile system that can automatically recognise sites of interest and project relevant information to a user that navigates the city. We build a collection of notable sites using Wikipedia and then exploit online services such as Google Images and Flickr to collect large collections of crowd-sourced imagery describing those sites. These images are then used to train minimal deep learning architectures that can be effectively deployed to dedicated applications on mobile devices. By conducting an evaluation and performing a series of online and real-world experiments, we recognise a number of key challenges in deploying a site recognition system and highlight the importance of incorporating mobile contextual information to facilitate the visual recognition task. The similarity in the feature maps of objects that undergo identification, the presence of noise in crowd-sourced imagery and arbitrary user-induced inputs are among the factors the impede correct classification for deep learning models. We show how curating the training data through the application of a class-specific image de-noising method and the incorporation of information such as user location, orientation, and attention patterns can allow for significant improvement in classification accuracy and the election of an end-to-end system that can effectively be used to recognise sites in the wild.
21st IEEE International Conference on Mobile Data Management (MDM)
Versailles
June 30 - July 3, 2020
2020 21st IEEE International Conference on Mobile Data Management (MDM)
IEEE
137
147
978-1-7281-4664-5
978-1-7281-4663-8
https://ieeexplore.ieee.org/abstract/document/9162277
Mobile Computing, Deep Learning, location-based application, End-to-End System
Jimin Tan, Anastasios Noulas, Diego Sáez, Rossano Schifanella
File in questo prodotto:
File Dimensione Formato  
1910.09705.pdf

accesso aperto

Tipo di file: PREPRINT (PRIMA BOZZA)
Dimensione 7.65 MB
Formato Adobe PDF
7.65 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/2318/1795557
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact