Multimodal item representations are widely used in modern recommender systems to capture various aspects of items. However, the high dimensionality of these representations poses challenges in terms of computational efficiency and resource usage. In this paper, we propose a fusion method, named MDR, based on attention bottlenecks to obtain condensed multimodal item representations. Our approach, intersecting dimensionality reduction and information fusion, leverages a Transformer autoencoder architecture to learn a compact item representation that preserves the salient information from the original high-dimensional features. We evaluate the impact of the condensed representations on the recommendation performance and resource consumption of a set of recommendation algorithms using the Cornac framework. Experiments on four datasets show that the condensed item representations produced by our fusion method enable the recommender algorithms to achieve comparable or even improved recommendation performance compared to the original, high-dimensional representations. Moreover, using the condensed representations significantly reduces training time, RAM Memory usage, and GPU utilization. These findings highlight the potential of our approach to enhance the efficiency and sustainability of multimodal recommender systems without compromising their effectiveness.

Shrinking for success: Multimodal dimensionality reduction for sustainable recommender systems

Geninatti Cossatin A.;Mauro N.
2025-01-01

Abstract

Multimodal item representations are widely used in modern recommender systems to capture various aspects of items. However, the high dimensionality of these representations poses challenges in terms of computational efficiency and resource usage. In this paper, we propose a fusion method, named MDR, based on attention bottlenecks to obtain condensed multimodal item representations. Our approach, intersecting dimensionality reduction and information fusion, leverages a Transformer autoencoder architecture to learn a compact item representation that preserves the salient information from the original high-dimensional features. We evaluate the impact of the condensed representations on the recommendation performance and resource consumption of a set of recommendation algorithms using the Cornac framework. Experiments on four datasets show that the condensed item representations produced by our fusion method enable the recommender algorithms to achieve comparable or even improved recommendation performance compared to the original, high-dimensional representations. Moreover, using the condensed representations significantly reduces training time, RAM Memory usage, and GPU utilization. These findings highlight the potential of our approach to enhance the efficiency and sustainability of multimodal recommender systems without compromising their effectiveness.
2025
296
1
22
Attention bottlenecks; Multimodal recommender systems; Sustainable computing; Transformer
Geninatti Cossatin A.; Mauro N.
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S0957417425025461-main.pdf

Accesso aperto

Tipo di file: PDF EDITORIALE
Dimensione 2.87 MB
Formato Adobe PDF
2.87 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/2094917
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact