Shrinking for success: Multimodal dimensionality reduction for sustainable recommender systems

Geninatti Cossatin, A.; Mauro, N.

doi:10.1016/j.eswa.2025.128929

Multimodal item representations are widely used in modern recommender systems to capture various aspects of items. However, the high dimensionality of these representations poses challenges in terms of computational efficiency and resource usage. In this paper, we propose a fusion method, named MDR, based on attention bottlenecks to obtain condensed multimodal item representations. Our approach, intersecting dimensionality reduction and information fusion, leverages a Transformer autoencoder architecture to learn a compact item representation that preserves the salient information from the original high-dimensional features. We evaluate the impact of the condensed representations on the recommendation performance and resource consumption of a set of recommendation algorithms using the Cornac framework. Experiments on four datasets show that the condensed item representations produced by our fusion method enable the recommender algorithms to achieve comparable or even improved recommendation performance compared to the original, high-dimensional representations. Moreover, using the condensed representations significantly reduces training time, RAM Memory usage, and GPU utilization. These findings highlight the potential of our approach to enhance the efficiency and sustainability of multimodal recommender systems without compromising their effectiveness.