Keywords categorization is an essential tool for SEO (Search Engine Optimization), digital marketers, and online advertising. Keywords represent one of the most valuable pieces of information to infer the users' intents and interests. An effective keyword categorization method allows understanding what types of content are in the greatest demand and can help improve future content strategies or marketing/ad campaigns.In this paper, we present a novel deep learning model for multilingual keyword categorization. The model relies on fastText multilingual word embeddings, and its architecture is inspired by the DeepSets model. To make use of (training) words not included in the pre-trained fastText embeddings, we initialize them as the average embedding overall of the co-occurrent words. Then, we fine-tune these representations by allowing the network to back-propagate the error to the input. We assess the quality of our proposal on a real-world dataset provided by a Spanish company where keywords are categorized upon the Google Product Taxonomy (GPT). Empirical results show that our model can achieve high accuracy scores while being extremely efficient.
Efficient Multilingual Deep Learning Model for Keyword Categorization
Polato, M
;
2021-01-01
Abstract
Keywords categorization is an essential tool for SEO (Search Engine Optimization), digital marketers, and online advertising. Keywords represent one of the most valuable pieces of information to infer the users' intents and interests. An effective keyword categorization method allows understanding what types of content are in the greatest demand and can help improve future content strategies or marketing/ad campaigns.In this paper, we present a novel deep learning model for multilingual keyword categorization. The model relies on fastText multilingual word embeddings, and its architecture is inspired by the DeepSets model. To make use of (training) words not included in the pre-trained fastText embeddings, we initialize them as the average embedding overall of the co-occurrent words. Then, we fine-tune these representations by allowing the network to back-propagate the error to the input. We assess the quality of our proposal on a real-world dataset provided by a Spanish company where keywords are categorized upon the Google Product Taxonomy (GPT). Empirical results show that our model can achieve high accuracy scores while being extremely efficient.File | Dimensione | Formato | |
---|---|---|---|
SSCI2021___keyword_categorization.pdf
Accesso aperto
Tipo di file:
PREPRINT (PRIMA BOZZA)
Dimensione
345.31 kB
Formato
Adobe PDF
|
345.31 kB | Adobe PDF | Visualizza/Apri |
Efficient_Multilingual_Deep_Learning_Model_for_Keyword_Categorization.pdf
Accesso riservato
Tipo di file:
PDF EDITORIALE
Dimensione
226.91 kB
Formato
Adobe PDF
|
226.91 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.