Most of the existing document and web search engines rely on keyword-based queries. To find matches, these queries are processed using retrieval algorithms that rely on word frequencies, topic recentness, document authority, and (in some cases) available ontologies. In this paper, we propose an innovative approach to exploring text collections using a novel keywords-by-concepts (KbC) graph, which supports navigation using domain-specific concepts as well as keywords that are characterizing the text corpus. The KbC graph is a weighted graph, created by tightly integrating keywords extracted from documents and concepts obtained from domain taxonomies. Documents in the corpus are associated to the nodes of the graph based on evidence supporting contextual relevance; thus, the KbC graph supports contextually informed access to these documents. The construction of the KbC graph relies on a spreading-activation like technique which mimics the way the brain links and constructs knowledge. In this paper, we also present CoSeNa (Context-based Search and Navigation) system that leverages the KbC model as the basis for document exploration as well as contextually-informed media integration.
Context-informed Knowledge Extraction from Document Collections to Support User Navigation / Mario Cataldi; Claudio Schifanella; K. Selcuk Candan; Maria Luisa Sapino; Luigi Di Caro. - In: JOURNAL OF MULTIMEDIA PROCESSING TECHNOLOGY. - ISSN 0976-4127. - 1(2010), pp. 74-94.
Titolo: | Context-informed Knowledge Extraction from Document Collections to Support User Navigation |
Autori Riconosciuti: | |
Autori: | Mario Cataldi; Claudio Schifanella; K. Selcuk Candan; Maria Luisa Sapino; Luigi Di Caro |
Data di pubblicazione: | 2010 |
Abstract: | Most of the existing document and web search engines rely on keyword-based queries. To find matches, these queries are processed using retrieval algorithms that rely on word frequencies, topic recentness, document authority, and (in some cases) available ontologies. In this paper, we propose an innovative approach to exploring text collections using a novel keywords-by-concepts (KbC) graph, which supports navigation using domain-specific concepts as well as keywords that are characterizing the text corpus. The KbC graph is a weighted graph, created by tightly integrating keywords extracted from documents and concepts obtained from domain taxonomies. Documents in the corpus are associated to the nodes of the graph based on evidence supporting contextual relevance; thus, the KbC graph supports contextually informed access to these documents. The construction of the KbC graph relies on a spreading-activation like technique which mimics the way the brain links and constructs knowledge. In this paper, we also present CoSeNa (Context-based Search and Navigation) system that leverages the KbC model as the basis for document exploration as well as contextually-informed media integration. |
Volume: | 1 |
Pagina iniziale: | 74 |
Pagina finale: | 94 |
URL: | http://www.dline.info/jmpt/v1n2.php |
Rivista: | JOURNAL OF MULTIMEDIA PROCESSING TECHNOLOGY |
Appare nelle tipologie: | 03A-Articolo su Rivista |