img

Notice détaillée

Topic distillation with query-dependent link connections

Article Ecrit par: Mingfang, Wu ; Falk, Scholer ;

Résumé: Searchers on the Web often aim to find key resources about a topic. Finding such results is called topic distillation. Previous research has shown that the use of sources of evidence such as page indegree and URL structure can significantly improve search performance on interconnected collections such as the Web, beyond the use of simple term distribution statistics. This article presents a new approach to improve topic distillation by exploring the use of external sources of evidence: link structure, including query dependent indegree and outdegree; and web page characteristics, such as the density of anchor links. Our experiments with the TREC .GOV collection, an 18GB crawl of the US .gov domain from 2002, show that using such evidence can significantly improve search effectiveness, with combinations of evidence leading to significant performance gains over both full-text and anchor-text baselines. Moreover, we demonstrate that, at a different scope level, both local query-dependent outdegree and query-dependent indegree out-performed their global query-independent counterparts; and at the same scope level, outdegree out-performed indegree. Adding query-dependent indegree or page characteristics to query-dependent outdegree could have a small, but not significant, improvement.


Langue: Français