On the effect of INQUERY term-weighting scheme on query-sensitive similarity measures
MetadataShow full item record
Cluster-based information retrieval systems often use a similarity measure to compute the association among text documents. In this thesis, we focus on a class of similarity measures named Query-Sensitive Similarity (QSS) measures. Recent studies have shown QSS measures to positively influence the outcome of a clustering procedure. These studies have used QSS measures in conjunction with the ltc term-weighting scheme. Several term-weighting schemes have superseded the ltc term-weighing scheme and demonstrated better retrieval performance relative to the latter. We test whether introducing one of these schemes, INQUERY, will offer any benefit over the ltc scheme when used in the context of QSS measures. The testing procedure uses the Nearest Neighbor (NN) test to quantify the clustering effectiveness of QSS measures and the corresponding term-weighting scheme. The NN tests are applied on certain standard test document collections and the results are tested for statistical significance. On analyzing results of the NN test relative to those obtained for the ltc scheme, we find several instances where the INQUERY scheme improves the clustering effectiveness of QSS measures. To be able to apply the NN test, we designed a software test framework, Ferret, by complementing the features provided by dtSearch, a search engine. The test framework automates the generation of NN coefficients by processing standard test document collection data. We provide an insight into the construction and working of the Ferret test framework.
Kini, Ananth Ullal (2005). On the effect of INQUERY term-weighting scheme on query-sensitive similarity measures. Master's thesis, Texas A&M University. Texas A&M University. Available electronically from