Relevance (information retrieval)

In information science and information retrieval, relevance denotes how well a retrieved document or set of documents meets the information need of the user. Relevance may include concerns such as timeliness, authority or novelty of the result.

History

The concern with the problem of finding relevant information dates back at least to the first publication of scientific journals in the 17th century.^{[citation needed]}

The formal study of relevance began in the 20th century with the study of what would later be called bibliometrics. In the 1930s and 1940s, S. C. Bradford used the term "relevant" to characterize articles relevant to a subject (cf., Bradford's law). In the 1950s, the first information retrieval systems emerged, and researchers noted the retrieval of irrelevant articles as a significant concern. In 1958, B. C. Vickery made the concept of relevance explicit in an address at the International Conference on Scientific Information.^[1]

Since 1958, information scientists have explored and debated definitions of relevance. A particular focus of the debate was the distinction between "relevance to a subject" or "topical relevance" and "user relevance".^[1]

Evaluation

The information retrieval community has emphasized the use of test collections and benchmark tasks to measure topical relevance, starting with the Cranfield Experiments of the early 1960s and culminating in the TREC evaluations that continue to this day as the main evaluation framework for information retrieval research.^[2]

In order to evaluate how well an information retrieval system retrieved topically relevant results, the relevance of retrieved results must be quantified. In Cranfield-style evaluations, this typically involves assigning a relevance level to each retrieved result, a process known as relevance assessment. Relevance levels can be binary (indicating a result is relevant or that it is not relevant), or graded (indicating results have a varying degree of match between the topic of the result and the information need). Once relevance levels have been assigned to the retrieved results, information retrieval performance measures can be used to assess the quality of a retrieval system's output.

In contrast to this focus solely on topical relevance, the information science community has emphasized user studies that consider user relevance.^[3] These studies often focus on aspects of human-computer interaction (see also human-computer information retrieval).

Clustering and relevance

The cluster hypothesis, proposed by C. J. van Rijsbergen in 1979, asserts that two documents that are similar to each other have a high likelihood of being relevant to the same information need. With respect to the embedding similarity space, the cluster hypothesis can be interpreted globally or locally.^[4] The global interpretation assumes that there exist some fixed set of underlying topics derived from inter-document similarity. These global clusters or their representatives can then be used to relate relevance of two documents (e.g. two documents in the same cluster should both be relevant to the same request). Methods in this spirit include:

cluster-based information retrieval^[5]^[6]
cluster-based document expansion such as latent semantic analysis or its language modeling equivalents.^[7] It is important to ensure that clusters – either in isolation or combination – successfully model the set of possible relevant documents.

A second interpretation, most notably advanced by Ellen Voorhees,^[8] focuses on the local relationships between documents. The local interpretation avoids having to model the number or size of clusters in the collection and allow relevance at multiple scales. Methods in this spirit include:

multiple cluster retrieval^[6]^[8]
spreading activation^[9] and relevance propagation^[10] methods
local document expansion^[11]
score regularization^[12]

Local methods require an accurate and appropriate document similarity measure.

Problems and alternatives

The documents which are most relevant are not necessarily those which are most useful to display in the first page of search results. For example, two duplicate documents might be individually considered quite relevant, but it is only useful to display one of them. A measure called "maximal marginal relevance" (MMR) has been proposed to manage this shortcoming. It considers the relevance of each document only in terms of how much new information it brings given the previous results.^[13]

In some cases, a query may have an ambiguous interpretation, or a variety of potential responses. Providing a diversity of results can be a consideration when evaluating the utility of a result set.^[14]

References

^ ^a ^b Mizzaro, Stefano (1997). "Relevance: The whole history" (PDF). Journal of the American Society for Information Science. 48 (9): 810–832. doi:10.1002/(SICI)1097-4571(199709)48:9<810::AID-ASI6>3.0.CO;2-U.
^ Sanderson, P. Clough, M. (2013-06-15). "Evaluating the performance of information retrieval systems using test collections". informationr.net. Retrieved 2020-05-28.{{cite web}}: CS1 maint: multiple names: authors list (link)
^ Yunjie, Xu (2006). "Relevance judgment: What do information users consider beyond topicality?". Journal of the American Society for Information Science and Technology. 57 (7): 961–973. doi:10.1002/asi.20361.
^ F. Diaz, Autocorrelation and Regularization of Query-Based Retrieval Scores. PhD thesis, University of Massachusetts Amherst, Amherst, MA, February 2008, Chapter 3.
^ Croft, W.Bruce (1980). "A model of cluster searching based on classification". Information Systems. 5 (3): 189–195. doi:10.1016/0306-4379(80)90010-1.
^ ^a ^b Griffiths, Alan; Luckhurst, H. Claire; Willett, Peter (1986). "Using interdocument similarity information in document retrieval systems" (PDF). Journal of the American Society for Information Science. 37: 3–11. doi:10.1002/(SICI)1097-4571(198601)37:1<3::AID-ASI1>3.0.CO;2-O.
^ X. Liu and W. B. Croft, “Cluster-based retrieval using language models,” in SIGIR ’04: Proceedings of the 27th annual international conference on Research and development in information retrieval, (New York, NY, USA), pp. 186–193, ACM Press, 2004.
^ ^a ^b E. M. Voorhees, “The cluster hypothesis revisited,” in SIGIR ’85: Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval, (New York, NY, USA), pp. 188–196, ACM Press, 1985.
^ S. Preece, A spreading activation network model for information retrieval. PhD thesis, University of Illinois, Urbana-Champaign, 1981.
^ T. Qin, T.-Y. Liu, X.-D. Zhang, Z. Chen, and W.-Y. Ma, “A study of relevance propagation for web search,” in SIGIR ’05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, (New York, NY, USA), pp. 408–415, ACM Press, 2005.
^ A. Singhal and F. Pereira, “Document expansion for speech retrieval,” in SIGIR ’99: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, (New York, NY, USA), pp. 34–41, ACM Press, 1999.
^ Qin, Tao; Liu, Tie-Yan; Zhang, Xu-Dong; Chen, Zheng; Ma, Wei-Ying (2005). "A study of relevance propagation for web search" (PDF). Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. p. 408. doi:10.1145/1076034.1076105. ISBN 1595930345. S2CID 15310025.
^ Carbonell, Jaime; Goldstein, Jade (1998). "The use of MMR, diversity-based reranking for reordering documents and producing summaries". Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. pp. 335–336. CiteSeerX 10.1.1.50.2490. doi:10.1145/290941.291025. ISBN 978-1581130157. S2CID 6334682.
^ "Diversity in Document Retrieval (DDR) 2012".