Precision and recall

Precision measures how well a system retrieves only the relevant documents. Recall measures how well a system retrieves all the relevant documents. The relative importance of these metrics varies based on the type of search.

For the sample search in which a few good documents are sufficient, precision outweighs recall. Most Google searchers, for example, want a few good results fast without sifting through false drops. Precision is even more important for the known-item or existence search in which a specific document (or web site) is desired. In fact, this type of search has more in common with data retrieval than information retrieval—because there is a single, correct answer.

But for the exhaustive search when all or nearly all relevant documents are desired, recall is the key metric. Lawyers and researchers are willing to sacrifice precision in the interest of finding the smoking gun or the data that makes a difference.

Full-text searching recall falls dramatically as the collection increases in size. Controlled vocabularies are necessary for larger content sets to keep recall high because language is beautifully various.