Search ranking/relevance algorithms

Tom picture Tom · Oct 7, 2008 · Viewed 22.5k times · Source

When developing a database of articles in a Knowledge Base (for example) - what are the best ways to sort and display the most relevant answers to a users' question?

Would you use additional data such as keyword weighting based on whether previous users found the article of help, or do you find a simple keyword matching algorithm to be sufficient?

Answer

hippietrail picture hippietrail · Oct 20, 2012

Perhaps the easiest and most naive approach that will give immediately useful results would be to implement *tf-idf:

Variations of the tf–idf weighting scheme are often used by search engines as a central tool in scoring and ranking a document's relevance given a user query. tf–idf can be successfully used for stop-words filtering in various subject fields including text summarization and classification.

In a recent related question of mine here I learned of an excellent free book on this topic which you can download or read online:

An Introduction to Information Retrieval