Reputation: 1941
I was reading about inverted index (used by the text search engines like Solr, Elastic Search etc) and as I understand (if we take "Person" as an example):
The attribute to Person relationship is inverted:
John -> PersonId(1), PersonId(2), PersonId(3)
London -> PersonId(1), PersonId(2), PersonId(5)
I can now search the person records for 'John who lives in London'
Doesn't this solve all the problems? Why do we have the forward (or regular database index) at all? Or in other words, in what cases the regular indexing is useful? Please explain. Thanks.
Upvotes: 16
Views: 13112
Reputation: 529
In forward index, the input is a document and the output is words contained in the document.
{
doc1: [word1, word2, word3],
doc2: [word4, word5]
}
In the reverse/inverted index, the input is a word, and the output is all the documents in which the words are contained.
{
word1: [doc1, doc10, doc3],
word2: [doc5, doc3]
}
Search engines make use of reverse/inverted index to get us documents from keywords.
Upvotes: 1
Reputation: 5796
Here's an explanation of inverted index, from Elasticsearch:
Elasticsearch uses a structure called an inverted index, which is designed to allow very fast full-text searches. An inverted index consists of a list of all the unique words that appear in any document, and for each word, a list of the documents in which it appears. https://www.elastic.co/guide/en/elasticsearch/guide/current/inverted-index.html
Inverted indexing is for fast full text search. Regular indexing is less efficient, because the engine looks through all entries for a term, but very fast with indexing!
You can say this:
But, it's always context related. If you compare it with MySQL: myisam has fast read, innodb has fast insert/update and slower read.
Read more here: https://www.found.no/foundation/indexing-for-beginners-part3/
Upvotes: 3
Reputation: 25221
The point that you're missing is that there is no real technical distinction between a forward index and an inverted index. "Forward" and "inverted" in this case are just descriptive terms to distinguish between:
The concept of an inverted index only makes sense if the concept of a regular (forward) index already exists. In the context of a search engine, a forward index would be the term vector; a list of terms contained within a particular document. The inverted index would be a list of documents containing a given term.
When you understand that the terms "forward" and "inverted" are really just relative terms used to describe the nature of the index you're talking about - and that really an index is just an index - your question doesn't really make sense any more.
Upvotes: 36