Reputation: 3244
I read about Clustered and Non-Clustered Index and concluded that , we use Hashing for Clustered Index and B+ tree (or B tree) for Non-Clustered Index. Is my conclusion right? If not, then what is the difference between these two at data structure level?
Upvotes: 1
Views: 1967
Reputation: 142433
This answer was written when the question seemed to be focused on MySQL.
Since you have tagged the question [mysql], we should limit the discussion to what is possible with that RDBMS.
The one "Engine" you should mostly be using these days is InnoDB. It has primarily B+Tree indexes. (It also has FULLTEXT
and SPATIAL
, but I won't get into them.) It does not have "Hash".
The PRIMARY KEY
is a clustered B+Tree. All the data is ordered by the PK and stored into that one B+Tree. The PK can only be clustered and can only be BTree. The PK is also Unique (in MySQL).
Secondary keys are also B+Trees. They contain the column(s) of the secondary index. At the leaf nodes is the column(s) of the primary key. A "point query" via a secondary index requires drilling down the secondary BTree, then drilling down the Primary key. Secondary keys cannot be "clustered".
For general purpose use, a BTree is the best. A Hash is rarely any better. A Hash is useless for range scans, where as a BTree, especially a B+Tree, is excellent for such.
Hashes are terrible when the index is bigger than can be cached in RAM. Typically the 'next' key you want to access is not adjacent to the last one you touched, so there is a good chance of needing I/O.
Upvotes: 1
Reputation: 5031
In simple words we can explain like below.
Both the Clustered and non-clustered index follows b-tree structure,
for clustered index leaf node of the b-tree structure contains the Actual data.
for non clustered index leaf node of the b-tree structure points the address of the actual data.
Upvotes: 0