Reputation: 1437
How indexes in hive are different than partitions? both improves query performance as per my knowledge then in what way they differ?
What are the situations I'll be using indexing or partitioning? Can i use them together?
Kindly suggest
Upvotes: 2
Views: 1235
Reputation: 182
Partitions allow users to store data files stored in different HDFS directories (based on chosen parameter, date for example, if you want to store your datafiles by date) thus, minimizing the number of files to scan when users run queries.
While indexes help in fetching data faster, indexes require index tables to built where the data to be indexed is stored. This leads to storing the data twice.
Upvotes: 1
Reputation: 10139
partition:
Think about that you have a table keeping transactions created from your applications. this table get bigger day by day, if you partition this table based on day interval ,database creates like table at each day interval but you see only one table. It makes your dailiy basis query more effective.
Index. Index is used to access your table records fastly.
Upvotes: 0