Nipun
Nipun

Reputation: 4319

Indexing of spark inmemory table

I have registered a temp table in spark and cached that table in memory. I query a lot on this table on a particular column i.e. range query, that is a timestamp. The records are around 4 million and it takes around 25 sec to filter records on the range column. I do it around 50 times to get records between times. Is there a way wherein i can have a btree index on this column so that my queries are much faster

Upvotes: 0

Views: 179

Answers (1)

Arnon Rotem-Gal-Oz
Arnon Rotem-Gal-Oz

Reputation: 25909

Write the filter so it would get all the relevant records in one go (filter(x=> x.field>= date1 && x.field <= date2)

Upvotes: 1

Related Questions