Random Reads and Scans inside the same HBase cluster

Question

We have a situation where we host data for:

MapReduce/Spark jobs (disk accessed by seq. reads)
Random reads. (disk accessed by seeks)

All inside the same cluster/table.

With YARN we can manage resources like CPU and RAM, but during intensive scans HDD can become a bottleneck and can slow down random read performance. How to manage that resource

How this kind of situations are being handled in general?

Random Reads and Scans inside the same HBase cluster

Answers (1)

Related Questions