Spark - Kudu predicate pushdown

I'm using kudu and spark streaming for a realtime dashboard, my problem is that when I'm joining the batch from spark streaming with kudu table it doesn't make a predicate pushdown on it and takes 2-3 seconds to fetch the entire table in spark and after that filter it. It's any way to avoid this?

Thanks,

Alexandru

Upvotes: 0

Answers (1)

GarlicSauce

Reputation: 17

1.Kudu is a Columnar storage engine,so you can select what column you need.It can decrease the data pulled from kudu.

2.kudu predicate pushdown support >,<,>=,<=,=,BETWEEN, or IN maybe you can cache the data,after you filtering data from kudu.And predicate pushdown may triggered.

Upvotes: -1

Spark - Kudu predicate pushdown

Answers (1)

Related Questions