PIG and HIVE connectivity to Datastax Cassandra running huge no of maps

Question

I am using DSE3.2.4 I have created three tables which have 10M rows in one and 50k rows in other and other with just 10 rows When I run a simple PIG or Hive query over these tables it is running same no.of mappers for both the tables.

In Pig by default pig.splitCombination is true where in it is running only one map If I set this to false it is now running 513 maps.

In Hive by default it is running 513 maps

I tried in setting the following properties

mapred.min.split.size=134217728 in `mapred-site.xml` now running 513 maps for all

set pig.splitCombination=false in pig shell now running only 1 for all the tables

But no luck

finally I find mapred.map.tasks = 513 in job.xml

I tried to change this in mapred-site.xml but it is not reflecting

please help me in this

PIG and HIVE connectivity to Datastax Cassandra running huge no of maps

Answers (1)

Related Questions