sudheer
sudheer

Reputation: 337

Datastax Cassandra PIG Running only one MAP

I am using Datastax Cassandra 3.1.4 with two nodes. I am running pig with CqlStorage() with 12million rows in the table, but I find there is only one map running for a simple pig command.

I tried changing split_size in my pig relation but it didn't worked.

Here is my sample query.

x = load'cql://Mykeyspace/MyCF?split_size=1000' using CqlStorage();
y = limit x 500;
dump y

I didn't find input.split.size property in my mapred-site.xml I am assuming default split size is 64*1024

I tried set pig.splitCombination false;

Now its taking 513 maps for any no.of records, I tried same thing from Hive

I have connected to Cassandra from Hive and gave a simple select all query with where col1>value this table have only 10 records but still this is running 513 maps.

Please help me on this

Thanks

Upvotes: 1

Views: 99

Answers (1)

psanford
psanford

Reputation: 5670

Try this setting:

set pig.splitCombination false;

By default, pig will combine what it considers small splits into a single map.

Upvotes: 1

Related Questions