e90jimmy
e90jimmy

Reputation: 272

Using Pig with Cassandra CQL3

When trying to run PIG against a CQL3 created Cassandra Schema,

-- This script simply gets a row count of the given column family  
rows = LOAD 'cassandra://Keyspace1/ColumnFamily/' USING CassandraStorage();
counted = foreach (group rows all) generate COUNT($1);
dump counted;

I get the following Error.

Error: Column family 'ColumnFamily' not found in keyspace 'KeySpace1'

I understand that this is by design, but I have been having trouble finding the correct method to load CQL3 tables into PIG.

Can someone point me in the right direction? Is there a missing bit of documentation?

Upvotes: 1

Views: 505

Answers (5)

RussS
RussS

Reputation: 16576

The best way to access Cql3 Tables in Pig is by using the CqlStorage Handler

The syntax is similar to what you have a above

row = Load 'cql://Keyspace/ColumnFamily/' Using CqlStorage()

More info In the Dev Blog Post

Upvotes: 0

marcostrama
marcostrama

Reputation: 161

As e90jimmy said, its supported in Cassandra 1.2.8, but we have a issue when using counter column type. This was fixed by Alex Liu but due to regression problem in 1.2.7 the patch doesn't go ahead:

https://issues.apache.org/jira/browse/CASSANDRA-5234

To correct this, wait until 2.0 become production ready or download the source, apply the patch from the above link by yourself and rebuild the cassandra .jar. Worked for me by now...

Upvotes: 0

e90jimmy
e90jimmy

Reputation: 272

This is now supported in Cassandra 1.2.8

Upvotes: 1

Louis Simoneau
Louis Simoneau

Reputation: 1791

Per this https://github.com/alexliu68/cassandra/pull/3, it appears that this fix is planned for the 1.2.6 release of Cassandra. It sounds like they're trying to get that out in the reasonably near future, but of course there's no certain ETA.

Upvotes: 0

Lyuben Todorov
Lyuben Todorov

Reputation: 14153

As you mention this is by design because if thrift was updated to allow for this it would compromise backwards computability. Instead of creating keyspaces and column families using CQL (I'm guessing you used cqlsh) try using the C* CLI.

Take a look at these issues as well:

Upvotes: 0

Related Questions