D. Müller
D. Müller

Reputation: 3426

Cassandra create duplicate table with different primary key

I'm new to Apache Cassandra and have the following issue:

I have a table with PRIMARY KEY (userid, countrycode, carid). As described in many tutorials this table can be queried by using following filter criteria:

This is fine for most cases, but now I need to query the table by filtering only on

Here, the documentation sais that is the best solution to create another table with a modified primary key, in this case PRIMARY KEY (userid, carid, countrycode).

The question here is, how to copy the data from the "original" table to the new one with different index?

And another important question concerning the duplication of a huge table: What about the storage needed to save both tables instead of only one?

Upvotes: 1

Views: 4053

Answers (1)

user1401472
user1401472

Reputation: 2301

You can use COPY command to export from one table and import into other table.

From your example - I created 2 tables. user_country and user_car with respective primary keys.

CREATE KEYSPACE user WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy',  'datacenter1' : 2 } ;
CREATE TABLE user.user_country ( user_id text, country_code text, car_id text, PRIMARY KEY (user_id, country_code, car_id));
CREATE TABLE user.user_car ( user_id text, country_code text, car_id text, PRIMARY KEY (user_id, car_id, country_code));

Let's insert some dummy data into one table.

cqlsh> INSERT INTO user.user_country (user_id, country_code, car_id) VALUES ('1', 'IN', 'CAR1');
cqlsh> INSERT INTO user.user_country (user_id, country_code, car_id) VALUES ('2', 'IN', 'CAR2');
cqlsh> INSERT INTO user.user_country (user_id, country_code, car_id) VALUES ('3', 'IN', 'CAR3');
cqlsh> select * from user.user_country ;

 user_id | country_code | car_id
---------+--------------+--------
       3 |           IN |   CAR3
       2 |           IN |   CAR2
       1 |           IN |   CAR1

(3 rows)

Now we will export the data into a CSV. Observe the sequence of columns mentioned.

cqlsh> COPY user.user_country (user_id,car_id, country_code) TO 'export.csv';
Using 1 child processes

Starting copy of user.user_country with columns [user_id, car_id, country_code].
Processed: 3 rows; Rate:       4 rows/s; Avg. rate:       4 rows/s
3 rows exported to 1 files in 0.824 seconds.

export.csv can now be directly inserted into other table.

cqlsh> COPY user.user_car(user_id,car_id, country_code) FROM 'export.csv';
Using 1 child processes

Starting copy of user.user_car with columns [user_id, car_id, country_code].
Processed: 3 rows; Rate:       6 rows/s; Avg. rate:       8 rows/s
3 rows imported from 1 files in 0.359 seconds (0 skipped).
cqlsh>
cqlsh>
cqlsh> select * from user.user_car ;

 user_id | car_id | country_code
---------+--------+--------------
       3 |   CAR3 |           IN
       2 |   CAR2 |           IN
       1 |   CAR1 |           IN

(3 rows)
cqlsh>

About your other question - yes the data will be duplicated, but that's how cassandra is used.

Upvotes: 2

Related Questions