Reputation: 3426
I'm new to Apache Cassandra and have the following issue:
I have a table with PRIMARY KEY (userid, countrycode, carid)
. As described in many tutorials this table can be queried by using following filter criteria:
This is fine for most cases, but now I need to query the table by filtering only on
Here, the documentation sais that is the best solution to create another table with a modified primary key, in this case PRIMARY KEY (userid, carid, countrycode)
.
The question here is, how to copy the data from the "original" table to the new one with different index?
And another important question concerning the duplication of a huge table: What about the storage needed to save both tables instead of only one?
Upvotes: 1
Views: 4053
Reputation: 2301
You can use COPY command to export from one table and import into other table.
From your example - I created 2 tables. user_country and user_car with respective primary keys.
CREATE KEYSPACE user WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 2 } ;
CREATE TABLE user.user_country ( user_id text, country_code text, car_id text, PRIMARY KEY (user_id, country_code, car_id));
CREATE TABLE user.user_car ( user_id text, country_code text, car_id text, PRIMARY KEY (user_id, car_id, country_code));
Let's insert some dummy data into one table.
cqlsh> INSERT INTO user.user_country (user_id, country_code, car_id) VALUES ('1', 'IN', 'CAR1');
cqlsh> INSERT INTO user.user_country (user_id, country_code, car_id) VALUES ('2', 'IN', 'CAR2');
cqlsh> INSERT INTO user.user_country (user_id, country_code, car_id) VALUES ('3', 'IN', 'CAR3');
cqlsh> select * from user.user_country ;
user_id | country_code | car_id
---------+--------------+--------
3 | IN | CAR3
2 | IN | CAR2
1 | IN | CAR1
(3 rows)
Now we will export the data into a CSV. Observe the sequence of columns mentioned.
cqlsh> COPY user.user_country (user_id,car_id, country_code) TO 'export.csv';
Using 1 child processes
Starting copy of user.user_country with columns [user_id, car_id, country_code].
Processed: 3 rows; Rate: 4 rows/s; Avg. rate: 4 rows/s
3 rows exported to 1 files in 0.824 seconds.
export.csv can now be directly inserted into other table.
cqlsh> COPY user.user_car(user_id,car_id, country_code) FROM 'export.csv';
Using 1 child processes
Starting copy of user.user_car with columns [user_id, car_id, country_code].
Processed: 3 rows; Rate: 6 rows/s; Avg. rate: 8 rows/s
3 rows imported from 1 files in 0.359 seconds (0 skipped).
cqlsh>
cqlsh>
cqlsh> select * from user.user_car ;
user_id | car_id | country_code
---------+--------+--------------
3 | CAR3 | IN
2 | CAR2 | IN
1 | CAR1 | IN
(3 rows)
cqlsh>
About your other question - yes the data will be duplicated, but that's how cassandra is used.
Upvotes: 2