Denormalize two Cassandra column families with existing data in Cassandra

Question

I have two column families in Cassandra with the same structure:

  CREATE TABLE keyspace.table (
        id bigint,
        firstname text,
        lastname text,
        countryId bigint,
        cityId bigint,
        solr_query text,
        PRIMARY KEY (id)
    ) ;

In the second table the number of rows is three times bigger than in the first table. For example in the first table I would have the row (firstname, lastname, countryId):

John, Doe, 100, 8

and in the second table I would have the rows (firstname, lastname, countryId):

John, Doe, 222, 1
John, Doe, 100, 8
John, Doe, 333, 9

I need to create only one column family using collections so that I would have only one line with John Doe with the primary countryId (100), and a collection of the other ones. I added a user-defined data_type:

CREATE TYPE hints_info (
  countryId bigint,
  cityId bigint
);

and added a collection to my first column family:

ALTER TABLE table1 ADD hints map>;

So for the example above my row would look like:

John, Doe, 100, "{1: {countryId :222, cityId :1}, 2: {countryId :333, cityId :9}}"

Can I do this from Cassandra? using CQL? To take each row from the first column family, match it with a row from the second column family and add the missing info that don't exist in the first column family, in a collection? All my data is imported in Cassandra. If I would use C# for example, what would be the syntax to update my collection from my existing column family, with the data from my second column family?

Thank you!

Denormalize two Cassandra column families with existing data in Cassandra

Answers (1)

Related Questions