niaomingjian
niaomingjian

Reputation: 3742

How can I get the primary keys of all records in Cassandra?

I have inserted plenty of data into Cassandra. Now I'd like to randomly query one record. But I don't know the primary key. So I want a way to fetch the primary keys of all data in Cassandra,then I can use one random primary key to query the data.

Can I get it from the data directory of Cassandra? I only found .db files in that directory.

Upvotes: 3

Views: 9996

Answers (3)

Roy van der Valk
Roy van der Valk

Reputation: 537

To get the primary key values you can use distinct in CQL:

cqlsh> select distinct pkey_column FROM table;

Or with multiple:

cqlsh> select distinct pkey_column_1, pkey_column_2 FROM table;

Upvotes: 3

xmas79
xmas79

Reputation: 5180

You need to use the TOKEN function in order to avoid unbound SELECT * FROM xxx; queries. This external post explains how to traverse all your dataset, but you can adopt it to your use case with very small efforts.

Upvotes: 1

Aaron
Aaron

Reputation: 57843

Which version of Cassandra are you on? If you're on Cassandra 3, you could query system_schema.columns from the command line, and grep for the PRIMARY KEY parts like this:

$ bin/cqlsh -u aploetz -p pass 
    -e'SELECT keyspace_name,table_name,column_name,kind 
       FROM system_schema.columns' | grep _key
               test |                           fail |                          id | partition_key
      stackoverflow |                 hourly_average |              application_id | partition_key
      stackoverflow |                         meters |                   device_id | partition_key
      stackoverflow |                         meters |                  parking_id | partition_key
      stackoverflow |                        meters2 |                   device_id | partition_key
      stackoverflow |                        meters2 |                  parking_id | partition_key

Then to get the clustering keys, you could modify that slightly:

$ bin/cqlsh -u aploetz -p pass
    -e'SELECT keyspace_name,table_name,column_name,kind
       FROM system_schema.columns' | grep clustering
      stackoverflow |                 hourly_average |                   device_id |    clustering
      stackoverflow |                 hourly_average |                 location_id |    clustering
      stackoverflow |                 hourly_average |                  partner_id |    clustering
      stackoverflow |                         meters |                        date |    clustering
      stackoverflow |                         meters |                      status |    clustering
      stackoverflow |                        meters2 |                        date |    clustering
      stackoverflow |                        meters2 |                      status |    clustering

Note that if you want to filter these results by keyspace and table, you can specify those in the WHERE clause.

Upvotes: 3

Related Questions