How to stream data with golang query to Cassandra

Question

I have the following code:

cluster := gocql.NewCluster("our-cass")
cass, err := cluster.CreateSession()
defer cass.Close()
iter := cass.Query(`SELECT * FROM cmuser.users LIMIT 9999999999;`).Iter()
c :=iter.Columns()
scanArgs := make([]interface{}, len(c))

for i:=0; i < len(scanArgs); i++ {
    scanArgs[i] = makeType(c[i])
}

for iter.Scan(scanArgs...) { ... }

The problem is that we have way too many rows in that table. But I need to read all of them, to migrate the data to another db. Is there a way to stream the data from Cassandra? Unfortunately, we don't have a sequence for the primary key of the table, we are using a uuid for the PK. So that means we can't do a simple technique of 2 for loops, one incrementing a counter and going through all the rows that way.

phonaputer · Accepted Answer

Gocql has some options for paging (assuming your Cassandra version is at least version 2).

Gocql's Session has a method SetPageSize

And Gocql's Query has a similar method, PageSize

This may help you break up your query. Here's what the code would look like:

cluster := gocql.NewCluster("our-cass")
cass, err := cluster.CreateSession()
defer cass.Close()

iter := cass.Query(`SELECT * FROM cmuser.users;`).PageSize(5000).Iter()

// use the iter as usual to iterate over all results 
// this will send additional CQL queries when it needs to get new pages

How to stream data with golang query to Cassandra

Answers (2)

Related Questions