dlu
dlu

Reputation: 91

Cassandra "truncate" does not empty tables

I have encountered this problem recently. When I populated my tables (called event and index) to more than 1 million, and tried to truncate them for new tests, the tables were not empty after the truncation. CQL shows something like

cqlsh> select count(*) from event limit 100000000;

 count
---------
 2033492


cqlsh> truncate event;
cqlsh> select count(*) from event limit 100000000;

 count
-------
    25

(1 rows)

cqlsh> select count(*) from event limit 100000000;

 count
-------
    27

(1 rows)

cqlsh> select count(*) from event limit 100000000;

 count
-------
    34

(1 rows)

cqlsh> select event_id, dateOf(time_token), unixTimestampOf(time_token), writetime(time_token) from event limit 100000000;

 event_id                             | dateOf(time_token)       | unixTimestampOf(time_token) | writetime(time_token)
--------------------------------------+--------------------------+-----------------------------+-----------------------
 567c4f2b-c86a-4663-a8ec-50f70d183b62 | 2014-07-22 22:29:04-0400 |               1406082544416 |      1406082544416000
 20a2f9e7-cdcb-4c2d-93e7-a646d0910e6b | 2014-07-22 15:12:29-0400 |               1406056349772 |      1406056349774000
 ... ...
 0d983cec-4ba5-4df8-ada8-eb347add57bf | 2014-07-22 22:20:53-0400 |               1406082053926 |      1406082053930000

(34 rows)

cqlsh>

After the "truncate" command, the "select count(*)" returned numbers quickly changing, and stabilized at 34. To be sure there is no other program inserting records at the time, I ran a CQL statement showing all records were created on July 22 or 23, which is 4 - 5 days ago.

I tried "truncate" command several times, and the results were the same.

This happened in 2 environments. The first environment is on my laptop where I created 3 Cassandra instances cluster using localhost IPs (127.0.0.2, 127.0.0.3, and 127.0.0.4), while the second environment is 3 node Cassandra cluster, with each node on a separate Linux CentOS 6.5 machine. I am using Cassandra 2.0.6.

Could someone help me to figure out what is going on? Thanks in advance.

Upvotes: 3

Views: 4971

Answers (3)

dlu
dlu

Reputation: 91

It is a bug in Cassandra 2.0.6, and got fixed in at least 2.0.10.

Apparently, it is not a well-known (well-published) bug as many DataStax experts did not know it either when I reproduced it to them at the Cassandra Summit 2014. They were also puzzled until the CQL architect dropped by and said he fixed a mysterious bug in recent release. He asked me to upgrade to 2.0.10, and the problem is gone. There are no more lingering records after "truncate" in 2.0.10.

Upvotes: 6

Richard
Richard

Reputation: 11100

Truncate doesn't truncate hints so hints awaiting delivery will still get delivered. This could be causing your issue, especially if you inserted lots of rows quickly that could have caused a few dropped mutations. However, hints are normally delivered in minutes, not days, so there must be something else wrong if hints are causing your issue. You can see when hints are delivered from the logs.

The safest way to delete all data is to drop the table and recreate under a different name (or in a different keyspace).

Upvotes: 2

user3195649
user3195649

Reputation: 437

There is one thing you absolutely have to make sure before truncating is all the nodes are up.

If you are using Astyanax

/* keyspace variable is Keyspace Type */ keyspace.truncateColumnFamily(ColumnFamilyName);

Note: Even after truncating you will have to manually delete all the table metadata

Upvotes: 0

Related Questions