Reputation: 171

Cassandra Query Timeout with small set of data

I am having a problem with Cassandra 2.1.17. I have a table with about 40k "rows" in it. One partition I am having a problem with has maybe about 5k entries in it.

Table is:

create table billing (
    accountid uuid,
    date timeuuid,
    credit double,
    debit double,
    type text,
    primary key (accountid,date)
) with clustering order by (date desc)

So there is a lot of inserting and deleting from this table.

My problem is that somehow it seems to get corrupt I think because I am no longer able to select data past a certain point from a partition.

From cqlsh I can run soemthing like this.

SELECT accoutid,date,credit,debit,type FROM billing WHERE accountid=XXXXX-xxxx-xxxx-xxxxx... AND date < 3d466d80-189c-11e7-8a57-f33cbced2fc5 limit 2;

First I did a select limit of 10000 it works up to around 5000 rows pageing through them then towards the end it will give a timeout error.

I then use the second from last timeuuid and select limit 2 it will fail limit 1 will work.

If I use the last timeuuid as a < and limit to 1 it will also fail.

So just looking for what I can do here I am not sure what is wrong and not sure how I can fix/diagnose what happened.

I have tired a repair and force a compaction. but it still seems to have the issue.

Thank you for any help.

Upvotes: 2

Answers (3)

nevsv

Reputation: 2466

Try to start with running manual compaction on your table.
You can increase read_request_timeout_in_ms parameter in cassandra config.
Consider moving to leveled compaction strategy if you are having a lot of deletes and updates.

Upvotes: 4

Adrien Piquerez

Reputation: 1044

I think you got too many tombstones in this partition.

What is a tombstone ?

To remember that a record has been deleted Cassandra creates a special value called a "tombstone". A tombstone has a TTL as any other value has but it is not compacted as easily as any other value is. Cassandra keeps it longer to avoid such inconsistency as data reappearence.

How to watch tombstones ?

nodetool cfstats gives you an idea of how many tombstones you have on average per slice

How to fix the issue ?

The duration a tombstone is preserved is gc_grace_seconds. You have to reduce it and then run a major compaction to fix the issue.

Upvotes: 3

Marko Švaljek

Reputation: 2101

It looks to me like you are hitting a lot of tombstones when you do selects. The thing is while they are there cassandra still has to go over them. There might be multiple factors like ttl with insert statements, a lot of deletes, inserting of nulls etc.

My bet would be that you would need to adjust gc_grace_seconds on table and run repairs more often. But be careful and don't set it to to low (one round of repair has to finish before this time).

It's all nicely explained here: https://opencredo.com/cassandra-tombstones-common-issues/

Upvotes: 2

Cassandra Query Timeout with small set of data

Answers (3)

Related Questions