Inv3r53
Inv3r53

Reputation: 2959

performance issue in a select query from a single table

I have a table as below

dbo.UserLogs

-------------------------------------
Id | UserId |Date | Name| P1 | Dirty
-------------------------------------

There can be several records per userId[even in millions]

I have clustered index on Date column and query this table very frequently in time ranges. The column 'Dirty' is non-nullable and can take either 0 or 1 only so I have no indexes on 'Dirty'

I have several millions of records in this table and in one particular case in my application i need to query this table to get all UserId that have at least one record that is marked dirty.

I tried this query - select distinct(UserId) from UserLogs where Dirty=1

I have 10 million records in total and this takes like 10min to run and i want this to run much faster than this.

[i am able to query this table on date column in less than a minute.]

Any comments/suggestion are welcome.

my env 64bit,sybase15.0.3,Linux

Upvotes: 3

Views: 204

Answers (2)

lexu
lexu

Reputation: 8849

my suggestion would be to reduce the amount of data that needs to be queried by "archiving" log entries to an archive table in suitable intervals.

You can still access all entries if you provide a union-view over current and archived log data, but accessing current logs would be much reduced.

Upvotes: 2

Guffa
Guffa

Reputation: 700262

Add an index containing both the UserId and Dirty fields. Put UserId before Dirty in the index as it has more unique values.

Upvotes: 1

Related Questions