Reputation: 1473
I have a basic analytics mysql database table which keeps track of all ipaddresses and urls visited on a user's visit and the time they visited. There are a large number of inserts (millions per day).
After a few days, running a query on the table to try to find out how many users visited on a particular day takes a very long time.
Should I add an index to the table? Will it recreate the index after every insert, and is it worth it?
Or is there a better way to speed up my analytics queries?
This is probably a common situation (everyone has logs). What is the best way to maintain this table?
Upvotes: 7
Views: 2581
Reputation: 13506
+1 for the INSERT DELAYED in another answer.
After midnight, you could store the total nr of visitors in another table like
insert into visitor_cache
select date, count(1)
from log
where date < today
group by date
;
Afterwards you could delete the records from your log table.
This way you only have to use the log table for a running total of the current day.
Upvotes: 1
Reputation: 615
If possible, it is better to update totals rather then inserting 10000000 records. For instance, you can have one record per day with the amount of users that day. This will make it much more performant to read out than an aggregate. Even if you want the statistical data for a year, you're only aggregating over 365 records instead of 300 million.
If you must keep them separately, you could try only storing the recent ones as records, and the older ones as cumulative records, cleaning up your database every now and then. If you really need a LOT of statistical data, you should look into other types of databases (data warehouses) that were designed specifically for this purpose.
Upvotes: 1
Reputation: 47321
MYSQL 5.0 introdcue Archive Storage Engine,
I believe that answer most your question
Should I add an index to the table
- yes, if your want to perform search
Will it recreate the index after every insert
- is handle internally by mysql
worth it
- depends, adding index will slow-down database write operation a bit (depending how many index and record length)
Or is there a better way to speed up my analytic queries
- if you just want to perform simple calculation, you can consider split the data daily (each table to hold a logging for each day). Or you can prepare some sort of query cache warm-up script
Upvotes: 1
Reputation: 6572
Any table that will later be queried should use indexes. Look at INSERT DELAYED which returns quickly and is designed for logging:-
http://dev.mysql.com/doc/refman/5.1/en/insert-delayed.html
This is probably the best option for log tables, although you could also look at the Archive storage engine, but that is more concerned with space than lookups:-
http://dev.mysql.com/doc/refman/5.1/en/archive-storage-engine.html
Upvotes: 3
Reputation: 33678
It will not recreate it but it will update the index. Yes, that takes some time, though not much more than a simple table update.
Anyway, if you are going to select from the tables with WHERE, you have no other choice than to create an index, otherwise those selects will be painfully slow.
Upvotes: 1