Reputation: 31962
I want to calculate the standard deviation between page views on my site. I'd like to do this using pure MySQL - without querying the whole table to the webserver - and return a single number to the PHP code for further use. Each page view is stored as a visitor_id
- page_id
- visit_count
trio as per the following schema:
+============+=========+=============+
| visitor_id | page_id | visit_count |
+============+=========+=============+
| 1 | 2 | 7 |
+------------+---------+-------------+
| 2 | 2 | 4 |
+------------+---------+-------------+
| 1 | 1 | 17 |
+------------+---------+-------------+
| 3 | 2 | 12 |
+------------+---------+-------------+
| 1 | 3 | 639478 |
+------------+---------+-------------+
| 2 | 1 | 6 |
+------------+---------+-------------+
page_id
refers to a PRIMARY_KEY in the pages table, visitor_id
refers to a PRIMARY_KEY in the visitors table. The above table's primary key is the visitor_id
- page_id
pair, since the same page seen by the same visitor is recorded by increasing the visit_count
of the corresponding row, instead of creating a new one.
Before calculating standard deviation, the entries should be grouped together by page_id
, their visit_count
summed (visitor_id can be ignored here), so, effectively, I want to calculate the deviation of the following:
+=========+=============+
| page_id | visit_count |
+=========+=============+
| 2 | 23 |
+---------+-------------+
| 1 | 23 |
+---------+-------------+
| 3 | 639478 |
+---------+-------------+
I'm aware of the possible PHP solutions, but I'm interested in a MySQL one.
Upvotes: 0
Views: 57
Reputation: 1270301
If you want the standard deviation for each page (i.e., the visitors are the population):
select page_id, sum(visit_count) as visit_count, std(visit_count) as visit_std
from table1
group by page_id;
If you want the standard deviation over the pages:
select std(visit_count) as page_std
from (select page_id, sum(visit_count) as visit_count
from table1
group by page_id
) t;
Upvotes: 1
Reputation: 2434
You could create a new table that stores timestamp + current views so you can view a history of changes in views. You'd be able to check the last two timestamped entries and how much the difference is between the two as well as a whole bunch of other stuff you haven't even thought of yet. Like graphs. Or pie charts showing activity increases per week day. Mmmm pie.
Upvotes: 0