Reputation: 3094
I know the fullVisitorId, visitID, visitNumber, totals.pageviews
which all give me information about the session.
I'm trying to write an efficient query, such that I only look at new users to create the following table:
user | #visits | #total_pages | #pages/visit
Sample DB that I just came up with:
Row fullVisitorId visitID visitNumber page_views
1 27 15 1 10
2 27 20 2 15
3 27 1921 3 5
4 654 15 1 28
5 575 16 3 1
6 547 16 1 4
The user column is populated with fullVisitorId
, #visits is the number of visits.
Since I'm only interested in new users, I first wrote a query to select new users, where visit number = 1. This is
SELECT fullVisitorId FROM (SELECT fullVisitorId, visitID, visitNumber, page_views FROM [table] WHERE visitNumber =1) GROUP BY fullVisitorId
Now, this is the first column in my database, the user
table completed.
For the second column #visits
, I need to find the number of visits, which is simply the SUM of unique visitIDs
that belong to the user. It should return 3 for user 27, 1 for users 654 and 547 (note that user 575 is excluded). Something like:
SELECT * FROM (table) WHERE fullVisitorID IN (1stQuery) to remove the old users (only new required) and then count the number of visits.
3rd column, would aggregate the number of pages viewed, and return (10+15+5=30) for user 27, and so on.
The last column is simply the division of #total_pages and #visits.
Ideally, at the end I would get:
user | #visits | #total_pages | #pages/visit
27 3 30 10
654 1 28 28
547 1 4 4
Upvotes: 0
Views: 2201
Reputation: 1271013
I would just do this with aggregation:
SELECT fullVisitorId, COUNT(*) as NumVisits,
SUM(PageViews) as TotalPages,
AVG(PageViews) as AvgPages
FROM [table]
GROUP BY fullVisitorId
HAVING MIN(VisitNumber) = 1;
I don't think you gain anything by making the query more complicated by filtering before the aggregation.
Upvotes: 1