Reputation: 450
I have a table that contains user event data. Some entries are revenue events and some entries are load events. The load events have url's that contain URL parameters and both revenue and load events have session_id and user_id data attached. I wrote a query that will pull the user and session IDs from this table based off of URL parameters on load events for the past 45 days and I would like to then look for revenue events with matching user and session IDs and total the amount for them. Here is my first query:
SELECT user_id, session_id
FROM events
WHERE event_type = 'load'
AND client_id = 1234
AND calling_url LIKE '%utm_source=so%'
AND server_stamp BETWEEN DATE_SUB(NOW(), INTERVAL ? DAY) AND NOW()
A query that would get revenue from a single user would be something like this:
SELECT SUM(revenue_subtotal)
FROM events
WHERE event_type = 'revenue'
AND client_id = 1234
AND user_id = 'thisUUID'
AND session_id = 'thisSID'
So I am trying to find a way to link these two queries to sum up the revenue from all user/session ID combos returned from the first query directly in SQL (all in one full query).
Unfortunately this table has many columns so giving the full table definition and full sample data is a bit difficult to portray on SO but I think I have spelled it out well enough. Please let me know though if any more information is needed or any idea of how to accomplish this. As of right now the only way I have is to get the results of the first query in PHP then loop through a do a new query for each row returned, but this can easily get into the thousands so I'm trying to keep in all in SQL if at all possible.
Upvotes: 1
Views: 292
Reputation: 147196
Without sample data it's hard to be 100% certain but you should be able to do a self-join on the client_id
, user_id
and session_id
to get the data you want:
SELECT e1.user_id, e1.session_id, COALESCE(SUM(e2.revenue_subtotal), 0) AS revenue
FROM events e1
LEFT JOIN events e2 ON e2.client_id = e1.client_id
AND e2.session_id = e1.session_id
AND e2.user_id = e1.user_id
AND e2.event_type = 'revenue'
WHERE e1.event_type = 'load'
AND e1.client_id = 1234
AND e1.calling_url LIKE '%utm_source=so%'
AND e1.server_stamp BETWEEN DATE_SUB(NOW(), INTERVAL ? DAY) AND NOW()
GROUP BY e1.user_id, e1.session_id
Note we use a LEFT JOIN
so we can still get a result when there are no revenue
events, and COALESCE
to make the output 0 in those cases.
Note also that you need a GROUP BY
clause on this query to separate the results for each event.
Upvotes: 4