Reputation: 207
I have a table with columns:
id , conversation_id , session_id , user_id , message , created_at
every time a user starts a conversation with an employee, a new session starts (different session number).all messages between every employees and users are stored in this table. the created_at column is a timestamp. I need to filter out sessions by employee number, and calculate the average response time between the first message a user sends and the first message sent back by a specific employee, for every session disregarding outlying data where either a customer or employee did not reply ( only one user in the session)
i know this is complicated but please help!
in this example in the user_id column, 4 is the employee ( keep in mind there are other employees). everytime a new conversation starts the session_id changes. i have to go through each session for a specific employee, take the timestamp of the first message sent by the customer as well as the employee, take the difference, sum all the differences and then take an average, while making sure that the session actually contains two users ( filtering outlying data).
So far, ive come up with this:
SELECT * FROM messages WHERE session_id IN ( SELECT session_id FROM messages WHERE user_id =4 ) GROUP BY session_id, user_id
to get the first message from each customer and employee (gives something like this)
so from this specific example, i would omit line 41040 as it only as the session contains only 1 person (column 3, id 1028) and is considered outlying data
Upvotes: 0
Views: 2627
Reputation: 207
I'm actually appalled by some of the comments... StackOverflow is meant to be a community for helping others. Why bother even taking up comment space if you're gonna complain about my ponctuation or give a vague, useless answer?
Anyways, i figured it out.
Basically, i joined the same table multiple times but only queried the necessary data. In the first join, I queried the messages table with the employee messages and grouped them by session number. In the second join, i did the same procedure but only extracted the messages from the user. By joining them on the session id, it automatically omits any sessions where either a user or employee is not present. By default, the groupby returns the first set of data from the group ( in this situation i didn't have to manipulate the groupby because I was actually looking for the first message in the session), I then took the average of the difference between the message timestamp for the user and employee.In this specific situation, the number 4 is the employee number. Here is what the query looks like Also, the HAVING AVG_RESP > 0
was necessary in this situation to remove outlying data when tests are performed :
SELECT AVG(AVG_RESP)
FROM(
SELECT TIME_TO_SEC(TIMEDIFF(t.created_at, u.created_at )) AS AVG_RESP
FROM (
SELECT * FROM messages
WHERE session_id IN (
SELECT session_id FROM messages
WHERE user_id = 4) AND user_id = 4
GROUP BY session_id
) AS t
JOIN(
SELECT * FROM messages
WHERE session_id IN (
SELECT session_id FROM messages
WHERE user_id = 4) AND user_id != 4
GROUP BY session_id
) as u
ON t.session_id = u.session_id
GROUP BY t.session_id
HAVING AVG_RESP > 0
) as ar
Hopefully this helps someone in the future, unlike the people who leave ridiculous, useless comments.
Upvotes: 1