Reputation: 44
I'm making a MySQL script executed in PHP. So I have 3 tables.
Messages who contains at least 3 000 000 rows (userid, messageid, timestamp, received, opened, clicked, deliveryid)
Users (user(unique), profile, profile_actual_timestamp, last_delivery_id_sent)
events_clicked_data who contains at least 2 000 000 rows and detailed clicked event. This user click on this profile(like GAMES) on this message(deliveryId) (userid, profile, deliveryId, eventDate)
So here is sample data for messages and users tables.
CREATE TABLE messages
(`user_id` varchar(100), `message_id` int, `timestamp` datetime, `received` varchar(5), `opened` varchar(5), `clicked` varchar(5), `delivery_id` int);
INSERT INTO messages
(`user_id`, `message_id`, `timestamp`, `received`, `opened`, `clicked`, `delivery_id`)
VALUES
("BillyStuff", 12,'2016-05-16 00:00:00', 'true', 'true', 'true', 8),
("BillyStuff", 11,'2016-05-14 00:00:00', 'true', 'true', 'true' , 7),
("BillyStuff", 8,'2016-04-03 00:00:00', 'true', 'false', 'false' , 6),
("BillyStuff", 4,'2016-04-02 00:00:00', 'true', 'false', 'false', 5),
("JohnDoe", 15 ,'2016-05-16 00:00:00', 'true', 'true', 'false' , 4),
("JohnDoe", 13 ,'2016-05-14 00:00:00', 'true', 'true', 'true', 3),
("Donnie", 15 ,'2016-05-16 00:00:00', 'true', 'true', 'true' , 4),
("Donnie", 13 ,'2016-05-14 00:00:00', 'true', 'true', 'true', 3)
CREATE TABLE users
(`user_id` varchar(100), `profile` varchar(100), `profile_actual_timestamp` datetime, `last_delivery_id_sent` int);
INSERT INTO users
(`user_id`, `profile`, `profile_actual_timestamp`, `last_delivery_id_sent`)
VALUES
("BillyStuff", "Game", "2016-01-01 00:00:00", 1),
("JohnDoe", "Book", "2016-01-01 00:00:00", 1),
("Donnie", "Book", "2016-05-16 00:00:00", 4)
I want to get users with 2 messages clicked after timestamp (profile_actual_timestamp means last time it was updated) in profile.
In this case I only get BillyStuff because Donnie is already up to date if I check if profile_actual_timestamp.
After this, I need to check by deliveryId and user in events_clicked's table if same categories was clicked.
CREATE TABLE events_clicked_data
(`user_id` varchar(100), `profile` varchar(100), `deliveryId` int, `eventDate` datetime);
INSERT INTO users
(`user_id`, `profile`, `deliveryId`, `eventDate`) VALUES
("BillyStuff", "Book", 8,"2016-01-01 00:00:00"),
("BillyStuff", "Book", 7,"2016-01-01 00:00:00"),
("JohnDoe", "Book", 3,"2016-01-01 00:00:00"),
("Donnie", "Book", 4,"2016-05-16 00:00:00"),
("Donnie", "Game", 3,"2016-05-16 00:00:00")
In this case, i need to update BillyStuff's profile and change it to "Book" instead of "Game" because he clicked on the same categorie twice in his last messages after the last time he was updated (profile_actual_timestamp)
So its been a really good puzzle for me this week and Im wondering if you guys can help me with this one.
originalid = userid (not necessary original, depend on table)
e.name = name of profil clicked like game.
select originalid,
name
from (
select @g := if(@u = originalid, if (@p = name, @g, @g + 1), 1) as grp,
@u := originalid as originalid,
@p := name as name
from (
select u.originalid,
m.message_sendtime_timestamp,
e.name
from bienvenue_nouveau_client_dev u
inner join messages_nouveaux_clients m
on m.originalid = u.originalid
inner join events_clicked_data e
on e.originalId = u.originalid
and e.deliveryId = m.deliveryId
where m.message_sendtime_timestamp >= u.profil_actuel_timestamp
and m.clicked = 'TRUE'
limit 1000000000000000
order by u.originalid,
m.message_sendtime_timestamp desc
) alias
) alias2
where grp = 1
group by originalid, name
having count(*) > 1
Whatever I change I got an error like this:
Upvotes: 0
Views: 75
Reputation: 350667
The first query can be done as follows:
select u.*
from users u
inner join messages m
on m.user_id = u.user_id
where m.timestamp >= u.profile_actual_timestamp
and m.clicked = 'true'
group by u.user_id
having count(m.message_id) > 1
This second query will give you the users that used the same profile in their last two click events, if these events happened both later than the last update:
select user_id,
profile
from (
select @g := 0 + if(@u = user_id, if (@p = profile, @g, @g + 1), 1) as grp,
@u := user_id as user_id,
@p := profile as profile
from (
select u.user_id,
m.timestamp,
e.profile
from users u
inner join messages m
on m.user_id = u.user_id
left join events_clicked_data e
on e.user_id = u.user_id
and e.deliveryId = m.delivery_id
where m.timestamp >= u.profile_actual_timestamp
order by u.user_id,
m.timestamp desc
) alias
) alias2
where grp = 1
group by user_id, profile
having count(*) > 1
This query depends on variables, and is a bit risky, in that it must return the innermost results in the specified order, and that it must evaluate the middle select list (with variable assignments) in top-to-bottom order. This happens consistently, but in theory it is not guaranteed.
The (high) limit
in the inner query is there to make sure the order by
clause is applied, not to actually limit anything.
The variables @u and @p track the user_id and profile. Based on their previous values, the variable @g is calculated: it is reset to 1 whenever the user_id changed. Otherwise it is kept at the same value if also the profile didn't change, and it is incremented when the profile did change.
This way the grp values identify groups in which the profile is the same and uninterrupted in order of descending timestamp. The most recent group per used has number 1, which is the only one of interest in the outer query. The outer query then requires that this group 1 has more than one record (i.e. more than one occurrence of the same profile).
Upvotes: 1