Reputation: 391
SELECT *
FROM `eBayorders`
WHERE (`OrderIDAmazon` IS NULL
OR `OrderIDAmazon` = "null")
AND `Flag` = "True"
AND `TYPE` = "GROUP"
AND (`Carrier` IS NULL
OR `Carrier` = "null")
AND LEFT(`SKU`, 1) = "B"
AND datediff(now(), `TIME`) < 4
AND (`TrackingInfo` IS NULL
OR `TrackingInfo` = "null")
AND `STATUS` = "PROCESSING"
GROUP BY `Name`,
`SKU`
ORDER BY `TIME` ASC LIMIT 7
I am trying to make sure that none of the names and skus will show up in the same result. I am trying to group by name and then sku, however I ran into the problem where a result showed up that has the same name and different skus, which I dont want to happen. How can I fix this query to make sure that there is always distinct names and skus in the result set?!
For example say I have an Order:
Name: Ben Z, SKU : B000334, oldest
Name: Ben Z, SKU : B000333, second oldest
Name: Will, SKU: B000334, third oldest
Name: John, SKU: B000036, fourth oldest
The query should return only:
Name: Ben Z, SKU : B000334, oldest
Name: John, SKU: B000036, fourth oldest
This is because all of the Names should only have one entry in the set along with SKU.
Upvotes: 2
Views: 11244
Reputation: 150633
In my case, this happened because I had an error in my SQL code. When I posted the SQL into a repl, I got this error:
ERROR 1055 (42000): Expression #4 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'example.example.meta_value' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
But when I ran the same query in my PHP code using WordPress's $wpdb
, I did not get any error! Instead, I got rows with duplicate values. It turns out that WordPress disables some SQL error checking.
So have a look at your sql_mode setting, to see if it is preventing you from discovering the error in your query.
Upvotes: 0
Reputation: 2571
SELECT T1.*
FROM eBayorders T1
JOIN
( SELECT `Name`,
`SKU`,
max(`TIME`) AS MAX_TIME
FROM eBayorders
WHERE (`OrderIDAmazon` IS NULL OR `OrderIDAmazon` = "null") AND `Flag` = "True" AND `TYPE` = "GROUP" AND (`Carrier` IS NULL OR `Carrier` = "null") AND LEFT(`SKU`, 1) = "B" AND datediff(now(), `TIME`) < 4 AND (`TrackingInfo` IS NULL OR `TrackingInfo` = "null") AND `STATUS` = "PROCESSING"
GROUP BY `Name`,
`SKU`) AS dedupe ON T1.`Name` = dedupe.`Name`
AND T1.`SKU` = dedupe.`SKU`
AND T1.`Time` = dedupe.`MAX_TIME`
ORDER BY `TIME` ASC LIMIT 7
Your database platform should have complained because your original query had items in the select list which were not present in the group by (generally not allowed). The above should resolve it.
An even better option would be the following if your database supported window functions (MySQL doesn't, unfortunately):
SELECT *
FROM
( SELECT *,
row_number() over (partition BY `Name`, `SKU`
ORDER BY `TIME` ASC) AS dedupe_rank
FROM eBayorders
WHERE (`OrderIDAmazon` IS NULL OR `OrderIDAmazon` = "null") AND `Flag` = "True" AND `TYPE` = "GROUP" AND (`Carrier` IS NULL OR `Carrier` = "null") AND LEFT(`SKU`, 1) = "B" AND datediff(now(), `TIME`) < 4 AND (`TrackingInfo` IS NULL OR `TrackingInfo` = "null") AND `STATUS` = "PROCESSING" ) T
WHERE dedupe_rank = 1
ORDER BY T.`TIME` ASC LIMIT 7
Upvotes: 2
Reputation: 118
You are trying to obtain a result set which doesn't have repeats in either the SKU nor the Name column.
You might have to add a subquery
to your query, to accomplish that. The inner query would group by Name, and the Outer query would group by SKU, such that you won't have repeats in either column.
Try this :
SELECT *
FROM
(SELECT *
FROM eBayorders
WHERE (`OrderIDAmazon` IS NULL
OR `OrderIDAmazon` = "null")
AND `Flag` = "True"
AND `TYPE` = "GROUP"
AND (`Carrier` IS NULL
OR `Carrier` = "null")
AND LEFT(`SKU`, 1) = "B"
AND datediff(now(), `TIME`) < 4
AND (`TrackingInfo` IS NULL
OR `TrackingInfo` = "null")
AND `STATUS` = "PROCESSING"
GROUP BY Name)
GROUP BY `SKU`
ORDER BY `TIME` ASC LIMIT 7
Upvotes: 2
Reputation: 52240
With this approach you just filter out rows that do not contain the largest/latest value for TIME.
SELECT SKU, Name
FROM eBayOrders o
WHERE NOT EXISTS (SELECT 0 FROM eBayOrders WHERE Name = o.name and Time > o.Time)
GROUP BY SKU, Name
Note: If two records have exactly the same Name and Time values, you may still end up getting duplicates, because the logic you have specified does not provide any way to break up a tie.
Upvotes: 0
Reputation: 415735
There are two problems here.
The first is the ANSI standard says that if you have a GROUP BY
clause, the only things you can put in the SELECT
clause are items listed in GROUP BY
or items that use an aggregate function (SUM, COUNT, MAX, etc). The query in your question selects all the columns in the table, even those not in the GROUP BY
. If you have multiple records that match a group, the table doesn't know which record to use for those extra columns.
MySql is dumb about this. A sane database server would throw an error and refuse to run that query. Sql Server, Oracle and Postgresql will all do that. MySql will make a guess about which data you want. It's not usually a good idea to let your DB server make guesses about data.
But that doesn't explain the duplicates... just why the bad query runs at all. The reason you have duplicates is that you group on both Name
and SKU
. So, for example, for Ben Z
's record you want to see just the oldest SKU. But when you group on both Name
and SKU
, you get a seperate group for { Ben Z, B000334 }
and { Ben Z, B000333 }
... that's two rows for Ben Z, but it's what the query asked for, since SKU is also part of what determines a group.
If you only want to see one record per person, you need to group by just the person fields. This may mean building that part of the query first, to determine the base record set you need, and then JOINing to this original query as part of your full solution.
Upvotes: 4