Reputation: 1018
There's a query I need to modify. What the query currently does is return search results (ads) based on Ad Title and Ad Description . If any of the search words are either found in ad title or ad description, it returns those results
I want to modify the query so that each ad appears in search results only once for a given ad title... So if there were 5 ads found with the same ad title for the given words in the search , it should return only 1 ad for that ad title...
$sql = "SELECT a.*, UNIX_TIMESTAMP(a.createdon) AS timestamp, ct.cityname,
COUNT(*) AS piccount, p.picfile,
scat.subcatname, cat.catid, cat.catname $xfieldsql
FROM t_ads a
INNER JOIN t_cities ct ON a.cityid = ct.cityid
INNER JOIN t_subcats scat ON a.subcatid = scat.subcatid
INNER JOIN t_cats cat ON scat.catid = cat.catid
LEFT OUTER JOIN t_adxfields axf ON a.adid = axf.adid
LEFT OUTER JOIN t_adpics p ON a.adid = p.adid AND p.isevent = '0'
LEFT OUTER JOIN t_featured feat ON a.adid = feat.adid AND feat.adtype = 'A'
WHERE $where
AND $visibility_condn
AND (feat.adid IS NULL OR feat.featuredtill < NOW())
$loc_condn
GROUP BY a.adid
ORDER BY a.createdon DESC
LIMIT $offset, $ads_per_page";
Edit: $where contains the search expression... if regular expression search is turned on it uses regex otherwise not... $sqlsearch contains the search words that were input by the user...
if ($regex_search) {
$where = "(a.adtitle RLIKE '[[:<:]]{$searchsql}[[:>:]]' OR a.addesc RLIKE '[[:<:]]{$searchsql}[[:>:]]')";
} else {
$where = "(a.adtitle LIKE '$searchsql' OR a.addesc LIKE '$searchsql')";
Upvotes: 1
Views: 705
Reputation: 39394
The "proper" way to do this would be tackle the route cause by working out why the duplicates are appearing in the first place. It will be something to do with the JOINs but without looking at the data I'm unable to answer that. If, however you'd like a quick(ish) and dirty way to remove duplicates, could try something like below.
Disclaimer: This is completely untested so there's more likely to be a mistake or two in here - but hopefully no dealbreaker.
SELECT a2.*, UNIX_TIMESTAMP(a.createdon) AS timestamp, ct2.cityname,
COUNT(*) AS piccount, p2.picfile,
scat2.subcatname, cat2.catid, cat2.catname $xfieldsql
FROM
(SELECT subq1.title, MIN(subq1.adid) AS adid
FROM
(SELECT a.*, UNIX_TIMESTAMP(a.createdon) AS timestamp, ct.cityname,
COUNT(*) AS piccount, p.picfile,
scat.subcatname, cat.catid, cat.catname
FROM t_ads a
INNER JOIN t_cities ct ON a.cityid = ct.cityid
INNER JOIN t_subcats scat ON a.subcatid = scat.subcatid
INNER JOIN t_cats cat ON scat.catid = cat.catid
LEFT OUTER JOIN t_adxfields axf ON a.adid = axf.adid
LEFT OUTER JOIN t_adpics p ON a.adid = p.adid AND p.isevent = '0'
LEFT OUTER JOIN t_featured feat ON a.adid = feat.adid AND feat.adtype = 'A'
WHERE $where
AND $visibility_condn
AND (feat.adid IS NULL OR feat.featuredtill < NOW())
$loc_condn
GROUP BY a.adid) subq1
GROUP BY subq.title) subq2
INNER JOIN t_ads a2 ON a2.adid = subq2.adid
INNER JOIN t_cities ct2 ON a2.cityid = ct2.cityid
INNER JOIN t_subcats scat2 ON a2.subcatid = scat2.subcatid
INNER JOIN t_cats cat2 ON scat2.catid = cat2.catid
LEFT OUTER JOIN t_adxfields axf2 ON a2.adid = axf2.adid
LEFT OUTER JOIN t_adpics p2 ON a2.adid = p2.adid AND p2.isevent = '0'
LEFT OUTER JOIN t_featured feat2 ON a2.adid = feat2.adid AND feat2.adtype = 'A'
ORDER BY a2.createdon DESC
LIMIT $offset, $ads_per_page
This could be massively simplified and tidied up e.g. by removing some of the stuff from the subquery but am just giving the general idea to (hopefully) get you up and running...
subq2
simply groups by title and picks out an adid
from each group (chose to use MIN
here but could have used MAX
instead).
subq1
is the original query but with ordering and limits removed since these are applied by the outer query.
The outer query joins back on the de-duped IDs and joins back to the ads and other tables (giving them different aliases) in order to select the fields from your original query.
Upvotes: 3