Reputation: 127
I have a table which contains two columns where I'm checking out for duplicates.I have written code for duplicates and it is
DbCommand cmd = db.GetSqlStringCommand("SELECT MyTable.* FROM MyTable. INNER JOIN(SELECT MainUrl, COUNT(*) as counter FROM MyTable GROUP BY MainUrl HAVING COUNT(*) >1) t ON t.MainUrl=MyTable.MainUrl");
int count = db.ExecuteNonQuery(cmd);
DataSet ds = db.ExecuteDataSet(cmd);
return ds;
Whereas in my columns I have Amazon.com,http://Amazon.com,https://Amazon.com.I want even these two duplicates.I want to return all the columns in my table with those duplicates.But from my code I can't achieve this,Can anyone help me out from this
Upvotes: 4
Views: 77
Reputation: 11
Considering one field. We can use this:
SELECT field, Count(*) FROM table
GROUP BY field
HAVING Count(*) > 1
ORDER BY Count /*optional DESC*/
Upvotes: 0
Reputation: 31879
Using CHARINDEX
and SUBSTRING
:
It considers 'amazon.com'
, http://amazon.com'
, and https://amazon.com'
as duplicate.
WITH MyTable(OriginalURL) AS(
SELECT 'http://Amazon.com' UNION ALL
SELECT 'https://Amazon.com' UNION ALL
SELECT 'Amazon.com' UNION ALL
SELECT 'http://Stackoverflow.com' UNION ALL
SELECT 'Stackoverflow.com' UNION ALL
SELECT 'http://google.com'
)
SELECT
CASE
WHEN CHARINDEX('http://', OriginalURL, 1) > 0 THEN SUBSTRING(OriginalURL, 8, LEN(OriginalURL) - 7)
WHEN CHARINDEX('https://', OriginalURL, 1) > 0 THEN SUBSTRING(OriginalURL, 9, LEN(OriginalURL) - 8)
ELSE OriginalURL
END AS OriginalURL
, COUNT(*) AS DupeCount
FROM MyTable
GROUP BY
CASE
WHEN CHARINDEX('http://', OriginalURL, 1) > 0 THEN SUBSTRING(OriginalURL, 8, LEN(OriginalURL) - 7)
WHEN CHARINDEX('https://', OriginalURL, 1) > 0 THEN SUBSTRING(OriginalURL, 9, LEN(OriginalURL) - 8)
ELSE OriginalURL
END
HAVING COUNT(*) > 1
Upvotes: 1