FinalGirl321
FinalGirl321

Reputation: 115

find rows in database with 2 or more values of a column in common

I have a table called articletag for a blog database that says which article has which tag:

Art_Id  Tag_id
       1    3
       2    3
       3    3
       4    3
       1    1
       3    1
       4    1
       2    2
       5    5

another way to see this data is:

1, "blog", "first"
2, "blog", "second"
3, "blog", "first"
4, "blog", "first"
5, "seaside"

Tag_id 3 = 'blog' Tag_id 1 = 'first' Tag_id 5 = 'seaside' Tag_id 2 = 'second'

I am specifically looking for any articles with 2 or more words in common among EVERY article in the database and EVERY word tag (these tags are unique, btw) Looking at the denormalized example above the answer should be 1,3,4, as articles with 2 or more words in common. Those 3 articles clearly share "blog" and "first." The output should be

art_id

1 3 4

I have been trying for hours to get this right. The best I came up with was finding which tag_id shows up 2 or more times using:

Select a.* 
from articletag a 
  join (
    select t.tag_id 
    from articletag t 
    group by t.tag_id 
    having count(*) >=2 
  ) b on b.tag_id = a.tag_id 

But what I really want is which Article_id's have 2 or more words in common Can anyone please help?

Upvotes: 2

Views: 69

Answers (3)

Michał Turczyn
Michał Turczyn

Reputation: 37377

Try this:

declare @x table (art_id int, tag_id int)
insert into @x values 
(1,    3),
(2,    3),
(3,    3),
(4,    3),
(1,    1),
(3,    1),
(4,    1),
(2,    2),
(5,    5)

select distinct art_id from (
    select [x1].art_id,
           COUNT(*) over (partition by [x1].art_id,[x2].art_id) [cnt] 
    from @x [x1] join @x [x2]
    on [x1].tag_id = [x2].tag_id and [x1].art_id <> [x2].art_id
) a where cnt > 1

Upvotes: 0

Yogesh Sharma
Yogesh Sharma

Reputation: 50163

You could also use cte to find the Art_Ids which have same combination

;with cte as
(
    select Tag_id
    from table
    group by Tag_id
    having count(*) >= 2
)

select t.Art_Id 
from cte c inner join table t 
        on t.Tag_id = c.Tag_id
group by t.Art_Id 
having count(*) = (select count(1) from cte)

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521429

We can try doing a self join here:

SELECT t1.Art_id, t2.Art_id
FROM articletag t1
INNER JOIN articletag t2
    ON t2.Art_id > t1.Art_id AND
       t1.Tag_id = t2.Tag_id
GROUP BY
    t1.Art_id, t2.Art_id
HAVING
    COUNT(DISTINCT t1.Tag_id) >= 2;

enter image description here

Demo

Note that I am seeing 1-3, 1-4, and 3-4 as being the articles which have two or more tags in common.

Upvotes: 2

Related Questions