Tohid
Tohid

Reputation: 22523

SQL Intersection and Union

I have a table called words, consisting of three columns word(VARCHAR(16)), doc_id(INT), weight(DOUBLE).

Here is what I need to do, I have two queries:

SELECT doc_id, weight FROM words WHERE word = 'bla';

doc_id    weight
------    ------
1         0.14
2         0.61
3         0.32

and

SELECT doc_id, weight FROM words WHERE word = 'blabla';

doc_id    weight
------    ------
2         0.19
3         0.45
4         0.14

I need to get the intersection of the two on doc_id and select the lower weight value as the weight, i.e. I want the results to be:

doc_id    weight
------    ------
2         0.19
3         0.32

Is there a way to do that in a single query? Doing it in the program makes it damn slow!

I also need to get their UNION and select the higher weight value, i.e. I want the results to be:

doc_id    weight
------    ------
1         0.14
2         0.61
3         0.45
4         0.14

Keep in mind that the column word and doc_id are not unique, so one word can be assigned to many docs.

Upvotes: 1

Views: 735

Answers (5)

egrunin
egrunin

Reputation: 25053

Intersection:

SELECT doc_id, MIN(weight) as MinWeight FROM words
WHERE doc_id IN
    (SELECT doc_id FROM words WHERE word = 'bla')
AND doc_id IN
    (SELECT doc_id FROM words WHERE word = 'blabla')
GROUP BY doc_id

Union:

SELECT doc_id, MAX(weight) as MaxWeight FROM words
WHERE word IN ('bla', 'blabla')
GROUP BY doc_id

Upvotes: 0

goat
goat

Reputation: 31813

select w1.doc_id
     , least(w1.weight, w2.weight) weight
  from words w1
 inner 
  join words w2
    on w1.doc_id = w2.doc_id
 where w1.word = 'bla1'
   and w2.word = 'bla2'



select doc_id
     , max(weight) weight
  from words
 where word in ('blah1', 'blah2')
 group
    by doc_id

Upvotes: 1

Teja
Teja

Reputation: 13524

As per my understanding the below query should work for your req.

-- Lower Weight Query SELECT Z.doc_id,MIN(weight) as LOWER_WEIGHT FROM ( SELECT doc_id, weight FROM words WHERE word = 'bla' INTERSECT SELECT doc_id, weight FROM words WHERE word = 'blabla' )Z GROUP BY Z.DOC_ID;

-- Higher Weight Query

SELECT Z.doc_id,MAX(weight) as HIGHER_WEIGHT FROM ( SELECT doc_id, weight FROM words WHERE word = 'bla' UNION SELECT doc_id, weight FROM words WHERE word = 'blabla' )Z GROUP BY Z.DOC_ID;

Regards, Venk

Upvotes: 0

Andreas Wederbrand
Andreas Wederbrand

Reputation: 39951

For the intersect part it sounds like you want "the lowest weight for all doc_id where the doc_id has one row for the word 'bla' AND one row for the word 'blabla'". That can be found by

(untested)
select w1.doc_id, least(min(w1.weight), min(w2.weight)) as minweight 
  from words w1, words w2
 where w1.doc_id = w2.doc_id
   and w1.word = 'bla'
   and w2.word = 'blabla'
 group by w1.doc_id;

For the union part what you want is "the highest weight for all doc_id where the doc_id has one row for the word 'bla' OR one row for the word 'blabla'". That can be found by

(untested)
select doc_id, max(weight) as maxWeight
  from words
 where word in ('bla', 'blabla')
 group by doc_id;

Upvotes: 1

Lamak
Lamak

Reputation: 70638

I think that you want this:

SELECT doc_id, MIN(weight) MinWeight, MAX(weight) MaxWeight
FROM words
WHERE word IN ('bla','blabla')
GROUP BY doc_id

Upvotes: 0

Related Questions