user1481769
user1481769

Reputation: 1

SQL - combining multiple results from same table

I have a large table (20,000,000+ rows) of this format:

id ref  feature val
1  a    pos     pc
2  a    gen     m
3  a    syll    Cv
4  b    pos     pc
5  b    L5      harry
6  c    syll    Cv
7  d    gen     m

How do I select a list of refs that satisfy multiple pairs of feature+val?

E.g. Select distinct ref where feature='pos' and val = 'pc', gives a,b Select distinct ref where feature = 'gen' and val= 'm' gives a,d

but how do I return the combined refs a,b and d, for those refs having feature='pos' and val = 'pc' with those refs having feature = 'gen' and val= 'm'?

I need to be able to extend the solution to many such pairs in the same search.

The search needs to be fast, so advice on indexes would be helpful as well.

Should I select from the same table N times, one for each feature+val pair? And if so, how?

Upvotes: 0

Views: 242

Answers (3)

Samy Vilar
Samy Vilar

Reputation: 11130

sqlite> CREATE TABLE t (id INT, ref VARCHAR, feature VARCHAR, val VARCHAR);
sqlite> 
sqlite> INSERT INTO t (id, ref, feature, val) VALUES (1,  "a",    "pos",     "pc");
sqlite> INSERT INTO t (id, ref, feature, val) VALUES (2,  "a",    "gen",     "m");
sqlite> INSERT INTO t (id, ref, feature, val) VALUES (3,  "a",    "syll",    "Cv");
sqlite> INSERT INTO t (id, ref, feature, val) VALUES (4,  "b",    "pos",     "pc");
sqlite> INSERT INTO t (id, ref, feature, val) VALUES (5,  "b",    "L5",      "harry");
sqlite> INSERT INTO t (id, ref, feature, val) VALUES (6,  "c",    "syll",    "Cv");
sqlite> INSERT INTO t (id, ref, feature, val) VALUES (7,  "d",    "gen",     "m");
sqlite> 
sqlite> SELECT DISTINCT ref FROM t WHERE (feature = 'pos' AND val = 'pc') OR (feature = 'gen' and val= 'm');
a
b
d
sqlite> 

the running time should be linear, 20 million rows may seem big, but it shouldn't much of an issue with todays hardware and most implementations, indexing agains the filter types might improve performance, making sure your rows are fixed width may also have slight improvement.

If you need more configurations simply add them using OR well you get the idea, I would recommend testing it, and seeing how long the query takes, I seriously doubt it would be to long, though I don't know the system nor the implementation you are using, so I can't really say much.

Upvotes: 0

Lieven Keersmaekers
Lieven Keersmaekers

Reputation: 58441

Typically, you would use an OR statement to add those additional clauses

  SELECT DISTINCT ref
  FROM   YourTable
  WHERE  (feature = 'pos' AND val = 'pc')
         OR (feature = 'gen' AND val = 'm')
         ...

You might be able to simplify this a bit by using a persistent calculated column on concatenating both feature and val.

  SELECT DISTINCT ref
  FROM   YourTable
  WHERE  (featureval = 'pospc')
         OR (featureval = 'genm')

but the chaining of OR statements remain.

Indexes

Each statements performance has to be evaluated by looking at the execution plan but for given statement, a Covering Index on (feature, val, ref) wouldn't hurt.

Upvotes: 0

Sam Peacey
Sam Peacey

Reputation: 6034

You can select combinations like that using 'OR':

Select distinct ref 
from table
where ( feature='pos' and val = 'pc')
  or ( feature = 'gen' and val= 'm' )

If that's going to get too unwieldy, you probably want to move your feature / val combinations into a different table and do a join on that.

If you have a join table called join_table (now there's an inspired name!) with feature and val columns, you could do something like this:

select distinct ref
from table t
  inner join join_table j on t.feature = j.feature and t.val = j.val

Upvotes: 1

Related Questions