John Nolt
John Nolt

Reputation: 21

Given a set of values, what SQL query will find the row in my db with the most of those values?

I have a simple database that uses a form entry to determine what file to display. But almost none of the form fields are required, and many of my database entries are variations on a similar theme.

How do I craft an SQL query that will return the entry in my data base that has columns most closely matching the entered form values?

For example here is an example of my database:

Filename    Val1    Val2    Val3    Val4    Val5
one         red     blue                    yellow
two         red     blue
three       red     blue    green           yellow
four        red     blue    green   violet

Values coming in from my form:

Val1: red
Val2: blue
Val3: blue
Val4: red
Val5: yellow

Is there a single SQL query will return filename "one", because it has the most columns matching a value in the form entry, and no columns that are something OTHER than an entered value? If there isn't, how do use the form entries to find filename "one"?

I'm sure there is something really simple that I am just missing due to inexperience, but obviously "AND" won't work, and "OR" will return every row.

Upvotes: 2

Views: 55

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270513

I think you want something like this:

select t.*
from table t
order by ((col1 in (val1, val2, val3, val4, val5)) +
          (col2 in (val1, val2, val3, val4, val5)) +
          (col3 in (val1, val2, val3, val4, val5)) +
          (col4 in (val1, val2, val3, val4, val5)) +
          (col5 in (val1, val2, val3, val4, val5))
         ) desc
limit 1;

This counts the number of matching values and returns the row with the most matches. It might have unexpected behavior for the null values. If so, here is a simple fix:

select t.*
from table t
order by ((coalesce(col1, '') in (val1, val2, val3, val4, val5)) +
          (coalesce(col2, '') in (val1, val2, val3, val4, val5)) +
          (coalesce(col3, '') in (val1, val2, val3, val4, val5)) +
          (coalesce(col4, '') in (val1, val2, val3, val4, val5)) +
          (coalesce(col5, '') in (val1, val2, val3, val4, val5))
         ) desc
limit 1;

EDIT:

It looks like the values are "positional" so in can be replaced with =:

select t.*
from table t
order by ((coalesce(col1, '') = val1) +
          (coalesce(col2, '') = val2) +
          (coalesce(col3, '') = val3) +
          (coalesce(col4, '') = val4) +
          (coalesce(col5, '') = val5)
         ) desc
limit 1;

Upvotes: 1

Related Questions