Reputation: 1894
In PostgreSQL, I would like to select a row based on some criteria, but if no row matches the criteria, I would like to return the first row. The table actually contains an ordinal column, so the task should be easier (the first row is the one with ordinal 0). For example:
SELECT street, zip, city
FROM address
WHERE street LIKE 'Test%' OR ord = 0
LIMIT 1;
But in this case, there is no way to guarantee the order of the records that match, and I have nothing to order them by. What would be the way to do this using a single SELECT
statement?
Upvotes: 3
Views: 1539
Reputation: 656241
I would like to select a row based on some criteria, but if no row matches the criteria, I would like to return the first row
But slow.
Logically, you don't actually need a WHERE
clause:
SELECT street, zip, city
FROM address
ORDER BY (street LIKE 'Test%') DESC NULLS LAST, ord
-- ORDER BY street NOT LIKE 'Test%', ord -- shorter, maybe less clear
LIMIT 1;
ORDER BY expr DESC
sorts null values first. If null values are possible, you'll want to add NULLS LAST
. Or invert the logic: NOT LIKE
instead of LIKE
, and use default sort order ASC
where null values sort last. See:
If street
is defined NOT NULL
this is irrelevant, but that has not been defined. (Exact table definition would clarify.)
More importantly, if multiple rows have a matching street
(which is to be expected), the returned row would be arbitrary and could change between calls - typically undesirable. Add an (arbitrary unless you know better) tiebreaker to make it deterministic. This query picks the row with the smallest ord
to be deterministic. (Below query uses the street name to break ties.)
This form is also flexible in that it does not rely on the existence of a row with ord = 0
. Instead, the row with the smallest ord
is picked.
For a big table and only left-anchored patterns (like you show) this COLLATE "C"
index radically improves performance:
CREATE INDEX address_street_collate_c_idx ON address (street COLLATE "C");
See:
Plus, I assume an index on address(ord)
. (The PK?)
This query makes use of the index(es):
( -- uses address_street_collate_c_idx
SELECT street, zip, city
FROM address
WHERE street LIKE 'Test%'
ORDER BY street COLLATE "C"
LIMIT 1 -- logically redundant, but helps in this query
)
UNION ALL
( -- uses index on (ord)
SELECT street, zip, city
FROM address
ORDER BY ord
LIMIT 1 -- logically redundant, but helps in this query
)
LIMIT 1
The 2nd query uses the street name as tiebreaker. It's cheapest to pick the fist hit from the index.
The second SELECT
of the UNION ALL
is never executed if the first SELECT
produces enough rows (in our case: 1). If you test with EXPLAIN ANALYZE
, you'll see (never executed)
in this case.
Related:
UNION ALL
In reply to Gordon's comment. The manual:
Multiple
UNION
operators in the sameSELECT
statement are evaluated left to right, unless otherwise indicated by parentheses.
Bold emphasis mine.
And LIMIT
makes Postgres stop evaluating as soon as enough rows are found. That's why you see (never executed)
in the output of EXPLAIN ANALYZE
.
If you add an outer ORDER BY
before the final LIMIT
this optimization is not possible. Then all rows have to be collected to see which might sort first.
This query with cheap index scans and LIMIT 1
will never break.
But exact guarantees are under investigation. See:
Upvotes: 3
Reputation: 1269443
You are on the right track. Just add an order by
:
SELECT street, zip, city
FROM address
WHERE street LIKE 'Test%' OR ord = 0
ORDER BY (CASE WHEN street LIKE 'Test%' THEN 1 ELSE 0 END) DESC
LIMIT 1;
Or, alternately:
ORDER BY ord DESC
Either of these will put the ord = 0
row last.
EDIT:
Erwin brings up a good point that from the perspective of index usage, an OR
in the WHERE
clause is not the best approach. I would modify my answer to be:
SELECT *
FROM ((SELECT street, zip, city
FROM address
WHERE street LIKE 'Test%'
LIMIT 1
)
UNION ALL
(SELECT street, zip, city
FROM address
WHERE ord = 0
LIMIT 1
)
) t
ORDER BY (CASE WHEN street LIKE 'Test%' THEN 1 ELSE 0 END) DESC
LIMIT 1;
This allows the query to make use of two indexes (street
and ord
). Note that this is really only because the LIKE
pattern does not start with a wildcard. If the LIKE
pattern starts with a wildcard, then this form of the query would still do a full table scan.
Upvotes: 5
Reputation: 35780
You can do the following:
SELECT street, zip, city
FROM address
WHERE (EXISTS(SELECT * FROM address WHERE street LIKE 'Test%') AND street LIKE 'Test%') OR
(NOT EXISTS(SELECT * FROM address WHERE street LIKE 'Test%') AND ord = 0)
Upvotes: 0
Reputation: 1585
How about something like this... (I'm not familiar with PostgreSQL so syntax might be slightly off)
SELECT street, zip, city, 1 as SortOrder
FROM address
WHERE street LIKE 'Test%'
--
union all
--
SELECT street, zip, city, 2 as SortOrder
FROM address
WHERE ord = 0
ORDER BY SortOrder
LIMIT 1;
Upvotes: 2