Reputation: 23
I have two tables - countries (id, name) and users (id, name, country_id). Each user belongs to one country. I want to select 10 random users from the same random country. However, there are countries that have less than 10 users, so I can't use them. I need to select only from those countries, that have at least 10 users.
I can write something like this:
SELECT * FROM(
SELECT *
FROM users u
{MANY_OTHER_JOINS_AND_CONDITIONS}
WHERE u.country_id =
(
SELECT *
FROM
(
SELECT c.id
FROM countries c
JOIN
(
SELECT users.country_id, COUNT(*) as cnt
FROM users
{MANY_OTHER_JOINS_AND_CONDITIONS}
GROUP BY users.country_id
) X ON X.country_id = c.id
WHERE X.cnt >= 10
ORDER BY DBMS_RANDOM.RANDOM
) Y
WHERE ROWNUM = 1
)
ORDER BY DBMS_RANDOM.RANDOM
) Z WHERE ROWNUM < 10
However, In my real scenario, I have more conditions and joins to other tables for determining which user is applicable. By using this query, I must have these conditions on two places - in query that actually selects data and in the count subquery.
Is there any way how to write query like this but without having those other conditions on two places (which is probably not good performance-wise)?
Upvotes: 2
Views: 132
Reputation: 146349
To get countries with more than 10 users:
SELECT users.country_id
, row_number() over (order by dbms_random.value()) as rn
FROM users
GROUP BY users.country_id having count(*) > 10
Use this as a sub-query to choose a country and grab some users:
with ctry as (
SELECT users.country_id
, row_number() over (order by dbms_random.value()) as ctry_rn
FROM users
GROUP BY users.country_id having count(*) > 10
)
, usr as (
select user_id
, row_number() over (order by dbms_random.value()) as usr_rn
from ctry
join users
on users.country_id = ctry.country_id
where ctry.ctry_rn = 1
)
select users.*
from usr
join users
on users.user_id = usr.user_id
where usr.usr_rn <= 10
/
This example ignores your {MANY_OTHER_JOINS_AND_CONDITIONS}
: please inject them back where you need them.
Upvotes: 0
Reputation: 416149
You can use a CTE for the user criteria to avoid repeating the logic and to allow the DB to cache that set once (though in my experience the DB isn't as good at that as it should be, so check your execution plan).
I'm more of a Sql Server guy than Oracle, and syntax is subtly different so this may need some tweaks yet, but try this:
WITH SafeUsers (ID, Name, country_id) As
(
--criteria for users only has to specified here
SELECT ID, Name, country_id
FROM users
WHERE ...
),
RandomCountry (ID) As
(
SELECT ID
FROM (
SELECT u.country_id AS ID
FROM SafeUsers u -- but we reference it HERE
GROUP BY u.country_id
HAVING COUNT(u.Id) >= 10
ORDER BY DBMS_RANDOM.RANDOM
) c
WHERE ROWNUM = 1
)
SELECT u.*
FROM (
SELECT s.*
FROM SafeUsers s -- and HERE
INNER JOIN RandomCountry r ON s.country_id = r.ID
ORDER BY DBMS_RANDOM.RANDOM
) u
WHERE ROWNUM <= 10
And by removing nesting and introducing names for each intermediate step, this query is suddenly much easier to read and maintain.
Upvotes: 5
Reputation: 133400
you could create a view for
create view user_with_many_cond as
SELECT *
FROM users u
{MANY_OTHER_JOINS_AND_CONDITIONS}
ths looking to your query
You could use having instead of a where outside the query
The order by seems could be placed inside the inner query
so the filter for one row
SELECT * FROM(
SELECT *
FROM user_with_many_cond u
WHERE u.country_id =
(
SELECT c.id
FROM countries c
JOIN
(
SELECT users.country_id, COUNT(*) as cnt
FROM user_with_many_cond
GROUP BY users.country_id
HAVING cnt >=10
ORDER BY DBMS_RANDOM.RANDOM
) X ON X.country_id = c.id
WHERE ROWNUM = 1
)
ORDER BY DBMS_RANDOM.RANDOM
) Z WHERE ROWNUM < 10
Upvotes: 0