Reputation: 3347
I'm not a pro in SQL at all :) Having a very critical performance issue. Here is the info directly related to problem.
I have 2 tables in my DB- table condos
and table goods
.
table condos
have the fields:
table items:
I have 1000+ entities in condos
table and 1000+ in items
table.
The problem is how i perform items search
currently it is:
For example, i want to get all the items for city = Sydney
SELECT condos.condo_id FROM public.condos WHERE city = 'Sydney'
SELECT * FROM public.items WHERE item.condo_id = ?
for each condo_id
i get in step 1.The issue is that once i get 1000+ entities in condos
table, the request is performed 1000+ times for each condo_id
belongs to 'Sydney'. And the execution of this request takes more then a 2 minutes which is a critical performance issue.
So, the questions is:
What is the best way for me to perform such search ? should i put a 1000+ id's in single WHERE
request? or?
For add info, i use PostgreSQL 9.4 and Spring MVC.
Upvotes: 2
Views: 565
Reputation: 9492
Essentially what you need is to eliminate the N+1 query and at the same time ensure that your City field is indexed. You have 3 mechanisms to go. One is already stated in one of the other answers you have received this is the SUBSELECT approach. Beyond this approach you have another two.
You can use what you have stated :
SELECT condos.condo_id FROM public.condos WHERE city = 'Sydney'
SELECT *
FROM public.items
WHERE items.condo_id IN (up to 1000 ids here)
the reason why I am stating up to 1000 is because some SQL providers have limitations.
You also can do join as a way to eliminate the N+1 selects
SELECT *
FROM public.items join public.condos on items.condo_id=condos.condo_id and condos.city='Sydney'
Now what is the difference in between the 3 queries.
Pros of Subselect query is that you get everything at once. The Cons is that if you have too many elements the performance may suffer:
Pros of simple In clause. Effectivly solves the N+1 problem, Cons may lead to some extra queries compared to the Subselect
Joined query pros, you can initialize in one go both Condo and Item. Cons leads to some data duplication on Condo side
If we have a look into a framework like Hibernate, we can find there that in most of the cases as a fetch strategy is used either Joined either IN strategies. Subselect is used rarely.
Also if you have critical performance you may consider reading everything In Memory and serving it from there. Judging from the content of these two tables it should be fairly easy to just upload it into a Map.
Effectively everything that solves your N+1 query problem is a solution in your case if we are talking of just 2 times 1000 queries. All three options are solutions.
Upvotes: 1
Reputation: 11619
Use a table join
to perform a query such that you do not need to perform a additional query. In your case you can join condos
and items
by condo_id
which is something like:
SELECT i.*
FROM public.items i join public.condos c on i.condo_id = c.condo_id
WHERE c.city = 'Sydney'
Note that performance tuning is a board topic. It can varied from environment to environment, depends on how you structure the data in table and how you organize the data in your code.
Here is some other suggestion that may also help:
Try to add index to the field where you use sorting and searching, e.g. city
in condos
and condo_id
in items
. There is a good answer to explain how indexing work.
I also recommend you to perform EXPLAIN
to devises a query plan for your query whether there is full table search that may cause performance issue.
Hope this can help.
Upvotes: 2
Reputation: 311393
You could use the first query as a subquery in an in
operator in the second query:
SELECT *
FROM public.items
WHERE item.condo_id IN (SELECT condos.condo_id
FROM public.condos
WHERE city = 'Sydney')
Upvotes: 0