ruhewo
ruhewo

Reputation: 107

Better Performance with SQL and Java Program

I have a code where I am getting the data from various sources and sorting and ordering them to be send to the user.

I am taking the data by firing a query which contains multiple joins to a list of DTO, then again I am firing another query which further contains multiple joins to the same list of DTO. then I am adding both the lists of DTOs to be presented to the user.

Query 1: Select * from TABLE1, TABLE2....

Query 2: Select * from TABLE5, TABLE7....

dto1.addAll(dto2);

dto1.sort(Comparator....);

I am sorting it again programatically is because of below reason:

Query 1 returned sorted data lets assume
1,2,3,4
Query 2 returned sorted data lets assume
1,2,3,4

After combining both the lists, I will get
1,2,3,4,1,2,3,4
Expected data
1,1,2,2,3,3,4,4

My question is, on which case performance will be better?

  1. fetch the sorted data from both the queries, add the list and then sort and order them.
  2. fetch the unsorted data from both the queries, add the list and then sort and order only once.

In the first case, it will get sorted thrice, but on the second case, it will sort and order only once.

When I tested with putting hundreds of thousands of records in the table, I didn't found much difference, second case was a bit faster than the first one.

So, in case of efficiency and performance, which one should be recommended?

Upvotes: 0

Views: 707

Answers (3)

The Impaler
The Impaler

Reputation: 48750

I think all three will have similar performance. You could get a little bit higher speed using one or the other but I don't think it will be significant.

Now, in terms of load, that's a different story. Are you more limited by CPU resources (in your local machine) or by database resources (in the remote DB server)? Most of the time the database will be sitting there idle while your application will be processing a lot of other stuff. If that's the case, I would prefer to put the load on the database, rather than the application itself: that is, I would let the database combine and sort the data in a single SQL call; then the application would simply use the ready-to-use data.

Edit on Dec 22. 2018:

If both queries run on the same database, you can run them as a single one and combine the results using a CTE (Common Table Expression). For example:

with
x (col1, col2, col3, col4, ...) as (
  select * from TABLE1, TABLE2... -- query 1
  union all
  select * from TABLE5, TABLE7... -- query 2
)
select * from x
order by col1

The ORDER BY at then end operates over the combined result. Alternatively, if your database doesn't support CTEs, you can write:

select * from (
  select * from TABLE1, TABLE2... -- query 1
  union all
  select * from TABLE5, TABLE7... -- query 2
) x
order by col1

Upvotes: 1

Rick James
Rick James

Reputation: 142208

Do it all in MySQL:

( SELECT ... )
UNION ALL
( SELECT ... )
ORDER BY ...

Don't worry about sorting in the two selects; wait until the end to do it.

ALL assumes that there are no dups you need to get rid of.

This approach may be fastest simply because it is a single SQL request to the database. And because it does only one sort.

Upvotes: 1

flopcoder
flopcoder

Reputation: 1175

I think 2nd one is better performer because if you run a sorting algorithm after merging your two list. So you don't need to run sort query to db. So database sorting query cost not requiring your 2nd query.

But if you retrieve data with sorted order and then again you run sorting algorithm then it will must take some more cost to execute although its negligible.

Upvotes: 0

Related Questions