Reputation: 1626

Optimizing db query for nested data

I have a database table like this with around 1 million rows:

id    prev    curr    next
1     25      26      27
2     26      27      28
3     27      45      46
4     45      46      47
5     46      47      48
6     47      59      41
..............
..............

On the java side, what I need to do is, for a given input like (curr = 45 and diff = 2), I need to get the list of items like 45, 27, 26. For input (curr = 59 and diff = 1), I need to get 59, 47 As you can see above, the prev and next are not always 1 less and 1 greater than curr value.

Currently what I do in java is based on diff value, I query table to get the prev for curr. Then using the prev as curr, I again query the table and continue till I get what I'm looking for. But for higher diff values like 20 or 30, this is too many DB calls.

Does anyone has any thoughts on doing all of this in 1 DB query? Since there are too many rows in table, fetching and keeping the data locally is not an option.

Edited with answer to comments:

The table contains other columns too and so data won't just be 16MB. I just trimmed out the not relevant columns.
No duplicate in curr field.
There are other functions that use curr and next columns similar to prev and curr.
This is not a tree stucture. So last row is a valid row.

Upvotes: 1

Answers (3)

chabzjo

Reputation: 626

I think you could to this with two db calls:

SELECT id, curr, next FROM ...

which should still be a relatively small amount of data.

From there, loop through the data with javascript diff # of times to find the id you need.

Then:

SELECT * from ... WHERE id = {the record you need}

Upvotes: 0

Joop Eggen

Reputation: 109613

Hierarchical data: prev of prev of prev of curr, so prev+ of curr is a missing operation in relational SQL.

You could create a prev+ table as (curr, prevplus, level) so that prev^level gives prevplus. To fill such a table is not that difficult, and even changes can be done; in mysql (because of the self-reference) with a temporary table.

Then a query would be with level <= 2.

Of course the table prevplus becomes large.

Upvotes: 0

Mike Brant

Reputation: 71422

You could self-join the table X number of times (based on the value for diff, but this is probably not a very efficient way to do things if you need to to support large values for diff.

This seems like a schema problem to me. Without having further understanding of how you are assigning the order of the items when writing to the database, it would be hard to give suggestions on how the schema might be altered to allow for easier read queries.

Upvotes: 1

Optimizing db query for nested data

Answers (3)

Related Questions