tempname
tempname

Reputation: 3

optimizing child/parent structure in one table with a lot of data

I have a table which has a simple parent child structure

products:
- id
- product_id
- time_created
- ... a few other columns

It is a parent if product_id IS NULL. Product id behaves here like parent_id. Data inside looks like this:

id | product_id 
 1     NULL 
 2      1      
 3      1         
 4     NULL         
 4      4         

This table is updated every night a new versions are added.

Every user is using a lot of these products but only one version. User is notified if new rows are added for an product_id.

He can stop using id:2 and start using id:3. An another user will continue using id:2 etc.

products table is updated every night and it grows pretty fast. There are around 500000 rows at the moment and every night adds around 20000, probably 5-7000000 changes (new rows) per year.

Is there a way to optimize this database/table structure? Should I change anything? Is it a problem to have so much data in one table?

Upvotes: 0

Views: 492

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270021

Your question is not clear. The sample data is suggesting that the parent-child relationship is only one level deep. If so, this is not a particularly hard problem. You can create a query to look up the most recent product id for each product -- and I'm assuming this is the one with the maximum id:

select id, product_id,
       max(id) over (partition by coalsesce(product_id, id)) as biggest_id
from table t;

This is then a lookup table, to get the biggest id. It would produce:

id | product_id | biggest_id
 1     NULL           3
 2      1             3
 3      1             3
 4     NULL           4
 4      4             4

If your table has deeper hierarchies, you can solve the problem using recursive CTEs, or by doing the calculation when the table is updated.

Upvotes: 1

Related Questions