Reputation: 3
I have a table which has a simple parent child structure
products:
- id
- product_id
- time_created
- ... a few other columns
It is a parent if product_id IS NULL. Product id behaves here like parent_id. Data inside looks like this:
id | product_id
1 NULL
2 1
3 1
4 NULL
4 4
This table is updated every night a new versions are added.
Every user is using a lot of these products but only one version. User is notified if new rows are added for an product_id.
He can stop using id:2 and start using id:3. An another user will continue using id:2 etc.
products table is updated every night and it grows pretty fast. There are around 500000 rows at the moment and every night adds around 20000, probably 5-7000000 changes (new rows) per year.
Is there a way to optimize this database/table structure? Should I change anything? Is it a problem to have so much data in one table?
Upvotes: 0
Views: 492
Reputation: 1270021
Your question is not clear. The sample data is suggesting that the parent-child relationship is only one level deep. If so, this is not a particularly hard problem. You can create a query to look up the most recent product id for each product -- and I'm assuming this is the one with the maximum id
:
select id, product_id,
max(id) over (partition by coalsesce(product_id, id)) as biggest_id
from table t;
This is then a lookup table, to get the biggest id. It would produce:
id | product_id | biggest_id
1 NULL 3
2 1 3
3 1 3
4 NULL 4
4 4 4
If your table has deeper hierarchies, you can solve the problem using recursive CTEs, or by doing the calculation when the table is updated.
Upvotes: 1