Mark Magnus
Mark Magnus

Reputation: 71

Postgres 10.3 heavily partitioned table and cannot delete any records

Anyone having this problem on native partitioned tables?

Partitioned table has 7202 partitions. No partition contains more than 50 records. Partitioning is done on a foreign key.

Any delete operation i.e.

delete from contacts where id = ?
delete from contacts where id = ? and account_id = ?
delete from contacts where account_id = ?

results in out of memory condition.

Default Postgres Configuration with exception max_locks_per_transaction = 1024

Postgres Logs:

2018-03-15 14:26:40.340 AEDT [7120] LOG:  server process (PID 8177) was terminated by signal 9: Killed
2018-03-15 14:26:40.340 AEDT [7120] DETAIL:  Failed process was running: delete from contacts where id = 82398 and account_id = 9000
2018-03-15 14:26:40.354 AEDT [7120] LOG:  terminating any other active server processes
2018-03-15 14:26:40.367 AEDT [3821] WARNING:  terminating connection because of crash of another server process
2018-03-15 14:26:40.367 AEDT [3821] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2018-03-15 14:26:40.367 AEDT [3821] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2018-03-15 14:26:40.369 AEDT [7726] mark@postgres WARNING:  terminating connection because of crash of another server process
2018-03-15 14:26:40.369 AEDT [7726] mark@postgres DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2018-03-15 14:26:40.369 AEDT [7726] mark@postgres HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2018-03-15 14:26:40.392 AEDT [7749] mark@partitioning_development WARNING:  terminating connection because of crash of another server process
2018-03-15 14:26:40.392 AEDT [7749] mark@partitioning_development DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2018-03-15 14:26:40.392 AEDT [7749] mark@partitioning_development HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2018-03-15 14:26:40.569 AEDT [7120] LOG:  all server processes terminated; reinitializing
2018-03-15 14:26:40.639 AEDT [9244] LOG:  database system was interrupted; last known up at 2018-03-15 13:08:47 AEDT
2018-03-15 14:26:41.745 AEDT [9251] mark@postgres FATAL:  the database system is in recovery mode
2018-03-15 14:26:41.746 AEDT [9252] mark@postgres FATAL:  the database system is in recovery mode
2018-03-15 14:26:44.778 AEDT [9244] LOG:  database system was not properly shut down; automatic recovery in progress
2018-03-15 14:26:44.798 AEDT [9244] LOG:  redo starts at 0/56782CE0
2018-03-15 14:26:44.798 AEDT [9244] LOG:  invalid record length at 0/56782D18: wanted 24, got 0
2018-03-15 14:26:44.798 AEDT [9244] LOG:  redo done at 0/56782CE0
2018-03-15 14:26:44.870 AEDT [7120] LOG:  database system is ready to accept connections

Upvotes: 3

Views: 1582

Answers (2)

JosMac
JosMac

Reputation: 2302

We have exactly the same problem on PostgreSQL 11, on table with multilevel native partitioning. Main table is partitioned by shops and each shop by period (year-month) over last several years. I.e. several thousands of partitions together.

When we need to do deletes or updates over all shops PostgreSQL crashes. PostgreSQL is able to handle deletion/updates over one level of partitioning - i.e. from one specific shop and its monthly partitions. Because here we have only like several dozens of partitions for each shop.

But database crashes when we try to delete or update over main top parent table - i.e. several thousands of partitions are being addressed here.

Crashes are still the same - monitoring shows that PostgreSQL starts to use huge amount of memory and eventually is killed by OOM killer. Tuning of work_mem or other settings seems to have only very small influence - PostgreSQL just crashes later.

So we have to do all deletes/updates on this table using cycle over shops and do deletes or updates over these partitions separately. But at least this works.

For explanation - these fine grained partitions are incredibly useful for our client portal. Because we store aggregated data there and construct queries using directly specific shop-monthly partition so clients see data very quickly. And native partitioning takes care of distributing data over whole structure during inserts which is amazing...

Upvotes: 0

Mark Magnus
Mark Magnus

Reputation: 71

From Amit Langote, pgsql-bugs

I can reproduce OOM being triggered on my modest development machine, so perhaps that's what's happening in your case too.

This is unfortunately expected, given that the underlying planning mechanism cannot cope beyond a few hundred partitions. :-( See a relevant note in the documentation; last line of the page at this link: https://www.postgresql.org/docs/devel/static/ddl-partitioning.html.

Until things improve in that area, one workaround might be to perform the delete operation directly on the partition, as it's possible to do that. Or redesign your schema to use less number of partitions.

I know this is not actually a solution but rather a cautionary tale.

It would seem that native partitioning in postgresql 10, does not fit our use case. Part of my brief was to evaluate it's suitability. I suspected there would be a cost to aggressive partitioning but didn't expect memory problems.

Still feel free to post your own experiences and solutions.

Upvotes: 3

Related Questions