Append binary content to bytea column in PostgreSQL database

Question

My case:

I have some binary data, which has been broken into a lot of parts. Special table FileParts containts fileId, partNo, and data.

I need take all parts together into another table MyFilesStorage.

Now i'm choosing between two realizations:

DO $CODE$
declare
  r record;
begin
  UPDATE public.MyFilesStorage SET mainFileData = E''::bytea WHERE id = 'e14a26c0-db4b-47e1-8b66-e091fb3ba199'::uuid;

  for r in ( select data 
             from public.FileParts 
             where fileId = '89cb8598-436b-49b3-bb1c-34534c6d068e'::uuid
             order by partNo ) loop
    UPDATE public.MyFilesStorage SET mainFileData = mainFileData || r.data WHERE id = 'e14a26c0-db4b-47e1-8b66-e091fb3ba199'::uuid;
  end loop;

end;
$CODE$

I set data as empty, then read parts one by one and append each part into main table.

Another variant:

DO $CODE$
declare
  r record;
  p_result bytea;
begin
  p_result = E''::bytea;

  for r in ( select data 
             from public.FileParts 
             where fileId = '89cb8598-436b-49b3-bb1c-34534c6d068e'::uuid
             order by partNo ) loop
    p_result = p_result || r.data;
  end loop;

  UPDATE public.MyFilesStorage SET mainFileData = p_result WHERE id = 'e14a26c0-db4b-47e1-8b66-e091fb3ba199'::uuid;
end;
$CODE$

Here i use temp variable. Second is much more quick but i dont know wich will take more memory? Is first i need memory to load all file to RAM, and what about first? Will postgre loads all content here: mainFileData = mainFileData || r.data?

Maybe there is another way to do this, becouse both variants a veeeeery slow? In oracle i use DBMS_LOB.APPEND for this operation.

Daniel V&#233;rit&#233; · Accepted Answer

The first version is slower because PostgreSQL doesn't do in-place updates at the storage level, it creates a new version of the row for each UPDATE. So for a row that would go from 0Mb to 100MB by 10Mb increments, what will really be written to disk is not 10x10Mb but rather: 10Mb+20Mb+30Mb+...+90Mb+100Mb = 550Mb. On the other hand, the memory consumption will remain low with no more than 10Mb allocated in memory at a time.

The second version is faster with only 100Mb to write, but it needs to allocate 100Mb in memory.

The structure of the FileParts table with ordered chunks is generally easier to manage for large contents, why bother to convert it into the monolithic other structure?

Append binary content to bytea column in PostgreSQL database

Answers (2)

Related Questions