How to update a large (1 million+ rows) postgres column of jsonb type values

Question

Trying to update a specific array inside of a jsonb type in a column called params , and having issues with how long it’s taking. For example, there is a table with a row that contains an array owners

{
  "hidden": false,
  "org_id": "34692",
  "owners": [
    "tim@facebuk.com"
  ],
  "deleted": false
}

And another example

{
  "hidden": false,
  "org_id": "34692",
  "owners": [
    "tim@google.com"
  ],
  "deleted": false
}

And there’s essentially a million of these rows (all with different email domains as owners. I have this query which I want to execute across all of these rows:

UPDATE table 
  set params = CASE 
                 WHEN params->>'owners' NOT LIKE '%google.com%' THEN jsonb_set(params, '{owners}', concat('"', substr(md5(random()::text), 0, 25), '@googlefake.com"')::jsonb) 
                 ELSE params 
               END

I’ve tested with a dataset of 100, and it executes perfectly time, but doing this with a 1000x multiple, makes the query forever execute, and I’ve no clue if it will actually successfully complete. Not entirely sure how to speed up this process or utilize this in a better fashion. I did try indexing e.g. CREATE INDEX ON table((params->>'owners')); to no avail. Query has run >1 hour, and there are multiple rows similar to this.

Am i indexing incorrectly? Also, I've looked into the gin operator and @> won't help since each owner field differs

Update:

UPDATE table AS "updatetarget" 
                 SET params = jsonb_set(params, '{owners}', concat('"', substr(md5(random()::text), 0, 25), '@googlefake.com"')::jsonb)

query updated to this and still taking awhile. Is there some way to index the key so i can make the second query faster?

How to update a large (1 million+ rows) postgres column of jsonb type values

Answers (1)

Related Questions