Jayke
Jayke

Reputation: 21

Why is postgres UPDATE function so slow?

I have a table with about 180K records defined as such:

CREATE TABLE table( 
    id text UNIQUE,
    classification text,
    author text,
    date DATE,
    score float,
    transcript text
    )

Nothing here is too remarkable except for one thing. All text values are < 200 characters, except for the transcript column. The text in the transcript column oftentimes ends up at hundreds of characters, up to 3MB of data. That being said, when I run the following command in the pipeline where I'm setting new classifications:

conn = connect()
conn.cursor().execute("UPDATE table SET classification = '%s' WHERE author = '%s'" % (classification, author))

The command takes an extraordinary amount of time. I've waited +30 minutes and I've yet to see this command finish. Furthermore, it locks these rows for this time which freezes up an attached webapp that queries the table. I can't quite understand why this is taking so long.

Lastly, of note, each author has fewer than 40 entries in the table. So the "WHERE" condition should never affect more than that many entries.

Upvotes: 1

Views: 543

Answers (2)

wildplasser
wildplasser

Reputation: 44250

You can at least try to suppress the idempotent updates, which can cause a lot of needless row-versions.


conn.cursor().execute("UPDATE table SET classification = '%s'
WHERE  classification <> '%s'
AND author = '%s'" % (sector_long, sector_long, ticker) )

Besides: since you are complaining about this update locking other processes, it could also be other processes [b]locking your update.

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522396

I don't know the relative size of your table or the number of records, but if you don't already have an index on the author column, you can try adding add one:

CREATE INDEX idx ON yourTable (author);

Upvotes: 1

Related Questions