joshm1
joshm1

Reputation: 553

Simple postgresql Insert/Update queries taking *seconds* to respond every 5-7 minutes

I have a Ruby on Rails app that is using a Postgresql database. I've noticed that my database performance has huge spikes every 5-7 minutes.

newrelic request response time - database spikes

I'm seeing 1+ second response times for simple queries like:

UPDATE users SET last_seen_at = ? where id = ?

or

INSERT INTO emails (email, created_at, updated_at) VALUES (?, ?, ?)

The VPS is an AWS EC2 instance (m2.2xlarge) with a 4 core Xeon 2.4ghz and 34gb of memory.

Here is my postgresql.conf

I made to following changes to the conf to try to figure it out (like reducing the # of checkpoint timeouts) to no avail.

root:/etc/postgresql/9.2/main# diff postgresql.conf.bck postgresql.conf
176,178c176,178
< #checkpoint_segments = 3    # in logfile segments, min 1, 16MB each
< #checkpoint_timeout = 5min    # range 30s-1h
< #checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0
---
> checkpoint_segments = 10    # in logfile segments, min 1, 16MB each
> checkpoint_timeout = 30min    # range 30s-1h
> checkpoint_completion_target = 0.9 # checkpoint target duration, 0.0 - 1.0
361c361
< #log_min_duration_statement = -1  # -1 is disabled, 0 logs all statements
---
> log_min_duration_statement = 2s  # -1 is disabled, 0 logs all statements
370,371c370,371
< #debug_print_rewritten = off
< #debug_print_plan = off
---
> #debug_print_rewritten = on
> #debug_print_plan = on
376c376
< #log_duration = off
---
> #log_duration = on
378c378
< #log_hostname = off
---
> #log_hostname = on
399c399
< #log_lock_waits = off     # log lock waits >= deadlock_timeout
---
> log_lock_waits = on     # log lock waits >= deadlock_timeout

Upvotes: 1

Views: 2404

Answers (1)

jjanes
jjanes

Reputation: 44137

You have a serious IO problem at the end of the checkpoints. Note that the slow queries are mostly COMMIT, which should do almost nothing but flush the WAL log, and that it took 41.604 s to sync the files (including 11 s to sync one file!).

There is probably nothing much you can do from within PostgreSQL to improve this. I've heard rumors that lowering shared_buffers might help, but I have not seen that first hand.

You probably need to make changes on the operating system, like lowering /proc/sys/vm/dirty_ratio so that it doesn't allow so much dirty data to build up between checkpoints. Also, if you can separate your WAL logs to separate disks from the main data, that can help.

What filesystem are you using? What kernel/distro?

There is also the possibility that your workload simply can't be accommodated by the IO system you are using, and you need to move to more capable hardware.

Upvotes: 1

Related Questions