user2024300
user2024300

Reputation: 169

Occasional PostgreSQL "Duplicate key value violates unique constraint" error from Go insert

I have a table with the unique constraint

CREATE UNIQUE INDEX "bd_hash_index" ON "public"."bodies" USING btree ("hash");

I also have a Go program that takes "body" values on a channel, filters out the duplicates by hashing, and inserts only the non-duplicates into the database. Like this:

import (
    "crypto/md5"
    "database/sql"
    "encoding/hex"
    "log"
    "strings"
    "time"
)

type Process struct {
    DB                  *sql.DB
    BodiesHash          map[string]bool
    Channel             chan BodyIterface
    Logger              *log.Logger
}

func (pr *Process) Run() {
    bodyInsert, err := pr.DB.Prepare("INSERT INTO bodies (hash, type, source, body, created_timestamp) VALUES ($1, $2, $3, $4, $5)")
    if err != nil {
        pr.Logger.Println(err)
        return
    }
    defer bodyInsert.Close()

    hash := md5.New()

    for p := range pr.Channel {
        nowUnix := time.Now().Unix()

        bodyString := strings.Join([]string{
            p.GetType(),
            p.GetSource(),
            p.GetBodyString(),
        }, ":")
        hash.Write([]byte(bodyString))
        bodyHash := hex.EncodeToString(hash.Sum(nil))
        hash.Reset()

        if _, ok := pr.BodiesHash[bodyHash]; !ok {
            pr.BodiesHash[bodyHash] = true

            _, err = bodyInsert.Exec(
                bodyHash,
                p.GetType(),
                p.GetSource(),
                p.GetBodyString(),
                nowUnix,
            )
            if err != nil {
                pr.Logger.Println(err, bodyString, bodyHash)
            }
        }
    }   
}

But periodically I get the error

"pq: duplicate key value violates unique constraint "bd_hash_index""

in my log file. I can't image how it can be, because I check the hash for uniqueness before I do an insert. I am sure that when I call go processDebugBody.Run() the bodies table is empty.

The channel was created as a buffered channel with:

    processDebugBody.Channel = make(chan BodyIterface, 1000)

Upvotes: 2

Views: 3021

Answers (1)

tomasz
tomasz

Reputation: 13052

When you execute a query outside of transaction with sql.DB, it automatically retries when there's a problem with connection. In the current implementation, up to 10 times. For example, notice maxBadConnRetries in sql.Exec.

Now, it really happens only when underlying driver returns driver.ErrBadConn and specification states the following:

ErrBadConn should be returned by a driver to signal to the sql package that a driver.Conn is in a bad state (such as the server having earlier closed the connection) and the sql package should retry on a new connection.

To prevent duplicate operations, ErrBadConn should NOT be returned if there's a possibility that the database server might have performed the operation.

I think driver implementations are a little bit careless in implementing this rule, but maybe there is some logic behind it. I've been studying implementation of lib/pq the other day and noticed this scenario would be possible.

As you pointed out in the comments you have some SSL errors issued just before seeing duplicates, so this seems like a reasonable guess.

One thing to consider is to use transactions. If you lose the connection before committing the transaction, you can be sure it will be rolled back. Also the statements of the transactions are not retransmitted automatically on bad connections, so this problem might be solved – you will most probably will se SSL errors being propagated directly to you application though, so you'll need to retry on your own.

I must tell you I've been also seeing SSL renegotiation errors on postgres using Go 1.3 and that's why I've disabled SSL for my internal DB for time being (sslmode=disable in the connection string). I was wondering whether version 1.4 has solved the issue, as one thing on changelog was The crypto/tls package now supports ALPN as defined in RFC 7301 (ALPN states for Application-Layer Protocol Negotiation Extension).

Upvotes: 1

Related Questions