Alireza
Alireza

Reputation: 61

Sometime, there is a delay required to execute queries successful in Cockroachdb

I'm trying to migrate our PostgreSQL database to Cockroach DB. So, I'm running our moch tests after setting up a single node server on my local (Macbook Pro). There were some issues which I fixed one by one, but the last one. The issue is that the tests (which were passed) started to fail with the following error messages:

Error: current transaction is aborted, commands ignored until end of transaction block
.

This is my running node configuration (which is run in insecure mode):

CockroachDB node starting at 2019-01-27 13:11:46.17270052 +0000 UTC (took 0.9s)
build:               CCL v2.1.1 @ 2018/11/19 18:20:16 (go1.10.3)
webui:               http://localhost:8080
sql:                 postgresql://root@localhost:26257?sslmode=disable
client flags:        cockroach <client cmd> --host=localhost:26257 --insecure

I tried to run the sql queries by a 1-second delay after insertion of the records and it started to work.

I would appreciate if somebody can explain why this kind of delay may required, and any solution.

Upvotes: 1

Views: 314

Answers (2)

Ron
Ron

Reputation: 11

The error you are seeing is a result of trying to issue another command in the context of a transaction had already been aborted.

But without seeing what was happening inside the transaction that caused it to abort I wouldn't be able to tell why the transaction aborted.

Could you share your DML, DDL, and Schema that you're using?

Thanks,

Ron

Upvotes: 1

Alireza
Alireza

Reputation: 61

Thanks to Ron. The problem was in code where I tried to handle error code 40001 (Retryable Errors). The proper way to handle this situation is to rollback the transaction before retrying, which I didn't. The correct code should looks like:

.catch( (e) => {
  // check whether the error is "retryable error" (40001) or not. if yes,
  // retry the whole batch, if not rollback and then re-throw the exception
  return client.query('rollback').then(() => {
    if (e.code === '40001') {
      return batch(list)
    } else {
      throw e
    }
  })
})

but I had done it wrong like:

// *** THIS IS THE WRONG CODE.
.catch( (e) => {
  // check whether the error is "retryable error" (40001) or not. if yes,
  // retry the whole batch, if not rollback and then re-throw the exception
  if (e.code == 40001) {
    return batch(list)
  } else {
    return client.query('rollback').then(() => {
      throw e
    })
  }
})

So the connection which made the aborted transaction was not rolled back before putting back into the connection pool, and it caused the error next time being used.

Thinking about why the 1-second delay helps to overcome the situation, I guess that the connection was killed by connection pool manager after a second of being idle.

Upvotes: 1

Related Questions