eager
eager

Reputation: 53

Embedded database with concurrent read and write

I am looking at a database solution for a concurrent writer and reader problem. I need an embedded database which will be written by a single writer process. The same database will be read by a single reader process and these processes run simultaneously.

I have looked at solutions like RocksDB but where we can have multiple readers and a single writer but the reader does not have the latest view of the database after opening it once and hence has to open the DB again.

Any help would be great.

EDIT

The code I wrote for RocksDB -

writer.cc

#include <cstdio>
#include <string>
#include <unistd.h>
#include <iostream>

#include "rocksdb/db.h"
#include "rocksdb/slice.h"
#include "rocksdb/options.h"

using namespace rocksdb;

std::string kDBPath = "./db";

int main() {
  DB* db;
  Options options;
  options.IncreaseParallelism();
  options.OptimizeLevelStyleCompaction();
  options.create_if_missing = true;

  Status s = DB::Open(options, kDBPath, &db);
  assert(s.ok());

  for (int i = 0 ; ; i++) {
    int key = i;
    Slice kslice((char*)&key, sizeof(int));
    int value = i*i;
    Slice vslice((char*)&value, sizeof(value));
    s = db->Put(WriteOptions(), kslice, vslice);
    std::cout <<  "writing " << i << " : " << i*i << std::endl;
    assert(s.ok());
    sleep(1);
  }
  delete db;
  return 0;
}

The output is as expected:

writing 0 : 0
writing 1 : 1
writing 2 : 4
writing 3 : 9
writing 4 : 16
writing 5 : 25
writing 6 : 36
writing 7 : 49
writing 8 : 64
writing 9 : 81
...

reader.cc

#include <cstdio>
#include <string>
#include <unistd.h>
#include <iostream>

#include "rocksdb/db.h"
#include "rocksdb/slice.h"
#include "rocksdb/options.h"

using namespace rocksdb;
using namespace std;

std::string kDBPath = "./db";

int main() {
  DB* db;
  Options options;
  options.IncreaseParallelism();
  options.OptimizeLevelStyleCompaction();

  Status s = DB::OpenForReadOnly(options, kDBPath, &db);
  assert(s.ok());
  int i = 0;

  while(true) {
    sleep(1);
    std::string value;
    Slice kslice((char*)&i, sizeof(int));
    Status s = db->Get(ReadOptions(), kslice, &value);
    if (!s.ok()) {
      std::cout << i << " " << s.ToString() << std::endl;
      break;
    }
    int a;
    memcpy(&a, value.c_str(), sizeof(a));
    std::cout << i << ":" << a << std::endl;
    i++;
  }
  delete db;
  return 0;
}

The output is (starting after key 3 has been added and not key 4)

0:0
1:1
2:4
3:9
4 NotFound: 

One possible solution which I tried is:

  Iterator* it = db->NewIterator(ReadOptions());
  int start = 0;
  Slice kslice((char*)&start, sizeof(int));
  it->Seek(kslice);
  bool flag = true;

  while (true) {
    int key, value;
    for ( ; it->Valid() ; it->Next()) {
      memcpy(&key, it->key().ToString().c_str(), sizeof(int));
      memcpy(&value,  it->value().ToString().c_str(), sizeof(int));
      cout << key << " - " << value << endl;
      if (!it->status().ok()) {
        cout << s.ToString() << endl;
        flag = false;
      }
    }
    if (!flag)
      break;
    sleep(1);
    Status s = DB::OpenForReadOnly(options, kDBPath, &db);
    assert(s.ok());
    Slice kslice((char*)&key, sizeof(int));
    it = db->NewIterator(ReadOptions());
    it->Seek(kslice);
    it->Next();
  }

And the output is as expected:

writing 0 : 0
writing 1 : 1
writing 2 : 4
writing 3 : 9
writing 4 : 16
writing 5 : 25
writing 6 : 36
writing 7 : 49
writing 8 : 64
writing 9 : 81
...

However, I want to avoid reading the database again and again for every update.

Upvotes: 3

Views: 4137

Answers (2)

hyc
hyc

Reputation: 1443

RocksDB is explicitly documented to only support multithread concurrency within a single process. You can't use it safely from multiple processes.

LMDB is explicitly documented to support multiprocess concurrency, and LMDB readers and writers run without blocking each other. It will do what you want.

Upvotes: 5

Venki
Venki

Reputation: 149

I recommend BerkeleyDB (BDB). You can perform concurrent read & write operations in different processes without any issue. The database takes care of consistency, you need not explicitly use any locks.

Another notable database, LMDB (Lightening Memory Mapped Database), developed as drop-in replacement for BDB.

There are few more embedded-able databases out there. These two could fit your need.

Disclosure: I did use BDB in the past in an application.

Upvotes: 4

Related Questions