Awdsa
Awdsa

Reputation: 13

How to implement ColumnFamilies in RocksDB in Java?

I am trying to use column families in RocksDB through java binding.

RocksDB.loadLibrary();
        String threat = "threat_data";
        String ipRange = "ip_range";
        options = new DBOptions();
        options.setCreateIfMissing(true);
        options.setCreateMissingColumnFamilies(true);
        ColumnFamilyOptions cfOpts = new ColumnFamilyOptions().optimizeUniversalStyleCompaction();
        List cfDescriptors = Arrays.asList(
                new ColumnFamilyDescriptor(RocksDB.DEFAULT_COLUMN_FAMILY, cfOpts),
                new ColumnFamilyDescriptor(threat.getBytes(), cfOpts),
                new ColumnFamilyDescriptor(ipRange.getBytes(),cfOpts)
        );
        List<ColumnFamilyHandle> cfHandles = new ArrayList<>();
        rocksDb = RocksDB.open(options, new File("/tmp/benchmark", "rockdb-threat-detection.db").getAbsolutePath(),cfDescriptors,cfHandles);
        
        cfHandleThreat = (ColumnFamilyHandle) ((List) cfHandles.stream().filter(x -> {
            try {
                return (new String(x.getName())).equals(threat);
            } catch (RocksDBException e) {
                e.printStackTrace();
            }
            return false;
        }).collect(Collectors.toList())).get(0);
        
        cfHandleIp = (ColumnFamilyHandle) ((List) cfHandles.stream().filter(x -> {
            try {
                return (new String(x.getName())).equals(ipRange);
            } catch (RocksDBException e) {
                e.printStackTrace();
            }
            return false;
        }).collect(Collectors.toList())).get(0);

I am creating 2 column families threat_data and ip_range. But if trying to read from using get() function, the performance hits low.

mapThreat.get(ipToLong("157.49.194.173"))

The performance between using columnfamilies and not using them changes drastically. Is there anything I am doing wrong or How should I improve performance?

Upvotes: 0

Views: 916

Answers (1)

Asad Awadia
Asad Awadia

Reputation: 1521

Are all gets slow or only the first one? There isn't much you can do as they are just virtual dataspaces

The only alternative is to not use column families and prefix your keys with the column family name

Upvotes: 0

Related Questions