I'm looking to implement a B-tree (in Java) for a "one use" index where a few million keys are inserted, and queries are then made a handful of times for each key. The keys are <= 40 byte ascii strings, and the associated data always takes up 6 bytes. The B-tree structure has been chosen because my memory budget does not allow me to keep the entire temporary index in memory. My issue is about the practical details in choosing a branching factor and storing nodes on disk. It seems to me that there are two approaches: One node always fit within one block. Achieved by choosing a branching factor k so that even for the worst case key-length the storage requirement for keys, data and control structures are <= the system block size. k is likely to be low, and nodes will in most cases have a lot of empty room. One node can be stored on multiple blocks. Branching factor is chosen independent of key size. Loading a single node may require that multiple blocks are loaded. The questions are then: Is the second approach what is usually used for variable-length keys? or is there some completely different approach I have missed? Given my use case, would you recommend a different overall solution? I should in closing mention that I'm aware of the jdbm3 project, and is considering using it. Will attempt to implement my own in any case, both as a learning exercise and to see if case specific optimization can yield better performance. Edit: Reading about SB-Trees at the moment: S(b)-Trees Algorithms and Data Structures for External Memory

Reputation: 31

B-tree implementation for variable size keys

I'm looking to implement a B-tree (in Java) for a "one use" index where a few million keys are inserted, and queries are then made a handful of times for each key. The keys are <= 40 byte ascii strings, and the associated data always takes up 6 bytes. The B-tree structure has been chosen because my memory budget does not allow me to keep the entire temporary index in memory.

My issue is about the practical details in choosing a branching factor and storing nodes on disk. It seems to me that there are two approaches:

One node always fit within one block. Achieved by choosing a branching factor k so that even for the worst case key-length the storage requirement for keys, data and control structures are <= the system block size. k is likely to be low, and nodes will in most cases have a lot of empty room.
One node can be stored on multiple blocks. Branching factor is chosen independent of key size. Loading a single node may require that multiple blocks are loaded.

The questions are then:

Is the second approach what is usually used for variable-length keys? or is there some completely different approach I have missed?
Given my use case, would you recommend a different overall solution?

I should in closing mention that I'm aware of the jdbm3 project, and is considering using it. Will attempt to implement my own in any case, both as a learning exercise and to see if case specific optimization can yield better performance.

Edit: Reading about SB-Trees at the moment:

Upvotes: 3

Answers (3)

Jan Kotek

Reputation: 1084

JDBM BTree is already self balancing. It also have defragmentation which is very fast and solves all problems described above.

One node can be stored on multiple blocks. Branching factor is chosen independent of key size. Loading a single node may require that multiple blocks are loaded.

Not necessary. JDBM3 uses mapped memory, so it never reads full block from disk to memory. It creates 'a view' on top of block and only read partial data as actually needed. So instead of reading full 4KB block, it may read just 2x128 bytes. This depends on underlying OS block size.

Is the second approach what is usually used for variable-length keys? or is there some completely different approach I have missed?

I think you missed point that increasing disk size decreases performance, as more data have to be read. And single tree can have share both approaches (newly inserted nodes first, second after defragmentation).

Anyway, flat-file with mapped memory buffer is probably best for your problem. Since you have fixed record size and just a few million records.

Also have look at leveldb. It has new java port which almost beats JDBM:

https://github.com/dain/leveldb

http://code.google.com/p/leveldb/

Upvotes: 1

A.H.

Reputation: 66243

I'm missing option C here:

At least two tuples always fit into one block, the block size is chosen accordingly. Blocks are filled up with as many key/value pairs as possible, which means the branching factor is variable. If the blocksize is much greater than average size of a (key, value) tuple, the wasted space would be very low. Since the optimal IO size for discs is usually 4k or greater and you have a maximum tuple size of 46, this is automatically true in your case.

And for all options you have some variants: B* or B+ Trees (see Wikipedia).

Upvotes: 2

A.H.

Reputation: 66243

You could avoid this hassle if you use some embedded database. Those have solved these problems and some more for you already.

You also write: "a few million keys" ... "[max] 40 byte ascii strings" and "6 bytes [associated data]". This does not count up right. One gig of RAM would allow you more then "a few million" entries.

Upvotes: 0

B-tree implementation for variable size keys

Answers (3)

Related Questions