Binary-tree data storage implementation

Question

I have started using binary trees in c++, and i must say i really like the idea and things are clear for me, until i think of storing data on the disk in an order where later i can instantly read a chunk of data. So far i have stored everything (nodes) into the ram... but this is just a simple and not real life app. I am not interested in storing the whole binary tree on the disk as that would be useless again since you have to read it again back to the memory! what i am after is a method just like for example MYSQL. I haven't found any good article on this so i would appreciate if i someone include some urls or books.

Ant&#243;nio Dias · Accepted Answer

The main difference from b-tree and b+tree: - The leaf nodes are linked for fast lockup sequential reads. Can point ascending, can point descending , or both (like i saw in one IBM DB)

You should write it on disk, if the table or file grows, you will have memory problems. (SEEK operations on files ARE REALLY FAST. You can create a 1 GB file on disk in less than 1 second... C# filestream,method .SetFilesize)
If you manage to have multiple readers/writers, you need concurrency control over the index and table(or file).... You gona do that in memory? If a power failure occures, how do you rollback?Ye, you dont.

IE:Field f1 is indexed.

WHERE 1=1 (dont need to access b+tree, give me all and the order is irrelevant)

WHERE 1=1 ORDER BY f1 ASC/DESC (Need to access b+tree, give me all by ascending/descending order)

WHERE f1>=100 (Need to access b+tree, lock up where the leaf node =100 and give all leaf node items following right pointers. If this process is a multithreaded read, they probablly come with a strange order, but no problem... no order by in clause).

WHERE f1>=100 order by f1 asc (Need to access b+tree, lock up where the leaf node =100 and give all leaf node items following right pointers. This process shouldnt be multithreaded following the b+tree, comes naturally in order.

Field f2 indexed with a b+tree and type string.

Where name like '%ODD' (Internally, the compared value must be inverted and the all symbol stays at the end Like starts with 'DDO' and ends with anything. 'DDOT' is in the group so 'TODD' must belongs to the result!!!! Tricky, tricky logic ;P)

with this statement, WHERE name like '%OD%' (has in the middle 'OD'). The things get hot :)))) Internally, the result is the UNION of the sub result for 'OD%' with the sub result inverted 'DO%'. After that, removes of starting 'OD' and ending 'OD' without 'OD' in the middle, otherwise its a valid result('ODODODODOD' its a valid result. Invalid results 'ODABCD' and 'ABCDOD' ).

Consider what i said and check some more things if you gona do deep: - FastIO on files:C# Filestream no_buffered_flag, wriththought disk flag on. - Memory mapped files/memory views: Yes we can manipulate an huge file in small portions as we need it - Indexes:Bitmap index, hash index (hash function;perfect hash function;ambiguity of the hashfunction), sparse index, dense index, b+tree, r-tree, reversed index. - Multithreads: lock, mutexes,semaphores - Transactional conciderations (Log file, 2phase commit;3phase commit); - Locks (database,table,page,record) - Deadlocks: 3 ways to kill it (longer conflicting process;Younger conflicting process;The process which locks more objects). Modern RDBMs use a mixed of the 3 ways... - SQL parsing (AST-Tree). - Caching recurrent queries. - Triggers, procedures, views, etc. - Passing parameters to the procedures (can use the object type ;P)

DONT LOAD EVERYTHING IN MEMORY,INTELLIGENT SOLUTIONS LOADS PARTS AS THEY NEED IT AND RELEASES WHEN ITS NO LONGER USABLE. Why=> your db engine (and PC) becomes more responsive using less memory. Using b+tree for lockup the branch leaf nodes needs just 2 Disk IO's. Knowing the lockup value, you get the record long pointer. SEEK the main file for the position, read the content. This is too fast. Memory is faster... Yes it is, but can you put 10 GB's of a b+tree on memory? If so, how your DB engine program starts to behave? Slowlly?

Forget binary trees and convencional btrees: they are academic tutorials. Real life they are replaced by hashtables or b+trees (B PLUS TREE showing storage and ordered ascending- http://en.wikipedia.org/wiki/B%2B_tree)

Consider using dataspaces for the db data in multiple disks. You can parallelize Disk IO performance. Dont forget to mirrored them... Each dataspace, should have a fragment of the table with a fragment of the indice, with a partial log file. You should develop the coordinator which presents wiselly the queries for the sub units.

IE: 3 dataspaces... INSERT INTO etc...... only should happend in 1 table space.

but select * from TB_XPTO, should be presented to all dataspaces.

select * from TB_XPTO order by an indexed field, should be presented to all dataspaces. Each data space executes the instruction, so now we have a 3 subsets by their sub order.

The result will be on the coordinator, where will reorder it again. Confuse, BUT FAST!!!!!!

The coordinator should controls the master transaction.

if dataspace A commited dataspace B commited dataspace C is in uncommited state the coordinator will rollback a C,B and A.

if dataspace A commited dataspace B commited dataspace C commited the coordinator will Commit the overall transaction.

COORDINATOR LOG: CREATE MASTER TRANSACTION UID 121212, CHILD TRANSACTIONS(1111,2222,3333)

DATA SPACE A LOG 1111 INSERT len byte array 1111 INSERT len byte array COMMIT 1111

DATA SPACE B LOG 2222 INSERT len byte array 2222 INSERT len byte array COMMIT 2222

DATA SPACE C LOG 3333 INSERT len byte array 3333 ---> No more nothing..... Power failure here!!!!!!!

On startup coordinator check if the db was properlly closed, if not, it will check his log file. Well, is missing a master commit line like COMMIT 121212. So it will enquire the data spaces for the log consistency. A,B repplies COMMITED, but C, after checked his log file, detects a failure. Replies UNCOMMITED. Master Coordinator FORCES TABLESPACE A,B,C FOR ROLLBACK 1111,2222,3333 After that, himself rollbacks his master transaction and puts DB state=OK.

The main point here is speed on insert,selects, updates, and deletes

Consider to maintain the index well balanced. Many deletes on the index will unbalanced it. An unbalanced index drops its performance.... Add a heap on the head of the index file, for controlling it. Some math here would help. If deletes are higher than 5% of records, balance it and reset the counter. If an update is over on an indexed field, should count it too.
Be smart considering the field index. If the column is Gender, there are only 2 options(i hope, lol.... ops, can be nullable too....), a bitmap index is well applied. If the distinctness (i think i spell it badlly) of a field is 100% (all values heterogeneous), like a sequence applied on a field like Oracle do, or an identity field like SQL Server do, a b+tree is well applied. If a field is kind of geometric type, like in Oracle, the R-Tree is the best. For strings, reversed Index is well applied, or b+tree if heterogenous.
Houston, we have problems.... NULL value fields, should be considered too in the index. Its a value too!!!! IE: WHERE F1 is null
Add some socket functionality:Async TCP/IP SERVER

-If you delete a record, dont resize the file right now. Mark it as deleted. You should do some metrics here too. If unused space > x and transactions =0, do a database lock and re-allocate pointers, then resize database. Some spaces appears on the DB file, you can try to do some page locks instead of database lock... Things can keep going and no one gets hurt.... Measure the last unlocked page of the DB, lock it for you. Check a deleted page that you can fill with your page. Not Found, release lock; If found, move for the new position, fix pointers, mark old page as deleted, resize file, release lock. Why so many operations? To keep the log well formed!!!! You can split the page in small pages, but you get fragmentation (argh...we lost speed commander?)... 2 algorithms comes here. Best-Fit, and Worst-Fit....Google it. The best is .... using both :P

And if you solve all of this stuff, you can shout out loud "DAM, I DID A DATABASE... IM GONA NAME IT ORACLE!!!!" ;P

Binary-tree data storage implementation

Answers (1)

Related Questions