[Metakit] Implementing a search tree in Metakit?

Jean-Claude Wippler jcw at equi4.com
Thu Dec 1 12:45:43 CET 2005


Magnus Lie Hetland wrote:

> jyl at mod3.net <jyl at mod3.net>:
>
>> Check out http://e4graph.sourceforge.net/
>
> Thanks -- I've seen it (a while ago), but I guess I'll have another
> look.
>
> My main questions were really about how Metakit would perform (i.e.,
> the number of disk accesses etc.) under an implementation roughly like
> the one I described :)

The main thing to keep in mind is the column-wise nature of data,  
including subviews:
	http://www.equi4.com/mkviews.html

Lots of tiny subviews are not going to scale, because of the  
fragmentation.  This is one of the things which e4graph adresses.   
Fragmentation due to lots of tiny columns is something I plan to  
improve in the future (by inling more small amounts of data into  
their parent data-structures).

MK's "blocked" views are very much like B-trees in a column wise  
world.  Their depth is fixed at two, but it turns out that this gets  
many of the benefits of B-trees:
	http://www.equi4.com/mkblocked.html
It takes a bit of mental juggling to see how blocked subviews and  
columns fit together.

Also, again due to columns: keep in mind that as you iterate over  
data structures in MK, only properties/columns which actually  
participate in the iteration will cause disk access.  And that  
columns with ints in them are considerably more efficient to access  
in random order (and often a lot smaller, due to 1..32-bit adaptive  
int-width vector sizing).

As always, if performance is your top priority then there is really  
no way around trying out various approaches.  You can expect order-of- 
magnitude differences, and I've dabbled in this columnar world long  
enough to know that there are lots and lots of surprises either way.   
YMWVAL: your mileage *will* vary a lot.

-jcw



More information about the Metakit mailing list