Hashed and blocked views

top pages

Metakit
Tclkit
Starkit
CatFish

Soapbox

Blog
Musings
Tclers' Wiki

Company

About
Services
Contact info
Support

mapping views

Metakit 2.3/2.4 introduces a number of new storage options, collectively called "mapping views". The term comes from the fact that the underlying views, as stored on file, have a slightly different structure from the ones you will see while in use. There is a mapping between the two, which lets Metakit play a number of tricks.

Mapping views are special in that you set them up after opening a file, specifying the underlying views to use, and that you only access and make changes through the returned mapped view from then on. Direct changes to the underlying views will break the illusion and can easily make things inconsistent.

Hashed views work in combination with a second view to provide O(1) hashed access by key value. The data is - as before - stored in the original view. The second view is just used as a hash index and is completely managed by Metakit. The second view can be either persistent or transient (i.e. not stored on file).

To use a hashed view, you must apply the "hash" view mapping, which takes three input arguments: 1) the data view, 2) the secondary hash view, and 3) the number of properties involved in the key. The resulting view looks identical to the data view, but when looking up a key value, Metakit will detect the presence of the hash info and will use it to very quickly locate the row with the given key.

There are a number of rules to make this work:

The secondary view must have a fixed structure, i.e. two int properties called _H and _R, respectively. The size and contents of this view is managed by the hash view mapping.
The key is always taken as the first N properties of the data. N is usually 1, but it can be higher in case of "compound keys".

For the infinitely curious: the secondary view implements a hash table, with spare slots, so it will have more rows than the actual data view. The size of the secondary view is always a power of two.

Blocked views offer scalability, i.e. they support views which can contain far more rows without becoming slow when making changes. The trade-off is that plain positional access and iteration are somewhat less efficient. Blocked views are implemented as a view of smaller subviews, with all the blocking details fully managed by Metakit. The result looks, walks, and quacks like one huge view.

Unlike hashed views, blocked views need a slightly different definition of the view in which data is being stored. If you wanted a view with say 4 properties, you'd now have to redefine the view as a view of views, i.e. the description:

names[first:S,last:S,age:I,shoesize:I]

needs to be changed into:

names[_B[first:S,last:S,age:I,shoesize:I]]

To use a blocked view, you pass this new definition to the "blocked" view mapping, which takes a single argument, being the redefined view. The virtual result will look like the view you actually expect, but internally Metakit will fool around with how it stores things in subviews, moving rows around to keep all subviews reasonably balanced.

For the infinitely curious: blocked views implement a structure which is very similar to B-trees, but fixed to two levels. As a result, storing a million rows virtually will be handled as storing rows in 1,000 subviews of 1,000 rows each (on average).

More details on blocked views are on a separate page.

metakit index

• Metakit home page

• Overview

• Documentation

• Licensing

• Acknowledgements

• Quotes

• Links

	Copyright © 2020 Equi4 Software. Metakit is a trademark of Equi4 Software.