Equi4 Softwaremetakit

top pages

 

Metakit
Tclkit
Starkit
CatFish

 

Soapbox

 

Blog
Musings
Tclers' Wiki

Company

 

About
Services
Contact info
Support

commit modes

This page describes the different commit modes available in Metakit from release 2.3 onwards. This information was copied from the Metakit Wiki.

The second argument to c4_Storage::c4_Storage is an integer, which describes one of a number of possible open modes:

  • -1 = open read-only, original contents
  • 0 = open read-only, most recently committed contents
  • 1 = open in exclusive read-write mode
  • 2 = open in "commit-extend" mode

The first and last modes are new. Commit-extend is a new feature, which only writes at the end of the datafile. This allows concurrent access (N readers + 1 extender), though at some point, the file will need to be opened in the "normal" exclusive r/w mode (1), with a commit to force re-use of data space in the file.

The "open read-only, original contents" mode (-1) only differs from normal read-only mode (0) for files on which commit-extend has been applied. In that case, mode -1 will present the last view of the file before commit-extends were applied.

Commit choices

The normal commit requires the same exclusive access mode as before. This call is only possible for files opened in mode 1, and therefore can only be done when no other readers exist.

The commit-extend mode requires open mode 1 or 2, i.e. a mode which allows changes. The main value of this mode, is that it allows concurrent reading and does so without any locking requirement or possible contention. Only one "extender" can have the file open an any point in time, but with the proper synchronization (outside MetaKit), different processes could alternate in obtaining that "extender" access and then relinquishing it again.

Finally, there is a new commit-aside mechanism, which works in any file mode. This is possible, because the commit saves its changes in a secondary MetaKit datafile . That second datafile must be open in mode 1 or 2, and can itself therefore use either normal commits or commit extends. In fact, the secondary datafile could even be read-only, if it has itself been set up to work in commit-aside mode. In other words, commit-aside can be stacked. See below for more details.

Commit-extend

Commit extend consumes disk space, because every commit will store changes at the end of the datafile. Since changes mean that entire columns are saved, the amount of disk space consumed can grow rapidly when frequent commits are issued, altering large datasets.

In a way, commit extend is like an ordinary commit, but one which does not ever re-use free spaec inside the datafile. It is relatively fast, because free space is in fact not even tracked in commit-extend mode.

The way to reclaim free space, is to re-open the datafile in exclusive mode (1), and then commit normally. There are a number of ways to actually bring the file back to its minimal size (this won't happen right away, since space will only become re-usable after the commit completes succesfully).

Commit-aside

Commit aside uses a secondary "aside" file, which is a regular MetaKit datafile, but one of which the contents is managed exclusively by MetaKit.

File changes saved with commit-aside require much less spaced than with commit-extend, because only modified sections of columns are saved. This means that commit-aside performance is far more proprtional with the number of changes than with the size of the entire dataset (normal commit speed is highly related to total column sizes).

To use commit aside, both datafiles must be opened. The main datafile can be opened in any mode, since it will not be altered. The secondary "aside" file must be writable. In commit-aside mode, the secondary file is then associated to the main file (using c4_Storage::SetAside ). From then on, commits on the main file cause differences to be written to the aside file, and a "real" commit on the aside file to be issued.

While in "aside mode" four different commit/rollback calls can be issued:

  • commit fast: this saves changes to the aside file, and commits the aside file
  • commit full: this folds all aside changes back into the main file, commits the main file, and clears the aside changes (this mode requires write access to the main file)
  • rollback fast: revert the state to the last commit (aside or normal, whichever was last)
  • rollback full: this reverts to the last full commit, and has as side effect that the aside mode is turned off (the aside file is no longer special after this)

Commit-aside datafiles are useless without the original datafile. In fact, the moment that original datafile is modified, they become invalid. To this end, MetaKit stores a monotonically increasing generation number in each datafile, and stores the current value in the commit-aside file. There is one situation in which changes are actually meant to invalidate the aside file, that is when a full commit is done.

Combined modes

The features described above open a range of options for dealing with multi-user / multi-tasking scenarios, even though they do not yet handle all possible cases.

First of all, commit-extend offers extreme performance in the case when lots of readers require access to a consistent state, but only few changes are made. Readers will see the state at time of opening the datafile (they can close and re-open any time to see new committed changes). This mode is efficient because readers never wait, not even while the extender is committing changes.

The drawback of commit extend is the potentially large disk space consumption, and hence the need to reclaim and clean up periodically. This requires exclusive access.

Commit-aside can be very effective for single-task situations with lots of commits, since commit performance is relatively high, and disk space consumption is limited. But again, at some point, changes will need to be consolidated, and again exclusive access mode is needed to fold all changes back into the main datafile and to clear the aside file.

Finally, an interesting option is to use commit aside with a read-only main datafile, and commit-extend for the secondary aside file. Since only changes are saved, the aside file will not grow that quickly, and since both files support concurrency, multiple readers can get at the latest consistent state without delay. The only contention is that if there are multiple writers, these will have to find a way to take turns in making changes.

Yet more scenarios will become available later, once the main datafile can also be used as the aside file, i.e. modes can be combined.

status

The 2.4.9 implementation still has limitations which reduce the usefulness of the new commit modes:

  • commit extend and commit aside cannot yet be combined into one file
  • commit aside does not write minimal diffs but full columns, so commit aside files are not nearly as small as they could become once this is implemented

metakit index

Metakit home page

Overview

Documentation

Licensing

Acknowledgements

Quotes

Links