[Metakit] newbie question - writing derived view back to db

Wolfgang Lipp paragate at gmx.net
Thu Jan 20 21:26:41 CET 2005


> BK> will get to in a moment.  You make a point about the speed of mk
> BK> versus the speed of raw python dictionary,
> I don't think it could ever get as good as operations on disk-based
> data structures being comparable in speed to memory-based data
> structures. It's unrealistic to expect that I think, unless you're
> some sort of Donald Knuth on steroids.

agreed. i was writing under the impression that all of
an mk storage gets always fully loaded into memory on
opening. i am happy to hear this is not the case. my
caching example was only possible in such a simple way
because i know my tables are not too big for memory. i
am thinking about functionality somehwere in my wrapping
class that manages such a caching on demand.

> view = storage.getas("test[_B[a:s,b:s,c:s]]").blocked()
> vw.append(('1','2','3')))

> 600000, time: 21.30, delta: 2.78
> Values written, now syncing, time: 22.83
> After syncing: 23.08
> end.

i tried this myself. the results so far are slightly
puzzling to me. here is a short test report:

============================================================

table creation strings: for blocked and unblocked views:

     node[_B[name:S,comment:S,termnr:I]]
     node[name:S,comment:S,termnr:I]

data: several thousands of rows with nearly identical
content. all data was produced prior to each testrun and
kept in memory throughout. for mode t1, a list of
tuples, for mode t0, a list of dictionaries was
produced::

     [
         ('*0*', 'QfoY', 88),
         ('*1*', 'dn', 430),
         ('*2*', 'CJnnTZTLD', 502),
         ... ]

     [
         {'termnr': 88, 'comment': 'QfoY', 'name': '*0*'},
         {'termnr': 430, 'comment': 'dn', 'name': '*1*'},
         {'termnr': 502, 'comment': 'CJnnTZTLD', 'name': '*2*'},
         ... ]



core code::

     stopwatch.start( _inter( '$ax $bx $ex $tx' ) )
     if useAppend:
         if useTuples:
             for entry in ENTRYTUPLES:
                 targetTable.append( entry )
         else:
             for entry in ENTRYDICTS:
                 targetTable.append( **entry )
     else:
         if useTuples:
             targetTable[ 0 : ROWCOUNT ] = ENTRYTUPLES
         else:
             targetTable[ 0 : ROWCOUNT ] = ENTRYDICTS
     stopwatch.stop()

the test runs created 16 data storages with identical sizes of about 2MB  
each.

results::

     test run 2, 100'000 rows

     TOTAL      :     749.9780
     a0 b0 x0 t0:      10.1340       ****
     a1 b1 x1 t1:      10.3540       *****
     a0 b0 x0 t1:      11.0860       *****
     a1 b0 x0 t0:      11.7670       *****
     a1 b1 x0 t1:      14.6310       ******
     a0 b0 x1 t1:      22.5230       **********
     a0 b0 x1 t0:      27.2790       ************
     a1 b1 x1 t0:      28.9120       *************
     a1 b0 x1 t0:      41.5600       ******************
     a0 b1 x1 t1:      60.7580       ***************************
     a1 b0 x0 t1:      63.1210       ****************************
     a0 b1 x1 t0:      65.4740       *****************************
     a1 b0 x1 t1:      67.9580       ******************************
     a1 b1 x0 t0:      78.3630       **********************************
     a0 b1 x0 t1:     109.0270        
************************************************
     a0 b1 x0 t0:     114.1740        
**************************************************


     test run 2, 50'000 rows

     TOTAL      :     187.3690
     a1 b1 x1 t1:       3.7960       ********
     a0 b0 x0 t1:       4.0160       ********
     a1 b1 x0 t1:       4.4560       *********
     a0 b0 x0 t0:       4.9470       **********
     a1 b0 x0 t0:       4.9770       **********
     a0 b0 x1 t1:       5.3680       ***********
     a0 b0 x1 t0:       6.6990       **************
     a1 b1 x1 t0:       9.2140       *******************
     a1 b0 x1 t0:      11.0960       ***********************
     a1 b0 x0 t1:      13.3400       ****************************
     a1 b1 x0 t0:      14.8110       *******************************
     a1 b0 x1 t1:      16.5630       ***********************************
     a0 b1 x1 t1:      16.6240       ***********************************
     a0 b1 x1 t0:      17.5650       *************************************
     a0 b1 x0 t1:      23.6340        
*************************************************
     a0 b1 x0 t0:      23.9150        
**************************************************

     a0  --  use slice assignment (see code)
     a1  --  use append with loop (see code)

     b0  --  do not use blocked view
     b1  --  use blocked view

     x0  --  use normal commit mode
     x1  --  use extend commit mode

     t1  --  use tuples (see code)
     t0  --  use dictionaries (see code)

============================================================

there are huge differences in the timings, but i find
myself unable to distill any kind of clear policy for
using metakit from them -- all of the 0s and 1s seem to
be scattered all over the plot for all four options. i
would have expected the results with slice assignment
 from a list of tuples on a blocked view that is in a
storage opened using extend-comit should behave fastest,
but even if we concede that the top-runners in both
cases somehow corroborate that expectation. furthermore,
the results seem not to allow the interpretation that
these factors act together in a synergetic way. even if
we say that factor b (blocked views) does not kick in
here because even 100'000 rows are not enough, then
still these other factors do not appear to act together.

the only three interpretations i have to offer right now are:

1)  the testing code contains some grave blunder that mars
     the results;

2)  it is the lack of many test runs that are randomly
     shuffled that is missing here -- perhaps the order
     in which the storages were produced is important (i
     can not see how, but i'll try);

3)  the results are correct and metakit's behavior *is*
     not very predictable.

perhaps someone would be eager to falsify at least the
last hypothesis.


_wolf








More information about the Metakit mailing list