[Metakit] Python, Metakit, Sorting tracking data

Brian Kelley bkelley at biology.columbia.edu
Mon Jun 21 10:47:22 CEST 2004


Joel Lawhead wrote:

>
> So each time I add data to metakit I want to:
>
> 1. Throw out duplicates (done)
> 2. Get the latest timestamp (done)
> 3. Remove targets and their trails 30+ seconds older than the latest 
> timestamp. (not sure)

You can get this almost for free by ordering the column on epoch.  Epoch 
will have to come first in the table definition.

orderedTable = table.ordered(1)

Then, add the data to the table.  Now you can get the appropriate epochs by

lasttime = orderedTable[-1].epoch

# use locate to find the index of lasttime - 30, this is done using 
binary search
trailsMinus30index, count = orderedTable.locate(epoch=lasttime-30)
# now that we have the index, extract all epochs > trailsMinus30index
trailsMinus30 = orderedTable[trailsMinus30index:]

This is fairly easy :)

> 4. Group the targets by id (no problem)
> 5. Access each target id by group and sort it by head (the latest 
> point) and trail (all the other track points). (not sure)

Any metakit view can be sorted, even the subview created by a groupby 
column.  I've make a little test script for you to play with.  The 
following uses in memory storages and is a great way to play with 
metakit data in the interpreter.

import metakit, random

st = metakit.storage()
vw = st.getas("test[epoch:I,id:I,date,latitude,longitude]")
vw = vw.ordered(1)

for i in range(100):
    r = int(100*random.random())
    id = int(4*random.random())
    vw.append((r,id))

# extract the last epoch - 30
index, count = vw.locate(epoch=vw[-1].epoch-30)
trails = vw[index:]
# group by ids
trails = trails.groupby(trails.id, "data")

# sort the data for each id by epoch
metakit.dump(trails[0].data.sort(vw.epoch))




More information about the Metakit mailing list