Weekly Report for week ending 13 May 2011




Successfully got the state machine generation running across ISP traces,
fixing a few bugs that the new dataset exposed along the way. Took the
machine that was generated using the ISP data and ran it with the older
data with known spam status to see how they compared (quite similar).
Again, it is quite clear what is spam after the point it is rejected by
the mail server but the distinction is much less clear prior to that.

Started to work on reading the machine back in from the output dot graph
files so that a pre-built machine can be used to run against any object
trace without having to rebuild the machine every time.

Spent some time working on documentation about embedding R in C code in
response to an email query I got. I've been tinkering with this off and on
for a while and should blog about it when it's more complete.