Archive for the ‘Work’ Category

Yet More Honeypot Goodness

My second success for a donated MX entry was back in early July (the first was February 1st). That was about a year for the first one, and then five months for the second. Yesterday, I had a third. While this seems like a big speed-up, this was for a different domain, so is really the first success and not the third. Even still, that’s about a month and a half for the first success for this donated MX, compared to a year for the original donation.

Again, please consider helping out if you can.

technorati tags:, , , ,

SpamBayes 1.1a3 is out!

Thanks mostly to Skip Montanaro, SpamBayes 1.1a3 is now available, with interesting new tokenization to try.

technorati tags:, , ,

py2exe & Grisoft’s AVG

As of today, Grisoft’s AVG anti-virus software is reporting any Windows (non-console) application built using py2exe (maybe just with Python 2.4?) as a virus.

If AVG is suddenly reporting an application you have used for a long time as a virus, don’t believe it. Unfortunately, unless you disabled AVG, it will probably have already deleted (moved to the ‘vault’) the application.

I (along with many others) have reported this to AVG, but please do so as well. They need to learn to be more careful when detecting real viruses.

eWido’s anti-virus/anti-spyware software had the same problem a couple of weeks ago (by the 24th of July they had corrected the error). I had hoped for better from AVG.

(This is all the fault of the Backdoor.Rajump Trojan).

technorati tags:, , , , , , ,

Corrections to JGC’s Spam and Anti-Spam Newsletter #37.

In John Graham-Cumming‘s Spam and Anti-Spam Newsletter #37, John printed comments from Gordon Cormark regarding the TREC 2005 Spam Track submissions:

Several submitted by participants have their heritage in a filter product, but are experimental in the sense I mentioned above, and not available for download or purchase: […] tamSPAM (spambayes as configured by Tony Meyer)

However, this is not correct. I emailed John the comments below, but it seems he didn’t feel they were worthy of including in later issues. This doesn’t bother me (it’s a minor enough thing), they got lost in his mail. They’ll be in a later issue, but for the record, I’ll reproduce them here. Continue reading

Another Honeypot Success

Back on February 1st, I received notification that a MX entry that I donated to Project Honeypot helped identify a previously unknown email harvester. Last Thursday, I received notification of another success.

Stop Spam Harvesters, Join Project Honey Pot

This is an interesting (although statistically irrelevant) speedup (about a year for the first, and about five months for the second). I’ve donated MX entries from two other domains ( and since then, so it’ll be interesting to see what the rate is for those (which certainly get hardly any spam compared to the original donation).

If you have the ability to, it would be great if you donated, too.

technorati tags:, , , ,

Distributed manual verification of a corpus

John Graham-Cumming (of popfile, among other things), has setup a site for manual verification of the 2005 TREC Spam Track corpus.  The idea is that as many people as possible go to the site and manually classify the messages that are presented as ham or spam.

The TREC corpus was primarily classified automatically, so it's possible that there are errors in the corpus.  It's an interesting experiment, and I look forward to reading papers about the results (and possibly using a more correct corpus). It's a shame that the TREC corpus is the best one available, since the mail is pretty old, and it's a weird collection of (Enron, I believe) mail from different people.  It will be particularly interesting to see if there are messages that many people disagree on – some messages are particularly hard to classify, since you don't know what the interests/subscriptions of the original recipient were.

The site itself is particularly well done, IMO.  Not only do you get the raw email, but you are presented with a screenshot showing you what the message looks like in a typcial mail client.  This is a great idea. 

I encourage everyone to go to the site and classify at least a few emails.  It doesn't take much time, and it's a great contribution. 

technorati tags: , ,

Honeypot Success

Nearly exactly a year after I donated an MX entry to Project Honeypot, it was used to catch a previously unknown harvester (Project Honeypot sends an email out to let you know that this has happened). The MX is public, so it could have been harvested from my site (not this one, or the Massey one, or the ihug one) or from anyone else’s that participates in Project Honeypot.

It’s interesting that it took a year (and two days) for this to happen. Does that mean that Project Honeypot has a really large number of MX’s compared to the number of new harvesters arriving? I like that theory more than the one that suggests that there are so few harvesters caught that it takes this long. Of course, it could just have been a fluke, and maybe other people’s MXs are successful more quickly.

The number of harvesters for the site seems pretty low, especially considering the amount of spam that must (because the addresses aren’t anywhere else) come from harvesting it. Perhaps I should adjust where the honeypot links are, to try and make them more appealing.

If you’re not already part of Project Honeypot, and you have a website (on which you can use custom cgi scripts), I strongly encourage you to join. You don’t have to donate an MX entry if you don’t have the ability to do that.

technorati tags: , , ,


(Moved over from my Massey site.

Future entries will be separate, but in the Tools category).

These are the tools that I regularly use, and which I would obviously recommend:

Continue reading

After Hours Sign-in Books

A recent Massey Albany announcement:

As part of our Health & Safety requirement it has been necessary to provide after hour sign in books in all multi-storey buildings on campus.These have now been put in place and can be found either by the after hour entry doors or in the case of the Quad A and Atrium buildings close to the lift. Please make yourself familiar with the location of the books and ensure that all staff and visitors entering and leaving the building after hours sign the book.

Remember they are in place for your safety and in the unlikely event of an emergency they will assist security or emergency services in determining who is in the building.

How silly is this? Continue reading

Visiting Australia (Passport Requirements)

I’m visiting Australia (just for the day) next week, and realised last night that I foolishly have let my passport reach the stage where it expires in just over two months.  I know that many countries require the expiry to be a certain amount of time after arriving (e.g. six months) and wondered if I would have to speedily renew my passport (an extra $75 for the speed, and taking three days plus travel to and from the passports office, which would cut it close).

I googled for information about this for ages, and tried the sensible seeming sites (Australian tourism, New Zealand and Australian government departments that look after passports and immigration, travel websites.  Nothing helped.

Eventually my wife rang the New Zealand Ministry of Internal Affairs (who do the passports), who said that in their opinion (matching mine) the passport is valid until it expires.  They suggested ringing the Australian consolate.  They couldn’t do anything but play recorded messages, but one of those messages suggested a website to look at.

The website, which does have the information is  What a stupid choice of URL – how could I possibly be expected to guess that?  And why did google not find this? (They have a PageRank of 0, which explains much, so someone needs to do a better job of promoting the site!).

The answer, anyway, is that (for a New Zealander at least) you only need a valid passport for the duration of the stay (although obviously the length of the visa allocated will be effected).  So I can just renew my passport when I get back, and wait the 10 days, and save $75.