Re: Anti-spambot hacks? (patch)

From: Bjarni R. Einarsson <bre_at_netverjar.is_at_hypermail-project.org>
Date: Sun, 7 Nov 1999 23:38:34 +0100
Message-ID: <19991107233834.A11019_at_diskordiah.localdomain>


On 1999-11-07, 16:19:23 (+0100), Bjarni R. Einarsson wrote:
>
> There are all sorts of examples of how this could be done out there,
> and I'm considering implementing some of the following here on my own
> copy of hypermail (I'll send in a patch when I'm done):

As promised, a patch is included with this massage. It's against hypermail 2b25 (my CVS snapshot won't compile, I didn't feel like figuring out why not).

This patch modifies six files:

	print.c
	proto.h
	setup.c
	setup.h
	string.c
	struct.c

The main function is added to string.c, named spamprotect. It's a simple routine which replaces any occurance of '_at_' with something else guaranteed to make the email address invalid (sometimes generating new valid ones like __at_domain.com instead). There are quite a few replacements to choose from, and adding new ones is simply a matter of adding strings to the lists. An adventurous soul could make this a configuration variable, I guess.

The function performs either single-character substitutions or multiple-character substitutions, depending on whether it's told that it has a fixed width to work with or not (so I don't uglify signatures). Which substitution is chosen isn't really random, it's based on a nonsensical hash of the string being "spamprotected" - which means that a given email address will always get mangled the same way. I consider this a feature, as it means this feature has a lower chance of breaking groupings based on names or email addresses. It will probably break groupings based on subjects containing '_at_' though, because of "Re:" prefixes. (This might be a good enough reason to disable the Subject mangling in struct.c, see below.)

In print.c modifications are made to the printbody function to pass all the lines through spamprotect.

In setup.{h,c} a "hm_spamprotect" configuration variable is created, which can be toggled on or off to enable/disable this feature. Default is off.

In struct.c I pass the subject, name and email fields of all hashed messages through my spamprotect routine. I chose this point since the change propogates from there on out to all the different routines which print out these values, and also so that hypermail's internal logic would always be working with the same (mangled) values.

This patch increases hypermail's memory usage a little bit, but I used the Push struct carefully so I don't think I introduced any new memory leaks.

I deciced not to implement WPoison style generation of fake email addresses, since it's relatively complicated to do this well and there are specialized tools which are already quite good at this. People who want this for their archives should use server-side parsed HTML and specialized tools, I think.

Enjoy, let me know what you think! :-)

-- 
Bjarni R. Einarsson                           PGP: 02764305, B7A3AB89
 bre_at_netverjar.is           -><-           http://www.mmedia.is/~bre/

Netverjar gegn ruslpósti: http://www.netverjar.is/baratta/ruslpostur/


Received on Mon 08 Nov 1999 12:41:51 AM GMT

This archive was generated by hypermail 2.3.0 : Sat 13 Mar 2010 03:46:11 AM GMT GMT