dezembro 09, 2004
spamurl.pl

In the spirit of the (now dead) makelovenotspam project, i wrote a little perl code to visit and gently spider the urls in any spam tagged email i receive, in an effort to drive up the cost of business for spammers and those who fund them. here's the file: spamurl.pl, use it at your own risk, though i've been running it for a day and haven't seen any issues. also, my perl is a bit rusty... coments appreciated.

so here's the theory: i pay a webhost $50/month for 6gb of bandwidth to sell viagra. if within that 6bg of traffic 0.01% buys something, then i've made a profit. what i'm doing is a) taking up a little of their allocated bandwidth and b) not buying anything. really it's stupid to advertise your dumb website through spam... stop it, please.

yes i've linked directly to the file, but i'll paste it here just because:

#!/usr/bin/perl

## spamurl.pl v1.2 by skp Dec 09 2004
## parses incoming mail through stdin and
## visits any urls it finds, dumping any
## returned data to /dev/null
##
## usage: put the following in ~/.procmailrc after spamassassin
## :0 HBc
## * ^X-Spam-Status: Yes
## | /usr/local/bin/mailurl.pl
##
## in the spirit of the makelovenotspam project, i visit and
## slighty spider the urls in any spam tagged email in an
## effort to drive up the cost of business for spammers and
## those who fund them. -skp
##

use URI::Find::Rule;
use WWW::Curl::easy;

while () {
$_ =~ s/[\015\012\032\r\n]//g;
$_ =~ s/\s+$//;
my($line) = $_;
chomp($line);

my @urls = URI::Find::Rule->scheme('http')->in($line);

for my $url (@urls) {
$url = $url->[1];

local $body = "";
sub chunk { my ($data,$pointer)=@_; ${$pointer}.=$data; return length($data) }

$url =~ s/[\<\>]//; # remove < from urls
$url =~ s/[\?\%\&\(\)\'\`\"\;].*//g;

my $curl= WWW::Curl::easy->new() or die "curl init failed\n";
open(DEVNULL,">>/dev/null") or die;
$curl->setopt(CURLOPT_STDERR,*DEVNULL);
$curl->setopt(CURLOPT_WRITEHEADER,*DEVNULL);
$curl->setopt(CURLOPT_WRITEDATA,*DEVNULL);
$curl->setopt(CURLOPT_ERRORBUFFER,*DEVNULL);

$curl->setopt(CURLOPT_WRITEFUNCTION,\&chunk);
$curl->setopt(CURLOPT_FILE,\$body);

$curl->setopt(CURLOPT-NOSIGNAL,"1");
$curl->setopt(CURLOPT_FRESH_CONNECT,"1");
$curl->setopt(CURLOPT_FORBIT_REUSE,"1");
$curl->setopt(CURLOPT_CONNECTTIMEOUT,"6");
$curl->setopt(CURLOPT_MAXCONNECTS,"18");
$curl->setopt(CURLOPT_TIMEOUT,"30");

$curl->setopt(CURLOPT_REFERER,"$url");
$curl->setopt(CURLOPT_USERAGENT,'Mozilla/4.0 (compatible; Firefox; Windows NT 5.1)');
$curl->setopt(CURLOPT_FOLLOWLOCATION,1);
$curl->setopt(CURLOPT_URL,$url);

$curl->perform(); # do it
sleep int(rand(15)); # randomly sleep a number of seconds up to 15

@urlb = URI::Find::Rule->scheme('http')->in($body);
for $urlb (@urlb) {
$urlb = $urlb->[1]; # here i'm visiting any urls in the html of the original url

$urlb =~ s/[\<\>]//; # remove < from urls
$urlb =~ s/[\?\%\&\(\)\'\`\"\;].*//g;

if ("$urlb" ne "$url" && "urlb" ne "$urlc") {
$curl->setopt(CURLOPT_URL,"$urlb");
$curl->perform();
sleep int(rand(15)); # randomly sleep a number of seconds up to 15
$urlc = $urlb;
}
}

WWW::Curl::easy::global_cleanup();
}

}

Posted by skp at dezembro 09, 2004 02:00 PM | TrackBack
Comments
Post a comment
Name:


Email Address:


URL:


Comments:


Remember info?