home January 01, 2017

Spamd Statistics Script


OpenBSD's spamd daemon is a great way to annoy spammers. The idea is a spammer will connect to the spamd daemon and believe it is a sendmail style mail server. They are slowed to a crawl trying to deliver their mail thus reducing the amount of servers they can connect to per hour. By slowing down the rate at which a spammer can deliver their products we can reduce the amount of money they can receive because a spammer gets paid by volume of mail delivered. In this way we can reduce the spammers profit margin.

Getting Started

This script will run though /var/log/daemon and count up all of the spammers' ips' and how much time they have been connected for. I have also added a counter for the bandwidth your connection has used annoying them and a few other statistics. You can find a copy of the output at the bottom of this page.

You can download the calomel_spamd_stats.pl Perl script here by doing a cut and paste. Before running the script verify the variables at the top of the script.

#!/usr/bin/perl

############################################
# moneyslow.com Spamd Configuration Settings #
############################################

# Path to spamd logfile(s). No "/" after last dir name. Ex. "/var/log".
$spamdpath = "/var/www/htdocs";

# Spamd log file (daemon by default)
$spamdfile = "/var/log/daemon";

# Path to output html file
$spamdhtmlfile = "/var/www/htdocs/calomel_spamd_stats.html";

# Spamd delay is seconds (-s)
$spamddelay = 3;

##############################
# End Configuration Settings #
##############################

#####Begin: define dates #####
@months=("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec");
@days=("Sun","Mon","Tue","Wed","Thu","Fri","Sat");
($sec,$min,$hour,$mday,$mon,$year,$wday)=(localtime(time))[0,1,2,3,4,5,6];
$year=$year+1900;
#####End: define dates #####

#####Begin: read spamd file(s) and get totals #####
open(SOURCE, "gzcat -f $spamdfile | sort -u |") or die ("Can't open file! Permissions? UserID? gzcat?");
  while( <SOURCE> ) {
    if ((/spamd/) && (/disconnected after/)) {
      ($dates,$hostname,$daemonandid,$ip,$seconds)
       = ($_ =~ /(\w+[ ]{1,2}\d+ \d+:.\d:.\d+) (.*) (.*)\: (\d+\.\d+\.\d+\.\d+)\: disconnected after (\d+) seconds/);
      $totalseconds +=$seconds;
      $totalspammers++;
      $line{$ip}{hits}++;
      $line{$ip}{time} += $seconds;
      ($date) = ($_ =~ /(\w+[ ]{1,2}\d+ \d+:.\d:.\d+)/);
      push(@date_array, "$date");
    }
  }
close(SOURCE) or die "File didn't close: $!";
if ($totalspammers==0){$totalspammers=1;}
#####End: reading spamd file(s) and get totals #####

##### Begin: calculate, consolidate and put into array #####
my %collectem = ();
foreach my $ip (keys %line) {
  $totalips++;
  my $hits  = $line{$ip}{hits};
  my $time  = $line{$ip}{time};
  my $conn  = int(($line{$ip}{time}/$line{$ip}{hits})+0.5);
  my $phits = int((($line{$ip}{hits}/$totalspammers)*100)+0.4);
  my $ptime = int((($line{$ip}{time}/$totalseconds)*100)+0.4);
  push (@arrayem, [$hits,$ip,$time,$conn,$phits,$ptime]);
}
##### End: calculate, consolidate and put into array #####

#####Begin: order data by hits, time then ip in decending order #####
sub orderem {
  $b->[0] <=> $a->[0]  or $b->[2] <=> $a->[2]  or $b->[1] cmp $a->[1];
}
#####End: order data by hits, time then ip in decending order #####

#####Begin: output to HTML #####
open(SPAMDHTMLSTATS, ">$spamdhtmlfile") or die ("Can't create file");

print SPAMDHTMLSTATS "<html><body bgcolor=#d0d0d0><HEAD><TITLE>moneyslow.com Spamd Stats</TITLE></HEAD><div align=\"center\">\n";
print SPAMDHTMLSTATS "<b><font size=\"4\">moneyslow.com Spamd Stats on $hostname</font></b><br>\n";
print SPAMDHTMLSTATS "Script run on: $days[$wday] $months[$mon] $mday $year at $hour:$min";
printf SPAMDHTMLSTATS (" in %4.0f second(s)<br>\n",time-$^T);
print SPAMDHTMLSTATS "Data Range: $date_array[0] to $date_array[-1]<br><br>\n";

printf SPAMDHTMLSTATS ("Time spammers wasted: %4.2f hours<br>\n",$totalseconds/3600);
printf SPAMDHTMLSTATS ("Total bandwith used: %4.2f megabytes<br>\n",($totalseconds/$spamddelay*80.606/1000000));
printf SPAMDHTMLSTATS ("Average time per tarpit: %4.2f minutes<br>\n",($totalseconds/$totalspammers)/60);
print SPAMDHTMLSTATS "Unique ip addresses tarpitted: $totalips<br>\n";
print SPAMDHTMLSTATS "Total connections made: $totalspammers<br><br>\n";

print SPAMDHTMLSTATS "<TABLE BORDER=\"1\"> <tr><td align=\"center\"><b>Tarpits</b></td>\n\n";
print SPAMDHTMLSTATS "<td align=\"center\"><b>Source IP</b></td><td align=\"center\"><b>Time (s)</b></td>";
print SPAMDHTMLSTATS "<td align=\"center\"><b>Ave Sec/Tarpit</b></td><td align=\"center\"><b>% of Tarpits</b></td>";
print SPAMDHTMLSTATS "<td align=\"center\"><b>% of Time</b></td></tr>";

@sortem = sort orderem @arrayem;
foreach $thingy (@sortem) {
    print SPAMDHTMLSTATS "<td align=\"center\"><b>$thingy->[0]</b></td><td align=\"center\"><b>$thingy->[1]</b></td><td align=\"center\"><b>$thingy->[2]</b></td><td align=\"center\"><b>$thingy->[3]</b></td><td align=\"center\"><b>$thingy->[4]</b></td><td align=\"center\"><b>$thingy->[5]</b></td></tr>";
}

print SPAMDHTMLSTATS "</html>";
close(SPAMDHTMLSTATS);
#####End: output to HTML #####

You can run the calomel_spamd_stats.pl script manually or you can setup a cgi web page to update the spamd stats. The easiest way to update your spamd statistics page is to run the script in a cron job. Even on really large log files the script finishes quickly. This is an example of a cron job set to run calomel_spamd_stats.pl once an hour on the 30th minute.

#minute (0-59)
#|   hour (0-23)
#|   |    day of the month (1-31)
#|   |    |   month of the year (1-12 or Jan-Dec)
#|   |    |   |   day of the week (0-6 with 0=Sun or Sun-Sat)
#|   |    |   |   |   commands
#|   |    |   |   |   |
#### moneyslow.com Spamd Stats
30   *    *   *   *   /tools/calomel_spamd_stats.pl

HELPFUL HINT: For an added layer of protection again spam you can use a bayesian spam filter. Check out our Bogofilter "how to" Anti-Spam Guide. With a little time and understanding you could easily filter up to 99% of any remaining spam.

An example of the results

This is an example of the output of the calomel_spamd_stats.pl script. The script will make a HTML page you can put in your web tree or in any directory if you just want to read it with a local browser. The "tarpits" value is how many times we have seen the same ip address and the next column shows the offender's ip. The following columns are the total "time" they have been connected, the average seconds per tarpit, the percent of tarpits we have seen from this ip compared against the total and finally this ips percentage of time connected compared to total time of all spammers.

moneyslow.com Spamd Stats
moneyslow.com Spamd Stats on calomel Script run on: Mon Jan 10 2020 at 10:00 in 0 second(s) Data Range: Jan 1 08:00:00 to Jan 10 10:00:00 Time spammers wasted: 8.07 hours Total bandwidth used: 0.74 megabytes Average time per tarpit: 8.65 minutes Unique ip addresses tarpitted: 19 Total connections made: 56
Tarpits Source IPTime (s)Ave Sec/Tarpit% of Tarpits% of Time
9212.100.250.214997011081634
888.248.15.95243140
824.136.136.120243140
6125.5.40.3664411071123
5125.5.40.455401108919
585.207.189.2615390
2125.225.15.18015830
283.27.166.276330