home January 01, 2017

Web Server Abuse Detection Script


What is considered web server abuse ?

Commonly, abuse of a web site is anything which the web site designer sees as unwanted requests, scanning or otherwise causing the server to act in a way the server was not designed. The more paranoid server admins see ip addresses that cause errors in general to also be abusive. The reason is a client request of a resource that does not exist is wasted time and possibly a security problem. There is little reason a client should cause errors on a web server if the server is setup correctly. The client asks for a valid HTML page by request and the server sends the response by transferring the pages, pictures and CSS files. The client should take care not to send unnecessary or illegal requests to the server. Those clients who are faulty, abusive or cause our server to do extra work to fulfill bad requests should be banned when we could have used that time to serve proper clients.

A good example of abuse is unwanted penetration scanning. This behavior can be seen by many script kiddies who are looking to find a way to break into a server. Scanning tries many different calls to the web server to see if any of them trigger a positive response. Many PHP scanners work this way. What you will see in your logs are a bunch of PHP admin requests. If you have your server setup securely, and especially if your and _not_ running any PHP code, then all of the illicit requests will trigger HTTP server errors. It would be prudent to block these abusive ips so they can not do any damage or gather any more information about your infrastructure.

It is important to remember that not all requests are abusive, but they are wasteful. The cursed favicon.ico is a good example. This ico file can be requested by any client without first asking for a valid HTML page. Requesting this file is valid, but that does not make it right.

Another topic considered abusive are clients who generate many errors and are misconfigured or just broke. Internet Explorer 6 is definitely a broke client. It ignores standards and asks for files in non-standard ways. On the other side of the spectrum are people who use download accelerators to download files from a site. These programs promise faster downloads at the cost of increase server side load. They make tens, if not hundreds of connections for the same file putting a burden on your server. We would rather not talk to clients like these if they cause issues and prefer to block them if they are found to be slowing our server down and denying access to others.

To defend our servers we setup a simple script, some what similar to Fail2Ban, to watch the logs and deny access in real time to those who are looking to do us harm or those who are using broken, misconfigured or misinformed clients.

How the web_server_abuse_detection.pl script works

The logic behind the script is: watch the web server logs and keep track of client ip addresses who cause errors. Client ips will be held in a hash for a few hours. If the client only causes a few errors they will be removed from the hash at some expire time. If the ip causes too many errors this triggers a system call to the local firewall to block the offending ip within a second or less.

The script is built with Perl and called "web_server_abuse_detection.pl". We tried to make it run as efficient as possible on any operating system supporting Perl. It tails the web server log file and looks for ip addresses who throw errors on the server. If the remote ip causes an error, like a code 404 for example then the ip address is put into a hash. The hash contains the ip address, the last time the ip triggered an error and the total number of errors that ip has triggered.

At some point an abusive ip address will hit a the user defined limit on the number errors a single host can cause. At this point the ip address will be subject to the user defined system call. In the example script below we logged the ip address to /var/log/messages and then used our firewall to block the ip address from further contact. If the ip does cause a few errors and then does not trigger any more before an expiration time then the ip is removed from the hash.

The script's USER defined variables:

Lets go through a few of the user defined variables explaining what they are for and what settings you may want to set them for. Remember that every site is going to be different so you will want to use the DEBUG option and modify them accordingly. Lets look at some options, then the script itself and then an example of how to run the script.

my $log is the full path to the web server log file you want to watch. The format of the log file is standard Apache "common" or "combined". The best part about having you server use a standard logging format is this script can be used with Apache, Nginx, Lighttpd or even thttpd. What the script is looking for is an ip address of the remote client as the first string of the line with a space separating it from the rest of the line. For example, this is Googlebot accessing a site, "66.249.72.6 moneyslow.com - ..."

my $errors_block is the number of errors a client can trigger before the system call runs to block that ip. The web server error codes 400-417, 444 if you are using Nginx and 500-505 are the conditions we are looking for. We set errors_block to 10 in the example, so once an ip causes 10 errors ($errors_block) in less than 7200 seconds ($expire_time) they get blocked.

my $expire_time is the amount of time in seconds before an ip address in the hash is considered stale and removed. In the example we have 7200 seconds or two hours. It is important to notice that an ip must not cause any errors for 7200 seconds before they can be removed from the hash. This means a malicious client who slowly scans the web server will continually time stamp their ip in the hash every time they trigger an error. For example, lets say a client scans us slowly at one line per hour looking for vulnerabilities. The first hour they are added to the hash and in the second hour they increment the error counter _and_ update the "last seen" time stamp of their ip in our hash. A normal intrusion detection system (IDS) might miss this type of behavior, but since our detector is keeping track of access times over the long term (7200 seconds) we will still block them after 10 errors. In this case 10 errors would be triggered in 10 hours.

my $cleanup_time is the number of error lines that are triggered before the script calls the function to clean up the hash. The clean up routine in a loop is slightly CPU expensive so there is no reason to run it over and over again for every error log line. The longer this script runs the less of the common abusive clients you will see. Thus, make sure your clean up counter is low enough to get old ips out at a decent rate. But, do not set the value to low as it is "expensive" CPU wise to go through the entire hash. A good number for the clean up counter is equal to the amount of error lines your server sees in five(5) minutes or less. The actual time the cleanup runs will vary though out the day as your traffic dies down at night and increases during the day, but 5 minutes is a good starting point.

my $debug_mode is the ability of the script to print out each of the steps to the console. At the end of this page we go though an example. You should turn debug on when you are first trying to figure out how the script settings work with your server environment. Just run the script and do not run it as a background process (i.e. no ampersand after the command) and the script will print to the console. You will see the time, ip address and number of errors the remote ip has triggered. When the $cleanup_time value is triggered you will see the script clean up its hash of stale ip addresses.

USER EDIT: Whitelisted ips is for adding any ip addresses you never want to be considered abusive even if they cause errors. A good example of this is Googlebot at 66.249/16. Any remote site that links back to you with an incorrect URL will cause Googlebot to produce an error. Other sub nets you may want to white list are private internal ip address like 192.168.0.0/16, 172.16.0.0/12 and 10.0.0.0/8. Perhaps your web developers are inside the firewall and they are testing out code and causing errors. We have already added the ips for Googlebot and the private subnet 192.168/16 in the script below as an example.

USER EDIT: system( ) in the middle of the script is the line the Perl script will execute against the ip address which has triggered too many errors. In the example we are using logger to print a message, "(bad ip address here) blocked by calomel abuse detection" to /var/log/messages. Then we are executing the OpenBSD or FreeBSD version of pfctl to add the offending ip address to a Pf table called "BLOCKTEMP". BTW, the BLOCKTEMP table simply drops any connections from the abusive ip address. It is the same table seen in our pf rules on the OpenBSD Pf Firewall "how to" ( pf.conf ) page. If you are running on Linux you could add an iptables block rule instead. You could also send an email to support, trigger a Nagios alert or even add the offending ip address to a MySQL table. Again, your options are endless.

Getting the script running

To get the script running you first need to copy and paste the following Perl code below. We are going to call the script web_server_abuse_detection.pl, but you can call it anything you like.

We have tried to clearly comment every line and logic path in the script. Because of this the script might look at little long. We clearly commented the script so, if you wanted to, you could modify the script to better meet the needs of your environment. For example, maybe you want to look for people trying to download *.mp3's or the latest Windows virus storm looking for *.vbs files. Just modify the regular expression conditions to suit your needs.

Once you have the script in a file and executable you need to edit the user defined options as stated in the section above. Then, take a look at the section after the code block for a discussion on how to run the script.

#!/usr/bin/perl -T

use strict;
use warnings;

## moneyslow.com .:. https://moneyslow.com/html/webconf
##   name     : web_server_abuse_detection.pl
##   version  : 0.04

# description: this script will watch the web server logs (like Apache or Nginx) and
#  count the number of http error codes an ip has triggered. At a user defined amount
#  of errors we can execute a action to block the ip using our firewall software.

## which log file do you want to watch?
  my $log = "/var/log/nginx/access.log";

## how many errors can an ip address trigger before we block them?
  my $errors_block = 10;

## how many seconds before an unseen ip is considered old and removed from the hash?
  my $expire_time = 7200;

## how many error log lines before we trigger blocking abusive ips and clean up
## of old ips in the hash? make sure this value is greater than $errors_block above.
  my $cleanup_time = 10;

## do you want to debug the scripts output ? on=1 and off=0
  my $debug_mode = 1;

## clear the environment and set our path
  $ENV{ENV} ="";
  $ENV{PATH} = "/bin:/usr/bin:/usr/local/bin";

## declare some internal variables and the hash of abusive ip addresses
  my ( $ip, $errors, $time, $newtime, $newerrors );
  my $trigger_count=1;
  my %abusive_ips = ();

## open the log file. we are using the system binary tail which is smart enough
## to follow rotating logs. We could have used File::Tail, but tail is easier.
  open(LOG,"/usr/bin/tail -F $log |") || die "ERROR: could not open log file.\n";

  while(<LOG>) {
       ## process the log line if it contains one of these error codes 
       if ($_ =~ m/( 301 | 302 | 303 | 307 | 400 | 401 | 402 | 403 | 404 | 405 | 406 | 408 | 409 | 410 | 411 | 412 | 413 | 414 | 415 | 416 | 444 | 494 | 495 | 496 | 497 | 500 | 501 | 502 | 503 | 504 | 507 )/)
         {

         ## Whitelisted ips. This is where you can whitelist ips that cause errors,
         ## but you do NOT want them to be blocked. Googlebot at 66.249/16 is a good
         ## example. We also whitelisted the private subnet 192.168/16 so web
         ## developers inside the firewall can test and never be blocked. 
         if ($_ !~ m/^(66\.249\.|192\.168\.)/)
         {

         ## extract the ip address from the log line and get the current unix time
          $time = time();
          $ip = (split ' ')[0];

         ## if an ip address has never been seen before we need
         ## to initialize the errors value to avoid warning messages.
          $abusive_ips{ $ip }{ 'errors' } = 0 if not defined $abusive_ips{ $ip }{ 'errors' };

         ## increment the error counter and update the time stamp.
          $abusive_ips{ $ip }{ 'errors' } = $abusive_ips{ $ip }->{ 'errors' } + 1;
          $abusive_ips{ $ip }{ 'time' } = $time;

         ## DEBUG: show detailed output
         if ( $debug_mode == 1 ) {
           $newerrors  = $abusive_ips{ $ip }->{ 'errors' };
           $newtime = $abusive_ips{ $ip }->{ 'time' };
           print "unix_time:  $newtime, errors:  $newerrors, ip:  $ip, cleanup_time: $trigger_count\n";
         }

         ## if an ip has triggered the $errors_block value we block them
          if ($abusive_ips{ $ip }->{ 'errors' } >= $errors_block ) {

             ## DEBUG: show detailed output
             if ( $debug_mode == 1 ) {
               print "ABUSIVE IP! unix_time:  $newtime, errors:  $newerrors, ip:  $ip, cleanup_time: $trigger_count\n";
             }

             ## Untaint the ip variable for use by the following external system() calls
             my $ip_ext = "$1" if ($ip =~ m/^([0-9\.]+)$/ or die "\nError: Illegal characters in ip\n\n" );

             ## USER EDIT: this is the system call you will set to block the abuser. You can add the command
             ##  line you want to execute on the ip address of the abuser. For example, we are using logger to
             ##  echo the line out to /var/log/messages and then we are adding the offending ip address to our
             ##  FreeBSD Pf table which we have setup to block ips at Pf firewall.
             system("/usr/bin/logger", "$ip_ext", "is", "abusive,", "sent", "to", "BLOCKTEMP");
             system("/sbin/pfctl", "-t", "BLOCKTEMP", "-T", "add", "$ip_ext");

             ## after the ip is blocked it does need to be in the hash anymore
             delete($abusive_ips{ $ip });
          }

         ## increment the trigger counter which is used for the following clean up function. 
          $trigger_count++;

         ## clean up function: when the trigger counter reaches the $cleanup_time we
         ## remove any old hash entries from the $abusive_ips hash
          if ($trigger_count >= $cleanup_time) {
             my $time_current =  time();

             ## DEBUG: show detailed output
             if ( $debug_mode == 1 ) {
               print "  Clean up... expire: $expire_time, pre-size of hash:  " . keys( %abusive_ips ) . ".\n";
             }

              ## clean up ip addresses we have not seen in a long time
               while (($ip, $time) = each(%abusive_ips)){

               ## DEBUG: show detailed output
               if ( $debug_mode == 1 ) {
                 my $total_time = $time_current - $abusive_ips{ $ip }->{ 'time' };
                 print "    ip: $ip, seconds_last_seen: $total_time, errors:  $abusive_ips{ $ip }->{ 'errors' }\n";
               }

                  if ( ($time_current - $abusive_ips{ $ip }->{ 'time' } ) >= $expire_time) {
                       delete($abusive_ips{ $ip });
                  }
               }

            ## DEBUG: show detailed output
            if ( $debug_mode == 1 ) {
               print "  Clean up... expire: $expire_time, post-size of hash:  " . keys( %abusive_ips ) . ".\n";
             }

             ## reset the trigger counter
              $trigger_count = 1;
          }
         }
       }
  }
#### EOF ####

How to run the script and test settings:

The first time you setup the script you will want to watch how it works. To do this you should set the debug variable on, "my $debug_mode = 1;" and run the script on the console (xterm). You may also want to open a second window and tail your access log file, the same file you told the script to watch. This way you can watch the debug output of script and compare it to what you see in the real logs. Lastly, edit the "system(" line to just execute logger or just comment out the line completely. We do not need to block anyone during testing.

When the script sees that an ip has tripped an error code in the log it will print out one line per error. The line will contain the following:

When the clean up routine runs after every 10 error lines the output will show the following:

This is a quick look at the output on our server. The addresses have been randomized and are just for example:

user@machine: ./web_server_abuse_detection.pl
unix_time:  1314309794, errors:  2, ip:  193.85.174.205, cleanup_time: 1
unix_time:  1314309885, errors:  3, ip:  193.85.174.205, cleanup_time: 2
unix_time:  1314309910, errors:  1, ip:  188.40.142.12, cleanup_time: 3
unix_time:  1314309910, errors:  2, ip:  188.40.142.12, cleanup_time: 4
unix_time:  1314309943, errors:  4, ip:  193.85.174.205, cleanup_time: 5
unix_time:  1314310048, errors:  1, ip:  68.77.114.76, cleanup_time: 6
unix_time:  1314310106, errors:  1, ip:  72.70.228.45, cleanup_time: 7
unix_time:  1314310198, errors:  1, ip:  71.164.218.111, cleanup_time: 8
unix_time:  1314310273, errors:  1, ip:  95.9.171.91, cleanup_time: 9
  Clean up... expire: 7200, pre-size of hash:  8.
    ip: 193.85.174.205, seconds_last_seen: 330, errors:  4
    ip: 89.134.245.4, seconds_last_seen: 504, errors:  1
    ip: 72.70.228.45, seconds_last_seen: 167, errors:  1
    ip: 91.64.156.181, seconds_last_seen: 486, errors:  7
    ip: 188.40.142.12, seconds_last_seen: 363, errors:  2
    ip: 68.77.114.76, seconds_last_seen: 225, errors:  1
    ip: 95.9.171.91, seconds_last_seen: 0, errors:  1
    ip: 71.164.218.111, seconds_last_seen: 75, errors:  1
  Clean up... expire: 7200, post-size of hash:  8.
unix_time:  1314310304, errors:  2, ip:  72.70.228.45, cleanup_time: 1
unix_time:  1314310314, errors:  3, ip:  72.70.228.45, cleanup_time: 2
unix_time:  1314310314, errors:  4, ip:  72.70.228.45, cleanup_time: 3
unix_time:  1314310314, errors:  5, ip:  72.70.228.45, cleanup_time: 4
unix_time:  1314310314, errors:  6, ip:  72.70.228.45, cleanup_time: 5
unix_time:  1314310334, errors:  1, ip:  24.115.204.182, cleanup_time: 6
unix_time:  1314310380, errors:  1, ip:  80.67.118.68, cleanup_time: 7
unix_time:  1314310380, errors:  2, ip:  80.67.118.68, cleanup_time: 8
unix_time:  1314310524, errors:  1, ip:  72.193.34.19, cleanup_time: 9
  Clean up... expire: 7200, pre-size of hash:  11.
    ip: 72.193.34.19, seconds_last_seen: 0, errors:  1
    ip: 24.115.204.182, seconds_last_seen: 190, errors:  1
    ip: 193.85.174.205, seconds_last_seen: 581, errors:  4
    ip: 89.134.245.4, seconds_last_seen: 755, errors:  1
    ip: 80.67.118.68, seconds_last_seen: 144, errors:  2
    ip: 72.70.228.45, seconds_last_seen: 210, errors:  6
    ip: 91.64.156.181, seconds_last_seen: 737, errors:  7
    ip: 188.40.142.12, seconds_last_seen: 614, errors:  2
    ip: 68.77.114.76, seconds_last_seen: 476, errors:  1
    ip: 95.9.171.91, seconds_last_seen: 251, errors:  1
    ip: 71.164.218.111, seconds_last_seen: 326, errors:  1
  Clean up... expire: 7200, post-size of hash:  11.
unix_time:  1314310550, errors:  2, ip:  72.193.34.19, cleanup_time: 1
unix_time:  1314310561, errors:  1, ip:  87.156.30.97, cleanup_time: 2
unix_time:  1314310564, errors:  1, ip:  166.137.138.16, cleanup_time: 3
unix_time:  1314310639, errors:  1, ip:  66.90.75.117, cleanup_time: 4
unix_time:  1314310739, errors:  1, ip:  201.138.135.216, cleanup_time: 5
unix_time:  1314310843, errors:  8, ip:  91.64.156.181, cleanup_time: 6
unix_time:  1314310843, errors:  9, ip:  91.64.156.181, cleanup_time: 7
ABUSIVE IP! unix_time: 1314310843, errors: 10, ip: 91.64.156.181, cleanup_time: 8

How to run the script in production:

When you have the script working like you want it to it is time to set it and forget it. Make sure to define the debug option off with "my $debug_mode = 0;". When debug is off the script will run silently. the only output you may see is if you decided to use the logger statement to print to /var/log/messages using the system call as we did on the example.

To run the script manually in the background use:

./web_server_abuse_detection.pl &

If you look in the system processes you will see the script running. That's about all you have to do. If you wanted to you could create a simple bash shell script to grep through /var/log/messages file and look for the logger lines you printed out and mail them to yourself. Since the code is open you have many options available to you.

How to run the script on boot in FreeBSD:

To run the script on reboot on FreeBSD we can use the /usr/local/etc/rc.d/ script method. Note rc.d also allows you to start, stop and query the status of the script using /usr/local/etc/rc.d/webabuse (start|stop|status) which makes the script easier to work with on the command line during testing.

Rc.d will require an us to make a rc.d script and enter an "enable" line in /etc/rc.d . Make a new file called "/usr/local/etc/rc.d/webabuse" and put in the following information. Make sure the full path to the script in the command= variable is correct. The rc.d script will run the web_server_abuse_detection.pl script as root; root is needed because we are executing firewall block rules which need root permissions. Finally, in your /etc/rc.conf file put in webabuse_enable="YES" to tell the rc.d to start this script on boot.

#### vi /etc/rc.d ####

webabuse_enable="YES"

#### vi /etc/rc.d ####


#### vi /usr/local/etc/rc.d/webabuse ####

#!/bin/sh

# PROVIDE: webabuse
# BEFORE:  LOGIN
# KEYWORD: 

. /etc/rc.subr

name=webabuse
rcvar=`set_rcvar`
command=/tools/web_server_abuse_detection.pl
command_interpreter=/usr/bin/perl
webabuse_user=root
start_cmd="/usr/sbin/daemon -u $webabuse_user $command"

load_rc_config $name
run_rc_command "$1"

#### vi /usr/local/etc/rc.d/webabuse ####

Want more speed? Make sure to also check out the Network Speed and Performance Guide. With a little time and understanding you could easily double your firewalls throughput.

Questions?

Can this script help against a DOS and DDOS attack ?The script can at least help mitigate the attack. The script is looking for clients which cause errors. If you setup your webserver to limit the amount of requests by a single client to a reasonable number, then anything over that request rate would be logged as an error. For example, our Nginx tutorial uses the directive limit_req_zone and limit_req to limit a client to a certain amount of requests per minute. When an ip goes over that limit the client is sent an error 503, meaning server overloaded. This web server abuse script will see those errors and block the client ip. So, a DDOS attack is currently going on and lets say 10000 ip addresses are hitting your index.html page over and over again. Eventually they are going to hit the request limit and be blocked by the firewall. Normal clients will always be under the request limit and never be blocked.

How much RAM and CPU time does the script use ?The script is very efficient. This script has been running for two (2) months in a configuration very close to the example above. It has used less than twenty two (22) seconds of CPU time for processing over nine hundred thousand (900,000) log lines. RAM usage is limited to the size of the hash. A single ip set uses less then forty (40) bytes of RAM so one(1) megabyte of ram would hold around twenty five thousand (25,000) ip address entries.

How many ip addresses being blocked is acceptable ?

You will want to tune the script to block as many abusive ip addresses as possible without getting too many innocent clients. This depends on your traffic and what your clients are doing. On average we see 0.6% of clients getting blocked due to abuse of some kind. This percentage is equal to sixty (60) ips blocked per ten thousand (10,0000) ips seen during a twenty four (24) hour period. You will have to decide what you see as an acceptable block rate in your environment.