alexcolson.com alexcolson.com [Printable Version]
Logging to MySQL with Apache
Home > Tech Talk > Article

[Create New Article] :: [Browse Tags]

Modified: Saturday, January 9, 2010 8:46:19 AM by Alex Colson
  [Edit Tags]

Tags: Apache, Database, MySQL, Perl, SQL, Web Development

Logging Apache access logs to a database can be very convenient for keeping track of requests.  Some advantages are:
  • Tracking data sent from Apache for a specific virtual host, very useful for hosting companies
  • Determining which pages get the most hits
  • Who, which IP address, made the most requests in a given time period
  • What browser is used most often on your website

The way this method works is by using a Piped Logger with Apache and a customized script to place the data from the server into the database.  This particular script includes a method for logging requests even when the database is not available by placing the information into a text file that can later be imported into the database.

Requirements

By using the Piped logging method, we do not need to recompile Apache, or even install any additional programs.  The only thing that is required is:
  • Perl 5 or newer
  • Perl DBI module and the driver for your database
  • A database that supports SQL queries
  • Access to a table within that database

Format the Logs

The following example configures Apache to send the following information:
  1. The Server Name, useful for multiple sites on a single server (%v)  Use varchar(255) for the database type
  2. The Remote Host (%h)  varchar(50)
  3. The Time in MySQL Format (%{%F %T}t)  datetime
  4. The first line of the request (%r)  varchar(255)
  5. The status that was returned (%s)  smallint unsigned
  6. The bytes sent (%O to include the headers, %B without headers)  int unsigned
  7. The duration of the request (%T in seconds, %D in microseconds)  smallint unsigned (up to 18 hours for %T)
  8. The referer (%{Referer}i)  varchar(255)
  9. The User-Agent, or browser (%{User-agent}i)  varchar(255)
LogFormat "%v\t%h\t%{%F %T}t\t%r\t%s\t%O\t%T\t%{Referer}i\t%{User-agent}i" sqllog

Note that each of these arguments has a \t between them, this is a tab and is used to separate the different values.  Also at the end of the string there is sqllog, this becomes the name of this particular log format so we can identify it later.

For a complete list of Apache Logging arguments, visit the Apache Mod Log Config page.
A great resource for Data Types is htmllite.com's MySQL Data Types.

Creating the Piped Logger

Now that a format for SQL and Apache has been created, it is time to specify where to send that information.  Using the CustomLog directive we use a Pipe (|) to tell Apache that it will be a seperate program.  After the pipe, specify the location and name of the script, in this case the logging script is at /etc/apache2/logger.pl.  Following the pipe and script in quotes, you will see the name of the log format that we created above.

CustomLog "|/etc/apache2/logger.pl" sqllog

This script may be located anywhere as long as the user that runs the Apache server has access.

Creating the Logging Script

Now that the information from Apache will use the above format and send the information to the specified script, lets create that script.

It is good to note that the script should run continuously as a loop, this prevents the Apache server from having to start the script every time a request is made.  This also prevents errors such as
piped log program '/etc/apache2/logger.pl' failed unexpectedly
from occurring.  The way we keep the script running without consuming the entire processor is to create a While loop that waits for standard input (<STDIN>).

Another important feature of this script is that the database server may occasionally need to be restarted without restarting Apache, if the script cannot connect to the database server, it simply logs the information in a file called notlogged.txt.  The location and file name can be customized in the first two variables.  Just be sure that the user that runs Apache server has write access to the location.

The variables under Database Information include the DBI Driver name, in this case we are using MySQL.  You can use any database as long as the Perl DateBase Driver (DBD) is installed.  For a complete list of DBD modules, visit CPAN.org's DBD List.  After the driver, enter the hostname, username, password, database, and table.  Make sure that the username provided has access to write to the table.

#!/usr/bin/perl

# Location of Notlogged
  $dir = '/etc/apache2/';
  $file = 'notlogged.txt';
# Database Information
  $driver = 'mysql';
  $host = 'localhost';
  $user = 'username';
  $pass = 'password';
  $database = 'database';
  $table = 'apachelog';


use DBI;
BEGIN {$SIG{'__WARN__'} = sub {$connect=0;}}

chdir $dir;

while (<STDIN>) {
        $insert = $_;
        chomp($insert);
        $insert =~ s/\\/\\\\/g;
        $insert =~ s/\'/\\\'/g;
        $insert =~ s/\t/\'\,\'/g;
        $insert = "insert into $table values ('$insert');";
        if (!$connect) { sqlconnect(); }
        if ($connect) {
                $db->do($insert) or sub {$connect=0};
        }
        if (!$connect) {
                open (NOTLOGGED, ">>$file") or die "Cannot open file";
                print NOTLOGGED "$insert\n";
                close (NOTLOGGED);
        }
}

sub sqlconnect {
        $connect = 1;
        $db = DBI->connect("DBI:$driver:database=$database;host=$host",$user,"$pass");
}

Make sure that the script above is in an executable file and that the Apache user has permission.

Finishing Up

Now that everything is installed, restart Apache and watch the database grow with hits.  Just remember to delete old records or the database file will become huge.

A simple way to delete old records is with:
delete from tablename where time < '2008-01-01';
where tablename is the name of the table, and 2008-01-01 is replaced with the date that you want to delete records prior to.

An easy query to check the amount of data transferred by month is:
select sum(bytes),left(time,7) as month from apache2 group by month;
Sunday, September 5, 2010 10:20:08 AM EST (0.0159 s) Copyright © 2008 Alex Colson