From ConShell
Jump to: navigation, search


Advanced Web Statistics 6.x -


There are typically three stages to implementing awstats (I have done about 5 or 6 of these implementations now so it's become clockwork).

1. Gather the logs. This could involve some serious headaches, because you might have problems such as missing or incorrectly formatted data, huge logfiles (never rotated?) and the like. Awstats can process may different logfiles though and has a flexible LogFormat directive to get that working. Another twist to this step is that you often find multiple nodes (head-ends) so you have multiple logfiles to combine. The awstats utility will handle this nicely. I usually created a folder called rollup/ as the output folder for this step.


cd rollup/
/usr/local/www/awstats/tools/ \
-showsteps \
../web1/access.log.200802.gz \
../web2/access.log.200802.gz \
| gzip --fast > access.log.200802.gz

2. Process the logs to-date. This is what I call the catsup (silly like ketchup) mode, you have to decide how far back you want to go date-wise. Usually 6 months or less is sufficient. This step also requires the initial setup of the file


/usr/local/awstats/wwwroot/cgi-bin/ \ -update -showsteps \
-LogFile="zcat 2008-02.www_access_log.gz |"

3. Process the logs ongoing. This will require a cronjob or two. The first cronjob is going to copy logfiles down from wherever and put them in a ready-to-process state (see from step 1). Then other cronjob will actually call with the -update flag.


#Note this works because LogFile was set appropriately in the config file
/usr/local/awstats/wwwroot/cgi-bin/ -update

Generating static HTML & PDF reports

Another example showing how to generate a set of HTML/PDF files from the metadata. This is sometimes required for high-traffic sites where the metadata for any given month might be in the hundreds of megabytes - that is too much to process dynamically by the CGI script.

/usr/local/www/apache22/data/tools/ \ \
-month=03 \
-year=2008 \
-awstatsprog=/usr/local/www/awstats/cgi-bin/ \

You can add -buildpdf to those arguments also

Refer to 1 and 2for more information.

Using Maxmind GeoIP lookups


Install the GeoIP package from EPEL. This should work out of box as the GeoLite Country database appears to be bundled with the package in /usr/share/GeoIP/GeoIP.dat


New way: install these packages- libgeo-ip-perl, libgeo-ipfree-perl, geoip-bin

Old way: Get the code and databases from Maxmind. Specifically I use Geo-IP-1.31 (perl), Geo-IP-1.4.4 (c) and also the free GeoIP (country) and GeoIPCity (city) databases - these get gunzipped and the dat files should be put in /usr/local/share/GeoIP/ folder.


Once the packages are installed installed, test with

GeoIP Country Edition: US, United States

If you installed the GeoLite City database, here is an alternate invocation to use it

geoiplookup -f /usr/local/share/GeoIP/GeoLiteCity.dat 
GeoIP City Edition, Rev 1: US, CA, San Francisco, 94109, 37.795700, -122.420898, 807, 415


Now set these values in the file

LoadPlugin="geoip GEOIP_STANDARD /usr/local/share/GeoIP/GeoIP.dat"
LoadPlugin="geoip_city_maxmind GEOIP_STANDARD /usr/local/share/GeoIP/GeoIPCity.dat"

Another alternative is to simply use the geo-ipfree library, like so.

Then add/uncomment this line in awstats.conf


Using WHOIS lookups

This plugin requires the perl module Net/ (Net::XWhois). On Debian/Ubuntu this can be installed using aptitude install libnet-xwhois-perl

Then uncomment the line in


Processing Tips

If you are processing massive log files (>4million hits per day) I recommend the following changes.

  • Use SkipFiles="REGEX[.*\.(com|css|eot|gif|htc|jpg|js|png|rss|svg|swf|ttf)$]". Adjust to suit.
  • Bump $LIMITFLUSH in from 5000 to 1000000 or more.