AWStats logfile analyzer 3.2 Documentation

 


Frequently Asked Questions


ABOUT QUESTIONS:

SETUP or ERROR QUESTIONS:
Here, you can find the most common questions and answers users have to install/setup AWStats.
COMMON SUPPORT QUESTIONS:
Here, you can find the most common questions and answers users have when using AWStats.
SECURITY QUESTIONS:
Here, you can find the common questions about security problems when setting or using AWStats.





FAQ-ABO100 : WHICH WEB SERVER OR OS ARE SUPPORTED ?
AWStats can works with all web server able to write log file with a combined log format (XLF/ELF) like apache, a common log format (CLF) like Apache or Squid, an W3C log format like IIS 5.0 or higher (Some users have reported that you can setup your log format to W3C with IIS 4.0 but you need a service pack 6), and a lot of others web/wap/proxies servers.
Because AWStats is in perl, it can works on all Operating Systems.
Examples of used platforms (bold means 'tested by author', others were reported by AWStats users to work correctly) :
OS:
Windows NT 4.0, Windows 2000, Windows Me, Linux, Macintosh, Solaris, Aix, BeOS, ...
Web/Wap/Proxy servers
Apache, IIS 5.0, IceCast, IPlanet, Squid, WebStar, www4mail, ...
Perl interpreters:
ActivePerl 5.6, Perl for unix 5.0, mod_perl for Apache, ...


FAQ-ABO150 : WHICH LOG FORMAT CAN AWSTATS ANALYZE ?
AWStats setup knows predefined log format you can use to make AWStats config easier. However, you can define your own log format, that's the reason why AWStats can analyze nearly all web servers or proxies log files. The only requirement is "Your log file must contains required information".
This is example of possible log format:
Apache combined log format (known as NCSA combined log format or XLF or ELF format)
IIS 5.0+ log format (known as W3C format)
Webstar native log format
...
Apache common log format (AWStats can now analyse such log files but such log files does not contain all information AWStats is looking for. The problem is in the content, not in the format). I think analysing common log files is not interesting because there is a lot of missing information: no way to filter robots, find search engines, keywords, os, browser. But a lot of users asked me for it, so AWStats support it. However, a lot of intereting advanced features can't work: browsers, os, keywords, robot detection...).

See also F.A.Q.: LOG FORMAT SETUP OR ERRORS .


FAQ-ABO200 : WICH LANGUAGES ARE AVAILABLE ?
AWStats can make reports in 25 languages. This is a list of all of them, for last version, in alphabetical order:
Bosnian=ba, Chinese (Taiwan)=tw, Chinese (Traditional)=cn, Czech=cz, Danish=dk, Dutch=nl, English=en, French=fr, German=de, Greek=gr, Hungarian=hu, Indonesian=id, Italian=it, Japanese=jp, Korean=kr, Norwegian (Nynorsk)=nn, Norwegian (Bokmål)=nb, Polish=pl, Portuguese=pt, Romanian=ro, Russian=ru, Spanish=es, Swedish=se, Turkish=tr, Ukrainian=ua
However, AWStats documentation is only provided in English.
But, you can find some documentation made by contributors:
In French: How to install AWStats and Webalizer


FAQ-ABO250 : CAN AWSTATS BE INTEGRATED WITH PHP NUKE ?
I don't know any plan to make an Add-On for PHPNuke to include AWStats, for the moment. But this can change. You should ask to have a such Add-On to PHPNuke authors, and on PHPNuke forums.




FAQ-SET050 : ERROR "MISSING $ ON LOOP VARIABLE ..."
PROBLEM: When I run awstats.pl from command line, I get:
"Missing $ on loop variable at awstats.pl line xxx"
SOLUTION: Problem is in your Perl interpreter. Try to install or reinstall a more recent/stable perl interpreter.
You can get new Perl version at ActivePerl (Win32) or Perl.com (Unix/Linux/Other).


FAQ-SET100 : I SEE PERL SCRIPT'S SOURCE INSTEAD OF ITS EXECUTION
PROBLEM: When I try to execute the perl script through the web server, I see the perl script's source instead of the HTML result page of its execution !
SOLUTION: This is not a problem of AWStats but a problem in your web server setup. awstats.pl file must be in a directory defined in your web server to be a "cgi" directory, this means, a directory configured in your web server to contain "executable" files and not to documents files. You have to read your web server manual to know how to setup a directory to be an "executable cgi" directory (With IIS, you have some checkbox to check in directory properties, with apache you have to use the "ExecCGI" option in the directory "Directive").


FAQ-SET150 & FAQ-SET200 : INTERNAL SERVER ERROR IN MY BROWSER
ERROR "... COULDN'T SPAWN CHILD PROCESS..." IN APACHE ERROR LOG
PROBLEM: AWStats seems to run fine at the command prompt but when ran as a CGI from a browser, i get:
"Internal Server Error"
Sometimes I get the following message in my error log file:
[error] [client xx.xx.xx.xx] No such file or directory: couldn't spawn child process: c:/mywebroot/cgi-bin/awstats.pl
SOLUTION: This problem occurs with Apache web server with no internal perl interpreter (mod_perl not active). To solve this, you must tell Apache where is your perl interpreter. For this, you have 2 solutions:
1) Change the first line of awstats.pl file with the full path of your perl interpreter.
Example with Windows OS and ActivePerl perl interpreter (installed in C:\Program Files\ActivePerl), you must change the first line of awstats.pl file with:
#!c:/program files/activeperl/bin/perl
2) Other solution: Uncomment in your Apache httpd.conf config the following line (remove the # at the beginning)
ScriptInterpreterSource registry
Then restart Apache. This will tell Apache to use the program associated to .pl extension in windows registry, to find the perl interpreter.


FAQ-SET300 : ERROR "COULDN'T OPEN FILE ..."
PROBLEM: I have the following error:
"Couldn't open file /workingpath/awstatsmmyyyy.tmp.9999: Permission denied."
SOLUTION: This error means that the web server didn't succeed in writing the working temporary file (file ended by .tmp.9999 where 9999 is a number) because of permissions problems.
First check that the directory /workingpath has write permission for
user nobody (default used user by apache on linux systems)
or user IUSR_SERVERNAME (default used user by IIS on NT).
With Unix, try with a path with no links.
With NT, you must check NTFS permissions if your directory is on a NTFS partition.
With IIS, there is also a write permission attribute, defined in directory properties in your IIS setup, that you must check.
With IIS, if a default cgi-bin directory was created during IIS install, try to put AWStats directly this directory.
If this still doesn't work, you can change the DirData parameter to say AWStats that you want to use another directory (A directory you are sure that the default user, used by web server process, can write into).


FAQ-SET350 : EMPTY OR NULL STATISTICS REPORTED
PROBLEM: Awstats seems to work but i'm not getting any results. i get a statistics page that looks like i have no hits.
SOLUTION: That's the most common problem you can have and reason is simple: Your log file format setup is wrong.
If you use Apache web server
The best way of working is to use the "combined" log format (See into the README.TXT file to know the way to change your apache server log from "common" log format into "combined"). Don't forget to stop apache, reset your log file and restart Apache to make change into combined effective. Then you must setup your AWStats config file with value LogFormat=1.
If you want to use another format, read the next FAQ to have examples of LogFile value according to log files format.
If you use IIS server or Windows built-in web server
The Internet Information Server default W3C Extended Log Format will not work correctly with AWStats. To make it work correctly, start the IIS Snap-in, select the web site and look at it's Properties. Choose W3C Extended Log Format, then Properties, then the Tab Extended Properties and uncheck everything under Extended Properties. Once they are all unchecked, check off the list in the ReadMe file in the IIS section, "With IIS Server". You can also read the next FAQ to have examples of LogFile value according to log files format.


FAQ-SET400 : LOG FORMAT SETUP OR ERRORS
PROBLEM: Which value do I have to put in the LogFile parameter to make AWStats working with my log file format ?
SOLUTION: The AWStats config file give you all possible values for LogFile. To help you this is some common case of log file format and the corresponding value of LogFile you must use in your AWStats config file:
If your log records are EXACTLY like this (NCSA combined/XLF/ELF log format):
62.161.78.73 - - [dd/mmm/yyyy:hh:mm:ss +0x00] "GET /page.html HTTP/1.1" 200 1234 "http://www.from.com/from.htm" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"
You must use : LogFile=1
If your log records are EXACTLY like this (NCSA combined with Apache using mod_gzip):
62.161.78.73 - - [dd/mmm/yyyy:hh:mm:ss +0x00] "GET /page.html HTTP/1.1" 200 1234 "http://www.from.com/from.htm" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)" mod_gzip: DECHUNK:OK In:11393 Out:3904:66pct.
You must use : LogFile="%host %other %other %time1 %methodurl %code %bytesd %refererquot %uaquot %other %gzipres %gzipin %gzipout"
If your log records are EXACTLY like this (NCSA common CLF log format):
62.161.78.73 - - [dd/mmm/yyyy:hh:mm:ss +0x00] "GET /page.html HTTP/1.1" 200 1234
You must use : LogFile=4
Note: Browsers, OS's, Keywords and Referers features are not available with a such format.
If your log records are EXACTLY like this (IIS W3C log format):
yyyy-mm-dd hh:mm:ss 62.161.78.73 - GET /page.html 200 1234 HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+5.01;+Windows+NT+5.0) http://www.from.com/from.htm
You must use : LogFile=2
If your log records are EXACTLY like this (With some providers):
62.161.78.73 - - [dd/Month/yyyy:hh:mm:ss +0x00] "GET /page.html HTTP/1.1" "-" 200 1234
You must use : LogFile="%host %logname %other %time1 %methodurl %other %code %bytesd"
Note: Browsers, OS's, Keywords and Referers features are not available with a such format.
If your log records are EXACTLY like this (Webstar native log format):
05/21/00 00:17:31 OK 200 212.242.30.6 Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt) http://www.cover.dk/ "www.cover.dk" :Documentation:graphics:starninelogo.white.gif 1133
You must use : LogFile=3
There is a lot of other possible log formats.
You must use a personalised log format LogFile ="..." as described in config file to support other various log format.


FAQ-SET450 : NO PICTURES/GRAPHICS SHOWN
PROBLEM: Awstats seems to work (all data and counters seem to be good) but I have no image shown.
SOLUTION: With Apache web server, you might have troubles (no picture shown on stats page) if you use a directory called "icons" (because of Apache pre-defined "icons" alias directory), so use instead, for example, a directory called "icon" with no s at the end (Rename your directory phisically and change the DirIcons parameter in config file to reflect this change).


FAQ-SET550 : HOW TO RUN AWSTATS FREQUENTLY
PROBLEM: Awstats must be ran frequently to update statistitics. How can I do this ?
SOLUTION:
With Windows, you can use the internal task scheduler. The use of this tool is not an AWStats related problem, so please take a look at your Windows manual. Warning, if you use "awstats.pl -config=mysite -update" in your scheduled task, you might experience problem of not updating data. Try this instead
"C:\WINNT\system32\CMD.EXE /C [awstat-path]\awstats.pl -config=mysite -update"
A lot of other scheduler (sharewares/freewares) are very good.
With unix-like operating systems, you can use the "crontab".
This is examples of lines you can add in the cron file (see your unix reference manual for cron) :
To run update every day at 04:00, use :
0 4 * * * /opt/awstats/wwwroot/cgi-bin/awstats.pl -config=mysite -update
To run update every hour, use :
0 * * * * /opt/awstats/wwwroot/cgi-bin/awstats.pl -config=mysite -update


FAQ-SET600 : HOW CAN I EXCLUDE MY IP ADDRESS (OR WHOLE SUBNET MASK) FROM STATS ?
PROBLEM: I don't want to see my own IP address in the stats or I want to exclude counting visits from a whole subnet.
SOLUTION:
You must edit the config file to change the SkipHosts parameter.
For example, to exclude:
- your own IP adress 123.123.123.123, use SkipHosts="123\.123\.123\.123"
- the whole subnet 123.123.123.xxx, use SkipHosts="123\.123\.123"
- all sub hosts xxx.myintranet.com, use SkipHosts="\.myintranet\.com" (This one works only if dns lookup is already done in your log file).




FAQ-COM100 : AWSTATS SPEED/TIMEOUT PROBLEMS ?
PROBLEM: When I analyze large log files, processing times are very important (Update process from a browser returns a timeout/internal error after a long wait). Is there a setup or things to do to avoid this and increase speed ?
SOLUTION: Yes. You really need to understand how a log analyzer works to have good speed. There is also major setup changes you can do to decrease your processing time.
- Launch AWStats more often (from crontab or a scheduler). More often you launch AWStats, more faster is AWStats (because the less is the number of NEW lines in log, since last run, to process). See the Benchmark page to get examples of launching frequency according to your web traffic
- You can disable DNSLookup in configure file (set DNSLookup=0) but this requires absolutely that hosts addresses in your log file are already resolved (need to setup your web server to do so). Speed can be increased up by 2 to 50 times !
If you don't understand what is an "already resolved reverse DNS lookup", keep this parameter to 1.
- If you use Apache, set PurgeLogFile to 1 (By default, to avoid bad surprise, PurgeLogFile is 0 in configure file, but you can set it to 1 to ask AWStats to purge the log file after processing it. This increase speed for next run).
- Use last AWStats version.


FAQ-COM150 : BENCHMARK / FREQUENCY TO LAUNCH AWSTATS TO UPDATE STATISTICS
PROBLEM: What is AWStats speed ?
PROBLEM: What is the frequency to launch AWStats process to update my statistics ?
SOLUTION: All benchmarks information and advice on frequency for update process are related into the Benchmark page.

FAQ-COM200 : HOW REVERSE DNS LOOKUP WORKS, UNRESOLVED IP ADDRESSES
PROBLEM: The reported page AWStats shows me has no hostnames, only IP addresses, countries reported are all "unkown".
SOLUTION: When AWStats find an IP address in your log file, it tries a reverse DNS lookup to find the hostname and domain if the DNSLookup parameter, in your AWStats config file, is DNSLookup=1 (Default value). So, first, check if you have the good value. The DNSLookup=0 must be used only if your log file contains already resolved IP address. For example, when you set up Apache with the HostNameLookups=on directive. When you ask your web server to make itself the reverse DNS lookup to log hostname instead of IP address, you will still find some IP addresses in your log file because the reverse DNS lookup is not always possible. But if your web server fails in it, AWStats will also fails (All reverse DNS lookups use the same system API). So to avoid AWStats to make an already done lookup (with success or not), you can set DNSLookup=0 in AWStats config file. Since 2.23, because a lot of users don't know this option, when AWStats find an already resolved IP Address in your log file, it disables itself the reverse DNS lookup because it means that reverse lookup is already done in log file. If IIS or Apache has made one DNS lookup resolution for one record in your log file, they must have done it for all the file. If you find only few lines with hostnames and others with IP Address, it means your web server failed in resolving them. Check your DNS reverse system with the nslookup command (available on NT/2000 and Unix).
Apache users might be interesting in knowing there is a tool called logresolve with Apache distribution, that can convert a logfile with IP Addresses into a logfile with resolved hostnames.


FAQ-COM250 : DIFFERENT RESULTS THAN OTHER ANALYZER
PROBLEM: I also use webalizer (or another log analyzer) and it doesn't report the same results than AWStats. Why ?
SOLUTION: If you compare AWStats results with an other log file analyzer, you will found some differences, sometimes very important. In fact, all analyzer (even AWStats) make "over reporting" because of the problem of proxy-servers and robots. However Awstats is one the most accurate and its "over reporting" is very low where all other analyzers, even the most famous, have a very high error rate (10% to 2x more than reality).
This is the most important reasons why you will find differences:
- Some dynamic pages generated by CGI programs are not counted by some analyzer (ie Webalizer) like a "Page" (but only like a "Hit") if CGI prog has not a .cgi extension, so they are not included correctly in their statistics. AWstats does not make this error and all CGI pages are pages.
- AWStats is the alone analyzer (that i know for the moment) able to detect robots visits. All other analyzers think it's a human visitor. This error make them to report more visits and visitors than reality. This does not happen with AWStats. When it tells "1 visitor", it means "1 human visitor". All robots hits are reported in the "Robots/Spiders visitors" chart.
- A lot of analyzer (ie webalizer) use the "Hits" to count visitors. This is not a good way of working : Some visitors use a lot of proxy servers to surf (ie: AOL users), this means it's possible that several hosts (with several IP addresses) are used to reach your site for only one visitor (ie: one proxy server download the page and 2 other servers download all images). Because of this, if stats of unique visitors are made on "hits", 3 users are reported but it's wrong. So AWStats, like HitBox, considers only HTML "Pages" to count unique visitors. This decrease the error, not completely, because it's always possible that a proxy server download one HTML frame and another one download another frame, but this make the over-reporting of unique visitors less important.
There is also differences in log analyzers databases and algorithms that make details of results less or more accurate:
- AWStats has a higher browser, os and search engine database, so reports concerning this are more accurate.
- AWStats has url syntax rules to find keywords or keyphrases used to find your site, but AWStats has also an algorithm to detect keywords of unknown search engines with unknown url syntax rule.


FAQ-COM300 : DIFFERENCE BETWEEN LOCAL HOURS AND AWSTATS REPORTED HOURS
PROBLEM: I use IIS and there's a difference between local hour and AWStats reported hour. For example I made a hit on a page at 4:00 and AWStats report I hited it at 2:00.
SOLUTION: This is not a problem of time in your local client host. AWStats use only time reported in logs by your server and all time are related to server hour. The problem is that IIS in some foreign versions puts GMT time in its log file (and not local time). So, you have also GMT time in your statistics.
You can do nothing, for the moment, but waiting that microsoft change this in next IIS versions. However, Microsoft sheet Q271196 "IIS Log File Entries Have the Incorrect Date and Time Stamp" says:
The selected log file format is the W3C Extended Log File Format. The extended log file format is defined in the W3C Working Draft WD- logfile-960323 specification by Phillip M. Hallam-Baker and Brian Behlendorf. This document defines the Date and Time files to always be in GMT. This behavior is by design.
So this means this way of working might never be changed.


FAQ-COM350 : HOW CAN I PROCESS OLD LOG FILE ?
PROBLEM: I want to process an old log file to include its data in my AWStats reports.
SOLUTION: You must change your LogFile parameter to point to the old log file and run the update. However the update process can only accept files in chronological order, so if you have already processed a recent file, you must before reset all your statistics (see next FAQ) and restart all the update process for all past log files and in chronoloical order.


FAQ-COM400 : HOW CAN I RESET ALL MY STATISTICS ?
PROBLEM: I want to reset all my statistics and restart my stats from now.
SOLUTION: All analyzed data are stored by AWStats in files called awstatsMMYYYY.[site.]txt (one file each month). You will find those files in directory defined by "DirData" parameter (same directory than awstats.pl by default).
To reset your stats for a month, you just have to delete the file for the required month/year.
To reset all your stats, delete all files awstats*.txt
Warning, if you delete those data files, you won't be able to recover your stats back, unless you kept old log files somewhere. You will have to process all past log files (in chronological order) to get old statistics back.


FAQ-COM450 : HOW CAN I UPDATE MY STATISTICS WHEN I USE A LOAD BALANCING SYSTEM THAT SPLITS MY LOGS ?
PROBLEM: How can I update my statistics when i use a load balancing system that split my logs ?
SOLUTION: The best solution is to merge all splitted log files resulted from all your load balanced servers into one. For this, you can use the logresolvemerge tool provided with AWstats since version 3.2 :
logresolvemerge.pl file1.log file2.log ... filen.log > newfiletoprocess.log
And setup the LogFile parameter in your config file to process the newfiletoprocess.log file.




FAQ-SEC00 : CAN AWSTATS BE USED TO MAKE CROSS SITE SCRIPTING ATTACKS ?
PROBLEM: If a bad user use a browser to make a hit on an URL that include a < SCRIPT > ... < /SCRIPT > section in its parameter, when AWStats will show the links on the report page, does the script will be executed ?
SOLUTION: No. AWStats use a filter to remove all scripts codes that was included in an URL to make a Cross Site Scripting Attack using a log analyzer report page.


FAQ-SEC150 : HOW CAN I PREVENT SOME USERS TO SEE STATISTICS OF OTHER USERS ?
PROBLEM: I don't want a user xxx (having a site www.xxx.com) to see statistics of user yyy (having a site www.yyy.com). How can i setup AWStats for this ?
SOLUTION: If you host different users/sites, it means you have different config files.
A common way to manage securities right is to put awstats.pl in a directory protected by an authentication (.htaccess with Apache). Then, you set all files permissions on user config files (awstats.xxx.conf, awstats.yyy.conf...) to be readable by owner user only (xxx for awstats.xxx.conf, yyy for awstats.yyy.conf...). With this setup, if a user xxx try to see statistics for user yyy (using the URL http://provider.com/cgi-bin/awstats.pl?config=yyy), AWStats will be ran by user xxx (because awstats.pl is in a protected directory) and won't be able to read config file for user yyy (because only user yyy can read his config file).


FAQ-SEC200 : HOW TO MANAGE LOG FILES (AND STATISTICS) CORRUPTED BY 'CODE RED VIRUS LIKE' ATTACKS ?
PROBLEM: My site is attacked by some Code Red Viruses. This make my log file corrupted and full of 404 errors. So my statistics are also full of 404 errors. This make AWStats slower and my history files very large. Can I do something to avoid this ?
SOLUTION: Yes.
'Code Red virus like' attacks are infected browsers or robots that make hits on your site using a very long unknown URL like this one (hoping your server is IIS):
/default.ida?XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX%40%50...%40%50
URL is generated by the infected robot and the purpose is to exploit a vulnerability of the web server (only IIS is concerned by Code Red worm). So you will often find a 'common string' in those URLs. For example, with Code Red worm, there is always default.ida in the URL string. So, you should edit your config file to add in the SkipFiles parameter the following value:
SkipFiles="default.ida"