Monitoring visits
Create site12 by copying site11.
- /cms
- ...
- site11
- site12
In this chapter, we are going to program the logging of all the requests to the site, in a file and in the database.
To test the result online, enter http://www.frasq.org/cms/site12 in the address bar of your navigator.
Add a folder called log directly under the root of the site:
- /cms/site12
- log
IMPORTANT: Make sure the Apache process is allowed to write in the folder:
$ chgrp www-data log
$ chmod 775 log
NOTE: The name of the group of the Apache process is defined by the Group directive. The configurations of the local and the remote servers can be different, depending on your internet provider.
Define the global variable $log_dir
in config.inc:
- global $log_dir;
- $log_dir = ROOT_DIR . DIRECTORY_SEPARATOR . 'log';
Add the files clientipaddress.php, validateipaddress.php and log.php in the folder library with the following contents:
- /cms/site12
- library
- clientipaddress.php
- validateipaddress.php
- log.php
- library
- function client_ip_address() {
- return $_SERVER['REMOTE_ADDR'];
- }
client_ip_address
returns the PHP variable $_SERVER['REMOTE_ADDR']
.
NOTE: $_SERVER['HTTP_X_FORWARDED_FOR']
, whose value has been added by a proxy server, and $_SERVER['HTTP_CLIENT_IP']
, which is assigned directly by the client, are not reliable.
- function validate_ip_address($ipaddress) {
- return preg_match('/^(([1-9]?[0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]).){3}([1-9]?[0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$/', $ipaddress);
- }
validate_ip_address
returns true
if $ipaddress
is a valid IP address, false
otherwise.
- require_once 'clientipaddress.php';
- require_once 'validateipaddress.php';
Loads the functions client_ip_address
and validate_ip_address
.
- function write_log($logfile, $textline=false) {
- global $log_dir;
- $ipaddress = client_ip_address();
- if (!validate_ip_address($ipaddress)) {
- return false;
- }
- $timestamp=strftime('%Y-%m-%d %H:%M:%S', time());
- $logmsg="$timestamp $ipaddress";
- if ($textline) {
- $logmsg .= "\t$textline";
- }
- $file = isset($log_dir) ? ($log_dir . DIRECTORY_SEPARATOR . $logfile) : $logfile;
- $r = @file_put_contents($file, array($logmsg, "\n"), FILE_APPEND);
- return $r;
- }
write_log
obtains the IP address of the client and validates it, stamps the message with the current date and time followed by the IP address separated by spaces, then adds $textline
preceded by a tab character.
Next the message is written at the end of the file $logfile
in the directory designated by the global variable $log_dir
if it's defined.
The logging parameters of a request are defined in config.inc
:
- global $track_db, $track_log;
- global $track_visitor, $track_visitor_agent;
- $track_db=true;
- $track_log=true;
- $track_visitor=true;
- $track_visitor_agent=true;
$track_visitor
set to true
triggers logging requests.
$track_visitor_agent
adds the content of the header User-Agent
to the registered data.
$track_db
gives the name of the DB table which contains the log, track
by default if $track_db
is just true
.
$track_log
gives the name of the file which contains the the log, track.log in the folder defined by $log_dir
by default if $track_log
is just true
.
If $track_db
and $track_log
are false
, no logging is performed.
NOTE: You can easily register other pieces of information like the visitors' languages.
Logging requests is managed by the dispatch
function in engine.php:
- require_once 'track.php';
Loads the track
function.
- global $track_visitor, $track_visitor_agent;
- $req = $base_path ? substr(request_uri(), strlen($base_path)) : request_uri();
- if ($track_visitor) {
- track($req, $track_visitor_agent);
- }
Calls track
with the client's request $req
and the globale option $track_agent
if the global variable $track_visitor
is true
.
Add the files useragent.php, validateuseragent.php and track.php in the folder library with the following contents:
- /cms/site12
- library
- useragent.php
- validateagent.php
- track.php
- library
- function user_agent() {
- if (isset($_SERVER['HTTP_USER_AGENT'])) {
- return $_SERVER['HTTP_USER_AGENT'];
- }
- return false;
- }
user_agent
returns the PHP variable $_SERVER['HTTP_USER_AGENT']
.
- function validate_user_agent($agent) {
- return preg_match('/^[a-zA-Z0-9 \;\:\.\-\)\(\/\@\]\[\+\~\_\,\?\=\{\}\*\|\&\#\!]+$/', $agent);
- }
validate_user_agent
returns true
if $agent
designates a valid agent, false
otherwise.
- require_once 'clientipaddress.php';
- require_once 'validateipaddress.php';
- require_once 'requesturi.php';
- require_once 'useragent.php';
- require_once 'validateuseragent.php';
Loads the functions client_ip_address
, validate_ip_address
, request_uri
, user_agent
and validate_user_agent
.
- function track($request_uri=false, $track_agent=false) {
- global $track_log, $track_db;
- if (! ($track_log or $track_db) ) {
- return true;
- }
- if (!$request_uri) {
- $request_uri=request_uri();
- }
- if (!$request_uri) {
- return false;
- }
- $user_agent=$track_agent ? user_agent() : false;
- if (!validate_user_agent($user_agent)) {
- $user_agent=false;
- }
- $r = true;
track
returns immediately if the global variables $track_log
and $track_db
are false
.
If the parameter $request_uri
isn't defined, track
initializes it by calling the function request_uri
.
If $request_uri
is still not defined, track
exits.
If the parameter $track_agent
is true
, the variable $user_agent
is initialized by calling the function user_agent
and validated.
- if ($track_log) {
- require_once 'log.php';
- $logmsg = $request_uri;
- if ($user_agent) {
- $logmsg .= "\t" . $user_agent;
- }
- $r = write_log($track_log === true ? 'track.log' : $track_log, $logmsg);
- }
If the global variable $track_log
is true
, track
loads the function write_log
and asks it to log the request and possibly the agent in the file whose name is defined by $track_log
or called track.log by default.
- if ($track_db) {
- $ip_address=client_ip_address();
- if (!validate_ip_address($ip_address)) {
- return false;
- }
- $sqlipaddress=db_sql_arg($ip_address, false);
- $sqlrequesturi=db_sql_arg($request_uri, true);
- $sqluseragent=db_sql_arg($user_agent, true, true);
- $tabtrack=db_prefix_table($track_db === true ? 'track' : $track_db);
- $sql="INSERT $tabtrack (ip_address, request_uri, user_agent) VALUES ($sqlipaddress, $sqlrequesturi, $sqluseragent)";
- $r = db_insert($sql);
- }
- return $r;
- }
If the global variable $track_db
is true
, track
obtains the IP address of the client and validates it then it prepares and executes an SQL order which registers the parameters of the request in the table whose name is defined by $track_db
or called track
by default.
Add the table track
to the DB of the site:
$ mysql -u root -p
mysql> use frasqdb2;
mysql> CREATE TABLE track (
track_id int(10) unsigned NOT NULL AUTO_INCREMENT,
time_stamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
ip_address varchar(15) NOT NULL,
request_uri varchar(255) NOT NULL,
user_agent varchar(255) DEFAULT NULL,
PRIMARY KEY (track_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
mysql> quit
Check in config.inc that the parameters $track_log
and $track_db
are set to true
.
Enter http://localhost/cms/site12/en/home in the address bar of your navigator, access the contact page, change of language.
Display the content of the connection log:
$ tail track.log
To obtain the total number of visitors:
$ cut -f 1 track.log | cut -d' ' -f 3 | sort | uniq | wc -l
To list the 10 most consulted pages:
$ cut -f 2 track.log | sort | uniq -c | sort -rn | head -10
Check the DB:
mysql> SELECT * FROM track;
To obtain the total number of visitors:
mysql> SELECT COUNT(DISTINCT ip_address) from track;
To list the 10 most consulted pages:
mysql> SELECT request_uri, COUNT(request_uri) AS count from track GROUP BY request_uri ORDER BY count DESC LIMIT 10;
IMPORTANT: The amount of data generated can rapidly fill up the DB and the log file. Choose only one mode by setting $track_db
or $track_log
to false
. Once a campaign for analyzing the types of the clients (navigators, mobiles, robots, etc.) is over, leave the parameter $track_agent
to false
.
Comments