Recent Topics

1 Jul 11, 2010 17:30    

My b2evolution Version: 3.3.3

This might have made sense in another forum (like plugins & extensions), but I can't start a thread there so it's going here instead B).

So, I was somewhat frustrated by the provided stats in two respects:
1. The data is purged from the database after 15 days.
2. There aren't rich graphs/charts.

Being a former programmer turned manager, I figured I could write something up to meet my needs. This was the first time I've ever written PHP code, so I admit it's hacky. There are obvious optimizations (authentication, optimize SQL, restructure code, etc). It's quick & dirty, but if it's helpful to others then feel free to reuse.

As an example of what you can do with this, check out this excel file: http://cookingwithdoyle.com/test/Stats.xlsx

If you want to generate something like this for your site then follow these instructions:

1. Paste the code at the bottom of this post into a file called stats.php and then upload to your server somewhere. There are a few parameters that you'll need to update: $dbUserName, $dbPassword, $dbName.
2. Get the file geoiploc.php from http://chir.ag/projects/geoiploc/ (this resolves the IP address into the country, which is a cool stat to see)
3. Point your browser at stats.php. It'll dump the stats from the hit_log in your database into a file called data.txt and show you an HTML table of the data extracted. On future page hits, it'll parse the file and then append it with new hits.
4. Copy the data.txt file that was generated (will be in the same directory) to your local machine. Leave a copy on the webserver since it'll get updated the next time you do step #3.
5. Grab the excel file I wrote up (see above) or make your own. If you use mine, update the database connection to point to wherever you choose to save that data.txt file. (You may need to enable macros/add the directory to the safe list in Excel).
6. Refresh all tables and pivot charts.

Now that you've done that once, in the future you only need to do the following:
1. Hit the stats.php page with your browser.
2. Copy the data.txt file onto your local machine, overwriting the old file.
3. Open Excel and refresh all.

Since the database wipes the hitlog on a rolling 15-day cycle, you need to hit the .php file once every two weeks to avoid permanently losing data. However, if you don't want to copy the data.txt file each time then that's ok since it's getting appended each time you hit stats.php. It'll just keep growing for the next time you snag it.

As I said, there are obvious improvements that could be done. This is quick & dirty.

[u]stats.php[/u]


<html>
<body>

<?php

/****************************************
These are support functions needed by the code below
****************************************/

// http://chir.ag/projects/geoiploc/
include("geoiploc.php");

function getMonthAbbr($num)
{
	if ($num == 1)
		return "Jan";
	if ($num == 2)
		return "Feb";
	if ($num == 3)
		return "Mar";
	if ($num == 4)
		return "Apr";
	if ($num == 5)
		return "May";
	if ($num == 6)
		return "Jun";
	if ($num == 7)
		return "Jul";
	if ($num == 8)
		return "Aug";
	if ($num == 9)
		return "Sep";
	if ($num == 10)
		return "Oct";
	if ($num == 11)
		return "Nov";
	if ($num == 11)
		return "Dec";
	return "";
}

function getOS($agent)
{
	if (strpos($agent,"NT 6.1"))
	{
		$info['os']="Windows 7/Server 2008 R2";
	}
	else if(strpos($agent,"NT 6.0"))
	{
		$info['os']="Windows Vista/Server 2008";
	}
	else if(strpos($agent,"NT 5.2"))
	{
		$info['os']="Windows XP 64-bit/Server 2003";
	}
	else if(strpos($agent,"NT 5.1"))
	{
		$info['os']="Windows XP";
	}
	else if(strpos($agent,"NT 5.0"))
	{
		$info['os']="Windows 2000";
	}
	else if(strpos($agent,"NT 4.0"))
	{
		$info['os']="Windows NT 4.0";
	}
	else if(strpos($agent,"NT 3.51"))
	{
		$info['os']="Windows NT 3.51";
	}
	else if(strpos($agent,"NT 3.5"))
	{
		$info['os']="Windows NT 3.5";
	}
	else if(strpos($agent,"NT 3.1"))
	{
		$info['os']="Windows NT 3.1";
	}
	else if(strpos($agent,"Mac OS X"))
	{
		$info['os']="Mac OS X";
	}
	else if(strpos($agent,"Android"))
	{
		$info['os']="Android";
	}
	else if(strpos($agent,"Linux"))
	{
		$info['os']="Linux";
		if (strpos($agent,"Ubuntu"))
		{
			$info['os-flavor'] = "Ubuntu";
		}
		else if (strpos($agent,"Fedora"))
		{
			$info['os-flavor'] = "Fedora";
		}
		else if (strpos($agent,"SUSE"))
		{
			$info['os-flavor'] = "SUSE";
		}
		else
		{
			$info['os-flavor'] = "Unknown";
		}
	}

	return $info;
}

function getBrowserInfo($agent)
{
	// Get OS
	$info = getOS($agent);
	
	// Internet Explorer
	$pos = strpos($agent,"MSIE");
	if ($pos)
	{
		$versionEnd = strpos($agent,";",$pos);
		$info['browser'] = "MSIE";
		if ($versionEnd)
		{
			$info['version'] = substr($agent,$pos+5,$versionEnd - $pos - 5);
		}
		return $info;
	}
	
	// Opera
	$pos = strpos($agent,"pera/");
	if ($pos)
	{
		$info['browser'] = "Opera";
		$versionStart = strpos($agent,"Version/") + 8;
		if (strpos($agent,"Version/")) {
			$info['version'] = substr($agent,$versionStart,strlen($agent));
		}
		return $info;
	}
	
	// Iceweasel
	$pos = strpos($agent,"Iceweasel");
	if ($pos)
	{
		$info['browser'] = "Iceweasel";
		$versionEnd = strpos($agent," ",$pos);
		$info['version'] = substr($agent,$pos + 10,$versionEnd - $pos - 10);
		$info['os-flavor'] = "Debian";
		return $info;
	}

	// Firefox
	$pos = strpos($agent,"Firefox");
	if ($pos)
	{
		$info['browser'] = "Firefox";
		$versionEnd = strpos($agent," ",$pos);
		if ($versionEnd > $pos)
		{
			$info['version'] = substr($agent,$pos + 8, $versionEnd - $pos - 8);
		}
		else
		{
			$info['version'] = substr($agent,$pos + 8);
		}
		return $info;
	}
	
	// Google Chrome
	$pos = strpos($agent,"Chrome/");
	if ($pos)
	{
		$info['browser'] = "Chrome";
		$versionStart = $pos + 7;
		$versionEnd = strpos($agent, " ",$versionStart);
		$info['version'] = substr($agent,$versionStart,$versionEnd - $versionStart);
		return $info;
	}

	// Safari
	$pos = strpos($agent,"Safari");
	if ($pos)
	{
		$info['browser'] = "Safari";
		$versionStart = strpos($agent,"Version");
		$versionEnd = strpos($agent," ",$versionStart);
		$info['version'] = substr($agent,$versionStart+8,$versionEnd - $versionStart - 8);
		return $info;
	}

	// Unknown
	$info['browser'] = "?";
	$info['version'] = "?";
	return $info;
}

function includeURI($uri)
{
	if (substr($uri,0,28) == "/index.php?tempskin=_sitemap")
	{
		return false;
	}
	else if (strpos($uri,"disp=msgform"))
	{
		return false;
	}
	/*else if (substr($uri,0,25) == "/index.php?tempskin=_rss2")
	{
		return false;
	}
	else if (substr($uri,0,25) == "/index.php?tempskin=_atom")
	{
		return false;
	}*/
	else if (substr($uri,0,10) == "/index.php")
	{
		return true;
	}
	else
	{
		return false;
	}
}

function startDelimited($fh,$delimiter)
{
	fwrite($fh,"ID" . $delimiter . "URI" . $delimiter . "Remote Address" . $delimiter . "Year" . $delimiter . "Month" . $delimiter . "Date" . $delimiter . "Time" . $delimiter . "Type" . $delimiter . "Browser" . $delimiter . "Browser Version" . $delimiter . "Operating System" . $delimiter . "Agent" . $delimiter . "Country" . $delimiter . "Month Abbr" . $delimiter . "Referrer Type" . $delimiter . "Referrer" . $delimiter . "Search Rank" . $delimiter . "Linux Flavor");
}

function startHTML()
{
	echo "<table border=1><tr><th>ID</th><th>URI</th><th>Remote Address</th><th>year</th><th>month</th><th>Date</th><th>Time</th><th>Type</th><th>Browser</th><th><nobr>Browser Version</nobr></th><th><nobr>Operating System</nobr></th><th>Agent</th><th>Country</th><th><nobr>Month Abbr</nobr></th><th><nobr>Referrer Type</nobr></th><th>Referrer</th><th><nobr>Search Rank</nobr></th><th><nobr>Linux Flavor</nobr></th></tr>";
}

function endHTML()
{
	echo "</table>";
}

function displayHTML($data)
{
	echo "<tr>";
	displayHTMLCell($data['ID']);
	displayHTMLCell($data['uri']);
	displayHTMLCell($data['remote_addr']);
	displayHTMLCell($data['year']);
	displayHTMLCell($data['month']);
	displayHTMLCell($data['date']);
	displayHTMLCell($data['time']);
	displayHTMLCell($data['agentType']);
	displayHTMLCell($data['browser']);
	displayHTMLCell($data['version']);
	displayHTMLCell($data['os']);
	displayHTMLCell($data['agent']);
	displayHTMLCell($data['country']);
	displayHTMLCell($data['month_abbr']);
	displayHTMLCell($data['referrer_type']);
	displayHTMLCell($data['referrer']);
	displayHTMLCell($data['search_rank']);
	displayHTMLCell($data['os-flavor']);

	echo "</tr>\n";
}

function displayHTMLCell($text)
{
	echo "<td><nobr>" . $text . "</nobr></td>";
}

function writeDataLine($fh,$delimiter,$data)
{
		fwrite($fh, "\r\n" . $data['ID'] . $delimiter . $data['uri'] . $delimiter . $data['remote_addr'] . $delimiter . $data['year'] . $delimiter . $data['month'] . $delimiter . $data['date'] . $delimiter . $data['time'] . $delimiter . $data['agentType'] . $delimiter . $data['browser'] . $delimiter . $data['version'] . $delimiter . $data['os'] . $delimiter . $data['agent'] . $delimiter . $data['country'] . $delimiter . $data['month_abbr'] . $delimiter . $data['referrer_type'] . $delimiter . $data['referrer'] . $delimiter  . $data['search_rank'] . $delimiter . $data['os-flavor']);
}

function findLastIDInFile($fileName,$delimiter)
{
	try
	{
	$fh = @fopen($fileName, 'r');
	if (!$fh)
		return 0;
	do {
		$line = fgets($fh);
		if (strlen($line) > 0)
		{
			$lastID = substr($line,0,strpos($line,$delimiter));
		}
	} while ($line != "");
	fclose($fh);
}
catch (Exception $e)
{
	$lastID=0;
}

if (!$lastID)
	return 0;

	return $lastID;
}

?>

<?php
	$dbAddress = "localhost";
	$dbUserName = "cookin30_bvlt1";
	$dbPassword = "Suj7zCEBz1rO";
	$dbName = "cookin30_bvlt1";
	$fileName = "data.txt";
	$delimiter="|";
	
	// Read in the existing file.  Find the last line and get the hit_ID so we know where to start.
	$lastID = findLastIDInFile($fileName,$delimiter);
	
	echo "Adding to <a href=\"" . $fileName . "\">". $fileName . "</a> after hit_ID " . $lastID . "<BR><BR>";
	
	// connect to the database
	$con = mysql_connect($dbAddress,$dbUserName,$dbPassword);
	if (!$con)
	{
		die('Could not connect: ' . mysql_error());
	}
	mysql_select_db($dbName, $con);
	
	// Loop over agent table and build local array
	$result = mysql_query("SELECT * FROM evo_useragents");
	while ($row = mysql_fetch_array($result))
	{
		$agents[$row['agnt_ID']]=$row['agnt_signature'];
		$agentTypes[$row['agnt_ID']]=$row['agnt_type'];
	}

	// Loop over basedomains and build local array
	$result = mysql_query("SELECT * FROM evo_basedomains");
	while ($row = mysql_fetch_array($result))
	{
		$referrers[$row['dom_ID']] = $row['dom_name'];
	}


	// Loop over all rows in the hitlog
	$result = mysql_query("SELECT * FROM evo_hitlog WHERE hit_ID > " . $lastID . " ORDER BY hit_ID");
	
	// Open file in append mode
	$fh = fopen($fileName, 'a');
	
	if ($lastID == 0) 
	{
		startDelimited($fh, $delimiter);
	}

	startHTML();
	while($row = mysql_fetch_array($result))
	{
		if (includeURI($row['hit_uri']))
		{
			$data['ID']=$row['hit_ID'];
			$data['uri']=$row['hit_uri'];
			$data['remote_addr']=$row['hit_remote_addr'];
			$data['date']=substr($row['hit_datetime'],0,10);
			$data['year']=substr($row['hit_datetime'],0,4);
			$data['month']=substr($row['hit_datetime'],5,2);
			$data['month_abbr']=getMonthAbbr($data['month']);
			$data['time']=substr($row['hit_datetime'],11);
			$data['agentType']=$agentTypes[$row['hit_agnt_ID']];
			if ($data['agentType'] == 'browser')
			{
				$info = getBrowserInfo($agents[$row['hit_agnt_ID']]);
				$data['browser']=$info['browser'];
				$data['version']=$info['version'];
				$data['os']=$info['os'];
				$data['os-flavor']=$info['os-flavor'];
			}
			else
			{
				$data['browser'] = "";
				$data['version'] = "";
				$data['os'] = "";
				$data['os-flavor'] = "";
			}
			$data['agent']=$agents[$row['hit_agnt_ID']];
			$data['country'] = getCountryFromIP($data['remote_addr'],"name");
			$data['referrer_type'] = $row['hit_referer_type'];
			$data['referrer'] = $referrers[$row['hit_referer_dom_ID']];
			$data['search_rank'] = $row['hit_serprank'];

			displayHTML($data);
			writeDataLine($fh,$delimiter,$data);
		}
	}
	endHTML();
fclose($fh);
?>

</body>
</html>

http://cookingwithdoyle.com

2 Jul 11, 2010 17:55

spiceygas2 wrote:

This might have made sense in another forum (like plugins & extensions), but I can't start a thread there so it's going here instead B).

How this forum works wrote:

This forum is for the individual support threads for released plugins and hacks

...

If you're a plugin developer and you want to be able to add support threads here then just leave a comment on this thread.

....

*edit*
In a moment of obscure clarity [url=http://wonderwinds.com/index.php]EdB[/url] created an index of known support threads ( [url=http://forums.b2evolution.net/viewtopic.php?t=18528]Index of plugin support threads[/url] ), for those of you who are unsure how the search function works
This assumes that anyone, who doesn't realise that search exists, will actually read a sticky post ... including this one ;)

Guess I'm crap at writing instructions :( .. moving this to plugins and hacks for you ;)

ref @link [url=http://forums.b2evolution.net/viewtopic.php?t=17969]sticky[/url] ;)

¥

3 Jul 11, 2010 20:54

Since the database wipes the hitlog on a rolling 15-day cycle

You can change this number on Global settings > Features tab ;)

4 Jul 11, 2010 21:08

sam2kb wrote:

Since the database wipes the hitlog on a rolling 15-day cycle

You can change this number on Global settings > Features tab ;)

Still wouldn't give the flexibility that's available in excel.

(But yes, I did figure that out)

5 Jul 17, 2010 15:58

I made a few updates. The excel export now supports referrers and searches. This is probably about as far as I plan to go for now.

The code above is updated. Here are some examples of the graphs you can put together pretty quickly in excel using this extract. These are all pivot charts in excel, so you can quickly tweak filters & sort options, along with creating new charts & tables.

http://cookingwithdoyle.com/media/blogs/plugins/v1_0/01.png
http://cookingwithdoyle.com/media/blogs/plugins/v1_0/02.png
http://cookingwithdoyle.com/media/blogs/plugins/v1_0/03.png
http://cookingwithdoyle.com/media/blogs/plugins/v1_0/04.png
http://cookingwithdoyle.com/media/blogs/plugins/v1_0/05.png
http://cookingwithdoyle.com/media/blogs/plugins/v1_0/06.png
http://cookingwithdoyle.com/media/blogs/plugins/v1_0/07.png
http://cookingwithdoyle.com/media/blogs/plugins/v1_0/08.png

7 Jul 20, 2010 13:55

I made a few updates to the code above:

1. Identifies google chrome browser.
2. Identifies android OS.
3. Identifies iceweasel browser.
4. Records a couple flavours of linux (certainly not all of them. This will need some improvement in the future).

Also, I had to remove all my data from the excel file linked above because search engines are parsing it and fubar'ing the keywords for my website :( . So now you get a blank spreadsheet with no data and will need to run it yourself to get a sample. (I also blocked the directory in robots.txt)

8 Sep 06, 2010 19:01

Updated the code in the first post with a couple bug fixes, as well as ability to identify SUSE Linux.

9 Sep 15, 2010 04:46

Is it just me or does b2evo 4.0.0-alpha miss user-agent extension in the stats section?

8|

10 Sep 15, 2010 05:05

True.
User agent table was dropped to increase b2evo performance.

11 Sep 15, 2010 05:27

This means that spiceygas2's Stats.php data collection is not going to work, needs some moderation!?

Volunteers?

12 Sep 15, 2010 13:48

Where did the data get moved that was in the evo_useragent table? Assuming it's still relatively accessible, I can update the code.

I don't have a 4.0.0 alpha server accessible, so I'd need something to test against. I don't have the interest yet in upgrading my website, but would be willing to update the code I wrote.

13 Sep 15, 2010 16:45

The evo_useragent table table is completely removed in b2evo 4.0.0-alpha. This means that you have to reconfigure stats.php to exclude that data collection from the database.

It appears to present the following errors when applying your Stats plug-in hack:

[u]First Area[/u]

Table 'kazaaikc_pj1.evo_useragents' doesn't exist

This error can be found here....

Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in /home/somefolder/public_html/stats.php on line 328


   mysql_select_db($dbName, $con);
   // Loop over agent table and build local array
   $result = mysql_query("SELECT * FROM evo_useragents");
    while ($row = mysql_fetch_array($result))
   { $agents[$row['agnt_ID']]=$row['agnt_signature'];
      $agentTypes[$row['agnt_ID']]=$row['agnt_type'];
   }

***

[u]The second [/u]

Error report Unknown column 'ID' in 'where clause'

This error can be found here....

Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in /home/somefolder/public_html/stats.php on line 354

   startHTML();
   while($row = mysql_fetch_array($result))
   { if (includeURI($row['hit_uri']))
      {  $data['ID']=$row['hit_ID'];
         $data['uri']=$row['hit_uri'];
         $data['remote_addr']=$row['hit_remote_addr'];
         $data['date']=substr($row['hit_datetime'],0,10);
         $data['year']=substr($row['hit_datetime'],0,4);
         $data['month']=substr($row['hit_datetime'],5,2);
         $data['month_abbr']=getMonthAbbr($data['month']);
         $data['time']=substr($row['hit_datetime'],11);
         $data['agentType']=$agentTypes[$row['hit_agnt_ID']];
         if ($data['agentType'] == 'browser')
         {
            $info = getBrowserInfo($agents[$row['hit_agnt_ID']]);
            $data['browser']=$info['browser'];
            $data['version']=$info['version'];
            $data['os']=$info['os'];
            $data['os-flavor']=$info['os-flavor'];
         }

***

You need to change those two fields in order for it to work on 4.0

You will also have to amend your stats.xmls for a 4.0 version to exclude that data collection and projection.

Your code is a hectic move, but a brilliant feature, well done!

It is a excellent tool to have thanks for sharing your work.

B)

14 Sep 16, 2010 05:04

wow. It's gone entirely in 4.0.0? Thats... stupid. Can't imagine why we'd sacrifice raw data, especially since that single string holds so much information. I'm probably never going to upgrade my production website if 4.0 is missing this information.

But I would like to try and fix the PHP code for the rest of you. I guess I'll need to dust off my old linux box and get 4.0.0 running to see what the database looks like and how tough it'll be to fix this.

Work's busy right now, so it'll be a couple days before I can take a look at it.

16 Sep 17, 2010 16:25

sam2kb wrote:

True.
User agent table was dropped to increase b2evo performance.

I understand the reasoning behind removing the useragent table. The db lookup to determine whether the user agent string was unique before writing the string to the table (and the id of that unique string to the hitlog table) was always a little redundant and unnecessary.

Unfortunately, the performance you gained (on the front end) by eliminating that db lookup is cancelled by the slowness of the php code that has to determine whether the useragent was a (browser, robot, xml, other) before that hit can be logged in the hitlog table. At best a tradeoff, maybe even a step backward for those not using opcode caching.

Basically, you've thrown the baby out with the bathwater.
A better solution would have been to have the hit-logging routines simply throw the useragent string (with no unnessesary lookup) into the hitlog table with the rest of the info, and have the backoffice statistics code determine what kind of useragent it represents. Nobody cares if the backoffice stats page takes an extra second or two to display.

The decision to remove the user agent string from the log data entirely, was a poor one. Boo! ;)

17 Sep 17, 2010 16:58

I agree with you. That's the reason why I'm still running 3.3.3 on websites where I need detailed stats.

18 Sep 17, 2010 18:02

I hope it didn't seem like I was blaming you personally. The "you" in my post was more of a generic "whoever was ultimately responsible for this decision" kind of "you". ;)

19 Sep 17, 2010 18:10

I didn't take it personally :)

20 Sep 19, 2010 00:54

In this case, performance is a really stupid reason. As you pointed out, if the lookup was really that bad then you'd just move the agent string to the hitlog itself and avoid the extra table lookup. Sure, it'll take more space but most people purge the database every 14 days so the extra space is a moot point.

(actually, now that I've said that, maybe someone should hack up the code in v4.0.0 to add an additional column to the hitlog table and log the agent string. Shouldn't be too tough...)

Or you could just do the sensible thing and index the agent strings. Yep, there's a lot of them and the index will take disk space, but isn't computer science filled with space v. time tradeoffs.

Seems like two possible solutions, either of which would have been better then throwing away vital data.

21 Sep 19, 2010 01:17

Why don't you want to make a plugin out of this hack and log any stats you need to a separate plugin's table?

22 Sep 19, 2010 22:01

sam2kb wrote:

Why don't you want to make a plugin out of this hack and log any stats you need to a separate plugin's table?

That's a decent suggestion. I just installed a bunch of junk and am looking at the framework for writing a plugin. If I can figure it out then I'll do v 3.3.3 first since it's simpler and doesn't require modifying any db tables.

Fingers crossed.

23 Sep 20, 2010 12:38

Last night I got a basic plugin running on v3.3.3 and extracting the data via the back office. I need a bit more time to do some testing and make sure it is working correctly, and this week is pretty busy at work so it might take a few more days. (If only every day was a weekend)

Once I get it done I'll post it here and then set to work on a plugin for v4.0.0.

I actually didn't realize anyone was even using this hack until someone pointed out that it doesn't work on v4.0.0. I'm glad others are getting some value from the idea.


Form is loading...