Google Analytics : Real Time Overview

In the last few month, Google Analytics team haven’t been getting much sleep I guess. They release features upon features upon features. I usually don’t have the time to properly check every feature they do right when they do it, so I keep a browser tab with an announcement open until I get to play with it. Sometimes though, either I lose the tab or close it or lose interest or lose hope that I will ever get to it.

Somehow I managed to miss it completely or forget to play with the Real Time beta functionality of Google Analytics. Today I stumbled upon this feature in my reports and I have to say it’s absolutely awesome. If you can’t find it straight away, switch your Google Analytics interface to the new version, navigate to Home tab at the top, then choose Real Time (beta) in the menu on the left, and click on Overview. If you have any sort of traffic on your site at that moment, you’ll see a screen like this, which will update in real time.

During your regular hours, especially on the small sites, it would probably be too boring to watch. But it can save you a lot of time and pulled out hair during a traffic spike.

If I had a busy website and an office full of people, I’d probably put a big screen or a projection on a wall with full screen page of those stats. How cool would that be!

Urchin – Google Analytics in a box

Google Analytics has proven itself over and over again as an extremely valuable tool for pretty much everyone interested in website statistics.  But as awesome as it is, Google Analytics has a number of limitations.  These don’t come handy when you need to analyze non-public websites, such as intranets or web-based services behind closed firewalls.  Sure, there are plenty of alternatives to Google Analytics that you could go with.  But what if you wanted to stick to Google Analytics?  I thought you couldn’t (that is without tricks and ugly workarounds).  Apparently, I was wrong.  You could.

Urchin is a packaged Google Analytics application that you can run on your own servers, under your full control.  There are a few features in Urchin that are not in Google Analytics (mostly due to Google Analytics not having access to your server logs).  Here are the interesting ones:

  • Process historical logs
  • Status & error codes reports
  • Individual visitor history drilldown

These are things that you don’t probably care too much, if you only have a couple of personal blogs to manage.  But if you are a company with a few busy websites and large chunks of your revenues spent on the online advertising, you’d want each and every bit of information, including the above.

One other reason that probably only the enterprises will be interested in Urchin is the price.  While Google Analytics comes to everyone for free, you’ll have to pay USD $9,995 (yes, almost ten thousand!) for the Urchin license.

And even though the price is quite prohibitive and will leave most people still using Google Analytics, I think it’s nice to have this option.

Rebuilding – Step 2 – Monitoring

The best practices of web design and development suggest that I need to set up some goals and monitoring before I make any changes to the web site.  This way, I will be able to track how the changes affect the performance of the web site.

The problem is that right now my web site is not set very well for goals that I want it to achieve.  But insufficient monitoring is better than none, so I logged into my Google Analytics account and configured the following goals.

According to my own goals for this web site, as described yesterday, I want to get hired more.  The only way to track it now is to see how many people used the Contact me page.  I will eventually improve my contact details and contact form to be able to track more details.  Once that is done, I’ll reconfigure this goal.

One other thing I want people to do is read more of my movie reviews.  Again, there is no easy way of monitoring it now, so I just setup a goal for the Movies category.  This will probably get reconfigured later.

And just to have some generic goals for an overall picture, I added tracking for those who visit more than 2 pages  and for those who spend more than 2 minutes on the site.  That should be enough for starters.

How accurate is Google Analytics?

That’s the question that I was asked recently by one of the co-workers.   It is simple and not so simple at the same time.  It really depends on what you are looking for, what is the acceptable accuracy, and what is that you are comparing Google Analytics with.

For example, if you compare the numbers from your Google Analytics reports to the summaries of the web server logs, you’ll probably find that Google Analytics reports lower numbers.  Almost like not everything is recorded.  Which is true because Google Analytics is using JavaScript to track your visitors.  Server logs record all hits to your web server, but the information in logs is very limited – it won’t be enough for anything but very basic tracking.

How much will numbers differ?  Here is what Google Analytics blog has to say:

Google Analytics uses JavaScript tags to collect data. This industry-standard method yields reliable trends and a high degree of precision, but it’s not perfect. Most of the time, if you are noticing data discrepancies greater than 10%, it’s due to an installation issue. Common problems include JavaScript errors, redirects, untagged pages and slow client-side load times.

Having used Google Analytics on a number of sites over a number of years, I’d say that that is just about right.

Web statistics and visitor tracking : things you need to know

First of all, just to make it clear, I don’t recommend writing your own web statistics / analytics / tracking application.  Google Analytics can track and report pretty much everything you will ever need. Period. If you think it can’t do it, chances are you just don’t know how.  That’s much easier to correct than to write your own tracking / reporting application.  I promise.  In case though, Google Analytics doesn’t do something that you need, grab one of those Open Source applications and modify it to suit.  While not as easy as learning Google Analytics, that would still be much easier than doing your own thing from scratch.

However, if you still decide to roll out your own tracker, here are a few things that you need to know.

  • Use the bicycle, don’t reinvent it. Most of the tracking applications that I’ve seen use some form of JavaScript, which is appended right before the end of the page markup.  Said JavaScript collects as much statistics as you need and generates a request to an image on the remote server (your tracking application), passing gathered statistics as parameters to the image.  On the server side, your tracking application gathers sent parameters, merges them with whatever else you can get from the server side, and saves in the database or in your data storage of choice.
  • Keep ad blocking applications in mind. Many ad blocking plugins for different browsers block 1×1 pixel images from remote servers.  Be a bit more creative – use a 2×1 or a 1×2 pixel image.  If it is a transparent GIF at the bottom of the page, nobody will notice it anyway.
  • Gather as much as you can from the server side. It’s simpler, and you minimize the chances of breaking things with an URL which is too long (your GET request for the image with all parameters can run pretty long, especially if you pass current page and referring page URLs).
  • Minimize the length of your parameter names and values when you pass them to image GET request. Again, this is to avoid extremely long URLs.  You can sacrifice readability in your JavaScript and instead document parameters in the server side tracker application.
  • Record both client’s IP address and possible proxy server’s IP address. That is available for you in the request headers ($_SERVER[‘HTTP_X_FORWARDED_FOR’] in PHP for example).  Once you got the IP addresses, use GeoIP to lookup the country, region, city, coordinates, etc.  It’s better to do so at the time you record the data.  There is a free GeoIP service as well, but it will give you much less information.  The commercial one is not that expensive.
  • Record client’s browser information. Browsercap is very useful for that.  However, it’s better to parse user agent string with browsercap at the report / export time, not at the request recording time.  This will guarantee that you always have the most correct information about the browser in your report.  Browsercap gets updated with new signatures pretty often.
  • If you are tracking a secure site (HTTPS), chances are you won’t have referrer information available to you.  Apparently, that’s a security feature.
  • If you use both JavaScript and PHP to figure out the referrer, keep in mind that JavaScript uses document.referrer, while PHP uses $_SERVER[‘HTTP_REFERER’].  Notice that one is spelled with two Rs, while the other – with one.  That might save you some troubleshooting time.
  • It’s better to use the same JavaScript code snippet across all your sites.  To avoid SSL-related security warnings, your JavaScript need to figure out if it’s in HTTPS web site or in plain HTTP one. See Google Analytics example on how to actually do that.   It doesn’t hurt to have a signed SSL certificate for the HTTPS hosting of your tracker application.
  • Don’t forget about HTML and URL escaping / encoding. Check that everything works properly for you in different browsers.  JavaScript is still hard to nail right sometimes.
  • Keep the version of tracker application in every request log entry. This will much simplify your migrations later.  One of the ways to keep this automated is to use tags / keyword substitutions in your version control software (here is how to do this in Subversion).
  • Make sure your tracker spits out that transparent image no matter what. Broken image icons are very visible and you don’t want those on your site just because your tracker database went down temporarily.
  • For the best cross-site tracking, start tracker session, which will remain the same when visitor will go from one of your tracked web sites to another.  If your tracked web sites use sessions, pass their IDs to tracker, so that both tracked and tracker session IDs could be logged in the same request. This will help you link stats from several sites together, as well as do all sorts of drill-downs into site-specific stats straight from the bird-view reports.
  • Don’t be evil! There is a lot that you can collect about your visitors.  Make sure that you tell them exactly what you are collecting and how you are using it.  Aggregate and anonymize your logs to prevent negative consequences.  I’m sure you know what I mean.

Once again, think really good before you decide to do one yourself.  It’s not an easy job.  And even if you grab all the data you want and save it in your database, there is an incomparably bigger issue to solve yet – reports, graphs, export, and overall visualization and analytics part of that data.  Why would you even want to go into that?

A glimpse into Google Analytics power

For a few years now I always recommend Google Analytics to anyone who is looking for a statistical / analytical package for their web site.  While there are a few alternatives, I think that almost none of them can match Google Analytics in both ease of use and analytical power.

Easy installation (a copy-paste of provided JavaScript snippet), web-based reports available from anywhere, multi-user access, schedule reports, exports to several formats such as CSV and PDF, schedule reports, customizable dashboards, multiple site and profile management, A/B testing, goal conversion tracking, and much more – and all of it for free.  That’s hard to compete with.

Google Analytics help

And if you want to see how much you can get out of it and how easy that would be to configure, consider a recent example posted in Google Analytics blog – “Advanced: Structure Your Account With Roll Up Reporting And More“.