AVG Response
Initial thoughts and response to AVG linkscanner #wa
AVG is doing some interesting things. I think that my own perceptions of what they are up to are biased by my own interest in web analytics - after all, regular users of the software really don't care what their AV is doing.
Useragent filtering
The web analytics platform that I am most used to is Unica Affinium NetInsight, both in the on-premise (and possibily logfile based) and on-demand (hosted, and thus most likely to be JavaScript pagetag based) versions.
Due to the logfile-based nature (at least historically) of NetInsight it has always had the need to filter out Robots/Spiders, monitoring agents and all sorts of other garbage that litters the data. As such it's trivial to segment away or exclude the current AVG useragent, either in your own installation or, with a brief request to the on-demand team, from your hosted install.
Of course, this is already broken - AVG already seem to be altering the useragent string to something that looks completely real, and thus impossible to block all by itself.
Understanding AVG
The main thing that I would like to know about right now is the sort of environment that AVG presents to JavaScript - what sort of screen resolution, locale, plugin list, cookies etc.
If the above presents a recognisable fingerprint it would then be possible to filter based on these multiple criteria.
Of course, it may be the case that it presents the actual environment of the host, which would make things much harder to work with, although I don't think that this is likely to be the case.
How AVG executes
JavaScript pagetags typically create the URL that they are going to request from a complex block of code. I propose (and I stand to be corrected, as AV isn't my thing) there are four main options for how AVG can function.
- Static analysis of the JavaScript
- Sandboxed execution of JavaScript
- Sandboxed execution of JavaScript that allows the tag to 'fire' to the outside world
- Actual execution of JavaScript
Now - I don't *think* it's doing static analysis, although I have colleagues that know about such things - I'll have a word on Monday.
I hope (for the sake of AVG) that it isn't executing the code for real - that would open-up the opportunity for malicious exploitation - although we may be able to exploit it ourselves. :-)
Which leaves some form of sandbox. This should be easy enough to implement as JavaScript runs in one anyway. AVG would just need a separate instance. The real question is what does the sandbox provide for an environment and how is it allowed to interact with the rest of the world - at least we know that it allows extra requests to be made.
References - further reading
http://www.grisoft.com/ww.72
http://www.grisoft.com/ww.faq.num-1066#faq_1066
http://www.grisoft.com/ww.faq.num-1188#faq_1188
Disclaimer
All this is pure speculation, but it almost makes me want to sign-up to see what it does.
All for now. Comments/thoughts via usual channels
Link Visualisation
Thoughts on a link visualisation tool
I have been having a thought - something to do with visualising the relationships between sites/blogs/posts/pages.
Clearly others have gone before me, I rather like:
http://www.touchgraph.com/TGGoogleBrowser.html, http://www.aharef.info/static/htmlgraph/ and http://home.snafu.de/tilman/xenulink.html for various reasons.
None of these quite do the job that I need - so if I'm going to create something myself I need some:
- network visualisation, including some de-cluttering algorithms
- site indexer (perhaps using web analytics data)
- source of link information for links going in the other direction
To be fancy this could all be done in 3D, but I'm not sure it would be any more useful than something in 2D.
And then I'll become fabulously rich.
Reading List
Web Analytics - blog reading list
Blogs
This is a quick list of worthwhile blogs to help in getting and keeping up-to-date in the world of web analytics.
Occam’s Razor by Avinash Kaushik
Lies, Damned Lies... (Ian Thomas)
Multichannel Marketing Metrics with Akin
Web Analytics Princess by Marianina
Chris Clapham - Marketing Demystified - but still with a healthy measurement/analytics subtext.
Official Google Analytics Blog
And not forgetting:
Books
If you want something that you can read on the train or hold in your hand then these may be of interest.
Mr Kaushiks book Web Analytics: An Hour a Day will give a good gentle introduction to Web Analytics for complete beginners, but should also have something to offer someone who is working on WA full time.
Akin's book Multichannel Marketing is a more advanced tome, best for people that are committed (one way or the other) to a multichannel customer-centric approach and need some way of figuring-out what works.
While not directly WA related, the SEO: An Hour a Day book is worth a read. Many of the things that make a site SEO-friendly also make it analytics friendly.
WAW - March
March 2008 - Web Analytics Wednesday - London
This is not really a review of Web Analytics Wednesday (WAW) that was held in London on Monday 31st March 2008. The revised date was to allow our special guest speaker, Mr Eric T. Peterson.
Yes, this is stupidly late, but out of completeness I still feel compelled to post. Besides, these images have been sitting on my desktop for the past weeks and I need to do something with them
Mr Peterson spoke on the subject of the 'Future of Web Analytics'. His presentation was both entertaining and insightful.
Here is the 'official' March WAW round-up post. Unfortunately there was one picture missing:

Here is Mr Wayne Byrne on the left with his eyes almost shut. Dr Alan Hall (with his eyes shut) in the middle and myself, sporting open eyes and my stupid attempt at a beard.
The beard competition that I was sort-of competing in was started by the web team of a customer of ours - they will be playing this until the end of May, but I had to quit early.
The next London WAW will be on the 20th of May (a Tuesday - designed to interact with e-metrics).
Web Analytics Lecture
Consumer metrics at the Uni of Southampton
It all went rather well. Despite the 0500h start and the nightmare journey down to Southampton that meant that I only arrived just before nine o-clock.
I spent two hours talking through a first introduction to Web Analytics - A 'Web Analytics 101' if you will.
Here is the presentation that I used - it's mostly made it through the converstion to flash in once piece.
Use here or open.
References used in the presentation :
Glossary of WA terms : http://www.sclanalytics.com/resources/glossary
Web Analytics on Wikipedia : http://en.wikipedia.org/wiki/Web_analytics (not everything is correct, but it's a reasonable read)
Web Analytics the Nokia Way : http://tinyurl.com/224szr (a guide to the use of KPI's within a large organization)
Web Analytics Princess : http://www.marianina.com (a blog, not just WA, but many insightful things.)
Avinash Kaushik : http://www.kaushik.net (another blog from the respected WA evangelist.)
Back To School
Consumer metrics at the Uni of Southampton
It looks like I'm going back to school, except this time I'll be the chap at the front of the room waving his hands around.
I have been asked to present the Web Analytics section of the 'Consumer Metrics' module that is part of a couple of the University of Southampton's school of Management MSc programmes.
More information, links, comments and stuff will follow - but at some point I need to settle down and put the slides together for the session.
The department is lauunching a blog :
http://thirstforknowledge.wordpress.com
Also, references :
http://www.management.soton.ac.uk/StudyOpportunities/pg-ft/marketing-analytics.php
http://www.soton.ac.uk/postgraduate/pgstudy/programmes/2007/management/msc_marketing_man.html
August WAW Review
August 2007 - Web Analytics Wednesday - London
The August Web Analytics Wednesday in London seemed to be a success - although we don't have all the feedback yet to make objective measurements.
I had missed the July session, having been in Iceland /travel/reykjavik - so this was my first time at the venue (A big thank you to the Crown and Anchor - who provided us with our own bar. Fools!)
I have been asked to publish the presentation that I used for the pre-networking session, while there isn't a lot of context on the slides, it may give you a little flavour of what happened.
You should be able to click through the slides below :
Use here or open.
We didn't manage to cover all of the points, but here was the gist of the discussion :
- Not everyone agreed that 'mobile content' / 'mobile sites' were worth doing at all.
- Effectively measuring mobile sites is non-trivial, although it should be possible to get something of use (even if it's not 100 percent good (not that anything is)
- Some people are waiting on standards support from operators and manufacturers before attempting anything.
- I figure (maybe someone agrees) that we may need to remember what the web was like 10 - 15 years ago and just get on with it and code defensively around lack of standards / support.
- There is a greater requirment to support the mobile multi-channel mix, but having %somewhere% for an online 'campaign'/message to go back to would be a good idea.
Also - BlackBerry quirk
I have an interesting trick (noooo, not %that% one, the other one!) If you want to track the network that a mobile device belongs to then you can simply use the IP information and look it up in a sensible GeoIP database... BUT if you try to do this with enterprise BlackBerrys then it will tell you the organisation that they are attached to (useful in it's own right, but still doesn't tell you the network). So, IF you get in touch with me (email address on /bob ) then maybe I'll tell how you can add the network operator for the BlackBerry into the mix.
References :
November WAW : http://www.sclanalytics.com/resources/events/waw_november2007
The WAA : http://www.webanalyticsassociation.org/
Tags and Logs
Logfiles can be your friend
Sorry it's been a while - it's been a crazy couple of months.

I had started this entry whilst I was working in Reykjavik, I think I'm now allowed to let you know (#1) that I was spending a week with the lovely people at Landsbankinn. (#2)
While I was there we had a problem that we have addressed a number of times in slightly different ways.... "If I have a website with regular pages as well as 'resources' (pdf files, spreadsheets, whatever) how do I track the usage of these if I am using page tags for my data collection?"
Throughout this I will refer, interchangeably to resources, files or downloads.
There are two solutions that I can think of right now :
1. Track the links leading to the file in question.
2. Identify the downloads based on web (or proxy) log files.
Now, tracking the links does work. You can do this the hard way (by hand) or you can use a pagetag that auto-instruments the links in question (like ours). The problem that I see with this approach is that resources like PDF files are highly rated by search engines (#3) and some visitors are going to land directly on your site on a resource, without the chance to trigger a pagetag. This sucks, especially if you're in an SEO mood.
The log-based method works fine, but there is a problem. If you *just* use logs (does anyone still do that?) then you miss-out on all the benefits that tagging gets you.
So, there is another way (two actually) (I wouldn't be writing this otherwise).
Real solution number 1. You can use Unica NetTracker or NetInsight in its hybrid mode, where it manages tags and logs, but that can sometimes be more trouble than it's worth, it's very easy to end-up double counting requests for regular pages.
Real solution number 2.
- Identify a parameter that can *only* be obtained from a page-tag based request. I extracted our 'lc' parameter.
- Identify a parameter that can *only* be obtained on the 'resources' (files, downloads, whatever) that you need to extract from just the logs. I extracted, using a regular expression, from the page, anything that ended-with .pdf and another untagable page (could be an RSS feed, anything really)
- Create a third (are you keeping up?) parameter that joins these two together (what we call a meta-parameter)
- Finally, specify a filter, so that only when this third parameter has a value, do we bother loading the line into the database.
This may all seem like a lot of work and the product really ought to do some of these things automatically, but it isn't too much effort, is nicely maintainable and produces a lovely clean profile containing mostly pagetag-based information, with some extra requests from things that just can't be tagged.
References :
#1 http://www.sclanalytics.com/resources/news/landsbanki-announcement
#3 http://www.google.com/search?q=landsbanki.com (to see some PDF results in a search)
#4 http://members.mrtc.com/anvk/fielddaycart04/fielddaycart04.html (for the picture of the man with the beard and the log and the tags)
May WAW Review
May 2007 - Web Analytics Wednesday - London
And so another Web Analytics Wednesday passes. This is the second such even that I have assisted with and I think that it was even more successful than the first.
The most obvious change was the new venue - from some deep-underground basement bar, that anybody could wander through we have moved to the rather more up-market Royale room in the RubyBlue bar located off Leicester Square. Plenty of light and even some fresh air from the balcony overlooking the square itself.
The dedicated room made it much easier to mingle, as there was much less of a chance of wandering up to somebody at random and launch into some conversation about long tails before realising that they'd just come in for a drink.
The 'Networking' aspect was also helped by the lovely name badges that we managed to hand-out to just about everyone ... no more guessing that you already know someone and really ought to recognise them by now.
The main session (1800h onwards) was prefaced by an open discussion about the use of Web Analytics tools for SEO tasks (part led by myself and part by m'colleague Matt). This was the first time that either of us had done anything like this and I think we have learned the following lessons :
- Make sure that people are expecting to contribute with an opinion or questions.
- With the above point in mind, pre-announce the full agenda.
- Less, but better (perhaps more inflammatory) points. :-)
- Make sure that everyone can hear (duh!) and that there is a real 'circle' effect in the seating.
- Try and avoid a focus in the circle (although with a projector this can be difficult)
- ... any other ideas?
This time we (the remaining SCL mob decanted ourselves into a nearby restaurant where we mostly had really manky salmon fishcakes. I wish I knew the name of the place so I could suggest that you avoid it.
This time we all made it home without drunkenly disgracing ourselves.
Next event : http://www.sclanalytics.com/resources/events/waw_july2007
Pretty Dashboards
How to make pretty AND useful dashboards in NetTracker and NetInsight
One of the things that I do at work is support the Unica NetTracker and Affinium NetInsight products. In the course of my work I sometimes find nice things that it would be good to share with a wider audience.
The products have had two sorts of dashboards for some time - the pretty graphical dashboard with nothing but pictures and the informative but ugly 'Executive' dashboard with nothing but numbers.


Wouldn't it be nice to combine the two? (can you tell where I'm going with this yet?)
Disclaimer :
While I have tested this myself in a few different environments and it all seems okay neither I, my employer (SCL) or Unica can be held responsible for any loss or damage of you trying-out any of this. This article is written by me (Bob Mitchell) and is not produced or endorsed in any way by either my employer (SCL) or Unica.
These instructions apply to version 7.1 of NetTracker and NetInsight, I would imagine that there will be a neater, gui-driven way of doing this in future versions of the product.
1. Create a graphical dashboard containing the graphical elements that you want. Save it as a custom report.
2. Take a look at the reportxxx.xml file (where xxx is the number of the report) in inst_dir/data/profilename.

3. Now take a look at the file 'execdash.xml' (It's for the Executive Dashboard - the one with all the numbers). Look familiar? (You should notice that the 'section' is of type 'executive', but otherwise it looks a bit like the graphical dashboard.
4. Transplant a section from execdash.xml into your reportxxx.xml

5. Force NetTracker to regenerate the report - perhaps just click on a single day and it will regenerate the report from scratch.
6. Observe the results :

Now, I think you'll agree that this both looks nice, while also presenting 'real' numbers.
Further options :
1. Change the 'link' attribute - this will alter, or prevent the report you get when you click on it to drill-down.
2. Change the 'label' attribute to rename an item
Past Items
- bad day
- AVG Response
- Link Visualisation
- Very Exciting
- Cornwall Holiday
- Reading List
- Multichannel Marketing
- Littleham and Landcross
- Sunny Sunday
- WAW - March
- Crawley Snow
- Virus
- Virus?
- Snowy Sunday
- untitled
- Reykjavik again