I’ve been watching a lot of people using computers over the past few months and it struck me how many of them were using web-based email services – the more tech savvy were on Gmail, and the more casual users gravitated towards Hotmail and Yahoo Mail despite their flaws. An even smaller number used webmail interfaces from their own ISP. Like all websites and online services, they all have their own specific demographics of users.
Category: Web metrics
Compete is one of several comparative ISP anayltic services that are doing some interesting tracking of how US internet users are behaving on particular sites and comparing them with competitors. One of their recent reports examines how users are behaving once they are on Facebook. We all know
Web Directions South 07 was lots of fun and there were some great presentations over the two days. Unfortunately conferences are always full of choices and I missed several presentations I’d been looking forward to catching. That said, overall the quality was high and there were only a handful of dull moments. Most of the presentations I saw were not on the tech-side (JS, Ajax, CSS etc) of things – Luke was there to go to those.
Here’s some notes from my highlights.
Cameron Adams managed to pack out one of the smaller rooms and by the time his ‘Future of web based interfaces’ was in full flow there were about 50 people standing at the back. Adams’ presentation went through the possibilities of flexible interfaces that are both customisable by the user (much like Netvibes or iGoogle is) and automatically reformats as you use it (like the BBC News pages subtly do).
After my own presentation (see below) it was on to Scott Gledhill’s ‘Is SEO evil?‘ to which the answer is, of course, no. SEO and a web standards approach should be complimentary. Scott had some lovely images – the menacing gummi bears in particular – and a fascinating case study from News Digital Media around the Steve Irwin death. In this instance, News went out with a web headline that was far more immediate and keyword loaded (“steve irwin dead”) than their major competitor, Sydney Morning Herald/Fairfax who were more obscure (“crocodile man reported dead”). They tracked the story traffic and referrers by the hour and more than doubled the Fairfax traffic – even after Fairfax adjusted their headline. Scott also told how journalists are now much more SEO content-savvy in their writing and that his team gives the journalists the necessary web reporting tools to be able to track their own stories. This, combined with the highly competitive environment, encourages journalists to further refine and re-edit their stories for performance even after initial publication.
The second day began with an edit of Scott Berkun’s famous Myths of Innovation presentation. Scott’s main message is that you can’t force ‘innovation’ and that it needs time and space to happen organically. In fact, one of the best triggers of innovation are failures and mistakes. He suggests that perhaps we should start including a ‘failures’ budget line in our organisational budgets – accept that they will happen and that we are all the better for it.
George Oates from Flickr spoke about how Flickr manages and facilitates user communities. She started out tracing Flickr back to its origins at Ludicorp as a sort-of MMORPG called Game Neverending. After GNE folded the community that had grown around it was imported directly into Flickr and they brought their experiences from the game world into the construction and design of Flickr. I found her focus on users and the real need for human-to-human communication and relationship management that Flickr does a timely reminder that in the museum world we cannot expect communities to ‘just happen’ around our content and that when the seeds of community appear they need careful nurturing. The necessary nurturing is impossible if you move immediately on to the next project.
Adrian Holovaty, the mind behind Chicago Crime and several other datamining and visualisation projects gave a fascinating presentation about the hidden potential of structured data. Now over in the museum world we are experts at structured data but we rarely make the most of it. Throughout Holovaty’s talk he kept coming back to the ideas of serendipity and free browsing that I’ve been working on with our OPAC. His position was to make everything hyperlinked and let the users build their own paths through the data. To that end he built the Django Databrowse application which takes a database and basically build a simple website that allows users to link from anything to anything else. Following Chicago Crime which took flat datasets from the Chicago Police Department and made them navigable in ways that the Chicago PD had never intended (view crimes by area, visualise hotspots, map your jogging route against reported crimes etc), Holovaty went on to do some great visualisation work at the Washington Post. Here he asked journalists to enter their notes into a simple database as well as turning their notes into stories. This allowed him to build the Faces of the Fallen which tracks and maps every US soldier killed in Iraq. Faces not only reveals some uncomfortable patterns in the data (deaths by age of soldier, by state etc), it also has allowed linkages to family tributes and newspaper articles about the circumstances of their death. The project returns great value back to reporters and the paper who can now report ‘milestones’ and trends, but also to the community who can now make ‘more sense’ out of what would otherwise be simply seen as a list of names. It ‘humanises’ the data, giving it far greater impact. Holovaty is now working on a community news project Every Block which intends to harvest and aggregate content by ‘block’ from various news sources – automatically creating journalistic stories from raw data. (Reuters already does this with some financial reporting).
There are a growing selection of presentation slides over at Slideshare.
Here’s an edited version of my own presentation slides which use the Powerhouse Museum’s collection search and tagging implementation as a case study of a government implementation of Web 2.0 techniques. Those who have seen my presentations over the recent months will recognise some re-use and re-puposing. For various reasons I have had to remove about 20-30 slides but most of it is there. There is a podcast coming apparently.
It has been one of the worst kept secrets of web statistics – deep linked image traffic. While this has been going on for years, since the beginning of the WWW actually, it has increased enormously in the past few years. On some cultural sector sites such traffic can be very substantial – a quick test is to look at exactly how much of your traffic is ‘referred’ from MySpace. It is also one of the main reasons why Photobucket has traditionally reported traffic so much higher than Flickr is – its deep linking and cut and paste engagement with MySpace. With the move away from log file analysis to page tagging in web analytics, some, but not all of this deep linking traffic is fortunately being expunged from analytics reporting.
Two Powerhouse examples include a Chinese news/comment portal that deep linked a Mao suit image (from an educational resource on our site), sending us 51,000 visits in under 24 hours in August 2005, and an A-grade Singaporean blogger who deep linked an image of Golum (from our archived Lord of the Rings exhibition pages) to use to describe an ugly celebrity which generated over 180,000 visits over 8 days In January 2007. (In both of these examples the visits were removed from the figures reported to management and funders.)
What is going on here sociologically?
At the recent ICA2007 event in San Francisco danah boyd and Dan Perkel presented an interesting look at the subcultural behaviours that are, in part, producing this effect. Although they look specifically at MySpace there are threads that can be drawn across many social sites from forums to blogs. Drawing on the work of many cultural theorists, they argue that on MySpace what is going on is a form of ‘code remix’. That is, young people’s MySpace pages are essentially ‘remixes’ of other content – but unlike a more traditional remix in audio and video cultures, these code remixes occur through the simple cut and paste of HTML snippets. By ‘producing’ both their MySpace pages as well as their online cultural identity in this way, they are reshaping concepts of ‘writing’ and digital literacy. They are also, importantly, not in control of the content they are remixing – a deep linked image can easily be changed, replaced or removed by the originating site.
There are plenty of examples – boyd and Perkel give a few – where the content owner changes the linked image to disrupt the deep linker. In the case of our Singaporean blogger we renamed the linked image to prevent it from appearing on her site (and in our statistics).
Revealingly, Perkel’s research is showing that many MySpace users have little, if any, knowledge or interest in website production – that is CSS and HTML. Instead, what has formed is a technically simple but sociologically complex ‘cut and paste’ culture. This is what drives the ‘easy embedding’ features found on almost any content provider site like YouTube etc – it is in the content providers’ interest to allow as much re-use of their content (or the content they host) because it allows for the insertion of advertising and branding including persistent watermarking. Of course, the museum sector is not geared up for this – instead our content is being cut and pasted often without anyone outside the web team having a deep understanding of what is actually going on. There are usually two reactions – one is negative (“those kids are ‘stealing’ our content”) and the other overly positive (“those kids are using our content therefore they must be engaging with it”). Certainly Perkel and others research deeply probelmatises any notion that these activities are in large part about technical upskilling – they aren’t – instead those involved are learning and mastering new communication skills, and emerging ways of networked life.
One approach that some in the sector have advocated is the widget approach – create museum content widgets for embedding – to make repurposing of content (and code snippets) easier. There have been recent calls for museum Facebook apps for example. But I’m not sure that this is going to be successful because a great deal of embeds are of the LOLcats variety – perhaps trivial, superficial, but highly viral and jammed full of flexible and changing semiotic meaning. Whereas our content tends to be the opposite – deep, complex and relatively fixed.
It is important to realise that to deliver more effective websites we need to move away from a one-size-fits-all approach not only when designing sites but also when evaluating and measuring their success. We know that some online projects are specifically intended to target specialist audiences – a site telling the histories of recent migrants might require translation tools, and a site aimed at teenagers might, by design, specifically discourage older and younger audiences in order to better attract teenage usage.
Remembering, too, that some key museum audiences (regional, remote, socially disadvantaged) may have no online representation in online visit figures, and others may have limited and sporadic online interactions, because of unequal internet access, it is important to look at the overall picture of museum service delivery. Some audiences cannot be effectively engaged online. Others still may only feel confident engaging in online conversations about the museum using non-museum services – as I’ve written before – on their own blogs, websites, and social media sites.
If we acknowledge ‘threshold fear’ in our physical institutions, then we need to realise this applies online as well. The difference being that in the online world there are many many more less ‘fearful’ options to which potential visitors and users can easily flee. The ‘back’ button is just a click away.
The measure of the ‘value’ of visitors therefore need to differ across parts of the same website. We may need to form different measures for a user in the ‘visiting the museum’ part of the website to the ‘tell us your story’ section, even though in one visit they might explore both areas. Likewise, a museum visitor who blogs about their positive experience of a real world visit on their own family blog might be considered. Or a regionally-oriented microsite that gets discussed on a specialist forum might be more valuable – to that particular project – than a posting on a more diffused national discussion list.
Visit-oriented parts of the the website should be designed and created with known target audiences in mind, understanding that not everyone can visit the museum, and their success measured accordingly. It might be sensible to attempt to address ‘threshold fear’ by using images of the museum that are more people-oriented rather than object-oriented in order to promote the notion that the museum is explicitly a place for people.
When we were building our children’s website we specifically decided against creating a resource for ‘all’ children – that would have resulted in a too generic site – and targeted the pre- and post- visit needs of a known subset of visitors with children. We don’t actively exclude other visitors (other than through language choice, visual design, and bandwidth requirements), but we have actively attempted to better meet the needs of a subset of visitors. This subset will necessarily diversify over time, but we also understand that out on the internet there are plenty of other options for children.
The problem with traditional measurements are that every visitor to our online resources is homogenised into single figures – visits, time spent, pages viewed. Not only does this reduce the value of the web analytics, it does the visitor a great disservice. Instead, good analytics is about segmentation. This can be segmentation based on task completion and conversions, and understanding visit intentions.
So who is a ‘valuable’ visitor?
It depends on context.
For our children’s site we place a greater internal value on those who complete one of two main site conversions – spending a particular amount of time on the visit information areas; and second, those who browse, find, and most critically, download an offsite activity. Focussing in on these subsets of users allows us to implement evaluation and tracking. For those who complete the visit-related tasks we might offer discount coupons for visiting and track virtual to real-world conversions. What proportion of online visitors who look at visit information actually convert their online interest to a real world action? And in what time frame (today, this week, this month?). Of the second group we may conduct evaluation of downloader satisfaction – did they make they craft activity they downloaded? Was it too hard, too easy? Did they enjoy the experience?
What of the others who visit the children’s site? They are a potential audience who have shown an interest but for many reasons haven’t ‘converted’ their online visit. We can segment this group by geography and origin – drill down deeper and really begin to examine the potential for them to ever ‘convert’.
Other parts of our website – say our SoundHouse VectorLab pages – we may see as valuable users who simply use and linkback to our ‘tip of the day’ resources. Despite being primarily an advertisement for onsite courses run in the teaching labs, we do see a great value in having our ‘tip of the day’ resources widely read, the RSS feed subscribed to, and articles linked back to. However this has to be a secondary objective to actually taking online bookings for courses.
Postscript – I’d also suggest reading the 2004 Demos report ‘Capturing Cultural Value’ for some important philosophical and practical caveats.
From Akshay Java, Xiaodan Song, Tim Finin, and Belle Tseng comes an interesting academic paper titled Why We Twitter: Understanding Microblogging Usage and Communities.
Following my recent post looking at diffused brand identity in social media, this paper is a useful examination of the emergent ‘authority’ and ‘connectedness’ of users amongst a dataset of 75,000 users and 1.3 million ‘posts’.
Twitter is something that I’ve seen limited potential for in most museum applications so far, but increasingly Twitter-style communciation is replacing email – see the frequent updates that your friends do on Facebook’s ‘what I am doing/feeling now’ mood monitor for example.
Abstract:
Microblogging is a new form of communication in which users can describe their current status in short posts distributed by instant messages, mobile phones, email or the Web. Twitter, a popular microblogging tool has seen a lot of growth since it launched in October, 2006. In this paper, we present our observations of the microblogging phenomena by studying the topological and geographical properties of Twitter’s social network. We find that people use microblogging to talk about their daily activities and to seek or share information. Finally, we analyze the user intentions associated at a community level and show how users with similar intentions connect with each other.
There has been a flurry of activity amongst web analytics companies and in the marketing world to devise complex ways of measuring social media activity. As much of this interest in devising a way of measuring and comparing social media ‘success’ comes down to monetising social media activity through the sale of advertising, these measures don’t easily translate to the cultural sector. Advertisers are after a ‘ratings’ system to compare the different ‘value’ of websites but as we know from old media (TV and radio), ratings don’t work well for public and community broadcasters who don’t sell advertising and have other charters and social obligations to meet.
We know that visits, page views and time spent aren’t the best ways of understanding our audiences or their levels of engagement with our content, and with social media it is all about engagement. If we aren’t selling advertising space to all those eyeballs focussing their attention on our rich and engaging content, then what are we trying to do?
I’d argue that it is about brand awareness. Not just brand awareness in terms of being top of mind when geographically close audiences are thinking of a cultural activity to do in their leisure time, but about linking the perceived authenticity of the information contained on your website to your brand. More and more there is ongoing research into how museums are perceived as ‘trusted’ information sources, and importantly politically impartial sources. But this perception relies upon an awareness on the part of the online visitor that they are indeed on a museum website.
This user awareness is, I argue, not a given, especially now that such a large proportion of our online traffic comes via search. Looking into the future, search will be an even greater determinant of traffic, even if your real-world marketing prominently displays your URL (as it should be doing by now!). Looking at your real world marketing campaigns around your URL you will probably find a spike in direct traffic but a similarly sized spike in brand name searches – we are finding this with the Sydney Design festival at the moment. The whole of Sydney is covered with street advertising from bus shelter posters to street banners, all promoting the URL. The resulting traffic is a mix of direct and brand name search based.
The problem is, now, the brand no longer is just represented in the online environment on our own websites.
One of the first things I talk about in my workshops and presentations is that even if your organisation is not producing social media about yourself, then your audiences almost certainly are. If you aren’t aware of what your audiences are saying about you, what they are taking photos of, or recording on their camera phones, then you are missing a unique opportunity to understand this generally highly engaged tip of your audience.
It is possible those who blog about their experience in your organisation, upload their photos and videos, are going to be those who are potentially your most (commercially) ‘valuable’ customers – high disposable income, high levels of interest and a desire to participate and communicate/advocate to others about your organisation.
They are probably the most likely to climb the ‘ladder of engagement’ from potential visitors through regular visitors to members and finally donors/sponsors. They may not always have positive things to say, but by hearing their gripes and grizzles, you are able to understand and address issues that impact how your organisation is going to be promoted through word-of-mouth. And word-of-mouth is going to almost always be the most ‘trusted’ type of marketing recommendation.
So how do we track these conversations that occur publicly but not on your organisation’s website?
Mia Ridge recently pointed to a great summary of the easiest to use ‘ego search’ tools and methods by which you can easily keep track of your audience conversations. Another favourite of mine for small scale tracking is EgoSurf.
Sixty Second View has compiled an ‘index’ of how these kinds of ego search results might be compiled to generate a figure to compare with competitors and other organisations. Their methodology, whilst very complex, focusses on assessing how connected the people who are talking about you actually are – this allows for a determination of effective reach, and the trust that may be accreted to those in the conversation.
(top level summary mine only)
a) Blogs that are talking about you – what are their Technorati rankings, how high are their Google PageRanks, how many BlogLines subscribers do they have etc
b) Multi-Format conversations – how popular/connected are the Facebook and MySpace people who are talking about your organisation
c) Mini-Updates – frequency and reach of Twitters
d) Business Cards – LinkedIn connectedness
e) Visual – Flickr influence and popularity can be used to determine how connected and visible posters images of your organisation are. This can be applied to YouTube as well.
f) Favourites – Digg, Del.icio.us connectedness
This approach is useful as it provides a detailed analysis of the spectrum of social media that your organisation is probably already represented in. It can reveal areas where your users are’nt talking about you, and it can illuminate areas of your own site that receive unexpected user attention. Not only that it focuses on who is talking about you. On the downside, it is a lot of work – but in undertaking even a cut down version of this methodology it will force you to examine the different impacts of types of social media.
For example, are all blog posts about your organisation equal? When you check the Technorati rankings of the commenting blogs you will find that some have greater reach and authority than others. The real world equivalent here is the different weightings your marketing team probably already gives to print media mentions in national broadsheets versus local weeklies; or the difference between a TV editorial and a local radio mention.
Is this really the job of the web team?
Unless your organisation has a marketing team that is expert in online marketing then the answer must be yes. Web analytics in five years time will be all about measuring offsite activity.
Continuing the theme of web analytics, last week I took a look at the top 5000 search terms (the head of a long tail containing over 4 million terms) searched for by Australian internet users in a 4 week period through a paid analytics service. This data is generally gathered from ISP logs and is anonymous, but is revealing of trends.
This sort of data is very useful in determining the demand side of internet usage – what is the general public looking for, what are they trying to find? As such it can provide useful comparative data for examining brand awareness and potential “intent to visit” amongst the population. Much like using the various AdWords tools, it is possible to look at relative popularity of thematics, competitors and can identify market opportunities.
Here is an example – in the 4 week period only two museums, the Australian War Memorial, and one gallery appeared in the top 5000 search phrases – the Melbourne Museum at 1064th position, the Australian War Memorial at 1357th, Powerhouse Museum at 2306th and the National Gallery of Victoria at 4232nd. Of the libraries and archives, only Brisbane City Council library appeared in the top 5000 at 2294th and the National Archives at 3606th.
Obviously there will be irregularities – those institutions with easy to find and remember domain names will likely not get as many searches. Also, for some institutions whose major online audience is a more web savvy demographic, search may not be such an important traffic driver. There will also be fluctuations around real world advertising campaigns and overall ‘top of mind’ concerns amongst local audiences.
Others may find, when looking at their own site analytics, that their search traffic is predominantly content-based rather than institution-based – that is, the audience is coming to them because of what they have rather than who they are. The concern with this is that content-based, rather than brand-based audiences are fickle and prey, over time, to migration to other sites and services both commercial and non-commercial. Such a disconnect between institutional awareness (brand awareness) and user behaviour also weakens claims that institutional authority extends to online visitors.
For example, contrast two different users. The first visits the Powerhouse Museum website and searches for ‘Delta Goodrem dress’. The second searches in Google for ‘Delta Goodrem dress’ and ends up deep within the Powerhouse Museum website.
In the first case it is safe to assume that the user first is intentionally visiting a museum website, and second, that they will be taking the institution’s reputation, expertise and authority into consideration when looking at the search result – the Delta Goodrem dress. This would typically be a very small proportion of Internet users.
In the second case, however, these assumptions do not necessarily hold. Instead, the intentions of the user may, and probably do have, very little to do with the museum or museums in general. Even if they do visit the museum’s webpage about the dress, it is difficult to make the assumption that they are aware that they are even on a museum website.
It is much more likely that the first user can be converted to a real world visitor than the second, especially if their geographic location permits.
It is this lack of ability to transparently understand user intentions that makes an increasingly large proportion of museums’ web traffic problematic. On one hand we are all very excited to know that our content is being viewed, possibly even read, by millions of users. On the other hand, as our traffic increases, we need to be very mindful of the need to ensure that this traffic is, at least proportionally, being drawn back to, or made aware of the organisation’s brand – which is its marker of authority and expertise.
Spending some time looking at the demand side of the web can be a sobering experience. Our ‘serious’ content seems to be low down on the list of public concerns – particularly at the head of the curve.
Over the past few months I have been becoming more and more concerned about the use of web analytics in the museum sector. I’ve also been seriously re-considering the Powerhouse’s current anayltics tools.
Web analytics should be used to improve the ROI of online services; improve customer/user experiences; and drive traffic to the key areas of your site. Yet almost every museum I have spoken to uses them, largely, to simply report back simple metrics to funding bodies – visits (or sessions), page views and, in some cases even hits (!). Some still count robots, spiders and other non-human visitors in the figures they report to funders, and see no way out of doing this because, as a few people overseas have put it – ‘everyone else in the sector does and we are competing against them for funding’.
The problem I think is that we haven’t, ourselves, come up with a better way of doing things. What constitutes ‘success’ for a museum website? Surely it cannot be reduced to ‘visits’. Instead we need to start thinking about what constitutes, in web analytics terminology – a ‘conversion’ (or set of conversions).
In the ecommerce world a ‘conversion’ is when a user/customer successfully browses a catalogue, chooses an item, adds it to their cart, and then checks out making a payment. Running an ecommerce site means using web analytics to keep a careful watch on where and when users are dropping out of a conversion process – do they leave before they add to their cart (if so, why?). Overall visits vs sales don’t tell the whole picture – certainly not a useful picture – other than perhaps indicating the size of the ‘potential market’.
The closest thing to ecommerce that most of us have are online surveys. I wonder how many museums have tracked the number of users that start a survey but fail to complete it?
At the Powerhouse, one of the conversion measures I’ve been exploring is how many website visitors go to our home page at some point during their ‘visit’. Currently this is sitting at around 29% – which is good, given the volume of users and most importantly, the other statistic – that only about 5% actually start at the home page. That means 24% of users coming in through other entry points (including search) are coming to the home page to find out more about where they are (or perhaps, more cynically, some are lost?).
If these figures are combined with statistics about how many people visiting the site are using brand-related search terms (eg “Powerhouse”, “Powerhouse Museum”, “Sydney Powerhouse” and variants) it is possible to get an indication of market interest in the brand. It is also possible to explore the potential size of an online audience interested in a physical visit to the Museum; and the brand-awareness rub off that the 5% to 29% figures start to indicate.
It is extremely important to know the breakdown of your online audience, even if it is reduced to something as basic as –
– online only visitors (will never visit the physical site)
– potential physical visitors (local vs overseas/interstate)
– definite physical visitors (immediate vs soon)
Then you can start to build conversion measures around each type of user and improve your site architecture to cater for each in the best possible manner.
Can you ensure your online-only visitors who have probably come in via a search engine are spending significant useful time on your site fulfilling the information seeking needs they have?
Can you ensure that those wanting to visit your with their family in the next two hours can get everything they need to know about visiting quickly and all in the one place?
And can you attract those who have come in via search but also live locally, to potentially become a physical visitor as well?
Web analytics expert Avinash Kaushik has recently written an exploring blog entry exploring how certain elements of Google Analytics can be used to set up useful metrics on non-ecommerce websites. Even if you don’t use Google Analytics as a tool his suggestions apply broadly.
Kaushik zeros in four default Google Analytics metrics – Loyalty, Recency, Length of visit, and Depth of visit – as being of greatest importance to those who don’t fit the ecommerce mould (which is much of the web analytics market).
The first two of these – Loyalty and Recency – are important because they tell you what proportion of your userbase are regulars and how frequently they might expect to see new, expanded or enhanced content on your site. Every museum should be trying to increase the number of ‘regulars’ and convert casual visitors to regulars. Regulars are far more likely to engage deeply with your content – especially interactive content.
The last two of these – Length and Depth of visit – are useful because they tell you how far users go into your site. Most of us probably already took at, and perhaps even report, ‘average time on site’ or ‘median time on site’, but breaking this down further, and by user segment or areas of the site and entry points can give you a lot more information.
What do you measure and report?
Smashing Magazine have put together a splendid and pretty much definitive guide to how Google’s Page Rank works. It is full of links to more information. Essential reading.