Lorcan Dempsey points to a succinct PDF article comparing Opensearch 1.0, 1.1, SRU and MXG.
Category: Web 2.0
As I’ve been speaking to other institutions both here in Australia and overseas I’ve started to realise that more of us should be using Opensearch to allow others (or ourselves) to aggregate our deep content – whilst still retaining full control of said content.
I blogged about this ages ago but I think everyone was caught up in getting their collections online and searchable to begin with.
The library sector has been debating its implementation for a while and their arguments for and against Opensearch are covered here.
OpenSearch is . . . a discovery mechanism. It allows a site to quickly expose vast amounts of data to end users in a detailed enough format that it elicits click-throughs. It is a way for end users to search a variety of sources, and source types, and to quickly grab the useful bits from each source, and to dig deeper for more detail when they find something of interest.
More to the point, though, since everyone must implement their opensearch results in exactly the same way every OpenSearch source is guaranteed to work with every OpenSearch client. Instant interoperability.
Now with both Firefox 2.0 and IE7 supporting Opensearch there really is no reason not to.
Imagine if your collection or your deep/dark web databases that you have already connected up to your website could be easily searched by a centralised search portal? And any interested searchers who clicked on a result would be redirected immediately to your site? And you didn’t need to implement anything complicated to make this possible?
Here is a very simple tutorial for a standard website.
Here is the Powerhouse Museum’s collection search for ‘chair’ delivered via the A9 portal.
And here is the raw XML result which anyone can aggregate to their site (allowing others to deliver traffic back to us).
If you have multiple databases on your site that all have their own esoteric search engines, then you could create your own cross database search simply by creating a Opensearch feed for each and then a search page that aggregates each feed.
If you DO add Opensearch to your site then please tell us!
Peer production and the ‘laws of quality’
Interesting reading from Paul Duguid in his paper on First Monday, Limits of self-organization: Peer production & the laws of quality. (via Nicholas Carr)
First, protagonists of the sorts of peer production projects discussed here should reflect on the extent to which, explicitly or implicitly, they rely on the laws of quality. If they don’t, they should ask themselves what they do rely on. Second, projects should be mature enough now for participants to admit their limitations. Project Gutenberg and Wikipedia are tremendous achievements. That does not entitle them to a free pass. Both, because free, tend to get some of the condescending praise given a bake sale, where it’s deemed inappropriate to criticize the cakes that didn’t rise. Third, they should draw closer to their roots in Open Source software. Software projects do not generally let anyone contribute code at random. Many have an open process for bug submission, but most are wisely more cautious about code. Making a distinction between the two (diagnosis and cure) is important because it would suggest that defensive energies might be misplaced/
Synonymiser is an experimental micro-application that returns related words from search data relationships held in the Powerhouse Museum’s collection database. These ‘synonyms’ are dynamically generated from realtime user interaction with the collection database.
On the Synonymiser site you can enter any search word or phrase and it will return a list of ‘related’ words or phrases and a measure of relationship.
Of course, the results are not synonyms in the dictionary sense of the word, but instead show meaning relationships specific to the way in which users use our collection database.
The idea is that these word relationships can then be used to query other data sources. In this case we retreive images from Flickr™ to demonstrate the concept. It is possible to merge terms and/or offer alternative terms to improve results using this.
There is a proposal to make these synonym relationships available via an API to allow other museums to use and build upon our usage data to improve their own search tools.
Is this useful? Would you like to be involved or help with this?
What is ‘synonym promiscuity’?
‘Synonym promiscuity’ is our term for describing the uniqueness of a relationship of one word to another. If the value is low (less than 10) then the synonym has a very close relationship with the word entered. If the value is high then the synonym is related to many other words (high promiscuity). We are currently refining the mathematics behind the calculation of these values – but they current figures shoukld provide a means for comapring words.
Stutzman on YouTube
Fred Stutzman’s blog is quickly becoming a must read.
Here he writes on YouTube from the perspective of YouTube as a social networking service rather than just a video hosting site. As he says,
The social architecture that enabled conversation in YouTube was built in, perhaps subconsciously, from the beginning. The founders built a site so they could share party videos with friends. The founders, while they probably have more friends now, likely had a relatively small social network. It was the millions of users like the founders, using the service in a similar fashion, that drove the value of YouTube. The fact the site also became the perfect home for viral videos and pirated video was completely secondary – they simply had the infrastructure to support the long-tail, hence the capacity to support non-long-tail uses. Other video sites that aren’t targeting the long tail are missing out on the social forces that drove YouTube – while people like viral videos, it is the long-tail of peer-produced content that keeps people coming back. It is the peer-production that enables conversation, and the iterative process that drives value back into the site. Without this value, a video sharing site is just expensive infrastructure built on a house of cards.
He also begins to hint at the other value in YouTube – that by visiting, watching, tagging, sharing and accumulating metadata around videos, users are effectively helping classify and categorise video which is notoriously difficult (like any time based media) to create descriptive metadata for (anyone use SMIL?).
Continuing on the additions from yesterday.
Today we added the top three search terms for each object to the USER KEYWORD section of an object page. This section is where folksonomy tags can be added and deleted.
Here is a 1960s raincoat for example.
Why did we add this to the USER KEYWORDS section?
What we are finding through our ongoing analysis of folksonomy tagging behaviour is that users are generally adding synonyms. What folksonomies are doing in this instance is effectively ‘crowd-sourcing’ synonym generation. Now when a user searches for term and selects an object from a list of results they are making an association between the object and that term. In many instances these may be tentative associations and generally full of false drops, but when aggregated patterns begin to emerge (see Chan 2006 forthcoming). Effectively on our site we are noticing that the search terms are beginning to offer the same kind of synonym behaviour that tags do.
We are presenting them together as a way of making explicit the terms that are already associated with an object (automatically). This way we hope to improve the diversity of user tags (why tag with a word/term that already appears?). This is just a trial though and if we see little or no change then we may move the search terms off to a separate section.
I’d welcome any comments or thoughts on this as we’re experimenting here.
Take a look at the new ‘related searches’ feature on our collection search.
Now a search for ‘glass‘ will these ‘similar searches’ –
glassware vase bottles bowl bowls
This result will change over time. Hopefully we will implement a timescale simulator in our upcoming ‘experimental’ browsing section which will allow users to view the changing search language over time.
How does it work?
Because we keep a large store of relationships between search terms and clicked objects, we are able to reverse query terms such as ‘glass’ and see what terms have been used to find similar objects. We currently show only the top 5 terms – aggregated – which lessens the probability of ‘false drops’. False drops are most likely to occur for uncommon search terms although this will change over time, too.
Does this use folksonomy tags?
Yes. We allow user tags to be included in the search terms and from time to time certain objects will be most visited as a result of their user tag.
A lot of blogosphere energy has been spent discussing the recent figures from ComScore in the USA about a rapid aging of the average MySpace user.
Boyd and Stutzman have been doing some digging and comparing the result to their own research which problematises the apparent age rise. As they point out, there is a vast difference between visitors (content readers) and users (content creators/social networking participants)
[ComSpace] have found that the unique VISITORS have gotten older. This is _not_ the same thing as USERS. A year ago, most adults hadn’t heard about MySpace. The moral panic has made it such that many US adults have now heard of it. This means that they _visit_ the site. Do they all have accounts? Probably not. Furthermore, MySpace has attracted numerous bands in the last year. If you Google most bands, their MySpace page is either first or second; you can visit these without an account. People of all ages look for bands through search.
ABC’s Digital Futures Blog
Apologies for the lack of recent updates. Plenty to hopefully come this week.
A local blog worth reading is from ABC Digital Futures. The ABC is our national public broadcaster, and they are in the process of migrating from from a old world media organisation to one that hopefully fully embraces the opportunities of the new digital media world. I’ve spoken quite a bit about the rather unique position organisations like the ABC are in and that they have the ability to leverage their position much in the way that the BBC has done seemingly so effectively.
The ABC Digital Futures blog follows and reports on an internal conference held at the ABC and covers many of the same issues and fields that are effecting museums and galleries particularly as we (the museums and galleries) start to operate in the digital realm more like niche media organisations ourselves.
Fred Stutzman with an excellent post on monetising social networks.
Obviously whilst there are problems with old advertising and economic models being applied directly to social networking applications, there are many new opportunities here – MySpace branching out into selling songs is a good way of them utilising content, for example.
This is timely given the warnings over advertising.
Our first attempt at a social networking application was in 2001 when we built the now defunct Soundbyte. Soundbyte was a site for students and teenagers to create and share music that they had made in class. The revenue model was based on the site acting as an attractor for physical visitation to the museum’s Soundhouse lab – where visitors can learn computer music production. The site was quite successful although hampered by government limitations – its greatest success was in greatly increasing the profile of the Soundhouse lab.