One of the popular sessions at MW2008 in Montreal was a double header featuring Frankie Roberto and myself talking about different approaches to data combining across multiple institutions.
Data combining was a bit of a theme this year with Mike Ellis, Brian Kelly and others talking mashups; Ross Parry, Eric Miller and Brian Sletten all talking ‘semantic web’; and Terry Makewell and Carolyn Royston demonstrating the early prototype of the NMOLP cross search.
Frankie’s presentation was lots of fun and very cheeky. He lodged a Freedom of Information request to all the major national museums in the UK to get access to their collection databases. Very few opened up – but with those that did he built a quick and dirty Ruby on Rails application to cross search and show patterns in their data. Frankie’s talk was refreshing in that it cut through all the decades of ‘museum protocol’ that often stops really innovative things from happening. That the UK still doesn’t have a decent federated search is astounding.
My presentation demonstrated some of the early foundational work we have been doing at the Powerhouse Museum for a cross-government project called About NSW. This project is building some lightweight prototypes to enable data combining across the various State Government funded cultural agencies – the Powerhouse, State Library of NSW, Art Gallery of NSW, Australian Museum, the Historic Houses Trust, and other government bodies. In the presentation I demonstrated some of the problems with data combining (or ‘smooshing’ as the Canadians kept saying) even with apparently simple things like calendars, and then showed some of the practical approaches we have been taking to solving some of these problems. I then showed some of the potential of social tagging and the search tagging we’ve been doing at the Powerhouse for the past 2 years as well as the emergent possibilities that come with data-parsing tools like OpenCalais – for example, the ability to browse object catalogues by person or company name – and what that might mean in terms of connecting to other non-cultural sector datasets. Finally I gave a few examples of the location-aware work we’ve been piloting – delivering collection content to users based on where they are, and how this might work across multiple institutions.
Here are my slides from my presentation. They should be viewed together with the paper which goes with the session. Bear in mind that all of this is very much a work in progress and is more about testing and exploring new approaches than developing a singular watertight, robust method. I hope you realise that a lot of what I’ve been talking about is that such a singular method doesn’t (and will not) exist.
Mike Ellis, Brent Gustafson, and Frankie himself blogged about the session in enough detail to give some of the audience interpretations and ensuing discussion.
One reply on “MW2008 – Data shanty towns, cross-search and combinatory approaches”
Thanks for posting these – Brent took notes that session so I could watch, but i still find myself forgetting half of it until I can see the slides again.
Did you get a chance to check out the Delphi Toolkit they’re building at Berkeley? (Paper here) Pretty cool, I’ve been playing with it a bit lately and it seems that with a good ontology built it’s quite good at extracting “meaning from mess”.