Month: August 2008

Vote on our next advertisement for the upcoming Star Wars exhibition!

Post author By Seb Chan
Post date August 28, 2008
1 Comment on Vote on our next advertisement for the upcoming Star Wars exhibition!

OK so here’s the deal – the Umbilical Brothers have helped us create six different short TV advertisements for the upcoming Star Wars: Where Science Meets Imagination exhibition.

One of these is already screening in cinemas (see below). But now we need your help in voting for the next advertisement. We can only make one more, but we have five different alternatives to choose from.

Which should it be?

Vote and help us decide. You can also go and leave comments over on YouTube too if you prefer.

Here’s the current advertisement.

More powerful browsers – Mozilla Labs Ubiquity

Post author By Seb Chan
Post date August 27, 2008
1 Comment on More powerful browsers – Mozilla Labs Ubiquity

Mozilla Labs has released Aza Raskin’s Ubiquity in an early alpha form. This is a glimpse into a future world of browser technology which brings notions of the semantic web directly into the browser and connects the dots between websites – not from a provider perspective, but from a user perspective.

Ubiquity for Firefox from Aza Raskin on Vimeo.

Geotagging & mapping Imaging open content

Flickr meets Google Street View – Paul Hagon’s Then & Now (or interesting things clever people do with your data #6247)

Post author By Seb Chan
Post date August 27, 2008
6 Comments on Flickr meets Google Street View – Paul Hagon’s Then & Now (or interesting things clever people do with your data #6247)

A week or so ago Paul Hagon got in touch with me to say he’d done something really cool with our geo-coded historical images in the Commons on Flickr. In what he describes as “about 30 minutes of coding” he had taken a KML feed from our Tyrrell photos in the Commons on Flickr and combined them with the recently released (in Australia) Google Street View.

The results are stunning and another example of why geocoding and releasing your data makes a lot of sense.

Go and have a play.

Imaging

A Commons Slideshow

Flickr has recently enabled the embedding of slideshows which means doing something like this is now really really easy. And you can even go full screen.

Developer tools

Powerhouse releases a Python HTML Sanitiser for developers to use (BSD license)

Post author By Seb Chan
Post date August 21, 2008
2 Comments on Powerhouse releases a Python HTML Sanitiser for developers to use (BSD license)

As you’ve heard, we’ve been working on a whole lot of new projects. And with new projects comes new code. I can’t say a lot more about these projects right now, but we’ve been using Python and the Django framework to develop them. So here’s the first of the spinoff products that we’re putting out under a BSD license for everyone to benefit from.

Over to Dan MacKinlay, one of our Python gurus, to tell you all about the HTML Sanitiser and why it matters.

“So the idea with the Python HTML Sanitizer is that we are consuming data from a wide variety of client websites, and we need to get their HTML data in a form that’s useful to us. This means –

1) standards compliant XHTML
2) … bereft of formatting quirks which break our site …
3) … and free from exploits for cross-site scripting and other browser-bugs that can compromise user security.

Normally, you can sidestep the HTML sanitization process by writing your own content, or using a special markup language (say, Markdown) – but when you are consuming HTML from clients’ websites this is not an option. They simply aren’t written in Markdown.

Stripping ALL HTML tags out would be another common option. That’s not reasonable for us, however, since we are supposed to be extracting rich information from our clients sites, and some of it is really useful and semantic – links, citations and definitions. things we don’t want to filter out, or punish them for using.

Rather, we’d probably like to reward them by keeping that markup and indexing on it.

By the same token, many clients use old markup (think HTML3), invalid or badly-formed markup or merely use types of markup which are inconvenient for us to display. (br – or even td tags – instead of p) Moreover, when a site is old enough to have such ancient markup in, it’s reasonable to think that maybe other types of maintenance has lapsed too — such as security maintenance.

We can’t blithely assume that every client site is free from malicious Javascript or whatever – that’s a one way ticket to weakest-link security hell. Already we’ve noticed that two partner sites have been hacked in the course of the project so far (these days we’d assume that a fair proportion of traffic to most dynamic websites is malicious).

Solution – the HTML Sanitiser.

This a flexible, adaptable HTML sanitising module (both parsing and cleaning) that can be tweaked to let through rich markup from good client sites, and salvage what it can from bad client sites. This is the approach chosen by things like PHP5’s HTML Purifier and Ruby’s HTML:Sanitizer, but since our scraping code is in Python, we’ve had to build our own, leveraging the power of the awesome BeautifulSoup HTML parser.

Since a lot of people need to solve similar problems to this, and many eyes make for more secure code, we’ve open-sourced it.

Go and download it, make changes and update the codebase.”

Interactive Media

Some NSW Baby Name Explorer easter eggs

Post author By Seb Chan
Post date August 21, 2008

Our Baby Names Explorer has been getting a lot of use and some of you might be interested in knowing about a few of the ‘hidden features’.

Here’s some useful ones which use the RegEx structure:

Comparing two names – enter ‘/(name)|(name)’ to compare. For example ‘/(michael)|(john)’ will show the popularity of both michael and john on the one chart.

Show names containing a string – enter ‘/string’. For example ‘/ee’ will show you all the names containing ‘ee’ anywhere in the name.

Show names ending in a string – enter ‘/string$’. For example ‘/ae$’ will show every name ending in ‘ae’.

Compare names ending in two strings – enter ‘/(string$)|(string$)’. For example ‘/(ee$)|(ae$)’ will compare names ending in ‘ee’ and ‘ae’.

There are some more hidden tricks . . . . have fun.

Interactive Media

NSW Baby Names Explorer goes live

Post author By Seb Chan
Post date August 19, 2008
2 Comments on NSW Baby Names Explorer goes live

It has been quite a long time coming but finally our NSW Baby Names Explorer has gone live as part of the relaunched NSW Government portal. This is the first publicly visible application developed out of a cross-government project that parts of the Powerhouse team are working on.

The application was built in Adobe Flex to work with any data set (death name explorer, anyone?). It required significant data wrangling as the source data was extracted from different databases and provided to us by the NSW Registry of Births, Deaths and Marriages. Whilst we had data going right back to European settlement in 1788 the decision was made to launch with data only back to 1900 for performance and quality reasons. Additional data and a lot more functionality will be added over time.

Prior to this application, which is currently quite similar to others in the USA, baby name popularity in NSW was only available for the past 10 years in a very basic list form on the Births, Deaths and Marriages website.

There was some quite amusing newspaper coverage this morning.

Other parts of this cross-government project will go live later in the year . . .

Imaging Interactive Media User experience

Next generation of Photosynth-style image interaction – Bundler

Post author By Seb Chan
Post date August 18, 2008

Last year there was a lot of buzz around the first demos on Microsoft’s Seadragon and Photosynth, now from SIGGRAPH08 comes this rather splendid update to underlying technologies and concepts.

There is now a lot more ability for users to navigate and tweak their experience of interacting and browsing a 3D scene using miscellaneous 2D images. I was particularly impressed by the notion of using other people’s photos (from Flickr) to act as the intermediaries in scene reconstructions from your own photos; and the very final simple demo of creating a 3D model of an object by processing a series of handheld 2D images – this would greatly reduce the costs of 3D digitisation for museums.

The toolset used in this new version of the underlying technologies, Bundler, has also been released, so if you have some computer science graduates working in your team you could feasibly give it a burl.

Bundler takes a set of images, image features, and image matches as input, and produces a 3D reconstruction of camera and (sparse) scene geometry as output.

Policy Web metrics

Australian internet usage trends and statistics

Post author By Seb Chan
Post date August 11, 2008

Knowing your audience is critical yet being outside of North America often means that we end up justifying projects, strategies, methodologies on general audience data drawn from another continent.

The CCI at QUT has just published the latest ‘Digital Futures Report – the Internet in Australia‘ which is a very comprehensive look at how Australian internet users connect, what they look at, and how they behave online. With the continuing digital divide, and amongst users a usage divide, there are obvious implications for those of us whose mandates is either national or state-wide compared to city or community museums.

Read the full 65 page PDF.

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: