I spent the day yesterday hearing all about mashups. Last year, everyone was interested in taxonomies and folksonomies. This year, it’s mashups. Speakers in the Integrating Content track looked at mashups from a variety of viewpoints, and judging from the crowded room for most of the day, interest in them is very high.

Darlene Fichter from the University of Saskatchewan got the day rolling with an introductory presentation. The term “mashup” came from the music industry, when people started creating their own applications such as combining a vocal music track with an instrumental one. Other applications quickly followed, and now mashups are becoming widespread.

A mashup is a Web application that uses content from one or more sources to create a completely new application. The content is typically obtained from a third party via downloading or an RSS feed and is created using an Application Program Interface (API). Tools for creating mashups have been developed, and by using them, one can literally create a custom Web application in five minutes.

Mashups give individuals the freedom to innovate and put application creation into their hands without needing the services of a developer. They are like a piece of Lego—-by itself, one block is not very useful, but when integrated into a structure, it becomes important. One can consider mashups to be lots of small pieces loosely joined.

Many mashups are trivial, but they show the potential for the technique. The vast majority of them use the Google maps application to plot geographical data. Some applications already available include:
Housing maps using Craigslist plus Google maps
• Display of Zip codes on a map using Census bureau data
• A route map of delivery routes
• Plot of the location of breaking news stories by type of story
• Location of earthquakes plotted on a map using US Geological Survey data
Crime locations in Chicago using a Police Department database
PlaceOpedia: Wikipedia articles and their locations
Group maps for online communities
WeatherBonk: Maps and weather data from personal and national weather data
BookBurro: Uses data from online bookstores and library data to tell the user which libraries have a book and its cost at several bookstores
BashR: Wikipedia articles and photos from Flickr

According to the Programmable Web, there are 1,105 mashups available today, with an average of 2.72 new ones being added every day. The Programmable Web maintains a matrix of what has been combined with what.

Creation of a mashup requires obtaining a developer token from Google or another site so that the API can be accessed. Community Walk is a simple creation tool that walks the user through the steps in creating a mashup. Use of the Google Maps application requires obtaining the latitude and longitude of the spot to be plotted, and there are databases available to facilitate this process. Other map builders include YourGMap and MapBuilder.

Darlene listed some of the technical issues with mashups. They are in their infancy, so some of the tools fall short of the ideal, and there are scale and dependency issues. The permanence of the underlying data sources can be a concern. It is important to be aware of intellectual property issues because you might not have the “right to remix”, and you might have to pay for using the data. There could be a dark side to mashups too, and if they are used to identify individuals, privacy issues come into play.

Tom Reamy of the KAPS Group wondered if mashups are a revolution and concluded that they are not, even though some people say that they are revolutionizing Web development. His opinion is that they are simply data integration. He pointed out that mapping data is an old idea, the current focus on technology is misplaced, and also noted that mashups need taxonomies and metadata, which are certainly not new ideas. Tom feels that mashups are still in the realm of “cool” and suffer from irrational exuberance.

Mashups can be thought of as a variant of faceted navigation, or dynamically mapping two dimensions together. Reamy suggested that we need to move beyond individual mashups to a platform for integration of a variety of dynamic sources. Mashups within the enterprise can be profitably used to integrate internal content with public Internet content. Geography is an early application for mashups because there are existing standards making it easy to develop mapping applications.

Reamy’s conclusions are that we need a for mashups, or a community to provide ongoing ranking of them. We need simple APIs to enable social collaboration, content structures such as metadata and taxonomies, so that we can use and build on content aggregation and faceted navigation.

John Blyberg from the Ann Arbor District Library and winner of the Talis “Mashing Up the Library 2006” competition discussed mashup applications, particularly for libraries. Advantages of mashups include:
• No advanced coding skills are needed.
• They provide instant gratification because results are instant.
• The results can be striking and elegant in presentation.
• They are a more involved and enlightened use of the Internet and are therefore part of the Evolving Web. By allowing machines to swap data, the world will become a much smarter organism. The era of Web Services is really here.
• They are an Internet tool for the proletariat and shift power to the users.

Blyberg was very emphatic that libraries not only can create mashups, but they must. Some library applications are:
• Lists of the most popular books (click here for an example)
• Electronic signage. At the Ann Arbor library, a list of the most popular books (number of requests and copies in the system) is displayed on a large screen as people enter the library.
• Cover images of new books (see Ed Vielmetti’s Wall of Books (Superpatron))
• Creation of customized Google home pages using circulation data to show items a user has checked out, due dates, etc.

By letting the public create mashups from the library’s data, a sense of stewardship is created and a potential brain trust is created. Innovation is encouraged, and high quality feedback is obtained. It can be promoted as a library service and will permit people to be part of the organically growing Web.

Chris Deweese, a developer at the Lewis & Clark Library System, demonstrated how to make a mashup from Google Maps, which is one of the easiest APIs to use. To get a Google Maps API key, click here. The documentation is available here and is very useful.

The day concluded with an illustration of some of the available mashup tools. Justine Wheeler, Data Librarian at the University of Calgary reviewed some commonly used data resources. There are two different types of data. Microdata is raw unprocessed data down to the case or respondent level (mashups with only raw data are sometimes called “dashups”), and aggregated data has been summarized. Many of the available data sets are huge, and need a data extractor to obtain exactly the desired data. Justine’s lists of resources will be available on the conference Web site.

It is important to check the conditions on use of the data. Just because you can create a mashup, are you allowed to do it? It is universally forbidden to use microdata in a mashup to try and identify someone. Many data set producers provide an RSS feed to let users know when they have uploaded new data, and most of them also have a “codebook”, or documentation describing the data and how it was produced. Always read the documentation—with power comes responsibility!

Finally, Kathy Greenler Sexton, Chief Marketing Officer, HighBeam Research, illustrated how she uses mashups in her job. She uses Netvibes to create them and creates a personalized home page to track news, blogs, social information, and her personal e-mail. She also noted that Google has recently built SearchMash, a search engine for mashups.

The “mashup day” was not only fascinating and interesting, but it was also extremely educational. Mashups will certainly play a part in our Internet experiences, and even though they are still in their infancy, we can expect to see them become prominent as individuals and commercial organizations take advantage of their power.

