Archive | October, 2011

IFLA Conference Grantees Reports Available

IFLA 2011 Conference Grantees

IFLA regularly awards travel grants for attendance at its annual conference.  For the 2011 conference in San Juan, PR, 40 grants were awarded to attendees from Latin America and the Caribbean, Asia and Oceania, Africa, and developing countries.  Recipients were required to be a member of their national library association and write a report on their experiences at the conference.  Several of the reports are now available here; they give an interesting perspective on the conference from a first-time attendee’s point of view.  More reports will be added as they become available.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Workshop on Human-Computer Interaction and Information Retrieval: HCIR 2011

The HCIR workshop began in 2007 as an experiment to see if there was interest among information science researchers to meet and discuss human-computer interaction (HCI) as it applies to information retrieval (IR), and the experiment has been highly successful. From its beginnings, it has grown until this year, about 90 information science researchers assembled at Google’s headquarters in Mountain View, CA on October 20, 2011 for the 5th HCIR workshop. (As the most heavily used search engine on the Web, there is no more appropriate organization than Google to sponsor a workshop on human involvement in searching and information retrieval.)

According to the workshop website, The workshop unites academic researchers and industrial practitioners working at the intersection of HCI and IR to develop more sophisticated models, tools, and evaluation metrics to support activities such as interactive information retrieval and exploratory search.”

The workshop featured a keynote address, poster sessions, presentations, and a challenge competition.

Keynote Address

Gary Marchionini

One of the highlights of the workshop was the keynote address by Gary Marchionini, Dean and Professor in the School of Information and Library Science at the University of North Carolina. He has had a long and distinguished career in the field, serving as a member of the Editorial Boards of several prominent journals, president of the American Society for Information Science & Technology (ASIST), and chair of several conferences. He is the author of Information Seeking in Electronic Environments and Information Concepts: From books to cyberspace identities. His keynote address “HCIR: Now the Tricky Part”, began with a look back into the history of HCIR and noted that the two pioneers of the field (he called them its “father and mother”) are Nick Belkin from Rutgers University and Susan Dumais from Microsoft Research. He showed this diagram breaking down the history into 3 eras and showing some of the pioneering researchers in each (I was honored to be included as a result of some of my work in the early days of online retrieval at Bell Labs).

  • Pre-1980s: Human and machine intermediaries (human search intermediaries are now largely extinct)
  • 1980s-1990s: Networks, search algorithms, words and links, no human intermediary
  • 2000-present: UIs, facets, usage patterns, social interactions (and involvement of many more people in the search process)

He then presented 3 case studies as examples of HCIR platforms: Open Video, the Relation Browser, and Results Space, from these assembling a list of challenges and evaluation worries: mixing various approaches to HCIR, information seeker behavior, retrieval and extraction, and individual and group interactions. Here are some pertinent questions:

  • How do we assess query quality (often the first indication of user behavior in a search)? We might think this is basic, but it is really quite difficult. Have we advanced the science to be able to say we are doing something better now? How much confidence can we put in query profiles?
  • How do we use search behavior as evidence? Can we match the behaviors to queries?
  • How do we create document surrogates and assess their effects? Surrogates tend to be active in the early stages of the search process.
  • How do we account for information seeker loads: cognitive, perceptual, and collaborative? What s the perceptual load in an environment where everything looks the same? When 2 people work together, they are less efficient. We are not paying attention to the costs of collaboration.
  • How do we measure session quality, search quality, and solutions to problems?

Marchionini concluded that substantial progress in HCIR has been made over the last 30 years (compare today’s search experience with that of searches done on a teletype terminal by a librarian while you waited for the results), but there is still much more to learn.

Poster Sessions

Topics of the posters included collective information seeking, a high density image retrieval interface,

High density image

an interactive music information retrieval system, information needs and behavior of mobile users, search quality differences in native and foreign language searching, and search interfaces in consumer health websites. Links to articles describing the research presented in each poster are available on the workshop website.

Presentations

In the presentation sessions, authors of research articles presented their work, and as with the posters, links to all of these articles are available on the website. Here are the conclusions from a few of the presentations:

  • Chiang Liu

    A study of dwell time (how long a user remains on a website) as a function of task difficulty by Chang Liu and colleagues at Rutgers University found that difficult tasks result in more diverse queries and longer dwell times on search result pages. Users with low knowledge of the search subject tend to be less efficient at selecting query terms; those with high domain knowledge spend much more time on content pages than those with low knowledge.
    :spacer:

  • Michael Cole

    Michael Cole, also from Rutgers, presented his group’s research on eye movement patterns during a search. Eye movement analysis is quite powerful; Cole et al. found a strong correlation between the level of a searcher’s domain knowledge, length of time reading words, and reading speed.
    :spacer:

  • Luanne Freund

    Luanne Freund from the University of British Columbia analyzed document usefulness by genre. People think about genres in difficult ways; labeling them is difficult for searchers; and they do not always agree about usefulness. Freund identified 5 types of information tasks–fact-finding, deciding, doing, learning, and problem solving–and found that usefulness scores vary considerably by task and genre.
    :spacer:

  • Alyona Medelyan

    Alyona Medelyan from Pingar, a New Zealand-based organization, evaluated 5 search interface features in biosciences information retrieval: query autocompletion, search expansions, facetted refinement, related searches, and search results preview. Interface features from several systems were presented to users (without identifying the service); the users were asked to rate the interface. They found facets useful as long as there were not too many of them to choose from, but felt negatively about autocompletion, that it had too much of a “pigeonholing” effect on their searches. The most important thing to them was the content, not the aesthetics of the interface. Facets were useful for searching; the other features were more useful for browsing.
    :spacer:

  • Keith Bagley

    Keith Bagley from IBM raised the interesting question whether concepts from the travel industry could be useful in modeling searching. When we travel, milestones provide reference points along the road. Many searches end prematurely because of user frustration; perhaps searchers could share their “road maps” to success with others.
    :spacer:

  • Gene Golovchinsky

    Gene Golovchinsky from FX Palo Alto Laboratory, Inc. studied collaboration in information seeking using the Querium system. He said that just because people talk about a document does not mean that it is useful. Systems have been developed that automatically flag documents that have been used in relevance feedback or queries that returned many useful documents. But are these enhancements useful? Is it appropriate to share results automatically? Does this kind of feedback produce better retrieval results despite users’ initial impressions?
    :spacer:

HCIR Challenge

The workshop organizers conducted an “HCIR Challenge”, in which search engine developers were asked to use a set of over 750,000 documents in the CiteSeer digital library of scientific literature to answer several questions focusing on the problem of information availability: when the seeker is uncertain as to whether the information of interest is available at all (for example, in a patent search).  Details of the challenge and the questions are available on the workshop website.

Four teams took up the challenge:

  • The L3S team used its faceted DLBP system to solve questions 1 and 4. Their system shows facets along with the retrieved references. Clustering is based on titles of results.
  • The second team used the Querium system on questions 1 and 2.
  • The VisualPurple team used its GisterPRO system to answer questions 3 and 4. Gister does cloud-powered exploratory searches of unstructured data. It was developed for analysts who need to do hard searches in short times and visually searches a databases. The only operation available to the user is quoting to construct phrases; there are no Boolean operators.
  • A team from Elsevier labs demonstrated their query analytics workbench to answer questions 2 and 5.

The challenge winner was chosen by majority vote from members of the audience not involved in the challenge. The vote was very close between the Querium and Elsevier systems. Querium won by a narrow margin.

Future of the HCIR Workshop

The HCIR workshop has clearly been a successful experiment.  It provides a unique venue for researchers in the field to discuss their results in an informal setting.  As it grows, decisions will need to be made as to how to guide it in the future, and a upcoming attendee survey will provide some useful input.  Personally, having been one of those researchers in the past, I hope it will continue.  It was a highly useful and stimulating experience; research in HCIR is making great strides.  We can expect significant improvements in search engines in the future as a result.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

 

Internet Librarian is Next Week

I will be blogging from Internet Librarian next week, so please head on over to the LibConf blog for the latest updates.  And don’t forget the treasure hunt if you have a smartphone.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Frankfurt Book Fair App

The Frankfurt Book Fair starts tomorrow, Wednesday, October 12. It’s huge–and crowded! Finding your way around can be difficult. If you haven’t already planned out your schedule, you might be interested in an app for your smartphone developed by Publishing Technology, plc. It contains exhibitor listings, hall plans, and a full schedule of events. There is also a “Tour Planner” feature that gives you an aerial view of the halls.

Enjoy the Fair!

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Information Today’s Final 2011 Conferences

The last two conferences on ITI’s fall conference calendar are the HTML5 Video Summit and Streaming Media West both co-located on November 8-9 in Los Angeles, CA.

The HTML5 Video Summit is an expansion of a very successful track at this year’s Streaming Media East conference last May.  The conference website gives the motivation for this Summit:

“It’s becoming increasingly important to deliver video not just on the web but to a multitude of mobile devices, set-top boxes, and connected TVs, and content providers, browser developers, and end users can no longer afford to have the primary video delivery mechanisms locked up in standards that can’t adapt to new environments. The effects of HTML5 have already had an impact throughout the industry. Major media sites such as YouTube, The New York Times, CNN, Vimeo, and more are already offering HTML5 video players, while web giants Apple, Microsoft, Google, and Mozilla are rapidly adding HTML5 features.  It’s time to consider how HTML5 can help your business move forward in these exciting times.”

The conference is focused on practical applications; several “How To” presentations are on the program, as are comparisons of HTML5 with other platforms such as Adobe Flash and Microsoft Silverlight; standards; and a look into the future.

 

The opening keynote at Streaming Media West will be presented by Michael Aragon, VP, GM, Global Digital Video and Music Services, Sony Network Entertainment.  Other presentations include those on Facebook as a platform for distributing digital media, Google TV, enterprise communications, and the business of premium online video.

Both conferences will share the keynote session and the exhibit hall.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Charleston, DataContent, and Many Other Fall Conferences

Besides the three major conferences mentioned in the title of this column, there are many others on the schedule as we approach the last month before the holiday season.

The Charleston Conference

The 31st Charleston Conference will be on November 2-5 at its usual venue in Charleston, SC.  This year’s theme is “Something’s Gotta Give!”, and that might be a widely shared sentiment in today’s continuing rapid-paced environment of change, new technologies, and economic difficulties.  On Wednesday, November 2, an all day preconference on “Shared Print Archiving” will take place, along with several half day sessions and the always popular Vendor’s Showcase.  The opening plenary session features Michael Keller, Stanford University Librarian, speaking on “The Semantic Web for Publishers and Libraries”; MacKenzie Smith, Research Director, MIT Libraries, speaking on “Data Papers in the Network Era” (“Data Papers” means research datasets); and speakers on Hidden Collections and the Digital Public Library of America.  This is just the opening session speaker lineup; the entire conference features similar interesting topics and high quality presentations. Charleston is an excellent conference and regularly draws over 1,000 attendees.

DataContent 2011

Organized by the InfoCommerce Group, the DataContent 2011 conference will be held November 3-4 in Philadelphia, PA.  The keynote speaker is Clare Hart, President and CEO, Infogroup (she formerly held several executive-level positions at Dow Jones and was CEO of Factiva from 2000 to 2006). Presentations on “The 3 Cs: Cloud, Crowd, and Curation”, “Mobile’s Second Coming”, “Strategic Makeovers”, and other topics of current relevance will follow.

Information Science: Maps, Life and Literature, Sentiment Analysis, Digital Humanities

The third in a series of conferences on the future of information sciences (INFuture) will take place November 9-11 in Zagreb, Croatia.  According to the conference website, the objective of these conferences is “provide a platform for discussing both theoretical and practical issues in information organization and information integration.”  The program for this year’s conference was not yet available when this column was written.

Many maps are produced for general use and are not designed to be preserved.  But the “Exploring Maps: History, Fabrication, and Preservation” conference (November 2-3, Philadelphia, PA) will explore maps that have been preserved for their beauty and link to the past.  Many of the speakers are librarians and curators in map libraries at universities and archives.

The Life and Literature Conference (November 14-15, Chicago, IL) was organized by the Biodiversity Heritage Library (BHL) consortium to discuss digitizing and networking of biodiversity literature.  Topics to be covered include biodiversity informatics, publishing models, digital libraries, and humanistic and artistic intersections with biodiversity literature.  The plenary speakers are George Dyson, a technology historian, and Richard Pyle, who has developed database systems for managing biodiversity information.  Four panel discussions on information-related topics and a “code challenge” to produce new and innovative ways to disseminate and use BHL’s data are on the rest of the conference program.

Sentiment analysis is deals with the expression of attitudes, emotions, and perspectives, and how these are expressed in language.  With the growth of online shopping and product reviews and the use of social media by consumers to voice their opinions, sentiment analysis has become especially important to product sellers and developers.  It is becoming an information research area in its own right.  The Sentiment Analysis Symposium (November 9, San Francisco, CA) will explore various approaches to sentiment analysis and practical uses of it in several industries.  Pre-symposium tutorial and research sessions will be on November 8.

The Supporting Digital Humanities 2011 conference (November 17-18, Copenhagen, Denmark) has not yet organized its program, but a long list of accepted papers appears on the conference website.  The two major themes of the conference are “Sound and movement – music, spoken word, dance and theatre” and “Texts and things – texts, and the relationship between texts and material artifacts, such as manuscripts, stone or other carriers of texts”.

Libraries:  Brick and Click, Library 2.011, RFID

The 11th annual Brick and Click Libraries Symposium will be on November 4 in Maryville, MO.  As in past years, it will be a series of 6 tracks with 5 concurrent presentations in each. The topics cover a wealth of subjects of interest to librarians in physical (brick) libraries as well as those who provide information services to remote users (click).  With 30 presentations to choose from, there is sure to be something of interest to each attendee; indeed, choosing which session to attend may be a challenge!

 

Library 2.011 is a global online conference organized by the School of Library and Information Science (SLIS) at San José State University to be held November 2-3.  The website says that it will be “a global conversation on the current and future state of libraries.” The conference will be arranged in 6 “strands”:

:spacer:

  • Libraries – The Roles of Libraries in Today’s World
  • Librarians and Information Professionals – Evolving Professional Roles in Today’s World
  • Information Organization
  • Access and Delivery
  • Learning – Digital Age Learning Cultures
  • Content and Creation – Changes in Accessing and Organizing Information

The conference website has lots of information on the technical requirements for connecting to the conference, speakers, and 36 pages (so far) of registered attendees.  All of the topics look highly interesting and relevant in today’s information environment.  If you’re not going to the Charleston Conference (see above), Library 2.011 might be a good alternative.

RFID has become popular in libraries; among other things, it allows users to check out their own books.  A new RFID standard has been issued, and it will give libraries wider freedom to choose among the various vendors of this technology.  CILIP, the Chartered Institute of Library and Information Professionals, has organized a one-day conference on RFID in Libraries to occur on November 8 in London.  Speakers will describe the current status of RFID technology, the new standard, and some practical case studies of how they have used RFID in their libraries.

:spacer:

Open Access (OA)

The Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities was issued in 2003 and has been signed by the leaders of over 300 institutions around the world.  The 9th Berlin Open Access Conference will be held November 9-10 (pre-conference sessions are on November 8) in Washington, DC (this is the first time it has been held in North America).  A coalition of five organizations (the Max Planck Institute, Marine Biological Laboratory, Howard Hughes Medical Institute, Association of Research Libraries, and SPARC) has organized the conference, and the Program Committee has identified the following subjects for discussion:

  • Transforming Research through Open Online Access to Discovery Inputs and Outputs
  • Creation of Innovative New Opportunities for Scholarship and Business
  • The Impact of Open Access and Open Repositories on Research in the Humanities
  • Open Education: Linking Learning and Research through Open Access
  • Public Interaction: the Range and Power of Open Access for Citizen Science, Patients, and Large-scale Collaboration

The Repositories Support Project (RSP), an initiative funded by the UK organization JISC (formerly known as the Joint Information Systems Committee), will hold its “Autumn School” conference, “Bringing the Emphasis back to Open Access, and Demonstrating Value to Your Institution” on November 7-9, near Cardiff, Wales.  A keynote address by David Prosser, Executive Director of Research Libraries UK (RLUK); technical development talks on DSpace, Eprints (OA platforms), Google Analytics, and other topics related to OA are on the program.

Digital Libraries and Preservation

The 2nd International Conference on African Digital Libraries and Archives (ICADLA-2, November 14-18, Johannesburg, South Africa) takes a broad view of the digital library world with its theme “Developing Knowledge for Economic Advancement in Africa”.  The first three days of the conference will be a training workshop for managers and library staff entitled “Digital Futures: from Digitization to Delivery”. The workshop will be conducted as a combination of presentations, discussions, and exercises. The last two days are a strategic planning conference entitled “Developing National and Institutional Digitization Strategies” for directors of libraries and museums.

:spacer:

The 8th International Conference on Preservation of Digital Objects (iPRES 2011, November 1-4, Singapore) will be keynoted by Professor Seamus Ross, iSchool, University of Toronto, speaking on “Digital Preservation: Why should today’s society pay for the benefit of society in future?”  The first day of the conference will consist of two tutorials, “Preservation Metadata in PREMIS” and “Archiving Websites”, and the last day will also offer tutorials: “Steps toward International Alignment in Digital Preservation” and “Web Analytics”.  As of this writing, two other keynote speakers and the session topics remain to be confirmed.

The University of London is offering a Digital Preservation Training Program on November 14-16.  The course will cover policies, planning, strategies, standards and procedures in digital preservation, and a class project will be part of it as well.

Society Meetings

Finally, here are two society meetings scheduled for November:

The Society for Scholarly Publication will hold its Fall Seminar Series on November 8-10 in Washington, DC.  Topics are “Content and Apps for Mobile Devices: Engaging Users in the Mobile Experience” and “Moving to the Online-Only Journal: Breaking Free of Print Constraints”.

 

 

The 2011 European Summit of Strategic and Competitive Information Professionals (SCIP) will be in Vienna, Austria on November 8-10.  The keynote address will be by David Frigstad, Chairman of the Board, Frost & Sullivan.

 

As always, many other conferences, including symposia and book fairs, are listed on the Information Today Conference Calendar.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Internet Librarian Begins in Just 2 Weeks!

It’s hard to believe but true–Internet Librarian 2011 is almost here!  If you’re blogging and haven’t signed up, please click here to do so now.

See you there!

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

:spacer:


 

Internet Librarian, October 17-19, 2011
Internet Librarian Is In Just 2 Weeks!
Can you believe Internet Librarian is just 2 weeks away?Come  to Monterey, California October 17-19 and Internet Librarian will give you the tools to meet all your challenges!Don’t miss our amazing speakers such as:

John Seely Brown   John Seely Brown       Lee Rainie               Lee Rainie
Liz Lawley                 Elizabeth Lane Lawley       Jane Dysart              Jane Dysart

And so many more! For a full list of the speakers, click here

Network, learn, and find your treasure!  Register today!

Web Scale Information Discovery: The Opportunity, The Reality, the Future–An NFAIS Symposium

NFAIS held another one of its popular symposia on September 30, 2011: “Web Scale Information Discovery: the Opportunity, the Reality, the Future”.  There were about 80 attendees, 35 onsite and 45 virtually from as far away as Israel.  Generally used in academic libraries, discovery services provide users with single search box (Google-like) access to all of a library’s resources.  Using discovery services can greatly enhance access to a library’s collection.

This was an excellent overview of the current state of the discovery services market, featuring noted experts as well as representatives from the four current major players.  Not only was the current market described, but NFAIS-sponsored activities and a look into the future were also on the program.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor


 The Opportunity 

Judy Luther

Judy Luther, President, Informed Strategies, opened the workshop with an overview of the discovery landscape.  (She had authored an article on discovery services in Library Journal approximately a year ago, but much has changed since then.)

Online access to journal content progressed in the 1970s from printed indexes with controlled vocabularies to online services such as Dialog, which used Boolean logic to structure search queries.  Initially, only abstracts and indexes were available, but full text was soon added.  Many information providers produced CD-ROM versions of their databases for local access. With the coming of the Internet, access to the full-text became ubiquitous, many specialized services appeared, and Google entered the picture with its Google Scholar service.

End users have become used to content organized by format—with databases of books, journals, or local content, but this arrangement does not help the user.  Users want to be able to run one search across all content.  Students do not think of format at all, and some of them have lost the concept of a database of journals, thinking of information sources by the database name.  And the idea of an information intermediary is foreign to many of today’s students, who consider themselves able to use information services without help.  The proliferation of databases (for example, Stanford University has 900 of them), each of which must be searched separately, means that many databases are infrequently used, so there is a significant need for discovery services.

Students want things fast and simple; libraries want to maximize the investment they have made in acquiring databases of information; and content providers want inexperienced users to access their information.  Discovery tools, which bring the user the experience of walking through the stacks, are geared towards browsing, which is a huge step forward because it lays out the landscape of the content.

Today’s four main discovery service providers each bring a variety of skills to the market. EBSCO and ProQuest have lots of experience with content; ExLibris knows technology; and OCLC has extensive experience with metadata.  Criteria for selecting a discovery service are the same as for most information products:

  • Scope and depth of content,
  • Richness and consistency of metadata,
  • Frequency of updates,
  • Ease of incorporating local information,
  • Simplicity of the interface,
  • Ability to customize the interface,
  • Support for mobile platforms, and of course
  • Cost and availability of funding.

We are still learning the role of discovery tools in the industry and will be for the next few years.

In the question period, the important issue of “good enough” search results was raised. What happens to the discovery environment if the industry standard becomes “good enough”?  Undergraduate students are under considerable time pressure so “good enough” information may be sufficient for them.  Writing a thesis requires much more specific and in-depth information, as do disciplines such as law and medicine.

The Players

In this segment of the workshop, each of the four current discovery service providers presented an overview of their offerings.

Summon™

John Law

John Law, Vice President of Discovery Services at Serials Solutions observed students searching for information in their native environments and found that more than 90% of them did not know the resources that were available to them in the library. One of the major functions of discovery services is to make the library a useful starting place for research.  Serials Solutions’ service, Summon, launched commercially in August 2009, and had a very rapid adoption, which was validation that it filled a need.

 

Here is a breakdown of Summon’s current customers.

Law said that many customers choose the Summon service because it meets users’ expectations.  It was not built on catalogs or library databases, which have only a small portion of the content that users want and do not scale well.  Summon was designed by web application developers, so it has an interface familiar to digital natives which lets them identify with the library.  The system’s objective is to bring the user back to the library, not to bring them federated search.  Content comes from multiple sources and is de-duped: when the same article is retrieved from more than one source, the metadata from each is brought into a single unified record—a capability unique to Summon.

Summon also points users to the library as the starting place for research, thus providing an immediate ROI for the entire collection.  It works, people use it, and it has had a very significant impact on usage in libraries. Some libraries have seen as much as a 50% increase in uses of well known titles, and their database usage has also increased substantially.

Summon has over 760 million records from more than 80 types of content.  The Hathi Trust will provide over 9 million records to Summon, and book publishers are also supplying data, so for the first time, libraries can offer users the ability to search the content of the books on their shelves.  Local collections can be easily incorporated and made discoverable.  All the content is in one index, is treated equally, and can be searched at once.  An API allows libraries to build their own interface on top of Summon.  It has become a digital front door for libraries.  Publishers are also seeing the advantages of discovery services because libraries’ usage of their content is increasing in libraries that are Summon subscribers.

Primo

Carl Grant

According to Carl Grant, Chief Librarian at ExLibris, their Primo discovery service was built with the needs of librarians in mind.  Primo is layered on all ExLibris products and can be locally installed or cloud-based.  Over 860 sites, 291 in North America, are using Primo. Users see their results in a single view, with links back to the original source.

Primo Central, ExLibris’s comprehensive consolidated index, can be used as a standalone hosted metadata source, or in conjunction with Primo.  Hundreds of millions of items are indexed in full text.

Primo integrates with many other systems:  ILSs, digital repositories, XML databases, link resolvers.  It is designed to work with what a library has in place and can be customized to boost local content, thus allowing libraries to emphasize “what we offer”. Open APIs allow libraries or consortia to write their own extensions to serve local end users’ needs.  The source of each displayed record is shown, maintaining the content owner’s brand identity.

Primo subscribers have found that their library’s average number of search sessions increases dramatically (NYU found that usage tripled), but average session length dropped because users are quickly finding what they want, which is good news for librarians.

Other considerations to take into account include the importance of content neutrality and non-exclusivity of content in databases.  It is important to recognize that discovery services are NOT content providers. The content provider’s branding is maintained, copyright statements are supported, and access to its subscribers can be restricted by the discovery service if necessary. Retrieved records in Primo are not deduped. They are grouped but not merged, and links back to the source record are provided.

WorldCat® Local

Chip Nilges

Chip Nilges, Vice President of Business Development at OCLC, noted that its mission of its WorldCat Local service is to integrate library collections for consumers.  Launched in 2008, WorldCat Local was built at the request of an OCLC member.  It provides a single search of all library collections, integrated and intuitive resource discovery, and interoperation with existing library systems.  Because its goal is comprehensive coverage of a library’s collections, it includes more than books. Over 14 million records from over 1,000 sources, amounting to more than 44 million data elements, have been integrated into the database.

:spacer:

When multiple retrievals of the same item occur, the “most robust” record is displayed.  Usage reports show statistics and traffic on WorldCat local to publisher site.  Full text indexing will be available in early 2012; staff and expert search views are under development; and the user interface is being enhanced.

 EBSCO Discovery Service™

Sam Brooks

Sam Brooks, Sr. Vice President, Sales and Marketing at EBSCO Publishing, said that the motivation for developing the EBSCO Discovery Service (EDS) was to try to help libraries compete with Google and Wikipedia for the attention of end users.  He quoted a recent OCLC report, “College Student Perceptions of Libraries and Information Services” that found that only 30% of college students use library resources; they are using Google instead.  A service is based solely on full text searching is not much different from Google; the advantage of discovery services is subject indexing.

All discovery services load library catalogs and partner with non-journal vendors, but subject index providers are not working with the discovery services.  As the owner of several subject indexes, EBSCO knew why: subject index providers generally consider partnering with discovery services to be a risk to their bottom line.  There is no risk for full text publishers in working with discovery services because users do not come to full text vendors for the metadata.

A combination of subject indexing and full text searching is the best way to provide users with the best results.  The subject indexing is crucial, and a bias toward it is good and helps give better results.  EDS was developed not to replace subject indexes but to embrace them and gain additional value from them. EBSCO does not try to convince libraries that discovery services are a replacement for subject indexing; indeed, Brooks said that other discovery services may mislead users into thinking that subject indexes are not the best source for finding the desired information.

EBSCO and EDS will be featured on the cover of the upcoming Reference 2012 issue of Library Journal.

The Discovery Service Selection Process

Diane Bruxvoort

Discovery services represent a big commitment of people and funds for a library because they are one way of presenting a library to its community.  Diane Bruxvoort, Associate Dean for Scholarly Resources and Research Services at the University of Florida, described her experiences in selecting discovery services for an academic library and grouped the main issues into “7 C’s”:

  • Commitment can be bottom up or top down or it may be driven by a funding opportunity, but administrative support is crucial. Without it, it is impossible to move forward.
  • Costs can be met from new funds or reallocated ones.  They can also be met from the technology fee charged to students, which is a good use of those fees because discovery tools are used by many students and are very visible to them.
  • Choice: who chooses a vendor makes a difference.  Sometimes the administration just makes the choice, but usually a task force is involved.  The members of the task force must understand all aspects of the service and can include liaisons, librarians, catalogers (they understand metadata), or administrator (having an administrator on the task force is one way to sway the decision with the administration).  End users can help with the decision if they know about searching.
  • Criteria:  Lay out what matters at the beginning, but be willing to add new criteria or delete some.  Be flexible, and avoid the laundry list if possible.  Identify top 5 criteria for your institution.
  • Coverage:  Who has what?  Lots of coverage is common with all discovery services, but it must be matched up with holdings.  Note that coverage changes rapidly.
  • Companies:  All of today’s discovery service providers have been in business for a long time, and all the products are good.  An institution’s previous experience with a company is important.
  • Calendar:  In an academic institution, new products are customarily rolled out at the beginning of the fall semester.  Be sure to allow enough time for the launch; it is a big change, and the librarians need time to prepare for a new product.  Instruction is important in libraries, and it is needed with discovery services.  For an example of how a discovery service is presented to its users, see the University of Houston’s library site.

The Reality: Implementation and Results

Even though discovery services are still in their early days, a number of libraries have acquired implementation experiences.

Demian Katz

Demian Katz, Library Technology Development Specialist at Villanova University said that when Summon appeared as the first available discovery service, the decision was made to integrate Summon into VuFind, Villanova’s interface for discovery, which was being used to search its OPAC.  The challenge was.  Summon’s advantages were that it exposes a wide range of library resources in a single place, it has a simple interface, users do not need instruction to use it, and it is compatible with VuFind.  Three implementation options were considered:

 

  1. Install Summon as a standalone service.  Although this was the simplest option, it would result in loss of functionality and might have relevance ranking problems.
  2. Dynamically merge results from Summon and VuFind.  This option would have been complex to implement and the system response would be slow.
  3. (the chosen option) Provide results from Summon and VuFind as separate lists.  This had the advantage of retaining the full functionality of both systems and was relatively simple to implement.  The results are shown side by side on the screen with many navigation options.

To see the Villanova implementation of Summon click here.

Gregg Silvis

Gregg Silvis, Assistant Director for Library Computing Systems, University of Delaware (UD) Library, noted that UD was the first production site for WorldCat Local. It went live in August 2008 after a complex installation process.  Key issues were the integration of UD’s in-house built open URL resolver and a well-maintained list of databases. It was a major conceptual change for the library staff to search a database of external resources—suddenly articles appeared in search results.  The analytics provided by the system were very useful to see how the users were interacting with the site.

 

 

Scott Anderson

Scott Anderson, Associate Professor and Information Systems Librarian, Millersville University installed EBSCO’s EDS at the library.  The Millersville library was a heavy user of EBSCO’s products, so the integration of EDS was straightforward because all the system parameters were preset.  Not only did this save considerable time (installation took only a few days), but it gave flexibility to build subsets of the content for each course.

EDS has been well accepted by the faculty and students.  It has administrative support because the staff of the Provost’s Office liked it.  Freshmen think EDS is like using Amazon, so they understand the idea of facets.  And it has resulted in increased usage of some subject databases that the library has been trying to justify.  EDS usage data was even used to avoid cancellations of some databases.  Millersville branded its implementation of EDS as “Library Search”, so in the eyes of the users, it has become the primary point of accessing all research content.

Erin Rushton

At Binghamton University, Ex Libris’s Primo service spurred the purchase of the underlying products as well because it could be used as a uniform discovery layer for the library collections and other digital collections.  The convenience of one vendor supporting all the services was seen as a plus.  Erin Rushton, Web Services Librarian, described the implementation of Primo.  Link resolver data were uploaded to provide availability information to users. System testing was done with the help of forms supplied by Ex Libris, and the system was customized for the local data that was added.  Primo has become the default search on the library home page.  After the system went live, little feedback was received from users, so it was assumed that they like it.

 

Content Provider Perspective

Bonnie Lawlor

Bonnie Lawlor, NFAIS Executive Director, reported on a survey of NFAIS members (mostly content providers) conducted a year ago to find out who was working with discovery services and what they had learned.  About half of the respondents were working with discovery services, and most regarded them as an opportunity because they offer broad exposure of content, improved searching speed, and better search results for users.  Some respondents were concerned about loss of brand identification, inaccurate usage statistics, and poor rankings.  A number of specific issues were identified by the respondents; the survey results are available here.

Lawlor is leading a task force currently developing a Code of Practice (COP) for discovery services which will create an awareness and understanding of the issues, ensure full disclosure, provide guiding principles for contract negotiations, and list the rights and obligations of each player. The COP will be patterned on one derived in the 1980s for information gateway services.  So far, 16 rights and obligations have been identified.  When the draft COP is completed later this year, the providers will be invited to react to it.

The Future

Discovery services are putting themselves in the delivery chain of content, influencing what is exposed to users and which content gets used, so they are significant tools.  In the closing session of the day, the four service representatives presented their view of the future of discovery services.

Chip Nilges

Are discovery services a sustaining or disruptive innovation?  We will see lots more aggregation of content into centralized indexes which will drive the need for more filtering and create a demand for more vertical markets.  Shrinking budgets in libraries will create opportunities for new business models.  Rights, clearance, and subscription agents will come together.  More research will migrate to these services, and interface to library collections will become more complex.  Bringing the library into the social web will become increasingly important.  Abstracting and indexing services should be thinking this way as well.   This is an exciting time, and exciting times are scary!

John Law

Libraries can now have tools to meet user expectations, and they will be able to close the gap between information and users.  The number of volumes in a library is no longer a measure of its size.  It is hard to do coverage assessment, but discovery services should make that transparent.  Abstracting and indexing databases are more problematic.  Including indexing from an abstracting and indexing service in a discovery service does not necessarily mean that the information will be discovered.

A recent article from the Chronicle of Higher Education noted these changes in search:

  • Discovery services need to be agile.  We are in a very early stage.  It is exciting to be able to use a whole new technology platform.
  • Discovery services will have to move past the OPAC as their center.  Be careful not to be distracted by a long list of features—only a few users will use the advanced features.
  • Discovery services will need library expertise built in to them.

Carl Grant

The next steps in discovery services are:

  • Personal relevance ranking.  How do we consider what the user is looking for?  What is the context of the user?
  • Open discovery processes and workflows.
  • Improved mobile interfaces.  We are not paying enough attention to this.  The growth of mobile is tremendous and will surpass desktop usage in the near future.  We must be on those content devices.  The world of apps is out of control.  Do they add functionality?  Many of them are quite basic.
  • Address the growth in e-book usage.

Needs currently not being addressed are:

  • A clearer difference between Google and discovery interfaces.  Read The Filter Bubble which gives results based on user behavior.  We do not need to sell an ad; make this clear to end users that the best way to get unbiased results is by using a librarian.
  • Many people like text-based learning, but there are also other ways of learning (see Apple’s Garage Band app for the iPad for a good example of such an interface). We need better support of users with different ways of learning.

Sam Brooks

Do users know that a record comes from an abstracting and indexing database?  It is not merged, not unauthenticated, and is not free on the web.  There is a real bias toward subject indexing.  Usage statistics go up when subject indexes are introduced.

The next step is for discovery services to market themselves appropriately.  There is much confusion in the market, which needs to be addressed.  We need to refute the notion that a discovery service can replace a major subject index.

Discovery services are also competing with Google and Wikipedia.  Users think of Google News, Google Images, or Wikipedia, and look at them as legitimate places to get information.  Students like the Google and Wikipedia because they provide a single search box, real-time news, and a massive encyclopedia.  Discovery services must find ways to coexist with them.  EDS will shortly offer real-time news from major wire services such as AP, UPI, and PR Newswire.  (AP, the most important newswire, is no longer available on Google News.)  It is also building an unprecedented collection of high quality encyclopedias.  Contracts have already been signed with 9 publishers, and more are in process.