Archive | Online Information 2010 RSS for this section

More Reviews of Online Information 2010

Several reviews of the Online Information conference have appeared, including one entitled “The One When It Snowed“.  You can also see a list of several of the reviews on Information Today Europe’s blog including an analysis of the trends and highlights.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Next Year: Same Venue, New Hall

And so we come to the end of Online Information 2010 and this installment of “Live From London”.    While the large effects of the economic downturn in the online information industry were quite noticeable–conference delegates numbered about 600 which was a significant decline, and the exhibit hall has shrunk, as noted in an earlier post–exciting new developments and technologies are making themselves felt, and this year’s conference covered them well.  The program focused heavily on two of them:  open and linked data and social media strategies, challenges, and integration into information products and services.  One hopes that these developments will help the industry recover from the downturn and progress to new heights.

And one also hopes that weather conditions will be better next year; London was cold and snowy this year, and particularly on the final day, attendees faced difficult and even chaotic conditions with public transportation!

Online Information 2011 will still be in London and still at the Olympia Conference Centre, but it will move to a new hall within the Centre.  Continue to follow The Conference Circuit for the latest developments on this and other industry conferences.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

The New World of Search: Closing Keynote Panel

(L-R) Stephen Arnold (Moderator), David Hawking, Chief Scientist, Funnelback Search; Andrew Kanter, COO, Autonomy, UK; Saratendu Sethi, Director, Research & Development, SAS; Jeremy Bentley, Chief Executie, Smartlogic

The conference concluded with a panel of CEOs, moderated by Stephen Arnold, President, Arnold Information Technologies, and long time expert and conference speaker, discussing at the new world of search.  Arnold listed the following 3 trends that he believes will appear in searching for 2011 and invited panel members to agree or disagree with him.

  • In 2011, Open Source (OS) search will put proprietary software companies under great pressure and will force these companies to change their businesses. Kantor disagreed, saying that OS has its place at the commodity end of the market, which is the smallest part.  Sethi also disagreed and said that SAS provides extensive “hand holding” to its customers, and this is where the value comes.  Bentley said that search has been commoditized and his organization is agnostic to the various platforms.  Search engines do not have the metadata to provide the user with a good search experience.  Finding is what people are prepared to pay for.  Hawking said that his products are developed inside the company, but the code is available for users to modify if they wish.  Searching solves peoples problems; what combination of technology will solve problems better or more often?
  • Buyers of systems are no longer talking about search; problem solving solutions will lead to survival.  Sethi responded that metadata is critical to the success of searching.  Search has become a decision engine to arrive at the solution of the problem.  End users should not need to deal with complex search semantics.  Bentley agreed with the trend provided that “solution” means “value”.  Users often use jargon different from the author of a document; work is needed to make metadata accurate.  It is extremely difficult for a machine to figure out what is in the user’s head.  Funnelback is providing an effective finding tool which is different from search.  Searching should be simply built into the problem solving system, but it is often the weak link.  Kanter said that search is a manual and inefficient process.  Engines that watch things as they happen are much more effective.  For Autonomy, search has always been part of the problem.
  • The growth of data in organizations will cause companies’ systems to fail.  Traditional vendors will not be able to handle the new types of content (Twitter posts, blogs, etc.) or the volume of content. Sethi said that faster indexing will be needed because search engines will continue to need to handle precision and recall.  Findability will be improved by improved metadata.  Technology is already available that can handle thousands of transactions a second, so this is not a problem.  SAS can process 100 to 150 Tweets a second.  Kanter said that Autonomy can index every Bloomberg terminal transaction and e-mail of a major Wall Street bank, but the problem will only get worse.  Hawking indexed 115 million websites in the .uk domain on a laptop and received good response.  However, he thinks that the problem will get better because people will weed their repositories.  Bentley said that Smartlogic was designed before social media evolved and it is built to scale to the size of a Twitter stream.
  • In 2011 and in the markets that matter, the products sold by panelists’ companies will be under great pressure because the needs of the user are not served by them.  A new group of companies will therefore emerge. Bentley said that access to information will be through devices that do not have a screen and keyboard, and therefore he disagrees with the prediction.  He thinks that enterprise semantics is the technology that we should be considering.  Hawking said that mobile devices present presentation challenges but they offer new ways of interactive.  The nature of search will not disappear, so the system behind the scenes must be amenable to new ways of searching.  Sethi said that SAS has been profitable since its founding because it has been had a visionary approach as technologies changed.  He thinks that SAS will survive because it is already actively adapting its systems for mobile devices.  Kantor said that mobile will not crush Autonomy.  The power of the devices we carry opens opportunities for new products using the new technologies now available.  They are already entering new markets using content for mobile devices.  For Funnelback, search effectiveness will be foremost.   Search has many forms in different environments.  There are many problems, which metadata often cannot solve.  The big challenge is to help people form their queries to match the data structure.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

A Course in the XML Theater

Jay ven Eman, CEO, Access Innovations, Makes a Presentation in the XML Theater

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Managing Content in a Mobile World

It is certainly not news that mobile applications have taken the world by storm, but managing content for that world continues to be a key issue.  This panel discussed many of the considerations.

Mobile Content Panel (L-R): Ed Keating (Moderator), Alan Pelz-Sharpe, Robin Neidorf, David Kellog

Alan Pelz-Sharpe, Principal of the Real Story Group led off the discussion and noted that mobile device are enormously diverse and include anything except a PC.  Most organizations are trying to repurpose existing content for a mobile environment.  Will they need to create many different versions of the same content?  How will they be managed?  Devices are increasingly incorporating touch screens–how will this affect existing mobile content applications?

Key concepts for mobile:

  • The publishing model:  pages vs. components.  We need to shift to smaller chunks.
  • Low bandwidth requirements for a limited screen size. There will always be bandwidth limitations.  It may be necessary to break pages into smaller chunks.  This may not be easy.
  • We must consider device-specific capabilities and limitations by adapting content, layout, or templates.
  • Design for mobile and touch interfaces–no double-clicking, dropdown menus, etc. are available.  You will not find a one-stop shop for all of your needs.
  • Content production for a mobile environment is difficult.

Key alternatives:

  • Do nothing and take the attitude “If you can’t read it on your device, go to a PC”.
  • Simple repurposing.  Removing the images helps adapt the content.
  • Broad targeting to specific devices.  This will keep most of the users happy most of the time.
  • Fine targeting to major groups of users.  This is an expensive and complicated area to get into.

Most vendors claim to support delivery to mobile devices, but these claims must  be tested.  There are multiple ways to adapt your content; all require some degree of technical and operational adaption.  The most critical requirement is a component-oriented system, and the controlled use of rich text editors.  Publishing becomes an exercise in trade-offs.  We are not creating different types of content, but a different environment.  (A free copy of Real Story’s report on mobile content is available here.)

Robin Neidorf, Director of Research at FreePint, described a study conducted for the Financial Times on mobile content in enterprises.  27 in-depth interviews were conducted with information managers, then a survey of 100 knowledge workers who use mobile content.

There is a disconnect between mobile workers and mobile content, between security and workflow, and costs and device support.  These were very important to buyers but not to users.  Publishers are interested in the content, but the focus of enterprise content buyers is the worker.

Mobile workers do not want to do their own searches, but they want to send the request to a deskbound searcher and have the results e-mailed to them.  So the deliverable must be enhanced by a mobile platform, not the search.  Vendors must design their products to accommodate this observation.

Many companies have a major investment in Blackberry technology and may be unwilling to shift to other platforms until their investments have paid for themselves.  At present, many users are only thinking about moving to an iPhone or iPad.  New expenditures are not being made in the current economic environment.

Data security is highly important to enterprises when they bring in mobile content.  They want to know where the data is actually stored and managed.  Many organizations have designed their systems for seamless authentication, so that a user never has to remember a password.  This is a major challenge to apply to mobile devices.  Many vendors have not recognized these security issues.

The user surveys showed that the most widely used actions on mobile devices are texting personal contacts, accessing e-mail, and searching via a major search engine.  They are still primarily behaving as consumers rather than knowledge workers in a business.  Airtime and access costs are a primary concern to these users.

David Kellogg, CEO of MarkLogic, continued looking at the mobile opportunity.  Information providers are building services that fuse content and software to help solve business problems–“content in context”.  Many of them know who the user is and where they are, so they are building products to help the become more efficient.  They used to sell a document; now they sell an app.  We need to consider mobile in conjunction with the value proposition, and it’s no secret that mobile devices are increasing and pose hard technology strategy questions.

Kellogg recommends reading Chris Anderson’s story in Wired magazine:  The Web is Dead.  The future is the Internet, and it will be multichannel and mobile.  What is your value proposition?  The bold thing to do is try to change your value proposition to understand how users are using your content, then develop new products to accommodate their behavior, which will certainly include mobile devices.  The web may be dead, but mobile gives us a second chance to develop something for the Internet.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Happy 10th Birthday Wiley Online Books!

John Wiley celebrated the 10th anniversary of the launch of its e-book product, Wiley Online Books, at its booth.

An appreciative crowd enjoyed drinks and snacks.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Learning From User Behavior


David Nicholas

In all online systems, user behavior is important because it guides the development and promotion of the system.  David Nicholas,  Director of Information Studies at University College, London, reported on a study of the impact of the Europeana system to determine whether it is meeting its objectives.  Europeana is a multilingual online collection of millions of digitized objects from about 1,500 European museums, libraries, and archives.  The launch of Europeana was seen as an excellent opportunity to study the impact of a large and ambitious public digital rollout of an online system.

The system was developed in France and was considered by some as the French answer to Google; however, one of the first results of the study was that most use of the system (over 90%) was by robots, most frequently by Google.  Those who study system usage must be aware of this effect; otherwise, their data will be distorted.  After excluding the robotic use, Europeana was visited by about 1 million users who made 9-10 million page views during the first 12 months after it was launched.  Usage was volatile and diverse; here are some results of the study, which are very interesting and revealing:

  • Hourly use peaked initially at 11 AM, then reached a daily high in the evening, suggesting a high volume of home and public use.
  • The daily peaks for different countries can vary widely; for example, Finland peaked at 10-11 AM and Portugal at 6-7 PM.
  • Weekly use peaked on Tuesday.  Sunday usage was similar to a working day suggesting a consumer/leisure profile of users.
  • December and March are the busiest months.  Summer usage is significantly lower.  The UK showed little monthly usage variation; France and Germany followed the overall pattern.
  • Overall growth was modest, equivalent to a 0.9% compound annual growth rate.  Germany and France account for about 40% of use.

One of the questions to be investigated is whether users tend to primarily access their own nation’s collections or whether they range across the entire collection (a desired goal).  88% of the material used was French.

Google was the major referrer of traffic to Europeana, accounting for half of the referrals.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Improving Search Using Taxonomies and Ontologies

Richard Padley

According to Richard Padley, Managing Director, Semantico Ltd., we are drowning in information, which forces us to look at strategies for improving search.  The benefits of a top-down approach are an improvement in user satisfaction and incremental sales from content monetization.  Search is therefore being “spiced up”.

Padley began by defining some terms:

  • Vocabulary–a body of words used in a specific discipline (like a tag cloud).  Vocabularies are unruly but can be useful.
  • Controlled vocabulary (aka keywords)–specific list of terms used to describe concepts.  They are important to resolve ambiguities in word meanings and can start to look like a dictionary.  They are used for identification of synonyms and include definitions.  Central control of controlled vocabularies provides rigor.
  • Taxonomy–hierarchical listing of terms that shows parent/child relationships (broader, narrower).  Can build polyhierarchy where a term has >1 parent.
  • Thesaurus–adds an associative relationship (See Also or Related Term).  Thesauri can have different “flavors”.  Allowing users to browse the thesaurus may not be useful for them.
  • Ontology–expression of a controlled vocabulary in a specific technical language.

Many searches return huge result sets, but they are in silos.  We can use taxonomies and ontologies to help us by building drill-down categories, suggesting additional search terms, and doing semantic analysis.  This is a major advance on a simple Google search which cannot be refined; you must do a new search.

Other ways to improve search include giving the user an opportunity to drill down in search results and build on taxonomies and ontologies by providing facets, and using Preferred Term displays to show the user the correct tags to broaden the search.  If you just give the user top level results, you are helping them by providing less information and then letting them easily access more if they wish.

Padley stressed that in improving the search process, it is important not to reinvent the wheel, but apply new technologies such as taxonomies and ontologies.

Reinventing the Wheel

Here are his final recommendations:

Padley is the author of The Discovery Blog, which discusses these issues and contains further information.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Approaches to Social Tagging: Folksonomies

Tom Reamy

Tom Reamy, Chief Knowledge Architect, KAPS Group, gave an information-packed presentation on folksonomies and their uses in social tagging.  He began by noting that a hybrid approach to content management is necessary because search engines have been failing for years.  Folksonoies can be used as an insight into what people are thinking about in an organization; they are much better than search logs for this purpose.  The main problem with folksonomies is getting people to tag their content; true believers tend not to tag for infrequent users.

Value can be obtained from a taxonomy by combining it with other techniques such as search and content management.  This will create a framework in which taxonomies can be used.

Text analytics goes beyond folksonomies, mainly by the use of auto categorization, and can be used for summarization, fact extraction, and sentiment analysis.  Basic level categories–those which are shorter and easier to understand–are at the level at which most knowledge is organized.  They can be used to categorize not only documents but entire communities and add a new element to social media.

Reamy concluded that the best is yet to come as these techniques become widespread.

These are only a few of the topics covered by Reamy in this presentation.  All of his presentations are available on the KAPS website, and are highly recommended for an introduction to the subject.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Scenes From the Exhibit Hall

Here are some random scenes from the exhibit hall.

The Wolters Kluwer Booth Featured This Display of Subjects Relevant to Our Industry

Want the “Real Story”?  Visit Tom Underhill at Real Story Group‘s Booth 342 and maybe they can help you.  (Just don’t try to use their diagram to find your way home on the Tube!)

Try Your Luck at Oxford University Press's Booth

A Service For People With Special Needs

General View of the Exhibit Hall

How has the current economic climate affected the information industry?  Not surprisingly, quite significantly.  The Exhibit Hall is noticeably smaller than in past years.  Note the large empty area at one end of it.

A Sign of the Times

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor