Search Engine Update

This session featured 3 search experts reviewing current trends and developments.  Marydee Ojala, Editor, ONLINE Magazine and long-time online searcher, led off with a presentation entitled “So Many Search Engines, So Little Time”.  Of course, the most popular search engine is still Google, but its relevancy is declining, there is no commitment to advanced search options, and it seems to be pulling back from features admired by information professionals.  Alternatives to Google are:

  • General web search engines.  Bing by Microsoft is the most familiar.  It features field searching and, search refinement (i.e. advanced search).   Yahoo’s search is powered by Bing except in Japan and South Korea, and it remains a takeover target.
  • Specialty search engines concentrate on format (images, video, social media), or subject (news, science, business).  A variety of country search engines are available, such as Baidu (China), Yandex (Russia), and Naver (South Korea).  The Search Engine Colossus is an international directory of search engines.  Blekko has no spam and filters out results from content farms.  DuckDuckGo is known for its privacy because it does not save searches.  Exalead is a cloud-based site for enterprise search and has some advanced features such as soundslike and spellslike.  Topsy is now the only search engine for archival Tweets.

Many search engines feature databases of a variety of information types; for example, one can find databases of images, books, news, and maps on Google; images and finance on Yahoo; and travel, news, inages, and video on Bing.  Flickr and Picasa are well-known image databases, which can be searched by image criteria such as color.  YouTube, of course, is the leading video search engine, but one can also find instructional videos from various universities as well as those from the Journal of Visual Experiments (JOVE).

  • Paid search engines are mainly the traditional ones such as Dialog, Factiva, LexisNexis, EBSCO, and ProQuest.  Some subject-oriented paid search engines are also available such as those from STN International, whose flagship database is  Chemical Abstracts.  In contrast to Google and some other web search engines, no SEO manipulation is done by these vendors, so results are very consistent.

Innovations in search continue, but it is happening at the margins and inside the enterprise.  Search algorithms are changed frequently.  (See the closing keynote session for a discussion of the future of search.)  Information professionals must constantly keep up with changes in search engines and be ready to switch search tools quickly.  This is time consuming, but it is necessary if we are to remain relevant.

Marydee closed by urging attendees to read The Filter Bubble.

Arthur Weiss, Managing Director, AWARE, continued Marydee’s theme and reviewed some specialist search engines for people, numeric data, and news.  He noted that although search engines may claim to search the deep web, they may be only using a web crawler to find material on the visible web.  True deep web search tools typically look for information not searchable by crawlers.

Weiss showed how a Google news search returns different results depending on whether one is logged in to a Google account or not. When you are logged in to your account, Google knows who you are, your location, and any preferences you have set.  Several news search engines cater to business users, including Northern Light, Congoo, and NewsnowSilobreaker and Evri aggregate news and return results on a topic.  Silobreaker has a number of innovative features, such as a summary, headlines, and trend charts showing item frequencies.  Evri has more images than Silobreaker.

People search engines are either directories of names or searches for names in the context of articles.  Some of the second type include Pipl, 123People, and Yasni.  Pipl has a US bias; the other two are based in Europe.  Yatedo allows phonetic searches, searches based on links to other people, and other advanced options.  Jigsaw is a database of online business cards and actively solitics contributions of them.  Yoname searches people who are users of any of 27 social media sites.

Numeric searches can be difficult because much numeric is presented in graphical format.  Data from official statistical sources is available in the Offstats database, and the Open Data Directory provides links to over 400,000 databases of numeric data on a wide range of subjects.  For scientific data, Wolfram Alpha is a good source; it presents data in tabular or graphical format. Lexxe searches data by using a “semantic key” approach and also reports results in a chart.

Karen Blakeman, Trainer and Consultant, RBA Information Systems, looked at what search engines know about us, and “a lot” is known, so users must be well aware of this when as they do their searches.  In particular, Google knows us very well and personalizes search results based on the user’s location browser, search history, blocked sites, “liked” sites, etc.  Searches based on the user’s location attempt to return rresults relevant to the country, but they may return erroneous results because a company’s switchboard may be located in a different country, for example, which has implications because access to some sites is blocked outside their local region.

Panopticlick will test your browser configuration and report how unique it appears to be.  (The more unique it is, the easier it is to track unique information about the user.)

Search personalization and localization may not be all bad for users; for example, it is useful if you need to quickly find a local restaurant or are researching comapnies in a particular country.  To explicitly search local listings, country versions of search engines are useful.  Several browsers have an anonymous searching feature that turns off saving of searches, personalization, etc.  You can also set your ad preferences in Google (www.google.com/ads/preferences).

Facebook is notorious for making it difficult to delete material, and it even keeps it even when you think you have deleted it.  Europe v. Facebook is a collection of complaints against Facebook and instructions for residents of Europe to request their data from Facebook under EU privacy laws.

In the news area, Google can seriously damage search results.  Mary Ellen Bates recently did an experiment where she asked several searchers to enter the term “Israel” and send her the results.  The results were startling:  More than 25% of the stories were retrieved by only one searcher, and only 12% of the searchers saw the same 3 stories in the same order in their results.  Google’s recently introduced “Standout” feature to tag content will make the situation worse.

So what should a searcher do?  You can reject cookies, but then many searches will not run.  Active management of cookies is possible, but it is time consuming.  Scroogle.org provides an anonymized interface to Google, but it is for web search only.  Duck Duck Go and Blekko do not keep web history of personalize search results.

Here are Karen’s recommendations in this uncertain and sometimes scary search world.

  • You have some control over personalization, so damage limitation is sometimes possible.
  • Sometimes a web search history is a convenience and personalization is a good thing.  You must make this decision.
  • If you have a Google or Bing account, be sure to log out of it when not using it.
  • Regularly check your dashboard privacy settings, and ad preferences.
  • Clear histories if you do not need them.
  • Remember that if you delete all cookies, you will lose your opt-out preferences.
This was an information-packed session and one that all information professionals should look at.  You are certain to find something of interest!

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Comments are closed.