Finding the Concept, Not Just the Word

Ontologies are popular!  In fact,  they are very popular!  But there is still much confusion about the difference between an ontology, thesaurus, and taxonomy.  In her presentation entitled "Onto-What?", Brandy King, Librarian, Center on Media and Child Health at Harvard University, gave one of the most succinct and clear definitions of these terms that I have heard.

  • A thesaurus contains concepts and their synonomous (or opposite) relationships.
  • A taxonomy is concepts arranged in a hierarchical relationship.
  • An ontology contains defined concepts and the desirable relationships between them.
    • Example relationships are "Is_a", "Has_a", "Occurs_with", and "Result_of".

Pictorially, an ontology looks like this.

Why bother making relationships?  Because they match concepts rather than words, they allow building of a search engine that understands the core concepts of a query regardless of the way it was asked.  In fact the search term may not even appear in the document, but a semantic search engine will be able to find it provided the correct concepts have been identified.

King said that Boolean searching will retrieve every occurrence of the search term, but semantic searches provide quick and immediate relevant answers and can make up for errors in cataloging and searching.  Researchers should therefore use a combination of both methods.

Joe Tragert of EBSCO Information Services described how EBSCO is using Latent Semantic Indexing (LSI) to enhance searching structured and unstructured content.  Benefits of LSI include:

  • Blind search.  You cannnot know all the terms in the database.  LSI allows queries to find overlooked information that may not be in the database.
  • Categorization.  Users can train the search engine using examples, and LSI can then automatically assign the terms to incoming documents.
  • Relationship discovery.  Subtle relationships that are deliberately or accidently obscured can be found.

Similar concepts can be found by automatic association, even when the word is not in the document.  Any type of text can be searched:  free form, documents, text that has been cut and pasted from a Web site, etc. Keyword searches do not have this capability.  LSI is an extremely powerful technique that will be included in future search engines.

Don Hawkins
Columnist, Information  Today

One Response to “Finding the Concept, Not Just the Word”

  1. Brandy King June 10, 2007 at 10:51 am #

    Glad you found our session interesting. If you’d like to access our slides, they are available here:
    Brandy King, Center on Media and Child Health at Children’s Hospital Boston, Harvard University — See presentation slides at http://sla.dsoc.googlepages.com/King2007SLA-DSOC.pdf
    Joe Tragert, EBSCO See presentation slides at http://sla.dsoc.googlepages.com/Tragert2007LSI.pdf