Archive | Charleston Conference RSS for this section

Hidden Collections in the 21st Century Research Library

Mark Diminuation

Half of our archival collections in the U.S. have no online presence at all, but the appetite for digital collections is unsatiable. Only 44% of all archival finding aids are online and born digital materials are undercollected and under described. In a response to this problem, Mark Dimunation, Chief, Rare Book and Special Collections Division at the Library of Congress, said that we need to think about how we process and make our collections available to our constitutencies. There are many hidden collections in libraries, primarily caused by backlogs in processing. Some of the problems that have occurred are:

  • We have tended to use a project approach to a system problem, viewing hidden collections as a fixed problem.
  • The nature of descriptive data is changing regularly. We have accepted the erosion of legacy data.
  • We repeat tasks over and over, forcing new types of data into a system designed to accept something else.
  • in our optimistic rush to solve the problem, we may have obscured a new meaning of it. We need to make access happen.

In 2003, a conference on hidden collections concluded that:

  • We must think globally and not tread over familiar ground in circles. We must make efforts to know what we have and report it to others.
  • There is a need to build, access, and report viable models for processing, cataloging, etc.
  • We do not need to invent another core standard but achieve technical simple standards.
  • We need to explore and embrace the collection-level record. Greater success can be achieved by cooperative ventures. Promote a national backlog project.
  • Funds should be reallocated and committed.

Hidden collections were targeted by a Mellon Conference grant in 2008. The program focuses on materials of wide scholarly interest and value. 47 projects at 17 institutions have received funding.

Today, many backlogs have decreased, but the number of hidden collections has grown. We need to move away from a model of special approaches and make these materials part of the overall workflow, perhaps considering digitization of the material prior to processing it.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

My thanks to Carol Tenopir, University of Tennessee, for her contributions to this posting.


Coming of Age: Strategic Directions for Digital Repositories


David Nicholas

This presentation reported on a project on digital repositories undertaken by David Nicholas and Ian Rowlands from CIBER Research as part of the Charleston Observatory program.  Repositories are about 10 years old now; this study was to understand what library directors and researchers see as the goals of digital repositories; to identify the critical success factors behind repositories, and to assess their wider impact.


Ian Rowlands

Ian Rowlands reported on a survey of 153 library directors of 2,126 Open/DOAR repositories.  73% of the responding libraries have a repository; 21% are in the planning stage; and 6% have no plans for a repository.  Most libraries fund their repositories out of general library budgets, and most repositories are maintained by only 1 or 2 staff members.

Although nearly 80% of the repositories contain journal articles and conference papers, there is a huge variety of types of materials in them.  The libraries offer services to help their users to help them deposit their material, create metadata, and obtain copyright clearance.  When asked the main advantages of maintaining a repository, many services were mentioned, among which were long term preservation, access to “grey literature”, and better services to students and researchers.  Several libraries mentioned that repositories are helping to change the library culture.  Disadvantages of repositories included  confusion from different versions of the same material access fragmentation, and generally adding complexity to the information landscape.  One respondent said that sloppy repositories are harmful because they lower the standard for scholarly communication.

Library directors regard repositories as good vehicles to make the literature more openly available, preserve and curate information, and as first steps toward becoming a digital publisher.  Repositories are no longer only about open access; they have become a valuable part of a large system that includes publishers, societies, etc.

Important issues include promotion of the repository, and motivating people to not only access its material but also to contribute material.  The majority of survey respondents felt that the importance of repositories will increase in the future.

A complete set of slides describing the survey and its results is available on the Ciber Research website.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor


The Vicky Speck Award

The Vicky Speck ABC-CLIO Leadership Award is given every year to a leader who has made a lasting contribution to the mission of the Charleston Conference.  This year’s winner was Glenda Alvin, Assistant Director for Collection Management and Administration at the Brown-Daniel Library, Tennessee State University, Nashville, TN.

Glenda Alvin (L) Receives the Vicky Speck Leadership Award

Congratulations Glenda!

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Data Papers in the Network Era

MacKenzie Smith

MacKenzie Smith, Director of Research at the MIT Libraries, made a good case for the creation and publication of “data papers”, a formal publication whose primary purpose is to expose and describe data, as opposed to analyzing it and drawing conclusions from it.  Data sharing is important because:

  • Many institutions and granting organizations, such as NIH and NSF, are now requiring researchers to share their data. They require that a mandatory “data management plan” be part of every grant application.
  • The underlying desire of researchers is to make their results reproducible. Most of them do not object to sharing their data, but it is difficult and labor-intensive. Some of researchers’ concerns are: losing control over their data, confidentiality, privacy, and intellectual property rights.  Lack of credit for sharing data and a lack of an infrastructure stops researchers from investing the effort to share their data.

In this context, “Data” means the scientific data underlying research. One key property of this type of data is that it would be prohibitively expensive or difficult to reproduce, as with time-based sampling for example. The data can be very costly to collect in the first place. Data can be in many forms, and may exist in a proprietary format from a specific instrument. It cannot be neatly packaged like a book, and the distinction between data and software is becoming quite blurred.

Interpreting the data must be part of the current research workflow. Reusable data is structured, versioned, and documented; formatted for long-term access; archived; findable and citable; and either legally unrestricted or with a clear usage policy.  A “Data Paper” is one way to help overcome these limitations. Data papers are like regular journal articles, but they describe the data itself. Recent forms of data papers support downloads from the web. NISO is developing a standard for supplemental data in journal articles.

A data publishing infrastructure must be web-based, so to achieve interoperability, we must look at linked data. The web requires identifiers, called URIs. But we need more types of identifiers for data papers, some of which have been proposed by ORCID, I2 (Institutional Identifiers), DataCite, and CrossRef. We need identifiers for people, institutions, and datasets and their subsets.

Another aspect of the data publishing infrastructure is visualization. Data browsers will be the key to success of data papers; Web browsers are not able to support linked data. The Exhibit browser, developed at MIT, is one example of a data browser. All data can be converted into linked data and viewable by a data browser. Ontologies are necessary and are not always available. We need a registry of ontologies or schemas.

Who will do all this work to allow formal data publication on the web? Many players are involved; researchers are at the center as they always are, and their role will not change. They will need to be tapped for many of the peer review and validation functions.

Players in data publication

Publishers (and societies) can produce data journals and acquire data deposits to support the data papers. They can organize peer review and quality control as they always have, recognizing that data has a very different intellectual framework than an article (it cannot be copyrighted, for example). New mechanisms for a sustainable business model will therefore need to be developed.

Libraries are exploring data curation and ontology creation. These are excellent roles for libraries. Some organizations require libraries to sign off on any grant application to see if it has a good data management plan.

We need technology companies to provide the tools for managing the data and developing uses for it.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor


Keynote Address: Semantic Web for Libraries and Publishers

Michael Keller

Michael Keller, University Librarian at Stanford University, gave an excellent review of and argument for linked data.  He said that the problem with information today is that there are too many silos, too many search engines, in too many places and looking different from one another.

Too Many Silos

We do not make it easy for users to get what is local and what is easy to get.  We give them better interfaces and show them other tools, all of them good, and suggest that our clients use them to broaden their search.  We routinely do not refer clients to the web.  Our OPACs and the other services do not refer to more than a tiny fraction of pages available on the web.  We therefore confuse our readership.

Some of us provide our readers with lots of secondary databases–too many–in some cases over 1000!  Selecting a database to search is something of an art.   We have made it very difficult.  This shows only a very minuscule portion of what is available, and our users have to search through it on their own.

A Plethora of Resources

Our users, authors, teachers, students, and consumers, need us to help them find a better way.  Ideally, librarians and publishers will make facts about what we have discoverable on the web.  But:

  • There are too many stovepipe systems,  The landscape of discovery and access is a shambles and cannot be mapped in any logical way.
  • There is too little precision with inadequate recall.  Most of this problem lies in limitations in the design and execution of the infrastructure that supports discovery and access.
  • We are too far removed from the Web. Together, our metadata collections make up a big chink of the “dark web”, and it is clear that visibility on the web promotes dramatic increases in discovery and access.

Our working environment consists of consumers and users, publishers as intermediaries, and libraries as warehouses. Then there is the Internet, and the web of pages of information. The web of data is the next big thing to empower individuals. The next phase is the Linked Data phase, leading to the semantic web, where the machines will understand meaning and act on it.

The construction of the linked data environment and the recipe for creating it includes identifying entities embedded in the knowledge resources, tying them together with named connections, and publish the relationships as crawlable links on the web, then build and use apps supporting discovery by the web of data.  Here is a simple example.

Simple Linked Data Network

RDF triples are a way to describe objects and ideas on the web; URIs allow machine interaction among Web objects. But our metadata standards are closed. Passive metadata is searchable by word, but it is in silos. The search results are refinable, but final–there is no way to go beyond them. In contrast, semantic metadata is open, dynamic, interactive and responsive, and leads to other queries and other views.

Many publishers and societies have begun to make use of linked data. They aggregate content in their own realms and beyond, and provide actionable, constantly updated links and compellig services tying users to them. Here are a few publishers that have adopted the semantic web.

Organizations embracing the semantic web

For publishers, a library’s content is king, but if users cannot find it, there is a problem. Publishers must make their content visible. Aggregation is very important.

The Linked Open Data Value Proposition by the Stanford/CLIR Linked Data Workshop held in June 2011 is an encouragement. Google is using Stanford’s bibliographic facts and Web resources to create linked data pages using the Freebase Open Data Browser.

On Monday of this week, LC announced a bibliographic framework for the data age.

We are headed to the semantic web where uniquitous computing and mobility are essential.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Charleston Conference Opens

Here are some pictures taken just prior to the opening of the Charleston Conference.  Attendance this year has set a record–1,458, with about 400 first-time attendees, necessitating the use of 3(!) overflow rooms for the keynotes and an additional hotel for concurrent session meeting rooms.  Now there are 4!  And with nearly 200 sessions to choose from, attendees have some serious decisions to make!  (My policy, though perhaps a little limiting, is to only go to sessions in the Francis Marion Hotel because of the time it takes to get from one to another in the brief time between sessions.)

Attendees registering for the conference...

... and picking up their conference bags.

Leah Hinds (L) and Beth Bernhadrt (R) who take excellent care of the behind-the-scenes work

Anthony Watkinson, moderator of the plenary sessions, and Katina Strauch, conference organizer

An overflow crowd gathers for the opening session

I hope you enjoy the conference, and if you can’t be here, be sure to watch this blog for updates.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Charleston Conference 2011 Opens with Vendor Showcase

The 31st Charleston Conference opened today with the traditional Vendor’s Showcase.  Here are a few scenes of that event.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor


As always, the food table was a big draw.


Coming of Age: The Charleston Observatory 2011

Where are institutional repositories going and why?  Find out at the upcoming Charleston Conference, November 2-5, where the results of a survey on this subject will be presented.  Here is an e-mail that I received with the details (thanks to the organizers for permission to reproduce it).

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Institutional and subject repositories have become an important feature of the information landscape and have arguable `come of age’. CIBER, with the support of the Charleston Library Conference, is carrying out a study to find out where repositories are going next and why.

We are asking library directors and researchers worldwide what they think by means of online surveys and focus groups. The aim is to find answers to questions such as:

  •  What are the priorities and goals for repository managers?
  •  What are the critical factors behind a successful repository?
  •  What impact are they having?

 We would very much welcome your views as a library professional!

Please click here to take part in the survey, or paste into your browser.

The survey should take no more than 25 minutes of your time.  The survey will close on TuesdaOctober 25 .

 You may wish to print out the survey before you start filling it in online because we ask a few questions about budgets that you may not have at your fingertips.  Please click here to download a PDF for reference – but please do not try to fill it in!

 You will not be asked for your name.  However, you will be invited to leave an email contact address if you wish to be entered into a prize draw for an iPhone G3.  You will not be contacted again for any other purpose.

The findings of the study will be presented at the XXXI Charleston Library Conference and will be published widely in the professional and scholarly press, so we can guarantee that your response will add to the debate and your voice heard.

Please feel free to contact me if you have any questions.

Ian Rowlands
CIBER Research Limited



Charleston Conference Wrapup

I just received this e-mail with some final details about the Charleston Conference:

The Windup

And so another Charleston Conference came to a close.  Attendance surpassed all previous records–about 1,350.  You can view a timeline with all the photos shown at the opening session and high quality photos of the posters commemorating past conferences here.

Many of the speakers’ presentations will be available on Slideshare, and summaries will be published in Against the Grain in its next few issues.

The 31st Charleston Conference will be on November 3-5, 2011, preceded by the Vendor Showcase on November 2.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor