Archive | Charleston Conference 2011 RSS for this section

Charleston Conference Wrapup

 

View of Cooper River Bridge, Charleston, SC

The 2011 Charleston Conference is over now.  As always, it was a highly worthwhile event.  This years attendance was about 1,450–an all-time record.  Here is a summary of some of the major points that came up in the sessions that I attended.

  • Michael Keller’s excellent keynote addressed the problem of information silos and how we can make it easier for our users to find the information they need.  Linked data–identifying entities embedded in the knowledge resources, tying them together with named connections, and publishing the relationships as crawlable links on the web– may be one solution.
  • With the increasing availability of large datasets, handling data has become a significant problem.  The concept of the “data paper“–a formal publication whose primary purpose is to expose and describe data, as opposed to analyzing it and drawing conclusions from it–will help researchers share their data and make it accessible and also help them comply with requirements of granting organizations that a “data management plan” be part of every application.
  • Digital repositories continue to be important, but there is considerable variation in their uses and the types of material they contain.  Repositories are no longer only about open access; they have become a valuable part of a large system that includes publishers, societies, etc.  Motivating researchers to contribute their work is a major issue.
  • One cannot go to a conference related to libraries without hearing lots about e-books, and this conference was no exception.  In an academic library’s collection, a few high-use titles tend to dominate the usage statistics, and a large number fall into the Long Tail.  A platform allowing e-books and other materials (such as journals) to be searched together is appealing.
  • The final plenary session on new directions in open research was outstanding.  The problems in today’s scholarly communication are not economic, but include scale, access, speed, and communication.  7 platforms facilitating open research have emerged in the last 12 to 18 months; many are open source and have an API for sharing.
  • An interesting report on the Digital Public Library of America (DPLA) provided a status update and summarized the operational plans for the coming 18 months.  A steering committee has been formed to provide guidance.  At launch, the DPLA will be a distributed system of basic materials.  It will collaborate with a similar effort in Europe and will respect copyright.  Whenever possible, free and open source code will be used.  Metadata will be freely available.
  • The Long Arm of the Law” was a panel on current legal and copyright issues in our industry.  The doctrine of Fair Use is widely used as a justification for copying, but it is less well known that there are significant limitations on it in the current law.  “First sale” limitations do not apply to works produced outside the U.S., and an important consideration is whether the planned use of the material will be “transformative”–whether the use will change its original purpose into something new and different.
  • Fallout from the Google Books case continues.  The settlement was recently rejected by the Court because it created rights for Google that could reduce the ability of current and future competitors to enter the market.  Negotiations are continuing.
  • Discovery systems have become prominent, but they are not a panacea.  Students still must be extensively trained to search and do research, as one university professor’s experience recently showed.  Despite detailed instructions and demonstrations of the Summon system, many students had significant problems locating a known article and finding other related articles.  Discovery systems conceal the variety in conducting research and move novice searchers away form the characteristics and context of the underlying resources.  These conclusions of this experience are that all tools used by the current  generation of students require specialized instruction, and without it, even smart students will struggle to use tools that may seem intuitive to many of us.
  • The closing plenary, “The Status Quo Has Got To Go!“, by Brad Eden, Dean of Library Services, Valpariso University, was a stirring challenge to all academic librarians.  He listed some of the current problems we face such as the disengagement of states from funding higher education, dramatic changes in information dissemination as a result of the Google book settlement, the rise of social media, and space and people issues.  He challenged the audience to embrace social media and talk the way our users talk.  The current publishing model is unsustainable, and we need to be fully aware of authors’ rights.  He urged us to stop keeping our data in expensive proprietary systems.  The entire staff must be aware of the organization’s strategic direction, think like administrators, and work as a team.  A report written for university provosts (those who fund libraries) provides excellent direction for moving libraries into the future.
Dates for the 2012 Charleston Conference are November 7-10.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

End User Tools for Evaluating Scholarly Content

 

Carol Anne Meyer

Carol Anne Meyer from CrossRef presented an update of her SSP 2011 presentation.  After a Harvard researcher was accused of wrongdoing, his research paper was retracted by the publisher, but it remained on his website for several months.  And it still can be retrieved on Google Scholar.  It is also on PubMed (with a note about the retraction).  There are multiple channels to access an article, which is a problem for users.

Users are not always clear which version of a document they are reading.  Librarians do not have time to track changes after publication, so users might cite incorrect versions of articles.  It is very tempting to go to the most accessible version of something.  Researchers complain that Google does not retrieve the correct versions of articles.

How should we communicate article corrections?  Sometimes journal editors issue an “Expression of Concern” or add a link to corrections on the web page for the article, such as this one:

But sometimes these links are difficult to find, at the bottom of the page for example.  And the original PDF does not contain the corrections, so if the researcher downloads the original, the correction is lost.  CrossRef has begun to address this system, putting a mark (CrossMark) on records that identify publisher-maintained content (see my posting linked above).  A pilot test currently underway has shown that users do not know what the mark means, so CrossRef has added mouse-over information and a label explaining it.  The mark can even be shown on PDF documents.  The CrossMark service will be available in 2012.  CrossRef is working with some search vendors to label articles in search results.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

The Status Quo Has Got To Go!

 

Brad Eden

Brad Eden, Dean of Library Services, Valpariso University, closed the plenary sessions with a stirring challenge to the audience. His address, entitled “The Status Quo Has Got to Go”, was extremely information-rich and was filled with a plethora of slides filled with tiny print. They will shortly be available on the conference website, and should be required reading for every academic library professional (those in other areas of librarianship will also find much that is useful in them).

Eden began by saying that he wanted to make people uncomfortable, and it was clear that he succeeded. He noted that we need to be uncomfortable because the world is changing, and we will become obsolete if we are comfortable. We are no longer technical services, public services, or collection developement people; we are a team and we must work together as a team! We must be fully informed about the strategies of our organizations; it is part of our job.

It is useful to view the overall picture of the library from the administrator’s perspective. It is all about politics! Some of the major issues are:

  • States are disengaging themselves from higher education because nobody wants to pay more taxes. This trend will not change, only accelerate.
  • The Google book settlement will dramatically change how we disseminate information. Google wants to be another Elsevier. It will eventually be the largest item in library budgets.
  • We need to get involved in social networking and talk the way our users talk.
  • Space and people–distributed print repositories are the big topic. State governments will not build more buildings for storage space, so we are getting into regional repositories. It costs $4.26 to keep a book on the shelf per year. High density storage cuts this cost to 86 cents–a huge cost savings. So why are we putting books on our shelves any more? We need to get beyond this!

Here are some things that must occur:

  • Move the organiztion into the digital mentality and the digital level of collections. Dramatic change must happen.
  • Move from the local to the network level in collaboration, metadata, and resource sharing.
  • Move towards open access and scholarly communication.

The current publishing model where we give away our research for free and then buy it back at exhorbitant prices is unsustainable. We need to start talking about digital preservation. Everyone needs to understand about author rights and inform and educate the faculty.

We must shift from information literacy to media literacy and understand the power and capabilities of our devices.

People are not coming to us becasue we are not talking their language. In today’s 3-dimensional world of visualization why are we are serving up info in old fashioned one-dimensional text? (See the Theban Mapping Project for an example of how learn in 3D.)

Technical services and libraries have been a microcosm for bad PR and a lack of marketing. We must show people what we do. A recent OCLC report debunks the 80/20 myth; ARL statistics show a 50 to 60% decline in reference transactions since 1995. We are biuying a lot of content that nobody is using, and people do not know what we do. Why do we continue to buy things?

Three major issues for today’s library administrators are:

  • If you are still doing local copy cataloging, you are wasting your organization’s money! Stop paying a vendor to display your data in a proprietary system.
  • Users no longer think of the library or its OPAC as the first option for obtaining information; they are usually the last option, if an option at all.
  • If and when the current economic situation improves, library staffing will never return to its former levels. So be a leader by freeing up staff time. Move things around for cost-benefits. Train your current staff (you won’t get any more!) to move into the digital resources environment.
  • Get with it! Examine open cloud solutions. Why are we still keeping our data in and using proprietary systems? STOP!! Get out of them. It is uncomfortable but it is the right thing to do. Vendors are giving us parties at ALA conferences with our own money! You are part of the problem! Get smart–this is ridiculous!!
  • The next on the cutting block will be reference services. Are you going out to the users? Nothing more that can be cut out of the back room processing costs.

Provosts’ Report on Academic Libraries (entitled “Redefining the Academic Library”) was published by the University Leadership Council in 2011.  (Provost’s are those who control the funding of academic libraries.) The report is in 2 sections: 1. Transformational Change in the Information Landscape, and 2. Managing the Migration to Digital Information Services. This is an excellent report that tells provosts how to move the library into the future. We need to help them make the right decisions. Which is more important: high profits for commercial publishers, or jobs for academic librarians?

We must think and act like an informed library activist or employee (i.e. library administrator). If we do not work as a team, we will all sink together. Find out what your university’s strategic plan is; if you are not on board, you are not on the ship. Develop goals and targets. Don’t play it safe–this fosters mediocrity which leads to decay. Leave plenty of room to take risks. There is no substitute for your talent; understand, value, and develop it.

Here are things we should stop.

  • Checking in print serials.
  • Binding print journals.
  • Maintaining serials records locally instead of centrally across the campus.
  • Local customization of bibliographic records.
  • Having a staff member distribute records for loading into local OPACs.
  • Preparing full records for everything instead of “good enough” ones.
  • Having separate local ILSs.

And here are some things we should do.

  • Spend time on collections that are uncataloged or undescribed.
  • Share responsibility for cataloging backlogs.
  • Redeploy staff to description and organization of digital resources instead of print.
  • Do all bibliographic work at the network rather than the local level.
  • Consider the life cycle of all resources and formats.

What will kill our profession is a lack of imagination. We need to be bridge builders and global thinkers. A helpful resource is the Self-Improvement Newsletter, and a new book, The Challenge of Library Management, is also excellent.

This summary only scratches the surface of Eden’s presentation, which was one of the highlights of the conference. Again, I highly recommend reading the slides.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Discovery Systems: We Still Need to Train Searchers to Become Researchers

Craig Brians (L) and Bruce Pencek (R)

Craig Brians, Professor of Political Science, and Bruce Pencek, Library Trainer, both from Virginia Tech, presented a fascinating report on their experiences with using Serials Solutions’ Summon discovery service.  Two years ago, Bruce had predicted that there would be some problems with Summon, and he was right.  This presentation was a report on Craig’s experiences with it.  He told his 400 students to use Summon. (Even in a 300 student class, he assigns research projects.) Despite extensive training, many students did not find things in an optimal fashion.

“Introduction to US Government” is a basic freshman-level class with about 300 students from many other departments outside political science. “Political Communication” is an advanced upper level class with about 100 students. None of the students had a previous exposure to Summon. Their level of research work was unknown.

Students were assigned an article to find and read. A full citation was provided, and they were instructed to locate the article using Summon, read it, and summarize the key ideas. Then from the known article, they were to find 4 related articles. In great (one might say “almost excruciating”!) detail, Bruce taught the students how to find key ideas, how to search on “subject terms” in Summon, and how to log in to the Library’s website from off campus. (Summon is prominently displayed on the library’s homepage.)  They were shown how to copy and paste the article title into the Summon search box. The article should have been the first one displayed. Then they were told to use the article keywords to find related articles.  Nearly half a class period was spent on this exercise.

Many students had trouble reading the assigned article. The most common question they asked was if they could Google Summon. Another common question was whether they had to pay for the article (even though they were told never to pay for anything). There were also challenges with off-campus sign-on and broken resolver links.  Even though the students were dismissive of the lesson when they were shown in class, they had trouble finding related articles, what to put into the Summon search box, how the article was related, and how to read a scholarly article.

Why do students struggle with Summon? You would have thought it would have been very easy, especially given the information that was provided to them. Searching today is different from 15 years ago. Searchers are physically away from the library, and there are no cues from the co-location of various resources in stacks. Subject headings from various sources are marginalized.

Summon appears to simplify literature research for students; however, discovery systems conceal the variety and messiness of conducting research. And when you are moved away from the messiness, you tend to forget how to be a researcher. Being frustrated is an important part of the research process so you find your way through the roadblocks.

Discovery tools move novice researchers farther away from the characteristics of the underlying resources, and students do not differentiate sources; for example, the content distinctions between blogs and scholarly articles. In years past, news and editorial content were often confused with one another. Summon’s left side format distinctions may assume too much knowledge by the searcher.

The conclusions of this experience are that all tools used by the current generation of students require specialized instruction. Without this pedagogical effort, even smart students struggle to use tools that may seem intuitive to many of us. Discovery systems breakdown disciplinary silos, but also burn down disciplinary scaffolding.

Tools do not substitute for instruction. Bruce and Craig recommend that studnts receive guided instruction and more hands-on practice, which requires class time for reserch instruction and assistance by librarians. The more complex the research tool gets, the more instruction is needed. To help teach Summon, they developed new learning methods utilizing clicker questions and a screen shot tutorial.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

The Long Arm of the Law

(L-R) Ann Okerson, Bill Hannay, Lauren Schoenthaler, Jack Bernard

Ann Okerson, Sr. Advisor,  Center for Research Libraryes, moderated a panel on current legal issues and cases in our industry.  Panelists were Jack Bernard, Associate General Counsel, University of Michigan; William Hannay, Schiff Hardin LLP; and Lauren Schoenthaler, Sr. University Counsel, Stanford University.  Ann began by noting that lawyers disccovered libraries as early as 1973.  The 1990s were a rich legal environment, and the present century has seen much action.  Here is a list of some of the more notable cases:

In the international arena, IFLA has an outstanding legal committee.

Jack Bernard began the discussion by reviewing the U.S. copyright law and what we believe about copyright.  In a survey, only about 3% of the public knew that the purpose of the copyright law, according to the U.S. Constitution, is to advance and promote progress.   We always must keep in mind the balance among all the players and focus on the work of authorship. The moment a work is authored, the author receives 5 amazing exclusive rights:  reproduction, derivative rights, distribution, and public performance of the work.

Sometimes there is a question about who is the copyright holder.  The default is the author, but it could also be an employer or an independent contractor.  Holders are very anxious to protect their rights because they are very valuable.Authors can only authorize specific things subject to Sections 107-122 of the act. There is lots of confusion which leads people to erroneous conclusions about copyright. You are not infringing copyright if you are the owner, you have express permission from the owner, the work is in the public domain, or your use falls within a specific statutory limitation covered in Sections 107-122 of the Copyright Act.  For example, Section 109 supersedes Section 106, so you are allowed to sell your copy of a book.

It is important to recognize the “first sale” limitations, which control the first instance in which a work enters the marketplace.  But these limitations do not cover works produced outside the U.S., which was used in a well known case in which a student from Thailand imported copies of textbooks published by a U.S. publisher but manufactured outside the U.S.  However the term “manufactured” as used in the Copyright Act is ambiguous, which has led to appeals of the various court decisions in the case, and it is still not resolved.  In particular, how this will affect library lending of books is unclear.  The case has other implications as well:  it encourages foreign manufacture, invites mass unwitting infringement, runs counter to common sense, and protects some markets but is anti-competitive in others.

Lauren Schoenthaler noted that in 1976, when the present copyright law was enacted, we had no digital technology or instant licensing. But 1976 is not the best judge of 2011.   She reviewed the case of Georgia State University litigation involving allegations of mass unauthorized copying by faculty creating electronic reserves.  There is no decision yet in this case, but we need to watch it closely. It will be temporary because there will certainly be an appeal.

In analyzing Fair Use, we must consider several factors, the most important of which is whether the use will be “transformative” and will change its original purpose into something new and different.  We must use caution and recognize that:

  • Licensing terms trump copyright,
  • Privacy concerns of students persist long after they graduate, and
  • Documents can be “hacked” (make sure the copy was lawfully acquired).

Bill Hannay reviewed the status of the Google book case.  He began by noting that we are entering into a problematic phase of libraries. What if there were competition in electronic libraries of the future?  Competition is good for our country. Congress originally feared concentration of economic power not only on economic grounds but also because of its threat to democratic values.

The Google Book Case is not resolved and could have unpredictable negative impacts. The court rejected the settlement because of competition issues. It was a private arrangement on the way people would do business in the future, and created rights that Google would have going forward and could reduce the ability of current and future competitors to enter the market.  It is not really a dispute of actions in the past. Google and the plaintiffs wanted a settlement outside the laws.  One objector to the settlement said that “Google pursued its copyright project in calculated disregard of authors’ rights.”

In the 7 months since the settlement was rejected, Google and the plaintiffs have continue to negotiate but have not reached any agreement.

There doesn’t seem to be any way to resolve this: at conferences, nobody agreed with anybody else. If we wait long enough, things may work out on their own.  The Internet Archive is digitizing over 1000 books a day, and the Hathi Trust includes over 6 million volumes.

Hannay concluded that more competition is better than less competition because it spurs lower prices and higher quality of service.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

 

 

I Hear the Train A Comin’

Greg Tannenbaum conducted his annual live version of his “I Hear the Train A Comin'” column in Against the Grain.  His guests this year were Anne Kenney, Cornell University Librarian; and Kevin Guthrie, Founder of JSTOR and ITHAKA.

(L-R) Greg Tannenbaum, Anne Kenney, Kevin Guthrie

Here is an edited transcript of their conversation:

GT: What are the biggest challenges facing the library community over the next 2-5 years? What has to give?

AK: Our materials budgets show us a path forward. The percent going to e-content has doubled from 30% in 2004 to 60% now. We have been surprised by the diversity of our holdings in the past; moving forward we will see more homogeneity. But our organizational arrangements are still heavily oriented towards physical services. Something must give as more and more electronic services become available.

KG: The notion of libraries and publishers as adversaries is not appropriate. It is really about the author and reader and a system to serve them. The allegiance to the intermediary structures must give because much restructuring is happening. It is very hard for the existing actors to give up on how they do things to allow freedom of reinvestment.

AK: The archive is moving towards being seen as a public good worthy of public support. We are moving toward a model of providing support from around the world.

KG: There is downward pressure on pricing of content. The question is what value are you adding on to that. Everybody in the space between the author and reader must figure out how they can contribute value.

GT: What aspect of the vendor-institutional relationship do publishers misunderstand?

AK: We tend to think that publishers see us as a sales channel, with less understanding of our mediation goals, providing access, preservation. There is a stronger relationship between the library and reader than would be considered.

GT: Same question for libraries?

KG: Librarians go into that profession because they are not looking to go into business, so there is a challenge understanding the business aspects. When you build scale, you build huge costs. Value is more challenging. How does a library value the materials they are getting? It is very difficult to mesure value.

AK: We want the same rights for electronic materials as for physical, as for example, e-lending of materials. Libraries see this as publishers trying to curb their traditional roles. The hidden environment of seeking information whereever it is found is justification for that. In the music environment, performers are using concerts for income. We have no similar process in libraries.

KG: There is a tendency to stay with the status quo. The concept of owning something has changed. Publishers sold books to libraries and they were loaned, but there was friction in that. All the former players will not necessarily survive to the new world. Collaboration between those helping the authors write and those helping them distribute is important.

GT: Circumstances and media have changed, which allows us an opportunity to revisit how we are operating.

AK: We are moving beyond the silos of publishers as well as the silos of libraries. Preservation is an area where publishers and libraries need to do much more work as we move towards licenses and not owning material. We are at real risk of losing access. We need mechanisms in place to preserve content. Publishers’ activities are insufficient so far.

KG: There is a tremendous amount to do with e-journals. New formats are coming and it is important that we are investing in those solutions.

GT: What should we take away from current legal cases?

AK: We need to understand the issues associated with what is appropriate for digital access to material. Libraries in the business of respecting agreements, contracts, and rights. If we do not do that, we will lose our sense of trust, which we will not let go lightly. We must respect privacy of use.

KG: The old adage that “nothing in newspapers and blogs is true” is indeed true! It is amazing how far from the truth some of the published articles are. We have benefitted greatly as a country from the respect for the rule of law. If we do not like the law, we advocate changes, and that is a great thing. Libraries have been good stewards of their responsibilities, and we must respect that.

GT: What game changer will we be talking about in Charleston in 2014?

KG: Books in electronic form is the big game changer of the present moment. We do not have broad access to books yet; when we do, everything will change. The Google Books initiative signaled that it was possible to digitize 15 million books. What if everything in a library is available electronically? That changes the way we operate. It is still not here yet, but those books are not yet as widely available as people want them to be.

AK: There are many different game changers. How do we manage the long continuum of scholarly communication? The outcome of the Google settlement is a main game changer, as is the develoopment of the Hathi Trust. Over 60 institutions and consortia are participating, and with 10 million volumes, it is in the company of the elite of ALA libraries. Now we have the ability to search across the content of all those volumes. We are moving towards new forms of reading, where we can mine information in new ways. Researchers will still look at physical books as we move toward orphan works becoming available. How do we as a community keep things lightweight and work together and not diminish the role of the individual institution but enhance it? The future will be in more pre- and post-collaborative activity.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

 

Executive Roundtable: The Boundaries are Getting Blurred

(L-R) T. Scott Plutchak, University of Alabama; Paul Courant, University of Michigan; H. Frederick Dylla, American Institute of Physics

The Friday plenary session opened with an executive roundtable discussion.  The panel consisted of 2 people in charge of large organizations who spent many years in research and who represent one audience that librarians want to serve.  They reported on a Scholarly Publishing Roundtable, which met to discuss issues and find common ground, then issued a report,which is on the AAU website. The issues are complicated and need to be balanced. Access only is not worth very much.

Here is an edited transcript of the conversation:

PC: The conclusions of the Scholarly Publishing Roundtable will be useful to define policies around public availability to publicly funded work. There is a remarkable heterogeneity in the ways things come to be published, and we must recognize that or we will be in serious trouble.

FD: For 3-1/2 centuries, publishers and librarians worked together. In the last 10 years, they had a fractious debate. The Roundtable changed the tone of the debate on public access. The President’s Science Office released some overriding principles to be followed and requested input on publishing practices.

SP: The “version of record” issue is a concern. The author’s final manuscript may be useful for immediate needs, but may not suffice as time goes on and revisions are made.

PC: That issue is complicated now as we produce everything electronically. You can lose the version of record easily in an unstable world. In the humanities, there is no version of record because of the continuing integration of multimedia, updates, etc. Versions of articles never become stable. What is the library’s role in this environment? The library and publisher begin to look very similar. Do we want to preserve the entire record? Do we want just a sample of it?

SP: This is particularly important in the healthcare field because lawyers often want a particular edition for a legal case. Traditional publishing in physics and similar disciplines is stable. Why?

FD: It goes back to the mimeograph machine where early versions were sent around asking for colleges. Now we fax documents to others. The physics community is well knit and has a half century of experience with collaboration. Some papers have hundreds of authors, so peer review is done internally.

PC: Economics is like physics and has always had a similar preprint culture of circulating papers before they are reviewed. We should not believe that a model that works in one world will work in all worlds.

SP: Science and scholarship is becoming more and more siloed and more interdisciplinary.

PC: The silo to the next piece of work is extremely important, but it may be more useful when it jumps over into another discipline.

FD: PLoS ONE is an interesting example. By forming an interdisciplinary and wide open journal, it has become the largest journal in the world. Many of us just use Google and go right to the abstract, without needing the indexing and other things on top of the article. We as publishers must be working on accurate discovery tools to help users locate articles.

SP: PLoS ONE is the first real game changer in publishing because it has shifted the process of peer review. One wonders what will happen to the rest of the journal space.

PC: I expect vertical alliances of journals. BEPress has set up various categories of journals, but articles are only reviewed once and then the journal where the article will be published is selected.

FD: This is just another corner of the publsihing ecosystem. The diversity of publishing is one thing I admire.

SP: Findability is the thing that worries me most about PLoS ONE. There is so much of interest that is being published that the challenge is not to separate the interesting from the uninteresting but to find the really important things among all the interesting ones.

FD: Our most important customers are the authors and readers. Everybody else serves them.

SP: The use of social networking may help researchers broaden their circle of colleagues beyond what they are aware of.

FD: Collexis, now part of Elsevier, set up something similar for biomedical researchers. We have UniPHY for physics. It is not too successful yet, but it is a good start.

SP: We are at least a generation or two removed from a true digital culture that parallels today’s print culture. The technical challenges are very solvable, but the economic and legal issues are much more difficult.

PC: The technology is actually quite good; we have become very good at transmitting large data files. But doing so requires payments and a set of arrangements different from anything we have seen.

FD: Costs and benefits must be a very important part of the equation. Someone must pay for the infrastructure.

SP: You both seem very optimistic about where we are going. Is that accurate? Where are the bright spots?

FD: I think the diverse group at the Roundtable showed the way for us to work through these problems. We all agreed on a set of principles for scholarly publishing.

PC: It’s now very inexpensive to copy and distribute work. That’s very good because new things that were formerly unimaginible are possible.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

 

The Digital Public Library of America (DLPA)

 

Robert Darnton and Rachel Frick

In a session on the Digital Public Library of America (DPLA), Robert Darnton, Director of the Harvard University Library, likened the concept to Thomas Jefferson’s observation that often the use of something does not diminish its value. For example, using one candle to light another spreads light and does not diminish the value of the first candle. This idea acquired a 21st century luster with the spread of the Internet. The use of information does not diminish its value. Public good benefits the entire citizenry and one citizen’s benefit does not diminish another’s; it is not a zero-sum game. However in considering these concepts, we must not lose sight that the acquisition of knowledge as a public good is not without cost. (Someone had to purchase Jefferson’s candle!)

Moving on to the present, Darnton noted that the DPLA is an opportunity to realize the enlightment and goals upon which our country was founded. Google tried to establish a major digital library and demonstrated that today’s technology could be used to create a new kind of library which, in principle, could contain all the books in existence. But Darnton observed that Google Book Search is an example of a good idea gone bad because of copyright problems and the alleged infringement of it by the Author’s Guild. Google did not pursue a legal case which (if they won) would have provided a significant public benefit, but instead they chose a commercial approach and negotiated a settlement with the Guild. The settlement was rejected by a Federal Court. So the time has come to create a digital library to make our cultural heritage available to the entire world.

In 2010, a steering committee was formed to provide guidance. Working groups were set up and produced a way to create a master plan, which was presented to the public last week.

Here are some of Darnton’s thoughts about features of the DPLA:

1. The DPLA will be a distributed system aggregating collections from research libraries and institutions. It will not be a single database but will consist primarily of books in the public domain from the Hathi Trust, the Internet Archive, and digitized collections made by large libraries independent of Google. Government sources are also rich sources; all 50 states have digitized their major newspapers and given them to the Library of Congress. These can be given to DPLA. Because of copyright law, most current literature will not be in the DPLA. DPLA’s mission should be defined to make its service distinct from public libraries. Darnton suggests that the DPLA exclude anything published in the last 5-10 years.

2. At its launch in April 2013, the DPLA will probably contain a basic stock and will grow as funding permits. It will be interoperable with major digital libraries in other countries (it has already made an agreement to cooperate with Europeana). The example of Europeana suggests the bare minimum funding needed to get the DPLA going. Brewster Kahle of the Internet Archive estimates a cost of 30 cents to digitize a page, or $300 million to digitize the contents of a large library, but others think the cost is nearer to $1/page. The DPLA will grow in accordance with its budget, which nobody knows yet. If a coalition contributed $100M/year, a great library could be created in a decade. DPLA will cooperate with Europeana, which estimates 5 Million Euros/year for its operating costs.

3. The DPLA must respect copyright. The first copyright law struck a balance between authors and publishers by providing limitations on the term of copyright. The current limit tips the balance toward private commercial interests. Every book published since 1923 is now covered by copyright, regardless of whether it has been renewed, and many owners are unknown which has led to the orphan works problem. The DPLA could try to reach an agreement between authors and publishers of books that have gone out of print.

4. The DPLA steering committee established a contest to develop a technical infrastructure. The technical subcommittee will develop a draft prototype to go into operation when the DPLA is launched.

5. A governments committee has only begun to study the administrative issues of the DPLA. The present interim leadership at Harvard will continue until the final DPLA comes into existence. It will serve a very broad and diverse community and is meant to serve the entire country, so it probably will not be at any elitist institution. Most people think it should not be part of the Federal Government to keep it free from political pressures.

Rachel Frick, Director of the Digital Library Federation summarized the operational plans of the DPLA for the next 18 months.

  • Where possible, existing free or open source code will be used.
  • The DPLA will be freely accessible for others to port or replicate.
  • Metadata is the core of the discovery framework. It will aggregate existing library data and operate in a global data environment. All metadata will be freely available except where it would violate personal privacy.
  • Content will incorporate all formats, not just books. It will begin with already digitized works in the public domain. It will grow with orphan works.
  • Tools and services like APIs will provide enhanced uses of the content. The platform will be open to public innovation and enable the creation of new tools and services. It will provide APIs. With Europeana it will share an interoperable data model and source code.
  • Community will be a participatory platform that supports users and developers who wish to reuse content and metadata.
  • A discovery layer will provide access to secondary sources.

More information is available on the DLPA website.

The DPLA beta-sprint was an aggregate of metada from 1400+ collections in 44 states. Visit dpla.granger.illlinois.edu to see it. 60 organizations submitted letters of intent, and several were chosen to demonstrate their systems at the DPLA plenary conference.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

My thanks to Carol Tenopir, University of Tennessee, for her contributions to this post.

New Directions in Open Research

 

Clifford Lynch (L) and Lee Dirks (R)

Clifford Lynch, Director of the Coalition for Networked Information (CNI), and Lee Dirks from Microsoft Research gave a wonderful presentation in the final plenary session on the first day of the Charleston Conference.

Lynch began by enumerating some serious problems with the present system of scholarly communication in science.  These are not economic problems. They include:

  • Scale: scientific publishing is getting bigger and bigger–a scientific paper is published every 1 or 2 minutes.
  • Speed: We are under constant pressure to make research and discovery move faster. Achieving speed is often at high cost. We have huge problems with filtering and validating work.
  • Access: One of the hopeful possibilities for getting a handle on these problems is doing computation on the literature. But getting access to enough data is difficult because it is in silos.
  • Communication: There is a growing disconnect with practices and norms in scholarly work and how communication is operating. An excellent book about this is The Fourth Paradigm, which is available for free download.

More and more science is data- and computation-intensive and relies on communications among geographically displaced people. Some systems are starting to look at this–myExperiment is a system that lets researchers make their data accessible for sharing.

We must get past designing articles with the same old presentation of science where there are major issues of reproducibility, adding other people’s work, and recognition of data as a primary input and output of scientific inquiry.We need to manage data and integrate it into the traditional scholarly literature.

Lee Dirks from Microsoft Research followed and enumerated 7 platforms for open research that have started to emerge in the last 12 to 18 months.  They facilitate collaborative research with academia, and particularly scholarly communication. Many of them are open source, and have an API for sharing.  Here are Lee’s slides describing each one (I thank him for providing me with these copies and giving me permission to post them here).

 

 

 

 

:spacer:

These platforms all integrate into the Scholarly communication life cycle, as shown here.

Scholarly communication life cycle

:spacer:

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

Value of E-Book Collections Purchased from a Large Publisher

 

Jennifer Bazelely and Aaron Shrimplin

Aaron Shrimplin and Jennifer Bazeley from Miami University Libraries (the Miami in Ohio, not the one in Florida) have evaluated their relationship with e-books, conducting a survey of 735 users’ attitudes toward e-books. Respondents fell into 4 classes:

  • Book lovers have an inherent affinity for printed books.
  • Technophiles are interested in the possibilities of new technology for reading books.
  • Pragmatists see the pros and cons of both print and electronic forms of books.
  • Printers prefer print books and have specific difficulties with the usability  or readability of e-books.

The library is planning on ramping up their e-book collections, but there are many issues and more questions than answers.  So a preliminary study focused on the 2008 Springer e-book collection and its use over a 3 year period was conducted.

The Springer collection is divided into 12 subject collections.  The e-books have no DRM, and the owner has perpetual access–an attractive consideration.   E-books and journals are on the same site and are searchable together.  At Miami, the collection can be accessed through the OhioLINK electronic book center (EBC) or directly through Springer’s site.  The study compiled usage of 2,529 e-books published from 2008-2010 used on both platforms.  Only 23% of the titles had been used, and if this trend continues 54% of Miami’s e-books will be unused after 6 years.  Usage followed the well known 80/20 rule (Pareto Principle):  20% of used titles accounted for 80% of the downloads.  The Long Tail effect was also observed: of the infrequently used titles, about half had 3 uses or less.  A few high-use titles dominate the statistics;  the most used title had 28% of the total uses over 3 years.  Professional books, monographs, and especially textbooks accounted for the most usage.  Computer science was the most heavily used subject area.  Past usage predicted future usage; trends observed in 2008 continued in 2009 and 2010.

Platform matters:  e-books that are cross-searchable with journals is appealing, especially for pragmatists and technophiles.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor