CIL 2011: Visualization Technologies

March 21, 2011 | Comments Off on CIL 2011: Visualization Technologies

This session was presented by Emily Wheeler and Samara Omundson.

In 2009 the digital universe grew immensely.  If you picture a stack of DVDs reaching to the moon and back, that’s how much data growth we had.  By 2020 it is estimated that the 44 times the size it was in 2009.  We are drowning in data.  How can we make sense of it?  Apply structure to large quantities of data to help make sense of it.  Information professionals can lead the way through these piles of data, even if we’re not statistics junkies or graphic designers.  We know how to sift through vast quantities of data and pull out those few salient data points.  Data conveys a clear message, cuts through the chaos, and helps to engage and inform stakeholders.

One strategy for information visualization is using topic clusters.  As an example, they searched for “Bieber Fever” with a general search engine.  They displayed results in a hierarchical format using a single PowerPoint slide.  Another visualization was a branching choice – almost a spider web of information circulating out from a central point.  Another strategy is using time series visualizations.  These can be line graphics or bar graphs of a particular data point’s change over time.  You highlight an intersection in searches using Search Associations.  You can do this in spreadsheet tools, but they used a great tool called TouchGraph to create some really nice relational graphics.

How do you handle text analysis differently?  Keyword frequency is very useful to identify repeated keywords—a simple word count provides this data point. Creating a bar graph quickly with the number of mentions of various words can show the relative importance or permeation of various words and ideas.  You can use Tagzedo to create good keyword clouds.  By adding word association to simple keyword frequency you can see relationships between words and concepts.  Using different colors, sizes, and boldness of visualization elements can communicate the relative importance quickly.  Structural data, like keyword associations, focus on word order.  This helps you drill down into the context of a given word.  They used IBM’s ManyEyes tool to create a really nice looking structural chart.  Looking at social media data, Twitter and Facebook activity and followers, can tell a really compelling story about how social interaction and popularity relate to frequency of posting, where you post, etc.  They built a few visualizations in Adobe Illustrator.  Visuals tell a story, they show patterns.  They touched on Infographics quickly.  An infographic is a visual representation of information, but most are designed to tell a visual story of pretty complex data.  You see these in large media outlets in articles and in lead stories.  It is really easy to transform data – try it out!

Tips and tricks: Know your message, stay simple, and experiment with data visualizations whenever you can.

CIL 2011: Search: Quick Tips for Adding Value

March 21, 2011 | Comments Off on CIL 2011: Search: Quick Tips for Adding Value

This panel had a whopping 5 presenters in 60 minutes.  Wow!

Ran Hock: Many real-time search engines cease to exist just as quickly as they were created.  Bing Social Search is an interesting experiment with real-time search.  Google has several real-time projects in its databases.  Google wild card words lets you search for words within phrases.  You can use an asterisk as a placeholder for an unknown or variable word.  You can use multiple asterisks in one search as well.  Google does some really good stuff with automatic stemming and synonyms.  But sometimes those terms are unrelated to your goal.  To get just price as in “gary price” add a plus sign à +price (just price, no stemming).  You can also precede a word with a tilde to get more synonyms.  Google Books full text indexing of the Full View Books is great.  There is a “read on your device” link that provides a mobile-friendly version of the books.  Google Language Tools, like Search Across Languages, Translate Text, Translate a Web Page, and the Google interface in over 120 languages.  Google has calculators and controls.  Ask sometimes works.

Gary Price: Web Cite lets you take any URL, add it to the service, and it creates a permanent archive to the page. *nice!*   One of the great tools for PCs is called Website Watcher.  This lets you see any webpage on any website and tracks every single miniscule change.  Change Detection lets you track changes too (and works on Macs) but only checks once a day.  Fuse Labs is a Microsoft labs service.  Microsoft Academic provides a lot of scholarly information you can’t find elsewhere.  Not only do you get a citation, but you get links to others who are citing that paper as well. (pretty sweet)  Pinboard has been referred to as “ on steroids.”  You can bookmark and tag things, but also have it automatically bookmark and tag anything you Tweet with a link in it.  There’s a mobile version too.  Journal TOCS comes from the UK and is a service that provides tables of contents for free, focusing on open access publications primarily right now.  Topsy is one real-time search company that is doing well – creates an archive of Tweets.  The archive goes back 3 or 4 years right now.  Three more: BASE, Issue Map, and Many Eyes (no time to describe, but go look at them!)

Marcy Phelps: Marcy discussed adding value to your search results.  Her presentation is at  In an age of diminishing resources, researchers need to surface their value and think: can you be replaced?  What can we do that Google, Watson, and other search tools cannot?  Information professionals are uniquely qualified to add the kind of analysis that adds value.  We can make comparisons, look at patterns, chunk content together, prove or disprove hypotheses, and answer that bottom-line question: so what?  We have to listen to our customers.  What would be valuable to them?  Would it help to have this in a certain format?  Once we ask those questions we need to shut up and listen.  We can create research products that are helpful for others, like Issues Tracker or Know Before You Go.   Here’s how to add value.  Add a table of contents.  Add an executive summary (one page, bulleted).  Add a cover memo listing the purpose of the report, methods used, and any issues raised.  Then in 25 words or less report your findings.  Add quick article summaries to the report.  Add meaning to boring numbers – add charts and infographics.  Building a dashboard with some pretty charts – just do it in Word.  Try different views of information.  Don’t give interview by interview summaries, summarize all the answers to one question in one spot.  You can also add a matrix of data, a timeline, whatever makes sense for what you’re presenting.  Also, use specialized tools to help you do your work.  Use Google Trends, pre-formatted profiles, data mining, and fee-based sources that can get you analyzed data immediately.  Consider new formats – try PowerPoint, an in-person or phone presentation, or create a video.  Finally, create your value-added toolbox…use Word Styles, a chart gallery, templates, and with your branding.

Natasha Bergson-Michelson: Her job day and night is to teach people how to search.  She talked about simple tricks like doing filetype searches in Google.  These types of tips are awesome, but our users don’t remember them.  She says someone recommended the following to her: imagine your perfect source before you start searching.  So she started teaching this method to her students.  The first part is that if you’re using a search engine, imagine the answer and not the question.  Use the search terms and phrases that would appear in the answer.  This is the big thing…just stop to think before searching.  Use quotation marks for phrase searching as well.  You can search for dates in Google too – e.g. “1995..2010” the .. looks for every number within the range (nice!).   She gave a great tip for finding books by color – do a Google Image search for the topic or title of the books, e.g. “Rosa Parks” and then go into the lower left corner and limit by color (pink) and then voila, you get possibly relevant book covers.

Tamas Doskocz: What is semantic search?  A search, a question, or an action that produces meaningful results even when the retrieved items contain none of the query terms or the search involves no query text at all.  Semantic search is “what is possible with today’s technologies for search.”  Google recipes search is one example of an attempt at this.  Link people, algorithms, the social web, information, machine understandable and processable forms, etc.  There are a number of semantic search engines that focus on different disciplines.  These specialized engines do a better job with that type of data.  An example might be HealthMash, a system driven by consumer health knowledge bases and performs semantic searches quite successfully.

Greg Notess presented this popular session.  We are pretty much down to Google and Bing.  Yahoo is being powered by Bing.  Ask is contracting its database out to some yet unnamed company, and has been focusing on its Q&A technologies.  Cuil is gone.  Smaller search engines like Blekko, Exalead, and Gigablast are out there but nothing is at the level, size, and scope of Google and Bing.

Death of search?  We’ve seen the behavior of searchers change over the years.  Content farming is having a detrimental effect on the accuracy and clarity of search engine results.  There is a huge economic side to this (advertising).  eHow and Wikipedia don’t have bad information necessarily, but you need to be cautious.  So who qualifies as content farmers?  Allexperts, ChaCha, Answerbag, Mahalo, eHow,, 123people, FixYa, Seed, ShopWiki, and more.  Associated Content was purchased for $100 million by Yahoo.  AOL starts up Seed and then buys the Huffington Post for $300 million.  Demand Media had an IPO that had an over $1 billion valuation.

What are the content farmers writing about?  When we start to recognize content farm materials, know that they were very quickly created by the writers…which definitely effects quality.  There is also a lot of screen-scraping happening –near duplicate content of original content on aggregator websites.

Google has had some major changes recently.  They launched Panda Update, which was an attempt to target the content farm sites.  This has changed more than 11% of the results through Google.  How well did it work?  Many domains lost ranking:, associated content, and others.  But eHow, one of the most egregious examples of a content farm, actually gained a little bit of traction in results.  Hmmm…

Google blocking is useful: choosing to block all results from a particular domain in your search results.  But do you really benefit from this?  Sites change their content completely, which could happen on some of these content farm sites.  3 or 4 years from now will you remember which sites you blocked?

The little stars that let you “favorite/bookmark” a site in search results are now gone in Google.

The big change in this last year with Google is the sidebar.  There is a list of only a few select databases (everything, images, video, etc.) but you need to click on More to see a full list, something easily missed.  One of the new databases is “Recipes” but be aware that these only show sites that have used special Google mark-up language to be included.

Google has been working hard to have better date information about their search results (when information first posted).  It’s an easy way to limit results to only recent resources.

There are some other options in the sidebar, like “Social” which requires you to be logged in and to have a Google profile set up.

Greg also showed the many additional options in the Advanced Search page.

Google calls the “sponsored results” “Ads” now.  Yay!  Clear language.

Greg talked about Google Instant as well…a technology that saves Google processor time and money, but not really a benefit for the user, says Greg.  With Google Instant on you only get 5 recommended search terms as you type in the search box, whereas without Instant you get 10.   Chrome now has in the Omni box (the URL bar) the ability to do instant search too.

Google Preview gives you the little magnifying glass next to the search result.  This is really a copy of something Bing was doing.  You can’t turn them on or off.  Most people dislike them in our session room.

Google Encrypted Search – if you are at a firewalled location, you can encrypt your searches so that your internet service provider can’t see what you’re searching.  Only Google can (muah ha ha ha ha).

What features did we lose from Google?  Search wiki, the little bookmark stars.  The top toolbar changed a bit.

Social searching that shows up in Google is a lot of different sites, but not Facebook.

With Bing, if you have Facebook Connect, you can see that information in your results. Blekko shows pages and sites that have been liked by friends on Facebook.

If you have a large Facebook network, this is useful.  If not, then not so much.

Bing still has the cached page link (though it moved).  It also allows you to share search results through Twitter, Facebook, etc.   They have scholarly searching as well that pulls in Microsoft Academic Search data.  The image search pulls in particularly different search results than Google Image Search does.

Greg recommends looking at your search engine preferences, including the ability to see what types of ads are being served up to you and why.  You can opt out of this so that you’re not profiled for ads.

Blekko is interesting – Greg suggests trying out /liberal and /conservative.  He also recommends looking at Qwiki.

Mary Ellen Bates started her session talking about Google’s search operator “AROUND (#)” (yes, use all caps and put in a number for the distance between the two words, e.g. cats AROUND (5) toys).

Google Books has done data mining and through the Ngram viewer you can compare word usage over time, phrases as well.  Example: comparing the usage of kindergarten, child care, and nursery school.

Google is coping with content farms.  So there’s a nice trick to block crappy sites.  The option to block a domain shows up underneath each search result on the personal search results page.  If you click on that, then every time you do a search on Google while logged in that site’s results are blocked.  They support Firefox, Chrome, and Internet Explorer (they say) but it only seems to work in Chrome, says Mary Ellen. Fishy, eh?

Using Wikipedia’s “concepts related to…” list can be helpful in pointing you to related topics, especially when beginning your research.  She points out that it’s a way of getting a bigger sense of the ecology of the information environment and finding that hidden disruptor on the periphery.

Yahoo does demographics, sort of.  Yahoo Clues shows queries by age, gender, what they searched before and after the search in question.

A nice feature in Bing is the NEAR operator between two words (autism near:5 vaccination).  Bing also lets you limit a search to sites linked to from a specific URL: trials. provides good disambiguation.  As soon as you type in your query it shows you all of the alternative uses of the word.  The page also live-loads at the bottom, so you don’t have to click to get to the next page.  And they don’t track your search results.

Blekko is a site that Mary Ellen likes.  Blekko blocks spam and content farms like eHow and allexperts.  The search results are therefore cleaner.  Also, is that it offers specialized slash tags – e.g. “/likes” which only returns pages your Facebook friends have liked (requires Facebook Connect).  /relevance does a relevance sort and /date does a date sort.  /rank gives you some additional information on why the sites are ranked the way they are.  This could be super-useful for SEO if you’re trying to raise your library’s pages rank for certain searches. [is awesome!] and lets you scroll back through time and skim through the life of a website.

Mary Ellen plugged Yelp as well.  The reviews have much, much useful information on local businesses. has been around for a decade, and has been rejuvenuated lately.  It’s an aggregation of what the author considers the best places to go for particular types of information.  If you’re trying to educate your users that there is life beyond Google, this is a good starting point for them.

When is good enough good enough?  She asks us: would you rather be perfect or successful?  How much is this information worth? provides a good social search.  You see aggregated and filtered results.  They associate positive and negative words with your search terms as well to rate how well the word/brand is rated.

Topsy is also a social search engine and lets you search for hashtags.  You can limit it by time and by date.  It’s searching pages that were linked to from Tweets.  Tweeps is a good way of getting senses of who follows the people you follow, and other social relationships in your extended circle.  Mary Ellen emphasized the importance of social search resources.

James Crawford, Engineering Director for Google Books, was set to present this morning’s keynote.  However, Mr. Crawford chose to fly in on the red-eye, which was of course delayed, and he didn’t make it here for his talk.  This provides a good lesson for all speakers – don’t play to fly in a few hours before you’re supposed to speak.

Instead, Information Today quickly and fearlessly put together a panel of experts to discuss the topic: Roy Tenant, Dick Kaser, Stephen Abram, and Marshall Breeding.   I want to give kudos to the panelists for giving a salient discussion.  Good job guys!

Marshall Breeding started by talking about the (until recently remote) idea of digitizing the world’s books.  Very few libraries have the ability to digitize and provide full text discovery.  Companies like Google are the only ones digging in to this.

Stephen Abram posited “What are the unintended consequences of this Google Books project?”  There are 15 million books online now, more than any other library except the Library of Congress.  Once we separate the entertainment group of materials from the answer/research group of materials, we have a dangerous bifurcation.  What’s the difference between the chapter of a scholarly work and an article?  Are we going to start aggregating chapters together?  Library catalogs cannot handle this kind of material – how do you describe a 12-chapter book with only 3 subject headings?  The free text aspects of searching Google is going to change the dynamics of the answer space.  And those books aren’t going to be  a “book database,” but rather fully integrated with websites, video, articles, etc.

Dick Kaser compared this to the digitization of journals years ago.  We do this because we can – we have the storage space, the ability to digitize rapidly, and hey – Google has the money!  There was some controversy over those first libraries that signed on with Google.  The conventional library wisdom is to never trust a commercial vendor.  If Google sits on top of the vast amount if data and information, what’s then left for libraries in this space?  Perhaps libraries helping people digitize their own collections, digitizing rare local materials.  Maybe Google holds the books and the data, but perhaps libraries help out with how to search them effectively.

Years ago, Roy Tenant debated here at CIL that the Library of Congress would never be fully digitized.  He says he’s ready to eat his hat on that one.  He then brought the Internet Archive and the Hathi Trust into the conversation as well.

Stephen Abram responded that you need to question why you digitize books.   Is it perhaps to be able to put ads into books?  The President of Demand Media said it would be brilliant to digitize books as a way to gather data in order to game the search engine results.   How do you drive search results based on things that are already written?  How many people paid some of the billions of dollars in Google’s profits?  What are the consequences of a book database that serves up answers based on the needs of the advertisers – the people paying their bills.

Marshall Breeding noted that many libraries have very small collections, and having access to nearly countless book titles is very tempting.  The initial Google Books contracts with libraries did not give enough rights to libraries, and that has been a topic of a lot of conversations.  Subsequent partners for digitizing projects have asked for more after learning those early lessons.  Internet Archive hasn’t done the quantity of books that Google has, but the IA approaches the projects in a more library-friendly way.  The library has to pay a small fee for the digitization costs, but the business model on the output is certainly more library-friendly and rights-friendly.   How do we find the right deals that give library users the best deal?

Dick Kaser, in looking at the commercial side of digitization, has seen more publishers talking about the potential of digital books. eBook standards were a  key topic years ago, but now it just seems that approaching things in HTML5 is the easier and more expedient approach.  [and tee hee! Dick mentioned the eBook Bill of Rights that Andy Woodworth and I worked on!]

At some point in 2011, the U.S. Supreme Court will make some decision about the in-copyright books in Google Books.  ALA’s concerns are about one commercial entity having control over these books, and the ridiculous nature of the “one terminal per library” set-up with no printing or downloading rights.  Are we okay with Google having control over access to this information?  How are we going to handle this as libraries? How are we going to advocate for our users’ rights?

Roy Tenant then brought up the “26 checkout eBook rental” issue with HarperCollins (see #hcod on Twitter for more on this).  Dick said that the idea of lending an eBook is “disrupting publishers” because they’re so into tangible objects.  If they allow it to be loaned, they feel that they have lost control of their product.  Marshall responded that this brings up the problem of libraries’ automation systems.  These systems are not built to deal well with digital content, only physical items.  He thinks this is going to change.  What is the library’s role when everything is streaming?  When books are published digitally only, and not in print?  We are trying to figure out what a lending model for libraries can be.  There is a real struggle between what publishers are worried about and their feeling that libraries are in the way of that.  We figured out how to do it in an age of physical bookstores and we need to figure it out in this new environment as well.  Stephen says that the HarperCollins issue is one example of playing whack-a-mole.  What would we do without Sarah Palin’s book…omg!  Why are we only going after HarperCollins, and not after Simon & Schuster and Macmillan, who won’t let us loan eBooks at all?  If we don’t participate in the discussion, there is a danger that libraries won’t have a role in eBooks in the future.

Marshall and Roy then talked about the impact on research libraries.  How do you manage large digitized collections in large research libraries?  Does it mean you can ship more books to storage or maybe even get rid of a few?  Back-files of periodicals don’t exist in many research libraries anymore, and if they do it’s only in storage.   The working collections will likely become more limited and agile.  It provides new opportunities to think about what library spaces can be.

Stephen talked a bit about eReaders.  Do we want Jeff Bezos controlling the market?  Do we want Steve Jobs’s values system controlling what type of content is allowed on the market? Because you control the patent on an eReader or have control over market share, should you have the right to disrupt the market and disallow content and information access.  He asked: How many of you would allow a single person to ban a book in your library?  Do we need to add a Banned eBook Week through ALA?  Did we let telephone handset manufacturers tell us what we could say on the phone? Stephen then asked the question that has been riling me up for years: Why is the library profession so silent on an issue of such critical importance to the future of information?

The American Library Association elections are afoot, and as a member of ALA and LITA (Library and Information Technology Association) I wanted to chime in with a few recommendations of people whose work I know and trust.  Voting was supposed to open today (the elections website still says so) but so far there’s nothing up on ALA’s site yet.  Stay tuned though.

Please join me in voting for the following strong-minded and brilliant people:

ALA Council: Bobbi Newman, Holly Tomren, Wendy Stephens, Martin Garnar, Matthew Ciszek, and Kate Kosturski

LITA Vice President/President-Elect: Zoe Stewart-Marshall

LITA Board: David Lee KingLauren Pressley, and John Blyberg

As the debates rage about digital content, publishers, consumers, and libraries, I was reminded of a piece on DRM I wrote years ago.  In June of 2007 I published an article in School Library Journal entitled “Imagine No Restrictions: Digital Rights Management.” The opening line of my article is “We dream of a world with free access to content.  In the meantime, there’s DRM.”  *sigh*  Sadly, that is still true.

I wrote the article as a primer for library staff on what DRM is and why it matters for libraries.  In re-reading that article from long before eReaders became popular, I was struck by how applicable the ideas still are today.  The article covers device compatibility, DRM as a roadblock to use, and the archival issues raised for libraries.  I also provide talking points for discussing DRM and library digital content with library users…something we all end up doing, often with a frowny face and a serious sense of guilt.

If you’ll indulge me, here are two passages from the article that seemed to bear repeating, in addition to my grumpy epithet for DRM: “Despicable Rights Meddling”:

If you buy a physical version of a song or movie, you are warned about the law, but generally trusted to follow it. If you buy a digital version, however, the DRM code forces compliance.


Libraries must be part of the solution here, not the problem. With our current econtent models, we’re coming down on the wrong side of this debate—not the side of content delivery, accessibility, and customer service, like we should. Publishing companies like Springer and BWI are offering ebooks free of DRM. This is the model we should be promoting and demanding from all vendors. Otherwise we will continue to limit content to a select group of our users, and that select group will continue to get smaller as DRM becomes increasingly restrictive.

This article was a reminder to me that we’ve been discussing this issue for many years.  Librarians have cared about access to digital content since digital content was invented.  We have worked to educate staff and customers.  We have asked for leadership from our professional organizations in legislating change or working with the Librarian of Congress to make effective changes to the Digital Millennium Copyright Act.

And still, we wait.  But now we’re mad, we’re organized, and we’re pushing for change from all directions…not waiting for approval or sanctions from above.  As a friend said to me last night, “Librarians are going all Egypt on this one.”   In response to both the HarperCollins eBook licensing changes and the eBook User’s Bill of Rights we’ve seen grassroots efforts from the masses, opinions from all sides, and social media organization and information dissemination.  My favorite is the Librarians Against DRM and Readers Against DRM graphics (designed by cartoonist and artist-in-residence Nina Paley).  We’ve also seen mass media coverage on a scale I haven’t seen since the PATRIOT ACT.  And it all started with opinionated librarians blogging, tweeting, and some great investigative reporting from Josh Hadro at Library Journal.

I am very happy that American Library Association is now moving forward with a game plan for advocacy, including the work of the ALA eBooks Taskforce I’m a part of.  In the next couple of weeks I expect we will see more on these issues — from ALA and from librarians directly.

I encourage you to inform yourself, inform your co-workers, and push for change.  Call and write to publishers, authors, and your larger library organizations like consortia or regional partnerships.  Your voice matters.  Your voice can indeed create change.  Continue the revolution.

On February 17th, I posted a link to a survey that a library school student was doing about book challenges and removals in libraries.  Here are the results.  Granted, the people who will take the time to fill out this survey are likely interested in book challenges already, so we may get a higher rate of incidents reported.  But the results are still disturbing — that challenges and removals of library materials are grossly under-reported to ALA.

Book Challenges and Removals Survey Results

Of the 73 respondents: 9.6% worked at academic libraries, 67.1% worked at public libraries, and 23.3% worked at school libraries.

How many incidents of material challenges have you received in the last five years?
160 challenges reported from 73 respondents (average of 2.19)
How many incidents of material challenges have you reported to the American Library Association Office of Intellectual Freedom in the last five years?
33 challenges reported to ALA (20.6% of the 160 actual challenges)
How many incidents of material removal (following the library’s challenged material policy) are you aware of in the last five years?
23 challenges resulted in removals per the library’s policy (14.4% of the 160 actual challenges)
How many incidents of material removal (bypassing or ignoring the library’s challenged material policy) are you aware of in the last five years?
56 incidents of material removal in violation of the library’s policy (over twice the number of “per policy” removals)
The primary that jumps out at me is the gross under-reporting of incidents to ALA–only 20%!  Is this because libraries don’t want the professional stigma associated with challenges (especially removals or book-relocations)?  Or is it simply bad record-keeping?  Or apathy?  I don’t know.  But if I was a powerhouse in the ALA Office of Intellectual Freedom, I’d be planning a way to figure out if and why this is happening in member libraries.
The other thing that bothers me is that twice as many items are removed ignoring or violating the library’s challenged materials policy as are removed in keeping with the policy.  This says to me that people either take it upon themselves to make a decision and just remove something (perhaps to placate an angry interest group or parent) -or- librarians are just too irritated by the formality of the policy (and again, perhaps the stigma) so they just remove the items the easy way by just de-accessioning them from the catalog.
What do you think?  Have you experienced, like I have, a “hiding” of materials challenges in your library?  Have you seen them go unreported to ALA?  Have you experienced customers or library staff removing items or moving items in the collection?  Tell me your stories!

Below is the text from a letter that my library, the San Rafael Public Library (CA), sent to HarperCollins regarding the change in library licensing for their titles in digital format.  I shamelessly stole words, phrases, and concepts from the many wonderful open letters that other libraries, librarians, authors, and book-lovers have made available on the web (thanks people!).  Likewise, feel free to shamelessly steal from us.

Some libraries are boycotting HarperCollins materials (print & digital), others are boycotting digital materials only, and still others are finding new creative ways to protest HarperCollins that their libraries are more comfortable with.

If your library has not yet sent a letter to HarperCollins regarding HC-gate, it’s definitely not too late.  The more individuals and libraries they hear from, the more the conversation continues.

Dear HarperCollins:

The San Rafael Public Library is strongly opposed to your new eBook licensing policy for libraries.

The policy you propose is contradictory to the spirit of libraries and damaging to the relationship libraries have long held with publishers. You are demanding that libraries rent eBooks from your company, but if the same title were to be purchased in paper copy we would still own them. This seems to encourage paper-only purchases from HarperCollins, something that would not help your company’s brand or financials.

The arbitrarily chosen “self-destruct” number of 26 circulations is not reflective of the reasoning you gave in your public statement. Paper copies, hardback or paperback, last much longer than a year and many books see much more than 26 total circulations in their first year alone. As written, your new policy seems greedy, especially considering the low cost involved in producing an eBook “copy” as compared to a paper copy.

The larger practice of digital content licensing is profoundly destructive to the burgeoning eBook market. Libraries help authors find audiences – and therefore help your bottom line. A 2007 study done by ALA (see: shows that 40% of adults and 36% of youth purchased a book after checking the same title out from a library. Discarded library copies find new audiences with book sales, donations to schools, and more. In addition, libraries receive donations from individual consumers who purchased a book but are done using it. With the current model of licensing, consumers cannot donate eBooks. This removes one additional way for your authors to get more exposure – and future sales – through library check-outs. Libraries raise exposure for your authors and your books. Let us continue to play that vital role for your company.

You may, if lucky, see some financial benefits in the short-term in more affluent areas of the country that can still afford to ascribe to your model of rentals. However, in the long term libraries will be forced to stop offering your eBooks and increasingly rely solely on paper books. That does not help your bottom line or public image.

Libraries want to buy your titles – in print and in digital copies. As publishers’ most loyal and long-term customers, it is extremely confusing to be punished for wanting to give you money. If, however, HarperCollins is no longer interested in selling to libraries, libraries like ours look forward to the long list of independent publishers and self-published authors waiting to fill the gap you will be leaving in the market. We realize that these newer methods of publishing are perceived as a threat to business by traditional publishers like HarperCollins. This change in the market will only grow exponentially faster in response to decisions like your “rule of 26.”

All of the limits you have placed on library digital content are short-sighted, at best. eBook licensing, and the digital rights management that comes along with it, acts like a tariff in its inhibition of the free exchange of ideas, literature, and information – the ideas, literature, and information that your authors have worked so hard to put out into the world. We encourage you to maintain your positive relationship with libraries and sell your eBook titles to libraries outright, not rent them.


David Dodd – Director, San Rafael Public Library
Sarah Houghton-Jan – Assistant Director, San Rafael Public Library

I want to credit the cybersphere with having some real, heartfelt, and intelligent discussions about the future of eBooks — both for consumers and for libraries.  Discussions have happened in every medium with thousands upon thousands of people participating.

But who isn’t participating?  The American Library Association.

To date, a week after the HarperCollins debacle hit the press, there is still no formal statement from the American Library Association President, Roberta Stevens.  (Update: Roberta Stevens issued a statement on her personal Facebook page an hour or so after I posted this – for more on that method and the statement itself, read on). To add insult to injury, ALA’s digital magazine (American Libraries Direct) was released yesterday without this story at the top.  You have to scroll way down to the publishing section before you read word one about a story that hit the mainstream media.

Psssst, ALA! Your members are making news.  Your members are the ones who are upset, on behalf of their profession and their communities.  Your members are the ones who are making news by protesting a publisher’s short-sighted and antiquated decision that is not only anti-library but anti-consumer.  Perhaps you can listen to your membership, and cover what’s going on in a more intentional way.  Perhaps you can issue a formal statement to the press about what libraries believe in, and how publishers’ choices on how to sell or not sell digital formats to libraries is subverting a core value we hold dear.

The silence is deafening.  We’re waiting.

I originally posted the following as a comment to this post in response to others, but am moving it here for ease of following my thoughts:

I’m not asking ALA to officially endorse a boycott.  I am asking ALA to speak up on behalf of all libraries about this issue, and to do so immediately.  Roberta Stevens’s post on Facebook is something – but let’s see something official on ALA’s website too, not just on Facebook.  The decision about the forum through which the message is communicated, to me, seems like a cop out of responsibility for the message itself.  A wink and nod that ALA isn’t endorsing this statement.

In the vacuum they’ve left, other voices are rising — and none with the authority or community trust that ALA possesses.  Other voices are absolutely a good thing, but lacking a centralized voice speaking with authority we’re now left with disparate chaotic communications and no official leadership being taken.  Other voices are absolutely stronger than ALA’s right now, those of individuals and groups.  ALA’s voice should be a prominent one (if not the dominant one) in these discussions.  We, long time dues paying members, are looking for leadership and support.  We are looking for ALA to become visible on this issue, and now.  I’ll be damned if I’m going to wait patiently and quietly in my corner until June when somebody comes out with some official statement or report.

Speak out and speak out now, ALA.  Reassert libraries’ rights to lend materials.  Reassert libraries’ responsibilities to the public good.  And reassert libraries’ roles in our communities as cultural and thought leaders.  That doesn’t require anyone to say anything specific about HarperCollins or any other publishers, or endorse any specific action.  Give voice to our professional ethics and responsibilities regarding the content we provide, regardless of format.  Please, say something to the world–or the rest of us will keep talking loudly, angrily, and unofficially.  And those are the voices the press will pick up instead.  My guess is that is not what ALA wants.