Google just cries out for Tom Lehrer

Or rather, the phenomenon that is Google cries out for Tom Lehrer. Come out of retirement, Tom! Political satire is not obsolete, notwithstanding Kissinger’s Nobel ….

A colleague posted on a listserv a brief note about an article on “Google’s planes”. I thought, no, really? Google is buying planes? for streetview, I imagine — holy cow, what’s next?

Then I clicked on the link and was relieved to see it was about Google’s plans; the colleague had merely made a typo.

Or so I thought.

Cue ominous music: dunh dunh dunh.

Because, as that same colleague informed me, Google actually IS buying unmanned drones for aerial surveillance for Street View ! ! ! ! ! (I think screeching violins a la “Psycho” would be good here.)

Well, no, not really. A Google executive is buying it “for personal use”. Google categorically denies Street View applications, which shows that its PR department definitely is on the ball.

Sometimes reality is not nearly as weird as it should be.

Google Book Search panel at ALA Midwinter

The ALA’s Copyright Subcommittee (Committee on Legislation) is hosting a panel on the Google Book Settlement at ALA Midwinter this year — Saturday at 1:30 at the Grand Hyatt. (I’m on the committee and on the panel.) Should be interesting.

Come to the Google Book Settlement Session at ALA Midwinter Conference January 24th, 2009, 1:30-3:30, Grand Hyatt, Maroon Peak Room

If you’ll be at ALA’s Midwinter Conference in Denver at the end of January, please check out the session “Google Book Search: What’s In It for Libraries?” The open forum will be hosted by the ALA Committee on Legislation’s Copyright Subcommittee to discuss the proposed Google Book Search settlement. The discussion will take place on Saturday, January 24, from 1:30 to 3:30 p.m. at the Grand Hyatt, Maroon Peak (listed as the Washington Office Breakout Session IV – Google Book Search in the program).

Panelists will include Dan Clancy, Engineering Director for the Google Book Search Project, Karen Coyle, Digital Librarian and Consultant, Paul Courant, Dean of Libraries at the University of Michigan, and Laura Quilter, Librarian and Attorney at Law. The session will be moderated by Nancy Kranich, chair of the COL Copyright Subcommittee. Following brief opening remarks by each panelist, there be an opportunity for dialogue and questions from the audience.

Additional information about the proposed Google Book Search settlement is available at http://wo.ala.org/gbs/.

my own googlegängers

I hadn’t previously heard the word “googlegängers”, which the American Dialect Society deemed “most creative” word last year. But I love the concept, which Stephanie Rosenbloom explored in the NYT today. Apparently lots of people follow the lives and careers of people with the same names as themselves.

(more…)

friday nights are so exciting!
9th Circuit again: P10 v. Google

The Ninth Circuit has weighed in on Perfect 10 v. Google (captioned Perfect 10 v. Amazon.com on the 9th Circuit case download website). The opinion is by Ikuta, who (IMO) got it right on the Fair Housing Council decision yesterday. It’s a long opinion, and I’m still working through it. But here’s a summary of holdings from my first quick scan:

  • Liability for thumbnails — P10 made out a prima facie case of direct liability for Google’s display of thumbnails (affirming lower court) (but see fair use below)
  • No direct liability for display on linking to full-size images (affirming lower court): Specifically,

    “While in-link linking and framing may cause some computer users to believe they are viewing a single Google webpage, the Copyright Act, unlike the Trademark Act, does not protect a copyright holder against acts that cause consumer confusion.” (at 5772, pdf p.19)

  • No direct liability for display of cache (affirming lower court)
  • No direct liability for distribution of full-size image (affirming lower court) (distinguishing Hotaling v. LDS (4th 1997) & Napster)
  • Fair use for thumbnails & vacated preliminary injunction for Google’s thumbnails (reversing lower court)
    • purpose & character of the use: Google’s use was so highly transformative (“significantly transformative nature of Google’s search engine, particularly in light of its public benefit” at 5782/PDF p.29) that it outweighs superseding & commercial uses; the superseding uses were trivial because no evidence that downloads for mobile phone use had taken place. District Court’s determination that use of thumbnails in AdSense partner direction was not significant. Instead of weighing “slightly” in favor of P10 as the District Court found, this favor weighs for Google. (reversing Dist Ct)
    • nature of the copyrighted work: photos were creative but previously published; this factor weighs “only slightly” to P10 (affirming Dist Ct)
    • amount & substantiality: did not weigh in favor of either party because reasonable in light of the purpose of a search engine (affirming Dist Ct)
    • effect on the market: no effect of thumbnails for full-size images (affirming Dist Ct); effect of Google’s thumbnails for P10′s cell phone market “remains hypothetical”; so this factor favors neither party (reversing Dist Ct)
  • Possibility of contributory infringement & enunciated a new test (reversing & remanding) Citing Grokster, Napster, and Netcom, the court found the Dist Ct had erred in assuming that Google did not materially contribute to infringing conduct.

    “Applying our test, Google could be held contributorily liable if it had knowledge that infringing Perfect 10 images were available using its search engine, could take simple measures to prevent further damage to Perfect 10′s copyrighted works, and failed to take such steps.” (at 5793 / PDF p.40)

    Remanded for consideration of “whether Perfect 10 would likely succeed in establishing that Google was contributorily liable for in-line linking to full-size infringing images under the test enunciated today.”

  • No vicarious liability (affirming District Court)
  • Remand to do DMCA 512 analysis: The 9th said because there is now a possibility of contributory infringement, the District Court now has to do the DMCA 512(d) analysis to see whether Google met the qualifications for takedowns. The issues are whether, as P10 alleges, Google was not expeditious in takedown; and whether, as Google alleges, P10′s notice was not sufficient and did not comply with provisions.
  • Amazon.com: No direct infringement for linking to Google’s thumbnails or P10′s fullsize images, and no vicarious liability (affirming District Court). However, the Napster “knowledge” test (“actual knowledge that specific infringing material is available using its system”) popped up here as in Google, and so 9th remanded to consider this contributory infringement and the DMCA safe harbor.

….update 5/18: Thinking about the decision some more, I still really appreciate the “public benefit” aspect of the language that I previously highlighted. Probably not something that most artists will be able to rely on, but very helpful for information and indexing resources — so librarians can breathe a sigh of relief.

Various other scholars & interested parties have pointed out their own highlights:

  • Eric Goldman posted a brief comment on the case, pointing out that the court held that a plaintiff must disprove fair use, which Joe Gratz also pointed out. I was also amused to see his take on the case as difficult to teach.
  • Joe Gratz listed several points of interest, including the public interest point that I like.
  • John Ottovani posted also, pointing out that the court clarified that Section 512 is available for direct as well as contributory infringement. Hmm.
  • Jason Schultz @ EFF calls the decision a “huge victory” and parses out some of his insights.
  • Rebecca Tushnet points out the possible significance of footnote 8 for the Google Booksearch lawsuit, and also speculates on the transformativeness of search engines versus parodies.
  • The Washington Post covered the case too.
WSJ editorial page embarrassment

The WSJ editorial page is not something I ordinarily frequent, but they recently wrote an editorial on the DMCA. Aside from a reflexive and simplistic “intellectual property is good so don’t bother me with nuance or details” attitude, this paragraph struck me:

Google claims “a legal safe harbor” from copyright infringement under the 1998 Digital Millennium Copyright Act, which allows Internet firms to provide a thumbnail of copyrighted material. The firm also asserts a right to reproduce and distribute intellectual property without permission as long as it promptly stops the trespass if the copyright owner objects. That’s like saying you have the legal right to hop over your neighbors’ fence and swim in their pool — unless they complain.

WSJ 2006/12/1 (it’s the editorial page so the person who actually penned this embarrassment doesn’t have to sign his or her name)

I realize that editorial pages don’t require fact-checking, but getting the law this wrong is embarrassing. Readers of this blog probably are very familiar with the DMCA, but a couple of quick pointers:

  1. The DMCA doesn’t “allow[] Internet firms to provide a thumbnail of copyrighted material.” I believe the hopelessly inept WSJ editor was probably thinking about the Kelly v. Arriba 9th Cir. decision, supported recently by the 2d Cir. decision in Dorling-Kindersley. Both of those interpreted fair use (17 USC 107) to include offering thumbnails for a different purpose.
  2. “… without permission as long as it promptly stops the trespass if the copyright owner objects.” Presumably here they’re talking about the DMCA notice-and-takedown provisions, 17 USC 512. Of course, these provisions don’t apply to original infringement — reproduction and distribution — but to the responsibility of ISPs and other intermediaries when their networks are used for reproduction and distribution. That is, at best, secondary infringement (contributing to or vicariously responsible for someone else’s infringement), and it’s really not at all clear that ISPs would be liable for it even in the absence of the safe harbor provisions. Which aren’t “claimed” by Google et al but ”given” to them by Congress.

Since they can’t be bothered to do even the barest minimum of fact checking, and don’t understand what they’re talking about, it’s hard to actually take them seriously. Are they this bad all the time?

Gigi Sohn of Public Knowledge probably very wisely didn’t bother correcting their extremely shoddy fact-checking but responded to the overall tenor of their arguments; the WSJ published their letter – and because the WSJ puts their content behind passwords, the full-text of the letter is available at PK’s blog by Alex Curtis.

interesting things happening but also life

so, interesting things are going on right now that I have plenty of valuable, earth-shattering comments to make (google’s resistance of the DOJ subpoena, the new google 512 decision, siva’s latest article about google in the Chronicle, a recent discussion about personal releases & permissions culture, an exciting conference I just attended), but so is life. Blogging will have to wait.

Calling Doctor Google

As a former medical librarian I thought this editorial by a medical librarian in the BMJ was fascinating.

First this amazing information:

Within a year of its release Google Scholar has led more visitors to many biomedical journal websites than has PubMed (J Sack, personal communication, 2005).

… which certainly lends credence to the pro-tagging, anti- or indifferent-to-cataloging thinkers.

I was particularly interested to see the table from the BMJ’s web access stats, which lists Google as its number one referrer, by far, in November 2005 (345,756), and Google Scholar as its number two referrer (105,185). PubMed trailed significantly far behind — fourth place was PubMed Medline (14,522) and fifth place was PubMed Central (9,616). Of course, one shouldn’t read too much into this relatively raw access-data. A lot of factors must play into the numbers. Who are these searchers? Medical consumers typing in terms in google, hoping for consumer information? If they end up going to the BMJ, that’s probably more than most of them want to know, at least in an initial search. Or are they physicians realizing google is a shortcut to particular articles? Does this set of referrals include, for instance, academic-affiliated researchers? Many of them probably have access to their own institutional subscriptions to BMJ, and if requests are being routed through a local proxy then how is that reflected in these numbers? Still, anyway you slice it, it’s obvious that Google — or maybe it’s better to describe it as “general search” — is becoming significant for medical research. And Google Scholar is more successful than I’d realized.

And then this cropped up in the editorial, too:

In a recent letter in the New England Journal of Medicine, a New York rheumatologist describes a scene at rounds where a professor asked the presenting fellow to explain how he arrived at his diagnosis.[4] Matter of factly, the reply came: “I entered the salient features into Google, and [the diagnosis] popped right up.” The attending doctor was taken aback by the Google diagnosis. “Are we physicians no longer needed? Is an observer who can accurately select the findings to be entered in a Google search all we need for a diagnosis to appear—as if by magic?”

Ten years ago librarians were all a-twitter about the fear that search engines (Yahoo! and Altavista were the big contendahs then) would displace librarians. Most librarians blustered it out: “Nothing can replace a librarian!” but there was definitely some anxiety in the ranks. Now physicians. Relax, docs. Librarians, doctors, and search engines, all will find their place in the brave new world of infinite search. And it’s important that consumers have access to as much information as possible to critically evaluate and assess all the other info streamed at us daily. For example, since the FDA has deemed it acceptable for drug companies to “inform” us about their wares via millions of dollars of direct-to-consumer advertising, consumers get barraged with info about commercial drugs provided by commercial for-profit entities. In that information environment, it’s vital for consumers to have consumer-directed diagnostic information to assess Big Pharma’s claims. Ultimately it will improve healthcare. What did you think all those consumer health awareness services were about if not, ultimately, this?

essence of library

I like the flow of the google / library discussion: what’s the essence of library? and suspect I’ll be thinking about that one for a long time to come. (It sounds like a delightful perfume: a bit musty with an sweet undernote of decaying paper and an overnote of astringent preservative, maybe.)

Just picking out a few of the responses & adding a few more comments:

Michael Madison laid out a best-case defense for google based on google’s added-value of meta-information, and then termed the discussion: “is there an ‘essence’ of library?” And points out that we ought to focus “more what Google does than on what Google is“.

Siva Vaidhyanathan responded that Google doesn’t come close to the ‘essence of a library’.

This is the heart of the discussion that really intrigues me. Not because I truly am arguing that Google is a library, but because I suspect that the ways that information is being transmitted might start to render moot our current definitions of “library”. In my earlier post, I wasn’t really suggesting that Google take advantage of the warm feelings towards libraries; I doubt it would be a very helpful strategy, because most judges, like everyone else, would intuitively distinguish between the classical public library and Google. Rather, I was suggesting that library exceptionalism is only going to work so long as libraries are conceptually distinct.

Michael M then responded to Siva with some discussion of the essence of a library, ultimately concluding that we really have to talk about libraries in terms of information flows. And then he brings it back to Google:

Do we experience Google Print content as we experience other collections that we regard as libraries, or do we experience that content as we experience the Web — a functionally unlimited aggregation of data? Right now, the answer to that question has to rely on intuition and speculation. My money is on the second option, but in the end: who knows?

I’d like to suggest two basic functions for libraries: One is warehousing and archiving physical collections; serving effectively as a museum of information. The second function is providing information services. Storage, and access.

In the past and even today these two functions are, practically, inseparable. And each implicates a whole host of sub-functions many of which serve both masters — e.g., cataloging, which organizes the stored collections.

But these functions have been splitting and will continue to. Digitizing projects, like Google Print, will put the physical artifacts on the same plane with museum artifacts: nice if you’re a scholar and need the original, but for most people, the digitized content will suffice. [Google Print is not the only digitizing project, of course; there are plenty of others on smaller scales that have gotten less attention. I would be interested to get some examples of public-private partnerships because I suspect Google Print isn't the only one.]

As more of the information content becomes digital, the subfunctions used to service both the storage and access functions will shift. Two examples: cataloging and preservation. Electronic information needs much less in the way of cataloging; full-text searching obviates a lot of cataloging needs. (No, not all; I believe in subject headings and hierarchical thesauruses — although I’m not sure they’re ultimately scalable if we’re talking about organizing all information.) Digital media have their own preservation problems, fairly distinct from those relevant in most special collections. The central problem in preserving digital media collections is shifting formats; the central problem in preserving physical collections is preserving the original artifact.

So as these transitions within libraries move forward, the easy and obvious distinctions present today between libraries and Google Print will erode.

Now, Eric Goldman in a comment here said another of his maxims was never build a business on fair use. Google Print, of course, relies entirely on fair use (17 USC 107), so far as I can see. One way we might distinguish libraries at present is that most libraries, operating in the book-warehousing business today, rely not very much at all on fair use, and rather a lot on first sale (17 USC 109). Libraries vary with respect to the library exemptions in 108, which are used principally, so far as I know, to (a) establish reserves collections; and (b) make backups of software, videos, records, etc.

But the bedrock library provisions we rely on today, 108 and 109, won’t be enough for some collections that need to be built in the future. For instance, I don’t know what libraries are currently archiving popular digital ephemera (besides the Internet Archive). But just as libraries have begun to collect popular culture media in DVDs, CDs, comic books, and zines, so there will have to be archiving projects dedicated to archiving purely digital media, including digital media that are distributed for free via the web. I’m thinking of things like JibJab’s “This Land Is Our Land”, Mark Fiore’s shockwave commentaries, and similar such materials.

Let’s consider the Mark Fiore shockwave animated cartoons. [This is purely my example, because I love Mark Fiore; I have no idea if he has been approached by any libraries or what his response might have been.] The cartoons are distributed for free over the Internet; but they are not (so far as I know) licensed for free reproduction & distribution, and they author retains copyright. If a library wanted to begin collecting them, how would they analyze this collection & provision of access to it? 109 protects the rights of “the owner of a particular copy or phonorecord lawfully made under this title … to sell or otherwise dispose of the possession of that copy or phonorecord”. But “computer programs” are exempted. Are shockwave files “computer programs”? Maybe we have to resort, at last, to fair use. Now what do American Geophysical Union, Kelly v. Arriba, MP3.com, et al, tell us? Michael Madison talked about it, but I think it was summed up by Eric Goldman: “Don’t build a business on fair use … multi-factor tests lead to complete unpredictability.”

This is obviously not a fullbore analysis of the relevant provisions as applied to publicly distributed shockwave files, but it does make my point: digital media and new ways of distributing content are already troubling the current copyright categories that are designed around brick-and-mortar libraries and physical artifacts.

And that’s just one example looking at only one aspect of the question of collecting & providing access to Mark Fiore shockwave animations. Consider the reams of problems that digital media pose in the realm of licensing, DRM, and DMCA-type technical protection measures, notwithstanding the protections allegedly offered by 109, 108, and the 1201(d). (Is there any point in even citing to 1201(c)? I feel it’s been effectively read out of the statute the same way, and perhaps for similar reasons, the 9th Amendment to the Constitution has been politely ignored.)

Libraries qua libraries — well, libraries qua public and academic libraries, anyway — will always have recourse to Congress, and I predict they will prove as popular there in the future as they have in the past: not popular enough to sway Congress from granting very broad rights to copyright holders that end up hurting libraries, but popular enough to get some limited library-specific protections.

But most librarians, myself included, want to preserve BOTH today’s model of the library: the brick-and-mortar warehouse-and-cataloger-of-physical-media (which I do think will always be around) — AND the idea of the library: the collector and provider of information. So the question is, how, or why, do we copyfighters / librarians / information activists / legal scholars distinguish Google Print in a way that doesn’t hurt Essence of Library down the line? And why, tactically, should we? Maybe, we should focus on building a more robust fair use, fixing 109 so it works with digital media, or even adding in more 108 exemptions. Or maybe on the DMCA Library of Congress anticircumvention comment rounds that are coming up again.

Further reading on this discussion at copy this blog and copy this blog again. copyfight is following the debate and a number of people are commenting: See google print is as google print does and google print library shoulda coulda woulda. More from “real librarians” and others responding on Siva’s blog: Eileen Snyder, 8/17; Siva responding to Michael Madison, and including comments from other folks too.

I’d like to link to some good discussions on 109 (I seem to recall Derek Slater recently talked about 109 and digital music files, for instance, but can’t find his post — is there a search function I’m missing? Derek?) but will need to do some more digging … later.


As I write I follow one of those social sciences rules about mobs or group discussions or something: I make myself more firm in my opinions the longer I write. This is why it would be much better if I had time to write a long post, then sit on it for a while — my tone could be measured & even the whole way through. But I was already delayed in responding, so wanted to get some thoughts out in a hurry.

update 8/18: a few last posts on this discussion: madisonian.net 8/17; siva 8/18 and siva again 8/18;

siva also posted about an aspect of this issue which i didn’t really touch at all in this discussion, which is the trustworthiness of private actors in general and google in particular. my interest was piqued by the essence-of-library question, but this was a significant thread in comments & subtexts in various discussions. See siva 8/17; copy this blog (previously cited) linked to a post & comment discussion of the google / library contract on the library law blog; and seth finkelstein wrote about what’s in it for google.

update 9/1: the best response to it all came from the onion: Google Announces Plan To Destroy All Information It Can’t Index …

The new project, dubbed Google Purge…. The company’s new directive may explain its recent acquisition of Celera Genomics, the company that mapped the human genome, and its buildup of a vast army of laser-equipped robots. ‘Google finally has what it needs to catalog the DNA of every organism on Earth,’ said analyst Imran Kahn of J.P. Morgan Chase. ‘Of course, some people might not want their DNA indexed. Hence, the robot army. It’s crazy, it’s brilliant—typical Google.’ … ‘This announcement is a red flag,’ said Daniel Brandt, founder of Google-Watch.org. ‘I certainly don’t want to accuse of them having bad intentions. But this campaign of destruction and genocide raises some potential privacy concerns.’

related posts: interesting reading early saturday morning 8/13google & not-for-profit libraries 8/13