Guestpost: Looking beyond Google for online access to EU culture and knowledge

Last month the US Supreme Court refused to hear an appeal from US authors who attempted to overturn a prior decision that Google’s scanning of millions in copyright books amounted to “fair use”. This refusal marks the end of a decade long legal fight about the Google books project. This means that in the US Google is free to scan and index in copyright protected books, in order to allow internet users to search the contents of the books.

The fact that Google is allowed to do this has received much criticism, not only from authors in the US but also from rights holders and media in Europe. Much of this criticism has been directed to the fact that the ruling allows a commercial entity to provide access to the full corpus of literature published in the US, but misses a much more important point.

As Ellen Euler, the Deputy Managing Director for Finance, Law, Communication of the Deutsche Digitale Bibliothek points out in her guest contribution below, this means that internet users in the US have access to a much broader body of knowledge and culture than the internet users in the EU. According to Euler we should not see Google Books as a threat to culture but rather as a reminder that Europe urgently needs to create a legal framework that enables access to the collections of our libraries, archives and museums, preferably by allowing them to make their collections available via their own online platforms.

Looking beyond Google for online access to EU culture and knowledge

by Ellen Euler

In the the digital and networked 21st century, cultural heritage institutions have an extended mandate: they must not only provide local access to culture and knowledge, but are also expected to make their collections available via the internet. As we spend an increasing amount of our time online, expect to be able to view and enjoy the the rich collections of our libraries, museums, and archives. And it’s important to provide online access to enable the discovery and innovative reuse of our shared cultural commons. As Tim Berners-Lee, one of the inventors of the web, sums up: “What’s not on the Net, is not in the world”.

When we digitize content from cultural heritage institutions, we begin the process of opening those materials to the world. As Armand Marie Leroi, a humanist and professor of evolutionary biology once said, “digitisation transforms them from caterpillars into butterflies”. Digitized texts allow us to pose entirely new questions and acquire new knowledge based on full-text searches and via other analytical tools and methods. This type of information mining is no longer restricted only to texts. Image recognition tools, combined with standardised metadata and geographical data, make it possible to interrogate other types of content too. We can use new quantitative research methods to test hypotheses and create linkages between bodies of knowledge. We can create virtual research environments to enable the contextualisation of collections within a broader framework.

Google Books: A blessing and a curse

Early on, Google recognized the benefits to digitisation, and tapped into the public’s interest in searching across huge textual collections. Since 2004 Google has been digitising millions of books from U.S. libraries for its Google Books product. In the U.S., the scanning triggered a backlash from authors and publishers, who felt that they were losing control over their copyrighted works. It also fuelled fears that the digitised resources within Europe’s cultural heritage institutions would be stifled by the dominance of Anglo-American digital cultural offerings.

Scanning historical books at the Bayerische Staatsbibliothek
Photo: Jürgen Keiper (CC BY)

Therefore, on 28 April 2005, seven EU countries wrote a joint letter to the President of the European Commission (PDF, in French). The letter recognised the potential benefits of availability and searchability of culture and knowledge, and proposed the creation of a virtual library that would make Europe’s cultural heritage accessible to everyone in digital form. With this proposal they wanted to combine existing initiatives, avoid redundancy, and stimulate the growth of the information society and European media industry.

The EU realised that a substantial platform needed to be created to counter the dominance of Google Books. The idea of a European cultural platform (and its national counterparts) was born. Launched in 2008, Europeana now serves as the European entry point for online collections of cultural heritage materials. In 2012 the Deutsche Digitale Bibliothek (German Digital Library) was created to serve the same function at the national level for Germany. Both of these platforms attempt to aggregate digital offerings from cultural heritage institutions, and to increase the visibility of Europe’s cultural heritage online.

As Google’s book digitisation efforts grew, it was sued for copyright infringement in a lawsuit brought by the Association of American Publishers and the Authors Guild. These organisations and Google entered into a protracted legal dispute that lasted for more than a decade. Then, in April 2016, the United States Supreme Court refused to hear an appeal of the case. As a result, the decision of the lower court was upheld, which means that the digitisation and indexing of copyrighted texts by Google is a fair use, and as such does not require permission from the rights holders of the books.

Google Books provides free, full-text access to books that are in the public domain in both the U.S. and Europe. Relying on fair use, the U.S. version of the Google Books product also allows users to search the full contents of books still under copyright. However, the results of these searches only display the search terms alongside short passages of text in which those terms appear. Unless the work is in the public domain, the full text of in-copyright books is only displayed if Google has obtained permission from the rightsholders to do so.

The great benefit of Google Books is indisputable and described in detail in the now confirmed ruling:

In my view, Google Books provides significant public benefits. It advances the progress of the arts and sciences, while maintaining respectful consideration for the rights of authors and other creative individuals, and without adversely impacting the rights of copyright holders. It has become an invaluable research tool that permits students, teachers, librarians, and others to more efficiently identify and locate books. It has given scholars the ability, for the first time, to conduct full-text searches of tens of millions of books. It preserves books, in particular out-of-print and old books that have been forgotten in the bowels of libraries, and it gives them new life. It facilitates access to books for print-disabled and remote or underserved populations. It generates new audiences and creates new sources of income for authors and publishers. Indeed, all society benefits.

Meanwhile, Google has expanded its digitisation activities to cover cultural works other than texts. In 2011 it founded the Google Cultural Institute as well as the Google Art Project—which displays digital art collections from museums around the whole. Together with its partner institutions, Google is able to widely share a wealth of cultural heritage collections, including both public domain and in-copyright materials.

Google relies on fair use to be able to digitise and provide at least minimal levels of access to these resources. But Europe does not enjoy an equivalent to fair use. Europe instead has a rigid, prescribed system of exceptions and limitations to copyright. Many social media and remix uses—such as internet memes, image collages, and the sharing of content over networks—are permitted in the U.S. on the basis of fair use. But in Europe, these types of innovations operate in a semi-legal grey area.

The limits of the European framework for digitisation and access

Currently there is no legal basis that authorizes cultural heritage institutions in Europe to undertake the comprehensive digitisation and indexing of protected works in their collections without permission from copyright holders. It’s often impossible for European cultural heritage organisations to obtain permission to digitize their collections, or to make use of thumbnails to show what is contained in their collections. Rights clearance is a complicated and resource-intensive process, and most cultural heritage institutions do not have the money to make these resources available online. Therefore, many attractive cultural offerings still under copyright can only be made available by well-financed commercial players.

This situation prevents most cultural heritage institutions from developing comparative offerings, even though these would be noncommercial in nature and intended to foster the public interest goals of copyright without causing any harm to rights holders. As a result many European cultural heritage institutions are only digitising and making available collections already in the public domain.

Over the last few years, Europeana has grown to contain almost 55 million digital objects. The Deutsche Digitale Bibliothek now offers almost 20 million digital objects. Imagine the incredible online collections these cultural heritage institutions could offer if only they were permitted to open up in-copyright works.

Why is it problematic that cultural heritage institutions in Europe are so limited in their ability to engage online? And why should these public interest organisations even attempt to provide online access to collections when commercial providers like Google can do it so much better? The answers to these questions must begin with the realisation that Google’s outsized position in the information society is accompanied by far-reaching consequences for our society.

First, Google collects information that could reveal details about a user and her interests. Google’s mission is “to organize the world’s information and make it universally accessible and useful”. But we shouldn’t assume Google’s will share without requiring something in return. It’s naive to assume Google is operating under any other frame than to meet its corporate responsibilities in the pursuit of growth and profit.

There is a need for platforms like Europeana and the Deutsche Digitale Bibliothek to be able to provide access to digital cultural materials based on public-focused missions not driven by commercial considerations. These types of organisations wish to provide sustainable, reliable access to our shared cultural memory in ways that does not violate the rights and expectations of its users. We need institutions to share large pools of data (“Big Cultural Data”) that can be used by anyone for new, innovative methods of analysis and cultural production. That is why we should advocate for full digital access to our shared cultural heritage.

Online public services stand to benefit greatly if they are indexed by Google’s search algorithm. “Linked Open Data” is the magic word for the greatest possible visibility and contextualisation. All resources—from commercial products to openly licensed offerings—should be able to interoperate with each other if they are to produce added value for end users. We should not entrust to Google the entire responsibility for digitising and sharing our cultural heritage materials. At the same time, cultural heritage institutions should not isolate themselves from Google or other commercial intermediaries.

Book scanners at the Bayerische Staatsbibliothek
Photo: Jürgen Keiper (CC BY)

For the time being, Google has abandoned its efforts to digitise more extensively in Europe. Historically, Google has been interested in digitisation projects that are of interest to a global public. However, this form of digitisation “cherry picking” can be problematic because it only focuses on popular content. Instead, we need to create a comprehensive online resource that provides access to the entirety of Europe’s cultural heritage. Doing it this way would be the best way to represent the historical and creative diversity of Europe’s cultural heritage institutions. Developing a comprehensive digitisation and access system would support the goals outlined in the 2005 letter from the EU heads of state when they wrote that the vision and values of European culture should be visible in virtual space.

For Europe this means that it must put its cultural heritage institutions on a path for success—not only by offering financial and institutional support, but also by setting up a favorable framework for change. The greatest hurdle to supporting digitisation and access is European copyright law, which is outdated for the digital age and relatively inflexible when it comes to limitations and exceptions to copyright. Previous reform attempts did not improve the situation: it made it clear that a patchwork of remedies based on voluntary measures is not the solution.

Orphan Works Directive: Good intentions, lackluster implementation

The orphan works directive was intended to fill the 20th century content black hole by allowing institutions to digitize and make available works for which rights holders could not be found or identified. But in reality, the orphan works directive has not been very effective to this end. A glance at the Register for Orphan Works at the Office of the European Union for Intellectual Property (where the works have to be registered before use) reveals that after two years there are still no more than 1684 works registered. Nearly twenty countries—including Spain, France and Italy—have not registered a single orphaned work. Even libraries do not see the orphan works directive as a significant step forward with regard to digitisation and access to cultural heritage collections. Even worse, the orphan works directive covers only textual and audiovisual works. It cannot be relied upon when digitising photography or visual art works.

Germany has gone a step further than simply implementing the orphan works directive. As a result of intensive lobbying from library associations, the German legislator has provided a solution for out-of-print works that are no longer commercially available. Under this provision libraries are allowed to digitize and make available out-of-commerce works first published before 1966 without having to undertake a diligent search for rights holders as long as they pay a reasonable fee to a collecting society. Although this provision only came into effect in mid 2015, the register for out-of-print works maintained by the German Office for Patents and Trademarks contains 3,758 works (and counting). Given this relative success, the provision seems suitable as a model for other types of out-of-commerce works held by cultural heritage institutions. However, this setup assumes there will be productive cooperation between cultural heritage organisations and the relevant collective management organisations.

Should collective management organisations be able to collect royalties from uses of orphaned works if the uses are noncommercial in nature, respect the legitimate interests of authors, and intended to advance the progress of culture and science? Or should we we create new exceptions that permit cultural heritage institutions to digitize and make freely available the works they have in their collections?

There’s no consensus on the answers to these questions. But we do know that some copyright holders are not prepared to yield a single step to entertain a progressive change. And historically, the European legislator has supported the interests of rights holders more than the needs of cultural heritage institutions and the public. As a result, Europe will lag behind in the digitisation and access to its cultural heritage materials.

The conclusion is clear: cultural heritage institutions in Europe urgently need a fair, legal framework to enable them to both serve their public audiences, and preserve the rights of authors in the digital space.

Dr. Ellen Euler, LL.M.
The author is the Deputy Managing Director for Finance, Law, Communication of the Deutsche Digitale Bibliothek

One thought on “Guestpost: Looking beyond Google for online access to EU culture and knowledge

  1. Paying royalties for orphan works is an oxymoron. Either the author is known and the work isn’t orphan, or it’s unknown and the collecting society doesn’t have any possibility to deliver the royalty, let alone represent the author.

    3700 doesn’t sound much better than 1700 and the difference may just be a matter of bigger interest/demand (reflected by the share of Germany in Europeana as well) or speedier delivery of records from the national to the European register.

    More interesting: EOD digitised over 10 thousands books in the last 3 years, despite having very low visibility and excluding potentially-copyrighted books. https://books2ebooks.eu/content/ebooks-demand-service-remains-self-sustaining-and-seeks-further-fields-activity