The right to read should be the right to mine!

We’re continuing to analyse the prospective changes to EU copyright law described in both the leaked impact assessment, and last weeks week’s leaked draft for a Directive on copyright in the Digital Single Market. In this post we take a closer look at the proposed exception for text and data mining (TDM).

The Commission recognises the incredible potential in text and data mining, writing that “TDM can be a powerful scientific research tool to analyse big corpuses of text and data such as scientific publications or research datasets.” They also note that researchers would be more likely to engage in text and data mining if it was not for the legal uncertainty that exists as a result of the current copyright rules. The draft Directive notes that there are parts of existing EU law that already would cover some TDM activities, except for the fact that these exceptions are “optional and not fully adapted to the current use of technologies in scientific research.” So, in order to overcome this legal uncertainty, the draft directive provides for a mandatory exception for uses of text and data mining technologies in the field of scientific research.

In article 3 the Directive stipulates that member states shall provide for an exception to the exclusive rights granted in the Copyright and Database Directives and the new publishers’ right proposed further down in the Copyright in the Digital Single Market Directive…

…for reproductions and extractions made by research organizations in order to carry out text and data mining of works or other subject-matter to which they have lawful access for the purposes of scientific research. […] Any contractual provision contrary to the exception […] shall be unenforeceable.

There are a few good things about this approach. First of all, making the Directive mandatory will ensure that the exception applies uniformly across all EU members states. We also welcome the explicit clarification that the rights granted under the exception cannot be contracted away.

In addition, it is a step in the right direction that the proposed exception would now apply to all acts undertaken “for the purpose of scientific research” whereas earlier statements by the Commission hinted at an exception that would only apply to non-commercial research purposes. Unfortunately these steps do not fix the fatal flaw of the approach proposed by the Commission:

A privileged class of Text and Data Miners

By introducing an exception that only applies to a limited group of beneficiaries (in this case: ‘research organisations’) it is made clear that everybody else will require permission from rightsholders before they can engage in Text and Data mining. In other words: What the Commission is proposing here, is that every act of Text and Data mining (that involves a reproduction or extraction) needs to be licensed unless it is undertaken by a research organisation. While this may not be a problem for commercial players from the life-sciences and pharmaceutical sectors (the Impact Assessment makes it clear that the Commission wants to preserve publishers TDM licensing revenues from these sectors), this will put up substantial barriers to anyone else including journalists and advocacy groups.

Even worse, it means that certain data sources will remain off limits for text and data mining for everyone except those working for research organisations. Large parts of the public Internet and materials held in archives are effectively not available for licensing because ownership is fragmented or rights holders have disappeared or lack an economic incentive to grant licenses. This means that computer analysis of such materials will carry a much bigger risk in Europe than in places where TDM is not deemed to be a copyright relevant act. What the Commission is proposing is another ‘Orphan Works’ problem in the making.

There is an obvious solution

All of this is especially problematic as there is an obvious solution. It is important to remember that the whole discussion deals with Text and Data mining of materials which are lawfully accessible. This means that publishers and other rights holders can factor in the value of Text and Data mining in the licenses they grant to their licensees. This would be as effective in preserving their existing income streams. The only thing this would not enable, would be licensing their back-catalogues anew. Seen in this light the Commission’s proposal seems to be crafted to allow this kind of double dipping by publishers at the expense of limiting access to data driven innovation for everyone else.

Instead of proposing a limited exception, Europe needs an exception that allows text and data mining of lawfully accessible materials by anyone for any purpose. As others have argued before us this can best be achieved by including TDM in the scope of existing exceptions for temporary acts of reproduction (article 5.1 of the InfoSoc Directive), or by modifying the proposal for the Copyright in the Digital Single Market Directive in such a way that article 3.1 applies to all users and not only those who are part of research organisations.

This post is one in a series of posts analyzing the EU Commission’s copyright reform proposals. See our take on the proposed exception for education here, our reaction to the proposal to introduce a new ancillary copyright for publishers here and a high level analysis of all issues addressed in the impact assessment here.

