From Reproduction to Licensing: Applying Article 15 CDSMD to the Process of Generative AI Training
Abstract
As Generative AI becomes central to the digital landscape, its reliance on vast datasets – often sourced from publicly available press publications – raises pressing legal questions concerning intellectual property (IP) rights. This article examines whether the use of such content for training AI systems may infringe Article 15 of the Copyright in the Digital Single Market Directive (CDSMD), a provision originally intended to regulate unlicensed uses by news aggregators and search engines.1 It explores the legal implications of using press content at scale for training purposes – typically without attribution, remuneration, or a clear legal basis – and identifies the specific stages of the training process where reproduction rights may be implicated. A central issue is the scope of the press publishers’ right (PPR), particularly the distinction between protected editorial content and unprotected “mere facts”. To support this analysis, the article develops a test for assessing whether a given use constitutes infringing reproduction under Article 15 CDSMD. It further argues that, where training uses qualify as infringing, collective licensing could offer a pragmatic solution – ensuring legal certainty for developers, fair compensation for publishers, and fostering a sustainable and pluralistic digital information ecosystem.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Klara Schinzler

This work is licensed under a Creative Commons Attribution 4.0 International License.




