Reproducing or data mining: The copyright law dilemma of AI training

A. A. Nikiforov

doi:10.38044/2686-9136-2025-6-6

Reproducing or data mining: The copyright law dilemma of AI training

A. A. Nikiforov

https://doi.org/10.38044/2686-9136-2025-6-6

Full Text:

PDF (Rus) HTML (Rus) XML (Rus)

Generate QR code

Abstract

What сonstitutes “use” under Copyright Law? Does the exclusive right of the copyright holder encompass any interaction with a protected work? This article explores the legal dimensions of training artificial intelligence (AI) based on works protected by copyright and related rights. The aim of this study is to conduct a comprehensive legal analysis of AI training based on protected subject matter, focusing on the interpretation of key terms such as “use”, “reproduction”, and the legal qualification of activities such as text and data mining, within both Russian and foreign legal systems. The article examines the relevant statutory exceptions and limitations provided under EU, U.S., and Japanese law, illustrating divergent models of legal balance between the interests of AI developers and copyright holders. Methodologically, the research adopts an interdisciplinary approach, combining a technical description of neural network training algorithms with doctrinal and comparative legal analysis of regulatory approaches to AI training and text and data mining across jurisdictions. During the editing and proofreading stages, ChatGPT was used to improve clarity and coherence. However, all ideas, reasoning, examples, and conclusions are entirely the author’s own and were not generated by AI. The article further engages with normative and policy-based arguments for and against permitting AI systems to train freely based on copyrighted content. As a result of the analysis, the author concludes that the act of training an AI model, in itself, does not constitute “use” of a work within the meaning of Article 1270 of the Russian Civil Code. This is because such training does not involve reproduction of the protected expression of the work, nor does it entail perceptible access by a human or functional exploitation of the work (i.e., expressive use). Nevertheless, it is advisable for the legal system to establish exceptions which allow the creation of temporary copies of works without the right holder’s consent, when such copying is necessary for legitimate text and data mining purposes. Additionally, the law should provide mechanisms which enable the use of data that is otherwise restricted for training, without requiring individual negotiations with every rights holder. An exception to this rule should apply to databases which have been specifically curated, structured, and prepared by rights holders for the purpose of AI training.

Keywords

artificial intelligence, copyright, related rights, AI training, text and data mining

About the Author

A. A. Nikiforov

Moscow School of Social and Economic Sciences (MSSES); Yandex
Russian Federation

Artem A. Nikiforov — LL.M. (Russian School of Private Law), Lecturer; Senior Legal Counsel, Software, Technology, Brand, and Data Transactions Legal Support Group

3-5/1, Gazetny Lane, Moscow, Russia, 125009

16, Lev Tolstoy St., Moscow, Russia, 119021

References

1. Abramova, E. N., & Khamidullina, E. V. (2024). Prava na rezul’taty intellektual’noy deyatel’nosti, sozdannyye s ispol’zovaniyem iskusstvennogo intellekta [Rights to intellectual deliverables created with the use of artificial intelligence]. Khozyaystvo i Pravo, 10, 71–83. https://doi.org/10.18572/0134-2398-2024-10-71-83

2. Anikin, A. S. (2022). K voprosu ob okhranosposobnosti rezul’tatov deyatel’nosti iskusstvennogo intellekta kak ob’yekta intellektual’noy sobstvennosti [On the protectability of the results of artificial intelligence activity as an object of intellectual property]. Tsivilist, 2, 25–31.

3. Astrachan, J. B. (2008). De minimus copyright infringement (SSRN Scholarly Paper No. 1625037). Social Science Research Network. https://doi.org/10.2139/ssrn.1625037

4. Barthes, R. (1989). Ot proizvedeniya k tekstu. Izbrannyye raboty: Semiotika i poetika [From work to text. Selected works: Semiotics and poetics]. Progress.

5. Benhamou, E. (2022). Machine learning fundamentals: Unsupervised learning part 2. Data & AI reskilling seminar slides. Social Science Research Network. https://doi.org/10.2139/ssrn.4234521

6. Bloch, D. A. (2023). Machine learning: Models and algorithms (SSRN Scholarly Paper No. 4493977). Social Science Research Network. https://doi.org/10.2139/ssrn.4493977

7. Borisova, L. B. (2024). O ponyatii iskusstvennogo intellekta i pravovom rezhime proizvedeniy, sozdannykh im bez tvorcheskogo uchastiya cheloveka [On the concept of artificial intelligence and the legal regime of AI-generated results without the creative participation of an individual]. Actual Problems of Russian Law, 19(8), 100–113. https://doi.org/10.17803/1994-1471.2024.165.8.100-113

8. Carrier, M. (2004). Cabining intellectual property through a property paradigm. Duke Law Journal, 54(1), 1–145.

9. Carroll, M. W. (2019). Copyright and the progress of science: Why text and data mining is lawful. UC Davis Law Review, 53, 893–964.

10. Chander, A., & Sunder, M. (2004). The romance of the public domain. California Law Review, 92(5), 1331–1373. https://doi.org/10.2307/3481419

11. Chen, W., Zhang, L., Zhong, L., Peng, L., Wang, Z., & Shang, J. (2025). Memorize or generalize? Evaluating LLM code generation with evolved questions (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2503.02296

12. Coase, R. H. (1960). The problem of social cost. The Journal of Law & Economics, 3, 1–44.

13. Cooper, A. F., & Grimmelmann, J. (2024). The files are in the computer: Copyright, memorization, and generative AI (Version 6). arXiv. https://doi.org/10.48550/ARXIV.2404.12590

14. Cooper, A. F., Choquette-Choo, C. A., Bogen, M., Jagielski, M., Filippova, K., Liu, K. Z., Chouldechova, A., Hayes, J., Huang, Y., Mireshghallah, N., Shumailov, I., Triantafillou, E., Kairouz, P., Mitchell, N., Liang, P., Ho, D. E., Choi, Y., Koyejo, S., Delgado, F., … Lee, K. (2024). Machine unlearning doesn’t do what you think: Lessons for generative AI policy, research, and practice (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2412.06966

15. Cooper, A. F., Gokaslan, A., Ahmed, A., Cyphert, A. B., De Sa, C., Lemley, M. A., Ho, D. E., & Liang, P. (2025). Extracting memorized pieces of (copyrighted) books from open-weight language models (Version 2). arXiv. https://doi.org/10.48550/ARXIV.2505.12546

16. Dermawan, A. (2024). Text and data mining exceptions in the development of generative AI models: What the EU member states could learn from the Japanese “nonenjoyment” purposes? The Journal of World Intellectual Property, 27(1), 44–68. https://doi.org/10.1111/jwip.12285

17. Dornis, T. W., & Stober, S. (2024). Urheberrecht und Training generativer KI-Modelle: Technologische und juristische Grundlagen [Copyright and training of generative AI models: technological and legal foundations]. Nomos. https://doi.org/10.5771/9783748949558

18. Dornis, T. W., & Stober, S. (2025). Generative AI training and copyright law (No. arXiv:2502.15858). arXiv. https://doi.org/10.48550/arXiv.2502.15858

19. Dozortsev, V. A. (2003). Intellektual’nyye prava: Ponyatiye. Sistema. Zadachi kodifikatsii [Intellectual rights: Concept. System. Codification challenges]. Statut.

20. Drexl, J., Hilty, R., Beneke, F., Desaunettes-Barbero, L., Finck, M., Globocnik, J., Gonzalez Otero, B., Hoffmann, J., Hollander, L., Kim, D., Richter, H., Scheuerer, S., Slowinski, P. R., & Thonemann, J. (2019). Technical aspects of artificial intelligence: An understanding from an intellectual property law perspective (SSRN Scholarly Paper No. 3465577). Social Science Research Network. https://doi.org/10.2139/ssrn.3465577

21. Ducato, R., & Strowel, A. (2021). Ensuring text and data mining: Remaining issues with the EU copyright exceptions and possible ways out. European Intellectual Property Review, 43(5), 322–337.

22. Dusollier, S. (2018). Realigning economic rights with exploitation of works: The control of authors over the circulation of works in the public sphere. In B. Hugenholtz (Ed.), Copyright reconstructed: Rethinking copyright’s economic rights in a time of highly dynamic technological and economic change (pp. 163–201). Wolters Kluwer Law International.

23. Entin, V. L. (2017). Avtorskoye pravo v virtual’noy real’nosti: Novyye vozmozhnosti i vyzovy tsifrovoy epokhi [Copyright law in virtual reality: New opportunities and challenges of the digital age]. Statut.

24. European Parliament. Directorate General for Citizens’ Rights, Justice and Institutional Affairs. (2025). Generative AI and copyright :training, creation, regulation. Publications Office. https://data.europa.eu/doi/10.2861/0365517

25. Directorate General for the Internal Market and Services & De Wolf & Partners. (2014). Study on the legal framework of text and data mining (TDM). European Commission. Publications Office. https://data.europa.eu/doi/10.2780/1475

26. Fiil-Flynn, S. M., Butler, B., Carroll, M., Cohen-Sasson, O., Craig, C., Guibault, L., Jaszi, P., Jütte, B. J., Katz, A., Quintais, J. P., Margoni, T., De Souza, A. R., Sag, M., Samberg, R., Schirru, L., Senftleben, M., Tur-Sinai, O., & Contreras, J. L. (2022). Legal reform to enhance global text and data mining research. Science, 378(6623), 951–953. https://doi.org/10.1126/science.add6124

27. Frankel, S., & Kellogg, M. (2012). Bad faith and fair use. Journal of the Copyright Society of the USA, 60, 1–36.

28. Garbacea, C., & Mei, Q. (2020). Neural language generation: Formulation, methods, and evaluation (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2007.15780

29. Gastaldi, J. L., Terilla, J., Malagutti, L., DuSell, B., Vieira, T., & Cotterell, R. (2024). The foundations of tokenization: Statistical and computational concerns (Version 4). arXiv. https://doi.org/10.48550/ARXIV.2407.11606

30. Ge, J., Tang, S., Fan, J., & Jin, C. (2023). On the provable advantage of unsupervised pretraining (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2303.01566

31. Geiger, C. (2021). The missing goal-scorers in the artificial intelligence team: Of big data, the fundamental right to research and the failed text and data mining limitations in the CSDM Directive. In M. Senftleben, J. Poort, M. M. M. van Eechoud, S. van Gompel, & N. Helberger (Eds.), Intellectual property and sports: Essays in honour of P. Bernt Hugenholtz (pp. 383–394). Wolters Kluwer Law International. https://doi.org/10.2139/ssrn.3829768

32. Geiger, C., & Iaia, V. (2024). The forgotten creator: Towards a statutory remuneration right for machine learning of generative AI. Computer Law & Security Review, 52, 105925. https://doi.org/10.1016/j.clsr.2023.105925

33. Gupta, I., & Devaiah, V. H. (2016). CJEU addresses private copying and fair compensation issues. Journal of Intellectual Property Law & Practice, 11(6), 403–406. https://doi.org/10.1093/jiplp/jpw035

34. Gurko, A. V. (2017). Iskusstvennyy intellekt i avtorskoye pravo: Vzglyad v budushcheye [Artificial intelligence and copyright: A look into the future]. Intellektual’naya Sobstvennost’. Avtorskoye Pravo i Smezhnyye Prava, 12, 7–18.

35. Gurko, A. V. (2024). O vozmozhnosti avtorsko-pravovoy okhrany ob’yektov, generiruyemykh sistemami iskusstvennogo intellekta [On a possibility of copyright protection of objects generated by artificial intelligence systems]. Journal of Russian Law, 28(7), 64–77. https://doi.org/10.61205/S160565900027569-9

36. Hadi, M. U., Tashi, Q. A., Qureshi, R., Shah, A., Muneer, A., Irfan, M., Zafar, A., Shaikh, M. B., Akhtar, N., Hassan, S. Z., Shoman, M., Wu, J., Mirjalili, S., & Shah, M. (2025). Large language models: A comprehensive survey of its applications, challenges, limitations, and future prospects. TechRxiv. https://doi.org/10.36227/techrxiv.23589741.v8

37. Horwitz, M. (2004). Conceptualizing the right to access to technology. Washington Law Review, 79(1), 105–118.

38. Hugenholtz, P. B., & Okediji, R. (2012). Conceiving an International instrument on limitations and exceptions to copyright (SSRN Scholarly Paper No. 2017629). Social Science Research Network. https://doi.org/10.2139/ssrn.2017629

39. Izacard, G., & Grave, E. (2021). Leveraging passage retrieval with generative models for open domain question answering. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 874–880. https://doi.org/10.18653/v1/2021.eacl-main.74

40. Kalyatin, V. O. (2023). Pravo intellektual’noy sobstvennosti. Pravovoye regulirovaniye baz dannykh [Intellectual property law. Legal regulation of databases]. Yurait.

41. Kalyatin, V. O. (2024). Gotovo li patentnoye zakonodatel’stvo k ispol’zovaniyu iskusstvennogo intellekta? [Is patent law ready for artificial intelligence?]. Intellektual’naya sobstvennost’. Promyshlennaya sobstvennost’, 3, 58–62.

42. Kartskhiya, A. A. (2024). Pravovaya okhrana dostizheniy iskusstvennogo intellekta [Legal protection of artificial intelligence achievements]. Intellektual’naya Sobstvennost’. Avtorskoye Pravo i Smezhnyye Prava, 4, 4–16.

43. Kashanin, A. V. (2007a). Problema minimal’nykh standartov okhranosposobnosti proizvedeniy v avtorskom prave [The problem of minimum standards of copyrightability of works]. Vestnik Grazhdanskogo Prava, (4), 23–62.

44. Kashanin, A. V. (2007b). Tvorcheskiy kharakter kak usloviye okhranosposobnosti proizvedeniya v rossiyskom i inostrannom avtorskom prave [Creative character as a condition of protectability of a work in Russian and foreign copyright law]. Vestnik Grazhdanskogo Prava, (2), 75–119.

45. Kashanin, A. V. (2009). Minimal’nyy uroven’ tvorcheskogo kharaktera proizvedeniy v avtorskom prave Germanii [Minimum level of creative character of works in German copyright law]. Zakonodatel’stvo i Ekonomika, (12). https://publications.hse.ru/articles/publications.hse.ru/articles/74677719

46. Kashanin, A. V. (2010a). Nepovtorimost’ proizvedeniya kak kriteriy ego sposobnosti k avtorsko-pravovoy okhrane [The uniqueness of a work as a criterion for its suitability for copyright protection]. Vestnik Arbitrazhnogo Suda Goroda Moskvy, (6), 32–41.

47. Kashanin, A. V. (2010b). Razvitiye ucheniya o forme i soderzhanii proizvedeniya v doktrine avtorskogo prava. Problema okhranosposobnosti soderzhaniya nauchnykh proizvedeniy [Development of the doctrine of form and content of a work in copyright law: The problem of protectability of the content of scientific works]. Vestnik Grazhdanskogo Prava, (2), 68–138.

48. Kashanin, A. V. (2010c). Obespecheniye lichnogo neimushchestvennogo interesa avtora proizvedeniya v Velikobritanii i SSHA [Ensuring the personal non-property interest of the author of a work in the UK and the USA]. Journal of Foreign Legislation and Comparative Law, (2), 23–39.

49. Kashanin, A. V. (2011). Razvitiye mekhanizma demarkatsii v avtorskom prave kontinental’noy Evropy [Development of the demarcation mechanism in the copyright law of continental Europe]. Vestnik Grazhdanskogo Prava, (3), 61–101.

50. Kashanin, A. V. (2012). Uroven’ trebovaniy k tvorcheskomu kharakteru proizvedeniya v rossiyskom yuridicheskom diskurse [The level of requirements for the creative nature of a work in Russian legal discourse]. Zakony Rossii: Opyt, Analiz, Praktika, (9), 92–102.

51. Kharitonova, Yu. S. (2018). Pravovoy rezhim rezul’tatov deyatel’nosti iskusstvennogo intellekta [Legal regime of the results of artificial intelligence activities]. In E. B. Lauts (Ed.), Sovremennyye informatsionnyye tekhnologii i pravo (pp. 68–83). Statut.

52. Khoshnoodi, M., Jain, V., Gao, M., Srikanth, M., & Chadha, A. (2024). A comprehensive survey of accelerated generation techniques in large language models (Version 2). arXiv. https://doi.org/10.48550/ARXIV.2405.13019

53. Kirsanova, E. E. (2023). Obzor osnovnykh teoriy opredeleniya pravovogo rezhima ob’yektov, sozdannykh iskusstvennym intellektom [Review of the main theories of determining the legal regime of objects created by artificial intelligence]. Zakon, 9, 36–46.

54. Kollár, P. (2021). Mind if I mine? A study on the justification and sufficiency of text and data mining exceptions in the european union [Master Dissertation, Maastricht University]. https://doi.org/10.2139/ssrn.3960570

55. Komariah, S. (2024). Assessment of AI’s creativity in the literary text: In comparison with AI’s creative compositions on visual art. In N. Haristiani, Y. Yulianeta, Y. Wirza, W. Gunawan, A. A. Danuwijaya, E. Kurniawan, S. Suharno, N. Nafisah, & E. D. A. Imperiani (Eds.), Proceedings of the 7th International Conference on Language, Literature, Culture, and Education (ICOLLITE 2023) (pp. 430–438). Atlantis Press International BV. https://doi.org/10.2991/978-94-6463-376-4_57

56. Kozlova, A. A. (2024). Prava na intellektual’nuyu sobstvennost’, sozdannuyu s primeneniyem iskusstvennogo intellekta [Rights to Intellectual property created using artificial intelligence]. Intellektual’naya Sobstvennost’. Avtorskoye Pravo i Smezhnyye Prava, 2, 33–40.

57. Lee, K., Cooper, A. F., & Grimmelmann, J. (2024). Talkin’ ’bout AI generation: Copyright and the generative-AI supply chain (No. arXiv:2309.08133). arXiv. https://doi.org/10.48550/arXiv.2309.08133

58. Lemley, M. A. (1997). Romantic authorship and the rhetoric of property. Texas Law Review, 75, 873–923. https://doi.org/10.2139/ssrn.44418

59. Lemley, M. A. (2017). The fruit of the poisonous tree in IP law. Iowa Law Review, 103, 245–269.

60. Lemley, M. A., & Casey, B. (2021). Fair learning. Texas Law Review, 99(4), 743–785.

61. Leval, P. N. (1990). Toward a fair use standard. Harvard Law Review, 103(5), 1105–1136. https://doi.org/10.2307/1341457

62. Leval, P. N. (2015). Campbell as fair use blueprint? Washington Law Review, 90(2), 597–614.

63. Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W., Rocktäschel, T., Riedel, S., & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks (Version 4). arXiv. https://doi.org/10.48550/ARXIV.2005.11401

64. Li, M. (2024). Comprehensive review of backpropagation neural networks. Academic Journal of Science and Technology, 9(1), 150–154. https://doi.org/10.54097/51y16r47

65. Li, Z., Chen, C., Xu, T., Qin, Z., Xiao, J., Luo, Z.-Q., & Sun, R. (2025). Preserving diversity in supervised fine-tuning of large language models [Conference poster]. In Proceedings of the 13th International Conference on Learning Representations (ICLR 2025). https://openreview.net/forum?id=NQEe7B7bSw

66. Lucchi, N. (2025, July). Generative AI and copyright: Training, creation, regulation (Study No. PE 774.095). Directorate-General for Citizens’ Rights, Justice and Institutional Affairs, European Parliament. https://doi.org/10.2861/0365517

67. Margoni, T. (2025). TDM and generative AI: Lawful access and opt-outs. Social Science Research Network. https://doi.org/10.2139/ssrn.5036164

68. Margoni, T., & Kretschmer, M. (2022). A deeper look into the EU text and data mining exceptions: Harmonisation, data ownership, and the future of technology. GRUR International, 71(8), 685–701. https://doi.org/10.1093/grurint/ikac054

69. Matulionyte, R. (2025). Reconceptualising the reproduction right in the age of AI. Social Science Research Network. https://doi.org/10.2139/ssrn.5041741

70. Mezei, P. (2025). The multi-layered regulation of rights reservation (opt-out) under EU copyright law and the AI Act -for the benefit of whom? (Version 3.0). Social Science Research Network. https://doi.org/10.2139/ssrn.5064018

71. Morkhat, P. M. (2018). Yunit iskusstvennogo intellekta v kontekste prava intellektual’noy sobstvennosti: Avtor, soavtor, nayomnyy rabotnik ili instrument [Artificial intelligence unit in the context of intellectual property law: Author, co-author, employee or tool]. Intellektual’naya Sobstvennost’. Avtorskoye Pravo i Smezhnyye Prava, 8, 35–42.

72. Mossoff, A. (2007). Who cares what Thomas Jefferson thought about patents? Reevaluating the patent privilege in historical context. Cornell Law Review, 92(5), 953–1012.

73. Mukherjee, A. (2025). The AI ouroboros and copyright laundering: Why copyright needs a “fruit of the poisonous tree” doctrine for generative AI (SSRN Scholarly Paper No. 5256625). Social Science Research Network. https://doi.org/10.2139/ssrn.5256625

74. Murray, M. D. (2025). AI pirated my art and birthed infringing works, and other metaphors that confound copyright law. Social Science Research Network. https://doi.org/10.2139/ssrn.5116714

75. Nikiforov, A. A. (2020). Personazh kak ob’yekt avtorskogo prava [Character as an object of copyright]. In Intellektual’nyye prava: Sbornik rabot vypusknikov Rossiyskoy shkoly chastnogo prava, posvyashchennyy 90-letiyu so dnya rozhdeniya Viktora Abramovicha Dozortseva (pp. 154–219). Statut.

76. Orlova, T. E. (2022). Avtorskoye pravo na rezul’taty deyatel’nosti iskusstvennogo intellekta [Copyright on the results of artificial intelligence activities]. Intellektual’naya Sobstvennost’. Avtorskoye Pravo i Smezhnyye Prava, (11), 62–71.

77. Pavlova, E. A. (Ed.). (2018). Kommentariy k chasti chetvertoy Grazhdanskogo kodeksa Rossiyskoy Federatsii [Commentary on Part Four of the Civil Code of the Russian Federation]. Issledovatel’skiy Tsentr Chastnogo Prava.

78. Peukert, A. (2021). A critique of the ontology of intellectual property law (G. Mertens, Trans.). Cambridge University Press. https://doi.org/10.1017/9781108653329

79. Poort, J., & Quintais, J. P. (2013). The levy runs dry: A legal and economic analysis of EU private copying levies. Journal of Intellectual Property, Information Technology and E-Commerce Law, 4(3), 205–224. https://nbn-resolving.de/urn:nbn:de:0009-29-38466

80. Prasad, K., & Padilla, J. (2025). Generative AI models at the gate: Licensing frameworks for the effective and efficient protection of copyright protected content in an AI world (SSRN Scholarly Paper No. 5263547). Social Science Research Network. https://doi.org/10.2139/ssrn.5263547

81. Quintais, J. P. (2025). The concept of “research organisation” and its implications for text and data mining and AI research. Social Science Research Network. https://doi.org/10.2139/ssrn.5155685

82. Rakhmatulina, R. S. (2025). Kontsept «ischezayushchego avtora» i kontseptsiya otsutstvuyushchego avtora rezul’tatov, sozdannykh tekhnologiyami iskusstvennogo intellekta [The concept of the “disappearing author” and the concept of the absent author of results generated by artificial intelligence technologies]. Grazhdanskoye Pravo, (1), 14–17. https://doi.org/10.18572/2070-2140-2025-1-14-17

83. Rebelo, A. D. P., Inês, G. D. O., & Damion, D. E. V. (2022). The impact of artificial intelligence on the creativity of videos. ACM Transactions on Multimedia Computing, Communications, and Applications, 18(1), 1–27. https://doi.org/10.1145/3462634

84. Riccio, G. M. (2024). AI, data mining and copyright law: Remarks about lawfulness and efficient choices. 2024 47th MIPRO ICT and Electronics Convention (MIPRO), 1457–1463. https://doi.org/10.1109/MIPRO60963.2024.10569189

85. Ricketson, S. (1987). The Berne convention for the protection of literary and artistic works: 1886-1986. Centre for Commercial Law Studies, Queen Mary College.

86. Rindfleisch, A. (2020). Transaction cost theory: Past, present and future. AMS Review, 10(1–2), 85–97. https:// doi.org/10.1007/s13162-019-00151-x

87. Rolinson, P., Ariyevich, E. A., & Ermolina, D. E. (2018). Ob’yekty intellektual’noy sobstvennosti, sozdavayemyye s pomoshch’yu iskusstvennogo intellekta: Osobennosti pravovogo rezhima v Rossii i za rubezhom [Intellectual property objects created with the help of artificial intelligence: Features of the legal regime in Russia and abroad]. Zakon, 5, 63–71.

88. Rosati, E. (2019). Copyright as an obstacle or an enabler? A European perspective on text and data mining and its role in the development of AI creativity. Asia Pacific Law Review, 27(2), 198–217. https://doi.org/10.1080/10192557.2019.1705525

89. Roznina, A. M. (2021). Avtorskiye prava na ob’yekty, sozdannyye iskusstvennym intellektom [Copyrights for objects created by artificial intelligence]. In E. A. Morgunova & P. V. Savishchev (Eds.), Sbornik luchshikh dokladov po grazhdanskomu pravu i grazhdanskomu protsessu uchastnikov studencheskikh nauchnykh meropriyatiy i chlenov studencheskikh nauchnykh klubov MGYUA: Materialy v ramkakh I Mezhdunarodnogo kongressa po tsivilisticheskoy komparativistike (pp. 101–104). RG-Press.

90. Safin, R. R., Maskin, K. A., & Povarov, Yu. S. (2018). Pravovoye regulirovaniye ob’yektov avtorskogo prava, sozdannykh s ispol’zovaniyem “neyroseti” [Legal regulation of copyright objects created using a “neural network”]. In O. V. Sushkova (Ed.), Pravovoye regulirovaniye intellektual’noy sobstvennosti i innovatsionnoy deyatel’nosti: Sbornik statey uchastnikov nauchno-metodologicheskogo seminara (pp. 154–158). RG-Press.

91. Sag, M. (2009). Copyright and copy-reliant technology. Northwestern University Law Review, 103(4), 1607– 1682.

92. Sag, M. (2019). The new legal landscape for text mining and machine learning. Journal of the Copyright Society of the U.S.A., 66, 291–367.

93. Sag, M. (2023). Copyright safety for generative AI. Houston Law Review, 61(2), 295–347.

94. Sag, M., Samuelson, P., & Sprigman, C. J. (2024). Comments in response to the Copyright Office’s notice of inquiry on artificial intelligence and copyright. Social Science Research Network. https://doi.org/10.2139/ssrn.4976391

95. Samuelson, P. (1989). Information as property: Do Ruckelshaus and Carpenter signal a changing direction in intellectual property law? Catholic University Law Review, 38(2), 365–400.

96. Schuhmann, C., Beaumont, R., Vencu, R., Gordon, C., Wightman, R., Cherti, M., Coombes, T., Katta, A., Mullis, C., Wortsman, M., Schramowski, P., Kundurthy, S., Crowson, K., Schmidt, L., Kaczmarczyk, R., & Jitsev, J. (2022). LAION-5B: An open large-scale dataset for training next generation image-text models (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2210.08402

97. Schwartz-croft, L. (2024, February 15). Effects of ROSS Intelligence and NDAs, highlighting the need for AI regulation. Social Science Research Network. https://ssrn.com/abstract=4727662

98. Senftleben, M. (2023). Generative AI and author remuneration. IIC — International Review of Intellectual Property and Competition Law, 54(10), 1535–1560. https://doi.org/10.1007/s40319-023-01399-4

99. Senftleben, M. (2025). Win-win: How to remove copyright obstacles to AI training while ensuring author remuneration (and why the AI act fails to do the magic). Chicago-Kent Law Review, 100(1), 7.

100. Senftleben, M., Poort, J., van Eechoud, M. M. M., van Gompel, S., & Helberger, N. (Eds.). (2021). Intellectual property and sports: Essays in honour of P. Bernt Hugenholtz. Wolters Kluwer Law International.

101. Senftleben, M., Szkalej, K., Sganga, C., & Margoni, T. (2025). Towards a European Research Freedom Act: A reform agenda for research exceptions in the EU copyright acquis. IIC — International Review of Intellectual Property and Competition Law. https://doi.org/10.1007/s40319-025-01604-6

102. Sharifani, K., & Amini, M. (2023). Machine Learning and Deep Learning: A Review of Methods and Applications. World Information Technology and Engineering Journal, 10(7), 3897–3904.

103. Shcherbak, N. V. (2023). Mnozhestvennost’ doktrin okhranosposobnosti proizvedeniya kak variant resheniya problemy pravovoy okhrany proizvedeniya po rossiyskomu pravu [Doctrines of protectability as a solution to the problem of the protectability of a work under russian law]. Vestnik Grazhdanskogo Prava, (5), 79–101. https://doi.org/10.24031/1992-2043-2023-23-5-79-101

104. Shpakovskaya, V. V. (2023). Iskusstvennyy intellekt v grazhdanskom prave [Artificial intelligence in civil law]. Intellektual’naya Sobstvennost’. Avtorskoye Pravo i Smezhnyye Prava, 4, 54–60.

105. Stallman, R. (2002). Misinterpreting Copyright—A Series of Errors. In J. Gay (Ed.), Free software, free society: Selected essays of Richard Stallman (1st ed, pp. 77–86). Free Software Foundation. https://www.gnu.org/philosophy/misinterpreting-copyright.html

106. Stratton, M. (2025). Market-based licensing for publishers’ works is feasible. big tech agrees. The Columbia Journal of Law & The Arts, 48(4), 434–449. https://doi.org/10.52214/jla.v48i4.13925

107. Tyagi, K. (2024). Copyright, text & data mining and the innovation dimension of generative AI. Journal of Intellectual Property Law & Practice, 19(7), 557–570. https://doi.org/10.1093/jiplp/jpae028

108. Ueno, T. (2021a). A general clause on copyright limitations in civil law countries: Recent discussion on japanese-style fair use clause. In H. Sun, N.-L. Wee Loon, & S. Balganesh (Eds.), The Cambridge Handbook of Copyright Limitations and Exceptions (pp. 211–215). Cambridge University Press. https://doi.org/10.1017/9781108671101.016

109. Ueno, T. (2021b). The flexible copyright exception for ‘non-enjoyment’ purposes ‒ recent amendment in japan and its implication. GRUR International, 70(2), 145–152. https://doi.org/10.1093/grurint/ikaa184

110. Verma, S., Rassin, R., Das, A., Bhatt, G., Seshadri, P., Shah, C., Bilmes, J., Hajishirzi, H., & Elazar, Y. (2024). How many Van Goghs does it take to Van Gogh? Finding the imitation threshold (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2410.15002

111. Vitko, V. S. (2019a). Analiz nauchnykh predstavleniy ob avtore i pravakh na rezul’taty deyatel’nosti iskusstvennogo intellekta (Chast’ 1) [Analysis of scientific views of authorship and right for results of AI activity (Part 2)]. Intellektual’naya Sobstvennost’. Avtorskoye Pravo i Smezhnyye Prava, 2, 5–20.

112. Vitko, V. S. (2019b). Analiz nauchnykh predstavleniy ob avtore i pravakh na rezul’taty deyatel’nosti iskusstvennogo intellekta (Chast’ 2) [Analysis of scientific views of authorship and right for results of AI activity (Part 2)]. Intellektual’naya Sobstvennost’. Avtorskoye Pravo i Smezhnyye Prava, 3, 5–22.

113. Vorozhevich, A. S. (2025). Kartiny, sgenerirovannyye neyroset’yu: Ob’yekty avtorskikh prav ili net? [Pictures generated by a neural network: Objects of copyright or not?]. Intellektual’naya Sobstvennost’. Avtorskoye Pravo i Smezhnyye Prava, 1, 24–33.

114. Zhang, J. (2019). Gradient descent based optimization algorithms for deep learning models training (Version 1). arXiv. https://doi.org/10.48550/ARXIV.1903.03614

115. Zhang, S. (2024). Gradient descent algorithm optimization and its application in linear regression model. Academic Journal of Natural Science, 1(1), 1–5. https://doi.org/10.5281/ZENODO.13753916

116. Zhang, S., Dong, L., Li, X., Zhang, S., Sun, X., Wang, S., Li, J., Hu, R., Zhang, T., Wu, F., & Wang, G. (2023). Instruction tuning for large language models: A survey (Version 9). arXiv. https://doi.org/10.48550/ARXIV.2308.10792

117. Zhao, W. X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., Min, Y., Zhang, B., Zhang, J., Dong, Z., Du, Y., Yang, C., Chen, Y., Chen, Z., Jiang, J., Ren, R., Li, Y., Tang, X., Liu, Z., … Wen, J.-R. (2023). A survey of large language models (Version 16). arXiv. https://doi.org/10.48550/ARXIV.2303.18223

Review

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 2686-9136 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Digital Law Journal

Reproducing or data mining: The copyright law dilemma of AI training

Full Text:

Abstract

Keywords

About the Author

References

Review

Cookies policy