“Are we failing our artists? - The copyright infringement of generative AI and its questionable training data.”
- Twisha Chand Warwade
- Oct 2, 2024
- 5 min read
Can you spot the difference? Of course. Can you tell which one is the original artwork titled “Starry Night” by Vincent Van Gogh? Let’s assume you can. The original is considered invaluable, but to simplify things, a price of $100 million is a good place to start. The derivative, modeled after the original on the other hand, is free or negligible of cost. To a rational consumer who isn’t a connoisseur of art, the derivative will do ample justice.
This seems like a win-win situation. Why pay $100 million when you can get almost the same, if not better, for free?
The operative word being “seems”. Imagine if Van Gogh was alive and well, and worked hours on end (years even) to create this masterpiece. Then, a random stranger took the essence of his creative genius, replicated it, circulated it, and argued it was for the betterment of society all while Van Gogh’s actual work sat there penniless, collecting dust. This irony is exactly what is marring present-day artists’ livelihood, as generative AI is cloning and publicizing their work without consent.
AI-generated art of any form has been a hot topic for quite a while. The artists’ ability to earn commission is hampered as their works are being replicated by generative AI such as DALL.E and Stable Diffusion. The legal and ethical connotation of this is what sparked the debate, and continues to fuel its character. To understand this better, we must understand the complexities of copyright infringement and the working of generative AI.
Whenever an artist produces or creates any original work, ranging from literary works, performing arts, music, and visual arts, they copyright it to ensure profit from their creation. Copyright guarantees exclusive rights and use of a work for a set period (usually 70 years). No third party can replicate works under a copyright agreement unless consensual licensing agreements have been made or the copyright holder sells their right. It also covers moral rights. For example, authors have a right of integrity which means they can prevent change and distortion of their work. Thus, copyright is an airtight legal mechanism in theory that safeguards the creativity and commissions of an artist.
Unfortunately, being perfect on paper does not make it ideal in reality. Why? Because copyright infringement occurs. Copyright infringement is the using or producing works under copyright without the explicit permission of the copyright holder. While it should be easy at this juncture to invoke copyright laws and sue the other party, proving copyright infringement proves to be a task, making it an air bubble or loophole in the otherwise airtight structure.
There can be, of course, reasons to justify copyright infringement, such as lack of supply of authorized work or the high price of authorized work but one must remember that it is at the cost of the livelihood of the artist themselves.
Now that we’ve established that copyright infringement is harmful to artists, we must also see how generative AI seemingly violates copyright law.
AI relies on deep learning foundational models and is developed in three stages: training, tuning, generation, evaluation, and retuning. In the first stage, the foundational model is trained on colossal (literally in terabytes) amounts of raw data, or data lakes, sourced from the internet or otherwise. It then evaluates the patterns, relationships, and sequences to form a neural network of parameters. To break it down, it works on the principle of patterns to predict the next step in a sequence, till it gets more and more close to the accurate answer. In the second stage, the generalized foundational model is tuned to create a certain type of content. In the third stage, the AI is finely and continuously retuned on the basis of the content it is generating, as developers train it to get more accurate.
The first stage, training, is what’s causing the strife of the artists. The data lakes include patented or copyrighted material, and since the generative AI is training using it, it does not understand the nature of the work and incorporates it into its own formulation process regardless. It is a distorted version of the student becoming the teacher.
From here, the murky waters of copyright law engulf the issue. While it should be easy to conclude that AI is violating copyright laws, it is not so. AI developers can argue that it is not a violation owing to the fact that if a piece of art is sufficiently different from its source material to become unequivocally “transformative,” and not subject to copyright. They can also use the doctrine of fair use as a shield, which allows for work to be used without the owner’s permission “for purposes such as criticism (including satire), comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research.”
The issue lies in the fact that when copyright laws were created, the use of AI was not factored in. Thus, decisive judgments have not yet been passed that declare the use of patented work by AI as an outright violation of copyright laws. There are multiple cases, such as Warhol Foundation vs Goldsmith and Andersen vs Stability AI that may prove critical in the future.
While I cannot comment on the legal connotation, the ethical implications of generative AI’s use of protected works bother me. Oscar Wilde says, “Life imitates art far more than art imitates life.” This quote is near and dear to my heart and applies here as well. Art is not art for its technique. Art is a subjective interpretation. Art is expression. The cause of expression is experience, which AI will never have. At the end of the, is not human. It can only predict patterns and is not capable of true creation. Thus comes the fundamental difference between inspiration and replication. Humans are capable of feeling, interpreting, and interacting with art. Of adding on to someone’s work, taking something from another’s paintings, adding their very own touch to art. Most of the time, these processes are so intrinsically human of us, that we do them without thinking. Humans are capable of creating. AI is only capable of replicating or predicting.
Thus, I believe that generative AI actually does not generate art to begin with, but rather a soulless recycling or paraphrasing of ideas and creations. Art, like us, has a soul, often a reflection of the artist’s soul, and actively interacts with us the same way we interact with it. The multidimensional, almost sacrosanct ability that allows us to do so is irreplaceable no matter the refined algorithm. So any AI-generated work is, at best, a diluted mirage of the original and should not reap the benefits of the original. It is very much an ethical violation of art.










Comments