FAQs on the Authors Guild’s Positions and Advocacy Around Generative AI

1)

The Authors Guild believes that it is inherently unfair to use and incorporate books, articles, and other copyrighted works in the fabric of AI technologies without the author’s consent, compensation, or credit. While generative AI technologies capable of generating text and other content can be useful tools for writers, guardrails around their development and use are urgently needed to protect the writing profession and literary culture. There is a serious risk of market dilution from machine-generated works that can be cheaply mass-produced and inevitably lower the value of human-authored works. We need to safeguard the incentives that fuel the creation of a rich and diverse literary culture, so vital to our democratic culture that they are inscribed in the Constitution.

2)

The Authors Guild is specifically lobbying for laws, regulations, and policies regarding:

  • Consent: Require permission for the use of writers’ works in generative AI;
  • Compensation: Compensate authors who wish to allow their works to be used in the “training” of generative AI;
  • Transparency: Create transparency obligations for AI developers to disclose what works they use to “train” their AI;
  • Use in outputs: Require permission and establish compensation for authors when their works are used in outputs, or when their names or identities or titles of their works are used in prompts—whether through adding a new economic right under copyright law or as a sui generis right, and/or through a broad, well-articulated federal right of publicity law;
  • Label AI-generated content: Require authors, publishers, platforms, and marketplaces to label AI-generated works and otherwise identify when a significant portion of a written work has been generated by AI.

In addition to lobbying for these legal and regulatory changes, the Authors Guild strongly opposes efforts to deem AI-generated content protectible under copyright, or the creation of a limited sui generis right. The Authors Guild believes that giving AI-generated content protection under existing copyright rules or under a new right exacerbates the threat of AI-generated content flooding the markets.

Note: We use the term “train” to refer to AI developers’ use of pre-existing works in developing their AI only because it has become the standard shorthand. That said, we have reservations about the semantics of the word because it makes the use of works sound like a one-time use and serves to anthropomorphize machines—as if they are simply “reading” or “observing” texts and other works. The reality is that the works actually are used to build the AI and remain part of its fabric. There is no generative AI without the material—mostly in-copyright works—that AI is so-called “trained” on.

3)

It is not efficient or even practicable for AI companies to seek licenses from each individual author who owns the rights to their works. So, the Authors Guild is proposing to create a collective license whereby a collective management organization (CMO) would license out rights on behalf of authors, negotiate fees with the AI companies, and then distribute those payment to authors who register with the CMO. These licenses could cover past uses of books, articles, and other works in AI systems, as well as future uses. The latter would not be licensed without a specific opt-in from the author or other rights holder.

Collective licensing is an established concept and an effective means of paying creators and publishers where licensing creates market inefficiencies. For many years now, the Authors Registry and the Authors Coalition of America have distributed royalties received from foreign collective licenses to U.S. authors.

4)

One or more collective rights management organizations (CMO) will have to be established or an existing one—e.g. the Authors Registry or the Copyright Clearance Center (CCC)—will have to be augmented for purposes of negotiating licenses with AI companies and distributing amounts collected to writers.

The CMO would demand fair and reasonable compensation for the use of texts where the rights are owned by the writers on an annual basis. It would seek compensation for past use as well as future use.

5)

The CMO would represent the interests only of “professional” writers and only for the use of books and articles—not, say, for every social media post, etc., that anyone has written. Proof of publication or membership in any professional organization for writers, for instance, could make one eligible for representation by the CMO.

6)

Licenses for uses that have already occurred

Once works have been used to “train” AI, they become part of the fabric of the AI and cannot be effectively removed. For the past “training” of what are called the “foundational” models—such as OpenAI’s GPT, Google’s LLaMDA and PaLM, and Meta’s LLaMA—the CMO will seek compensation on an annual basis going forward for as long as those foundational models are in use. “Training” AI on pre-existing works is not a one-time event; the works are constantly being used to continue to “train” the AI. As such, payment should continue as long as the AI model is in use.

A sum that represents the fair value of all of the books and articles ingested would have to be negotiated and a blanket license granted. The Guild will seek professional expertise to help determine the annual value.

Because works were indiscriminately ingested for the existing foundational models, the Guild seeks the ability for the CMO to grant licenses on an opt-out basis—meaning the CMO can grant the license and obtain compensation on behalf of all professional writers, and those who do not wish to participate can opt out. Robust notice provisions to allow authors and other rights owners to opt out will have to be established.

Extended collective licenses

Legislation would be required to allow qualifying CMOs to grant licenses on an opt-out basis. These are referred to as extended collective licenses (ECLs) and allow qualifying CMOs to negotiate licenses for a specific use (e.g., AI “training”) on behalf of a specific class of copyright owners (e.g., authors and journalists), whether or not they are existing members of the organization, but they must provide an effective mechanism for non-members to opt out of the licenses at any time.

ECLs are intended for mass use, where users cannot negotiate directly with all individual copyright holders due to their sheer numbers. CMOs permitted to use ECLs for AI “training” could be subject to authorization by the U.S. Copyright Office and would have to meet certain requirements.

For instance, the CMO would be required to show that it represents a broad group of impacted rightsholders, that its membership consents to an ECL, and that it adheres to sufficient standards of transparency, accountability, and good governance. Once authorized, a CMO would be entitled to negotiate royalty rates and terms with AI developers on behalf of the class. The benefit of an ECL is that from the perspective of an AI company it would allow for one-stop shopping for licensing books and other text-based works, and thus they would be incentivized to use the ECL over unauthorized uses, which carry the risk of litigation and its attendant costs. It will be much easier to negotiate with AI companies under an ECL system.

To be clear, an ECL is not a compulsory license like some of the licenses created by statute that have no opt-out provisions and under which rates are set by the governmental agency administering the license.

Licenses for future uses

The CMO would also offer licenses to AI companies for text-based works moving forward. This could be done on an opt-out/ECL basis, as described above, or an opt-in basis—meaning only on behalf of those who specifically authorize the CMO to offer licenses on their behalf.

Opt-in collective licenses

A CMO could provide licenses to AI companies on behalf of writers who affirmatively opt in to the collective license by authorizing the CMO to license out AI “training” rights for specific uses. The benefit of this approach is that legislation to provide an extended collective license would not be needed, but the drawback is that it would allow the CMOs to license only the works of those who opted in, and so may not be as attractive to AI companies that seek en masse permission and might be difficult to implement as a practical matter. To make it more attractive, the CMO conceivably could create a database of works for targeted “training.”

7)

We believe that text-generating AI technologies would not exist without the works they were “trained on,” and we are determined to get compensation that reflects these contributions, not just pennies on the dollar. It is important to bear in mind that, unlike Spotify, the license will be for a subsidiary (not primary) use and that any fee charged to the AI companies will be divvied up among all participating authors and publishers. The amount should be significant enough that all authors whose works were used feel the benefit from it. We will hire experts to value the use of the works in the AI systems and the CMO will negotiate rates accordingly.

8)

Whether the collective license is an opt-in or opt-out one, eligible writers who register with the CMO or a participating organization would receive a distribution based on algorithms that take into account the number of works published, the length of those works, and any available sales data. The board of the CMO would be responsible for authorizing distributions and the board or membership (which includes all authors and other rights owners who sign up) would approve the factors for allocation.

A certain amount would be set aside for those who have not yet registered but do so later. The amount set aside would be calculated based on the estimated number of eligible writers in the country and the number registered.

9)

In addition to compensating writers for the use of their works in “training” AI, AI companies should prevent use of creators’ names, their writings (or portions thereof), or the titles of their work in prompts without the creators’ express permission. And where writers permit this use, they need to be compensated.

The CMOs could license those rights and collect and distribute the fees on behalf of the writers who wish to permit the use and be compensated, making it possible for the writers means to earn additional income.

It would be helpful to create a new economic right, whether under copyright law or as a sui generis right, to ensure that AI companies obtain permissions for these kinds of uses. A well-articulated federal right of publicity law—for which the Authors Guild is lobbying—would also help.

10)

Under an opt-in collective licensing system, authors could simply withhold opting-in. Under an ECL system, robust notice and opt-out mechanisms would allow authors to remove their works from the license at any time. Notice methods, similar to those used in class-action lawsuits, will be used to ensure that all authors covered by the license are informed and have the opportunity to opt out. Opting out should be made very simple, such as through a publicly posted and easily located online form.

11)

Once a work is used to “train” AI, it is part of the program and cannot simply be extracted. So, opting out of training that has already occurred is not really an option, nor are courts or Congress likely to tell the foundational LLM model owners that they need to start over. That is why the Authors Guild is seeking compensation for the use of authors’ works in training existing AI. We strongly believe that permission should be obtained first, and do not want to condone the use that has already occurred, but we also know that we can’t wind back the clock. We want to make sure that authors are paid for use of their works in existing LLM models.

12)

On June 28, 2023, authors Paul Tremblay and Mona Awad filed a class action against OpenAI, asserting direct and vicarious copyright infringement as well as other claims, including those for negligence and privacy violations. On July 7, 2023, a second class action suit was filed by Richard Kadrey, Sarah Silverman, and Christopher Golden. The law firm representing the writers in both cases also brought a class action on behalf of visual artists against Midjourney and Stability AI.

The authors’ complaints allege that OpenAI used their and others’ copyrighted works, without consent, to “train” the AI language models ChatGPT, GPT-3, and GPT-4, and that “[b]ecause the OpenAI Language Models cannot function without the expressive information extracted from [their] works (and others) and retained inside them, the OpenAI Language Models are themselves infringing derivative works, made without [their] permission and in violation of [the authors’] exclusive rights under the Copyright Act.”

The complaint further alleges that the data sets used in “training” the language models comprised pirated copies and were likely obtained from “shadow libraries” such as Library Genesis, Z-Library, Sci-Hub, and Bibliotik, as well as from torrent sites, and that the capacity to generate accurate summaries indicates the models retain knowledge of particular works in the “training” dataset. The plaintiffs are seeking damages and injunctive relief on behalf of themselves and all U.S.-based copyright owners whose works were “used as training data for the OpenAI Language Models.”

The copyright infringement claims—assuming the lawsuits move forward past the pleading and class certification issues—will likely come down to whether OpenAI’s uses of the works are protected by the fair use doctrine. The court will consider OpenAI’s use of the copyrighted works in light of the four fair use factors: 1) the purpose and character of the use, including whether the use is of a commercial nature or is for nonprofit educational purposes, and whether the use adds something new with a further purpose or different character without creating a substitute for the original use of the work; 2) the nature of the copyrighted works; 3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and 4) effect of the use upon the potential market for or value of the copyrighted work.

13)

While the Authors Guild fully supports the lawsuits and authors using litigation as a means to assert their rights, the Guild is focusing its present efforts on legislative changes and direct negotiation for a couple of reasons:

  1. Class-action lawsuits can take years to be adjudicated, and there is need for immediate relief. The procedures for simply certifying the class can take time. For example, when the Guild and publishers brought the class-action lawsuit against Google in 2005, it took seven years for the class certification to be decided (and ultimately rejected) and for the merits of the case to be heard. Moreover, even if the class wins on the merits, it can take time for damages to be distributed. The class action In re Literary Works in Electronic Databases Copyright Litigation, for instance, was initiated in 2001, but the writers did not get compensated until 2018, seventeen years later. Given the rapid pace at which generative AI technologies are proliferating and adopted, we need a solution now, not in ten years, when non-payment has already become the de facto model.
  2. While the Authors Guild does not believe that AI companies use of copyrighted works is a fair use, the fair use doctrine includes a great deal of subjectivity, and there is no certainty as to the outcome of any particular case. The Authors Guild firmly believes that copying and using works without permission to create technologies that can produce commercial substitutes falls well outside of fair use, but a decision to the contrary use would give AI companies carte blanche to make use of authors’ works without any legal obligation to ask for permission or offer compensation.

This does not mean that the Authors Guild has foreclosed class-action litigation as an option. The Guild’s approach complements ongoing and future litigation to assert authors’ rights against unauthorized uses of their works in AI systems and define the limits of fair use.

14)

The risks to the writing profession from generative AI technologies require a multi-faceted response. Collective licensing does not address all of these risks, but does give us a starting point to give authors control over uses of their works and put money back into their pockets.

As part of its advocacy, as noted above, the Authors Guild is also asking Congress to:

  • Require AI-generated content to be labeled as such. This will prevent AI-generated content from being passed off as human written, and consumers have the right to know.
  • Require AI companies to disclose what copyrighted materials they used to “train” their AI.
  • Creation of a well-articulated federal right of publicity law that would give creators the right to sue for unauthorized use of their names or other identifying information in prompts and AI outputs.

In addition to our advocacy and lobbying around AI, the Authors Guild is doing the following:

  • Contract clauses: The Authors Guild has released new contract clauses that aim to prevent the use of books in “training” generative AI without an author’s express permission. In addition, the Authors Guild’s new clauses require publishers to get an author’s written consent before using AI-generated book translations, audiobook narration, or cover art. The Guild’s Model Trade Book Contract and Literary Translation Model Contract have been updated to include the new clauses, and the Guild is encouraging authors and agents to ask for their inclusion in contracts with publishers, and urging publishers to adopt the clauses.
  • Educational programs: The Authors Guild is expanding its educational programs to cover issues created by generative AI technologies, but also to equip authors with the skills to utilize and take advantage of these technologies. In upcoming webinars, it will show how authors can incorporate generative AI technologies into their writing process, effectively and ethically.
  • Best practices for authors and publishers: Some issues created by generative AI can’t be addressed through legislation or regulation—these require a commitment among authors and publishers to use the technology responsibly. It is incumbent on all members of the writing and publishing industry to be thoughtful and transparent about using generative AI. The Authors Guild is working on establishing ethical guidelines and best practices to help authors and publishers navigate the terrain.