Human Conversations About AI: Protecting IP in the World of Generative AI
February 22, 2024
Human Conversations About AI is an O’Melveny thought leadership series, where O’Melveny lawyers ponder hypothetical—“artificial”—scenarios likely to come before the courts. This week, we explore what happens when generative AI meets intellectual property.
Hypothetical: To increase efficiency, a software engineer for Company B uses a generative AI tool to create a portion of the back-end code for a program that Company B develops and licenses to Company A at a set price. IP ownership is obviously an important part of Company B’s business model. What are some of the risks?
Analysis: With the growth of generative AI and its prevalence in the workplace, it will not be unusual for an employee to use an AI-based tool as a starting point to create work. This hypothetical showcases several important risks, including (1) how to preserve IP ownership when a software program is partially generated by AI; and (2) the risk of using generative AI tools that might themselves include copyrighted works.
I. Preserving IP Ownership in the Age of Generative AI
By starting a creative work using a generative AI tool to build its foundation, an employee may hamper a company’s ability to obtain copyright protection for that work. This will likely depend on the proportion of the code that is human-generated versus AI-generated.
At least one court has already stated that a work that is entirely AI-generated is not subject to copyright protection.1 The reason: copyright only protects work of authorship, which requires that a human being has created the work. Nonetheless, it remains an unanswered question how much human input is required to subject a human creation to copyright protection if the work was partly generated by AI. In Thaler v. Perlmutter, an individual whose copyright application for AI-generated artwork (generated by an AI tool that he developed) was denied brought an action against the United States Copyright Office (“USCO”) and its director. In granting the government’s cross-motion for summary judgment, the court stated that the question of whether the entirely AI-generated artwork (with no human input in the art itself) was subject to copyright was not complex—but the court also noted that the question of whether more limited AI involvement in the creation of a work might negate copyright protection is more difficult to answer:
“The increased attenuation of human creativity from the actual generation of the final work will prompt challenging questions regarding how much human input is necessary to qualify the user of an AI system as an ‘author’ of a generated work, the scope of the protection obtained over the resultant image, how to assess the originality of AI-generated works where the systems may have been trained on unknown pre-existing works, how copyright might best be used to incentivize creative works involving AI, and more.”2
The USCO has grappled with these same issues. In a March 2023 public guidance document on copyright registration of AI-generated works, the USCO explained that, in considering whether to register a work, the USCO asks “whether the ‘work’ is basically one of human authorship, with the computer [or other device] merely being an assisting instrument, or whether the traditional elements of authorship in the work (literary, artistic, or musical expression or elements of selection, arrangement, etc.) were actually conceived and executed not by [a person] but by a machine.”3 If all the work’s “traditional elements of authorship” were AI-generated, the USCO will not register it.4 If a work containing AI-generated material contains sufficient human authorship, the USCO will register the human’s contributions, but the applicant must disclose and exclude AI-generated content that is “more than de minimis” from the application.5
In September 2022, the USCO granted copyright registration for a comic book that was generated with the help of a text-to-image AI program, Midjourney.6 In late 2022, the USCO was reported to be reviewing its decision to register the copyright, and the artist who created the comic stated that the USCO had asked her “to provide details of [her] process to show that there was substantial human involvement in the process of creation of this graphic novel.”7 This suggests that while work generated entirely by AI cannot be copyrighted, works that incorporate AI-generated outputs as well as “substantial human involvement” may be subject to copyright protection.8 The USCO ultimately decided not to register the comic book because, while the human author created the comic book’s text as well as “the selection, coordination, and arrangement of the [book’s] written and visual elements,” the images themselves were “not the product of human authorship” because they were generated by AI.9 Because the comic book contained “more than a de minimis amount of content” generated by AI, and the author was unwilling to disclaim that AI-generated material, the USCO determined that the comic book could not be registered.
Similarly, on February 13, 2024, the United States Patent and Trademark Office (“USPTO”) issued guidance on AI-assisted inventions, stating that while AI-assisted inventions are “not categorically unpatentable,” patent applications must name a human who “significantly contributed” to the invention.10
Turning back to the hypothetical, what proportion of human-generated code would render a work subject to copyright protection? The more human involvement, the better the chances of copyright protection. USCO guidance suggests that the human-generated portions of the code would be protected by copyright, but any AI-generated portion of the code that is more than “de minimis” would not. Determining how much human involvement went into developing the code will require keeping thorough records of what elements of the code were generated by human beings and assessing whether those elements themselves are copyrightable.
II. Source Material for Generative AI—Risks of Copyright Infringement
Using generative AI as the foundation for creative work can raise copyright concerns depending on the material used to train and create the generative AI. For example, using copyrighted works to train an AI tool might make the new creative work generated by the AI tool subject to copyright-infringement claims.
Although the caselaw in this area is developing, some early rulings suggest that the use of generative AI might lead to copyright infringement, though plaintiffs may need to show that the output contains portions of the copyrighted material or is substantially similar to the copyrighted material.11 Whether that could lead to legal action against the users of generative AI tools—and not just the companies behind the generative AI tools themselves—is an open question.
For the purposes of this hypothetical, certainly if the engineer for Company B using generative AI uses output that includes a third party’s source code in their work without any meaningful changes, Company B may be subject to direct copyright infringement claims for unauthorized use of that third party’s code. It could raise claims that Company B’s code is a “derivative work” of the third party’s code.12 This risk is particularly challenging to address because it is difficult to determine the source of the generative AI’s code or where derivative work ends and new work begins. For this reason, it will be important to review the terms of use of the AI tool and whether it includes an indemnification against third-party IP claims resulting from the AI tool’s output.
Another open question is whether there is copyright liability merely for using an AI generative tool that was trained using copyrighted works, even if the output itself is not infringing. Copyright infringement claims have been asserted against the providers of such tools, claiming that use of the copyrighted works to train the tool is copyright infringement because in the course of training the AI model, an intermediate copy of the copyrighted work was made without permission. However, it is not yet clear whether a user of the AI tool would be liable for copyright infringement if the output itself is not infringing.
For example, Getty Images banned the sale of AI-generative artwork created using image-synthesis models (e.g., Stable Diffusion, Midjourney, DALL-E 2) because of copyright infringement concerns.13 Getty CEO Craig Peters specifically stated that the ban was in part due to “real concerns with respect to the copyright of outputs from these models” as well as “unaddressed rights issues with respect to the imagery, the image metadata and those individuals contained within the imagery[.]”14
III. Risk of Hallucinations
One risk of using generative AI is that it may generate inaccurate results called “hallucinations.” Large language models (“LLMs”) can generate text that is incorrect, nonsense, or untethered from reality. Hallucinations are caused by limited contextual understanding—especially when LLMs are obligated to transform a prompt into an abstraction and generate a result, but not necessarily a correct one. Since LLMs predict text based on prior text, they may generate nonsensical results that are not necessarily supported by data. And training data is based on publicly available information, which may, of course, be false.
Turning back to the hypothetical, it is important to take steps to ensure that AI-generated data used for creative work are accurate. It is prudent to run regular audits of the data to confirm the integrity of the creative work. For an AI tool that generates computer code, this means ensuring that the code used to train the tool uses the best programming practices and is free from errors and bugs. As for AI-generated code, hallucinations are generally less of a risk because they may be more readily detectible since they may result in inoperability or errors in the program.
Conclusion
While there are significant benefits to using generative AI tools, there are also risks. Companies should approach the use of generative AI in the workplace with caution and put policies and guardrails in place to ensure its effective use. For example, if generative AI tools are used in the workplace for creative works, including computer code, a human should always be in the loop. Such work should also be documented (including the extent of human contribution to the work).
In addition, companies should also consider using AI tools that provide warranties and indemnification against copyright infringement based on AI-generated output. It is difficult to confirm whether any AI-generated code contains third-party source code, so these contractual provisions can mitigate risk.
1See Thaler v. Perlmutter, Case No. 1:22-cv-1564-BAH, 8-14 (D.D.C. August 18, 2023) (stating that entirely AI-generated works were not subject to copyright protection, but leaving open the question of how much human input is required in a partly AI-generated work to render it subject to copyright).
2Id. at 13.
3See Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence, 88 Fed. Reg. 16,190, 16,192 (Mar. 16, 2023) (“AI Registration Guidance”) (quoting U.S. Copyright Office, Sixty-Eighth Annual Report of the Register of Copyrights for the Fiscal Year Ending June 30, 1965, 5 (1966)).
4Id.
5Id. at 16,192–193.
6See James Vincent, “The scary truth about AI copyright is nobody knows what will happen next,” The Verge (Nov. 15, 2022), https://www.theverge.com/23444685/generative-ai-copyright-infringement-legal-fair-use-training-data; Benj Edwards, “Artist receives first known US copyright registration for latent diffusion AI art,” Ars Technica (Sept. 22, 2022), https://arstechnica.com/information-technology/2022/09/artist-receives-first-known-us-copyright-registration-for-generative-ai-art/.
7Id.
8See Burrow-Giles Lithographic Co. v. Sarony, 111 U.S. 53, 60 (1884) (holding that photographs were subject to copyright, despite being mechanically produced, where the human creator conceived of and designed the image and then used the camera to capture the image); Urantia Found. v. Kristen Maaherra, 114 F.3d 955, 957–59 (9th Cir. 1997) (holding that “some element of human creativity must have occurred in order for the Book to be copyrightable” because “it is not creations of divine beings that the copyright laws were intended to protect”).
9See Richard Lawler, “The US Copyright Office says you can’t copyright Midjourney AI-generated images,” The Verge (Feb. 22, 2023), https://www.theverge.com/2023/2/22/23611278/midjourney-ai-copyright-office-kristina-kashtanova; Eileen McDermott, “Copyright Office Denies Registration to Award-Winning Work Made with Midjourney,” IP Watchdog (Sept. 8, 2023), https://ipwatchdog.com/2023/09/08/copyright-office-denies-registration-award-winning-work-made-midjourney/id=166498/.
10Inventorship Guidance for AI-Assisted Inventions, 89 Fed. Reg. 10,043 (Feb. 13, 2024).
11See, e.g., Doe 1 v. GitHub, Inc., Case No. 4:22-cv-06823 (N.D. Cal. May 11, 2023 & Jan. 22, 2024) (denying GitHub and OpenAI’s motion to dismiss complaint in part on the grounds that plaintiff coders pleaded sufficient facts to support their Digital Millennium Copyright Act (“DCMA”) claim, where plaintiffs alleged that the defendants had used their copyrighted materials in its generative AI tool without proper attribution, copyright notices, or license terms; granting motion to dismiss amended complaint in part because allegations showed that copyright management information was not removed or altered from an identical copy of a copyrighted work, as required for a DCMA claim); Andersen v. Stability AI, Inc., No. 23-cv-00201 (N.D. Cal. Oct. 30, 2023) (dismissing, inter alia, direct and vicarious copyright infringement claims and claims under Section 1202 of the DCMA on the grounds that (1) “it is simply not plausible” that every output constitutes a derivative work “absent ‘substantial similarity’ type allegations,” and (2) “[i]n order to state [a] claim [under Section 1202 of the DCMA], each plaintiff must identify the exact type of [copyright management information (“CMI”)] included” in their works and “allege plausible facts” on how that CMI was allegedly removed or altered); Kadrey v. Meta Platforms, Inc., No. 23-cv-03417 (N.D. Cal. Nov. 20, 2023) (dismissing, inter alia, vicarious copyright infringement and DCMA claims on the grounds that plaintiffs failed to plead that the challenged outputs were “substantially similar” to their books and no facts supported the allegation that the AI language models in question distributed their books, rejecting plaintiffs’ contention that every output is an infringing derivative work, and rejecting plaintiffs’ contention that the AI language models were themselves infringing derivative works as “nonsensical”); Tremblay v. OpenAI, Inc., Case No. 3:23-cv-03223 (N.D. Cal.) (granting in part and denying in part OpenAI’s motion to dismiss claims by authors of copyrighted books in putative class action alleging that OpenAI unlawfully used their copyrighted books as training material for ChatGPT without their consent and have profited from the use of the copyrighted materials; Silverman v. OpenAI, Inc., Case No. 3:23-cv-03416 (N.D. Cal.) (same).
12A "derivative work" is a work based upon one or more preexisting works, such as a translation, musical arrangement, dramatization, fictionalization, motion picture version, sound recording, art reproduction, abridgment, condensation, or any other form in which a work may be recast, transformed, or adapted. 17 U.S.C. §101.
13See James Vincent, “Getty Images bans AI-generated content over fears of legal challenges,” The Verge (Sept. 21, 2022), https://www.theverge.com/2022/9/21/23364696/getty-images-ai-ban-generated-artwork-illustration-copyright.
14Id.
This memorandum is a summary for general information and discussion only and may be considered an advertisement for certain purposes. It is not a full analysis of the matters presented, may not be relied upon as legal advice, and does not purport to represent the views of our clients or the Firm. Nexus U. Sea, an O’Melveny partner licensed to practice law in New York and New Jersey; Mark Liang, an O'Melveny partner licensed to practice law in California; Scott W. Pink, an O’Melveny special counsel licensed to practice law in California and Illinois; Amy R. Lucas, an O'Melveny partner licensed to practice law in California; Marc J. Pensabene, an O'Melveny partner licensed to practice law in New York; Megan K. Smith, an O'Melveny partner licensed to practice law in California, New York, and Massachusetts; Coke Morgan Stewart, an O'Melveny senior counsel licensed to practice law in the District of Columbia and Virginia; and Laura K. Kaufmann, an O'Melveny associate licensed to practice law in California, contributed to the content of this newsletter. The views expressed in this newsletter are the views of the authors except as otherwise noted.
© 2024 O’Melveny & Myers LLP. All Rights Reserved. Portions of this communication may contain attorney advertising. Prior results do not guarantee a similar outcome. Please direct all inquiries regarding New York’s Rules of Professional Conduct to O’Melveny & Myers LLP, Times Square Tower, 7 Times Square, New York, NY, 10036, T: +1 212 326 2000.