CS+Law
Research Workshop
When: Third Friday of each month at Noon Central Time (sometimes fourth Friday; next workshop: Tuesday, September 26 1:00p.m.)
What: First 90 minutes: Two presentations of CS+Law works in progress or new papers with open Q&A. Last 30 minutes: Networking.
Where: Zoom
Who: CS+Law faculty, postdocs, PhD students, and other students (1) enrolled in or who have completed a graduate degree in CS or Law and (2) engage in CS+Law research intended for publication.
A Steering Committee of CS+Law faculty from Berkeley, Boston U., U. Chicago, Cornell, Georgetown, MIT, North Carolina Central, Northwestern, Ohio State, Penn, Technion, and UCLA organizes the CS+Law Monthly Workshop. A different university serves as the chair for each monthly program and sets the agenda.
Why: The Steering Committee’s goals include building community, facilitating the exchange of ideas, and getting students involved. To accomplish this, we ask that participants commit to attending regularly.
Computer Science + Law is a rapidly growing area. It is increasingly common that a researcher in one of these fields must interact with the other discipline. For example, there is significant research in each field regarding the law and regulation of computation, the use of computation in legal systems and governments, and the representation of law and legal reasoning. There has been a significant increase in interdisciplinary research collaborations between researchers from CS and Law. Our goal is to create a forum for the exchange of ideas in a collegial environment that promotes building community, collaboration, and research that helps to further develop CS+Law as a field.
Workshop 18: Tuesday, September 26, 1:00 to 3:00 p.m. Central (Chicago) Time
Please join us for our next CS+Law Research Workshop online on Tuesday, September 26 from 1:00p.m. to 3:00 p.m. CT (Chicago time).
Workshop 18 organizer: Northwestern University (Jason Hartline and Dan Linna)
Please join us for this session if you are interested in generative AI and copyright.
Link to join on Zoom: Will be circulated to Google Group
Agenda:
5-minute presentation about ACM CS&Law Symposium - Christopher Yoo
20-minute presentation - Peter Henderson
10-minute Q&A
20-minute presentation - Rui-Jie Yew
10-minute Q&A
30-minute open Q&A about both presentations
10-minute Q&A about ACM CS&Law Symposium - Ran Canetti
15-minute open discussion about curating CS+Law resources
Presentation 1:
Foundation Models and Fair Use
Presenter: Peter Henderson, JD/PhD candidate in CS, Stanford University (incoming Assistant Professor, Princeton University)
Abstract:
Existing foundation models are trained on copyrighted material. Deploying these models can pose both legal and ethical risks when data creators fail to receive appropriate attribution or compensation. In the United States and several other countries, copyrighted content may be used to build foundation models without incurring liability due to the fair use doctrine. However, there is a caveat: If the model produces output that is similar to copyrighted data, particularly in scenarios that affect the market of that data, fair use may no longer apply to the output of the model. In this work, we emphasize that fair use is not guaranteed, and additional work may be necessary to keep model development and deployment squarely in the realm of fair use. First, we survey the potential risks of developing and deploying foundation models based on copyrighted content. We review relevant U.S. case law, drawing parallels to existing and potential applications for generating text, source code, and visual art. Experiments confirm that popular foundation models can generate content considerably similar to copyrighted material. Second, we discuss technical mitigations that can help foundation models stay in line with fair use. We argue that more research is needed to align mitigation strategies with the current state of the law. Lastly, we suggest that the law and technical mitigations should co-evolve. For example, coupled with other policy mechanisms, the law could more explicitly consider safe harbors when strong technical tools are used to mitigate infringement harms. This co-evolution may help strike a balance between intellectual property and innovation, which speaks to the original goal of fair use. But we emphasize that the strategies we describe here are not a panacea and more work is needed to develop policies that address the potential harms of foundation models.
Joint work with Xuechen Li, Dan Jurafsky, Tatsunori Hashimoto, Mark A. Lemley, and Percy Liang.
Presentation 2:
Break It Till You Make It: Limitations of Copyright Liability Under a Pretraining Paradigm of AI Development
Presenter: Rui-Jie Yew, CS-PhD student, Brown University
Abstract:
In this talk, I consider the impacts of a pre-training regime on the enforcement of copyright law for AI systems. I identify a gap between conceptualizations of the developmental process in the legal literature for copyright liability and the evolving landscape of deployed AI models. Specifically, proposed legal tests have assumed a tight integration between model training and model deployment: the ultimate purpose of a model plays a central role in determining if a training procedure's use of copyrighted data infringes on the author's rights. In practice, modern systems are built and deployed under a pre-training paradigm: large models are trained for general-purpose applications and to be specialized to different applications, often by third parties. This potentially creates an opportunity for developers of pre-trained models to avoid direct liability under these tests. As a result, I consider copyright's secondary liability doctrine in the practical effect of copyright regulation on the development and deployment of AI systems. I draw from secondary copyright liability litigation for technologies in the past to understand how AI companies may manage or attempt to limit their copyright liability in practice. Based on this, I conclude with a discussion of regulatory strategies to close these loopholes and propose duties of care for developers of ML models to evaluate and mitigate their models' present and downstream effects on the authors of copyrighted works that are used in training. This is joint work with Dylan Hadfield-Menell.
Join us to get meeting information
Join our group to get the agenda and Zoom information for each meeting and engage in the CS+Law discussion.
Interested in presenting?
Submit a proposed topic to present. We strongly encourage the presentation of works in progress, although we will consider the presentation of more polished and published projects.
2023-24 Series Schedule
Tuesday, September 26, 1:00 to 3:00 p.m. Central Time (Organizer: Northwestern)
Monday, October 23, 2:00 to 4:00 p.m. Central Time (Organizer: UCLA)
Friday, November 17, 1:00 to 3:00 p.m. Central Time (Organizer: Boston University)
Friday, December 15, 1:00 to 3:00 p.m. Central Time (Organizer: Penn)
Friday, January 19, 1:00 to 3:00 p.m. Central Time (Organizer: Georgetown)
Friday, February 16, 1:00 to 3:00 p.m. Central Time (Organizer: Berkeley)
Friday, March 22, 1:00 to 3:00 p.m. Central Time (Organizer: Cornell)
Friday, April 19, 1:00 to 3:00 p.m. Central Time (Organizer: Ohio State)
Friday, May 17, 1:00 to 3:00 p.m. Central Time (Organizer: Tel Aviv + Hebrew Universities)
Steering Committee
Ran Canetti (Boston U.)
Bryan Choi (Ohio State)
Aloni Cohen (U. Chicago)
April Dawson (North Carolina Central)
Dazza Greenwood (MIT)
James Grimmelmann (Cornell Tech)
Jason Hartline (Northwestern)
Dan Linna (Northwestern)
Paul Ohm (Georgetown)
Pamela Samuelson (Berkeley)
Inbal Talgam-Cohen (Technion - Israel Institute of Technology)
John Villasenor (UCLA)
Rebecca Wexler (Berkeley)
Christopher Yoo (Penn)
Background - CS+Law Monthly Workshop
Northwestern Professors Jason Hartline and Dan Linna convened an initial meeting of 21 CS+Law faculty at various universities on August 17, 2021 to propose a series of monthly CS+Law research conferences. Hartline and Linna sought volunteers to sit on a steering committee. Hartline, Linna, and their Northwestern colleagues provide the platform and administrative support for the series.