Workshop 10: Friday, October 28, 2022, Cornell University (James Grimmelmann)
Lizzie Kumar (PhD candidate at Brown University), Equalizing Credit Opportunity in Algorithms: Aligning Algorithmic Fairness Research with U.S. Fair Lending Regulation: Credit is an essential component of financial wellbeing in America, and unequal access to it is a large factor in the economic disparities between demographic groups that exist today. Today, machine learning algorithms, sometimes trained on alternative data, are increasingly being used to determine access to credit, yet research has shown that machine learning can encode many different versions of "unfairness," thus raising the concern that banks and other financial institutions could---potentially unwittingly---engage in illegal discrimination through the use of this technology. In the US, there are laws in place to make sure discrimination does not happen in lending and agencies charged with enforcing them. However, conversations around fair credit models in computer science and in policy are often misaligned: fair machine learning research often lacks legal and practical considerations specific to existing fair lending policy, and regulators have yet to issue new guidance on how, if at all, credit risk models should be utilizing practices and techniques from the research community. This paper aims to better align these sides of the conversation. We describe the current state of credit discrimination regulation in the United States, contextualize results from fair ML research to identify the specific fairness concerns raised by the use of machine learning in lending, and discuss regulatory opportunities to address these concerns. Full paper available at: https://assets-global.website-files.com/6230fe4706acf355d38b2d54/62e02dd4dccb7c2ee30bfd56_Algorithmic_Fairness_and_Fair_Lending_Law.pdf
Sandra Wachter (University of Oxford), The Theory of Artificial Immutability: Protecting Algorithmic Groups under Anti-Discrimination Law: Artificial intelligence is increasingly used to make life-changing decisions, including about who is successful with their job application and who gets into university. To do this, AI often creates groups that haven’t previously been used by humans. Many of these groups are not covered by non-discrimination law (e.g., ‘dog owners’ or ‘sad teens’), and some of them are even incomprehensible to humans (e.g., people classified by how fast they scroll through a page or by which browser they use).
This is important because decisions based on algorithmic groups can be harmful. If a loan applicant scrolls through the page quickly or uses only lower caps when filling out the form, their application is more likely to be rejected. If a job applicant uses browsers such as Microsoft Explorer or Safari instead of Chrome or Firefox, they are less likely to be successful. Non-discrimination law aims to protect against similar types of harms, such as equal access to employment, goods, and services, but has never protected “fast scrollers” or “Safari users”. Granting these algorithmic groups protection will be challenging because historically the European Court of Justice has remained reluctant to extend the law to cover new groups.
This paper argues that algorithmic groups should be protected by non-discrimination law and shows how this could be achieved. Full paper available at: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4099100
Workshop 9: Friday, September 23, 2022, Organized by Northwestern University (Jason Hartline and Dan Linna)
Aileen Nielsen (ETH Zurich's Center for Law and Economics -Co-authors: Laura Skylaki, Milda Norkute, and Alexander Stremitzer), Building a Better Lawyer - Machine Guidance Can Make Legal Work Faster and Fairer: The possibility to make lawyers work better with technology is an important and ever moving target in the development of legal technologies. Thanks to new digital technologies, lawyers can do legal research and writing far more effectively today than just a few decades ago. But, to date, most assistive technology has been limited to legal search capabilities, with attorney users of these technologies executing relatively confined instructions in a narrow task. Now, a new breed of legal tools offers guidance rather than information retrieval. This rapid expansion in the range of tasks for which a machine can offer competent guidance for legal work has created new opportunities for human-machine cooperation to improve the administration of law but also offers new risks that machine guidance may bias the practice of law in undesirable ways. We present a randomized controlled study that tackles the question of how machine guidance influences the quality of legal work. We look both to the quality of the procedure by which work is carried out and the quality of the outputs themselves. Our results show that a legal AI tool can make lawyers faster and fairer without otherwise influencing aggregate measures of work quality. On the other hand, we identify some distributional effects of the machine guidance that raise concerns about the impact of machine guidance on human work quality. We thus provide experimental evidence that legal tools can improve objectively assessed performance indicators (efficiency, fairness) but also raise concerns about how the quality of legal work should be defined and regulated. In addition to these results, we also furnish an example methodology as to how organizations could begin to assess legal AI tools to ensure appropriate and responsible deployment of these tools.
Liren Shan (Theory Group at Northwestern University -Co-authors: Jason D. Hartline, Daniel W. Linna Jr. , Alex Tang), Algorithmic Learning Foundations for Common Law: This paper looks at a common law legal system as a learning algorithm, models specific features of legal proceedings, and asks whether this system learns efficiently. A particular feature of our model is explicitly viewing various aspects of court proceedings as learning algorithms. This viewpoint enables directly pointing out that when the costs of going to court are not commensurate with the benefits of going to court, there is a failure of learning and inaccurate outcomes will persist in cases that settle. Specifically, cases are brought to court at an insufficient rate. On the other hand, when individuals can be compelled or incentivized to bring their cases to court, the system can learn and inaccuracy vanishes over time. (Preprintonarxiv)
Workshop 8: Friday, May 20, 2022, Organized by Boston University (Professor Ran Canetti)
Sunoo Park (Cornell), The Right to Vote Securely: This Article argues that election law can, does, and should ensure that the right to vote is a right to vote securely. First, it argues that constitutional voting rights doctrines already prohibit election practices that fail to meet a bare minimum threshold of security. But the bare minimum is not enough to protect modern election infrastructure against sophisticated threats. The Article thus proposes new statutory measures to bolster election security beyond the constitutional baseline, with technical provisions designed to change the course of insecure election practices that have become regrettably commonplace and standardize best practices drawn from state-of-the-art research on election security.
Sarah Scheffler (Princeton), Formalizing human ingenuity: A quantitative framework for substantial similarity in copyright: A central notion in U.S. copyright law is judging the "substantial similarity" between an original and an allegedly derived work. Capturing this notion has proven elusive, and the many approaches offered by case law and legal scholarship are often ill-defined, contradictory, or internally inconsistent. This work suggests that a key part of the substantial similarity puzzle is amenable to modeling inspired by theoretical computer science. Our proposed framework quantitatively evaluates how much "novelty" is needed to produce the derived work with access to the original work, versus reproducing it without access to the copyrighted elements of the original work. Our definition has its roots in the abstraction-filtration-comparison method of Computer Associates, Inc. v. Altai. Our framework's output "comparison" is easy to evaluate, freeing up the court's time to focus on the more difficult "abstraction and filtering" steps used as input. We evaluate our framework on several pivotal cases in copyright law and observe that the results are consistent with the rulings.
Azer Bestavros, (Boston University); Stacey Dogan, (Boston University); Paul Ohm, (Georgetown); Andrew Sellars, (Boston University), Bridging the Computer Science-Law Divide (PDF available): With the generous support of the Public Interest Technology University Network (PIT-UN) researchers from the Georgetown University Institute for Technology Law and Policy and Boston University’s School of Law and Faculty of Computing and Data Sciences present this report compiling practical advice for bridging Computer Science and Law in academic environments. Intended for university administrators, professors in computer science and law, and graduate and law students, this report distills advice drawn from dozens of experts who have already successfully built bridges in institutions ranging from large public research universities to small liberal arts colleges.
Workshop 7: Friday, April 15, 2022, Organized by MIT (Lecturer and Research Scientist Dazza Greenwood)
Sandy Pentland (MIT), Law was the first Artificial Intelligence: Hammurabi's Code can be described as the first formally codified distillation of expert reasoning. Today we have many systems for codification and application of expert reasoning, these systems are often lumped together under the umbrella of Artificial Intelligence (AI). What can we learn by thinking about law as AI?
Robert Mahari (MIT & Harvard), Deriving computational insights from legal data: First, we will discuss the law as a knowledge system that grows by means of citations. We will compare the citation networks in law and science by leveraging tools from “science-of-science”. We will explore how, despite the fundamental differences between the two systems, the core citation dynamics are remarkably universal, suggesting that the citation dynamics are largely shaped by intrinsic human constraints and robust against the numerous factors that distinguish the law and science. Second, we will explore how legal citation data can be used to build sophisticated NLP models that can aid in forming legal arguments by predicting relevant passages of precedent given the summary of an argument. We will discuss a state-of-the-art BERT model, trained on 530,000 examples of legal arguments made by U.S. federal judges, which predict relevant passages from precedential court decisions given a brief legal argument. We will highlight how this model performs well on unseen examples (with a top-10 prediction accuracy of 96%) and how it handles arguments from real legal briefs.
Workshop 6: Friday, March 11, 2022, Organized by University of Pittsburgh (Professor Kevin Ashley)
Daniel E. Ho (Stanford University), Large Language Models and the Law: This talk will discuss the emergence of large language models, their applicability in law and legal research, and the legal issues raised by the use of such models. We will illustrate with the CaseHOLD dataset, comprised of over 53,000+ multiple choice questions to identify the relevant holding of a cited case, and the application to mass adjudication systems in federal agencies.
Elliott Ash (ETH Zurich), Reading (Judges') Minds with Natural Language Processing: This talk will introduce some recent lines of empirical legal research that apply natural language processing to analyze beliefs and attitudes of judges and other officials. When do lawmakers use more emotion, rather than logic, in their rhetoric? When do judges use notions of economic efficiency, rather than fairness or justice, in their written opinions? What can language tell us about political views or social attitudes?
Workshop 5: Friday, February 18, 2022, Organized by University of Chicago (Professor Aloni Cohen)
Deborah Hellman (University of Virginia School of Law), What is a proxy and does it matter: A few years ago there was a controversy about the Amazon hiring tool that downgraded women applicants because the tool had learned that women are less good as software engineers. As reported in the press, the program downgraded resumes with the word “Women” as in “Women’s volleyball team” and also the resumes of candidates from two women’s colleges. If we focus in particular on the women’s college example (suppose it was Smith and Wellesley), should we consider this differently than if the program had downgraded the resumes of candidates that noted “knitting” as a hobby? What about if it had downgraded resumes in which the candidate had listed him/herself as the president of a college club which, it turns out, also correlates with sex/gender (suppose women are more likely to seek these offices than men). The question I am interested in is whether there is a meaningful, normatively significant category of a proxy. A program might use the trait itself (sex, for example) to sort. It might have a disparate impact on the basis of that trait (sex). But is there something in between, in which it uses another trait (attended Smith, likes knitting, was President of the club) that we describe as being a “proxy for sex” such that this description is descriptively and normatively meaningful?
Aloni Cohen (University of Chicago Computer Science), Truthtelling and compelled decryption: The Fifth Amendment to the US Constitution provides individuals a privilege against being “compelled in any criminal case to be a witness against himself.” Courts and legal scholars disagree about how this privilege applies to so-called compelled decryption cases, wherein the government seeks to compel an individual to unlock or decrypt an encrypted phone, computer, or hard drive. A core question is under what circumstances is there testimony implicit in the act of decryption. One answer is that there is no implicit testimony if “the Government is in no way relying on the ‘truthtelling’” of the respondent (Fisher v US, 1976). In ongoing work with Sarah Scheffler and Mayank Varia, we are formalizing a version of this answer and exploring what it suggests about compelled decryption and other compelled computational acts. With this audience, I'd like to discuss (and elicit feedback about) the relationship between our approach and the underlying criminal law context.
Workshop 4: Friday, January 21, 2022, Organized by UCLA (Professor John Villasenor)
Leeza Arbatman (UCLA Law) and John Villasenor (UCLA Engineering and Law), When should anonymous online speakers be unmasked?: Freedom of expression under the First Amendment includes the right to anonymous expression. However, there are many circumstances under which speakers do not have a right to anonymity, including when they engage in defamation. This sets up a complex set of tensions that raise important—and as yet unresolved—questions regarding when, and under what circumstances online anonymous speakers should be “unmasked” so that their true identities are revealed.
Priyanka Nanayakkara (Northwestern Computer Science & Communication), The 2020 U.S. Census and Differential Privacy: Surfacing Tensions in Conceptualizations of Confidentiality Among Stakeholders: The U.S. Census Bureau is legally mandated under Title 13 to maintain confidentiality of census responses. For the 2020 Census, the bureau employed a new disclosure avoidance system (DAS) based on differential privacy (DP). The switch to the new DAS has sparked discussion among several stakeholders—including the bureau, computer science researchers, demographers, independent research organizations, and states—who have different perspectives on how confidentiality should be maintained. We draw on public-facing and scholarly reports from a variety of stakeholder perspectives to characterize discussions around the new DAS and reflect on underlying tensions around how confidentiality is conceptualized and reasoned about. We posit that these tensions pinpoint key sources of miscommunication among stakeholders that are likely to generalize to other applications of privacy-preserving approaches pioneered by computer scientists, and therefore offer important lessons about how definitions of confidentiality (as envisioned by different stakeholders) may align/misalign with one another.
Workshop 3: Friday, November 19, 2022, Organized by University of Pennsylvania (Professor Christopher S. Yoo)
Christopher S. Yoo (University of Pennsylvania), Technical Questions Underlying the EU Competition Case Against Android: The EU’s competition law case against Google Android turns on key factual findings about the motivations and limitations faced by the developer community. For example, how do developers decide on which operating systems to launch their apps and how to prioritize their efforts if they decide to create versions for more than one? How can we quantify the costs for developers of porting versions of apps for different platforms and forks of the same platform? To what extent does the Chinese market influence app development? To what extent does the Google Play Store compete with Apple’s App Store or Chinese equivalents? To what extent are developers inhibited by requirements that phone manufacturers bundle certain apps? And to what extent do developers benefit from provisions guaranteeing that the platform will provide general-purpose functionality such as clocks and calendars? At the same time, the Google Android case overlooks the inherent tensions underlying open source operating systems, which simultaneously presuppose the flexibility inherent in open source and the rigid compatibility requirements of a modular platform. Is fragmentation a real threat both in terms of software development and consumer adoption, and if so, what steps are appropriate to mitigate the problems it poses? Open source app environments are sometimes criticized as cesspools of malware. Is this true, and if so, what are the appropriate responses? To what extent are operating system platforms justified in overseeing compatibility? This presentation will sketch out how the EU’s case against Google Android frames and answers these questions. Full resolution depends not only on providing answers to this specific case but also providing more general frameworks for conceptualizing how to address similar questions in the future.
David Clark and Sara Wedeman (MIT), Law and Disinformation: Do They Intersect?: Disinformation (the intentional creation and propagation of false information, as opposed to misinformation, the unintentional propagation of incorrect information), has received a great deal of attention in recent years, with strong evidence of Russian attempts to manipulate elections in the US and elsewhere. The problem has been studied for well over 10 years, with many hundreds of papers from different disciplines ranging from journalism to psychology. However, the role of law in combatting disinformation is unclear. In this talk, I offer a few dimensions along which law might relate to this issue and invite discussion and clarification. My concern is specifically with online disinformation, propagated through platforms such as Twitter and Facebook. In the U.S., one law that defines the responsibilities and protections for platform providers is Section 230 of the Communications Decency Act. There are now calls for the revision or repeal of this law. However, I do not think the debate around Section 230 is well-formed. I suggest this as a possible topic of discussion. As another dimension of the problem, while financial institutions have a regulatory obligation to Know Your Customer (KYC) platform providers have no such obligation, and unattributed speech on the Internet is the norm. But the anonymity of some forms of speech is protected, as is telling lies. Should platform providers have any responsibilities with respect to disinformation, and if so, of what sort? The solution must be much more nuanced than simple calls for filtering or digital literacy.
Workshop 2: Friday, October 22, 2022, Organized by University of California Berkeley (Professors Rebecca Wexler and Pamela Samuelson)
Sarah Lawsky, Northwestern Law, and Liane Huttner, Sorbonne Law (research in collaboration with Denis Merigoux, Inria Paris CS, and Jonathan Protzenko, Microsoft Research CS). This presentation will describe a new domain-specific programming language, Catala, that provides a tractable, functional, and transparent approach to coding tax law. It will describe benefits of formalizing tax law using Catala, including increased government accountability and efficiency, and speculate about potential compliance issues that could arise from the formalization.
Catalin Voss and Jenny Hong (Stanford CS), Most applications of machine learning in criminal law focus on making predictions about people and using those predictions to guide decisions. Whereas this predictive technology analyzes people about whom decisions are made, we propose a new direction for machine learning that scrutinizes decision-making itself. Our aim is not to predict behavior, but to provide the public with data-driven opportunities to improve the fairness and consistency of human discretionary judgment. We call our approach the Recon Approach, which encompasses two functions: reconnaissance and reconsideration. Reconnaissance reveals patterns that may show systemic problems across a set of decisions; reconsideration reveals how these patterns affect individual cases that warrant review. In this talk, we describe the Recon Approach and how it applies to California’s parole hearing system, the largest lifer parole system in the United States, starting with reconnaissance. We describe an analysis using natural language processing tools to extract information from 35,105 parole hearing transcripts conducted between 2007 and 2019 for all parole-eligible candidates serving life sentences in California. We are the first to analyze all five million pages of these transcripts, providing the most comprehensive picture of a parole system studied to date through a computational lens. We identify several mechanisms that introduce significant arbitrariness into California’s parole decision process. We then ask how our insights motivate structural parole reform and reconsideration efforts to identify injustices in historical cases.
Workshop 1: Friday, September 17, 2022, Organized by Northwestern University (Professors Jason Hartline and Dan Linna)
Rebecca Wexler (Berkeley Law), Privacy Asymmetries: Access to Data in Criminal Defense Investigations: This Article introduces the phenomenon of “privacy asymmetries,” which are privacy statutes that permit courts to order disclosures of sensitive information when requested by law enforcement, but not when requested by criminal defense counsel. In the United States adversarial criminal legal system, defense counsel are the sole actors tasked with investigating evidence of innocence. Law enforcement has no constitutional, statutory, or formal ethical duty to seek out evidence of innocence. Therefore, selectively suppressing defense investigations means selectively suppressing evidence of innocence. Privacy asymmetries form a recurring, albeit previously unrecognized, pattern in privacy statutes. They likely arise from legislative oversight and not reasoned deliberation. Worse, they risk unnecessary harms to criminal defendants, as well as to the truth-seeking process of the judiciary, by advantaging the search for evidence of guilt over that for evidence of innocence. The number of these harms will only increase in the digital economy as private companies collect immense quantities of data about our heart beats, movements, communications, consumption, and more. Much of that data will be relevant to criminal investigations, and available to the accused solely through the very defense subpoenas that privacy asymmetries block. Moreover, the introduction of artificial intelligence and machine learning tools into the criminal justice system will exacerbate the consequences of law enforcement’s and defense counsel’s disparate access to data. To avoid enacting privacy asymmetries by sheer accident, legislators drafting privacy statutes should include a default symmetrical savings provision for law enforcement and defense investigators alike. Full Paper at: Privacy Asymmetries: Access to Data in Criminal Defense Investigations by Rebecca Wexler :: SSRN
Jinshuo Dong, Aravindan Vijayaraghavan, and Jason Hartline (Northwestern CS), Interactive Protocols for Automated e-Discovery: