Episode 32: Dr. Carys Craig on generative AI and the dangers of the copyright trap
Yves Faguy: Today we’re going to talk about Artificial Intelligence and Copyright. You’re listening to Modern Law, presented by the Canadian Bar Association’s National Magazine.
One of the significant controversies about AI is the impact of generative software on the use and production of cultural works. The fast-growing popularity of these tools raises big questions about the ethics of AI-generated works and whether they amount to a technologically advanced form of plagiarism. Now, lawsuits have been popping up around the world as artists as well as corporate interests – think of Getty Images, for example – claim infringement of their IP rights.
Now, I should say that a US federal court ruled back in August that art created by AI without any human input cannot be copyrighted under US law. Although what human input means exactly, is likely to be debated for some time. Still, we must ask ourselves whether copyright should be the appropriate regulatory tool to determine these questions. In fact, as listeners will hear from our guest today, the novelty of generative AI actually raises a whole lot of questions about the socioeconomic dynamics of cultural production, and whether it might not be time to reexamine the role of copyright law in encouraging and incentivising creativity.
My guest is Dr Carys Craig, and I’m really thrilled that I asked her to come on the show to try to talk through these issues. Dr Craig is Osgoode Hall Law School’s Associate Dean of Research, and she has recently stepped into the role of Director of IP Osgoode, which is the school’s intellectual property law and technology program. She joined the faculty at Osgoode Hall in 2002, and is the author of “Copyright, Communication and Culture: Towards a Relational Theory of Copyright”, among other writings, and in 2018 she held a MacCormick Research Fellowship at the University of Edinburgh. She teaches JD, graduate and professional courses in the areas of intellectual property, copyright and trademark law, and legal theory. Dr Carys Craig, welcome to Modern Law.
Carys Craig: Thank you. It’s a pleasure to be here.
Yves: First – I often ask this of guests who come on – just tell us a little bit about yourself, how you got to where you are today, what brought you into this world of copyright and intellectual property and artificial intelligence.
Carys: All right. Well, I’m a professor of law at Osgoode Hall Law School, at York University, and there I teach in the intellectual property area, so my focus really is on copyright as well as on trademarks, and then on law and technology and legal theory. I’ve been at Osgoode now for, well, a couple of decades, without ageing myself too much, but when I first came it was in the midst of the policy panics over the internet and Napster and copyright in that context.
So it was a great time to start thinking about copyright law, and there was a lot of demand at that point for expertise in the field, and I was, fortunately, a graduate fellow at the University of Toronto at that time. So I did my graduate thesis, my doctoral thesis, on copyright policy and copyright theory, and I got hired out of the University of Toronto into the faculty at Osgoode in the second year of my PhD, so I’ve been teaching there ever since.
Yves: So you landed in this field at a time where there was an extraordinary amount of, I’m guessing, change in the whole media field and in all these platforms and support systems of where we write things.
Carys: Yeah. Exactly that. It was when I first encountered intellectual property law just as an undergrad LLB student at Edinburgh. There wasn’t much focus, it was just a little piece of the commercial law course, and we moved through it quite quickly, and that was that. And even within a few years this had become an area that was enormously important, and where people were beginning to move in and worry about how copyright law was going to adjust to the arrival of internet technologies, how it was going to adjust to the arrival of peer-to-peer file sharing.
And fortunately for me that meant that there was quite a lot of investment in a new generation of legal scholars who were turning their minds to this question. And so in Ontario there was a large push towards bringing in graduate students who were working in the field, the Centre for Innovation Law, which was based at the Faculty of Law in Toronto; brought in a large group of PhD students and master’s students and generously funded them. And so I was attracted to Canada and stayed here with this graduate scholarship, and then began working in this field. And suddenly, the law schools needed to hire in the area, they needed a whole new generation of IP scholars, and so I think of this really as being one of those cases where I was in the right place at the right team, and suddenly in a field that was generating all of this interest in which there was a lot of movement.
And so I do remember at one point I was also studying feminist theory and international law, and I asked some folks at the University of Toronto if I should maybe redirect to do my graduate work in that area, and I was told in no uncertain terms that if I did intellectual property I should certainly stay doing intellectual property; that that was what was going to be of interest. And you and I were both at a conference recently where it was said that copyright has not been so interesting since the Napster era, and now suddenly we have another paradigm shift and new technologies with the arrival of generative AI, and copyright is cool again.
Yves: Yes, and that was the conference held last month at the University of Ottawa on responsible AI. I’m just curious, though. So if you’re looking back at that early period in your career where so much was going on, do you think copyright law has adapted to the circumstances, to those new circumstances of that time? Or did it adapt well, is perhaps a different way of putting it?
Carys: It certainly adapted. I’m not so sure that it adapted well. I think it adapted in a way that it invariably does, which is that it expanded in order to cover and capture valuable new activities, and typically to protect many of the longstanding market incumbents in the cultural industries. And so I think what’s interesting when I look back at it, is that it certainly was a time of great disruption, but the expectation that copyright would somehow be ousted, be so disruptive that it could no longer really continue to operate as it did in the digital age, was somewhat naïve.
And I think there was a lot of excitement about the new technology and there were certainly new challenges presented, but copyright law has been with us for hundreds of years, it has evolved through all kinds of technological changes and advancements, and it continues to grow, and it continues to be strengthened by the same kind of impulse for control over cultural content. And so to me, this is part of the way that the marketplace works, is part of the way that a capitalist society works. We can expect that copyright will continue to function and to expand as technology changes.
Yves: I want to hang onto this point just a sec, because obviously you’ve been interested in the AI copyright challenge for a little while. And actually, I should correct – the conference was Shaping AI for Just Futures at the University of Ottawa last month, organized by the Centre for Law, Technology and Society. But you wrote in a paper recently, or predating that, maybe back in 2021, 2022, you called it the unfolding AI copyright drama into which governments, courts and commentators are increasingly being drawn.
We can reasonably predict – talking about AI here – that copyright will once again adapt and prevail. Whether in service of creativity and culture or simply in service of capital, the copyright system is perfectly capable of absorbing this latest innovation and continuing about its business as it has so many times before. And I think that’s a little bit what you were saying, but is that true, still, today, post-ChatGPT, for example?
Carys: So it remains to be seen, obviously, and I know at that conference there was also some talk about the end of copyright. I found that very interesting, because it does suggest that, once again, we’re imagining that copyright can’t cope with technological change. And in some ways, I would like to see copyright recede as the need for it recedes. And we’ll speak, I think, a little bit more about the purpose of copyright, but if we have a very vibrant culture in which people can create, the costs of expression, the costs of sharing, are much less than they were in the pre-digital era, then perhaps we have much less need for copyright, at least as an incentive for creativity, if that’s how we understand its role.
And so it’s nice to think that as technology, let’s say, advances for the moment that at some point we might not need to have the social costs of copyright and copyright control. We might be able to allow everything to be freely shared in a public domain, and everyone to be able to access that and to benefit from it. To my mind, that’s a lovely vision of a future that could be technologically facilitated. I don’t imagine that we’re going to reach that point any time soon. And I think we also can rightly predict, I’m sure, that the industries especially that benefit from the exploitation of copyright content are going to continue to enforce a proprietary model that creates owners and therefore creates users and audiences, and a public that is, by default, excluded from copyright content, or subject to copyright control.
Yves: And might I suggest, too, that maybe not just the corporate owners but perhaps the authors themselves or the songwriters themselves, the artists themselves, might feel a little bit nervous about your proposition.
Carys: Indeed, and you can suggest it, and there’s a reason, however, why I’m talking about corporate owners and industries. And that’s because I hesitate to lump creators and authors and artists and performers in with the content industries and corporate owners who tend to reap the benefits of copyright control. And so I think what we have today is a system of copyright that benefits a very limited group of people or persons or corporations, and tends not to really benefit the artists and creators and authors that we think of when we talk about copyright policy.
Often, I find, in copyright policy debates the artist, the creator, is held up as sort of a romantic trope, something or someone that we venerate, that we want to reward, and that has, therefore, this important cultural significance in our society. But often that person or that trope is held up as really a stalking horse for corporate interests. Basically, what you see is – and this has been the case from the very inception of copyright – that the author is held up as the would-be beneficiary of a system, where actually all that copyright does is create the proprietary right over the work, over the original expression, and then just send it out into the marketplace as something that can be commoditized and controlled.
And so usually the author, or the creator or the performer, drops very quickly out of that picture. They have to assign their rights to the publisher, they lose control over their work, and rarely for its full social or market value. And so often the very people who talk about the need to reward creators, the need to protect artists and ensure their livelihoods, are the intermediaries who in fact are reaping the benefits of the copyright system and leaving those authors and artists with very little in return.
Yves: Like artists putting their stuff up on Spotify, for example. Not to point any fingers at any one platform in particular, but they don’t reap all that much unless they have absolutely massive audiences.
Carys: Yeah, that’s exactly right, and we see this in every different cultural industry, and so people have the same experience in the music industry, or if they’re authors assigning rights to book publishers, often the author or the artist is, as I say, not the main beneficiary of the profit that comes in as a result of their work, but gets a tiny fraction of that.
Yves: It might be a good time here just to rewind a little bit and talk a little bit about what originally is the purpose of copyright, and how should we think of its purpose in the broader context of the creation of works in AI? Because copyright’s been around for a very long time now, and so it had an original purpose. And I’m just wondering at this point if that purpose is still true to itself.
Carys: Right. Yeah, well, we can take this all the way back to the 15- and 1600s, if you want. The purpose of copyright was, or the equivalent there, was control over the printing presses, and it was really censorial at that point. It was a way to control the flow of information through publication at a time when even being able to print words and share them with a literate public was a very new thing. So that’s the first technological innovation that started this whole thing off, was the printing press.
When it came to the first copyright statutes – and now we’re in the early 1700s – the purpose of the Statute of Anne was stated to be the encouragement of learning. And so one way to understand that is the idea that it was aimed at increasing or growing a literate public, sharing books, allowing the printing of books and the selling of books, and access to books, for the encouragement of learning. And so I would say that’s the original function. There’s a lot of wonderful historical work, though, on whether that was actually true, and suggestions, for example, that it was really a trade regulation device to try to break up the stationers who were the publishers at the time.
And then it continued to evolve. So you have statements like in the US Constitution, the purpose of copyright is to advance progressive science and the useful arts by granting to authors a limited right. Meanwhile, in Continental Europe they would say, well, the purpose of copyright is first and foremost to protect the droit d'auteur, to recognize the rights of the author and understand that the author has some sort of natural entitlement to their work, and that comes first. Fast forward to now, and to Canada, we have an amalgam of all of these reasons in the mix.
We tend to talk about – and the Supreme Court of Canada has explained – that the purpose of copyright is to achieve a balance between rewarding creators and encouraging the creation and the dissemination of works of the arts and intellect. So what I like to do with that is put it all in context and say the purpose of copyright should be to function as a cultural policy tool that helps to encourage authorship, creativity, and the dissemination of works.
Yves: What do most people who are not in your field of study, what is it they don’t get about copyright that you think that they should understand about it?
Carys: To my mind, the thing that people most readily misconstrue about the copyright system is the nature of the copyright. People tend to think of it as a property right. They tend to think of it as an ownership right that functions like any other claim of ownership. And so that tends to look, first of all, to my mind, far too absolute; that you can control this thing that is the work. And you can own it, and you can exclude other people from it, and you can dictate how it’s used, and then if somebody uses it without the permission of the owner that’s misappropriation or that is theft, and that’s stealing. Stealing of ideas.
And this is not how copyright works, for a start. You don’t own the whole work as a thing, you can’t control the ideas that are within it. But also, it’s not a property right in this sense. If someone uses the work, takes it, copies it, they’re not depriving the owner of anything, the owner still has the work as well. Everybody can have the work at the same time. Everybody can sing the same song at the same time. It’s not a rivalrous possession. And so if we get rid of the property metaphor, or don’t begin, at least, with this analogy between the intangible work and a physical thing that is owned, I just think we start up in a much better place to actually create cultural policy.
Recognizing that what we’re really dealing with is an original work of expression. What we’re really dealing with is not a thing but it’s speech. It’s communication. It’s got meaning. And so if we think about it as something that regulates speech and information rather than that creates property rights over works as things, then we’re going to have, I think, a much more productive conversation about what copyright policy should achieve.
Yves: So this gets, I think, at an important question about the incentive system. How our laws now, as they are today, are actually designed versus how it might look to someone who just walks into the room from the outside or someone from Mars and who’s looking at how our copyright system incentivizes people to create. And I’m wondering, how do you measure the disconnect between those two things?
Carys: There is a significant disconnect between the way that copyright actually works and the incentive story that we tell ourselves about copyright’s purpose and what it does. So the incentive story is, in the law and economics framing, to say the act of expression, of creating a work, never mind publishing a work, comes with costs. It’s an investment of time, it’s an investment of money, and energy; those costs, that effort, could be expended elsewhere. And so if we assume that we’re all rational economic actors, why would anyone invest their time and energy and these costs in creating something if they cannot expect to recoup those costs in any way through the market?
And so as soon as they share the work, if the work can be copied by everybody for free, then they’re not going to get any economic benefits going back to them, so they won’t invest in the first place, and we’ll all be worse off, because there’ll be fewer people actually investing in creating works. And so to the extent that we want to encourage this kind of creativity, to encourage people to write, to compose, we need to make it possible for them to recoup their investment. And so what we do with the copyright system is we create that right that allows control in the marketplace, which allows, at least theoretically or notionally, for that work to be exchanged in the market for value.
Yves: So now there are technologies that are complete game changers in our economy. And I’m interested that you came into this back in the days of Napster, because Napster was a game changer in terms of the distribution of these creative works, for example. And now, today, we are facing a new game-changing technology, which is obviously Artificial Intelligence, and maybe more specifically what we’ve seen with generative AI. Because here we’re not just talking about the distribution, as I can see it, we’re actually talking about the creation, so this technology is actually changing the creation itself of the artwork. And I’m just wondering, how do we digest that to start even beginning to think about how copyright should play a role in building the legal framework around Artificial Intelligence?
Carys: Yeah, I do agree with you, I think that the arrival of generative AI is a game changer, it is a paradigm-shifting moment, probably just in our culture at large and in our copyright policy more specifically, for my interests. And so thinking about how it’s going to play out and the difference that it makes in the model and the copyright system’s methodology for creating value or protecting value, clearly we’re going to have a lot of thinking to do about how the law should respond to the arrival of this new technology. But also in the sense that this is really going to shift cultural practice and social practice. And so the law is often playing catchup with these realities anyway, and so I think we also have a larger question about whether the law should try to respond to this technology, and if so, not just how but when.
And one of the things that we saw early on with the arrival of the internet was what I think of as being really a kind of premature mobilizing to establish the rules of the game for the internet era. And of course the parties that were around the table at that time were the dominant cultural industries of the time. And ultimately, the risk is if you engage in that kind of lawmaking and are responsive to the pleas of lobbyists who are concerned about the arrival of new technology, then you risk stultifying the development of that new technology, and you risk reinforcing the current status quo. Making sure the same people stay around the table, that the rules of the game for the new technological landscape are the rules that they want to play by, that will benefit them.
And so I am just in general reluctant to assume that the law should necessarily step in and try to keep up with technological change immediately, and certainly I’m very nervous about the idea that you should do that in response to the pleas of people who might otherwise lose out in the new market that is created by the new technology. So I think that means that in this context, if we boil it right back down to what I was saying before about the cost of expression and investment in creativity and publishing work – and this is with the arrival of the internet all the way up to the arrival of generative AI – what we’ve seen is that the kinds of things that used to be very costly and involve significant investment of time and resources in creating works, in publishing works, in sharing works, in finding audiences, in distribution and dissemination, those costs have almost dissipated.
They have radically declined, in a world where you don’t need intermediaries to reach an audience, in a world where you can create virtually for free, professional-quality things that huge audiences want to access. And so we have a completely different landscape for creativity. Whether it’s authorship, whether it’s photography, whether it’s visual arts, whether it’s making music, completely different cultural landscape than we had 20 or 25 years ago.
Yves: And it’s also a completely different landscape, too, I think, in the process of creation. And so this is perhaps a good time to draw the distinction between input versus output, and I’m wondering if you can help us understand that. Because I think we think of copyright and Artificial Intelligence in terms of those two frameworks as a starting point.
Carys: Yeah, I think that’s right. If you don’t mind, what I might do is start with outputs, because it avoids some of the technical and doctrinal messiness of applying current copyright law at the input side. We’ll get to that, but I think when we’re just talking about the cultural environment that we’re in and the way that it’s changed with the arrival of generative AI, it’s worth starting off by just thinking about the outputs, thinking about what AI is capable of generating and what that means. And so here I think we’re at a time now where there’s a little bit of a backlash against this technology. Obviously, there’s a lot of concern about the way that it’s going to develop and what it’s going to do, especially to, I think, our cultural sector. Or not especially, to every sector of society, but that’s the one we’re focused on here.
And so I think it’s worth recognizing that these are extraordinary tools that can produce these amazing outputs that we have to recognize as being a significant contribution, I think, to our culture. And there are lots of artists and others who are working with these tools in an assistive capacity thinking about what it means for these new technologies to be at our disposal for the creation of new cultural artefacts, new ways of meaning making. I think it’s very real to say that they’re democratizing meaning making in important ways as well. So you don’t have to be a great artist to be able to create a great work of art if you’re using Midjourney. You don’t have to be a trained musician to be able to create a decent sounding track. And so there’s this capacity to create at a scale that I don’t think we’ve really encountered before.
And one way to look at that, just to return to this cost of expression point, is that the costs of expression, again, have gone way down. When it comes to using these tools which are available now to all of us for very little or nothing, and to be able to use them in this creative capacity, is pretty exciting. So I can imagine when people are talking about the end of copyright in this context, they might just mean we don’t need to be encouraging and rewarding creators anymore because everybody’s a creator if they’ve got an AI tool and an idea. They’re capable of writing a prompt and they can do it for virtually nothing. And so that’s, I think, one way in which we should recognize that if the purpose of copyright is encouraging this kind of exercise of creating meaning, then there’s something exciting about generative AI.
Yves: But how do the entrenched artists – let’s call them the incumbents, so to speak – and I’m not talking here about the corporate interests, I’m talking more about the people who have spent years, maybe even decades, honing their artistic talent and their artistic expression. Suddenly, they’re faced with this. Surely, there will be those who will rise to the challenge of adopting this kind of technology in their craft, others will understandably be very concerned, very worried, about this, let’s call it, democratization of creative expression. Do we need to think about their interests when considering the legal framework at play here?
Carys: Certainly. Absolutely we have to think about their interests because they are, as I’ve suggested, central to copyright policy, even if they have typically not been the beneficiaries of copyright in practice. I completely understand that there’s going to be a lot of anxiety around the way that these new tools might act as substitutes, if you will, for human creativity or works of human authorship in the cultural sphere. And to some extent, I think that’s real. If you can save the cost of hiring a graphic artist because you can use Midjourney and produce an image, then you probably will. Right? Depending on who you are and how much money you have to spend, and what it is you need.
So there’s definitely going to be a shift, and I don’t mean to underplay that at all. What I would say, though, and what I do say is that I think even if facially we’re seeing things generated by AI that look like they could’ve been authored by a human, or created by a human author, these two things are not the same. They are performing a different role in our culture, that they are going to be recognized as signifying or meaning something different. I don’t think that we’re going to see a disappearance of a desire to have a human author or a human creator or to marvel over what it is that artists and performers and creators and songwriters do. I think really our desire for these cultural works or texts is the meaning, their signification, what they tell us about the person who’s trying to communicate to us through them.
And I don’t mean to romanticize this, I think that that’s the nature of culture, that’s the value of creativity, is in this social dialogue. And that is what we should be trying to encourage in our system. And I think that people will continue to want human-authored works. One of the solutions that we might want to turn our mind to at some point here with regard to outputs is something that is more in the line of the trademark work I do, which is about information for consumers and for audiences, so that people are not being misled about what it is that they’re getting. So if consumers, if audiences, if citizens know what it is that they want, and they know that they want something that isn’t created or generated by a machine, then they shouldn’t be misled into paying for or acquiring or searching out the wrong thing.
Yves: Are you telling me I’m not going to be able to sell my Midjourney works of art that I’ve created?
Carys: Well, I don’t think anyone would, because why would they pay for it from you if they could just make their own? Right? So I don’t think there’s actually that much value there in what’s being generated in that economic sense. I think the value will still be in the human-authored works.
Yves: There is no value in it, and I think I told you this last time, but every time I play on Midjourney it always looks like it comes out of Frodo, The Hobbit or Lord of the Rings or things like that. I don’t know if it’s my prompting. I mean, I’m not even a big fan of those [unintelligible 00:36:14].
Carys: Well, that’s what you tell us, but…
Yves: I do want to flip it to the other side, though, for a second, because there is the input side that’s going on. And this is obviously an area that is upsetting creators. We’ve seen all these lawsuits and class actions by the day that are being litigated, mostly in the United States, on the use of content going into these creative AI tools.
Carys: I think often when we talk about these things or when it’s being talked about in the media, we see a convergence of these two things, or confusion between the inputs and the outputs, and so a lack of clarity over what the legal questions are. And there’s definitely an appropriate lack of clarity over what the answers are, but the questions are actually quite clear. And so I think we can and should separate these things out. We were talking before about the outputs, and I just maybe need to wrap up the thought there by saying my argument is that the outputs of Artificial Intelligence ought not to be protected by copyright at all. That they’re not works of human authorship, that copyright is there for and to encourage works of human authorship, and therefore, to the extent that a work is simply generated by a machine, no copyright attaches. And so it belongs in the public domain.
And I think even in that policy decision there are important implications for ensuring that, for example, creative employees remain employed. The Hollywood studios are going to want to keep their human scriptwriters to hand if they’re going to want to have copyright in their movies and their scripts. And so I think there’s an important reason to limit the benefits of copyright to human authors and their assignees. And so that’s the output side of things.
Yves: And presumably there would be a point where the human authors can be assisted by some of these creative AI tools. Somehow we’re going to have to figure out where the line is drawn between this was produced by a machine in output, but when does it become human?
Carys: Yeah. No, the line drawing exercise is not going to be an easy one here, especially as AI technology has become more and more embedded in the platforms that we typically use in our own creative activities. And so there will have to be some parsing or just a clear line-drawing exercise to decide when is something AI generated and when or at what point do we recognize, I’m going to say, the user of the Artificial Intelligence tool as an author. I actually think that the answer is already clear in the law. If the user of the AI tool is engaged in a process that involves skill and judgment, especially when you see the kinds of iterative creative processes where someone is working with the AI, adding prompts, choosing between different options that they’re given, selecting things, to continue to tinker with them, play with new prompts, all of that looks like something that we would typically recognize as being creativity anyway.
And so I think that there’s quite a lot of scope for recognizing authorship and allowing copyright in something that is created making some use of AI technologies. There’s still a debate about this because the lack of predictability and the lack of control that a user has over the AI means there’s still a gap between their expressive intent and the way the work ends up looking. So there’s still a debate there about how much control they need to have over the output to be able to say that their role was one of authorship and not just a user of a technological tool.
Yves: So it’s not quite Picasso stealing from Matisse, because at least Picasso had some control over what he was doing with his paintbrush.
Carys: Right.
Yves: He was doing more than just drawing inspiration from what someone else had done before.
Carys: That’s right. There’s always the idea of inspiration and using preexisting text to create new tech. Human authorship has always used other preexisting inputs to create new things by recombining them in new ways, by selecting or arranging them differently. And we’ve always recognized this as being authorship, even if it’s not creation de novo, out of nothing. It's working with the materials that are there. And so we have to continue to recognize that rather than having this romantic trope idea of authorship as being creating out of nothing, that really this recombination and working with preexisting materials is always an essential part of authorship. And so if the AI user is engaged in that and is using what the AI gives them in this expressive process, at some point they’re going to be doing everything that we need them to do to be considered an original author for the purposes of copyright. That’s already a very low threshold, by the way, and so I don’t think we’ll have trouble with that.
The thing that I said – and I think this was really at the outset of our conversation – about the work for copyright practices is not just a piece of property that’s owned in its entirety, but rather the work that’s subject to copyright contains copyrightable elements and public domain elements within it. So just because you have copyright in a work doesn’t mean you own everything between the front and the back cover of the book. Right? And you don’t own the ideas, you don’t own the information or the data, you don’t own non-original elements.
You don’t own things that were copied from elsewhere, you don’t own things that are stock or common features in our culture. And so really the same will just be true here. You can own copyright in a work that you created using an AI tool, but you can’t claim to own the pieces of that expression that were beyond your authorial control that were created by the machine and not through your own skill and judgment.
Yves: It’s often said that our laws, privacy laws but also our copyright laws, should be technologically neutral. I think this is a way of saying new technology comes to the fore, the law has to be designed in such a way as to be able to adapt to it. Am I wrong, or do you have a different definition of what is technologically neutral? I guess the larger question is, is the law technologically neutral, still, with Artificial Intelligence in mind?
Carys: The whole concept of technological neutrality is a bit of a head scratcher anyway, just trying to think about what that means, because of course technology is not neutral. And of course the law is not neutral, so when you try to write laws about technology, it’s not going to be neutral. We need to take that off the table to begin with. So we’re not really talking about neutrality in the sense that there’s a sort of technology blindness, a pure objectivity, which closes its eyes to technological affordances or that transcends our technological realities. That is fiction. But as a principle of law making or a principle of policy, I think there is a role for technological neutrality, properly understood.
Yves: What is it, properly understood, then?
Carys: Fair question.
Yves: I got a feeling I butchered it.
Carys: No, no, I think you did quite well. It’s easiest to say what it’s not, right? It’s not just that we are technology blind, that that’s not going to achieve the kind of neutrality that I want. What I’m asking for is something more like technology mindful, is one way to put it. But also there are norms, the goals of our law should remain consistent as technology changes. And so really it’s a normative neutrality. If the goal of copyright is to encourage the creation and dissemination of works, then we should remember that that’s our goal as technology changes. And what we should be doing in the law is trying to adjust the equilibrium or the balance that we’re striking to make sure that the law continues to function as we want it to function, and continues to advance the objectives or meet the social values that we mean it to or that we did before this new technological paradigm shifting moment.
And so I think what that means is that we cannot just blindly say copying is wrong and copyright always prohibited unauthorized copying, and so it doesn’t matter how technology changes, every copy that’s unauthorized is a copyright infringement. That would be a technology-blind approach, which is a formalistic way of understanding technological neutrality. And so if every copy is infringing, then even if you’re browsing online, you’re making digital copies in the Random Access Memory of your computer. If you’re caching copies for internet servers, all of those things are copies. And if all those copies count, then we can’t do anything with this new technology, because the law is rigid and doesn’t recognize the way the technology has changed how we interact with or around this cultural content.
And so I’ve been trying to argue for a vision of technological neutrality that’s more like substantive equality. That while we’re trying to achieve a balance, we recognize that it has to be understood in context, and we’re trying to maintain stability for the law and its objectives as technologies are shifting. Because I am aware that the other question – I keep doing this to you – that I left hanging before was on the inputs. Let me try and bring these two things together.
Yves: It could be my interruptions, but yes.
Carys: No, I’m pretty sure it’s my fault. So let me say this; I’ll try and bring it together. In my opinion, we can’t just treat every single copy as though it’s a copy, and therefore if it’s not authorized it’s infringing. That’s not going to make any sense in the digital era. It doesn’t make any sense just for the internet, never mind when we’re talking about training Artificial Intelligence. My suggestion is, then, that we begin our analysis taking a more substantive approach to neutrality and saying, what are the purposes of the copyright system, and how is Artificial Intelligence disrupting that? What do we need to do in our copyright system to continue to advance its objectives given this new technological reality?
I think that we need to look at the training of Artificial Intelligence as something that is equivalent to what we would always have allowed in our copyright system in other contexts. So the extraction of data and information, I have said already, there is no ownership information contained in copyright works. We’ve always been free to read a work, to take out the ideas, to take out the information, to use those, to recombine them to create something else. And so my suggestion is that rather than focusing on the technological processes, there must’ve been a copy made of a work in a database somewhere in order for the AI to be trained upon it, instead we look at how is the technology functioning, and what is it doing? What is its effect?
And I’m saying that the copies that are made in the process of training an AI are not copies that reach an audience. The texts that are used to train AI are not being read in the same way by the machine as they would be read by a human. They’re not being understood or used or enjoyed for their meaning or what they communicate in any way at all. They’re really just a source of data, and that data is being extracted. It’s being turned into tokens, as I understand it, that then allow the machine to, without ever going back to that original copy, discerning patterns, correlations, probabilities, and therefore predicting how text will or could look. And that’s essentially what the AI is doing.
And so I feel that in this context an obsession with the copies that go into the AI and the fact that there may be reproductions being made of works for training purposes, shouldn’t lead us to say that all of the training process is unlawful or a copyright infringement just because the owners of the copyright in all of those texts and all of that information were not given a chance to authorize the inclusion of their works in the system.
Yves: And I’m guessing there will be people trying to convince policymakers and governments to continue to protect their interests in that regard. And so there’s another line I’ve heard you talk about or use, which is interesting, where you talk about consider again the pressure that’s placed on policymakers to address this brave new world of cultural production. And obviously with AI technology being at the forefront of that. And you’re warning them, presumably, those who seek, again, this protection, against running into what you call the copyright trap. What do you mean by that?
Carys: So I am concerned that in this rush to respond to the arrival of generative AI, and in the context of an almost moral panic about what this technology can do and how it’s going to affect society and affect our creators and our cultural landscape, that we’re going to rush into regulatory response that deploys copyright as an easy preexisting fix. And so I think there has been, at each stage of technological development, as we’ve discussed, a sense that copyright is the appropriate tool, the thing that is going to allow us to respond to the new technology, that is going to support creators, that is going to ensure the continuation of cultural production in the new technological environment.
Yves: Or sometimes inappropriately police expression in ways that they shouldn’t be?
Carys: Yeah, no, absolutely. I think that’s absolutely right. From the beginning I said copyright was originally a censorial tool, it was originally a tool of censorship. And it really must be recognized as something that controls speech and controls access to information and the flow of information. And this is where the property model, I think, leads us astray, because when we think about copyright subject matter as speech, we can already see that we shouldn’t be limiting speech and the flow of information or access to information and the exchange of ideas unless there’s a really good reason to do that. And even then, it has to be something that you can justify. So this is not even in the charter language you talk about. It’s something that can be justified in a free and democratic society. It needs to be a legitimate limitation on expression.
And so the copyright trap, I think, is the conviction that copyright should regulate all aspects of our cultural landscape. That copyright is the legal tool of choice to control the way that expression is shared. Let me put it this way. When the internet arrived, there was some debate, as I mentioned before, about whether copyright could survive, but there was also some debate about whether copyright could actually regulate the internet. Whether we just have to say, okay, you can’t control this anymore. There was a time where you could control the printed word, but in the internet age we can’t control the way that words are shared or reproduced or disseminated anymore.
And at that time, Jessica Litman wrote a book about digital copyright where she said this is the hubris of copyright lawyers that we think that copyright should set the rules of the game for the internet. That we have such a perfect system that works so well that what we should do is impose that upon this whole new technological environment where people could otherwise be free to engage and share everything and anything. But no, no, no, we’re going to impose our jurisdictionally specific copyright system on the whole world. Not just the world but the whole worldwide web. And we did. That’s exactly what we did.
And so my concern is that with each new evolution, something that could be freed from this kind of control and this kind of model, we rush to copyright to reinstate it, to make sure that it stays as the appropriate rule of the game. And the truth is that the way the technology is evolving many of the core assumptions of copyright just simply don’t fit, or they don’t fit very well. And the copyright already wasn’t working very well. And so the idea that now it’s going to solve our problems just strikes me as a mistake. So I’ve spoken a little bit about the fixation on copies and how that’s a hangover from an analogue era where it was really hard to copy things and copies circulated as physical copies rather than as digital copies. So I think that’s one element of this.
The other is that we tend, in the copyright model, to assume that if something is valuable copyright should attach to it. So that means that we assume if someone’s extracting value or getting value from something that they should have to ask for permission for that thing, and if someone’s producing something of value then they should be able to own that thing. And I think we need to escape from this notion that everything that has social, or cultural or economic value must be privately owned. That, to me, is an element of the copyright trap.
Yves: There was another related point which I read you on which I thought was interesting. I think I got the passage here. I’ll quote it: “The quality and scope of a data set has a direct bearing on the quality and operation of the resulting AI. We must be alert to the risk that copyright law unduly restricts, distorts, or otherwise determines the trajectory of AI’s technological development and operation.” I thought that was really interesting, because is that saying that if we frame the law in too rigid of terms that we risk directing the evolution of the technology itself?
Carys: Oh, absolutely. Yeah, I mean, I think if we clamp down now on the training of AI and insist that using copyright protected works to train AI without permission is infringement, and is therefore unlawful, then we will absolutely be allowing copyright law to change the way that the technology develops. And not, I think, in ways that we ultimately want to. I think there’s a tendency to still think about individual works, and lots and lots of individual works, that if you have a database you can just list all the individual works that are there. And the reality now is really that the vast quantity of data that is being used to train, I mean we’re talking about billions, in some cases, of different texts that are being used and that are being scraped from the internet to allow for the training of these systems.
There are smaller systems and smaller data sets when you can say these books were included or these particular works were included, but the reality is it’s not just about particular works, it’s vast amounts of text and images from everywhere. And so copyright imagines a particular text, a particular work, that has a particular copyright owner. We only know who the owner is because we know who the author is. We have to know if the author is alive or dead, and if they’re dead, when they died, and did they die over 70 years ago, and if not, who inherited their estate, and who, therefore, has the capacity to grant copyright control.
So anyone that’s engaged in clearing copyright knows that this is an incredibly time-consuming process. Sometimes it’s incredibly time-consuming for a single work, or just compiling a small group of works. The idea that you could identify the author and the copyright owner of every work that is included, every piece of text or data or image that is included in any training data set, in the billions, and clear those rights, is just completely unfeasible. That just can’t happen. And so there are lots of people now who are thinking, okay, but still, still we could find a system that says, in theory at least, everybody deserves that power of control, but recognizing that we can’t really operationalize that, they still deserve some sort of reimbursement for their work.
I think that’s where we’re at, that’s where the European conversation is going, certainly, and I can see why that’s tempting, especially when people feel like their livelihoods might be threatened as creatives. The idea that there should be some reimbursement for the value that’s being extracted from the work is an attractive one. But this where I turn to the question about, first of all, how can that be managed. Who’s really going to benefit?
Yves: Yeah, that seems almost impossible.
Carys: To me it seems impossible. It actually seems not almost impossible but actually impossible. Not for everybody, but it will significantly limit what goes into the data sets, and it will significantly limit which interests have the capacity to build those lawful data sets. So either it’s going to produce a system where there’s a lack of transparency and accountability because AI creators can’t admit or can’t point to what they’ve used, and therefore there’s a sort of underground, unlawful AI, and you’re going to have the biggest, most powerful players with the best lawyers who are going to be able to buy the best data sets, and they’re going to emerge as the clear frontrunners in what will therefore be a landscape dominated by one or two big players, with lots of lawyers.
Yves: So instead of democratizing, we would end up creating an elite of owners over creative content, and I guess some sort of dark web of something else going on, dark web of creators. Which could be kind of fun, maybe, but…
Carys: Sounds intriguing. But yeah, no, exactly that, right? And so when I say that this could change the trajectory of the technological evolution, I mean this exactly. You’re distorting the development of the technology to protect existing rightsholders. And we’re talking often this debate about the creators and the artists, but it’s not the creators and the artists, it’s the rightsholders that are going to be able to ask for that reimbursement. Plus, if you think about it in the millions, like we talked about, how the works that are generated probably at not that much economic value in their own right.
Meanwhile how do you divide that value between the billions of rightsholders whose rights may have gone into the training of the thing? Put it this way – if artists and creators are struggling right now economically, this zero point zero, zero, zero, zero payment that might come because it’s attributed to the value of their work in a giant dataset, is not going to be the thing that ensures that their livelihoods are protected in the age of AI.
Yves: Something just dawned on me, though. We talk about, again, these inputs, and then the outputs that come out of it. And it’s funny, because we have all these conversations, too. I mean, there have been these very, very, at times, controversial, but also spirited conversations about cultural appropriation, for example, and relying on cultural appropriation to produce creative works of art. How does that fit in? How does culturally created works, how do they get input into a creative generator and come out on the other end? Is this a debate that we need to have? Is this a debate about cultural diversity that we need to have?
Carys: I think that the risk here is that when we start talking about cultural appropriation and extraction of value, while there’s a real concern about fairness and inclusion, I think once we start using a proprietary rhetoric to talk about what’s happening here, we, again, risk missing the bigger picture. Actually, if we use the property model we’re thinking about exclusion, that the property can exclude others, can say no, you can’t access this, you can’t use this. For me, from a cultural perspective, the bigger concern is inclusion. We want to flip this and start thinking about inclusivity. Who gets to be in?
And the risk, for me, of relying upon copyright or falling into the copyright trap when we’re thinking about how to regulate AI is that what we’re actually doing is creating walled gardens. What we’re actually doing is excluding people whose works and whose expressions are not going to be part of the data set. If they want to opt out or they’re just never included, you’re going to end up with AI tools that exacerbate knowledge disparities, and that create further social and cultural exclusions.
So, actually, the best place to start thinking about this is with a piece that was written several years ago now that was really quite prescient, by Amanda Lewandowski, where she argued about the bias problem that copyright was going to exacerbate in AI. That if people are trying to train AI not using copyright protected works, that they’re going to end up training AI on a limited data set that’s going to exclude lots of modern content, lots of content for which rights can’t be cleared, in a way that makes the tools less well trained and much more problematic. If they’re going to be trained in text that we know belong in the public domain because they were published in the 1900s, then there go all the women. So you have to worry that what you actually do when you rely upon copyright is that you create new exclusions.
And if Artificial Intelligence is going to be as important as we think it’s going to be in terms of resetting our cultural conversation, then everybody whose world view, everybody whose expressions and whose cultural meanings don’t appear in the data sets, they’re also going to be excluded from the outputs. They’re also going to be excluded from what the AI is generating. And that’s not going to hold back the main developers of these AI tools, but it is going to distort our culture further and create, I think, even more in the way of knowledge and cultural hierarchies and marginalized people.
AI has a way, just by virtue of how it works as a prediction tool or a probability tool, really has a way of producing the most probable thing. And so what it creates is really something that reflects our dominant culture. It’s not producing or generating things that are objective, it is producing things based upon the cultural content that it is fed. And so the more limited that is, I think, the more problems we have with bias and inequality in the outputs.
Yves: I think that’s a perfect place to end the interview, and a good place to get us thinking about next steps on where, perhaps, copyright law should go. I know at the beginning you said it would probably not be completely upended, the law itself, but perhaps AI does present an opportunity to revisit the purpose of our copyright system.
Carys: I think if we can redirect copyright to worrying about human authors and creators and a vibrant, participatory culture, then that will lead to both a better cultural environment for all of us and a better copyright system.
Yves: Carys Craig, thank you so much for joining us on the show. That was a fascinating conversation.
Carys: Thank you for having me.