What Do Cakes and LLMs Have in Common?
I spoke at the recent IAPP Barcelona event and I learned so much about communicating complex ideas and the value of translation, precisely because I don't really speak Spanish!
There’s an image macro/meme out there that’s been around forever, featuring a man pushing a small domino into a line of ever-larger sized dominos. It’s the perfect representation of the “domino effect”, and usually makes me laugh especially when people add text.
My life after I started writing this blog has been a lot like that first little domino falling. In June 2024, I started digging into a simple question: can an LLM ‘forget’ what it has learned? Over the next six months, this question consumed my life. I dug into the research, learned about the increasingly hot topic of ‘machine unlearning’, and wrote a series of posts attempting to translate the technical details into something lawyers, policymakers, and AI governance professionals could understand without having to read 50+ papers on the subject or learn linear algebra.
Since those posts, I’ve given presentations on this topic multiple times, and been on more podcasts than I remember. While the specific situations & contexts change, one thing remains constant: people really seem to appreciate my meager attempts at translating complex technical abstractions into lawyer-friendly language.
Currently, I think I’m somewhere in the middle domino-wise, but I need to explain the backstory.
In November, while visiting Madeira to avoid the depression-inducing situation that is November in Ireland, I shared a funny little quirk related to the .cat domain name I use for this blog. Namely, that by virtue of ICANN domain registration rules, I had a legal obligation to write about Catalonian culture, its people, customs, and language.
And so, partly out of my appreciation of quirky rules, and partly because I really love Barcelona, I started planning a trip to Spain and Andorra, two regions where Catalan is a native tongue, and in the case of Andorra, the official language. I’ll be writing more about Catalan culture and the weird little tax haven that is Andorra’s capital city, Andorra la Vella, in a follow-up post.
That November post led to Ramon Baradat Marí reaching out to me on LinkedIn, and suggesting that we should arrange a talk for the local IAPP Barcelona section. Ramon, an attorney with Cuatrecasas, is also the IAPP Chapter chair for Barcelona, and coincidentally, he happens to be one of the nicest, and probably most doggedly persistent individuals I’ve ever met. Over the next six months, he put the plan in place, working with Josuan Eguiluz Castañeira (of ESADE), Rodrigo Quintas Ferrín (HP), and Paloma Rodríguez Carreras and Sara Cobo Mañas (also of ESADE) to make this happen. Josuan and Rodrigo are equally fantastic, warm, and kind human beings, and I was so happy to meet them, and Paloma and Sara knocked it out of the park events-wise. I cannot enough how much I admire them, and how truly grateful I am for the opportunity.
Translation is Tricky Business
And so, this past Wednesday, it finally happened. My co-panelists, Manel Santilari Barnach (an attorney with Clifford Chance), and Francisco Perez Bes (Deputy Commissioner, Agencia Española de Protección de Datos (AEPD)), also shared the stage with me, along with Rodrigo, Josuan, and Ramon. The subject of the evening was the distinct challenges of reconciling AI and machine learning systems with laws like the GDPR and AI Act. Manel & Francisco were brilliant, and brought refreshing practitioner & regulatory perspectives on the matter. I also learned that the AEPD has been thinking about AI for quite some time and has a fair bit to say on the subject.
But rather than summarize the talk itself, I wanted to touch on something understated, overlooked, but nonetheless extremely important that I learned: The value of good translation.
As I mentioned, this was held in Barcelona, and unsurprisingly, seeing as how I’m in Spain and all, everything was en español, barring my presentation. My limited knowledge of Spanish consists of ‘donde esta el baño’ and ‘mas cervesza, por favor’. I can pick out words, and generally read enough hand signals to get by not get arrested in Spain, but everybody, including the organizers realized that it would be cruel to subject the audience members to 20 minutes of me torturando el español in the process of explaining machine unlearning. Fortunately, the folks at ESADE provided me with a live translator for the other panelists’ presentations. This led me to two observations:
Having live translation made me feel very important, like I was speaking in front of the European Commission or the United Nations or something (the huge audience size didn’t hurt!); and
Translation is tricky business.
The gentleman who translated my colleagues’ expert insights in Spanish into discernable English, had a tough job. Spanish, particularly in Spain, is spoken quickly. It’s second only to Japanese in terms of how many words are spoken by the average speaker per minute. Additionally, Spanish has loads of idiomatic terms that don’t easily translate. Add to that the complexity of translating the distinct languages of law and technology, plus having to do all of this live, and one can see how profoundly hard his work was. Hard, but absolutely critical.
If my anonymous translator friend hadn’t been there, it would have been a much different, and far less enlightening experience for me. In fact, I felt a little bad that the simultaneous translation option wasn’t available for the small, impromptu cheering section of English-speaking friends who attended on my behalf.1 Google Translate just doesn’t cut it when people are speaking quickly to a primarily native-speaking audience.
But the need for translation goes beyond purely linguistic barriers. It also needs to extend to sharing thoughts and ideas clearly and simply across distinct domains. Be they technical, legal, philosophical, or data-driven, so many of us are speaking incomprehensibly at one another all the time. It’s no wonder that the bilingualists of binary don’t understand lingua legis, or that we’re all a little confused when the interpreters of ethics and the decoders of data speak in their mother tongues.
Machine (Un)Learning & Cakes
One of the most consistent points of feedback I receive when I give my machine unlearning talk is that I made what was a previously opaque thing slightly less so. My personal approach is to rely on analogies and metaphors. This drives David (Husbot) nuts when we get into a fight, but it works surprisingly well in other situations.
In fact, most of the audience comments and LinkedIn praise I received after Wednesday’s talk emphasized how my very simple cake analogy for machine unlearning helped them really get their heads around how complicated machine unlearning actually is.
In my presentation, I analogize machine learning and unlearning to baking a cake. You start with a set of ingredients—flour, milk, eggs, butter, sugar, nuts, fruit—you combine them all, and then you bake the cake. In the context of training non-deterministic attention models like ChatGPT or Claude, the ‘ingredients’ include the training data itself, but more importantly, they also include the sequences of how words and phrases appear in the training data, the weights and parameters derived from those sequences, and any further refinements/fine-tuning that is done.
Depending on the model, how it was trained, what data it relied on, and what you’re trying to remove/change, untraining a model might be possible, or it might not. If you’re willing to bake a new cake from scratch (full model retraining), you’re golden. It’s expensive and costly, but doable. If you’re trying to pick out just the nuts or fruit (e.g., the personal data of one uniquely identifiable person), it’s possible, but extremely hard to do right and not guaranteed (you might miss a walnut here, or a raisin there).
But, if you’re trying to do this in compliance with all the data subject rights (rectification, erasure, objection, etc.), for loads of data subjects, based on training data from the internet, you might as well be asking the baker to remove all the eggs from the cake after it comes out of the oven.
In addition to being simple, I think the reason this analogy works is that there’s a shared understanding about what a ‘cake’ is and how baking works. The ingredients might be a little different from place to place, but the concept of a cake, or at least a sweet, baked item is pretty damn universal.
I think we’d all do better at building bridges and breaking down complexity if the ‘experts’ spent a little more time drawing on these types of shared understandings and move away from using big, fancy technical words and sounding important in front of everyone. Or, as Debbie Reynolds mentioned on our podcast in January, being plain-spoken and clear.
That’s not to say that there’s never a place for technical details and precise language. Obviously, experts don’t necessarily need to dumb things down when they’re speaking in a room of fellow experts in their field. But those audiences aren’t where hearts and minds are won. They aren’t the places where meaningful dialogue or change is likely to occur, because, to be honest, most experts are quite comfortable hanging out in rooms where everyone mostly agrees with them and nods appropriately at correct intervals. Not to call out any particular industry association or anything, but that’s why many of the same people keep participating at the same events, over and over and over again.
We need more plain-spoken simplifiers, and more folks willing to come up with silly cake analogies. And we also need more translators, who can help bridge the gaps between all these disparate languages. And with that, I need a nap and some tapas.
Part of the reason I planned this trip for May was that I knew that Beth Hochberg, a dear friend who I’ve known since law school (20 years ago!) would be visiting from Washington, DC. I mentioned the event to her in passing over two bottles of wine, and she, along with another DC-mutual friend, Shawn (who fled the US and has been living the life in Barcelona for the last 7 years), trekked out to ESADE, along with my forever patient husband David. We also made a new fellow expat friend, the CPO of Archer Daniels Midland, Ashley Slavik.
PS: I will run this through Claude or ChatGPT for a reasonably approximate Catalan translation. But I ran out of steam and need food...