The Right to Data Portability Is Kind of a Joke, But It Could Be So Much More
A modest proposal for reforming the Right to Data Portability as a means to curate our digital identities.
Sometimes, the best ideas for blog topics (or anything, really), comes over a good meal with an amiable companion, and a few glasses of wine. A few days ago, Husbot and I had a lovely meal at Six by Nico1 and as one does after a few glasses, randomly ended up on the topic of data protection — specifically, an aspect of data privacy and rights that are frequently overlooked: data portability. It all started with a rant.
David was relaying how our Netflix account was suspended after I’d canceled the credit card it’s attached to, and neither of us had remembered to add a new card. He got the warning email of course, but hadn’t gotten around to correcting it — after all, it’s summer and a time to enjoy the long days — not binge watch content. Then he got another email, which informed us that if we didn’t pay, our account data would be deleted in ten months. Uh oh.
That led to the following discussion, which I am liberally paraphrasing:
David: I mean, ten months isn’t very long. What If I fell into a coma tomorrow and then woke up 11 months later? No more Netflix! No more recommendations.
Me: You could download your data, of course.
David: But can I upload it back to Netflix? Or do I start from scratch?
I didn’t know. [NB: It turns out that you in fact, can migrate or transfer your watch history, likes, etc. to a new account, likely a result of Netflix’s crackdown on sharing accounts].
A Glaring Port(ability)-Hole
But this is much larger than Netflix and extends to much of our digital lives. Many companies comply with the GDPR’s Data Portability obligations, at least in spirit. That is, they allow you to export or ‘Download your Data’, but they may not always bother with the other bits – for example, making the data available in “a structured, commonly used and machine-readable format” or transmitting that data to others who make use of our data. PDFs may be machine-readable, but I don’t know if I’d call them structured in a way that the drafters envisioned. Now, before I go down yet another data protection rabbit hole, It’s worth mentioning that portability rights are fairly restricted to a few, specific scenarios,2 but a simple way to think about it is:
a) if your data lives on a computer or in a database somewhere; and
b) you provided the data directly (or through some automated means) to a person (or company) who is doing things with it; and
c) they’re doing stuff with your data based on either you telling them it’s ok (consent) or them providing a product or service to you that’s governed by a contract, then
you should be able to get your data back out again and move it around to somewhere else (like a competitor of theirs). I’d argue that something like your Netflix watch history, your contact lists, connected car data, posts on any social network, and say, places you favorite or visit on Google Maps, all meet those standards.
What’s interesting about data portability is that while the drafters explicitly contemplated migrating data from one company to another, it appears that a rather obvious use case may have been missed: namely, how data portability can be a powerful privacy tool for migrating, and more importantly, curating data within a system you’re already using.
On Castaways, Comas, and Curation
Getting back to David’s original point – It would be nice to know that if I disappeared off the internet, got trapped on a desert island, or fell into a coma for a few years, that I could still recreate the digital life I had – or at least some of it. It’s certainly more ideal from a privacy, security & compliance perspective, to build in functionality that allows users to easily re-import saved data, versus the default – storing it forever. I’m sure it’s also perfectly technically possible to build in importability – Netflix can do this, and both Apple and Google can do this easily enough when it comes to migrating phones or laptops.
But there’s also a few more cases where export and re-importing data can be valuable, and offer real-world benefits to users: Specifically when we consider portability from the perspective of data curation.
Few of us are static creatures. What we like and dislike may change over time. Like hairstyles, careers, relationship statuses, and flirtations with libertarianism, things change as we grow older, experience more of life, and evolve as people. Some of us go further, and change our names, our sexual orientations, or even our genders. It makes a great deal of sense then to empower users with a means to easily and selectively delete or modify records and account details that they feel no longer reflect who they are as people. Empowering people with the ability to directly control their digital stories and lives online means that social media and publishing sites won’t continue to deadname people. It means that we would have the ability to keep our online identities, without necessarily keeping every single awkward, painful, or regretted memory exposed in a database somewhere. Implementing better portability tools can make curating our digital lives easier.
Now, some of you might be thinking, “Uh, Carey, can’t you already do this via other data subject rights (like rectification and deletion)?” And the answer is, of course you can. But rectification and deletion rights are (usually) manual, and often require me as the user to either first file an access request (to uncover what information I want to correct or delete), or already know what information I want to correct or delete ahead of time.
Guys, I can’t remember what I ate for breakfast this morning. I certainly can’t remember enough details to inform Google about what restaurants I favorited in 2015 that I no longer care about, or tell Amazon that I no longer want them to keep that (hypothetical) sex toy order I made in 2005. Also, I cannot begin to explain all the terrible, terrible movies I watched in 2012 on Netflix.
But the access/rectification/deletion approach also means more work for the controller of the data as well – it means companies who use our data need to invest man-hours by having one (or often, multiple) employees manually doing something, in a rush, and probably doing it badly.
When Elon bought Twitter and drove the site off a cliff of his own creation, it would have been great if Twitter made it possible for me to download all my tweets, delete 90% of them, and then re-upload the 10% of data I didn’t mind memorializing on the internet. Instead, I relied on a third party tool called Semiphemeral (that I’m pretty sure no longer works thanks to Elon’s stupid API policy that might itself break portability rules?) to kinda sorta do what I probably could have done more quickly in Excel.
Moving Towards Trust (And Away From ‘Data as Toxic Waste’)
Data curation-through-portability also means that the companies and organizations who benefit from our data will have accurate, meaningful, quality data about us – not just all the data about us. If I can tell Google that I really dig cats, data protection, beer, and coffee, and have zero interest in that one ‘Love Island’ link I clicked on 3 years ago while drunk, it’s better for everyone. It means that Google only blasts ads and content that I find relevant and more likely to click on. It means they gain insights about me that they can use that I don’t feel bad about them knowing because I’ve curated what they know. Or, as the Article 29 Working Party (the predecessor to the European Data Protection Board) stated in 2017:
By affirming individuals’ personal rights and control over the personal data concerning them, data portability also represents an opportunity to “re-balance” the relationship between data subjects and data controllers.3
Again, some of these features do exist, in bits, spread out in fragments that are hard for users to find, much less manage. I do a regular curation of my data across Google, various social media sites, data aggregators and the like. But it’s an incredibly time-consuming, tedious, and research-intensive process that requires a great deal of patience and persistence. If I could instead easily download, purge Google’s current memory of me, and reupload what I want Google to know every six months or every year or so, I win, and Google wins. I trust them more, not less.
It also means that we move away from polarizing conversations where you’re either resigned to let Big Tech know everything about you or to exert a tremendous amount of energy trying to hide from tech companies altogether. With a ‘curation right’ incorporated into data portability, we might instead get to conversations about trust, mutually-beneficial insights, & the value in selective sharing.
I honestly think that most companies want accurate, relevant, and timely data about us. Not all the data. Based on my experiences talking with actual engineers and product teams across the tech sector, the problem isn’t data greed – it’s fractal complexity. It is simply too difficult to sift out meaningful, useful insights about us while also purging the noise, so most companies default to keeping everything, because loss-aversion is a thing, we’re terrible at intuiting value, and it’s easier to save than delete, and (still probably) less costly when there’s an error in our estimations of value. 4
The data-curation-through-portability approach meaningfully gives ‘control’ to users in a way that the traditional data ownership models usually can’t. For those not familiar, data ownership is the idea that we should be able to monetize our personal data, selling it to the companies we want for a profit. The problem with data ownership models I’ve seen in practice is that they simply don’t scale. Data ownership is difficult to execute and most of us can’t be bothered to put in the work for a few cents here and there.
And maybe that’s also true for the portability/curation approach as well. I don’t know, because it’s not really a thing yet. If someone, somewhere is reading this, and they want to help make it a thing, let’s talk. To be effective though, I think it will involve both some legislative and cultural shifts in how we look and think about, regulate, and commercialize data. It will absolutely require the insight and expertise of technologists (and not just lawyers) in terms of execution. It will also require thoughtful discussions about intellectual property, and the portability of inferred data. We may also need to take this whole discussion out of the realm of data protection entirely – there’s heavy consumer rights aspects to consider when we talk about access, data portability, interoperability, free movement of information and the like.
Closing Thoughts
In short, data portability isn't just about interoperability between platforms —it can also be a powerful tool for privacy, trust, and data curation. Done right, it can liberate us to control our digital stories and move away from an 'all-or-nothing' approach to data sharing. To make this happen, we'll need collaboration, legislative and cultural shifts, and new discussions about intellectual property.
But I want to hear from you. Do you have ideas about how we can build out data portability into a right that gives users power? Do you have examples you’ve seen where the data curation approach has been done well? What am I missing?
Leave your comments below, or reach out to me on social media – I’m Privacat everywhere.
I highly recommend checking it out. We ate in Dublin, but they have a few different spots throughout the UK and other parts of Europe. Their schtick is a rotating six-course menu that happens roughly every six weeks — but oddly only includes five wines.
Here’s the dorky legal summary, which I have begun to relegate to footnotes, because as my husband reminds me, “Most people don’t care” about the dorky legal summaries. Data portability only applies where a data subject (you or I) have provided information directly to a controller or processor (e.g., Google, Microsoft, your bank, etc.), or in some cases, if that data is obtained automatically – for example, by logging a user’s website activity or viewing habits, location data picked up by Google Maps, or badge reader access. Data portability also only really kicks in if the controller is relying on consent, or performance of a contract as their legal basis for processing your data. The ICO actually has a pretty good summary of the data portability right here.
Importantly, the data portability obligations don’t extend to data that’s derived or inferred by the controller (so, a profile that Google might make about you as a customer based on a collection of data), though that information does need to be provided in a data subject access request. Also, I suspect that after the Meta case, this whole inferred v. automatically discovered distinction might get messier in practice.
Article 29 Working Party, “Guidelines on the right to data portability” (WP 242), revised 5th April 2017: http://ec.europa.eu/newsroom/document.cfm?doc_id=44099.
Which is to say, the expense of lost valuable data is concrete and tangible. If you run Google Photos and you start randomly deleting people’s old photos, you’re going to have a lot of pissed-off people, and you’ll probably get sued and lose. But if you just keep everything and get hacked, you might get sued, or a regulator might jump up and down. But I still think the odds are lower.
"after all, it’s summer and a time to enjoy the long days", she says, while desperately trying to defeat a boss on Zelda.