The following is a guest post by Alistair Croll.
A couple of years ago, I spoke with an European Union diplomat who shall remain nameless about the governing body’s attitude to privacy.
“Do you know why the French hate traffic cameras?” he asked me. “It’s because it makes it hard for them to cheat on their spouses.”
He contended that it while was possible for a couple to overlook subtle signs of infidelity—a brush of lipstick on a collar, a stray hair, or the smell of a man’s cologne—the hard proof of a speeding ticket given on the way to an afternoon tryst couldn’t be ignored.
Humans live in these grey areas. A 65 mph speed limit is really a suggestion; it’s up to the officer’s to enforce that limit. That allows for context: a reckless teen might get pulled over for going 70, but a careful driver can go 75 without incident.
But a computer that’s programmed to issue tickets to speeders doesn’t have that ambiguity. And its accusations are hard to ignore, because they’re factual, rooted in hard data and numbers.
Did Big Data kill privacy?
With the rise of a data-driven society, it’s tempting to pronounce privacy dead. Each time we connect to a new service or network, we’re agreeing to leave a digital bread-crumb trail behind us. And increasingly, not connecting makes us social pariahs, leaving others to wonder what we have to hide.
But maybe privacy is a fiction. For millennia—before the rise of city-states—we lived in villages. Gossip, hearsay, and whisperings heard through thin-walled huts were the norm.
Shared moral values and social pressure helped groups of compete better against other groups, helping to evolve the societies and religions that dominate the world today. Humans thrive in part of our groupish nature—which is why moral psychologist Jonathan Haight says we’re ninety percent chimp and ten percent bee. We might have evolved as selfish individuals, but we conquered the earth as selfish teams.
In other words, being private is relatively new, perhaps only transient, and gossip helped us get here.
Prediction isn’t invasion
Much of what we see as technology’s invasion of privacy is really just prediction. As we connect the world’s databases—tying together smartphones, loyalty programs, medical records, and the other constellations in the galaxy of our online lives—we’re doing something that looks a lot like invading privacy. But it’s not.
Big Data doesn’t peer into your browser history or look through your bedside table to figure out what porn you like; rather, itinfers your taste in smut from the kind of music you like. Big Data doesn’t administer a pregnancy test; instead, it guesses you’re pregnant because of what you buy. Many of Big Data’s predictions are a boon, helping us to fight disease, devote resources to the right problems, and pinpoint ways to help the disadvantaged.
Is prediction an invasion of privacy? Not really. Companies will compete based on their ability to guess what’s going to happen. We’re simply taking the inefficiency out of the way we’ve dealt with risk in the past. Algorithms can be wrong. Prediction is only a problem when we cross the moral Rubicon of prejudice: treating you differently because of those predictions, changing the starting conditions for unfair reasons.
Unfortunately, Big Data’s predictions are often frighteningly accurate, so the temptation to treat them as fact is almost overwhelming. Policing looks like thoughtcrime. And tomorrow, a just society is a skeptical one.
We’re leakier than we know
Long before the Internet, we left a bread-crumb trail of personal being us: call history, credit-card receipts, car mileage, bank records, music purchases, library check-outs, and so on.
But until Big Data, baking the breadcrumbs back into a loaf was hard. Paper records were messy, and physical copies were hard to collect. Unless you were being pursued by an army of investigators, the patterns of your life remained hidden in plain sight. We weren’t really private—we just felt like we were, and it was too hard for others to prove otherwise without a lot of work.
No more. Big Data represents a radical drop in the cost of tying together vast amounts of disparate data quickly. Digital records are clean, easy to analyze, and trivial to copy. That means the illusion of personal privacy is vanishing—but we should remember that it’s always been an illusion.
Our digital lives make this even more true. We’re probably not aware of what’s being collected as we surf the web—but it’s pretty easy to tell where someone’s been through browser trickery, cross-site advertising, and the like. So when a politician calls for your vote, they may know more about you than you want. But let’s not confound promiscuous surfing behavior—leaving more breadcrumbs—with an improved ability to bake those crumbs back into a loaf.
Big Data didn’t force us to overshare; it’s just better at noticing when we do and deriving meaning from it. And because of this, it’s back to thin-walled huts and gossip. Only this time, because it’s digital and machine-driven, there are a couple of important twists to consider.
This ain’t your ancestor’s privacy
There are two key differences, however, between our ancestors’ gossip-filled, thin-walled villages and today’s global digital village.
First, consider the two-way flow of gossip. A thousand years ago, word-of-mouth worked both ways. Someone who told tales too often risked ostracism. We could confront our accusers. Social mores were a careful balance of shame and approval, with checks and balances.
That balance is gone. We can’t confront our digital accusers. If we’re denied a loan, we lack the tools to understand why. Often, we aren’t even aware that we’ve been painted with a digital scarlet letter. As one Oxford professor put it, “nobody knows the offer they didn’t receive.”
Big Data is whispering things about us—both inferred predictions and assembled truths—and we don’t even know it.
Second, everyone knew gossip was imperfect. We’ve all played “broken telephone” and seen how easily many mouths distort a message. We’re skeptical of a single truth. We’ve learned to forgive, to question.
The same studies that show groups should ostracize those who don’t chip in also suggest that the best strategy of all is to forgive occasionally—just in case the initial failure was an honest mistake. In other words, when dealing with whispered truths, we lived life in a grey area.
Unfortunately, digital accusations—like those made by traffic cameras—leave little room for mercy and tolerance, because they lack that grey area in which much of human interaction thrives. If we’re going to build data-driven systems, then those systems need grey areas.
New rules for the new transparency
In the timeline of human history, privacy is relatively recent. It may even be that privacy was an anomaly, that our social natures rely on leakage to thrive, and that we’re nearing the end of a transient time where the walls between us gave us the illusion of secrecy.
But now that technology is tearing down those walls, we need checks and balances to ensure that we don’t let predictions become prejudices. Even when those predictions are based in fact, we must build both context and mercy into the data-driven decisions that govern our quantified future.
This post first appeared on Solve for Interesting and has been lightly edited.
Alistair has been an entrepreneur, author, and public speaker for nearly 20 years. He’s worked on a variety of topics, from web performance, to big data, to cloud computing, to startups, in that time. In 2001, he co-founded web performance startup Coradiant (acquired by BMC in 2011), and since that time has also launched Rednod, CloudOps, Bitcurrent, Year One Labs, the Bitnorth conference, the International Startup Festival and several other early-stage companies. Alistair is the chair of O’Reilly’s Strata conference, Techweb’s Cloud Connect, and the International Startup Festival. Lean Analytics is his fourth book on analytics, technology, and entrepreneurship. He lives in Montreal, Canada and tries to mitigate chronic ADD by writing about far too many things at Solve For Interesting.
“Treating you differently because of those predictions, changing the starting conditions for unfair reasons.” This statement is IMO in the grey area in business practice and already exists in today’s business world.
Think about commonly practiced price discrimination, isn’t that based on the facts that customers have different (pre)conditions, and asymmetry of knowledge.
Absolutely. I think the Big Data problem is it’s systematic nature and scale. That presents a problem that can’t be overcome as other discriminations can. No?