Data Forecasting
How dump trucks of data are being used to predict crime.
“We may regard the present state of the universe as the effect of its past and the cause of its future. An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect were also vast enough to submit these data to analysis, it would embrace in a single formula the movements of the greatest bodies of the universe and those of the tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present before its eyes.”
— Pierre Simon Laplace, A Philosophical Essay on Probabilities
There’s an old thought experiment called Laplace’s Demon. It says that given an almost infinite about of computational power, if some being was able to accurately know and be able to analyze the position and velocity of every single atom in the universe, that computer could by all accounts predict the future with certainty. A fun if ultimately futile idea, Seth Lloyd, a professor at MIT, actually calculated the proposed amount of information that makes up the universe. According to him, that number clocks in around ten to the ninetieth power. The most powerful supercomputer to date has the capacity to perform ten to the eighteenth power calculations per second. Assuming our world is deterministic, we still have quite a while to go before we’re nearing on that level of computational power.
That being said, the tech industry’s ability at making pretty good guesses is getting better and better by the year. Some might even agree that these companies know us better than we know ourselves, innocently filling in a lot of the details that have up until this point been missing. The most prominent area is advertising, but even today, the power of data and these opaque algorithms are spilling over into more nefarious places.
Advertising
Recently, I went to a rare book festival. Before I went, I had no idea what to expect. I didn’t even know if I liked rare books, but the thought intrigued me. So I went, and even if I wasn’t about to drop tens of of thousands of dollars on a first edition of Farenheight 451, I thoroughly enjoyed myself. I will probably even go back next year. I must admit though that I’d have never known about this conference if it weren’t for an ad I saw on Instagram.
Their ads have gotten really good, and I regularly find myself clicking on them. More often than not, I even save the ad for future perusal if I don’t buy the product outright. When I think about all of this now, part of me wouldn’t change a thing. The argument goes, “What’s so wrong with Instagram intimately knowing me? My like and dislikes? If I’m forced to see ads, it’s better than the alternative.” What’s so bad about Instagram serving me ads for things I didn’t even know I wanted?
It’s an intriguing thought. Here’s another question. When you see that advertisement on Instagram, did you buy that pair of sustainably made, faux leather boots or did Instagram manipulate you into buying them? Does Instagram know from your engagement and purchase history that you usually click on ads at night on the Thursday just before pay day? Did they analyze your posts to assess your interest in sustainability as well as looking fly as hell? Was that post in your feed last week, outlining the CEO of that same company’s thoughts on the future of sustainability and how important it is a coincidence or intentional?
Let me provide some scope. In 2018, Dylan Curran, writing for the Guardian, laid it all out. Google knows where you’ve been, everything you’ve ever searched, all the apps you use, all of your YouTube history, your bookmarks, emails, contacts, your Google Drive files, the photos you’ve taken on your phone, the businesses you’ve bought from, the products you’ve bought through Google, data from your calendar, your Google hangout sessions, the music you listen to, the Google books you’ve purchased, the Google groups you’re in, the websites you’ve created, the phones you’ve owned, the pages you’ve shared, how many steps you walk in a day. According to Dylan, the amount of data Google has on you could fill millions of word documents.
Facebook is even worse. According to Dylan, Facebook’s date on you “includes every message you’ve ever sent or been sent, every file you’ve ever sent or been sent, all the contacts in your phone, and all the audio messages you’ve ever sent or been sent…what it thinks you might be interested in based off the things you’ve liked and what you and your friends talk about…every time you log in to Facebook, where you logged in from, what time, and from what device…all the applications you’ve ever had connected to your Facebook account…tracking where you are, what applications you have installed, when you use them, what you use them for, access to your webcam and microphone at any time, your contacts, your emails, your calendar, your call history, the messages you send and receive, the files you download, the games you play, your photos and videos, your music, your search history, your browsing history, even what radio stations you listen to.”
In the end, you might say, “It doesn’t matter! No one put a gun to my head. I put in those payment details all by myself.” Sure, you are right! Instagram didn’t technically force you to do anything. All they did in this scenario is make the guess that given a certain set of conditions, you are more likely than not going to engage with a given ad. Benign as it may seem, corporations accurately predicting your actions is a slippery slope to say the least.
Predictive Policing
In early 2019, Motherboard got a hold of a number of documents that showed that dozens of cities if not more around the country had been augmenting their police work with data-driven algorithms dedicated to “predictive policing”; that is, the forecasting of crimes before they happen in the hopes of preventing them from ever happening in the first place. Everywhere from Los Angeles to New York City to New Orleans to cities up and down the west coast, police departments have been found to have tried out the practice, with more and more signing up every year.
One of the most prominent suppliers of this predicitive policing software is a company called PredPol. According to their website, “PredPol grew out of a research project between the Los Angeles Police Department and UCLA. The chief at the time, Bill Bratton, wanted to find a way to use COMPSTAT data for more than just historical purposes. The goal was to understand if this data could provide any forward-looking recommendations as to where and when additional crimes could occur. Being able to anticipate these crime locations and times could allow officers to pre-emptively deploy officers and help prevent these crimes.”
Bill Bratton, for those who don’t know, when he was Police chief of the New York Police Department was an early proponent of the “broken windows” theory of policing, which essentially states that the best method of deterring serious crime is by over-policing misdemeanors and minor infractions. The theory has largely been disproven with many studies laying out that the link between petty crimes and felonies is negligible at best. All it really accomplished was the destroying of trust between minority communities and police.
Despite all that, this theory has guided police work for decades acting as the foundation of Michael Bloomberg’s notorious “stop and frisk” policy as well as serving as the bedrock for these predictive policing algorithms. Nathan Munn, writing for Vice, says, “Police training documents from PredPol, a company that sells predictive policing software, shows that the company considers its software to be comparable to a “broken windows” policing strategy that has led to over-policing of minority communities and is widely believed to be ineffective.”
The same criticisms have been leveled at PredPol, and the evidence seems to line up. While the company claims it only relies on three data points for its analyses: when, where, and what happened; the software generally is based on historical police data gathered over decades. This data, thanks to things like broken windows and stop and frisk and redlining, is believed to be startlingly biased. Some police departments have wisened up and at least announced reviews of the effectiveness of software like this. In Los Angeles, the LAPD announced a review of predictive policing tactics after a scandal involving over a dozen officers intentionally mislabeling people as gang members in order to populate a statewide gang database. Beryl Lipton, writing for Muckrock, says, “Predictive policing software is being used by law enforcement agencies nationwide, yet a review suggests that almost none of its users, past or present, have clear ways to measure the effectiveness or accuracy of the tool, a lack of oversight many researchers consider irresponsible.”
The troubling thing is that this type of data-driven police work extends far beyond PredPol. It’s widely known that police departments for years have employed everything from automated license plate cameras to shot spotters to facial recognition to aide in policing all across the country. Clearview AI, a Silicon Valley facial recognition start up, has already contracted with dozens of police departments for use of it’s proprietary technology. Ring, a home security company owned by Google, has partnered with police department all over the country as well. These few companies are just the tip of the iceberg. I didn’t even mention the slew of companies chopping at the bit to supercharge ICE’s immigration enforcement capabilities.
Simply put, it’s not hard to imagine a future of perfect surveillance, where if but for the sole reason a police entity is simply interested in you — the reason why doesn’t matter — all they need is interest, and with the click of a few buttons, they could have every picture you’ve posted on Facebook and Instagram, your exact location dating back five years, where you work, where you live, who your family is, who your friends are, and your daily routine down to the minute.
This is the unintended consequence of data. This is what we are agreeing too day in and day out when we give our most personal details away everyday. Suddenly, we’ve gone from Instagram accurately predicting whether or not you’d like those shoes to the NYPD actively determining whether you are likely to commit a crime.
—
If you liked this post, here are a few others I’ve written that you might enjoy: