I agree with the first half… It’s very easy to ingest and sift through insane amounts of data
What isn’t easy is doing so usefully. Yes, if you can link the account to a person, it’s trivial to pull up their records. Linking is easier said than done - it’s doable, but to make it scale you have to get the full records of device IDs, link them back to a number, then link them to a person. Minimum, you’d need the telco’s data
That’s a staggering amount of work - it’s much easier to do it if the app also has phone numbers, but even then where do you link it? The telco’s have an account holder (which often will be a family member), 50 separate dmvs might have more accurate links, but they’re largely legacy systems that will be a nightmare to work with. It’s doable, but it’s hard
Then you get to distribute this super extensive database of personal information - at this point it’s prism, and probably already has most of this data - they’d just have to ingest period data too
But we don’t give that kind of access to local police, because then every government would end up with it. And that’s a big and genuine security threat… But also a very unwieldy thing to work with. More data means more man hours to work with
The other direction is far more practical - if you start by looking at the data, you can tie it back to a person if they match a pattern. Then you can look at just the records you do have, and pay Amazon or the credit agencies for more. A human can easily investigate another human, because we are great with unstructured data, and computers aren’t
A chaotic data source means more bad leads to manually chase down. Man hours are limited, and people have morale - if a cop wastes an hour on a lead that ends with a spare phone or a single man, they’re going to complain and drag their feet. If productivity and morale are in the garbage, that’s going to lead to pushback. If it happens enough, the message at the top will be “this program doesn’t work”
It would be far better to find the patterns and target them methodically, but even chaotic garbage is effective - data analysis isn’t easy to automate, it’s very expensive to do when accuracy matters and they’re poisoning the data source
Ok, let’s use your first example. Someone crosses into a neighboring state and returns in the same day…I had co-workers who did that every day.
Let’s narrow that down… You cross into another state with abortion care once and return in the same day. Or maybe you’re a salesman closing a deal. Or maybe you’re visiting family and have work tomorrow… And honestly, both those situations are far more frequent. That happens every day. It happens more if you live near the border - otherwise you probably got a hotel. Unless you can’t afford a hotel. And the list goes on - all this structured data turns into stories at some point
Here’s the thing. Prism could handle it, because it’s a ton of people on the payroll
The government is not a monolith though…9/11 is a great example. We knew it would happen, we knew it was planned, but the right people didn’t know in the right time, because the agencies are not a monolith.
Because that is the hard part - communication is hard, harder with security concerns. More data means more analysts reviewing it - you can collect all the data you could want , (and we do), you could hire all the analysts you can afford (and we do), but that still gives you severe limits
We’re actually pretty great at stopping terrorism, but we do that (in part) because we have all this data and use it for specific ends
None of this shit is easy - I used to do this, specifically. How do you take 15 data sources that sometimes conflict, and deconflict them? There’s no hierarchy of truth here. This is literally a cutting edge problem - it’s a literal holy Grail. No one can solve it in 3 weeks, or even 3 years
You want a 20% rate? I could give it to you tomorrow, poisoned data or no, I could give it to you in weeks… Maybe not 3, because that’s a shit ton of data sources, but with proper motivation I could pump it out.
You want 90%? Give me a century or two, and I’m good at this. Maybe a genius could give it to you in a lifetime of with
It’s like they say in game dev, you can do 90% in 10% of the time, but the last 10% takes 90% of the time. And that’s a solved problem.
Except this is an unsolved problem, possibly the most lucrative unsolved problems in history