Collating Hacked Data Sets

Two Harvard undergraduates completed a project where they went out on the dark web and found a bunch of stolen datasets. Then they correlated all the information, and combined it with additional, publicly available, information. No surprise: the result was much more detailed and personal.

“What we were able to do is alarming because we can now find vulnerabilities in people’s online presence very quickly,” Metropolitansky said. “For instance, if I can aggregate all the leaked credentials associated with you in one place, then I can see the passwords and usernames that you use over and over again.”

Of the 96,000 passwords contained in the dataset the students used, only 26,000 were unique.

“We also showed that a cyber criminal doesn’t have to have a specific victim in mind. They can now search for victims who meet a certain set of criteria,” Metropolitansky said.

For example, in less than 10 seconds she produced a dataset with more than 1,000 people who have high net worth, are married, have children, and also have a username or password on a cheating website. Another query pulled up a list of senior-level politicians, revealing the credit scores, phone numbers, and addresses of three U.S. senators, three U.S. representatives, the mayor of Washington, D.C., and a Cabinet member.

“Hopefully, this serves as a wake-up call that leaks are much more dangerous than we think they are,” Metropolitansky said. “We’re two college students. If someone really wanted to do some damage, I’m sure they could use these same techniques to do something horrible.”

That’s about right.

And you can be sure that the world’s major intelligence organizations have already done all of this.
——–
Free High Security Email from Sigma