Pentagon data leak: Massive trove of global social media data left accidentally exposed online
The data exposed included social media posts made by people across the globe, including the US.
The Pentagon accidentally exposed classified US Department of Defense (DoD) databases containing information that the US gathered on social media users across the world.
A security expert found three "publicly downloadable" Amazon S3 servers, one of which contained nearly 1.8 billion social media posts made by people across the globe, including Americans, which appears to have been collected by the DoD over nearly eight years.
The three publicly exposed S3 buckets were discovered by UpGuard security researcher Chris Vickery and were named "centcom-backup," "centcom-archive," and "pacom-archive."
According to Vickery, the databases were compiled by scraping the Internet for publicly available social media posts, hinting at the kind of content the US government is interested in surveilling. Although the exposed databases did not contain any sensitive data, they did include social media posts and information about the users who posted different content.
The data appears to have been gathered by scraping the Internet for publicly available information. The social media data found in the S3 buckets was written in multiple languages — primarily in English, Arabic and Farsi.
According to Vickery, the collection of the data began around 2009 and lasted until August 2017.
Leaked data exposes how US collects information on "law-abiding" people across the globe
In a report, UpGuard stated that the leaked data was part of Pentagon's intelligence-gathering operation.
"The data exposed in one of the three buckets is estimated to contain at least 1.8 billion posts of scraped internet content over the past 8 years, including content captured from news sites, comment sections, web forums, and social media sites like Facebook, featuring multiple languages and originating from countries around the world.
"Among those are many apparently benign public internet and social media posts by Americans, collected in an apparent Pentagon intelligence-gathering operation, raising serious questions of privacy and civil liberties," the report read.
"It remains unclear why and for what reasons the data was accumulated, presenting the overwhelming likelihood that the majority of posts captured originate from law-abiding civilians across the world."
The exposed data could potentially have been also accessed by hackers, who in turn could have exploited it "perhaps against internet users in foreign countries wracked by civil violence". It remains unclear whether the exposed databases were accessed by hackers.
CNN reported that Vickery alerted the DoD about the unsecured S3 buckets in mid-September and that the databases were secured by 1 October.
"We determined that the data was accessed via unauthorised means by employing methods to circumvent security protocols," Major Josh Jacques, a spokesperson for US Central Command told CNN. "Once alerted to the unauthorised access, Centcom implemented additional security measures to prevent unauthorised access."
Jacques told CNN that the collected data is "used for measurement and engagement activities of our online programmes on public sites," adding that the data "is not collected nor processed for any intelligence purposes."
Although the Pentagon may face some criticism about collecting social media posts, Bleeping Computer reported that scraping the Internet for data is not illegal. However, the breach raises concerns about the Pentagon's ability to ensure that its data remains secure.
This is not the first time that the Pentagon has suffered a massive breach because of unsecured S3 buckets. In June, nearly 28GB of sensitive and classified US intelligence data, relating to the highly secretive National Geospatial-Intelligence Agency (NGA) was exposed online by US contractor Booz Allen Hamilton.