Before jumping into UpGuard’s findings, let’s review AggregateIQ. Whistleblower Christopher Wylie, a former Cambridge Analytica employee who has been central to information about the company coming to light, told The Observer last month that he helped get AggregateIQ up and running in order to help SCL expand its operations. “Essentially it was set up as a Canadian entity for people who wanted to work on SCL projects who didn’t want to move to London,” he said. “That’s how [AggregateIQ] got started: originally to service SCL and Cambridge Analytica projects.” Earlier this month, Facebook suspended AggregateIQ for its connections with Cambridge Analytica and the possibility that it might, therefore, have some of the data Cambridge Analytica improperly obtained.
Though AggregateIQ and SCL have tried to distance themselves from each other lately, they worked quite closely together for some time. “AggregateIQ were the ones that took a lot of data that Cambridge Analytica would acquire and the algorithms they build, and translated that into the actual physical targeting online, they [AggregateIQ] were the bit that actually disseminated stuff,” Wylie told The Observer. And AggregateIQ co-founder Jeff Silvester told Gizmodo recently, “We did some work with SCL and had a contract with them in 2014 for some custom software development. We last worked with SCL in 2016 and have not worked with them since.” AggregateIQ’s website now says it “has never been and is not a part of Cambridge Analytica or SCL” and that it “has never entered into a contract with Cambridge Analytica.”
Of course, now we know that Cambridge Analytica did improperly obtain information on 87 million Facebook users through researcher Aleksandr Kogan and that Donald Trump and Ted Cruz both used its services during their campaigns. We also now know that AggregateIQ developed some of the tools marketed by Cambridge Analytica and that it worked with a number of British political groups who campaigned in support of the UK leaving the European Union during the Brexit referendum.
But now we also have a look into the powerful tools AggregateIQ developed and how they work and we have that information because the company left them exposed online. Last month cybersecurity firm UpGuard released a report detailing how its researchers were able to access an AggregateIQ code repository with just an email address. The registration process didn’t even require a verification of that email address.
After registering, UpGuard researchers were able to access AggregateIQ’s Gitlab subdomain. “Within these repositories appear to be nothing less than mechanisms capable of organizing vast quantities of data about individuals, measuring how they are being influenced or reached by advertising and even tracking their internet browsing behavior,” said UpGuard. The repository included data management programs, advertising trackers and information databases as well as credentials, keys, hashes, usernames and passwords, which could be used to access other AIQ assets, such as databases, social media accounts and Amazon Web Services repositories.
UpGuard’s findings also indicated that AggregateIQ and SCL worked together on the Ripon platform developed for the Cruz campaign and that AggregateIQ worked with at least seven British political groups. Some of those, like Vote Leave, the Democratic Unionist Party and Veterans for Britain were already known to have worked with AggregateIQ, but others weren’t publicly linked to the data firm prior to UpGuard’s findings. Of note, a majority of the groups with repositories in AggregateIQ’s Gitlab subdomain actively campaigned for UK to leave the European Union ahead of the Brexit referendum.
Today, UpGuard publishes the third piece of its AggregateIQ series and it’s focused on the tools the firm developed and left exposed online. As UpGuard reports, two project families dubbed Saga and Monarch “are designed to gather and use data across a number of platforms through a variety of means.” And for the first time, we’re getting a hard look at how they work and their potential applications.
The first, Saga, appears to be able to automate the creation, analysis and targeting of advertisements in a way that would make it quite easy for a small number of people to manage a large number of Facebook ad accounts. “Saga was used specifically to interface with the Facebook ad system through APIs and scraping methods and gauge response to images and messages and posts,” says Chris Vickery, UpGuard’s director of cyber risk research.
And the information Saga scripts were designed to collect was quite specific. One script suggests AggregateIQ could targe political ads to individuals based on who they were friends with. Another suggests the firm’s tools could target geo-specifically down to the neighborhood or even the household. And of course, engagements with messages and posts could be monitored — actions such as who liked it, how quickly they liked it, how many people liked it, what regions people were liking it in and so on.
“The capability’s all there to do highly advanced targeting not only down to latitude and longitude in a radius,” Jon Hendren, UpGuard’s director of strategy, told us. “But you could also combine that with age demographics and gender, for example, and really focus in on a specific type of individual.” And since it’s all automated, it could be done incredibly easily and at a large scale.
Monarch takes over where Saga leaves off. “If Saga is a tool capable of tracking what happens when someone clicks a Facebook ad, Monarch seems designed to track what happens afterward, giving the controlling entity a more complete picture of their targets’ behavior,” says UpGuard. The sub-projects within Monarch include tools like pixel tracking and can monitor online behaviors like submitting forms, watching videos, hitting the bottom of a webpage and submitting a donation. “It’s pretty advanced for what it is,” says Hendren.
Hendren points out that nearly every company does ad-targeting, but he says usually they’re trying to get leads. “This is very plainly set up not to necessarily gather leads. They don’t want your email address so they can contact you. They already know who you are. What they’re tracking is your behavior and how you’re responding to the ad,” he says.
AggregateIQ didn’t respond to a request for comment.
The UpGuard team is no stranger to finding large amounts of shockingly unprotected data. They discovered exposures of 14 million Verizon customer records, personal information on nearly 200 million US citizens as well as classified US Army and NSA data. But Vickery says this finding takes the cake. “I’ve come across systems that are marked for handling top secret information and public key infrastructure data for people at the Pentagon, and all of that pales in comparison to coming across the tools that very well could have been used to manipulate the American public and swing an election,” he says.
It’s important to note that there’s no way to determine whether AggregateIQ or its customers ever put these tools to use. But that’s not the big picture here. What’s important about this finding is that these sophisticated tools designed to gather and use data in order to specifically target particular individuals were left out in the open. And though UpGuard didn’t use the data contained in the repository to access AggregateIQ’s databases, they could have in theory, which means someone else could have in theory. There’s no way to know what might be contained within those databases but if they held sensitive information, such as psychographic profiles on US, Canadian or UK citizens, it was left open to anyone willing to grab it.
And the appeal of these tools isn’t limited to political uses. “These tools are certainly designed for political purposes but I see no reason they couldn’t be applied towards criminal ends,” says UpGuard Cyber Resilience Analyst Dan O’Sullivan. “Social engineering and phishing attacks, they can be quite effective in that regard.”
UpGuard’s analysis of its AggregateIQ findings is ongoing and it will be releasing additional reports on the topic in the future. You can read the first two installments of its report here and here. You can find today’s report here.