
[AP – 21/12/2022]
In recent years, a wide range of companies has begun to monitor people in almost every aspect of their lives. Behaviors, movements, social relationships, interests, disabilities, and most private moments of billions of people are continuously recorded, evaluated, and analyzed in real time. The exploitation of personal data has become a multi-billion dollar industry. However, only the tip of the iceberg of today’s pervasive digital surveillance is visible; the larger part of it lies behind the scenes and remains opaque to most of us.
This report1 from Cracked Labs examines the actual practices and inner workings of this personal data industry. Based on years of research and a previous 2016 report2, this report sheds light on the hidden data flows between companies. It maps the structure and scope of today’s digital surveillance and profiling ecosystems and investigates related technologies, platforms and devices, as well as key recent developments.
In 2007, Apple introduced the smartphone, Facebook reached 30 million users, and online advertising companies began sending targeted advertisements based on data regarding individual preferences and interests. Ten years later, a massive data company landscape emerged, consisting not only of major players like Facebook and Google, but also thousands of other businesses from various sectors that continuously share and exchange digital profiles among themselves. Companies have started to combine and connect internet and smartphone data with “offline” data and information they have been collecting for decades.
The diffuse mechanics of real-time tracking developed for online advertising are rapidly expanding into other sectors, from product pricing to political communication, credit rating, and risk assessment. Large online platforms, digital advertising companies, data brokers, and businesses in many sectors can now recognize, classify, categorize, evaluate, score, and rank consumers across different platforms and devices. Every click on a website and every touch on a smartphone can activate a wide variety of hidden mechanisms of shared data that various companies use, thus directly influencing an individual’s available choices. Digital tracking and profiling, combined with personalization, are not only used for monitoring but also to influence people’s behavior.

Scientific studies show that many aspects of personality can be inferred from internet search data, browsing history, video viewing habits, social media activities, or purchasing behaviors. For example, sensitive personal characteristics such as nationality, religious and political views, relationship status, sexual orientation, and alcohol, cigarette, and drug use can be accurately inferred from someone’s likes on Facebook. Analysis of social network profiles can also predict personality traits such as emotional stability, life satisfaction, impulsivity, depression, etc.
Similarly, personality traits can be inferred from information about websites someone has visited, as well as from call logs and data from smartphone app usage. Browsing history can reveal information about one’s profession and educational level. Canadian researchers have successfully calculated even emotional states such as self-confidence, neuroticism, sadness, and fatigue by analyzing different typing patterns.
The results of today’s data mining and analysis methods are based on statistical correlations with limited levels of accuracy. However, such methods are already being used to classify, categorize, and evaluate individuals not only for marketing purposes, but also for decision-making in extremely important areas such as finance, insurance, and healthcare.
Companies such as Lenddo, Kreditech, Cignifi and ZestFinance already use data from social media, internet searches and mobile phones to calculate someone’s creditworthiness without using actual data related to financial transactions. Other companies also draw information about how someone fills out an electronic form or navigates a website, the grammar and punctuation of someone’s messages, and the battery status on their phone. Some companies even include data about someone’s friends in a social network to calculate creditworthiness.
Cignifi, which calculates credit scores based on the timing and frequency of phone calls, considers itself the “ultimate revenue-generating platform from data, for mobile network operators.” Large companies, including MasterCard, the mobile network provider Telefonica, credit reporting firms Experian and Equifax, as well as the Chinese search giant Baidu, have begun collaborating with such emerging companies.
Conversely, credit data also flows into online marketing. On Twitter, for example, marketing professionals can target ads based on the predicted creditworthiness of its users, using data from the Oracle data broker. Taking it a step further, Facebook has patented a method for assessing creditworthiness based on the credit ratings of someone’s friends within a social network. No one knows whether it intends to make this full integration of social networking, marketing, and risk assessment a reality.
Data companies and insurers are collaborating on programs that use information about consumers’ daily lives to predict risks related to their health. For example, the large insurance company Aviva, in collaboration with the consulting firm Deloitte, has conducted health risk assessments related to diabetes, cancer, high blood pressure, and depression for 60,000 insurance applicants, based on data purchased from a data broker intended for marketing purposes.
The consulting firm McKinsey helped predict the hospital costs of patients based on consumer data from a “large investor” in healthcare in the US. Using information regarding demographics, family structure, purchases, car ownership and other data, McKinsey stated that “such information can help identify key patient subgroups before high-cost episodes occur”.
The health data analytics company GNS Healthcare also calculates personalized health risks for patients from a wide range of data, such as the genome, medical records, laboratory data, wearable health devices, and consumer behavior. The company collaborates with insurance companies such as Aetna and provides scores that identify “individuals who are likely to need interventions” and offers the prediction of disease progression and intervention outcomes. According to a sector report, the company “ranks patients based on the return on investment” that an insurer can expect if they target them with specific interventions.
LexisNexis Risk Solutions, a large data and risk analysis company, provides a health scoring product that calculates health risks as well as expected healthcare costs, based on massive amounts of data on individuals’ consumer habits.

Data brokers and the trade of personal data
The dominant online platforms today – primarily Google and Facebook – possess extensive information about the daily lives of billions of people around the world. They are the most visible, the most widespread and, apart from information brokers, online advertisers and digital fraud detection services, perhaps the most advanced players in the personal data and analytics sector. Many others, however, operate behind the scenes and beyond the public’s attention.
At its core, online advertising consists of an ecosystem of thousands of companies focused on continuously monitoring and creating profiles of billions of people. Every time an advertisement appears on a website or mobile app, the user’s digital profile is “sold” to the highest bidder in milliseconds. In contrast to these new practices, credit reporting companies and consumer data brokers have already spent decades in the personal data sector. In recent years, they have begun combining the extensive information they have about people’s “offline” lives with user and customer databases operated by large platforms, online advertising companies, and myriad other businesses across many industries.
Facebook uses at least 52,000 personal characteristics to classify and categorize its 1.9 billion users based on, for example, their political views, nationality, and income. To achieve this, the platform analyzes their posts, likes, shares, friends, photos, movements, and many other types of behaviors.
Moreover, Facebook obtains user data from other companies. In 2013, the platform began collaborating with the four data brokers Acxiom, Epsilon, Datalogix, and BlueKai, two of which were subsequently acquired by the IT giant Oracle. These companies help Facebook create better user profiles than it already does, by providing it with data collected outside its platform.
Data brokers play a key role in today’s personal data industry. They collect, combine, and exchange vast amounts of information gathered from various online and “offline” sources across entire populations. Data brokers gather information that is publicly available and purchase or license consumer data from other companies. Generally, their data comes from sources not directly linked to individuals themselves and is largely collected without consumers’ knowledge. They analyze data, draw conclusions, categorize people into groups, and provide thousands of attributes for individuals to their clients.
The profiles that data brokers have on individuals include not only information regarding education, profession, children, religion, nationality, political opinions, activities, interests and media consumption, but also one’s online behaviours, such as web searches. Additionally, they collect data relating to purchases, credit card usage, income and loans, banking and insurance contracts, real estate and vehicle ownership and a variety of other types of data. Data brokers also calculate scores that predict an individual’s possible future behaviour, for example in relation to someone’s financial stability or plans to have a baby or change jobs.
Acxiom, founded in 1969, manages one of the world’s largest commercial databases. The company provides up to 3,000 data points for 700 million people from thousands of sources in many countries, including the US, the United Kingdom and Germany. Originally a direct marketing company, Acxiom developed its central consumer database in the late 1990s.

With the Abilitec Link system, the company runs a kind of private population registry in which each individual, household and building receives a unique identity. The company continuously updates its database with information about births and deaths, marriages and divorces, name and address changes and any other kind of profile data. If asked about a person, Acxiom provides, for example, one of 13 religious beliefs, including “Catholic”, “Jewish” and “Muslim” and one of nearly 200 ethnic codes.
Acxiom sells access to extensive consumer profiles and helps its clients find, target, recognize, analyze, classify, and score individuals. The company also manages 15,000 customer databases with billions of consumer profiles for its clients, including major banks, insurance companies, healthcare organizations, and government agencies. In addition to data marketing services, Acxiom also provides identity verification, risk management, and fraud detection services.
With the acquisition of the web data company LiveRamp in 2014, Acxiom has made significant efforts to connect the “offline” data it has been collecting for decades with the digital world. Acxiom, for example, was among the first data brokers to provide additional information to Facebook, Google, and Twitter, in order to help the platforms better track or categorize users based on purchases and other behaviors that the platforms could not yet monitor.
Acxiom’s LiveRamp connects and combines digital profiles across hundreds of data and advertising companies. At its core is the IdentityLink system, which helps identify individuals and connect information about them to databases, platforms, and devices based on email addresses, phone numbers, smartphone identifiers, and other identifiers. While the company promises that the linking and matching are done in “anonymized” and “de-identified” ways, it also states that it is able to “connect offline and online data into a single identifier.”
Several companies have recently evolved into data providers through LiveRamp, including credit reporting giants Equifax, Experian, and TransUnion. Additionally, many digital tracking services that collect data from the internet, mobile apps, and even sensors placed throughout the physical world provide data to LiveRamp. Some companies use LiveRamp’s “data marketplace,” which allows them to “buy and sell valuable customer data,” while others provide the data they already have to Acxiom and LiveRamp, in order to identify individuals and connect their recorded information with their digital profiles from other sources. Perhaps most concerning is the collaboration between Acxiom and Crossix, a company with extensive health data for 250 million consumers in the US, which is included as one of LiveRamp’s data providers.
By acquiring many data companies such as Datalogix, BlueKai, AddThis and CrossWise, Oracle, one of the world’s largest suppliers of enterprise software and databases, has recently also become one of the largest intermediaries of consumer data. In its databases, Oracle collects 3 billion user profiles from 15 million different websites, data from 1 billion mobile phone users, billions of purchases from supermarket chains and 1,500 large retailers, as well as 700 million messages from social networks, blogs and consumer reviews per day.

Oracle lists nearly 100 data providers in its data catalog, such as Acxiom and credit reporting companies Experian and TransUnion, as well as companies that track website visits, mobile app usage, movements, or collect data from online quizzes. Visa and MasterCard are also referred to as data providers. Together with its partners, Oracle provides more than 30,000 different data categories that can be assigned to consumers. Conversely, the company shares data with Facebook and helps Twitter calculate the creditworthiness of its users.
Oracle’s ID Graph service identifies and combines user profiles from all companies. It unites all interactions between databases, services, and devices to “create a personalized consumer profile” and “locate customers and prospects everywhere.” Other companies can send matching keys based on email addresses, phone numbers, postal addresses, and other identifiers to Oracle, which will then synchronize them with the “network of users and statistics interconnected in the Oracle ID Graph.” Although the company promises to use only anonymous user identifiers and anonymous user profiles, these still refer to specific individuals and can be used to identify and “isolate” them across various environments.
In general, customers can upload their own data regarding their clients, visitors to their websites, or users of their applications to Oracle’s databases, combine it with data from many other companies, and subsequently transfer and use it across hundreds of other marketing and advertising technology platforms in real time. They can use it, for example, to find and target individuals on specific devices and platforms, personalize interactions, and ultimately measure how consumers respond once they have been reached and influenced at an individual level.
Monitoring daily behaviors in real time
Online platforms, advertising technology providers, data brokers, and businesses across all industries can now track, identify, and analyze individuals in multiple contexts. They can learn about people’s interests, what they did today, what they are likely to do tomorrow, and how much they might be worth as customers.
A wide range of companies has been collecting information about people for decades. Before the internet, both credit reporting agencies and marketing firms served as the main points of integration between data flowing from different sources. A first major step toward systematic consumer monitoring was made in the 1990s through marketing databases, loyalty programs, and credit reporting systems. After the rise of the internet and online advertising in the early 2000s, and the subsequent rise of social networks and smartphones later in the decade, we now see the traditional consumer data industry integrating into the new ecosystem of digital tracking and profiling.

Data brokers and other companies have long been acquiring information about newspaper and magazine subscribers, book and movie club members, from catalogs and mail orders, travel agencies, seminar and conference participants, and product warranty forms. Collecting purchase data from loyalty programs is also an established practice.
Beyond the data collected directly from individuals, they also gather information regarding the types of neighborhoods and buildings where people live for classification and categorization purposes. Similarly, companies increasingly create consumer profiles based on metadata concerning the kinds of websites they visit, the videos they watch, the applications they use, and the geographic locations they visit. In recent years, the scale and depth of behavioral data flows generated by all kinds of daily activities, such as internet use, social media, and electronic device usage, have increased dramatically.
A major reason why corporate surveillance and profiling have become so widespread is the fact that almost all websites, mobile app providers, and many electronic device suppliers share behavioral data with other companies.
A few years ago, most websites began to incorporate tracking services that transmit user data to third parties. Some of these services provide functionality that is visible to users. When a website includes, for example, a Facebook like button or an embedded YouTube video, user data is automatically transmitted to Facebook or Google. However, many other services related to online advertising remain hidden and serve largely only one purpose, namely the collection of user data. It is widely unknown exactly what types of data are shared and how third parties use this data. At least part of these tracking activities can be examined by everyone, by installing the Lightbeam extension, for example, one can visualize the hidden trackers of third parties.
A recent study examined a million different websites and found more than 80,000 third-party services that collect data about visitors to these sites. Approximately 120 of these tracking services were found on more than 10,000 websites, and six companies track users across more than 100,000 websites, including Google, Facebook, Twitter, and Oracle’s BlueKai. A study of 200,000 users from Germany who visited 21 million websites showed that third-party trackers were present on 95% of the pages they visited. Similarly, most mobile apps share information about their users with other companies. A 2015 study of popular apps in Australia, Brazil, Germany, and the US found that between 85% and 95% of free apps, and even 60% of paid apps, connect with third parties that collect personal data.
Regarding devices, smartphones are perhaps the biggest contributors to today’s omnipresent data collection. The information recorded by mobile phones provides detailed insights into a user’s personality and daily life. Since consumers generally need to have a Google, Apple, or Microsoft account to use them, much of the information is already linked to a major platform’s identifier.
The sale of user data is not limited to website and mobile app publishers. The marketing information company SimilarWeb, for example, collects data not only from hundreds of thousands of direct measurement sources from websites and apps, but also from desktop software and browser extensions. In recent years, many other types of devices with sensors and network connections have entered daily life, from e-readers and wearable devices to smart TVs, meters, thermostats, smoke alarms, printers, refrigerators, toothbrushes, games, and cars. Like smartphones, these devices provide companies with unprecedented access to consumer behavior in many areas of daily life.
The majority of today’s digital advertising takes place in the form of highly automated real-time processes between publishers (digital content) and advertisers. This is often referred to as “programmatic advertising.” When an individual visits a website, it sends data to various third-party services, which then attempt to identify the individual and retrieve available information from their profile. Advertisers interested in displaying an ad to this specific individual based on particular characteristics and behaviors submit bids. Within milliseconds, the advertiser with the highest bid wins and places the advertisement. Advertisers can similarly submit bids for user profiles and ad placements within mobile applications.
Most of the time, however, this process does not take place directly between publishers and advertisers. The ecosystem consists of a variety of different types of data and technology providers that interact with each other, including advertising networks, advertising exchanges, sell-side platforms, and demand-side platforms. Some of these specialize in tracking and advertising alongside search results, general online advertisements, mobile ads, video ads, social media ads, or gaming ads. Others focus on providing data, analytics, or personalization services.
For the creation of user profiles, all involved parties have developed advanced methods for collecting, gathering and linking all information from different companies, in order to track individuals throughout the spectrum of their daily lives.
Most retailers sell more or less aggregated forms of purchase data to market research companies and consumer data brokers. The data company IRI, for example, has access to data from more than 85,000 retail stores, including supermarkets, mass merchandisers, pharmacies, hardware stores, beverage outlets, and pet stores. Nielsen claims to collect sales information from 900,000 stores worldwide across more than 100 countries. The large British retailer Tesco has assigned loyalty program activities to a subsidiary company, Dunnhumby, whose slogan is “turning customer data into customer delight.” When Dunnhumby acquired the German advertising technology company Sociomantic, it announced that it would “combine its extensive knowledge of the shopping preferences of 400 million consumers” with Sociomantic’s “real-time data from over 700 million online consumers” to personalize and evaluate advertising.
Large media conglomerates are also deeply embedded in today’s tracking and profiling ecosystems. For example, Time Inc. acquired Adelphic, a major tracking and advertising technology company, as well as Viant, a company that claims to have access to over 1.2 billion registered users. A prominent example of a digital content publisher selling user data is the Spotify platform. Since 2016, it has been sharing information regarding users’ mood, listening behavior and playlists, activity and location with the data division of advertising giant WPP, which now has access to the “unique listening preferences and behaviors of Spotify’s 100 million users.”
Many large telecommunications companies and internet service providers have acquired advertising and data technology companies. For example, Millennial Media, a subsidiary of Verizon’s AOL, is a mobile advertising platform that collects data from more than 65,000 mobile applications and claims to have access to approximately 1 billion active unique users worldwide. The Singapore-based telecommunications company Singtel acquired Turn, an advertising technology platform that provides merchants with access to 4.3 billion device and browser identifiers and 90,000 demographic, behavioral, and psychographic characteristics.
Just like airlines, hotels, retailers, and companies in many other sectors, the financial services industry began aggregating and using additional customer data through credit scoring programs in the 1980s and 1990s. Companies with related, complementary target groups have long shared certain customer data among themselves, a process often managed by intermediaries. Today, one such intermediary is Cardlytics, a company that runs reward programs with 1,500 financial institutions such as Bank of America and MasterCard. Cardlytics promises financial institutions that it will “create new revenue streams using the power of their purchase data.” The company also collaborates with LiveRamp, Acxiom’s subsidiary that combines online and offline consumer data.
For MasterCard, the sale of products and services created from data analysis could even become its core activity, given that information products, including data sales, already represent a significant and growing share of its revenues. Google recently stated that it records about 70% of credit and debit card transactions in the United States through “partnerships with third parties” in order to monitor markets, but did not reveal its sources.
Connection, mapping and combination of digital profiles
Until recently, advertisers using Facebook, Google, or other online advertising networks could only target individuals based on their online behavior. However, a few years ago, data companies began providing ways to combine and connect digital profiles across various platforms, customer databases, and the world of online advertising.
In 2012, Facebook began allowing companies to upload their own lists of email addresses and phone numbers to the platform. Although these addresses and numbers are converted into “pseudonymous codes,” Facebook can directly link this customer data from other companies to its users’ accounts. In this way, companies can, for example, find and precisely target those individuals on Facebook whose email addresses or phone numbers they possess. They could also selectively exclude them from targeting or let the platform find individuals with similar characteristics, interests, and behaviors.
This is a powerful “paper,” perhaps more powerful than it appears at first glance. It allows companies to systematically link their own customer data with Facebook’s data. Additionally, it also enables other advertising and data providers to synchronize with the platform’s databases and leverage its capabilities, essentially providing a kind of real-time management of Facebook’s entire data universe. Companies can now record very specific behavioral data, such as a click on a website, activity in an app, or a purchase in a store, in real time, and tell Facebook to immediately find and target the individuals who performed these activities. Google and Twitter introduced similar features in 2015.
Today, most advertising technology companies continuously share various forms of codes that refer to individuals. Data management platforms allow businesses across all industries to combine and connect their own consumer data, including real-time information on purchases, website visits, app usage and email responses, with digital profiles provided by myriad third-party data providers. The combined data can then be analyzed, sorted and categorized and used to target specific individuals with specific messages on specific channels or devices. A company could, for example, target a group of existing customers who visited a specific page on its website and are predicted to become valuable customers, with personalized content or a discount – either on Facebook, in a mobile app or on its own corporate website.
The emergence of data management platforms marks a decisive moment in the development of pervasive commercial surveillance of behavior. With their help, businesses across all industries worldwide are able to seamlessly combine and connect the data they have been collecting for years about their customers and prospects with billions of profiles collected in the world of digital tracking. Companies offering such platforms include Oracle, Adobe, Salesforce (Krux), Wunderman (KBM Group/Zipline), Neustar, Lotame and Cxense.
To track people across the various situations of their lives, to combine their profiles, and to always recognize them again as the same individuals, companies collect a wide range of characteristics that define them in some way.
Due to its ambiguity, a person’s legal name has always been a poor identifier for data collection. A postal address, on the other hand, has long been, and continues to be, a key characteristic that allows the combination and linking of data relating to consumers and their families from different sources. In the digital world, the most relevant identifiers used to connect profiles and behavioral data across different databases, platforms and devices are email addresses, phone numbers and unique codes relating to smartphones or other devices.
Account identifiers on major platforms such as Google, Facebook, Apple, and Microsoft also play a significant role in tracking individuals online. Google, Apple, Microsoft, and Roku assign “advertising identifiers” to individuals, which are now widely used to match and link data from devices such as smartphones with other information from across the digital world. Verizon uses its own identifier to track users across websites and devices. Some major data companies such as Acxiom, Experian, and Oracle have introduced globally unique identifiers for individuals, which they use to connect the databases they have been updating for decades with other information from various sources in the digital world. These corporate IDs consist primarily of two or more identifiers referring to different aspects of someone’s online and “offline” life and can be combined in various ways.

Tracking companies also use certain temporary identifiers, such as cookie identifiers associated with users browsing the web. Given that users may block or delete cookies in their browser, they have developed advanced methods for calculating distinct digital “fingerprints” based on various technical characteristics of a user’s browser and computer. Similarly, companies collect “fingerprints” for devices such as smartphones. Cookie identifiers and digital “fingerprints” are constantly synchronized between different tracking services and then linked to other, more permanent identifiers.
Other companies provide monitoring services for various devices, using machine learning to analyze large amounts of data. For example, Tapad, which was acquired by Norwegian telecommunications giant Telenor, analyzes data for 2 billion devices worldwide and uses behavioral and relational patterns to find the statistical probability that certain computers, tablets, phones and other devices belong to the same person.
Data companies often remove names from their extensive profiles and use encoding methods to convert email addresses and phone numbers into alphanumeric codes such as for example “e907c95ef289”. This allows them to claim on their websites and in their privacy policies that they collect, share and use only “anonymous” or “de-identified” consumer data.
However, because most companies use the same deterministic processes to calculate these unique identifiers, they should be considered pseudonyms which are, in reality, much more suitable for identifying users across the entire digital world than real names. Even if the profiles shared between companies contain only “encoded” or “encrypted” email addresses and phone numbers, an individual can be re-identified as soon as they use another service connected to the same email address or phone number. In this way, even though each of the involved tracking services may know only part of someone’s profile information, companies can track and interact with individuals on a personalized level across various services, platforms, and devices.
Consumer and behavior management: personalization and experiments
Based on advanced methods of connecting and combining data across different services, businesses in all sectors can leverage today’s ubiquitous behavioral data streams to monitor and analyze a wide range of consumer activities and behaviors that may be relevant to their business interests.
With the help of data providers, companies are trying to identify as many touchpoints as possible throughout the customer journey, from digital to in-store purchases, mail, television advertisements, and calls from call centers. They attempt to record and measure every interaction with a consumer, including websites, platforms, and devices they do not control themselves. They can seamlessly collect rich data about their customers and others in real-time, enhance it with information from third parties, and use these enriched profiles within the marketing and advertising technology ecosystem. Today’s data management platforms allow for setting complex rules that dictate how to automatically respond to certain criteria, such as specific activities, specific individuals, or a combination of both.
Consequently, individuals never know whether their behavior triggered a reaction from any of these continuously updated, interconnected, opaque networks of surveillance and profiling, and if so, how this affects the choices they have in communication channels and in the everyday situations they encounter.
The data streams shared between online advertisers, data brokers, and other companies are not only used to display targeted ads on websites or within mobile apps. They are increasingly used to dynamically personalize the available content and options offered to consumers. The data technology company Optimizely, for example, offers content personalization of a website for first-time visitors based on the digital profiles of these visitors provided by Oracle.
Electronic stores may, for example, personalize the way someone is treated, which products are displayed prominently, what discounts are offered, and even the prices of products or services may vary depending on who visits a website. Online fraud detection services evaluate users in real time and decide which payment and shipping methods someone can access.

Companies have developed technologies to continuously calculate and evaluate someone’s potential long-term value, based on information regarding an individual’s browsing history, search history, and location, as well as the use of applications, product purchases, or friends on a social network. Every click, touch on a touchscreen, “like”, post, or purchase can automatically influence the way someone is treated as a customer, how long someone has to wait when calling a contact line, or whether someone is excluded from marketing efforts or services.
Three types of technological platforms play an important role for this kind of direct personalization. First, companies use advanced “customer relationship management” systems to manage their data regarding customers and potential customers. Second, they use “data management platforms” to connect their own data with the digital advertising ecosystem and to obtain additional profile information about their customers. Third, they can use “marketing prediction platforms,” which help them craft the right message to the right person at the right time, calculating how to persuade someone, exploiting personal biases and weaknesses.
The data company RocketFuel, for example, promises its clients “to collect trillions of digital and ‘offline’ signals to create individual profiles and offer personalized, continuously active, continuously relevant consumer experiences,” based on 2.7 billion unique profiles stored in its data center. RocketFuel states that “it scores every impression, in its effort to influence the consumer.”
The marketing prediction platform TellApart, which belongs to Twitter, creates a customer score for every buyer and product combination, a “purchase probability score, predicted order size, and overall value,” based on “hundreds of online and in-store signals for a specific anonymous customer.” Subsequently, TellApart assists in automatically gathering content such as “product images, logos, offers, and other metadata” for personalized advertisements, emails, websites, and offers.
Similar methods can be used to personalize pricing in online stores, for example, by predicting how valuable someone might be as a long-term customer or how much they are likely willing to pay at that moment. Strong evidence suggests that online stores already display products at different prices to different consumers or even different prices for the same products, based on their individual characteristics and behaviors. A similar field is the use of personalization during electoral campaigns. Targeting voters with personalized messages that are tailored to their personality and political views on certain issues has already sparked widespread discussions regarding the potential for political manipulation.
Personalization based on rich profile information and real-time pervasive monitoring has become a powerful tool for influencing consumer behavior, such as visiting a website, clicking on an advertisement, subscribing to a service or newsletter, downloading an application, or purchasing a product.

To further improve this, companies have begun to constantly experiment with people. They conduct tests with different variations of functions, website designs, interface elements, titles, button texts, images or even different discounts and prices and then carefully monitor and measure how different groups of users interact with these variations. In this way, companies systematically optimize their ability to encourage people to act as they want them to act.
News organizations, including major outlets such as The Washington Post, use different versions of article headlines to test which variation performs better. Optimizely, one of the leading technology providers for such tests, offers its customers the ability “to experiment broadly across the entire customer experience, on any channel, on any device, and in any application.” Experimenting on users who are unaware has become the new norm.
Facebook declared in 2014 that it conducts “over a thousand experiments every day” in order to “optimize specific outcomes” or “to inform long-term design decisions.” In 2010 and 2012, the platform carried out experiments on millions of users and concluded that the configuration of the interface, functions, and displayed content can significantly increase voter participation for certain groups of people. The platform’s notorious “emotional contagion experiment”3 on nearly 700,000 users involved covert manipulation of the volume of emotionally positive and negative posts in users’ news feeds, which ultimately influenced how many positive and negative messages the users themselves subsequently published.
After massive public criticism over Facebook’s experiments, the online dating platform OkCupid published a provocative post defending such practices, stating that “we experiment on people” and “everyone else does it too.” OkCupid referred to an experiment in which it had manipulated the percentages displayed to various user pairs regarding how much they “matched.” When presenting a 90% match rate to pairs who did not actually “match” in reality, these users exchanged significantly more messages with each other. OkCupid claimed that when “you tell people” they are “a great match,” “they behave as if they really are.”
Dragnet – daily life, marketing data and risk analysis
Data regarding human behaviors, their social relationships, and their most private moments are increasingly being applied in environments or for purposes entirely different from those for which they were originally recorded. Specifically, they are increasingly being used for making automated decisions about individuals in critical areas of their lives, such as finance, insurance, and healthcare.
Credit rating agencies and other key players in risk assessment in areas such as identity verification, fraud prevention, healthcare, and insurance analysis primarily provide marketing solutions. Moreover, most data brokers trade in many types of sensitive information—such as data regarding an individual’s financial status—for marketing purposes. The use of credit scores for marketing purposes, whether to include or exclude vulnerable population groups, has evolved into a product that unifies marketing and risk management.
The credit reporting company TransUnion provides, for example, a service for data-driven decision-making to retail and financial services companies, allowing clients to “apply marketing and risk management strategies tailored to customers, channels, and business objectives,” including credit data and the highly promising “unique insights into consumer behavior, preferences, and potential risk.” Companies can enable consumers to “choose from a range of offers customized to their needs, preferences, and risk profile” and “evaluate a customer for multiple products across all channels and then present only the offers that are most relevant to them, and profitable for the company.”
Beyond the real-time monitoring engine developed within the framework of online advertising, other forms of pervasive surveillance and profiling have emerged in the fields of risk analysis, fraud detection, and cybersecurity.
Today’s online fraud detection services use extremely intrusive technologies to evaluate billions of digital transactions and collect vast amounts of information about devices, individuals, and behaviors. Traditional providers in the field of credit assessment, identification, and fraud prevention have begun to monitor and evaluate how people use the internet and their devices. Moreover, they have started to link digital behavioral data with the vast quantities of “offline” identification information they have been collecting for decades.
With the rise of technology-mediated services, consumer identity verification and fraud prevention have become increasingly important and challenging issues, particularly in light of cybercrime and automated fraud. At the same time, today’s risk analysis systems have amassed enormous databases containing sensitive information about entire populations. Many of these systems cover a wide range of scenarios, including identity verification for financial services, assessment of insurance claims and benefits, and analysis and evaluation of transactions.
Such risk analysis systems may need to decide whether an application or transaction is accepted or not, or which payment and delivery options are available to someone during an online transaction. Identity verification services and fraud analysis are also used in areas such as law enforcement and national security. The line between commercial identity and fraud analysis applications and those used by government intelligence services is becoming increasingly blurred.
Such opaque systems filter individuals, who may be flagged as suspicious and justify special treatment or investigation—or may be rejected without explanation. They may receive an email, a phone call, a notification, or the system may simply hide an option without the user ever knowing of its existence. Inaccurate assessments may spread from one system to another. It is often difficult or impossible to challenge such negative evaluations that exclude or deny, especially given how difficult it is to challenge mechanisms or decisions about which you know nothing at all.
The cybersecurity company ThreatMetrix processes data for 1.4 billion “unique user accounts” across “thousands of global websites.” Its “digital identity network” records “hundreds of millions of daily consumer transactions, including logins, payments, and new account creations,” and maps the “continuously changing relationships between people and their devices, locations, account credentials, and their behavior” for identity verification and fraud prevention purposes. The company collaborates with Equifax and TransUnion. Its clients include Netflix, Visa, and companies in sectors such as gaming, government services, and healthcare.
Similarly, ID Analytics, a data company recently acquired by Symantec, manages an “identity network” with “100 million identity elements collected every day from leading cross-sector organizations.” The company collects data on 300 million consumers, including information about their loans, online purchases, credit cards, and smartphone applications. With its ID Score, it evaluates digital devices, as well as names, social security numbers, and postal and email addresses.
Trustev, an Ireland-based online fraud detection company, which was acquired by credit rating company TransUnion in 2015, evaluates online transactions for clients in financial services, government, healthcare and insurance, based on the analysis of digital behaviors, identities and devices such as phones, tablets, laptops, gaming consoles, televisions, and even refrigerators. The company provides clients with the ability to analyze how visitors click and interact with websites and applications, and uses data from a broad spectrum to assess users, such as phone numbers, emails and postal addresses, browser and device “fingerprints”, credit checks, transaction histories, IP addresses, mobile data and locations. Trustev also offers technology for creating an individual’s “social fingerprint”, which analyzes social media content, including analysis of friends lists and recognition of behavioral patterns. TransUnion has incorporated Trustev’s technology into its own identity verification and fraud prevention solutions.

Similarly, credit reporting agency Equifax states that it has data on nearly 1 billion devices and can verify “where a device is actually located and whether it is associated with other devices used in known fraud.” By combining this data with billions of identity and credit events to detect suspicious activity across various industries, and with information regarding employment and interpersonal relationships among families and collaborators, Equifax claims it can “identify devices as well as individuals.”
Google’s reCaptcha product actually provides similar functionality, at least in part. It is embedded in websites and helps website providers decide whether a visitor is a legitimate human or not. Until recently, users had to solve various kinds of quick challenges, such as deciphering letters in an image, selecting objects in a grid of images, or simply clicking a checkbox titled “I’m not a robot.” In 2017, Google introduced an invisible version of reCaptcha, explaining that from then on humans would be allowed to pass without any related interaction, in contrast to “suspicious users and bots.”
The company does not disclose what types of data and behaviors it uses to recognize humans. Research suggests that Google does not only use IP addresses, browser “fingerprints”, the way the user types or moves their mouse or uses the touchscreen “before, during and after” an interaction with reCaptcha, but also several of Google’s cookies. It is unclear whether individuals without accounts face disadvantages, whether Google is able to identify specific individuals and not just “humans”, or whether Google also uses the data recorded in reCaptcha for purposes other than bot detection.
The omnipresent streams of behavioral data recorded for online advertising are increasingly flowing into fraud detection systems. The marketing data platform Segment, for example, offers its customers easy ways to send their own customers’ data from many different marketing technology services to fraud detection companies. One of these is Castle, which uses “customer behavioral data to predict which users pose a potential security or fraud risk.” Another, Smyte, helps with “preventing fraud, spam, harassment, and credit card fraud.”
Experian also offers a cross-device tracking service that provides universal device identification from mobile, web, and apps for digital marketing. The company promises to reconcile and correlate its customers’ “existing digital identifiers,” including “cookies, device IDs, IP addresses and many others,” providing merchants with an “omnichannel, consistent and stable link across all channels.”
Experian’s device recognition technology comes from 41st Parameter, an online fraud detection company that Experian acquired in 2013. Based on this technology, Experian also offers a “smart device” solution for detecting fraud in online payments, which “sets a reliable identifier for the device and collects enriched device data,” “identifies each device on every visit in milliseconds,” and “provides unparalleled visibility into the person behind the payment.” It is unclear whether Experian uses the same data for device recognition services in fraud detection and marketing.

Mapping the commercial landscape of tracking and profiling
In recent years, existing commercial surveillance practices have rapidly evolved into a vast landscape of corporate players continuously monitoring entire populations. Certain factors in today’s diffuse monitoring and profiling ecosystem, such as large platforms and other companies with a vast number of customers, hold a unique position regarding the scale and depth of their consumer profiles. However, the data used for making decisions about people across many areas of life is mostly not collected in one place, but is gathered from multiple sources in real time, according to needs.
A wide range of data and analytics companies in the fields of marketing, customer management, and risk analysis collect, analyze, share, and trade consumer data seamlessly, combining them with additional information from thousands of other companies. While the data and analytics industry provides the means for developing these powerful technologies, businesses across many sectors contribute equally to both the volume and detail of the collected data and the ability to utilize them effectively.
Google and Facebook, followed by other large platforms such as Apple, Microsoft, Amazon, and Alibaba, have unprecedented access to data about the lives of billions of people. Although they have different business models and therefore play different roles in the personal data industry, they have the power to broadly dictate the fundamental parameters of overall digital markets. The large platforms mostly limit the way other companies can access their data. In this way, they force them to use data for users within their own ecosystems and thus collect additional data beyond the scope of the platforms.
Although large multinationals in various sectors that frequently interact with consumers are in a somewhat similar position, they not only collect consumer data collected by others, but often also provide data. While segments of financial services and telecommunications, as well as critical social sectors such as healthcare, education and employment, are subject to stricter privacy protection regulations in most jurisdictions, a wide range of companies has begun to use or contribute data to today’s commercial surveillance networks.
Retailers and other companies that sell products and services to consumers also primarily sell data related to their customers’ purchases. Media conglomerates and digital publishers sell data about their audiences, which are then used by companies in other sectors. Telecommunications and broadband providers have begun tracking their customers online. Large retail, media, and telecommunications companies have acquired or are acquiring data technology, tracking, and advertising companies. With Comcast acquiring NBC Universal and AT&T likely to acquire Time Warner, major telecommunications companies in the US are also becoming giant publishers, creating powerful portfolios of content, data, and targeting capabilities. With the acquisition of AOL and Yahoo, Verizon also became a “platform.”
Credit institutions have long used consumer data for risk management, such as creditworthiness assessment and fraud detection, as well as for marketing, customer acquisition and retention of existing ones. They supplement their own data with external data from credit reporting agencies, data brokers and marketing data companies. PayPal, the largest name in electronic payments, shares personal information with more than 600 third parties, including other payment providers, credit reporting agencies, identity verification and fraud detection companies, as well as the most advanced players in digital tracking ecosystems. While credit card networks and banks have been sharing their customers’ financial data with risk assessment providers for decades, they have now begun selling transaction data for marketing purposes.
A myriad of smaller and larger companies that provide websites, mobile applications, games and other applications are closely connected to the marketing data ecosystem. They use services that allow them to easily transmit data about their users to hundreds of third-party services. Many of them sell user behavioral data streams as a core part of their business model. Even more concerning is that companies providing new types of devices, such as fitness trackers, also incorporate services that transfer user data to third parties.
The diffuse real-time monitoring engine developed for online advertising is rapidly expanding into other sectors, such as politics, pricing, credit rating, and risk management. Insurers around the world have begun to offer their customers programs that include real-time monitoring of behaviors such as driving, health activities, food purchases, or gym visits. New players in insurance analytics and financial technology predict personalized health risks based on consumer data, as well as the creditworthiness of individuals based on behavioral data from phone calls or internet searches.
Data brokers, customer management companies, and advertising firms such as Acxiom, Epsilon, Merkle, and Wunderman/WPP play a significant role in combining and connecting data across platforms, multinationals, and the advertising technology world. Credit reporting agencies, such as Experian, which provide many services in highly sensitive areas such as credit reporting, identity verification, and fraud detection, also play a significant role in today’s sprawling marketing data ecosystem.
Specific large companies that provide data services, analytics, and software have also been named as “platforms.” Oracle, a major database and enterprise software provider, has become a consumer data broker in recent years. Salesforce, the market leader in customer relationship management that manages customer databases, acquired Krux, a major data company that connects and combines data across the digital world. Adobe, a software company, also plays a significant role in profile and advertising technology.
Moreover, most major business software, analytics, and consulting companies, such as IBM, Informatica, SAS, FICO, Accenture, Capgemini, Deloitte, and McKinsey, or even information and defense service companies such as Palantir, also play a significant role in managing and analyzing personal data, from customer relationship management to identity management, marketing, and risk analysis for insurers, banks, and governments.
translation/adaptation: Wintermute
- ST: The full report was published in 2017, it is about 100 pages and has many references to further bibliography, journalism, etc. Here we translate the summary, which you can find along with the entire report here: https://crackedlabs.org/en/corporate-surveillance ↩︎
- Networks of Control (2016): http://crackedlabs.org/en/networksofcontrol ↩︎
- Experimental evidence of massive-scale emotional contagion through social networks: https://www.pnas.org/doi/full/10.1073/pnas.1320040111 ↩︎

