The Invisible Harvest: How Consumer Devices Train AI Without You Knowing // GenLab

The Problem

You signed up to catch Pikachu.

Nobody sits down and decides to donate their home's floor plan to a technology company. Nobody consciously agrees to let a corporation build a photorealistic 3D model of their neighborhood using their phone's camera. Nobody opts in to having their voice patterns analyzed for emotional valence by a machine learning system they have never heard of. And yet, each of these things is happening, at massive scale, every day, through applications that present themselves as entertainment, fitness trackers, and audio accessories.

This is not a conspiracy. The data collection is disclosed, technically, in terms-of-service agreements that run to tens of thousands of words and change without notice. The practices are legal in most jurisdictions. What they are not is transparent in any meaningful sense. The gap between what users understand they are agreeing to and what is actually happening to their data is the defining consumer technology story of the 2020s.

Three categories of consumer device have become especially significant in this context: augmented reality games built on crowd-sourced spatial data (Niantic's ecosystem, centered on Pokemon Go), wearable cameras disguised as eyewear (Meta's Ray-Ban smart glasses), and always-listening audio devices (earbuds, smart speakers, and everything in between). Together, they construct what the industry euphemistically calls "ambient computing" and what users might more accurately call a distributed sensor network pointed at their lives.

Case Study 01

Niantic's World Model

Pokemon Go launched in July 2016. Within three weeks it had been downloaded over 100 million times. Players walked the streets pointing their phones at the physical world to capture fictional creatures overlaid on camera feeds. What the players were doing, simultaneously and invisibly, was generating one of the most comprehensive crowd-sourced maps of human-scale physical space ever constructed.

Niantic's product was never purely a game. The company was built on Google's internal startup incubator, specifically around Google Maps data and location infrastructure. Its first title, Ingress, asked players to physically visit real-world locations and "capture" them, generating a dense semantic map of points of interest. Pokemon Go inherited this architecture and scaled it to a global audience.

In 2021, Niantic formalized what it had been building with the public launch of Lightship VPS (Visual Positioning System). VPS allows compatible devices to localize themselves in three-dimensional space by matching camera frames against a database of pre-scanned 3D point clouds, contributed passively by players using Niantic applications. The more people scan a location, the more accurate the model becomes. Players contribute spatial data; Niantic retains a persistent, commercial 3D map of the world.

What Niantic collects through its apps:

Spatial Data

3D point clouds of real-world locations contributed via AR scanning. Used to build Lightship VPS, a commercial real-world mapping layer sold as a developer platform.

Movement Trajectories

Precise GPS traces including speed, direction, and duration. Aggregated movement creates behavioral models: when people walk, which routes they prefer, where they linger.

Visual Environment

AR camera frames used for creature overlay also capture environmental visual context. Scaniverse (Niantic's standalone 3D scanning app) requests explicit room and building scans.

Social Graphs

In-game social connections correlate real-world proximity events. Which players appear at the same location at the same time is itself meaningful behavioral data.

In 2022, Niantic partnered with planet-scale satellite mapping company Planet Labs at a strategic level. The same year, it acquired 8th Wall, a company specializing in web-based AR that does not require a dedicated app, dramatically lowering the friction for browser-based spatial data collection. By 2024, Niantic had spun out its games division as a separate entity, clarifying that its core business is the Lightship platform: a commercial real-world AI infrastructure product, sold to developers, funded entirely by data generated by game players.

The game was the distribution mechanism. The map was always the product.

Pokémon Go built a secret 3D map: an analysis of what Niantic actually built with player-contributed spatial data.

Case Study 02

The Glasses That Remember

In October 2024, two Harvard students, AnhPhu Nguyen and Caine Kollar, published a demonstration that made clear what the Ray-Ban Meta smart glasses were capable of when connected to AI systems. They walked around Boston wearing the glasses, which streamed footage in real time to a laptop running a pipeline: frame extraction, facial recognition via PimEyes, a people-search engine, and an LLM wrapper that collated the results. Within seconds of seeing someone's face on the street, the system returned their name, employer, address, and phone number.

The students released no code and the footage showed no targets. The intention was not to enable harm but to demonstrate feasibility. The demonstration required no hacking because the glasses were functioning exactly as designed; the pipeline simply combined available tools. Facial recognition is not new. LLMs capable of structured data retrieval are not new. The novelty was the form factor: inconspicuous glasses with a recording indicator light that many people around the wearer did not notice, combined with continuous video capture.

The Ray-Ban Meta glasses record at 1080p video and 12MP still images. They have a LED indicator that illuminates during recording, but the indicator is small and positioned on the frame, visible only to those who know to look for it. Multiple surveys conducted following the Harvard demonstration found that well under half of respondents in photographs with the glasses correctly identified when recording was taking place.

What happens to that footage in terms of AI training is governed by Meta's privacy policy, which reserves the right to use content processed by Meta AI for model improvement. Interactions with the glasses' integrated Meta AI assistant, which include descriptions of scenes the AI is "seeing," are processed on Meta's servers. Meta has stated that it does not currently use footage to train its models without user consent, but the architecture exists. Opt-in data collection for model improvement is requested through the app, and the default settings have changed in multiple software updates.

The structural problem

Consent in the glasses context is one-directional. The wearer can choose whether to use the device. Every person who appears in the camera's field of view has no choice and receives no notice. Standard consent frameworks were designed for interactions between parties who both know an interaction is occurring. Ambient always-on camera systems break this model entirely.

France 24: Harvard students demonstrate real-time facial recognition pipeline using Ray-Ban Meta glasses and off-the-shelf AI tools. (October 2024)

Case Study 03

What Your Earbuds Notice

The privacy conversation around smart speakers and earbuds has centered almost exclusively on the "always-listening" wake-word detection problem: devices that continuously process microphone input to detect trigger phrases. This is real and consequential, but it is the most visible part of a much broader audio data infrastructure.

Amazon's Alexa team acknowledged in 2019 that a team of thousands of contractors listened to and annotated customer audio recordings to improve speech recognition. Apple's Siri grading program was exposed by a contractor the same year, revealing that Apple employees and contractors listened to "a small portion" of Siri activations to evaluate quality, including accidental activations during private conversations. Both companies modified their programs following the reporting, but both retained the underlying architecture that makes such collection possible.

The more structurally significant development is the expansion of audio processing into hearable devices. AirPods Pro with Personalized Spatial Audio perform ear canal acoustic profiling; the device actively measures how your ears distort sound. This biometric data, your precise ear geometry, is stored in iCloud and linked to your Apple ID. Apple's privacy documentation treats it as part of device personalization, not as a biometric, but audiologists would identify it as exactly that.

A timeline of audio data incidents:

2015

Samsung SmartTV privacy policy warns users: "If your spoken words include personal or other sensitive information, that information will be among the data captured and transmitted to a third party." The third party was Nuance Communications, an AI speech company.

2019

Apple, Amazon, Google, Facebook all confirmed within the same eight-week period that humans listen to recordings from their respective voice assistant products. All four disclosed this practice in privacy documentation but not in consumer-facing product descriptions.

2020

Amazon Sidewalk launches, turning Echo devices and Ring cameras into a shared ambient mesh network that covers neighborhoods, enabling low-bandwidth connectivity sharing between nearby devices and piggybacking location sensing on audio infrastructure.

2022

OpenAI's Whisper model demonstrates a step-change in speech recognition accuracy, trained on 680,000 hours of audio from the internet. The model dramatically reduces the cost of transcribing any audio, making mass collection of voice data significantly more valuable upstream.

2024

Meta announces that Ray-Ban glasses wearers' environmental audio processed by Meta AI can be used to improve Meta AI services. The EULA update requires users to opt out rather than opt in.

CBS News: Amazon employees listening to Echo conversations. The 2019 Bloomberg investigation that exposed widespread human review of Alexa audio recordings.

The specific risk with audio is context reconstruction. A recorded environment contains more information than a transcript. Acoustic properties reveal room size and surface materials. Background sounds identify location type: whether the speaker is in a kitchen, a car, a hospital, an office, a bar. Emotional prosody, the rhythm and texture of speech, is a direct signal of psychological state. Individual voice patterns are a biometric identifier more reliable than a fingerprint in many conditions. Audio data is cheap to store, cheap to process, and extraordinarily rich in inference potential.

Interactive

Your Data Surface

Select the devices and applications you use regularly. The calculator estimates your passive contribution profile to AI training pipelines.

Location & Movement 0%

Visual / Spatial 0%

Audio & Voice 0%

Biometric Signals 0%

NO SELECTION

Select the devices and apps you use daily.

Consequences

The Infrastructure You Are Building

The individual data points discussed above become qualitatively different when they are combined and when they persist over time. A single GPS trace is a privacy concern. Twelve months of GPS traces, correlated with audio environment data, visual scans of frequented spaces, and social graph proximity events, constitutes a behavioral profile with predictive power that exceeds what most people generate intentionally in any documentation they have ever produced about themselves.

The specific danger is not that any single company is malicious. Most data collection practices described here are implemented by engineers who genuinely believe they are building useful products. The danger is structural. Data that is collected does not stay where it was deposited. Companies merge, are acquired, and go bankrupt. Data brokers purchase anonymized datasets and de-anonymize them against other datasets. Law enforcement requests expand in scope through each court precedent. Data retained for product personalization becomes training data, then becomes the basis for automated decisions about insurance, employment, and credit.

Illinois's Biometric Information Privacy Act (BIPA), passed in 2008, is the most consequential privacy law in this domain in the United States. It requires explicit opt-in consent before collecting biometric identifiers, including face geometry, voice prints, and iris scans. Texas and Washington have similar laws. The EU's GDPR treats biometric data as a special category requiring explicit consent and significant documentary burden for processors. Despite this legal scaffolding, enforcement has been sporadic, penalties have rarely deterred companies with market capitalizations in the hundreds of billions, and most users in most jurisdictions have no legal recourse at all.

Spatial privacy, the right to move through physical space without generating persistent, commercially available records of that movement, does not currently exist as a legal concept in most countries.

VPRO Documentary: Shoshana Zuboff on surveillance capitalism and the structural transformation of personal data into AI training fuel. Essential viewing for the theoretical framework behind this article.

The longer-term consequence is the normalization of ambient data extraction as the business model for physical-world computing. Every successive generation of wearable device, from earbuds to glasses to watches to clothing, arrives with more sensor capability, more persistent connectivity, and more sophisticated on-device AI that preprocesses data before it reaches company servers, making regulation harder. Each device, in isolation, appears to offer genuine value. Each also extends the sensor network another degree. The cumulative effect is an infrastructure in which physical movement through the world generates a permanent digital record by default, with opt-out as the exception.

Practical

What Individuals Can Actually Do

Individual mitigation is meaningful at the margin, though it does not address the structural issues. The most impactful steps are the most boring ones. Reviewing and disabling location access for applications that have no functional need for it reduces the resolution of behavioral profiling significantly. Disabling camera permissions for AR games eliminates spatial scanning contributions. Reviewing smart speaker settings, specifically the options governing whether recordings can be used for model improvement, is often a single toggle that most users never see because it defaults to participation.

For glasses specifically: the only current mitigation is not wearing devices with always-on cameras. There is no technical mechanism for bystanders to prevent their faces from appearing in footage. Legislative movement toward requiring more visible recording indicators, or toward explicit bystander consent frameworks, is active in the EU under the AI Act's provisions on biometric categorization systems, but implementation timelines run to years.

The more durable intervention is political. BIPA exists because an Illinois state legislature decided that biometric data required a different consent standard than browsing cookies. That decision came largely from advocacy by a coalition of privacy researchers, civil liberties organizations, and unions whose members were being subjected to workplace biometric monitoring. The regulatory framework for ambient device data collection is not predetermined. It will be shaped by the extent to which people treat it as a political issue rather than an individual consumer choice.

References

Sources & Further Reading

[1] Niantic, Inc. (2021). Niantic Lightship AR Developer Kit — Documentation. Niantic Developer Portal. https://lightship.dev
[2] Koebler, J. (2022). Niantic Is Building a 3D Map of the World Using Pokémon Go Players. Motherboard / Vice. Accessed March 2026. vice.com
[3] Hanke, J. (2024). Niantic Announces Spin-Off of Games Business as Separate Company. Niantic Blog. March 2024. nianticlabs.com
[4] Nguyen, A. & Kollar, C. (2024). I-XRAY: Real-Time Facial Recognition Using Meta Ray-Ban Glasses. Harvard University student research demonstration. Published October 2024. Documentation available via MIT Technology Review coverage.
[5] Burgess, M. (2024). Meta's Smart Glasses Can Be Used to Dox Anyone You Meet. Wired. October 2024. wired.com
[6] Day, M., Turner, G., & Drozdiak, N. (2019). Amazon Workers Are Listening to What You Tell Alexa. Bloomberg Businessweek. April 10, 2019. bloomberg.com
[7] Thomas, Z. & Sherwood, B. (2019). Apple Contractors "Regularly Hear Confidential Details" on Siri Recordings. The Guardian. July 26, 2019. theguardian.com
[8] Rivera, J. (2015). Samsung's Smart TV Privacy Policy Warns Customers Not to Discuss Personal Information in Front of Television. The Independent. February 9, 2015.
[9] Amazon, Inc. (2021). Amazon Sidewalk Privacy and Security Whitepaper. AWS Official Documentation. developer.amazon.com
[10] Radford, A., Kim, J.W., Xu, T., Brockman, G., McLeavey, C., & Sutskever, I. (2022). Robust Speech Recognition via Large-Scale Weak Supervision. arXiv:2212.04356. OpenAI Technical Report. arxiv.org/abs/2212.04356
[11] Illinois General Assembly. (2008). Biometric Information Privacy Act (BIPA), 740 ILCS 14. State of Illinois Official Publications. ilga.gov
[12] European Parliament. (2024). Regulation (EU) 2024/1689 on Artificial Intelligence (AI Act). Official Journal of the European Union. eur-lex.europa.eu
[13] Shoshana Zuboff. (2019). The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power. PublicAffairs. ISBN 978-1-61039-569-4.
[14] Hill, K. (2020). The Secretive Company That Might End Privacy as We Know It. The New York Times. January 18, 2020. (On Clearview AI and the commercial facial recognition ecosystem.)
[15] Nissenbaum, H. (2010). Privacy in Context: Technology, Policy, and the Integrity of Social Life. Stanford University Press. (Reference framework for contextual integrity applied to ambient data collection.)

The Invisible Harvest

You signed up to catch Pikachu.

Niantic's World Model

What Niantic collects through its apps:

The Glasses That Remember

What Your Earbuds Notice

A timeline of audio data incidents:

Your Data Surface

The Infrastructure You Are Building

What Individuals Can Actually Do

Sources & Further Reading

Written by GenLab Editor

Related Protocols

Digital Colonialism and Technopolitics: The Struggle for Sovereign Infrastructure

Networked Ontologies: Technoshamanism & Digital Sovereignty

The Algorithm as Curator: How Recommendation Systems Shape Culture