Think about what this implies. If your phone is listening, it’s listening all the time, right? So like 12-18 hours of continuous audio every day. That’s a lot of ad triggers. Way too many to actually be served with ads during your browsing time, which is a strict subset of your total audible proximity to your phone (plus ad inventory is a strict subset of what you view on your phone).
So how does the phone + ad networks decide which words to prioritize to trigger which ads when?
So for this anecdote to be true, not only would the phone have to be listening, but the targeting algorithm would need to decide to actively exclude all the other audible triggers from that time period, and fill your limited ad impression inventory with the one phrase you were intentionally testing.
How would it do that? Especially if this is indeed an outlier one-off topic of conversation that you cover in a single sentence. There would not be contextual clues (like repetition over time) that might indicate you are actually “in market” for a pool fence.
To me this is the problem with these anecdotal tests. You understood that that was an important phrase in the context of ad targeting. But how did the automated ad system know it should serve you ads on that topic, and not one of the many other advertisable topics you talk about over the course of several days? Or that your phone hears over several days?
1) App stores the trailing two minutes of speech in memory.
2) If the app detects a consumption-related trigger word, the related conversation is flagged for transmission to the server.
3) Flagged audio block is converted to text. Consumption related verbs ("buy", "purchase", etc) are identified. The syntax of the sentence clearly indicates which noun is the target of a given consumption-related verb ("new car", "pool fencing")
Lots of people run network traffic sniffers to see what apps are doing. Lots of people decompile apps. Lots of people at companies leak details of bad things they are doing.
Why has nobody been able to demonstrate this beyond anecdotes about talking about swimming pools and then getting adverts for swimming pool stuff?
These are fair questions! I'm not convinced that it is happening. Nor am I convinced, as the parent seems to be, that it would be difficult to do.
edit: Having re-read my comment, I can see how it could easily be read to say "It's happening and this is how it works", whereas I intended to convey something like "It could easily be done and here's how." I have a bad habit of implying my point rather than stating it outright. I'm working on it!
I have a suspicion it’s not Facebook or Google listening in, but rather other third party apps. In fact it’s not even the third party apps but the libraries/frameworks they use to show ads.
Android shows when an app is using the microphone with a green indicator in the upper right corner -- I'm assuming iOS has something similar. How would apps get around this?
Easy: That indicator is not always on when the mic is.
Unless we're talking about an electret capsule with a physical LED wired into the supplying power rail that is switched off when the mic is not in use, you have to trust software.
And good luck with that after the patriot act. I am not implying the NSA has a microphone backdoor, but if they had and someone abused it, how would you know about it?
Listening in for keywords and only send text/audio when keywords are spoken isn't only good for ads, that would be dream of any intelligence agency. And since Snowden have been a few years.
But would you use it to show ads if you have access to such a backdoor?
That's an easy way for your backdoor to be found out. Something like this, if it exists, would be too valuable to be used en masse.
Well if I am a secret service I could either try to force google to do it and risk a leak or I could find a way that there is something in it for them that has the benefit of providing plausible deniability?
Yeah but that app was just nightmarishly bad, including an absolutely terrible approach to roll-your-own push notifications. Never attribute to malice that which can be explained by incompetence.
There will be no proof until somebody inside Apple who is in on the scam decides to grow a conscience and blow the whistle. Then they will be dismissed as a "disgruntled employee". Decompiling Siri probably will get you a lot of attention from very expensive lawyers that will make your life very interesting for a long while.
I can't even begin to tell you how many times I've been randomly having a conversation with someone, only to be alerted to the sound of the Google Assistant suddenly responding to what we're saying. Something we said was interpreted as a wake word, and then from that point on, every single thing we said was transcribed via STT, sent to Google's servers, various Google search queries were run, etc, and then the assistant responded - because it thought it was responding to a valid query and had no way of knowing otherwise. This has gotten worse with Gemini but has in no way been limited to that.
In this situation, I was alerted to this because the assistant started responding. However, I've also been in situations where I tried deliberately to talk to the assistant and it failed silently. In those situations, the UI spawns the Assistant interaction dialog, listens to what I say and then just silently closes. Sometimes this happens if there's too much background noise, for instance, and it then just re-evaluates that it wasn't a valid query at all and exits. Sometimes some background process may be frozen. Who knows if this happens before or after sending the data to the server. Sometimes the dialog lingers, waiting for the next input, and sometimes it just shuts off, leaving me (annoyingly) to have to reopen the dialog.
Putting that together, I have no idea how many times the Google Assistant has activated in my pocket, gone live, recorded stuff, sent it to Google's servers, realized it wasn't a valid query, and shut off without alerting me. I've certainly seen the Assistant dialog randomly open when looking at my phone plenty of times, which is usually a good indicator that such a thing has happened. If it silently fails in such a way that the UI doesn't respawn, then I would have no idea at all.
The net effect is that Google gets a random sample from billions of random conversations from millions of people every time this thing unintentionally goes off. They have a clear explanation as to why they got it and why ads are being served in response afterward. They can even make the case that the system is functioning as intended - after all, it'd be unreasonable to expect no false positives, or program bugs, or whatever, right? They can even say it's the user's fault and that they need to tune the voice model better.
Regardless, none of this changes the net result, which is they get a random sample of your conversation from time to time and are allowed to do whatever with it that they would have done if you sent it on purpose.
"Putting that together, I have no idea how many times the Google Assistant has activated in my pocket, gone live, recorded stuff, sent it to Google's servers, realized it wasn't a valid query, and shut off without alerting me."
Have you tried resisting an export from Google Takeout to see if there are answers in that data?
You may not be able to disassemble binaries or intercept network traffic but there are plenty of privacy researchers who can, and none of them have found anything.
That said, there's a much easier way to test this. Take two identical voice recognition smart devices (think Amazon Echos), register them each with a new never used Amazon account. Modify one of the devices to have a switch on its mic input which you leave off, and one which you leave on. See if the one with the mic on starts showing ads for things you've never searched for on that Amazon account. If the other one doesn't then there's your answer.
That sounds interesting enough that I might just give it a try.
Given the wide spread of this phenomenon and it’s been a decade, it’s either the most technically complex undetected global conspiracy or it’s not actually real[1].
Your voice is unique and can be fingerprinted to ID you (see Alexa devices). Add in things like positive sentiment analysis, changes in vocal inflection/intonation and context surrounding spoken products like purchase inference/intent and you can probably triangulate a threshold for showing products with high likelihood of purchasing intent.
Really smart people have been working on these things at Google for decades and that’s barely scratching the surface of this nuanced discussion. CPU/GPU has only gotten faster and smaller with more RAM available and better power management across the board for mobile devices.
Anything is possible if there is money to be made and it’s not explicitly illegal or better they can pay the fines after making their 100x ROI.
Embedded Audio ML engineer here (albeit mostly outside of speech). A modern MEMS microphone uses typically 0.8 mA in full performance mode at 1.8V. Doing basic voice activity detection, which is the first step of a continuous listening pipeline, can be done in under 1 mA. Doing basic keyword spotting is likey doable in 10 mA. But this is only done on the part that the voice activity module triggered on. Lets say that is 4 hours per day. Then basic speech recognition, for buying phrases and categorization, would maybe cost 100 mA. But say only 10% of the 4 hours = 0.4 hours have keywords triggered.
That would give a total power budget of (1.824)+(104)+(100*0.4) = 123 mAh per day. A typical mobile phone battery is 4000 mAh. People do not expect it to last many days anymore... So I would say that this is a actually in the feasible range. And this is before considering the very latest in low power hardware. Like MEMS mics with 0.3 mA power consumption or lower, MEMS microphones with built-in voice activity detection, or low power neural processing units (NPU) that some microcontrollers now have.
This is amazing thanks for doing the math. Didn’t realize the tech was feasibly there already off the shelf. I mean my Apple Watch can detect me saying “Hey Siri” all day with its puny battery.
If big tech isn’t doing this then it sounds like a huge startup idea worth $$$. I hope someone on here in the spirit of HN runs with it and blows the top off this topic once and for all if it’s monetizeable or expose the FAANG patent sharks that come out to play and silence them for infringing on their shady microphone tech.
Hah, that's another great argument against this being a real thing: where are the startup pitches?
If this targeting technique works and is feasible and legal and in demand by advertisers, why isn't there a competitive group of startups all trying to do it better than each other and sell the results?
Now the conspiracy theory has grown to include "dozens of companies compete at this, all of them secretively operating in a marketplace that is entirely invisible to the outside world."
Another question that comes to mind now: would this sort of technique run afoul of some wiretapping laws among various states? One is not listing to a wake word to provide a direct response but rather to... idk. just a random thought.
Thank you for taking the time to post this informative response. As a sibling comment posted, didn't realize it was so feasible. When posting my original comment, i was thinking orders of magnitude more power would have been needed to facilitate this.
Could even switch on when you detect positive social signals implying you are around another person nearby using wifi, Bluetooth, gps, IP address, etc. to ID another device.
They could even pickup or recognize the second voiceprint ID and know it’s your best friend and wake up the audio recognition from ultra power saving mode or whatever. Literally anything is possible to make this is work.
My phone can listen all day every day. It listens for "hey google" and it can listen and passively tell you songs that are playing. It's not outside the realm of possibility to do their audio fingerprinting on keywords and what not. The advertising potential makes it extremely juicy
Your phone can listen for “hey Google” because it’s only one phrase and the model can run at very low power on specialized hardware. If you want to add 1000 keywords the battery drain would be intense.
Pixel phones run song identification constantly now. They have a local database of the top 1000 (?) most popular songs. It has negligible impact on battery life.
Not saying I agree that 'phones are listening to show us ads', but technically we have the capability for that to happen (sampling audio every X intervals and matching against a local database of keywords)
Add at least two zeros to your number. Pixel phones can detect the top 11k songs while being offline (it used to be more). The fingerprint database for this is around 500 MB in size.
I think it is very easy to sneak a few (thousand) extra fingerprints in this database and do all kinds of tracking with it. All while the green microphone icon is disabled.
For argument’s sake, let’s be generous and stipulate your phone is listening for 11k keywords to serve you ads.
Why would “pool fencing” take up one of those valuable keyword slots on everyone’s phone?
And you’re going to see way less than 11k ads per day. Why would the ad server prioritize serving an ad for pool fencing (a phrase said once) over all the far more common topics a person talks about in a typical day, like movies, TV shows, food and drink, clothes, cars, consumer electronics, music, etc?
"look into" is a much more likely trigger, then send the 30 seconds before and after to a server for more analysis. "buying" could be another. It's not like it would be that hard. Especially with some of the pretty good vocal compression for audio. It would be a small blip on a modern connection, even wireless.
I'm not saying it is or isn't happening but it wouldn't be hard.
Your argument plays with the idea that the phone listening stuff is the only source of information for the ad networks. But it would be much more complex. It would be only one of many signals, that are used to serve the consumer the right advertisement in the right moment. So it doesn't need to have the exact phrase "pool fencing" in the database. It just need to detect that something about pools, or swimming, etc. was talked about. Since Google has thousands of signals and statistics (like browsing history, current location, the other smartphones that are near, and those histories etc.) about this person, it can sell the ad space to "pool fencing" and expect a high click through rate.
Selling ads is a bit like the current LLMs. It's just a stochastic parrot, that hallucinates stuff. But the stuff is often that advertisement that brings in the most money.
The self-expressed goal of this kind of test is to pick a phrase or topic that is so random that it escapes that person's existing ad data profile. As the comment above said, "He doesn't own a swimming pool, doesn't want to, and has never expressed any desire to."
So showing that person an ad for pool fencing is a complete waste; they're never going to click it. If that's what an alleged audio targeting system does, it would make the ad network less profitable than just using the data they already have. So why would anyone build it that way?
I dont know if phones listen to us to serve ads, but 11K is a decent vocab. Most adults have a vocab of 20K. Therefore I could imagine it including the words "pool" and "fencing".
Now Playing only has to sample for a few seconds every few minutes when the phone is powered on for other reasons (like to participate in cellular check-ins). This is because a song is typically several minutes long and you only have to fingerprint for a few seconds. It doesn't matter which few seconds. It's not continuously listening, so it's not the same thing at all.
The system knows to serve you ads about the new topic because it's new. You're already getting ads for the stuff you're normally talking about. The new topic stands out easily.
It doesn't have to be your phone. Could be your TV or any other device.
Most importantly there's just patterns of behavior. Companies are absolutely desperate for every scrap of data they can get on you. Why would they not capture audio from your mic?
You’re so right. We should just trust the computers in our pockets, hands, and nightstands 24/7/365 running proprietary operating systems, firmware, and sensor suites phoning home as much targeting data as they can possibly collect — but not that! What could they possibly gain from harvesting that?
Companies really are using tons of highly sensitive data to target ads, even when we sleep. But they're not generally using microphones to record audio to do it. Both things can be accurate statements.
>So how does the phone + ad networks decide which words to prioritize to trigger which ads when?
The same way they analyze your email and web searches. Basically, statistics.
>To me this is the problem with these anecdotal tests. You understood that that was an important phrase in the context of ad targeting. But how did the automated ad system know it should serve you ads on that topic, and not one of the many other advertisable topics you talk about over the course of several days? Or that your phone hears over several days?
Buddy, so many people have witnessed this happening for at least 10 years and even done experiments at this point that it's common knowledge. I know for a fact that one of my friends now has a phone that is especially receptive to hearing me say things around it, because our conversation topics ALWAYS come up in my searches, ads, and feeds shortly after. Think about that. Someone else's phone sends data to a cloud that I never gave permission to. It then puts that together with data from MY phone about where I was (perhaps even the devices chirping at each other!). The aggregation happens within a week then I see relevant ads. I've seen this happen dozens of times. It's no coincidence.
As far as the article, I'm not even going to read it. It's got to be stupid. We know from leaks, reverse-engineering, and personal experience that this spying is going on. I question the source of this article, but I suppose we should never underestimate the lengths someone will go to in order to feel that they are smarter than the rest of us with our eyes open.
I would be VERY interested to hear details of those leaks and that reverse-engineering. I've only ever heard the personal anecdotes.
(If you'd read my article you would have seen this bit at the top: "Convincing people of this is basically impossible. It doesn’t matter how good your argument is, if someone has ever seen an ad that relates to their previous voice conversation they are likely convinced and there’s nothing you can do to talk them out of it.")
I truly wish I had a bibliography to give you but it has been so obviously true to me that I hadn't bothered to catalogue all of this information. I'll try to get you started though. Start by familiarizing yourself with the Snowden leaks and how the government buys data from private companies to violate the constitution. Second, look for articles like this one: https://www.pcworld.com/article/2450052/do-smartphones-liste... This kind of thing is published periodically. Apple lost a lawsuit over Siri spying "inadvertently" very recently: https://arstechnica.com/tech-policy/2025/01/apple-agrees-to-... There is no reason to believe that your phone is ever not listening. The audio can at least be transcribed and catalogued.
If companies are willing to track your every click and mouse movement, every footstep and slight movement you make with your phone even while you are asleep, build and bundle keyboard apps to capture what you type, monitor you with AI, etc., are you seriously surprised that they would not also listen to you? None of that stuff I just described is fiction. It's established tech that has been documented over time. The only reason it's not 100% illegal is because the EULA probably covers it.
I swear people who think they aren't listening when they can seem like people who would be shocked to learn that an armed carjacker might demand your wallet in addition to your car. Unreal...
Oh yeah one more tip. Try to use the data export feature from Google or Facebook. You might just be surprised what you find. I've heard of people finding recordings of private conversations picked up by Google devices. I personally found hundreds of Facebook messages and posts that I deleted with a tool, and aren't visible to anyone (OK maybe the messages make sense but not the posts).
> Apple lost a lawsuit over Siri spying "inadvertently" very recently
That's what my article is about: it's about how I'm certain people will use this settled-out-of-court lawsuit as "evidence" that Apple are spying and targeting ads, but it's very clear that's not what was happening here.
Apple settled because they knew they would lose. Winning is good PR since they (falsely) claim to be favorable for privacy. They are only superficially better than Google in that regard. I did skim your article and it's not as bad as I thought. I think you really mean it when you say these are just coincidences. But they're not.
Another one I forgot to mention: Google explicitly tracks your location history unless you turn it off, and you'd be foolish to think that they won't (or couldn't) save the data anyway. People have done experiments showing dramatic improvements in battery life using AOSP without Google telemetry and spyware.
I don't trust takeout features completely, honestly. Takeout only gives you YOUR data and not all your acquaintances' data, which can be assembled by companies you don't even know exist to profile YOU. The companies you deal with then have no obligation to share it with you, because to them they are only leasing access to data that they sold off or some crap. It's like how the government can't collect this data but they can buy it. The same trick is everywhere.
I seriously don't trust anything on a very deep level. Like I said, I've seen too much evidence that these companies are run by snakes that can only be trusted in certain ways. You might not agree, and I'm not prepared to argue all that tonight (I keep hitting the comment rate limit anyway). Just try to remain skeptical both ways if you don't believe we're being spied on, ok?
I hate corporations as much as the next guy (probably more than the next guy) but the argument that "it wasn't proven they were doing it which proves they were doing it" is probably the worst one you could have come up with tbh.
That case dragged on for 5 years, and ended up with them paying $95 million anyway. I think if they could have proved that they weren't doing it, they would have. Maybe I didn't say that clearly but it makes a lot of sense.
Apple spends a lot of money to keep its secrets. Paying $95 million to avoid letting people snoop in exactly how Apple systems work is a bargain, and I don't even think they're using audio for ad targeting.
Eh I don't think that is what happened here. If other companies want to know how Apple did things 5 years ago, they can just hire some ex-Apple employees. I think someone could build a competitive system with current technology for something in the ballpark of $95 million lol.
Good analogy. Just like anyone with a lick of sense can see the spherical Earth from an airplane, so can anyone see the absence of this network traffic from any network analyzer. It’s not there. It does not exist.
And nevermind the conspiratorial thinking required to believe whole teams of engineers are developing and maintaining this capability across several giant companies, but nobody ever puts it on a resume. Apparently the thousands of people working on this are all personally committed to complete secrecy, forever. Uh-uh.
>And nevermind the conspiratorial thinking required to believe whole teams of engineers are developing and maintaining this capability across several giant companies, but nobody ever puts it on a resume. Apparently the thousands of people working on this are all personally committed to complete secrecy, forever. Uh-uh.
Bro it doesn't matter how much evidence you provide people with that this IS happening. They usually won't accept it. If they do accept it, half the time they shrug it off with "I've got nothing to hide anyway" kind of cope.
I seriously think I'm arguing with employees of these companies on HN because all you people do is deny everything and smear people who talk about this stuff. I hate to break it to you, but conspiracies are real. Noticing that people are conspiring to do things that nobody likes is not unreasonable in any way.
Just because most adtech is the equivalent of Internet billboards on the side of the highway doesn't mean these systems aren't in place. You don't even need a very complex system when the entire device platform is designed to spy on you.
Both things can be true. Companies vacuum up massive amounts of personal data. And then they run it through crummy algorithms that are designed to increase the number of people who fit into a given category instead of accurately finding only the people who really ought to be in that category.
I agree with most of this, but have to take note that
>thousands of people working on this all personally committed to complete secrecy
Basically describes a LOT of government spying programs or horrific abuses that have happened, for instance.
Secrets can absolutely be held, and I wouldn't be surprised by even thousands of NDA'd engineers (who already have been doing this sort of thing for a loooong time) opting not to leak anything in a way that would be credible.
I'll reiterate that I'm skeptical of the overall conspiracy claims even though I usually believe in mass spying claims or institutions/corps/etc. being awful. I just think your argument there is pretty flawed, at least that aspect of it.
In fact, on why I'm skeptical:
I just can't shake this profound sense that it's like the "Frequency Illusion" phenomenon that I've demonstrated to people while driving or walking outside.
Or more likely a mix of it with people also getting prompted with what they "want" in the first place by all the advertising and targeted media and their various search history data.
A lot of things are happening now that happened before. That doesn't mean things don't improve. Increased efficiency is the issue now. In 2017, maybe some simple algorithms, or a person, was intervening to drive ads, now AI is a big step change in better targeting.
That was one of the points in the book. in 2017 or before, a surveillance state was limited by the number of people it takes to do the actual surveillance . Now AI increases the efficiency.
He talks a lot when on a book tour, you know, promoting a book.
There is a couple months where he is on every podcast. Then he is gone again.
You know who else does this, every other author on the planet. When a book is coming out, then suddenly the authors are on every podcast or show, talk, debate, anything they can manage to get on. Its a blitz.
You know what else happens on a book tour, they tend to give a highlight of the book. They don't sit there reading off references and citations. They give a streamlined high level idea of what the book is about, but not all of it, because they want people to go buy it.
So how does the phone + ad networks decide which words to prioritize to trigger which ads when?
So for this anecdote to be true, not only would the phone have to be listening, but the targeting algorithm would need to decide to actively exclude all the other audible triggers from that time period, and fill your limited ad impression inventory with the one phrase you were intentionally testing.
How would it do that? Especially if this is indeed an outlier one-off topic of conversation that you cover in a single sentence. There would not be contextual clues (like repetition over time) that might indicate you are actually “in market” for a pool fence.
To me this is the problem with these anecdotal tests. You understood that that was an important phrase in the context of ad targeting. But how did the automated ad system know it should serve you ads on that topic, and not one of the many other advertisable topics you talk about over the course of several days? Or that your phone hears over several days?