Every time I read some technical description about why this isn't happening, the technical description seems convincing.
However...
A friend tested the theory a few years ago. He doesn't own a swimming pool, doesn't want to, and has never expressed any desire to. He put his and his wife's phone on the table and said to the wife (loudly), "Why don't we look into pool fencing?". She agreed with him. Shortly after, on both of their phones, on a particular social network, they were inundated with ads for....pool fencing.
Think about what this implies. If your phone is listening, it’s listening all the time, right? So like 12-18 hours of continuous audio every day. That’s a lot of ad triggers. Way too many to actually be served with ads during your browsing time, which is a strict subset of your total audible proximity to your phone (plus ad inventory is a strict subset of what you view on your phone).
So how does the phone + ad networks decide which words to prioritize to trigger which ads when?
So for this anecdote to be true, not only would the phone have to be listening, but the targeting algorithm would need to decide to actively exclude all the other audible triggers from that time period, and fill your limited ad impression inventory with the one phrase you were intentionally testing.
How would it do that? Especially if this is indeed an outlier one-off topic of conversation that you cover in a single sentence. There would not be contextual clues (like repetition over time) that might indicate you are actually “in market” for a pool fence.
To me this is the problem with these anecdotal tests. You understood that that was an important phrase in the context of ad targeting. But how did the automated ad system know it should serve you ads on that topic, and not one of the many other advertisable topics you talk about over the course of several days? Or that your phone hears over several days?
1) App stores the trailing two minutes of speech in memory.
2) If the app detects a consumption-related trigger word, the related conversation is flagged for transmission to the server.
3) Flagged audio block is converted to text. Consumption related verbs ("buy", "purchase", etc) are identified. The syntax of the sentence clearly indicates which noun is the target of a given consumption-related verb ("new car", "pool fencing")
Lots of people run network traffic sniffers to see what apps are doing. Lots of people decompile apps. Lots of people at companies leak details of bad things they are doing.
Why has nobody been able to demonstrate this beyond anecdotes about talking about swimming pools and then getting adverts for swimming pool stuff?
These are fair questions! I'm not convinced that it is happening. Nor am I convinced, as the parent seems to be, that it would be difficult to do.
edit: Having re-read my comment, I can see how it could easily be read to say "It's happening and this is how it works", whereas I intended to convey something like "It could easily be done and here's how." I have a bad habit of implying my point rather than stating it outright. I'm working on it!
I have a suspicion it’s not Facebook or Google listening in, but rather other third party apps. In fact it’s not even the third party apps but the libraries/frameworks they use to show ads.
Android shows when an app is using the microphone with a green indicator in the upper right corner -- I'm assuming iOS has something similar. How would apps get around this?
Easy: That indicator is not always on when the mic is.
Unless we're talking about an electret capsule with a physical LED wired into the supplying power rail that is switched off when the mic is not in use, you have to trust software.
And good luck with that after the patriot act. I am not implying the NSA has a microphone backdoor, but if they had and someone abused it, how would you know about it?
Listening in for keywords and only send text/audio when keywords are spoken isn't only good for ads, that would be dream of any intelligence agency. And since Snowden have been a few years.
But would you use it to show ads if you have access to such a backdoor?
That's an easy way for your backdoor to be found out. Something like this, if it exists, would be too valuable to be used en masse.
Well if I am a secret service I could either try to force google to do it and risk a leak or I could find a way that there is something in it for them that has the benefit of providing plausible deniability?
Yeah but that app was just nightmarishly bad, including an absolutely terrible approach to roll-your-own push notifications. Never attribute to malice that which can be explained by incompetence.
There will be no proof until somebody inside Apple who is in on the scam decides to grow a conscience and blow the whistle. Then they will be dismissed as a "disgruntled employee". Decompiling Siri probably will get you a lot of attention from very expensive lawyers that will make your life very interesting for a long while.
I can't even begin to tell you how many times I've been randomly having a conversation with someone, only to be alerted to the sound of the Google Assistant suddenly responding to what we're saying. Something we said was interpreted as a wake word, and then from that point on, every single thing we said was transcribed via STT, sent to Google's servers, various Google search queries were run, etc, and then the assistant responded - because it thought it was responding to a valid query and had no way of knowing otherwise. This has gotten worse with Gemini but has in no way been limited to that.
In this situation, I was alerted to this because the assistant started responding. However, I've also been in situations where I tried deliberately to talk to the assistant and it failed silently. In those situations, the UI spawns the Assistant interaction dialog, listens to what I say and then just silently closes. Sometimes this happens if there's too much background noise, for instance, and it then just re-evaluates that it wasn't a valid query at all and exits. Sometimes some background process may be frozen. Who knows if this happens before or after sending the data to the server. Sometimes the dialog lingers, waiting for the next input, and sometimes it just shuts off, leaving me (annoyingly) to have to reopen the dialog.
Putting that together, I have no idea how many times the Google Assistant has activated in my pocket, gone live, recorded stuff, sent it to Google's servers, realized it wasn't a valid query, and shut off without alerting me. I've certainly seen the Assistant dialog randomly open when looking at my phone plenty of times, which is usually a good indicator that such a thing has happened. If it silently fails in such a way that the UI doesn't respawn, then I would have no idea at all.
The net effect is that Google gets a random sample from billions of random conversations from millions of people every time this thing unintentionally goes off. They have a clear explanation as to why they got it and why ads are being served in response afterward. They can even make the case that the system is functioning as intended - after all, it'd be unreasonable to expect no false positives, or program bugs, or whatever, right? They can even say it's the user's fault and that they need to tune the voice model better.
Regardless, none of this changes the net result, which is they get a random sample of your conversation from time to time and are allowed to do whatever with it that they would have done if you sent it on purpose.
"Putting that together, I have no idea how many times the Google Assistant has activated in my pocket, gone live, recorded stuff, sent it to Google's servers, realized it wasn't a valid query, and shut off without alerting me."
Have you tried resisting an export from Google Takeout to see if there are answers in that data?
You may not be able to disassemble binaries or intercept network traffic but there are plenty of privacy researchers who can, and none of them have found anything.
That said, there's a much easier way to test this. Take two identical voice recognition smart devices (think Amazon Echos), register them each with a new never used Amazon account. Modify one of the devices to have a switch on its mic input which you leave off, and one which you leave on. See if the one with the mic on starts showing ads for things you've never searched for on that Amazon account. If the other one doesn't then there's your answer.
That sounds interesting enough that I might just give it a try.
Given the wide spread of this phenomenon and it’s been a decade, it’s either the most technically complex undetected global conspiracy or it’s not actually real[1].
Your voice is unique and can be fingerprinted to ID you (see Alexa devices). Add in things like positive sentiment analysis, changes in vocal inflection/intonation and context surrounding spoken products like purchase inference/intent and you can probably triangulate a threshold for showing products with high likelihood of purchasing intent.
Really smart people have been working on these things at Google for decades and that’s barely scratching the surface of this nuanced discussion. CPU/GPU has only gotten faster and smaller with more RAM available and better power management across the board for mobile devices.
Anything is possible if there is money to be made and it’s not explicitly illegal or better they can pay the fines after making their 100x ROI.
Embedded Audio ML engineer here (albeit mostly outside of speech). A modern MEMS microphone uses typically 0.8 mA in full performance mode at 1.8V. Doing basic voice activity detection, which is the first step of a continuous listening pipeline, can be done in under 1 mA. Doing basic keyword spotting is likey doable in 10 mA. But this is only done on the part that the voice activity module triggered on. Lets say that is 4 hours per day. Then basic speech recognition, for buying phrases and categorization, would maybe cost 100 mA. But say only 10% of the 4 hours = 0.4 hours have keywords triggered.
That would give a total power budget of (1.824)+(104)+(100*0.4) = 123 mAh per day. A typical mobile phone battery is 4000 mAh. People do not expect it to last many days anymore... So I would say that this is a actually in the feasible range. And this is before considering the very latest in low power hardware. Like MEMS mics with 0.3 mA power consumption or lower, MEMS microphones with built-in voice activity detection, or low power neural processing units (NPU) that some microcontrollers now have.
This is amazing thanks for doing the math. Didn’t realize the tech was feasibly there already off the shelf. I mean my Apple Watch can detect me saying “Hey Siri” all day with its puny battery.
If big tech isn’t doing this then it sounds like a huge startup idea worth $$$. I hope someone on here in the spirit of HN runs with it and blows the top off this topic once and for all if it’s monetizeable or expose the FAANG patent sharks that come out to play and silence them for infringing on their shady microphone tech.
Hah, that's another great argument against this being a real thing: where are the startup pitches?
If this targeting technique works and is feasible and legal and in demand by advertisers, why isn't there a competitive group of startups all trying to do it better than each other and sell the results?
Now the conspiracy theory has grown to include "dozens of companies compete at this, all of them secretively operating in a marketplace that is entirely invisible to the outside world."
Another question that comes to mind now: would this sort of technique run afoul of some wiretapping laws among various states? One is not listing to a wake word to provide a direct response but rather to... idk. just a random thought.
Thank you for taking the time to post this informative response. As a sibling comment posted, didn't realize it was so feasible. When posting my original comment, i was thinking orders of magnitude more power would have been needed to facilitate this.
Could even switch on when you detect positive social signals implying you are around another person nearby using wifi, Bluetooth, gps, IP address, etc. to ID another device.
They could even pickup or recognize the second voiceprint ID and know it’s your best friend and wake up the audio recognition from ultra power saving mode or whatever. Literally anything is possible to make this is work.
My phone can listen all day every day. It listens for "hey google" and it can listen and passively tell you songs that are playing. It's not outside the realm of possibility to do their audio fingerprinting on keywords and what not. The advertising potential makes it extremely juicy
Your phone can listen for “hey Google” because it’s only one phrase and the model can run at very low power on specialized hardware. If you want to add 1000 keywords the battery drain would be intense.
Pixel phones run song identification constantly now. They have a local database of the top 1000 (?) most popular songs. It has negligible impact on battery life.
Not saying I agree that 'phones are listening to show us ads', but technically we have the capability for that to happen (sampling audio every X intervals and matching against a local database of keywords)
Add at least two zeros to your number. Pixel phones can detect the top 11k songs while being offline (it used to be more). The fingerprint database for this is around 500 MB in size.
I think it is very easy to sneak a few (thousand) extra fingerprints in this database and do all kinds of tracking with it. All while the green microphone icon is disabled.
For argument’s sake, let’s be generous and stipulate your phone is listening for 11k keywords to serve you ads.
Why would “pool fencing” take up one of those valuable keyword slots on everyone’s phone?
And you’re going to see way less than 11k ads per day. Why would the ad server prioritize serving an ad for pool fencing (a phrase said once) over all the far more common topics a person talks about in a typical day, like movies, TV shows, food and drink, clothes, cars, consumer electronics, music, etc?
"look into" is a much more likely trigger, then send the 30 seconds before and after to a server for more analysis. "buying" could be another. It's not like it would be that hard. Especially with some of the pretty good vocal compression for audio. It would be a small blip on a modern connection, even wireless.
I'm not saying it is or isn't happening but it wouldn't be hard.
Your argument plays with the idea that the phone listening stuff is the only source of information for the ad networks. But it would be much more complex. It would be only one of many signals, that are used to serve the consumer the right advertisement in the right moment. So it doesn't need to have the exact phrase "pool fencing" in the database. It just need to detect that something about pools, or swimming, etc. was talked about. Since Google has thousands of signals and statistics (like browsing history, current location, the other smartphones that are near, and those histories etc.) about this person, it can sell the ad space to "pool fencing" and expect a high click through rate.
Selling ads is a bit like the current LLMs. It's just a stochastic parrot, that hallucinates stuff. But the stuff is often that advertisement that brings in the most money.
The self-expressed goal of this kind of test is to pick a phrase or topic that is so random that it escapes that person's existing ad data profile. As the comment above said, "He doesn't own a swimming pool, doesn't want to, and has never expressed any desire to."
So showing that person an ad for pool fencing is a complete waste; they're never going to click it. If that's what an alleged audio targeting system does, it would make the ad network less profitable than just using the data they already have. So why would anyone build it that way?
I dont know if phones listen to us to serve ads, but 11K is a decent vocab. Most adults have a vocab of 20K. Therefore I could imagine it including the words "pool" and "fencing".
Now Playing only has to sample for a few seconds every few minutes when the phone is powered on for other reasons (like to participate in cellular check-ins). This is because a song is typically several minutes long and you only have to fingerprint for a few seconds. It doesn't matter which few seconds. It's not continuously listening, so it's not the same thing at all.
The system knows to serve you ads about the new topic because it's new. You're already getting ads for the stuff you're normally talking about. The new topic stands out easily.
It doesn't have to be your phone. Could be your TV or any other device.
Most importantly there's just patterns of behavior. Companies are absolutely desperate for every scrap of data they can get on you. Why would they not capture audio from your mic?
You’re so right. We should just trust the computers in our pockets, hands, and nightstands 24/7/365 running proprietary operating systems, firmware, and sensor suites phoning home as much targeting data as they can possibly collect — but not that! What could they possibly gain from harvesting that?
Companies really are using tons of highly sensitive data to target ads, even when we sleep. But they're not generally using microphones to record audio to do it. Both things can be accurate statements.
>So how does the phone + ad networks decide which words to prioritize to trigger which ads when?
The same way they analyze your email and web searches. Basically, statistics.
>To me this is the problem with these anecdotal tests. You understood that that was an important phrase in the context of ad targeting. But how did the automated ad system know it should serve you ads on that topic, and not one of the many other advertisable topics you talk about over the course of several days? Or that your phone hears over several days?
Buddy, so many people have witnessed this happening for at least 10 years and even done experiments at this point that it's common knowledge. I know for a fact that one of my friends now has a phone that is especially receptive to hearing me say things around it, because our conversation topics ALWAYS come up in my searches, ads, and feeds shortly after. Think about that. Someone else's phone sends data to a cloud that I never gave permission to. It then puts that together with data from MY phone about where I was (perhaps even the devices chirping at each other!). The aggregation happens within a week then I see relevant ads. I've seen this happen dozens of times. It's no coincidence.
As far as the article, I'm not even going to read it. It's got to be stupid. We know from leaks, reverse-engineering, and personal experience that this spying is going on. I question the source of this article, but I suppose we should never underestimate the lengths someone will go to in order to feel that they are smarter than the rest of us with our eyes open.
I would be VERY interested to hear details of those leaks and that reverse-engineering. I've only ever heard the personal anecdotes.
(If you'd read my article you would have seen this bit at the top: "Convincing people of this is basically impossible. It doesn’t matter how good your argument is, if someone has ever seen an ad that relates to their previous voice conversation they are likely convinced and there’s nothing you can do to talk them out of it.")
I truly wish I had a bibliography to give you but it has been so obviously true to me that I hadn't bothered to catalogue all of this information. I'll try to get you started though. Start by familiarizing yourself with the Snowden leaks and how the government buys data from private companies to violate the constitution. Second, look for articles like this one: https://www.pcworld.com/article/2450052/do-smartphones-liste... This kind of thing is published periodically. Apple lost a lawsuit over Siri spying "inadvertently" very recently: https://arstechnica.com/tech-policy/2025/01/apple-agrees-to-... There is no reason to believe that your phone is ever not listening. The audio can at least be transcribed and catalogued.
If companies are willing to track your every click and mouse movement, every footstep and slight movement you make with your phone even while you are asleep, build and bundle keyboard apps to capture what you type, monitor you with AI, etc., are you seriously surprised that they would not also listen to you? None of that stuff I just described is fiction. It's established tech that has been documented over time. The only reason it's not 100% illegal is because the EULA probably covers it.
I swear people who think they aren't listening when they can seem like people who would be shocked to learn that an armed carjacker might demand your wallet in addition to your car. Unreal...
Oh yeah one more tip. Try to use the data export feature from Google or Facebook. You might just be surprised what you find. I've heard of people finding recordings of private conversations picked up by Google devices. I personally found hundreds of Facebook messages and posts that I deleted with a tool, and aren't visible to anyone (OK maybe the messages make sense but not the posts).
> Apple lost a lawsuit over Siri spying "inadvertently" very recently
That's what my article is about: it's about how I'm certain people will use this settled-out-of-court lawsuit as "evidence" that Apple are spying and targeting ads, but it's very clear that's not what was happening here.
Apple settled because they knew they would lose. Winning is good PR since they (falsely) claim to be favorable for privacy. They are only superficially better than Google in that regard. I did skim your article and it's not as bad as I thought. I think you really mean it when you say these are just coincidences. But they're not.
Another one I forgot to mention: Google explicitly tracks your location history unless you turn it off, and you'd be foolish to think that they won't (or couldn't) save the data anyway. People have done experiments showing dramatic improvements in battery life using AOSP without Google telemetry and spyware.
I don't trust takeout features completely, honestly. Takeout only gives you YOUR data and not all your acquaintances' data, which can be assembled by companies you don't even know exist to profile YOU. The companies you deal with then have no obligation to share it with you, because to them they are only leasing access to data that they sold off or some crap. It's like how the government can't collect this data but they can buy it. The same trick is everywhere.
I seriously don't trust anything on a very deep level. Like I said, I've seen too much evidence that these companies are run by snakes that can only be trusted in certain ways. You might not agree, and I'm not prepared to argue all that tonight (I keep hitting the comment rate limit anyway). Just try to remain skeptical both ways if you don't believe we're being spied on, ok?
I hate corporations as much as the next guy (probably more than the next guy) but the argument that "it wasn't proven they were doing it which proves they were doing it" is probably the worst one you could have come up with tbh.
That case dragged on for 5 years, and ended up with them paying $95 million anyway. I think if they could have proved that they weren't doing it, they would have. Maybe I didn't say that clearly but it makes a lot of sense.
Apple spends a lot of money to keep its secrets. Paying $95 million to avoid letting people snoop in exactly how Apple systems work is a bargain, and I don't even think they're using audio for ad targeting.
Eh I don't think that is what happened here. If other companies want to know how Apple did things 5 years ago, they can just hire some ex-Apple employees. I think someone could build a competitive system with current technology for something in the ballpark of $95 million lol.
Good analogy. Just like anyone with a lick of sense can see the spherical Earth from an airplane, so can anyone see the absence of this network traffic from any network analyzer. It’s not there. It does not exist.
And nevermind the conspiratorial thinking required to believe whole teams of engineers are developing and maintaining this capability across several giant companies, but nobody ever puts it on a resume. Apparently the thousands of people working on this are all personally committed to complete secrecy, forever. Uh-uh.
>And nevermind the conspiratorial thinking required to believe whole teams of engineers are developing and maintaining this capability across several giant companies, but nobody ever puts it on a resume. Apparently the thousands of people working on this are all personally committed to complete secrecy, forever. Uh-uh.
Bro it doesn't matter how much evidence you provide people with that this IS happening. They usually won't accept it. If they do accept it, half the time they shrug it off with "I've got nothing to hide anyway" kind of cope.
I seriously think I'm arguing with employees of these companies on HN because all you people do is deny everything and smear people who talk about this stuff. I hate to break it to you, but conspiracies are real. Noticing that people are conspiring to do things that nobody likes is not unreasonable in any way.
Just because most adtech is the equivalent of Internet billboards on the side of the highway doesn't mean these systems aren't in place. You don't even need a very complex system when the entire device platform is designed to spy on you.
Both things can be true. Companies vacuum up massive amounts of personal data. And then they run it through crummy algorithms that are designed to increase the number of people who fit into a given category instead of accurately finding only the people who really ought to be in that category.
I agree with most of this, but have to take note that
>thousands of people working on this all personally committed to complete secrecy
Basically describes a LOT of government spying programs or horrific abuses that have happened, for instance.
Secrets can absolutely be held, and I wouldn't be surprised by even thousands of NDA'd engineers (who already have been doing this sort of thing for a loooong time) opting not to leak anything in a way that would be credible.
I'll reiterate that I'm skeptical of the overall conspiracy claims even though I usually believe in mass spying claims or institutions/corps/etc. being awful. I just think your argument there is pretty flawed, at least that aspect of it.
In fact, on why I'm skeptical:
I just can't shake this profound sense that it's like the "Frequency Illusion" phenomenon that I've demonstrated to people while driving or walking outside.
Or more likely a mix of it with people also getting prompted with what they "want" in the first place by all the advertising and targeted media and their various search history data.
A lot of things are happening now that happened before. That doesn't mean things don't improve. Increased efficiency is the issue now. In 2017, maybe some simple algorithms, or a person, was intervening to drive ads, now AI is a big step change in better targeting.
That was one of the points in the book. in 2017 or before, a surveillance state was limited by the number of people it takes to do the actual surveillance . Now AI increases the efficiency.
He talks a lot when on a book tour, you know, promoting a book.
There is a couple months where he is on every podcast. Then he is gone again.
You know who else does this, every other author on the planet. When a book is coming out, then suddenly the authors are on every podcast or show, talk, debate, anything they can manage to get on. Its a blitz.
You know what else happens on a book tour, they tend to give a highlight of the book. They don't sit there reading off references and citations. They give a streamlined high level idea of what the book is about, but not all of it, because they want people to go buy it.
Why did they pick a swimming pool? Did they see people in their area installing pools? I think that's often people's best guess, is that the "random" thing people use to test this actually isn't random and subconsciously they already had this topic seeded to them.
Something similar -- while on a family visit at my parents' house, my brother was talking about his upcoming Hawaii trip, Specifically he was going over a snorkeling adventure he signed up for.
For the next week or so, I got many ads on my phone about underwater packages for Hawaii, along with ads for various snorkeling and swimming gear. Now I had never researched any of that on my phone, however obviously my brother has. And the ad trackers saw that both my phone and his had communicated out over the same IP address (my parents wifi) on other random internet connections, so that is probably why they were then targeting my tracker cookie with ads that would be related to his tracker cookie. (This is all technically "easy" for the trackers to do, and seems logical that they would, because "why not").
On an unrelated note, I was making a peanut butter sandwich, started browsing some sites, and started getting ads for Skippy peanut butter. My phone must have smelled the peanut butter in the air.
Until my wife installed UBo, this was helpful in finding presents for her. Because I was using UBo, I could switch it off, browse Facebook and all of my ads were being targetted at her. I could see everything she was considering buying.
It was impressively creepy and a good way of surprising her with something she hadn't said anything about but was considering buying.
You have hit the nail on the head but it doesn't even need to be wifi and it also doesn't have to be that complicated. They see that two devices are in the same area for a long period of time so they serve ads for other peoples web history. Jarringly enough ever since I had this realization I sometimes see ads for things that reveal something meant to be private of those I had just spent time with. Also, seeing peoples ads when they make desktop recordings of their screen can be extremely telling..
Exactly. Years ago, I did due diligence on a company that targets ads using IP address. They buy IP address data from ISPs, target ads based on demographics, and then use cookies to retarget. Not that far off of what you're describing.
I know my iPhone isn’t listening to me. And I know about my friend’s activity influencing the ads I get served, and my demographic, and location, and all of that. And my random idea for a test word being predictable in a shocking way.
But, recently I started thinking about the average user, who will install anything and approve any permissions requested without reading it. And imperfect App Store reviews approving a Trojan horse accidentally.
Am I positive someone hasn’t inadvertently allowed mic access to a malicious party? I wonder if that person’s phone may, in fact, be listening to them.
No, they deliberately chose a topic they had absolutely no interest in, to try and avoid confirmation bias. It’s not impossible that what you describe is actually what happened to an extent though, a lot of the recommendations and ads on FB do seem to have a “what people around you / in your network like” factor.
You also need to do the opposite experiment
Have two people put their phones in the microwave (don't run it!) / turn them off, discuss swimming pools (or jacuzzis etc), and see if you suddenly start seeing ads for said thing. And then you would have to repeat this experiment several times to rule out outliers
Experiment design is important! I completely believe that this happened to your friends and I also don't think it means what you/they think it does
(That is: you need to completely isolate yourself; music practice room on a college campus where nobody is wearing a watch or phone and repeat the experiment. If it turns out that you still see ads for that thing, then the experiment didn't prove anything)
Confirmation bias makes it hard to extract much from these types of anecdotes. On a daily basis you might be talking about dozens of products. If your lookback period is a few days, that could 100s of products, and you'll get spooky coincidences pop up from time to time from pure chance alone.
Technilogical causes are much more likely than accidental causes for such effects to appear, in today's world.
Occam's Razor and the answer to the question, "What kinds of companies are at work in the environment?" push that probability in a specific way, because the motives and means are definitely there. Do you think they are the kinds of companies that would waste such an opportunity?
Their Chief Councel's recommendation depends on how slimy they are, right?
What would happen if they got caught? Slap on the wrist would be all, if that, no?
>Technilogical causes are much more likely than accidental causes for such effects to appear, in today's world.
This is absurd. The chances of rolling snake eyes twice in a row is 0.07%. However, that doesn't mean if I do get snake eyes back to back, I should think it's caused by "Technilogical causes" (aliens? CIA remotely controlling the dice?). At best, it's an incomplete argument. The power of the birthday paradox, along with the factors I explained in my previous comment means such occurrences are virtually guaranteed to occur if you're on the look out for them. This can't be dismissed with an off-hand with "Technilogical causes are much more likely than accidental causes for such effects to appear, in today's world".
>Occam's Razor and the answer to the question, "What kinds of companies are at work in the environment?" push that probability in a specific way, because the motives and means are definitely there. Do you think they are the kinds of companies that would waste such an opportunity?
Apple got sued for accidentally recording siri queries, and that cost them class action lawsuit, along with the requisite discovery. Some company intentionally doing this, all the while actively engaging in a conspiracy is far harder, and much easier to fall apart due.
If you read my article I quote that exact paragraph, and then call out the Ars Technica reporter for the way they misleadingly rephrased this similar paragraph from the Reuters report that Ars Technica cite as their source:
> One Siri user said his private discussions with his doctor about a “brand name surgical treatment” caused him to receive targeted ads for that treatment, while two others said their discussions about Air Jordan sneakers, Pit Viper sunglasses and “Olive Garden” caused them to receive ads for those products.
The whole point of my article is that these random claims that were part of the original lawsuit in 2021 are being mindlessly quoted as if they are proven facts, when they are not. Here's my link again: https://simonwillison.net/2025/Jan/2/they-spy-on-you-but-not...
Im not the guy, but you never bother trying to prove anything; you just ramble about vibes most people will implicitly agree with. “Oh boy rich people sure do suck and companies sure do suck.” “Aww geez my wife got a targeted ad.” I loosely figure that they spy too, but I expected a better showing from HN in rebuttal to the article. If it’s so damn obvious then show some actual proof.
>No, more like Siri, bruh. To sell some too-expensive slippers. You used the word "absurd", didn't ya?
The dice analogy is clearly a separate scenario than your guci slippers story, I'm not sure why you're trying to bring siri into that analogy. The point is that you can't just invoke "Technilogical causes are much more likely than accidental causes for such effects to appear, in today's world" without justification. If there's spooky stuff happening at a rate far higher than to be expected by pure chance, and there was proper documentation of this, I might be amenable to the above argument, but you haven't done that. In previous comments I listed multiple reasons issues with relying on random anecdotes, but you failed to rebut them.
>Sure. Apple does things by accident. Gotcha.
Is it really so unbelievable that automatic speech recognition would have false positives? There's plenty of things to criticize about Apple's behavior in that case (eg. not taking steps to ensure audio from accidental triggers are deleted), but the implication that they're intentionally doing it is totally unsupported.
>Motive, means, opportunity. Which is missing?
Proof that a crime has actually been committed. With that logic we should be arresting people for murder every time their ex/spouse goes missing, even if there's no evidence that foul play occurred. If it's been actually proven (ie. more rigorous evidence than random anecdotes) that people are getting targeted advertisements based on their conversations, maybe we can start assigning blame, but we haven't even established that's happening yet.
Why can't you both be right? If you talk about 100 products in a week, chances are you're conducting searches about some of them or your demographic data suggests that you might be in market for it.
We don't talk about products. We live very, very simply. We watch very little fiction. We mostly listen to music and watch a handful of soccer highlights. No Microsoft, no ads on the `puters, no kids on phones or the net except for chess.
We did watch three Tom Papa stand-up shows on NYE, though. They were glorious; we were crying laughing. "Outrageous!" I love that guy, so brilliant, silly, and hilarious!
Something similar happened to me with backpack zippers. It convinced me the phone is listening and serving me ads despite the technical explanations that it isn't.
I was walking to work and my backpack zipper broke getting off the elevator. When I got to my cube I set my phone on the desk and said to my coworker, "damn, my backpack zipper just broke!" 45 minutes later I was in a meeting and checked my phone and backpack zipper ads appear. I had never googled backpack zippers before, never seen backpack zipper ads.
Literally the only proceeding thing before getting these was was telling my coworker that mine had just broken.
But this is just selection bias. If a hundred people do that and one gets an ad, it’s proof. Nevermind the 99 others who never saw a thing and wouldn’t bother posting.
The most striking example happened to me while watching a documentary about siberian cats.
We were watching it in Italian, our main language, and I wanted to know more about it, as I typed "g", the first result was "gatto siberiano", exactly the cat I was looking for. Way too specific.
Another time as my girlfriend said she was interested how much a specific model of a watch a friend of him costed, the very same happened, as I typed the first few letters the very watch brand and model appeared.
Since then, I just don't care about how much technical description I can read, nothing's gonna convince me of it being a coincidence.
The only way to test this would be to have your anecdote together with the complete marketing profiles of your friend and his wife. If such a profile could even be compiled in principle, from it we would be able to tell whether your friend or his wife had generated any non-audio pool-related signals, or whether they had seen other pool-related ads recently. Also, it'd be nice to know how often people in their marketing categories receive ads for pool fencing. Could be an astonishing coincidence.
It’s definitely a difficult one to test in a scientific way. But they 100% had no interest in pool fencing, living long-term in a rental townhouse. They chose the phrase specifically to be something they had no interest or search history in.
I’m fascinated that this urban legend persists among tech people because it’s so easy to disprove.
Did you know that you can set up a proxy from your phone and capture all traffic from it? It would be so trivial to find the traffic from your phone. There are ways to MITM and inspect the traffic, too.
There are also many people doing static reverse engineering of phone apps looking for security vulnerabilities. To believe this urban legend, you’d also have to believe that none of them have ever encountered this hidden voice analysis code.
If we ignore that, you know there are OS-level security controls on apps, right? iOS and Android don’t make it easy for apps to use the microphone constantly and run in the background to process it.
Finally, if we ignore all of that, how can anyone believe that these companies are recording conversations but none of their employees have ever chosen to blow the whistle? We’ve seen numerous FAANG “whistleblowers” come through with everything down to trivial or baseless complaints, but nobody has blown the whistle on these supposed widespread spying programs?
The whole urban legend is preposterous to anyone who has any experience with apps or phone security, let alone common traffic analysis or reverse engineering tools. I don’t understand why the myth is so persistent among even some technical people.
I'm not sure if the legend is true or not. But this argument doesn't really disprove it. The devices don't need to send full audio recordings. They are powerful enough these days that they can do a cheap on-device audio analysis and tagging, and then upload the (very small) tags. It doesn't need to be Siri quality analysis because it doesn't matter if the analysis is incomplete or sometimes inaccurate. They would just be scanning for certain keywords.
As for whistleblowing... Is there really that much to whistleblow about it? We already know that ad-based companies like Google are collecting our data every chance they get, because they make billions of dollars from it. They're scraping our emails, studying our GPS location, paying attention to who we are in proximity with, etc. The level of surveillance is incredible and people don't really care. It wouldn't be headline news to find out that they are taking advantage of yet another side channel.
>Did you know that you can set up a proxy from your phone and capture all traffic from it?
The phone knows about your proxy. There are phones - actual brands - that were caught on sending secret telemetry to their manufacturer, but only when not listened - definitely only on mobile data, no wifi, and I assume with cert pinning.
I know a person who was researching this and they needed a Faraday cage and a BTS to conduct experiments. So it's not exactly trivial.
The difference is that these were small Chinese brands that were not even that popular in my country - and still someone researched this. Imagine how much research Android and Iphone get, and there's not a single proof of and wrongdoing. Now that is unlikely.
This is just flat earth for technophiles. They don't really want to know the truth, they just want to enjoy their fantasy of living a conspiracy theory.
It is interesting how people always come up with anecdotes like this but none of them try repeating the experiment multiple times.
You might think the pool fencing example might be an extreme coincidence, but far weirder things happen every day. And what made your friend consider pool fencing as an example if they don't like pools? Maybe something they saw recently gave them the idea? Hmm...
Had this happen. “Airport tier tar” was the phrase someone said near me. Saw ads on Instagram for such a niche thing the next morning. Not only did I see ads they were insanely local. I have never needed to buy tar.
Then theres the time a friend told me about a very specific brand of Ramen, I opened up Facebook, and there it was, very first ad.
There is a video of Zuck denying they "recording peoples microphone" -- but how he said it with a smirk I took him to mean "we do on-device transcription and only send back keywords"
You see thousands of ads of every type every day and ignore them. Now you’re doing a test and consciously looking for ads related to pools. Of course you’re going to find something.
its called a noise gate: basic audio gear that triggers a function based on SPL(sound pressure level),which would be a reliable way to trigger a capture event and....the rest, without listening to everything.Change in tempo and pitch could also be good for an "event trigger".
the start of digital audio goes back to the 1980's
and the full suite of capabilities is trivial for any phone, as they are integrated extensivly to cancel background noise anyway.
And with so called digital voice assistants running, I cant be surprised.
My main point would be that ,everyone is convinced that there phones are spying on them,
its one more thing to make them flinch
and grimace, argueing about it will only draw deeper lines.And that, is where we are.
So...... the listening isn't very good? Because recommending a swimming pool simply based on the single word pool is just terrible.
Either they have the most technically impressive spying system that can't do anything right or it's just not happening and people are making connections where there isn't really any.
I’m not sure what point you’re attempting to make here, but they chose the phrase “pool fencing” and were rapidly inundated with ads for pool fencing, which, in isolation, would suggest the listening is extremely accurate.
we haven't done an experiment like that but I've had family know what medium certain topics were expressed and discussed over and those topics that landed in the ads that startled with their topic intersection were topics discussed only verbally vs typed into some search field or connected to some other web interaction etc.
That's kind of the smoking gun when you can create a disjoint set of topics and a disjoint set of mediums of communication delivery and see what shows up in the ad space from those discussion topics strictly expressed verbally.
But why did he pick pools? What if he lives in an upper middle class suburban neighborhood where everybody has pools? And what if he slowed his scrolling just a little too much on a pool ad on Instagram? What if he actually, kinda does think about getting a pool?
Who knows.
I'm just saying, the technical, ethical, and legal implications of creating an ad network that surreptitiously slurps up audio 24 hours a day in violation of the claimed terms of service without anybody leaking anything about it is a conspiracy that seems less likely than people just being more predictable than they would like to believe.
Whatever made him use pool fencing as his random example is probably also why the ad showed up. Maybe it's the season for that stuff, he saw other ads earlier, or other friends talked about it. He may not consciously remember that, but it could make him more likely to think of it again later. In other words he talked about it because of the ads, not the other way around.
...unless there were actually several thousand people who performed this experiment, got a negative result, and therefore don't remember it or post anything about it.
> Every time I read some technical description about why this isn't happening, the technical description seems convincing.
Having knowledge of the technical limitations and challenges myself, I used to be on board for this argument, but now less so.
All of the technical arguments against the listening seem to ignore the "Ok, <DEVICE>" or "Hello, <DEVICE>" initiating phrases for the voluntary surveillance devices people put in their rooms, and offer only a worst case defense ~"how could they process everything everyone is saying?!"
Why is it such a stretch to imagine these devices grab Direct Objects and Subjects and store those singular items for ad keywording?
We have cookies and know how they work, why is it difficult to extrapolate?
simonw is a breathless proselytizer of LLMs and likely is suffering from "a man's salary depending on misunderstanding" and all that.
It bears repeating, "these corpos are raising billions and hiring former alphabet heads to their boards for reasons other than just making you a better programming assistant."
> Why is it such a stretch to imagine these devices grab Direct Objects and Subjects and store those singular items for ad keywording?
How about because Apple say they don't do that, and can and do get sued if they say things like that which are not true?
(Sadly I make basically no money at all from my "breathless proselytizing" of LLMs. I hope to fix that this year, someone should pay me for this stuff!
You know I've written more negative things about LLMs than almost anyone else, right? 121 posts tagged AI ethics right here: https://simonwillison.net/tags/ai+ethics/ )
Obviously an experimental advertising system would require a special deal with the phone manufacturer and would be covered by Apple's very broad license to advertise based on your data.
That doesn't mean they're doing it, of course. I'm just not aware of technical or legal barriers.
However...
A friend tested the theory a few years ago. He doesn't own a swimming pool, doesn't want to, and has never expressed any desire to. He put his and his wife's phone on the table and said to the wife (loudly), "Why don't we look into pool fencing?". She agreed with him. Shortly after, on both of their phones, on a particular social network, they were inundated with ads for....pool fencing.