> *It's only now that everybody's used to natural language interfaces that I thi...

Other stuff: yeah, you can get away with a lot using some formal grammar, a random number generator, and a lot of effort railroading the user so they won't even think to say anything that'll break the illusion. I've had people not realize for weeks that my IRC bot is a bot, even if all it did was apply a bunch of regular exceptions to input (the trick was that the bot would occasionally speak unprompted and react to other people's conversations, and replied to common emoticons and words indicating emotion).

No, only in the last two years we can say that there exist Speech-to-Text that works reliably, Text-to-Speech that sounds natural, and a ML model that can parse arbitrary natural language text and reliably infer what you want, even if you never directly stated what you want, and handling for it was never explicitly coded.

lproven · on June 21, 2024

I disagree.

I think you are presenting a gradual incremental change as some sort of binary transition, and it isn't. This is at best disingenuous and misleading and at worst it's a flat out lie.

Text-only natural language interfaces were working in the 1960s and working well by the 1980s.

Live real-time speech recognition with training was working by the turn of the century, and following a hand injury, I was dictating my work into freeware for a while by 2000 or so. It was bundled with later versions of IBM OS/2 Warp.

Real-time speaker-independent speech recognition started working usefully well some 15 years ago, and even as a sceptic who dislikes such things, I was using it and demonstrating it a decade ago. It's been a standard feature of mainstream commercial desktop OSes as well as smartphones for about 8-9 years. Windows 10 (2015) included Cortana; macOS Sierra (2016) included Siri.

In fact, after I posted my previous comment, this morning Facebook reminded me that 8Y ago today I was installing Win10 on my Core 2 Duo Thinkpad.

I don't allow any of these devices in my home but they're a multi-billion dollar in domestic voice-controlled hardware.

This is mainstream used by a double-digit percentage of humanity.

You seem to have been deceived by the LLM bot fakery of "intelligence" that it's achieved some quantum leap in smarts recently. This is illusory.

TeMPOraL · on June 21, 2024

I must have really bad luck for voice control systems, because I've tried most of them, and there were only two that reliably worked for me:

1) Most recent breed of ML models, backed in part by LLMs;

2) A voice control system I hacked together some 17 years ago, with Microsoft Speech API and a cheap microphone I soldered to a long cable and stuck to the side of the wardrobe. It had the benefit of using a fixed grammar tree, and it was in the saner time when the vendor actually allowed you to train the system on your own texts.

Everything else - especially current voice assistants and dictation software on the couple flagship phones I've used over the years - was garbage compared to that.

So again, I must be the unluckiest person in the world when it comes to voice recognition, because what you wrote is entirely counter to my experience.

lproven · on June 22, 2024

I am not talking about my own experience, though.

I'm talking about what I see in the world. The people who own Apple watches and iPhones who rarely even take the phone out of their pocket/bag any more. The folks whose light switches are never used or who are even getting them removed. The people who don't use things like kitchen timers any more. The disappearance of physical or local music/video collections.

I think these trends are observable and widespread.

Me? I don't go near any of them, myself.

ben_w · on June 27, 2024

> The folks whose light switches are never used or who are even getting them removed.

Eh, I know an anecdote isn't a statistic, but my experience with voice controls is so bad that I've mostly gotten rid of the "smart" parts and gone back to the light switches.

Back when we had Alexa, we'd say "Alexa, Küche aus" ("kitchen off") and it would reply "Ich kann nicht 'Küche' im Spotify finden" (worse, we didn't have Spotify).

Siri is less bad, but it still just randomly fails. I've had two devices in the same room try to respond to the same voice command, one succeeds and the other spins around for a bit and responds with a spoken generic error.

For the voice input keyboard, the error rate is much worse, bad enough that in my experience I might as well have typed one real word and let autosuggest write the remainder of whatever I was not-typing.

lproven · on June 28, 2024

I find that easy to believe.

I have watched my friends with this kit demonstrate it to me, many times, and to me it seems clumsy and difficult and error-prone, as well as being expensive, vastly horribly insecure and not so much violating privacy as gang-raping it.

I do not understand why they like it or think it better.

The only friend I know with a valid use case is non-light-sensing blind. It's useful for him to just tell the room "turn the lights off" rather than hunt for the switch. Or ask if they're on.

(Yes he has lights. He's married with a kid.)