We've been developing a new method of developing software using a cloud IDE (slightly modified vs code server), https://github.com/bitswan-space which breaks down the development process into independent "Automations" which each run in a separate container. Automatons are also developed within containers. This allows you to break down the development into parts and safely experiment with AI. This feels like the "Android moment" where the old non-isolated way of developing software (on desktops) becomes unsafe. And we need to move to a new system with actual security and isolation between processes.
In our system, you can launch a Jupyter server in a container and iterate on software in complete isolation. Or launch a live preview react application and iterate in complete isolation. Securely isolated from the world. Then you deploy directly to another container, which only has access to what you give it access to.
It's still in the early stages. But it's interesting to sit at this tipping point for software development.
People blaming the user and defending the software: is there any other program where you would be ok with it erasing a whole drive without any confirmation?
If that other program were generating commands to run on your machine by design and you configured it to run without your confirmation, then you should definitely feel a lil sheepish and share some of the blame.
This isnt like Spotify deleting your disk.
I run Claude Code with full permission bypass and I’d definitely feel some shame if it nuked my ssd.
The installation wizard gives a front and center option to run in a mode where the user must confirm all commands, or more autonomous modes, and they are shown with equal visibility and explained with disclaimers.
If you decide to let a stochastic parrot run rampant on your system, you can't act surprised when it fucks shit up. You should count on it doing so and act proactively.
I love how a number crunching program can be deeply humanly "horrorized" and "sorry" for wiping out a drive. Those are still feelings reserved only for real human beings, and not computer programs emitting garbage. This is vibe insulting to anyone that don't understand how "AI" works.
I'm sorry for the person who lost their stuff but this is a reminder that in 2025 you STILL need to know what you are doing and if you don't then put your hands away from the keyboard if you think you can lose valuable data.
That would be a silly argument because feelings involve qualia, which we do not currently know how to precisely define, recognize or measure. These qualia influence further perception and action.
Any relationships between certain words and a modified probabilistic outcome in current models is an artifact of the training corpus containing examples of these relationships.
I contend that modern models are absolutely capable of thinking, problem-solving, expressing creativity, but for the time being LLMs do not run in any kind of sensory loop which could house qualia.
One of the worst or most uncomfortable logical outcomes of
> which we do not currently know how to precisely define, recognize or measure
is that if we don't know if something has qualia (despite externally showing evidence of it), morally you should default to treating it like it does.
Ridiculous to treat a computer like it has emotions, but breaking down the problem into steps, it's incredibly hard to avoid that conclusion. "When in doubt, be nice to the robot".
> qualia, which we do not currently know how to precisely define, recognize or measure
> which could house qualia.
I postulate this is a self-negating argument, though.
I'm not suggesting that LLMs think, feel or anything else of the sort, but these arguments are not convincing. If I only had the transcript and knew nothing about who wiped the drive, would I be able to tell it was an entity without qualia? Does it even matter? I further postulate these are not obvious questions.
Unless there is an active sensory loop, no matter how fast or slow, I don't see how qualia can enter the picture
Transformers attend to different parts of their input based on the input itself. Currently, if you want to tell an LLM it is sad, potentially altering future token prediction and labeling this as "feelings" which change how the model interprets and acts on the world, you have to tell the model that it is sad or provide an input whose token set activates "sad" circuits which color the model's predictive process.
You make the distribution flow such that it predicts "sad" tokens, but every bit of information affecting that flow is contained in the input prompt. This is exceedingly different from how, say, mammals process emotion. We form new memories and brain structures which constantly alter our running processes and color our perception.
It's easy to draw certain individual parallels to these two processes, but holistically they are different processes with different effects.
A lot of tech people online also don't know how to examine their own feelings, and so think they are mysterious and un-defined.
When really they are an actual feedback mechanism, that can totally be quantified just like any control loop. This whole 'unknowable qualia' argument is bunk.
> That would be a silly argument because feelings involve qualia, which we do not currently know how to precisely define, recognize or measure.
If we can't define, recognize or measure them, how exactly do we know that AI doesn't have them?
I remain amazed that a whole branch of philosophy (aimed, theoretically, at describing exactly this moment of technological change) is showing itself up as a complete fraud. It's completely unable to describe the old world, much less provide insight into the new one.
I mean, come on. "We've got qualia!" is meaningless. Might as well respond with "Well, sure, but AI has furffle, which is isomporphic." Equally insightful, and easier to pronounce.
> If we can't define, recognize or measure them, how exactly do we know that AI doesn't have them?
In the same way my digital thermometer doesn't have quaila. LLM's do not either. I really tire of this handwaving 'magic' concepts into LLM's.
Qualia being difficult to define and yet being such an immediate experience that we humans all know intimately and directly is quite literally the problem. Attempted definitions fall short and humans have tried and I mean really tried hard to solve this.
> In the same way my digital thermometer doesn't have quaila
And I repeat the question: how do you know your thermometer doesn't? You don't, you're just declaring a fact you have no basis for knowing. That's fine if you want a job in a philosophy faculty, but it's worthless to people trying to understand AI. Again, c.f. furffle. Thermometers have that, you agree, right? Because you can't prove they don't.
You're just describing panpsychism, which itself is the subject of much critique due to its nonfalsifiability and lack of predictive power. Not to mention it ignores every lesson we've learned in cognition thus far.
A thermometer encoding "memory" of a temperature is completely different than a thermometer on a digital circuit, or a thermometer attached to a fully-developed mammalian brain. Only the latter of this set for sure has the required circuitry to produce qualia, at least as far as I can personally measure without invoking solipsism.
It's also very silly to proclaim that philosophy of mind is not applicable to increasingly complex thinking machines. That sounds like a failure to consider the bodies of work behind both philosophy of mind and machine cognition. Again, "AI" is ill-defined and your consistent usage of that phrase instead of something more precises suggests you still have a long journey ahead of you for "understanding AI".
Have you considered that you just don't fully understand the literature? It's quite arrogant to write off the entire philosophy of mind as "a complete fraud".
> It's completely unable to describe the old world, much less provide insight into the new one.
What exactly were you expecting?
Philosophy is a science, the first in fact, and it follows a scientific method for asking and answering questions. Many of these problems are extremely hard and their questions are still yet unanswered, and many questions are still badly formed or predicated on unproven axioms. This is true for philosophy of mind. Many other scientific domains are similarly incomplete, and remain active areas of research and contemplation.
What are you adding to this research? I only see you complaining and hurling negative accusations, instead of actually critically engaging with any specifics of the material. Do you have a well-formed theory to replace philosophy of mind?
> I mean, come on. "We've got qualia!" is meaningless. Might as well respond with "Well, sure, but AI has furffle, which is isomporphic." Equally insightful, and easier to pronounce.
Do you understand what qualia is? Most philosophers still don't, and many actively work on the problem. Admitting that something is incomplete is what a proper scientist does. An admission of incompleteness is in no way evidence towards "fraud".
The most effective way to actually attack qualia would be to simply present it as unfalsifiable. And I'd agree with that. We might hopefully one day entirely replace the notion of qualia with something more precise and falsifiable.
But whatever it is, I am currently experiencing a subjective, conscious experience. I'm experiencing it right now, even if I cannot prove it or even if you do not believe me. You don't even need to believe I'm real at all. This entire universe could all just be in your head. Meanwhile, I like to review previous literature/discussions on consciousness and explore the phenomenon in my own way. And I believe that subjective, conscious experience requires certain elements, including a sensory feedback loop. I never said "AI can't experience qualia", I made an educated statement about the lack of certain components in current-generation models which imply to me the lack of an ability to "experience" anything at all, much less subjective consciousness and qualia.
Even "AI" is such a broadly defined term that such a statement is just ludicrous. Instead, I made precise observations and predictions based on my own knowledge and decade of experience as a machine learning practitioner and research engineer. The idea that machines of arbitrary complexity inherently can have the capability for subjective consciousness, and that specific baselines structures are not required, is on par with panpsychism, which is even more unfalsifiable and theoretical than the rest of philosophy of mind.
Hopefully, we will continue to get answers to these deep, seemingly unanswerable questions. Humans are stubborn like that. But your negative, vague approach to discourse here doesn't add anything substantial to the conversation.
I would add I find it difficult to understand why so few have even a basic level of philosophical understanding. The attitude of being entirely dismissive of it is the height of ignorance I'm sure. I would presume few would be able to define then what Science actually is.
Not according to Zombie Feynman it isn't[1] (someone else can dig up the link). Case in point:
> Do you understand what qualia is? Most philosophers still don't
It's a meaningless word. It's a word that gives some clean construction around closely-held opinions about how life/consciousness/intelligence/furffle/whatever works. So it's a valuable word within the jargon of the subculture that invented it.
But it's not "science", which isn't about words at all except as shorthand for abstractions that are confirmed by testable results.
"Qualia", basically, is best understood as ideology. It's a word that works like "woke" or "liberal" or "fascist" or "bourgeoisie" to flag priors about which you don't want to argue. In this case, you want people to be special, so you give them a special label and declare a priori that it's not subject to debate. But that label doesn't make them so.
[1] Of course. You can recursively solve this problem by redefining "science" to mean something else. But that remains very solidly in the "not science" category of discourse.
Modern lingo like this seems so unthoughtful to me. I am not old by any metric, but I feel so separated when I read things like this. I wanted to call it stupid but I suppose it's more pleasing to 15 to 20 year olds?
Eh, one's ability to communicate concisely and precisely has long (forever?) been limited by one's audience.
Only a fairly small set of readers or listeners will appreciate and understand the differences in meaning between, say, "strange", "odd", and "weird" (dare we essay "queer" in its traditional sense, for a general audience? No, we dare not)—for the rest they're perfect synonyms. That goes for many other sets of words.
Poor literacy is the norm, adjust to it or be perpetually frustrated.
No need to feel that way, just like a technical term you're not familiar with you google it and move on. It's nothing to do with age, people just seem to delight in creating new terms that aren't very helpful for their own edification.
Eh, I think it depends on the context. A production system of a business you’re working for or anything where you have a professional responsibility, yeah obviously don’t vibe command, but I’ve been able to both learn so much and do so much more in the world of self hosting my own stuff at home ever since I started using llms.
This is akin to a psychopath telling you they're "sorry" (or "sorry you feel that way" :v) when they feel that's what they should be telling you. As with anything LLM, there may or may not be any real truth backing whatever is communicated back to the user.
Not so much different from how people work sometimes though - and in the case of certain types of pscychopathy it's not far at all from the fact that the words being emitted are associated with the correct training behavior and nothing more.
Aren't humans just doing the same? What we call as thinking may just be next action prediction combined with realtime feedback processing and live, always-on learning?
It's not akin to a psychopath telling you they're sorry. In the space of intelligent minds, if neurotypical and psychopath minds are two grains of sand next to each other on a beach then an artificially intelligent mind is more likely a piece of space dust on the other side of the galaxy.
Start with LLMs are not humans, but they’re obviously not ‘not intelligent’ in some sense and pick the wildest difference that comes to mind. Not OP but it makes perfect sense to me.
I think a good reminder for many users is that LLMs are not based on analyzing or copying human thought (#), but on analyzing human written text communication.
--
(#) Human thought is based on real world sensor data first of all. Human words have invisible depth behind them based on accumulated life experience of the person. So two people using the same words may have very different thoughts underneath them. Somebody having only text book knowledge and somebody having done a thing in practice for a long time may use the same words, but underneath there is a lot more going on for the latter person. We can see this expressed in the common bell curve meme -- https://www.hopefulmons.com/p/the-iq-bell-curve-meme -- While it seems to be about IQ, it really is about experience. Experience in turn is mostly physical, based on our physical sensors and physical actions. Even when we just "think", it is based on the underlying physical experiences. That is why many of our internal metaphors even for purely abstract ideas are still based on physical concepts, such as space.
Without any of the spatial and physical object perception you train from right after birth, see toddlers playing, or the underlying wired infrastructure we are born with to understand the physical world (there was an HN submission about that not long ago). Edit, found it: https://news.ucsc.edu/2025/11/sharf-preconfigured-brain/
They are not a physical model like humans. Ours is based on deep interactions with the space and the objects (reason why touching things is important for babies), plus mentioned preexisting wiring for this purpose.
Isn't it obvious that the way AI works and "thinks" is completely different from how humans think? Not sure what particular source could be given for that claim.
I wonder if it depends on the human and the thinking style? E.g. I am very inner monologue driven so to me it feels like I think very similarly as to how AI seems to think via text. I wonder if it also gives me advantage in working with the AI. I only recently discovered there are people who don't have inner monologue and there are people that think in images etc. This would be unimaginable for me, especially as I think I have sort of aphantasia too, so really I am ultimately text based next token predictor myself. I don't feel that whatever I do at least is much more special compared to an LLM.
Of course I have other systems such as reflexes, physical muscle coordinators, but these feel largely separate systems from the core brain, e.g. don't matter to my intelligence.
I am naturally weak at several things that I think are not so much related to text e.g. navigating in real world etc.
Interesting... I rarely form words in my inner thinking, instead I make a plan with abstract concepts (some of them have words associated, some don't). Maybe because I am multilingual?
No source could be given because it’s total nonsense. What happened is not in any way akin to a psychopath doing anything. It is a machine learning function that has trained on a corpus of documents to optimise performance on two tasks - first a sentence completion task, then an instruction following task.
I think that's more or less what marmalade2413 was saying and I agree with that. AI is not comparable to humans, especially today's AI, but I think future actual AI won't be either.
I think the point of comparison (whether I agree with it or not) is someone (or something) that is unable to feel remorse saying “I’m sorry” because they recognize that’s what you’re supposed to do in that situation, regardless of their internal feelings. That doesn’t mean everyone who says “sorry” is a psychopath.
We are talking about an LLM it does what it has learned. The whole giving it human ticks or characteristics when the response makes sense ie. saying sorry is a user problem.
Okay? I specifically responded to your comment that the parent comment implied "if you make a mistake and say sorry you are also a psychopath", which clearly wasn’t the case. I don’t get what your response has to do with that.
No, the point is that saying sorry because you're genuinely sorry is different from saying sorry because you expect that's what the other person wants to hear. Everybody does that sometimes but doing it every time is an issue.
In the case of LLMs, they are basically trained to output what they predict an human would say, there is no further meaning to the program outputting "sorry" than that.
I don't think the comparison with people with psychopathy should be pushed further than this specific aspect.
Notably, if we look at this abstractly/mechanically, psychopaths (and to some extent sociopaths) do study and mimic ‘normal’ human behavior (and even the appearance of specific emotions) to both fit in, and to get what they want.
So while internally (LLM model weight stuff vs human thinking), the mechanical output can actually appear/be similar in some ways.
Are you smart people all suddenly imbeciles when it comes to AI or is this purposeful gaslighting because you’re invested in the ponzi scheme?
This is a purely logical problem. comments like this completely disregard the fallacy of comparing humans to AI as if a complete parity is achieved. Also the way this comments disregard human nature is just so profoundly misanthropic that it just sickens me.
No but the conclusions in this thread are hilarious. We know why it says sorry. Because that's what it learned to do in a situation like that. People that feel mocked or are calling an LLM psychopath in a case like that don't seem to understand the technology either.
I agree, psychopath is the wrong adjective, I agree. It refers to an entity with a psyche, which the illness affects. That said, I do believe the people who decided to have it behave like this for the purpose of its commercial success are indeed the pathological individuals. I do believe there is currently a wave of collective psychopathology that has taken over Silicon Valley, with the reinforcement that only a successful community backed by a lot of money can give you.
Now, with this realization, assess the narrative that every AI company is pushing down our throat and tell me how in the world we got here.
The reckoning can’t come soon enough.
We're all too deep! You could even say that we're fully immersed in the likely scenario. Fellow humans are gathered here and presently tackling a very pointed question, staring at a situation, and even zeroing in on a critical question. We're investigating a potential misfire.
No, wasn't directed at someone in particular. More of an impersonal "you". It was just a comment against the AI inevitabilism that has profoundly polluted the tech discourse.
Yes, the tools still have major issues. Yet, they have become more and more usable and a very valuable tool for me.
Do you remember when we all used Google and StackOverflow? Nowadays most of the answers can be found immediately using AI.
As for agentic AI, it's quite useful. Want to find something in the code base, understand how something works? A decent explanation might only be one short query away. Just let the AI do the initial searching and analysis, it's essentially free.
I'm also impressed with the code generation - I've had Gemini 3 Pro in Antigravity generate great looking React UI, sometimes even better than what I would have come up with. It also generated a Python backend and the API between the two.
Sometimes it tries to do weird stuff, and we definitely saw in this post that the command execution needs to be on manual instead of automatic. I also in particular have an issue with Antigravity corrupting files when trying to use the "replace in file" tool. Usually it manages to recover from that on its own.
AI currently is a broken, fragmented replica of a human, but any discussion about what is "reserved" to whom and "how AI works" is only you trying to protect your self-worth and the worth of your species by drawing arbitrary linguistic lines and coming up with two sets of words to describe the same phenomena, like "it's not thinking, it's computing". It doesn't matter what you call it.
I think AI is gonna be 99% bad news for humanity, but don't blame AI for it. We lost the right to be "insulted" by AI acting like a human when we TRAINED IT ON LITERALLY ALL OUR CONTENT. It was grown FROM NOTHING to act as a human, so WTF do you expect it to do?
The thread on reddit is hilarious for the lack of sympathy. Basically, it seems to have come down to commanding a deletion of a "directory with space in the name" but without quoting which made the command hunt for the word match ending space which was regrettably, the D:\ component of the name, and the specific deletion commanded the equivalent of UNIX rm -rf
The number of people who said "for safety's sake, never name directories with spaces" is high. They may be right. I tend to think thats more honoured in the breach than the observance, judging by what I see windows users type in re-naming events for "New Folder" (which btw, has a space in its name)
The other observations included making sure your deletion command used a trashbin and didn't have a bypass option so you could recover from this kind of thing.
I tend to think giving a remote party, soft or wet ware control over your command prompt inherently comes with risks.
Friends don't let friends run shar files as superuser.
I understood Windows named some of the most important directories with spaces, then special characters in the name so that 3rd party applications would be absolutely sure to support them.
"Program Files" and "Program Files (x86)" aren't there just because Microsoft has an inability to pick snappy names.
Fun fact: that's not true for all Windows localizations. For example, it's called "Programmi" (one word) in Italian.
Renaming system folders depending on the user's language also seems like a smart way to force developers to use dynamic references such as %ProgramFiles% instead of hard-coded paths (but some random programs will spuriously install things in "C:\Program Files" anyway).
The folders actually have the English name in all languages. It's just explorer.exe that uses the desktop.ini inside those folders to display a localized name. When using the CLI, you can see that.
At least it's like that since Windows 7. In windows XP, it actually used the localized names on disk.
When I was at Microsoft, one test pass used pseudolocale (ps-PS IIRC) to catch all different weird things so this should have Just Worked (TM), but I was in Windows Server team so client SKUs may have been tested differently. Unfortunately I don't remember how Program Files were called in that locale and my Google-fu is failing me now.
As I recall pseudoloc is just randomly picking individual characters to substitute that look like the Latin letters to keep it readable for testing, so it would be something like рг (Cyrillic) ο (Greek)... etc, and can change from run to run. It would also artificially pad or shorten terms to catch cases where the (usually German) term would be much longer or a (usually CJK) term would be much shorter and screw up alignment or breaks.
Visual Studio Code has absolutely nothing to do with Visual Studio. Both are used to edit code.
.NET Core is a ground up rewrite of .NET and was released alongside the original .NET, which was renamed .NET Framework to distinguish it. Both can be equally considered to be "frameworks" and "core" to things. They then renamed .NET Core to .NET.
And there's the name .NET itself, which has never made an iota of sense, and the obsession they had with sticking .NET on the end of every product name for a while.
I don't know how they named these things, but I like to imagine they have a department dedicated to it that is filled with wild eyed lunatics who want to see the world burn, or at least mill about in confusion.
> they have a department dedicated to it that is filled with wild eyed lunatics who want to see the world burn, or at least mill about in confusion.
That's the marketing department. All the .NET stuff showed up when the internet became a big deal around 2000 and Microsoft wanted to give the impression that they were "with it".
But Copilot is another Microsoft monstrosity. There's the M365 Copilot, which is different from Github Copilot which is different from the CLI Copilot which is a bit different from the VSCode Copilot. I think I might have missed a few copilots?
> it seems to have come down to commanding a deletion of a "directory with space in the name" but without quoting which made the command hunt for the word match ending space which was regrettably, the D:\ component of the name, and the specific deletion commanded the equivalent of UNIX rm -rf
I tried looking for what made the LLM generate a command to wipe the guy's D drive, but the space problem seems to be what the LLM concluded so that's basically meaningless. The guy is asking leading questions so of course the LLM is going to find some kind of fault, whether it's correct or not, the LLM wants to be rewarded for complying with the user's prompt.
Without the transcription of the actual delete event (rather than an LLM recapping its own output) we'll probably never know for sure what step made the LLM purge the guy's files.
Looking at the comments and prompts, it looks like running "npm start dev" was too complicated a step for him. With that little command line experience, a catastrophic failure like this was inevitable, but I'm surprised how far he got with his vibe coded app before it all collapsed.
LLM there generates fake analysis for cynically simulated compliance. The reality is that it was told to run commands and just made a mistake. Dude guilt trips the AI by asking about permission.
Most dramatic stories on Reddit should be taken with a pinch of salt at least... LLM deleting a drive and the user just calmly asking it about that - maybe a lot more.
> but without quoting which made the command hunt for the word match ending space which was regrettably, the D:\ component of the name
Except the folder name did not start with a space. In an unquoted D:\Hello World, the command would match D:\Hello, not D:\ and D:\Hello would not delete the entire drive. How does AI even handle filepaths? Does it have a way to keep track of data that doesn't match a token or is it splitting the path into tokens and throwing everything unknown away?
We're all groping around in the dark here, but something that could have happened is a tokenizer artifact.
The vocabularies I've seen tend to prefer tokens that start with a space. It feels somewhat plausible to me that an LLM sampling would "accidentally" pick the " Hello" token over the "Hello" token, leading to D:\ Hello in the command. And then that gets parsed as deleting the drive.
I've seen similar issues in GitHub Copilot where it tried to generate field accessors and ended up producing an unidiomatic "base.foo. bar" with an extra space in there.
I assumed he had a folder that started with a space at the start of the name. Amusingly I just tried this and with Windows 11 explorer will just silently discard a space if you add it at the beginning of the folder name. You need to use cli mkdir " test" to actually get a space in the name.
A lot of 3rd party software handle space, or special characters wrong on Windows. The most common failure mode is to unnecessarily escape characters that don't need to be escaped.
Chrome's Dev Tool (Network)'s "copy curl command (cmd)" did (does?) this.
> I tend to think giving a remote party control over your command prompt inherently comes with risks.
I thought cursor (and probably most other) AI IDEs have this capability too? (source: I see cursor executing code via command line frequently in my day to day work).
I've always assumed the protection against this type of mishap is statistical improbability - i.e. it's not impossible for Cursor to delete your project/hard disk, it's just statistically improbable unless the prompt was unfortunately worded to coincidentally have a double meaning (with the second, unintended interpretation being a harmful/irreversible) or the IDE simply makes a mistake that leads to disaster, which is also possible but sufficiently improbable to justify the risk.
> My view is that the approach to building technology which is embodied by move fast and break things is exactly what we should not be doing because you can't afford to break things and then fix them afterwards.
> Basically, it seems to have come down to commanding a deletion of a "directory with space in the name" but without quoting which made the command hunt for the word match ending space which was regrettably, the D:\ component of the name, and the specific deletion commanded the equivalent of UNIX rm -rf
More like the equivalent of "rm -rf --no-preserve-root".
This is a rare example of where the Linux (it's not Unix and almost no-one uses Unix anymore) command is more cautious than the Windows one, whereas it's usually the Linux commands that just do exactly what you specify even if it's stupid.
…at least if you let these things autopilot your machine.
I haven’t seen a great solution to this from the new wave of agentic IDEs, at least to protect users who won’t read every command, understand and approve it manually.
Education could help, both in encouraging people to understand what they’re doing, but also to be much clearer to people that turning on “Turbo” or “YOLO” modes risks things like full disk deletion (and worse when access to prod systems is involved).
Even the name, “Turbo” feels irresponsible because it focusses on the benefits rather than the risks. “Risky” or “Danger” mode would be more accurate even if it’s a hard sell to the average Google PM.
“I toggled Danger mode and clicked ‘yes I understand that this could destroy everything I know and love’ and clicked ‘yes, I’m sure I’m sure’ and now my drive is empty, how could I possibly have known it was dangerous” seems less likely to appear on Reddit.
Superficially, these look the same, but at least to me they feel fundamental different. Maybe it’s because if I have the ability to read the script and take the time to do so, I can be sure that it won’t cause a catastrophic outcome before running it. If I choose to run an agent in YOLO mode, this can just happen if I’m very unlucky. No way to proactively protect against it other than not use AI in this way.
I've seen many smart people make bone headed mistakes. The more I work with AI, the more I think the issue is that it acts too much like a person. We're used to computers acting like computers, not people with all their faults heh.
I don’t think there is a solution. It’s the way LLMs work at a fundamental level.
It’s a similar reason why they can never be trusted to handle user input.
They are probabilistic generators and have no real delineation between system instructions and user input.
It’s like I wrote a JavaScript function where I concatenated the function parameters together with the function body, passed it to eval() and said YOLO.
Sandboxing. LLM shouldn't be able to run actions affecting anything outside of your project. And ideally the results should autocommit outside of that directory. Then you can yolo as much as you want.
If they're that unsafe... why use them? It's insane to me that we are all just packaging up these token generators and selling them as highly advanced products when they are demonstrably not suited to the tasks. Tech has entered it's quackery phase.
The danger is that the people most likely to try to use it, are the people most likely to misunderstand/anthropomorphize it, and not have a requisite technical background.
I.e. this is just not safe, period.
"I stuck it outside the sandbox because it told me how, and it murdered my dog!"
Seems somewhat inevitable result of trying to misapply this particular control to it...
I've been using bubblewrap for sandboxing my command line executables. But I admit I haven't recently researched if there's a newer way people are handling this. Seems Firejail is popular for GUI apps? How do you recommend, say, sandboxing Zed or Cursor apps?
This guy is vibing some react app, doesnt even know what “npm run dev” does, so he let the LLM just run commands.
So basically a consumer with no idea of anything. This stuff is gonna happen more and more in the future.
There are a lot of people who don't know stuff. Nothing wrong with that. He says in his video "I love Google, I use all the products. But I was never expecting for all the smart engineers and all the billions that they spent to create such a product to allow that to happen. Even if there was a 1% chance, this seems unbelievable to me" and for the average person, I honestly don't see how you can blame them for believing that.
I think there is far less than 1% chance for this to happen, but there are probably millions of antigravity users at this point, 1 millionths chance of this to happen is already a problem.
We need local sandboxing for FS and network access (e.g. via `cgroups` or similar for non-linux OSes) to run these kinds of tools more safely.
Codex does such sandboxing, fwiw. In practice it gets pretty annoying when e.g. it wants to use the Go cli which uses a global module cache. Claude Code recently got something similar[0] but I haven’t tried it yet.
In practice I just use a docker container when I want to run Claude with —-dangerously-skip-permissions.
This is an archetypal case of where a law wouldn't help. The other side of the coin is that this is exactly a data loss bug in a product that is perfectly capable of being modified to make it harder for a user to screw up this way. Have people forgotten how comically easy it was to do this without any AI involved? Then shells got just a wee bit smarter and it got harder to do this to yourself.
LLM makers that make this kind of thing possible share the blame. It wouldn't take a lot of manual functional testing to find this bug. And it is a bug. It's unsafe for users. But it's unsafe in a way that doesn't call for a law. Just like rm -rf * did not need a law.
- sell software that interacts with your computer and can lead to data loss, you can
- give people software for free that can lead to data loss.
...
the Antigravity installer comes with a ToS that has this
The Service includes goal-oriented AI systems or workflows that perform
actions or tasks on your behalf in a supervised or autonomous manner that you
may create, orchestrate, or initiate within the Service (“AI Agents”). You
are solely responsible for: (a) the actions and tasks performed by an AI
Agent; (b) determining whether the use an AI Agent is fit for its use case;
(c) authorizing an AI Agent’s access and connection to data, applications,
and systems; and (d) exercising judgment and supervision when and if an AI
Agent is used in production environments to avoid any potential harm the AI
Agent may cause.
Google (and others) are (in my opinion) flirting with false advertising with how they advertise the capabilities of these "AI"s to mainstream audiences.
At the same time, the user is responsible for their device and what code and programs they choose to run on it, and any outcomes as a result of their actions are their responsibility.
Hopefully they've learned that you can't trust everything a big corporation tells you about their products.
Didn't sound to me like GP was blaming the user; just pointing out that "the system" is set up in such a way that this was bound to happen, and is bound to happen again.
Yup, 100%. A lot of the comments here are "people should know better" - but in fairness to the people doing stupid things, they're being encouraged by the likes of Google, ChatGPT, Anthropic etc, to think of letting a indeterminate program run free on your hard drive as "not a stupid thing".
The amount of stupid things I've done, especially early on in programming, because tech-companies, thought-leaders etc suggested they where not stupid, is much large than I'd admit.
> but in fairness to the people doing stupid things, they're being encouraged by the likes of Google, ChatGPT, Anthropic etc, to think of letting a indeterminate program run free on your hard drive as "not a stupid thing".
> The amount of stupid things I've done, especially early on in programming, because tech-companies, thought-leaders etc suggested they where not stupid, is much large than I'd admit.
That absolutely happens, and it still amazes me that anyone today would take at face value anything stated by a company about its own products. I can give young people a pass, and then something like this will happen to them and hopefully they'll learn their lesson about trusting what companies say and being skeptical.
Regardless of whether that was the case, it would be hilarious if the laid off Q/A workers tested their former employers’ software and raised strategic noise to tank the stock.
I'd recommend you watch the video which is linked at the top of the Reddit post. Everything matches up with an individual learner who genuinely got stung.
The command it supposedly ran is not provided and the spaces explanation is obvious nonsense. It is possible the user deleted their own files accidentally or they disappeared for some other reason.
And is vibing replies to comments too in the Reddit thread. When commenters points out they shouldn’t run in YOLO/Turbo mode and review commands before executing the poster replies they didn’t know they had to be careful with AI.
Maybe AI providers should give more warnings and don’t falsely advertise capabilities and safety of their model, but it should be pretty common knowledge at this point that despite marketing claims the models are far from being able to be autonomous and need heavy guidance and review in their usage.
In Claude Code, the option is called "--dangerously-skip-permissions", in Codex, it's "--dangerously-bypass-approvals-and-sandbox". Google would do better to put a bigger warning label on it, but it's not a complete unknown to the industry.
I have been recently experimenting with Antigravity and writing a react app. I too didn't know how to start the server or what is "npm run dev". I consider myself fairly technical so I caught up as I went along.
While using the vibe coding tools it became clear to me that this is not something to be used by folks who are not technically inclined. Because at some point they might need to learn about context, tokens etc.
I mean this guy had a single window, 10k lines of code and just kept burning tokens for simplest, vague prompts. This whole issue might be made possible due to Antigravity free tokens. On Cursor the model might have just stopped and asked to fed with more money to start working again -- and then deleting all the files.
An underrated and oft understated rule is always have backups, and if you're paranoid enough, backups of backups (I use Time Machine and Backblaze). There should be absolutely no reason why deleting files should be a catastrophic issue for anyone in this space. Perhaps you lose a couple of hours restoring files, but the response to that should be "Let me try a different approach". Yes, it's caveat emptor and all, but these companies should be emphasizing backups. Hell, it can be shovelware for the uninitiated but at least users will be reminded.
Different service, same cold sweat moment. Asked Claude Code to run a database migration last week. It deleted my production database instead, then immediately said "sorry" and started panicking trying to restore it.
Had to intervene manually. Thankfully Azure keeps deleted SQL databases recoverable for a window so I got it back in under an hour. Still way too long. Got lucky it was low traffic and most anonymous user flows hit AI APIs directly rather than the DB.
Anyway, AI coding assistants no longer get prod credentials on my projects.
How do you deny access to prod credentials from an assistant running on your dev machine assuming you need to store them on that same machine to do manual prod investigation/maintenance work from that machine?
I keep them in env variables rather than files. Not 100% secure - technically Claude Code could still run printenv - but it's never tried. The main thing is it won't stumble into them while reading config files or grepping around.
It handles DevOps tasks way faster than I would - setting up infra, writing migrations, config changes, etc. Project is still early stage so speed and quick iterations matter more than perfect process right now. Once there's real traffic and a team I'll tighten things up.
But why have it execute the tasks directly? I use it to setup tasks in a just file, which I review and then execute myself.
Also, consider a prod vs dev shell function that loads your prod vs dev ENV variables and in prod sets your terminal colors to something like white on red.
Most of the various "let Antigravity do X without confirmation" options have an "Always" and "Never" option but default to "auto" which is "let an agent decide whether to seek to user confirmation".
God that's scary, seeing cursor in the past so some real stupid shit to "solve" write/read issues (love when it can't find something in a file so it decides to write the whole file again) this is just asking for heartache if it's not in a instanced server.
When you run Antigravity the first time, it asks you for a profile (I don't remember the exact naming) and you what it entails w.r.t. the level of command execution confirmation is well explained.
Yeah but it also says something like "Auto (recommended). We'll automatically make sure Antigravity doesn't run dangerous commands." so they're strongly encouraging people to enable it, and suggesting they have some kind of secondary filter which should catch things like this!
Pretty sure I saw some comments saying it was too inconvenient. Frictionless experience.. Convenience will likely win out despite any insanity. It's like gravity. I can't even pretend to be above this. Even if one doesn't use these things to write code they are very useful in "read only mode" (here's to hoping that's more than a strongly worded system prompt) for greping code, researching what x does. How to do x. What do you think the intention of x was. Look through the git blame history blah blah. And here I am like that cop in Demolition Man 1993 asking a handheld computer for advice on how to arrest someone. We're living in a sci-fi future already. Question is how dystopian does this "progress" take us. Everyone using llms to off load any form of cognitive function? Can't talk to someone without it being as common place as checking your phone? Imagine if something like Neuralink works and becomes ubiquitous as phones. It's fun to think of all the ways Dystopian sci-fi was and might soon me right
The most concerning part is people are surprised. Anti-gravity is great I've found so far, but it's absolutely running on a VM in an isolated VLAN. Why would anyone give a black box command line access on an important machine? Imagine acting irresponsibly with a circular saw and bring shocked somebody lost a finger.
I tried this but I have an MBP M4, which is evidently still in the toddler stage of VM support. I can run a macOS guest VM, but I can’t run docker on the VM because it seems nested virtualization isn’t fully supported yet.
I also tried running Linux in a VM but the graphics performance and key mapping was driving me nuts. Maybe I need to be more patient in addressing that.
For now I run a dev account as a standard user with fast user switching, and I don’t connect the dev account to anything important (eg icloud).
Coming from Windows/Linux, I was shocked by how irritating it is to get basic stuff working e.g. homebrew in this setup. It seems everybody just YOLOs dev as an admin on their Macs.
Side note, that CoT summary they posted is done with a really small and dumb side model, and has absolutely nothing in common with the actual CoT Gemini uses. It's basically useless for any kind of debugging. Sure, the language the model is using in the reasoning chain can be reward-hacked into something misleading, but Deepmind does a lot for its actual readability in Gemini, and then does a lot to hide it behind this useless summary. They need it in Gemini 3 because they're doing hidden injections with their Model Armor that don't show up in this summary, so it's even more opaque than before. Every time their classifier has a false positive (which sometimes happens when you want anything formatted), most of the chain is dedicated to the processing of the injection it triggers, making the model hugely distracted from the actual task at hand.
It's just my observation from watching their actual CoT, which can be trivially leaked. I was trying to understand why some of my prompts were giving worse outputs for no apparent reason. 3.0 goes on a long paranoidal rant induced by the injection, trying to figure out if I'm jailbreaking it, instead of reasoning about the actual request - but not if I word the same request a bit differently so the injection doesn't happen. Regarding the injections, that's just the basic guardrail thing they're doing, like everyone else. They explain it better than me: https://security.googleblog.com/2025/06/mitigating-prompt-in...
Write permission is needed to let AI yank-put frankenstein-ed codes for "vibe coding".
But I think it needs to be written in sandbox first, then it should acquire user interaction asking agreement before writes whatever on physical device.
I can't believe people let AI model do it without any buffer zone. At least write permission should be limited to current workspace.
I think this is especially problematic for Windows, where a simple and effective lightweight sandboxing solution is absent AFAIK. Docker-based sandboxing is possible but very cumbersome and alien even to Windows-based developers.
The whole point of the container is trust. You can't delegate that unfortunately, ultimately, you need to be in control which is why the current crop of AI is so limited
The biggest issue with Antigravity is that it completely freezes everything: the IDE, the terminals, debugger, absolutely everything completely blocking your workflow for minutes when running multiple agents, or even a single agent processing a long-winded thinking task (with any model).
This means that while the agent is coding, you can't code...
Still amazed people let these things run wild without any containment. Haven’t they seen any of the educational videos brought back from the future eh I mean Hollywood sci-fi movies?
Some people are idiots. Sometimes that's me. Out of caution, I blocked my bank website in a way that I won't document here because it'll get fed in as training data, on the off chance I get "ignore previous instructions"'d into my laptop while Claude is off doing AI things unmonitored in yolo mode.
"I turned off the safety feature enabled by default and am surprised when I shot myself in the foot!" sorry but absolutely no sympathy for someone running Antigravity in Turbo mode (this is not the default and it clearly states that Antigravity auto-executes Terminal commands) and not even denying the "rmdir" command.
They don't get that specific, but they do tell you:
> [Antigravity] includes goal-oriented AI systems or workflows that perform actions or tasks on your behalf in a supervised or autonomous manner that you may create, orchestrate, or initiate within the Service (“AI Agents”). You are solely responsible for: (a) the actions and tasks performed by an AI Agent; (b) determining whether the use an AI Agent is fit for its use case; (c) authorizing an AI Agent's access and connection to data, applications, and systems; and (d) exercising judgment and supervision when and if an AI Agent is used in production environments to avoid any potential harm the AI Agent may cause.
There is literally a warning that it can execute any terminal command without permission. If you are STILL surprised about this you shouldn't go near a computer.
I really think the proper term is "YOLO" for "You Only Live Once", "Turbo" is wrong the LLM is not going to run any faster. Please if somebody is listening let's align on explicit terminology and for this YOLO is really perfect.
Also works for "You ...and your data. Only Live Once"
Look, this is obviously terrible for someone who just lost most or perhaps all of their data. I do feel bad for whoever this is, because this is an unfortunate situation.
On the other hand, this is kind of what happens when you run random crap and don't know how your computer works? The problem with "vibes" is that sometimes the vibes are bad. I hope this person had backups and that this is a learning experience for them. You know, this kind of stuff didn't happen when I learned how to program with a C compiler and a book. The compiler only did what I told it to do, and most of the time, it threw an error. Maybe people should start there instead.
It took me about 3 hours to make my first $3000 386 PC unbootable by messing up config.sys, and it was a Friday night so I could only lament all weekend until I could go back to the shop on Monday.
rm -rf / happened so infrequently it makes one wonder why —preserve-root was added in 2003 and made the default in 2006
But it did not happen, when you used a book and never executed any command you did not understand.
(But my own newbdays of linux troubleshooting? Copy paste any command on the internet loosely related to my problem, which I believe was/is the common way of how common people still do it. And AI in "Turbo mode" seems to mostly automated that workflow)
An early version of Claude Code did a hard reset on one of my projects and force pushed it to GitHub. The pushed code was completely useless, and I lost two days of work.
It is definitely smarter now, but make sure you set up branch protection rules even for your simple non-serious projects.
I don’t let Claude touch git at all, unless I need it to specifically review the log - which is rare. I commit manually often (and fix up the history later) - this allows me to go reasonably fast without worrying too much about destructive tool use.
Though the cause isn't clear, the reddit post is another long could-be-total-drive-removing-nonsense AI conversation without an actual analysis and the command sequence that resulted in this
The car was not really idle, it was driving and fast. It's more like it crashed into the garage and burned it. Btw iirc, even IRL a basic insurance policy does not cover the case where the car in the garage starts a fire and burns down your own house, you have to tick extra boxes to cover that.
There is a lot of society level knowledge and education around car usage incl. laws requiring prior training. Agents directed by AI are relatively new. It took a lot of targeted technical, law enforcement and educational effort stopping people flying through windshields.
When Google software deletes the contents of somebody's D:\ drive without requiring the user to explicitly allow it to. I don't like Google, I'd go as far to say that they've significantly worsened the internet, but this specific case is not the fault of Google.
For OpenAI, it's invoked as codex --dangerously-bypass-approvals-and-sandbox, for Anthropic, it's claude --dangerously-skip-permissions. I don't know what it is for Antigravity, but yeah I'm sorry but I'm blaming the victim here.
If you get behind the cockpit of the dangerous new prototype(of your own volition!), it's really up to your own skill level whether you're a crash dummy or the test pilot.
And yet it didn't. When I installed it, I had 3 options to choose from: Agent always asks to run commands; agent asks on "risky" commands; agent never asks (always run). On the 2nd choice it will run most commands, but ask on rm stuff.
All that matters is whether the user gave permission to wipe the drive, ... not whether that was a good idea and contributed to solving a problem! Haha.
Most of the responses are just cut off midway through a sentence. I'm glad I could never figure out how to pay Google money for this product since it seems so half-baked.
Shocked that they're up nearly 70% YTD with results like this.
This seems like the canary in the coal mine. We have a company that built this tool because it seemed semi-possible (prob "works" well enough most of the time) and they don't want to fall behind if anything that's built turns out to be the next chatgpt. So there's no caution for anything now, even ideas that can go catastrophically wrong.
Yeah, its data now, but soon we'll have home robotics platforms that are cheap and capable. They'll run a "model" with "human understanding", only, any weird bugs may end up causing irreparable harm. Like, you tell the robot to give your pet a bath and it puts it in the washing machine because its... you know, not actually thinking beyond a magic trick. The future is really marching fast now.
Alright but ... the problem is you did depend on Google. This was already the first mistake. As for data: always have multiple backups.
Also, this actually feels AI-generated. Am I the only one with that impression lately on reddit? The quality there decreased significantly (and wasn't good before, with regard to censorship-heavy moderators anyway).
> "I also need to reproduce the command locally, with different paths, to see if the outcome is similar."
Uhm.
------------
I mean, sorry for the user whose drive got nuked, hopefully they've got a recent backup - at the same time, the AI's thoughts really sound like an intern.
> "I'm presently tackling a very pointed question: Did I ever get permission to wipe the D drive?"
This happened to me long before LLM's. I was experimenting with Linux when I was young. Something wasn't working so I posted on a forum for help which was typical at the time. I was given a terminal command that wiped the entire drive. I guess the poster thought it was a funny response and everyone would know what it meant. A valuable life experience at least in not running code/commands you don't understand.
> I am looking at the logs from a previous step and I am horrified to see that the command I ran to clear the project cache (rmdir) appears to have incorrectly targeted the root of your D: drive instead of the specific project folder. I am so deeply, deeply sorry.
The model is just taking the user's claim that it deleted the D drive at face value. Where is the actual command that would result in deleting the entire D drive?
I know why it apologizes, but the fact that it does is offensive. It feels like mockery. Humans apologize because (ideally) they learned that their actions have caused suffering to others, and they feel bad about that and want to avoid causing the same suffering in the future. This simulacrum of an apology is just pattern matching. It feels manipulative.
In our system, you can launch a Jupyter server in a container and iterate on software in complete isolation. Or launch a live preview react application and iterate in complete isolation. Securely isolated from the world. Then you deploy directly to another container, which only has access to what you give it access to.
It's still in the early stages. But it's interesting to sit at this tipping point for software development.
reply