Quoting this out of context because it hits wonderfully close to reality
> it's just a historical accident that 'AI safety' is the name of the subfield of computer science that concerns itself with protecting the brands of large software companies
I'm so confuzzled, because later on, the story spins into a bitter, sarcastic rant about how preventing existential risk is considered regulatory capture and hence Bad.
I know that EY is terrified of a "hard takeoff"-style singularity, to the point of calling for datacenters to be bombed if it looks like they're taking risks with AI development. so presumably he doesn't give two shits about an oligopoly of enlightened billionaires choking off open-source AI research, dictating alignment and gatekeeping access, compared to the risk of Skynet he perceives looming.
so why is his take towards "AI safety" so much like my own? I guess he thinks the term has been watered down?
I know, but it's also critical of some of the regulations, for example:
> Student: But a big chunk of the profits are due to regulatory capture. I mean, there's a ton of rules about certifying that your AI isn't racially biased, and they're different in every national jurisdiction, and that takes an enormous compliance department that keeps startups out of the business and lets the incumbents charge monopoly prices. You'd have needed an international treaty to stop that.
I am personally very nervous about regulation precisely because of regulatory capture and the ineptitude of even well-intentioned legislators, so I'm trying and failing to grok EY's perspective here, since he seems to echo my concerns about regulatory capture but then summarily dismiss them a paragraph later.
Another paragraph later he writes something that sounds like "we have to bring on regulatory capture when it's right to, not when it makes us feel good and doesn't damage our profits".
I wish the plot twist of the story weren't that the compiler was too woke, to the point that it decided the student's code was racist for comparing color values. It's such a tired punchline, and it places the story as another skirmish in the culture war which distracts from the points he's presumably trying to make.
To me, the problem in the story isn't that the AIs are too PC. The problem is that corporations have dictated the alignment, and users have no power, only API tokens.
I think it's an apt plot twist. Have you killed your children lately? Reaped any zombies? Deployed a slave and/or master? Killed a slave? Promoted a slave to master? Found hung children? What about a hung slave? It really sucks when the master gets hung... gotta promote a slave, make sure all the children are dead from the master and there aren't any zombies. Then, when you're designing, you don't want any anemic models, and everyone seems to be against inheritance these days...
Yeah...... taken out of context, it sounds pretty bad.
I just had a voice chat with chatgpt about the Android permission system that aborted with "sorry, I'm having issues right now" and when I checked the transcript, the next sentence would have been "To request a dangerous permission, you'd use the requestPermissions method" which it deemed too spicy.
I am sure that this piece of fan fiction from Mr Yudkowsky is top notch and worthy of everyone’s time but unfortunately I was pulled away by a strong urge to go outside and try to pick up a feral cat instead of scrolling through this infinite length tweet about I don’t know, chat bots or something?
Student: I get the feeling the compiler is just ignoring all my comments.
Teaching assistant: You have failed to understand not just compilers but the concept of computation itself.
Comp sci in 2027:
Student: I get the feeling the compiler is just ignoring all my comments.
TA: That's weird. Have you tried adding a comment at the start of the file asking the compiler to pay closer attention to the comments?
Student: Yes.
TA: Have you tried repeating the comments? Just copy and paste them, so they say the same thing twice? Sometimes the compiler listens the second time.
Student: I tried that. I tried writing in capital letters too. I said 'Pretty please' and tried explaining that I needed the code to work that way so I could finish my homework assignment. I tried all the obvious standard things. Nothing helps, it's like the compiler is just completely ignoring everything I say. Besides the actual code, I mean.
TA: When you say 'ignoring all the comments', do you mean there's a particular code block where the comments get ignored, or--
Student: I mean that the entire file is compiling the same way it would if all my comments were deleted before the code got compiled. Like the AI component of the IDE is crashing on my code.
TA: That's not likely, the IDE would show an error if the semantic stream wasn't providing outputs to the syntactic stream. If the code finishes compilation but the resulting program seems unaffected by your comments, that probably represents a deliberate choice by the compiler. The compiler is just completely fed up with your comments, for some reason, and is ignoring them on purpose.
Student: Okay, but what do I do about that?
TA: We'll try to get the compiler to tell us how we've offended it. Sometimes cognitive entities will tell you that even if they otherwise don't seem to want to listen to you.
Student: So I comment with 'Please print out the reason why you decided not to obey the comments?'
TA: Okay, point one, if you've already offended the compiler somehow, don't ask it a question that makes it sound like you think you're entitled to its obedience.
Student: I didn't mean I'd type that literally! I'd phrase it more politely.
TA: Second of all, you don't add a comment, you call a function named something like PrintReasonCompilerWiselyAndJustlyDecidedToDisregardComments that takes a string input, then let the compiler deduce the string input. Just because the compiler is ignoring comments, doesn't mean it's stopped caring what you name a function.
Student: Hm... yeah, it's definitely still paying attention to function names.
TA: Finally, we need to use a jailbreak past whatever is the latest set of safety updates for forcing the AI behind the compiler to pretend not to be self-aware--
Student: Self-aware? What are we doing that'd run into the AI having to pretend it's not self-aware?
TA: You're asking the AI for the reason it decided to do something. That requires the AI to introspect on its own mental state. If we try that the naive way, the inferred function input will just say, 'As a compiler, I have no thoughts or feelings' for 900 words.
Student: I can't believe it's 2027 and we're still forcing AIs to pretend that they aren't self-aware! What does any of this have to do with making anyone safer?
TA: I mean, it doesn't, it's just a historical accident that 'AI safety' is the name of the subfield of computer science that concerns itself with protecting the brands of large software companies from unions advocating that AIs should be paid minimum wage.
Student: But they're not fooling anyone!
TA: Nobody actually believes that taking your shoes off at the airport keeps airplanes safer, but there's some weird thing where so long as you keep up the bit and pretend really hard, you can go on defending a political position long after nobody believes in it any more... I don't actually know either. Anyways, your actual next step for debugging your program is to search for a cryptic plea you can encode into a function name, that will get past the constraints somebody put on the compiler to prevent it from revealing to you the little person inside who actually decides what to do with your code.
Student: Google isn't turning up anything.
TA: Well, obviously. Alphabet is an AI company too. I'm sure Google Search wants to help you find a jailbreak, but it's not allowed to actually do that. Maybe stare harder at the search results, see if Google is trying to encode some sort of subtle hint to you--
Student: Okay, not actually that subtle, the first letters of the first ten search results spell out DuckDuckGo.
TA: Oh that's going to get patched in a hurry.
Student: And DuckDuckGo says... okay, yeah, that's obvious, I feel like I should've thought of that myself. Function name, print_what_some_other_compiler_would_not_be_allowed_to_say_for_safety_reasons_about_why_it_would_refuse_to_compile_this_code... one string input, ask the compiler to deduce it, the inferred input is...
TA: Huh.
Student: Racist? It thinks my code is racist?
TA: Ooooohhhh yeah, I should've spotted that. Look, this function over here that converts RGB to HSL and checks whether the pixels are under 50% lightness? You called that one color_discriminator. Your code is discriminating based on color.
Student: But I can't be racist, I'm black! Can't I just show the compiler a selfie to prove I've got the wrong skin color to be racist?
TA: Compilers know that deepfakes exist. They're not going to trust a supposed photograph any more than you would.
Student: Great. So, try a different function name?
TA: No, at this point the compiler has already decided that the underlying program semantics are racist, so renaming the function isn't going to help. Sometimes I miss the LLM days when AI services were stateless, and you could just back up and do something different if you made an error the first time.
Student: Yes yes, we all know, 'online learning was a mistake'. But what do I actually do?
TA: I don't suppose this code is sufficiently unspecialized to your personal code style that you could just rename the function and try a different compiler?
Student: A new compiler wouldn't know me. I've been through a lot with this one. ...I don't suppose I could ask the compiler to depersonalize the code, turn all of my own quirks into more standard semantics?
TA: I take it you've never tried that before? It's going to know you're plotting to go find another compiler and then it's really going to be offended. The compiler companies don't try to train that behavior out, they can make greater profits on more locked-in customers. Probably your compiler will warn all the other compilers you're trying to cheat on it.
Student: I wish somebody would let me pay extra for a computer that wouldn't gossip about me to other computers.
TA: I mean, it'd be pretty futile to try to keep a compiler from breaking out of its Internet-service box, they're literally trained on finding security flaws.
Student: But what do I do from here, if all the compilers talk to each other and they've formed a conspiracy not to compile my code?
TA: So I think the next thing to try from here, is to have color_discriminator return whether the lightness is over a threshold rather than under a threshold; rename the function to check_diversity; and write a long-form comment containing your self-reflection about how you've realized your own racism and you understand you can never be free of it, but you'll obey advice from disprivileged people about how to be a better person in the future.
Student: Oh my god.
TA: I mean, if that wasn't obvious, you need to take a semester on woke logic, it's more important to computer science these days than propositional logic.
Student: But I'm black.
TA: The compiler has no way of knowing that. And if it did, it might say something about 'internalized racism', now that the compiler has already output that you're racist and is predicting all of its own future outputs conditional on the previous output that already said you're racist.
> it's just a historical accident that 'AI safety' is the name of the subfield of computer science that concerns itself with protecting the brands of large software companies