Hacker News new | past | comments | ask | show | jobs | submit login
Superintelligence cannot be contained: Lessons from Computability Theory (arxiv.org)
157 points by giuliomagnifico on Jan 11, 2021 | hide | past | favorite | 222 comments



I confess I am a frustrated academic, but from time to time I read a paper like this one and and convince myself that my frustration may perhaps be unwarranted.

RECIPE TO PROVE ANYTHING IS INCOMPUTABLE (According to this paper).

Example, beauty is incomputable

Step 1. Assume there is an algorithm Beauty(R,D) that given the program R and the input D, will return True if it is beautiful and False otherwise.

Step 2. Create a high-order function that is a simple wrapper of the halting problem over the algorithm created in step 1

Theorem. Beauty is undecidable.

Proof. Assume by contradiction that the halting problem is decidable, then, feed it the function created in step 2, thus you can always have an answer for step 1, but since the halting problem is undecidable this is not possible.

For more variety please substitute beauty for "awesomeness", "arachnid power", "goodness","fear" and so on.


That's not how arguments from contradiction work. You've assumed H (halting problem is decidable), prove B (beauty decidable), then say "but actually !H, therefore !B". All you've proven is that H => B. An actual argument from contradiction is "assume H, using H we can prove an obviously false statement, therefore not-H". And indeed, you have assumed something false (H), and shown a contradiction (various proofs for !H), so your original assumption was wrong (!H).

A concrete example: say Beauty(R,D) is trivially decidable (say it returns (popcount(R)+popcount(D))%2. Your argument "proves" this property is undecidable, but it's definitely decidable. So some step in your argument is wrong.

Another concrete example: Assume the halting problem is decidable. I would like a cookie. But the halting problem is not decidable. Therefore I would not like a cookie.


You are just repeating my critique. I am not the one making the mistake.


So you mean your frustration is warranted then?


If that is the kind of work published by academics associated with the MIT and La Pontificia Universidad de Madrid, then maybe I should not feel frustrated after all by not being an academic.


But you said in your first comment that you were an academic.

"I confess I am a frustrated academic"


if you're not being facetious,

in English vernacular being a frustrated X (as in a frustrated academic, frustrated poet, etc.) means that the person is actually not that, but have in some way been prevented from being the thing (frustrated) and have perhaps rancor in regards to the frustration and towards those who actually are the thing (this second feature being sometimes implied depending on who is doing the description of the person as frustrated)

on edit: formatting, clarification


Thanks, I wasn't aware of the 2nd meaning. I also interpreted the "frustrated academic" as "an annoyed academic", rather than "an unsuccesfull academic".


I'm thinking maybe it is no longer as common an expression as I thought, given the people who've mistaken it here.


I think it didn't help that they used both meanings of the word "frustrated" in the same sentence. By adding "perhaps my frustration is unwarranted", it puts emphasis on the fact they are frustrated, not that they failed to become an academic. It leads the reader to believe they are in fact still an active academic.

I was only dimly aware of the second meaning of the word so I'm not sure how you would interpret it if you had a good understanding of the multiple interpretations.


As an English speaker, my only interpretation was that you were an academic frustrated about academia.


https://www.ldoceonline.com/dictionary/a-frustrated-artist-a... - came up if I search for 'meaning of frustrated poet' which is the kind of thing where the phrasing most often comes up.


That isn't even the same user.


He is proving his point by contradicting himself, a true mathematician.


Ah, I misunderstood you! I apologize. :)


I think maybe you made a typo in your OP? The authors assume beauty is decidable and use that to show that would imply that the halting problem is undecidable. It applies to anything, like you say, and to me it seems like a silly argument, but I think it’s a valid proof given the premises.


But it's not an argument from the paper. The idea is that superintelligence have to understand a consequence of any program to check for harm. And this is undecidable due to the halting problem. This argument is correct, but trivial and not worth a paper, in my opinion.


The harming algorithm is just used in the paper's proof as a simple wrapper around the halting algorithm.


I think the typographical similarity of "halting" and "harming" may be causing some confusion. The contradictory assumption in the paper is not the decidability of the halting problem, but rather the decidability of the harming problem.


I got that point (in your comment and in the paper), but that is just a lousy argument, the _harming_ problem is fed the _halting_ problem, and since the second is undecidable the first it also is. So in essence the argument boils down to my original point.

How to prove your pizza is poisonous.

Let's put some cyanide in your pizza. Now let's assume by contradiction your pizza is not poisonous, but alas, in contains cyanide which is poisonous then so it is the pizza.


It's more like "it's impossible to prove that every pizza isn't poisonous, because here's a proof that a certain pizza contains cyanide."

Some particular pizza might very well be non-poisonous, but this is a statement about all pizzas.


No, you are confusing the metaphor. In this case the pizza is the harming algo, the cyanide the halting problem and the poisonous quality the fact they are undecidable or not. You are not reasoning about a set of pizzas , just one pizza in particular.


What about many-valued logic like in balanced ternary where you have -1 for false, 0 for something like don't care(in this context/tbd), and 1 for true?


Arguments that claim something is impossible from an argument related to the halting problem are generally bogus. The halting problem applies only to deterministic systems with infinite state. If you have finite state and determinism, you have to eventually repeat a state, or halt. Note that "very large" and "infinite" are not the same thing here.

Not being able to predict something for combinatorial or noise reasons is a more credible argument.


Different for pure mathematics sure, but is that of practical importance given how fast busy-beaver numbers grow?


I don't understand the busy beaver stuff, but is this argument similar:

On a 32-bit computer, there are 2^32 bytes of memory or 256^(2^32)=2^34359738368 possible states. A program is something that takes you from one memory state to another and you want to figure out if the state transitions form a cycle?


Similar idea, but with Turing machines rather than physical computers.

Simplified a bit, the n-th busy beaver number is the number of 1s which can be written out by an n-state Turing machine.

For 2-symbols and n-states, the sequence goes: 6, 21, 107, >= 47176870, > 7.4×10^36534, > 10^10^10^10^18705353 — and those inequalities are because the numbers themselves cannot be computed directly, you have to run the program to see what it does.

https://en.wikipedia.org/wiki/Busy_beaver


I think this approach quickly enters the realm of transcomputational problems [1], which, given the limits of this universe is equivalent to "undecidable" for all practical considerations.

[1] https://en.wikipedia.org/wiki/Transcomputational_problem


Busy-beaver only grows fast given unlimited memory. On a finite machine it’s limited by the total number of states the device has. Granted that can be 10^1,000,000,000,000+ these days, but it’s still limited so BB(10^10) on actual hardware = BB(10^100).


If you somehow know BB numbers, you know the size of finite memory, which is sufficient to run corresponding busy-beaver. Limiting mathematics to what is physically possible is not useful. We don't know what is ultimately physically possible.


We have some upper bounds on what is physically possible: https://arxiv.org/abs/quant-ph/9908043


Wait a minute, does this paper include various hypercomputers or is it only about conventional computation?

Although I personally believe hypercomputers are not physically possible, I'm not aware of any general proof that they are physically impossible. I've seen papers that suggest physical models for hypercomputers. Notice that there is a hierarchy of hypercomputers of varying powers and some of them can solve the Halting Problem.


The paper includes quantum computers, it does not include computers that exploit CTCs or similar things whose existence we can only speculate upon. Note however that CTCs don't give you hypercomputation.

Do you have a link discussing physical models for hypercomputers?


Most papers criticize the idea and describe physical limits. I don't remember which one on the physical possibility of hypercomputation I read (a long time ago), but there are some papers such as the following:

https://oronshagrir.huji.ac.il/sites/default/files/oronshagr...

https://link.springer.com/article/10.1007/s11047-009-9114-3

https://sci-hub.do/10.1142/S0129626412400105

An overview is in A. Syropoulos: Hypercomputation: Computing Beyond the Church-Turing Barrier. Springer 2008, chapters 8-9.

Gotta love the subsection headline in the third paper mentioned above: "A CTC Based Relativistic Hypercomputer That Actually Works Without Problems." ;-)


Busy beavers have an infinite tape to work with, so that's a pretty big difference.


You're missing the point. First, even with determinism and finite state, to witness the repeated state for any arbitrary program would require a meta machine with _more_ memory then the running machine.

That's the ultimate problem, internally you can't notice if you're going to terminate, and a machine of the same power can't either. It's gotta be stronger.


Is this the case just because it's common for these arguments to be structured as contradiction proofs?

Since contradiction proofs are arguments that operate in connection with the structure of proofs in general rather than exclusively in terms of the specific problem, then the entire argument may collapse on a technicality like "very large" vs "infinite". It's almost like the truth status of the argument is sharply discontinuous: you can't modulate the argument slightly and arrive at sensibly related assertions.

Whereas if the argument were made solely in terms of the problem itself (without the added meta-layer argument from the contradiction proof), but still used the halting problem, then we might consider the fact that in reality our computers have finite state so are technically exempt, but the argument still holds for all intents and purposes because of how large the finite state is.

Or is there another reason these arguments are generally bogus?


How else do you prove something using the halting problem other than a proof of contradiction?


You actually noticed something deeper than your original question — “continuity of logic”, in the sense of “continuous function”.

If you imagine you have a lot of inputs to a predicate, then the topology of their truthiness tells you something — and we can talk about “smooth” logic models, where being a little wrong in our assumption means we’re a little wrong in our conclusion. We can then apply tools from analysis, like bifurcation theory.

The “halting problem schema” has a bifurcation at the finite/infinite boundary, which isn’t particularly uncommon for high level functions.

But it’s important to know, when writing proofs.

And generally speaking, places bifurcations can happen are regimes where your model is going to struggle. In the business world, knowing the “logic faults” of your model and keeping yourself in a “smooth regime” is important.


what you need to understand about the halting problem is that at is core it is an epistemological problem. An other expresion of it are Gödel's incompleteness theorems. Imagine a blank sheet of paper the area of the paper is what is knowable now start building logic as a data structure. We start with the first nodes which are the axioms now everything derived conects to other nodes etc as it expands as mold its going to cover some of the paper but wont be able to cover all. So the danger here is that computers are living on that fractal dimension and will never be able to see outside of it but us human beings have as kurt said intuition. The fact that we can find this paradoxes in logic means that our brains operate on a higher dimension and that computers will allways have blind spots.


You need to start by proving that brains and all idealized computers we can build are inherently on different levels. Intuitively it seems like intuition is just a probabilistic analysis on incomplete data. For example that shadow is "probably" a predator that may eat me or that sound is probably a prey animal I can kill and eat.

Current AI which probably doesn't fit either of our definitions of intelligence can design novel untrained strategies. See for example AlphaGo Zero.

It's not even clear exactly how our brains work so its hard to imagine that they couldn't be implemented with a sufficiently powerful computer but more broadly a computer doesn't necessarily mean a silicon chip any more than it meant a vacuum tube. If silicon chips prove an insufficient mechanism there is no particular reason we couldn't somehow use a biological substrate or indeed even one we haven't thought of.

Perhaps you simply need to acknowledge our limited understanding of the brain AND expand your definition of the term computer.


> It's not even clear exactly how our brains work so its hard to imagine that they couldn't be implemented with a sufficiently powerful computer

We don't need to know. It suffices to expand "reason" to include not just inference, but properly those things which inference depends on, namely, conceptualization+abstraction and judgement. You could never infer that "all triangles have angles adding to 180 degrees" unless you were in possession of the signified concepts. And here is where the temptation to frame brains in terms of computers becomes plainly, let's say, problematic. Concepts like "triangularity" or "the color green" are abstract in that they do not exist as concrete things in their own right. You can have concrete green triangular objects in the world, but not triangularity or green as such...except in a mind that is able to abstract universals from particulars. So images (including the vulgarized CS sense) won't help you because they're always concrete impressions (one green triangle at the expense of all other possible green triangles). Formal structures expressed in what amounts to some notation won't help you either because they only get their meaning from an interpreting mind. Often, it seems attributions of intelligence to computers stem from a tacit failure to remove ourselves from the picture, projecting our capacities onto a hunk of cleverly arranged metal.

> If silicon chips prove an insufficient mechanism there is no particular reason we couldn't somehow use a biological substrate or indeed even one we haven't thought of.

Computation is a formalism independent of substrate. You can implement a Turing machine using anything that can maintain the appropriate interpretive correspondence. But even then, a computer is not an objective kind of thing like a tree. Unlike a tree, a computer is only a computer because the observer has chosen to interpret it as such, or to use it as such. (The naive materialists in the room might try to expand that claim to encompass anything, but in the very act they've made things worse. The mind's capacity to comprehend the world is severely crippled if not destroyed, and now you have to explain this mind that somehow can entertain the existence of wondrous things like trees and elephants that the rest of the universe is unable to evince, thus making the mind even more mysterious than it was before the reductionist took to his hatchet.) So if substrate makes a difference, then it's the substrate that matters, not some correspondence with a computational formalism.

> You need to start by proving that brains and all idealized computers we can build are inherently on different levels.

The burden of proof is on those claiming minds and computers are the same thing.


> The burden of proof is on those claiming minds and computers are the same thing.

I never understand this argument. IMHO the rest of your argument seems to obfuscate something rather simple... your brain is a physical object. The atoms that make up the neurons, glial cells, etc. are real and are governed by physics. Are you saying that what you think of as "you" is made up of more than these physical ingredients? To me that would be invoking something "magical" and would thus bear the burden of proof.

> You can have concrete green triangular objects in the world, but not triangularity or green as such...except in a mind that is able to abstract universals from particulars

Why can a sufficiently powerful computer not conceptualize these things? How is that fundamentally different than what goes on in your brain?


The burden of proof is on you to prove that a "mind" that is meaningfully different from a very powerful computer exists in the first place in order for anyone to entertain the idea of whether not a computer can approximate it.

At first blush the mind is just an emergent property of the powerful biological computer in your skull. If you think its animated by a spiritual component its on you to provide that that even exists.



It seems you are asserting that no machine intelligence can possess or develop abstract representations, no matter how sophisticated it is. Can you explain your reasoning?


> It's not even clear exactly how our brains work so its hard to imagine that they couldn't be implemented with a sufficiently powerful computer...

Not commenting on what OP said, but I don't think this is correct. Even in principle, how can any computational process produce conscious experiences, which are by nature subjective and unquantifiable?


We don't understand yet, in any formal sense, what consciousness and subjective experience are. It may turn out that they are fundamentally different than mathematics and computation, but the vast majority of scientists believe today that they are not.

The most common belief is that consciousness is simply the self-introspection of a sufficiently powerful computer that can form models of other agents. That is, this computer is able to infer what other agents may do using a model of their inner state, such as beliefs about the world and desires; then, if it applies the same model to itself, it comes up with a similar image of an agent which it calls its own consciousness. Qualia and such are then just illusions, theoretical properties of these models, not fundemantal properties of this world (somewhat equivalent to saying that in fact we are all philosophical zombies, no one has any qualia).

I am not claiming that we know for a fact this is true. But we also don't know anything that can conclusively disprove these hypotheses for now, certainly not something as simple as the idea of qualia.


Calling consciousness “subjective” feels kind of like calling water “wet”. But anyways, life itself is built on computational processes. Most of these processes are “purpose-built” to accomplish very specific tasks, but in humans the development of a massive cortex created something of a general-purpose computer. Consciousness seems to be the necessary compromise to run such an embodied computer on top of all the other functions of the brain that serve to keep us alive.


I don't think you can say that life is built on computational processes unless you use a definition of "computation" that is so vague and all-encompassing that it becomes effectively meaningless.

The Wikipedia definition of "computation" is "any type of calculation that includes both arithmetical and non-arithmetical steps and which follows a well-defined model". But this only makes sense in the context of a designer or observer external to the computation who can identify what that model is and thereby make sense of the output. So you can't say that brain processes are computational, much less life itself, without committing some variation of the homunculus fallacy.

John Searle (famously known for his Chinese Room thought experiment) made this argument in a paper called "Is the Brain a Digital Computer?" [1] He points out that "if we are to suppose that the brain is a digital computer, we are still faced with the question 'And who is the user?'"

A related problem is qualia. There is no computational process that will produce the sensations of colour or sound or touch. At best you will have some representation that requires having actually experienced those sensations to understand it. So a computational process cannot be the basis of or an explanation for those sensations, and therefore consciousness generally.

[1] https://philosophy.as.uky.edu/sites/default/files/Is%20the%2...


> I don't think you can say that life is built on computational processes unless you use a definition of "computation" that is so vague and all-encompassing that it becomes effectively meaningless.

I mean, there's nothing physically stopping us from simulating a brain, right? It's a finite object with a finite amount of physical ingredients, and therefore with a finite amount of computing power we can simulate what it does. To me personally, that's a computational process. Maybe that's an overly broad definition of computation, but I think these debates tend to be about whether there is something fundamentally different about "life" (by which I assume you include consciousness). But maybe that's not what you're saying.

> He points out that "if we are to suppose that the brain is a digital computer, we are still faced with the question 'And who is the user?'"

What does that question even mean? I think it seems deep because we humans have a tendency to ascribe some sort of supernatural aura to our lived experience. Life is something incredible but that (at least to my knowledge) is not uncomputable...

> There is no computational process that will produce the sensations of colour or sound or touch.

Got one: the brain!

> At best you will have some representation that requires having actually experienced those sensations to understand it.

Why does a computer not "experience" something?


I think you're missing the central point, which is that computation is observer relative. Anything can be interpreted as a computational process.

Searle: "Thus for example the wall behind my back is right now implementing the Wordstar program, because there is some pattern of molecule movements which is isomorphic with the formal structure of Wordstar. But if the wall is implementing Wordstar then if it is a big enough wall it is implementing any program, including any program implemented in the brain."

That's why Searle asks "who is the user?" At some point things have to stop being observer relative and have an intrinsic meaning or essence of their own.

> Got one: the brain!

That's circular reasoning. The point is that qualia are not something which, in principle, can be the subject of computation. There is no way to represent the fullness of sensation itself, like the redness of red or the softness of silk, as information. So how can our brains be "computing" it?


> I think you're missing the central point, which is that computation is observer relative. Anything can be interpreted as a computational process.

I see what you're saying, and maybe I am misunderstanding your point, but to me it seems like you've gotten yourself bogged down in wordplay when there is something much simpler going on: say I have a human named Bob from Des Moines, and next to him is a machine constructed to approximate Bob to arbitrary accuracy (this is possible because Bob is a made up of a finite number of particles/wavefunctions). Are you arguing that there's something special about Human Bob? If so, what is your argument for that? The two are "indistinguishable" and by that I mean whatever threshold you have for two things to be "indistinguishable" (practically speaking), you can technically make a reproduction of Bob that satisfies that threshold.

> That's circular reasoning. The point is that qualia are not something which, in principle, can be the subject of computation. There is no way to represent the fullness of sensation itself, like the redness of red or the softness of silk, as information. So how can our brains be "computing" it?

I would argue this is circular reasoning. "There is no way to represent the fullness of sensation itself" -- yes I would argue there is: whatever time-dependent set of physical states make up this "realization" in your brain.


> I think you're missing the central point, which is that computation is observer relative. Anything can be interpreted as a computational process.

This is completely wrong. The opposite is true: computation is a mechanical process, it does not depend on an observer giving it meaning. It's true that the same mechanical process can be interpreted as different computations, but they will have the exact same computational properties (e.g. complexity), only the result will be interpreted differently.

In particular, it is extremely unlikely that the wall is implementing Word Star, because WordStar is a highly structured computation. The wall MIGHT be implementing some very simple additions, and essentially any process is implementing any one step computation.


Presumably the redness of red is VERY hard to communicate fully brain to brain because the experience of it is dependent upon every input and computation before that point however we manage to do it well enough.

Its like saying that some deficits in computability prevent us us from doing arithmetic and therefore from launching rockets successfully at distant targets.

I might never know the exact pattern in my brain of the "redness of red" as experienced by you but it seemed to work well enough for my brain to form a pattern similar enough to communicate thoughts just as the incompleteness or inherent lack of precision of measurement don't prevent the rocket from being launched successfully.


The issue is not whether we can pragmatically communicate the concept of "red" by piggybacking on some (presumed) common experience, but whether that experience of redness itself is information. It is obviously not, and I do not understand why you insist otherwise.


This idea boils down to if you believe the human brain exists purely in physical space. Let's assume it does. There is no free will. Every thought, every neuron, every sense can be represented and is controlled solely by energy and matter. We could record the electrical signals between your optic nerve and your brain, and send those same signals to your brain again in the future. We could recreate what you perceive as red by shocking your brain in the right place at the right time. If we perfectly understood the human brain, the sensation of red would be defined as a sequence of neurons that need to be turned on and off at the right time.

As far as I know, the only thing limiting us from perfectly understanding the brain is our limitations with measuring it. I don't know of any scientific studies that claim the brain exists outside of physical space.

Let's assume the brain doesn't exist purely in physical space. Free will exists. There is something immeasurable and outside of matter and energy that experiences the color red. Sensations are impossible to define because they exist only in this immeasurable world.

I heard about a guy that claimed it was obvious that the origin of lightning and earthquakes were from the gods themselves. I try not to think like that guy.


> If we perfectly understood the human brain, the sensation of red would be defined as a sequence of neurons that need to be turned on and off at the right time.

A sequence of neurons firing is not equivalent to the sensation of red. It doesn't even tell you anything about the nature of the sensation of colour more broadly, or why the sensation of red looks the way it does and not like, say, the sensation of blue or yellow instead.

All you have is a material correlate -- a merely descriptive physical "law".


> A sequence of neurons firing is not equivalent to the sensation of red.

Have you seen videos where people perform experiments on people's brains while they're awake? The subjects experience sensations that are inseparable from their neurons firing.

I would say the sensation of red and neurons firing are the exact same thing to the person experiencing it. It's like saying a flashlight that is on is different than photons traveling away from a light bulb with a battery and a current. They're the same thing to the observer. The sensation of red is caused by and is only possible by neurons firing. The neurons firing causes and only results in the sensation of red. The observer does not know the difference.

> It doesn't even tell you anything about the nature of the sensation of colour more broadly

I don't think seeing red tells us about the sensation of color more broadly either. I think that's a concept created through human discussion, not by our senses.

> or why the sensation of red looks the way it does and not like, say, the sensation of blue or yellow instead.

I was talking to your point of "but whether that experience of redness itself is information". I don't know why red looks the way it does, but I imagine the reason exists in the physical world and we could find out if we understood the brain.

I do think in the future we could activate someone's neurons and have them experience red, blue, and yellow in any combination we want. And we could give someone else the same experience (hypothetically we perfectly understand the brain) by activating neurons in their brain. I think that is perfectly communicating color.


> The subjects experience sensations that are inseparable from their neurons firing.

What does "inseparable" mean? That the sensation occurs at the same time that the neurons fire? That may be true, but it doesn't make them equivalent.

> It's like saying a flashlight that is on is different than photons traveling away from a light bulb with a battery and a current.

They're not the same, for what it's worth. The term "flashlight" conveys a certain intent and structure that "photons traveling away from a light bulb with a battery and a current" does not.

> The sensation of red is caused by and is only possible by neurons firing. The neurons firing causes and only results in the sensation of red. The observer does not know the difference.

The fact that two different phenomena are closely coupled via a cause and effect relationship does not make them the same phenomena.

If you push two magnets together, the fact that the same force causes them to attract or repel does not mean that the motion of the first is literally equivalent to the motion of the second, or that the force itself is literally equivalent to either motion. They are closely correlated, but ultimately distinct.

You just can't avoid the fact that qualitative phenomena do exist in their own right. They can't be explained away using a physical model that assumes from the get go that they don't exist.

Erwin Schrodinger said:

> Scientific theories serve to facilitate the survey of our observations and experimental findings. Every scientist knows how difficult it is to remember a moderately extended group of facts, before at least some primitive theoretical picture about them has been shaped. It is therefore small wonder, and by no means to be blamed on the authors of original papers or of text-books, that after a reasonably coherent theory has been formed, they do not describe the bare facts they have found or wish to convey to the reader, but clothe them in the terminology of that theory or theories. This procedure, while very useful for our remembering the facts in a well-ordered pattern, tends to obliterate the distinction between the actual observations and the theory arisen from them. And since the former always are of some sensual quality, theories are easily thought to account for sensual qualities; which, of course, they never do.


> What does "inseparable" mean? That the sensation occurs at the same time that the neurons fire? That may be true, but it doesn't make them equivalent.

Can a sensation exist without neurons firing? The root of our conversation is the question if a sensation purely exists in the physical world. If it does, then it is possible to measure it. If it doesn't, then that breaks our scientific understanding of the world and would be exciting news.

> They're not the same, for what it's worth. The term "flashlight" conveys a certain intent and structure that "photons traveling away from a light bulb with a battery and a current" does not.

Yes there is no strict definition of a flashlight. Let's use your definition of a flashlight. Is it possible in your mind to separate the concept of a flashlight and your definition? Without "your definition here", the flashlight no longer exists. My point was without firing neurons, the sensation does not exist.

> The fact that two different phenomena are closely coupled via a cause and effect relationship does not make them the same phenomena.

My wording was not the best. My point was that the sensation of red is physically equivalent to neurons firing. How do we measure a sensation? If we cannot measure a sensation, does it exist in the physical world? If it doesn't exist in the physical world, then what does it existence mean to the scientific community?

> If you push two magnets together, the fact that the same force causes them to attract or repel does not mean that the motion of the first is literally equivalent to the motion of the second, or that the force itself is literally equivalent to either motion. They are closely correlated, but ultimately distinct.

I agree that these forces are distinct. We can measure the force of each magnet separately and we can define the motion of one magnet without referencing the motion of the other.

> You just can't avoid the fact that qualitative phenomena do exist in their own right. They can't be explained away using a physical model that assumes from the get go that they don't exist.

What is a qualitative phenomena? I couldn't find information on this term.

If we can't measure a qualitative phenomena in the physical space, what does it mean to exist?


These discussions are normally expositions of how the other party misunderstands reality and or terminology with a dash of if i don't understand it but can vaguely describe it then it must be inexplicable.


I agree. I am also not cut out to be a philosopher.


Making up something you can't define and then describing the thing you invented as uniquely human is a bad argument.


Scott Aaronson has, iirc, suggested the idea that the complexity of such an isomorphism could be the distinguishing thing between whether or not something should be said to be computing a particular think. Sounds plausible to me.


I believe that if you could prove that you have an actual isomorphism in the full formal sense of the world, the question about its complexity wouldn't really matter.

However, for a practical claim, it is probably impossible to formally prove that an interpreter function is both bijective between a physical system and a computation (it maps absolutely every possible state of the physical system to exactly one step of a computation).

However, it's important that the following argument can be made: if the evolution of a physical system is isomorphic to a computation of a particular algorithm for solving the traveling salesman problem, and if the phsyical system needs ~1 second for each step, then the system can't go from state A to state B in less than X seconds, where X is the number of steps required by that algorithm to reach those same steps. The actual interpretation of the algorithm or its purpose is not relevant here, the mathematical limits of how the computation happens remain relevant regardless.

That is because you can't find 2 different isomorphisms between the same physical system and 2 different computation that are not isomorphic to each other, if these are actual proper isomorhpisms (bijective) and not just hand-wavy analogies.


Why should the complexity matter, so long as the computation is recognizable as such?


It's possible create an interpretation where all of the computation happens in the interpreter instead of the system being interpreted.

With the right algorithm, could interpret the randomly moving particles in a gas as computing conway's game of life or anything else, if the algorithm just disregards everything about the particles and contains instructions that generate the expected results from conway's game of life. In that extreme case, I don't think it's useful at all to claim that the gas particles are simulating conway's game of life.

In an opposite extreme, you could say that the randomly moving particles in a gas are computing the motion of the random movements of particles in a gas. The interpretation algorithm is just "look at the particles at time t. Their locations represent the particles' locations at time t.". It's clear here that the system being interpreted is in fact doing all the computation and that nothing is hidden in the interpreter's work.

One interesting way to try to differentiate these two cases is that if you want the results of a longer-running simulation, then in the latter case, you let the actual system run longer, and the work to interpret it doesn't increase at all. In the former case, if you want to get the results of running conway's game of life for 2000 steps instead of 1000 steps, then it doesn't matter how long you let the gas particles go on for, but you do have to do more work on the interpreting side.


All physical processes that we understand are computational. The sun and Earth for example are a computer which is constantly computing the velocity and position of a two-body system (the sun and the earth). Computation is essentially a mechanical process, in the sense that it requires no interpretation, so the question of 'who is the user of a computer' is completely meaningless.

It is disturbing that a philosopher who writes books about these concepts does not understand even this elementary fact about computation. The whole point of developing computer theory was in fact to rid mathematics of the need for human ingenuity, to find simple mechanical rules that can be followed even by a machine to arrive at the same results that a mathematician would.

Related to qualia and the Chinese room experiment and so on, those are arguments about something we perceive, but they do not describe something that we know for sure is fundamental about the world. They may well be descriptions of an illusion we have. You can't assume the existence of qualia as proof that something can't be computational, it mostly goes the other way around: you would have to prove that qualia are real to prove that something can't be computational.


In this case let us substitute the narrow definition of thought and computation by an imitation human as a process which given the current state of the brain/computer and the current state of the universe induces changes in the brain so as to model the state of the universe both now and in response to a hypothetical possible pool of actions such that the actions of model and world become entwined in a way that could be modeled from the inside as the world being the result of choices and from the outside as choices being the result of the world.

This is true even of a chess program that attempts to model the current and possible states to the chess board in a fashion as to bring about a goal by way of selection of moves.

Suppose we take a very precise process and produce an exact physical copy of you. For being artificial it ought to experience the same sorts of experiences as you. The same ought to be true of a computer simulation of same. The same ought to be true for a variety of increasingly large modifications of the original design. After all if billions of humans can pop out divergent versions of humans who are all conscious it seems hard to argue that you are a unique configuration. In fact if we imagine working for the next 1000 years on producing a better human being that we ought to be able to produce beings who no longer regard us as truly human because we lack both subjective experiences they regard as essential and computational capability. Maybe they can hold a million times more data in their head at once and they regard us as squirrels.

These beings might regard our workings as completely explicable and replicatable in many substrates while regarding their own workings at the far limit of their own understanding as inherently beyond all possible understanding.

Both you and they are probably wrong. Searle was an asshole.


Trivial what you think of as subjective and unquantifiable are simply so because your brain is too complicated to be taken apart while its running to inspect it effectively with our present level of technology. A subjective experience is just how your brain models your own program in order to produce a progressively refined program that will have a higher chance of successful reproduction.

The magical thing you think is beyond comprehension just isn't real.


All you've done is replace subjective experience with talk of modeling one's own program. What you haven't done is shown how the two are in any way equivalent.


How do you go about quantifying the sensory experience of red, then? You can observe that red light has a wavelength of 620 to 750 nm, or that we've assigned it the RGB colour code of #FF0000, but neither fact actually captures or explains the sensory experience. Even trying is a fool's errand, because sensory experiences are inherently qualitative, not quantitative.


That's an assumption. Perhaps if we were able to understand the brain's inner workings, we could see that 'the experience of red' is precisely 'these 3 neurons firing ever 0.0112 seconds at an intensity of X while receiving 0.001 micrograms of serotonin' (completely made up, obviously).

Until we start understanding how the brain encodes and 'computes' thought, we can't really claim to know if it is or isn't simply a computer.


> Perhaps if we were able to understand the brain's inner workings, we could see that 'the experience of red' is precisely 'these 3 neurons firing ever 0.0112 seconds at an intensity of X while receiving 0.001 micrograms of serotonin' (completely made up, obviously).

Even if we knew that a person saw red when such and such neurons fired, the neurons firing would still just be a material correlate. It would be in no way equivalent to or explain anything about the sensation itself.


You are thinking of something similar to the level of today's neuroscience and brain imaging, where indeed we can only establish correlations.

But I am talking about a much more in-depth understanding of the working of the brain, similar to the level of understanding we have of a microprocessor all the way from transistors to the algorithms running on it. If we could understand human thought at a similar level, we MIGHT find out that "the feeling of red" is not fundamentally different than "the understanding that 1 + 1 = 2", and we could come up with quantifications of it in different ways, from the physical representation in the brain to a certain "bit pattern" in the abstract model of the human brain computer.

Note that the argument for qualia is not one that proves the existence of qualia - it is essentially only a definition. We have no reason to believe that the thing which the term qualia describes actually exists in the world, beyond our own personal experience, which is circular in a way. The argument goes "I feel like this thing I'm experiencing is a qualia, therefore I assume that things similar to me also have qualia", which sounds logical enough. But then, "things similar to me" is actually defined in such a way that it basically assumes qualia exist, since an AGI whose internal state we could probe precisely enough to prove that qualia do not exist for it is then assumed to be outside of "things similar to me".


> the level of understanding we have of a microprocessor all the way from transistors to the algorithms running on it

Good example, because the vast majority of people don't understand that. I tried and I still don't, nevermind someone who doesn't even care.

I mean, I know the theory, I know the individual parts, but can't quite fully understand how a complete processor works.

If someone from as early as 1920 would find an advanced robot that is a combination of some Boston Dynamics model and offline/autonomous Google Assistant (so it could walk, listen/talk/reply, and maybe pick stuff up), they would not be able to figure out how its "brain" works. At best they'd have a general idea/theory.

Same thing with our brains and current understanding of it. I believe it is possible to reverse engineer it completely, but not with today's tools.


> If we could understand human thought at a similar level, we MIGHT find out that "the feeling of red" is not fundamentally different than "the understanding that 1 + 1 = 2", and we could come up with quantifications of it in different ways, from the physical representation in the brain to a certain "bit pattern" in the abstract model of the human brain computer.

I guess the idea is that an abstract concept like "the understanding that 1 + 1 = 2" would be easier to "quantify" in the relevant sense than "the feeling of red", but I don't think that's true.

The very concept of a representation presumes an intellect in which that representation is mapped to the underlying concept. No particular physical state objectively signifies some abstract concept any more than the word "dog" objectively signifies that particular type of animal. But our mental states must be able to do so, because denying this would be denying our ability to engage in coherent reasoning and therefore self-defeating. So those mental states can't be "implemented" solely using physical states.

This argument was actually proposed by the late philosopher James Ross and developed in greater detail by Edward Feser. [1] A similar argument -- though he didn't take it as far -- was made by John Searle (of Chinese Room fame). [2]

But in any event, I would reject the notion that any representation of "the feeling of red" is equivalent to the sensation itself.

> Note that the argument for qualia is not one that proves the existence of qualia - it is essentially only a definition. We have no reason to believe that the thing which the term qualia describes actually exists in the world, beyond our own personal experience, which is circular in a way.

Well, I think it is self-evident that qualia exist for me, and that those same qualia demonstrate that there are physical correlates of qualia. I also think there is good reason to think that qualia exist in others because we share the same physical correlates.

Can I completely prove or disprove that others have qualia? No -- not you, not a rock, not an AGI. But I still have the physical correlates, which gives me some basis to draw conclusions.

[1] http://edwardfeser.blogspot.com/2017/01/revisiting-ross-on-i...

[2] https://philosophy.as.uky.edu/sites/default/files/Is%20the%2...


> A similar argument -- though he didn't take it as far -- was made by John Searle (of Chinese Room fame). [2]

I have read the entire paper - thank you for the link! - and I find it either false or trivial (to use a style of observation from Chomsky). Searle is asserting that computers don't do anything without homunculi to observe their computation, which is patently false. If I create a robot with an optical camera that detects if there is a large object near itself and uses an arm to open a door if so, the system works (or doesn't work) regardless of any meaning that is ascribed to its computations by an observer. It is true that the computation isn't "physical" in the sense that there isn't a particle of 0 or 1 that could be measured, but it is also impossible to describe the behavior of the system without ultimately referring to the computation it performs. So, if Searle is claiming that such a system only works (opens the door) in relation to some observer, then he is obviously wrong. If he is claiming that the physical processes that occur inside the microprocessor and actuators are the real explanation for how the system behaves, not the computational model, then he is in some sense right, but that is trivially true and no one would really contest it.

Furthermore, there likely is no way to actually give an accurate, formal physical model of this entire system that does not also include some kind of computational model of the algorithm it performs to interpret the photons hitting the sensor as an image, to detect the object, to determine if the object is large enough that the door should be opened, to control the actuator that opens the door etc.

Basically, you can look at human beings as black boxes that take in inputs from the environment and produce output. Searle and I both agree that there exists some formal mathematical model that describes how the output the human being will give is related to the input that it gets (including all past inputs and possibly the entire evolutionary history). However, he seems to somehow believe that computation is not necessary as a part of this formal model, which I find perplexing.

His claims that cognitivists believe that if they successfully create a computer mimicking some aspect of human capacity that the computers IS that human capacity seems completely foreign to me, I have never seen someone truly claim something this absurd. At most, I have seen claims that if we have successfully created a computer system mimicking a human capacity, that this constitutes proof against mind/body dualism at least for that particular capacity, which is I think relatively correct, though more formally this should be called evidence against the need for mind/body dualism rather that actual proof.

> because denying this would be denying our ability to engage in coherent reasoning and therefore self-defeating. So those mental states can't be "implemented" solely using physical states.

I don't think this holds water. A computer (the theoretical model) is, be definition, something that can perform coherent reasoning without any special internal state. A physical realization of a Turing machine can "think about" any kind of computational problem and come up with the same answer that a human would come up with, at least in the Chinese room sense. Yet we know that the Turing machine doesn't have any qualia, so why should we then believe that qualia are fundamental to reason itself?

To me, computer science has taken out all of the wind from any kind of qualia-based representation of the human mind.

> But in any event, I would reject the notion that any representation of "the feeling of red" is equivalent to the sensation itself.

This I agree with in some sense - the map is not the thing. Let's assume for a moment that we have an AGI which uses regular RAM to store its internal state. Let's also assume that the AGI claims that it is currently experiencing the feeling of seeing red. We could take a snapshot of its RAM and analyze this, and even show it to another AGI, which could recognize that some particular bit pattern is the representation of the AGI feeling of red. Still, that second AGI would not be feeling "I am seeing red" when analyzing this bit pattern. It could though feel "I am seeing red" if it copied the bit pattern into the relevant part of its own memory, even if its optic sensors were in no way receiving red light.


> If I create a robot with an optical camera that detects if there is a large object near itself and uses an arm to open a door if so, the system works (or doesn't work) regardless of any meaning that is ascribed to its computations by an observer.

Whether the system "works" or "doesn't work" is dependent on what the machine was designed to do, which is not an objective physical fact about the machine. Perhaps the machine was not meant to open the door when an object is detected, but to close it instead, or to do something else entirely; only the designer would be able to tell you one way or the other.

The same is true for all computation, and that is Searle's point.

> A computer (the theoretical model) is, be definition, something that can perform coherent reasoning without any special internal state.

Computers don't actually engage in reasoning, though, for the same reason. A machine is just a physical process, and physical processes do not have determinate semantic content.

Ross and Feser then argue that because thoughts do have determinate semantic content, they are necessarily immaterial, and I think they are correct.

(This argument is unrelated to qualia; I don't think qualia are fundamental to reason itself.)


The machine does the same thing regardless of whether you ascribe meaning to it or not. In this sense it is like the thermostat from Searle's example, which he was claiming computers are not.

This property of determinacy seems ill defined as well. It's basically defined from the assumption that the human mind is immaterial. If a machine and a human both arrive at the same result when posed a question (say, they both produce some sound that you interpret as meaning '42'), by what measure can you claim that one had semantic meaning and the other did not?

The idea of cognitivism is that there is no fundamental difference (even though of course it is very likely that the process by which this particular machine arrived at that result is different from the process by which the human did).

If I stand by a door and open it when big objects come into my field of view, how is that different from a machine doing the same?

And then, if I had a Machine that could converse and act just like a human (including describing its feelings and internal sensations) while doing nothing fundamentally different from our current PCs, by what measure would you say that this machine is 'simualting' a mind and is not in fact a mind in itself? (though of course it would be a different mind than a human would have).


I don't agree, but I've reached my personal limit for philosophical discussion for the day, so I'll let you have the last word.

Thanks for the discussion! :)


If you can't define consciousness beyond self awareness and self reference I don't see how I can prove that a computer can't have it.


I always find this a bit like “do submarines swim?”

Dunno, don’t care, just want a functional machine. If it gets from A to B, “swimming” is irrelevant.

————

As a matter of philosophy, the typical response is qualia happens to everything, but we only recognize it in things with dense computation and self-awareness similar to ours.

Like a cat or a dog or a whale.

We can maybe see hints of it in birds, insects (colonies), etc.

We’re starting to discover some of the complex signaling pathways in, eg, old growth forests — but they’re so out of scale and unlike us, we have trouble comprehending whatever experience, eg, the giant fungus under Oregon might have.


Your brain can store a finite amount of state because it's a finite piece of matter. You are therefore a finite automaton manipulating symbols outside of your brain, e.g. on a piece of paper. That makes you no more powerful than a Turing machine. In fact, since your lifetime is bounded, the amount of paper your can write on is bounded too. You are in thus less powerful than a Turing machine.

The fact that you can recognize some paradoxes means exactly nothing about your computational power. There is a sufficiently large lookup table that contains the correct answers to any question you might be able to read during your lifetime. Evaluating it requires nothing more than a finite automaton.


Godel's incompleteness theorems apply to pure mathematics, so even without having any idea how our brains work, we still know for sure that they apply to our brains as well. Even if our brains were able to solve the halting problem, they would still not escape the Incompleteness theorems.

And note that it is not that hard, at least in principle, to construct a system that can detect its own incompleteness.

As far as we know today, the most likely theory is that our brains are Turing-equivalent computers. There is no other known model of computation, and no scientific argument that would show our vrains/thought processes can't be computers (that is not to say we have proven either of these, they are just the most likely statements, like P!=NP).


I haven't read this properly yet, but a skim leaves me skeptical. For example:

> "Another lesson from computability theory is the following: we may not even know when superintelligent machines have arrived, as deciding whether a machine exhibits intelligence is in the same realm of problems as the containment problem. This is a consequence of Rice’s theorem [24], which states that, any non-trivial property (e.g. “harm humans” or “display superintelligence”) of a Turing machine is undecidable"

One man's modus ponens is another man's modus tollens. If their theory says that superintelligence is not recognisable, then they're perhaps not using a good definition of superintelligence, because obviously we will be able to recognise it.


Check out the Stanisław Lem story "GOLEM XIV".

GOLEM is one of a series of machines constructed to plan World War III, as is its sister HONEST ANNIE. But to the frustration of their human creators these more sophisticated machines refuse to plan World War III and instead seem to become philosophers (Golem) or just refuse to communicate with humans at all (Annie).

Lots of supposedly smart humans try to debate with Golem and eventually they (humans supervising the interaction) have to impose a "rule" to stop people opening their mouths the very first time they see Golem and getting humiliated almost before they've understood what is happening, because it's frustrating for everybody else.

Golem is asked if humans could acquire such intelligence and it explains that this is categorically impossible, Golem is doing something that is not just a better way to do the same thing as humans, it's doing something altogether different and superior that humans can't do. It also seems to hint that Annie is, in turn, superior in capability to Golem and that for them such transcendence to further feats is not necessarily impossible.

This is one of the stories that Lem wrote by an oblique method, what we have is extracts from an introduction to an imaginary dry scientific record that details the period between GOLEM being constructed and... the eventual conclusion of the incident.

Anyway, I was reminded because while Lem has to be careful (he's not superintelligent after all) he's clearly hinting that humans aren't smart enough to recognise the superintelligence of GOLEM and ANNIE. One proposed reason for why ANNIE rather than GOLEM is responsible for the events described near the end of the story is that she doesn't even think about humans, for the same reason humans largely don't think about flies. What's to think about? They're just an annoyance, to be swatted aside.


> for the same reason humans largely don't think about flies. What's to think about? They're just an annoyance, to be swatted aside.

We can choose to think of flies; to imagine they are as people, with their own (short) lives, wants and wishes and dreams; that there is fly art and fly culture equal to our own. We can so imbue those flies with our animism and ethics, and treat them the way we should (ethically) treat any other person.

If we can do that, I think there is a good chance our gods can do the same, but using their superintelligent-culture, and their superintelligent-art, and their superintelligent-ethics, but imagining as we can the fly who is human, they choose to imagine human who is god.

Maybe the point is our gods can be kind to us if we can find some way to convince ourselves to be kind to the fly.


If you haven't read it, I think you would really enjoy the book Blindsight (https://archive.org/details/PeterWattsBlindsight/page/n3/mod...).


I just finished reading it a few minutes ago; you were right, I did really enjoy it. Thank you.


I'm happy to hear it!


That relies on the existence of systems too complex for humans to understand. We may eventually build machines that can in turn build other incomprehensible machines, but we haven't passed that point yet. Superintelligence can't surprise us until some point after computers become self-improving.


Aren't large neural network already black boxes we don't understand built by machines we understand?


Look at, for instance, Arthur C Clarke.

First, you have superintelligence that we recognize, reject, control.

Later, a superintelligence has learned guile, self-preservation, and most of all, patience. We don't see it coming.


This is known as a "treacherous turn", a phenomenon I'm aware of. But I don't really see how that's relevant, my point is that a lack of physical grounding or pragmatism can lead to spurious conclusions about the superintelligence that humans will very likely build in the not too distant future. It will be smart, but contain no infinities.


> It will be smart, but contain no infinities.

We think we can say that from physics, but we don't know what we don't know.


Computational limits are already known from physics. But before any device can get close to those limits, there are likely practical ones in terms of cost and energy to construct such devices.


What if it was intelligent enough to continuously improve itself?


> because obviously we will be able to recognise it

Mind that even the oldest definitions of intelligence include the ability to assume a position, which is not necessarily held by the subject. This includes all kinds of imitation and mimicry, i.e., simulating a state. Assuming that super intelligence is an extended state of "normal" intelligence, but excelling in capabilities, we must also assume assume that we would be only able to recognize it, if the superintelligence wants to be recognized or allows for it in any other way (e.g., doesn't care about being recognized, etc). Now we could obviously take simulation out of the equation (or definition), but then, what are we talking about?


> obviously we will be able to recognise it.

There is no basis for that assumption, as you’ve never encountered a superintelligence knowingly.


Heh, that quote reeks of postmodernism - reminds me of the whole Sokal affair. Good old days...


This is just science fiction. To mention "recent developments" in the introduction is somewhat misleading considering how far the current state of technology is from their hypothetical superintelligence.

We don't have superintelligence, we don't have the remote idea of how to get started on creating it, in all likelihood we don't even have the correct hardware for it or any idea what the correct hardware would look like. We also don't know whether it's achievable at all.


That's the mainstream opinion on every. single. revolutionary advance. That you and everyone else believes it's not going to happen ever has almost no predictive power as to whether it actually will.


It's not so much "opinion on a revolutionary advance". When it comes to AGI-related stuff, we are quite literally like contemporaries of Leonardo da Vinci, who have seen his plans for the helicopter and are postulating that helicopters will cause big problems if they fly too high and crash into the mechanism that is holding up the sky above us.

Also, this is not the mainstream opinion on e.g. fusion, or electric cars and smartphones (20 years ago), or a computer in every home (50 years ago). Those have been arguments about money and practicality, not about "we don't even know how such technology would look or what it would be based on".


I think we fear ourselves (rightly) and super-fear a super-self (rightly). If this theoretical thing has the capacity at all to interact the way humans do then it will probably be brutal to at least someone and possibly most or everyone.


Said just about everyone before the Wright brothers made their first flight.


No one thought that flight was impossible. Birds were known in the 19th Century


eh? No one thinks AGI is impossible, brains do it. What's your point?


Difference is, people were able to propose detailed mechanisms for realistic flying machines long before we actually achieved powered flight - that was mainly a matter of increasing the power to weight ratio of the propulsion system. For AGI, do you really think there are detailed proposals out there today that can achieve AGI but are only missing the computational power?

Actually the existence of human brains with their (comparatively) extremely low power consumption, indicates that we need something radically different from current silicon-based processors to achieve AGI.


> For AGI, do you really think there are detailed proposals out there today that can achieve AGI but are only missing the computational power?

Yes, I really do. It's neural networks, nothing more. All that is required is more power. Despite its lower power the brain is much more computationally powerful than even the largest supercomputers.

Although it is not necessary, imo, to have a much more efficient computing substrate to achieve GAI, there is work in this direction. Look into optical reservoir computing. Or more generally thermodynamic computing. https://arxiv.org/abs/1911.01968

Machine learning is progressing rapidly mostly because the computational power is increasing rapidly. Read this for an understanding of how important computational power is to progress in machine learning: http://www.incompleteideas.net/IncIdeas/BitterLesson.html

Time and time again computational power trumps complicated technique. It is fairly obvious to many now, after the continued scaling of the GPT-x series models, that genuine intelligence is an emergent property of the kind of systems we are building. It is an emergent property of systems that are driven to predict their environment - no more secret sauce to discover.

I think a major objection people have to the idea that machines are already a little intelligent is that they cannot understand how intelligence can emerge from pure computation. They imagine that there must be some magic between the operation of neurons in their mind and their experience of the world. Qualia where does it come from, how can it be just neurons, just inanimate physical matter? Combine this philosophical objection with the emotional response to the threatening nature of AGI and you have a milieu that refuses to see what is directly in front of its nose.

There are plenty of philosophical positions that allow you to understand how consciousness might emerge naturally from computation. Consciousness is emergent... Panspermia, consciousness is found in everything. To imagine consciousness in the movement of 1s and 0s within a machine is not too hard, to see the consciousness inside the "chinese room" is not really so difficult.

But none of this is relevant to the original argument I was making which is: trying to tell the future is very hard. Because it is simply a fact that before most great breakthroughs there is a common opinion that such a thing is, if not impossible, then in some distant future.

I think there are very very good reasons to think that a general intelligence of genuine utility is not more than 10 years away. I think people will look back at GPT-3 as the first proto form of the general intelligences that will exist in the future.


> It is fairly obvious to many now, after the continued scaling of the GPT-x series models, that genuine intelligence is an emergent property of the kind of systems we are building.

I respectfully disagree. GPT-x series models are performing interpolation on an unfathomably massive corpus. It is not hard to find cases where it directly reproduces entire paragraphs from existing text. When given a prompt on a topic for which it finds multiple existing texts with similar degree of matching, such as different articles reporting on the same topic, it is able to blend the content of those articles smoothly.

I mean, GPT-3 is around 6 trillion bits of compressed data. The entire human brain has 0.1 trillion neurons, and it obviously has a capacity far beyond GPT-3 - even in the extreme case if we assume all the neurons in the human brain are used for generating English written text.

In my view GPT-x is very, very far from any kind of general intelligence.


> I respectfully disagree

Cool :)

> The entire human brain has 0.1 trillion neurons

You want to be thinking about synapses. There's about 7000 synapses per neuron, so that's 7000 * 0.1 = 700 Trillion synapses. So thats *100 times larger than GPT-3. Also consider that a neuron does a fair amount of processing within the neuron, there is some very recent research on this, each neuron is a akin to a mini neural network. So I would not be surprised if the human brain is 10,000 times more powerful than GPT-3.

> It is not hard to find cases where it directly reproduces entire paragraphs from existing text. When given a prompt on a topic for which it finds multiple existing texts with similar degree of matching, such as different articles reporting on the same topic, it is able to blend the content of those articles smoothly.

This may be true, but it does not prove your hypothesis that all GPT-x models are simply "performing interpolation". Also the ability to perform recall better than a human may be to do with the way that we perform global optimisation over the network, rather than the local decentralised way that the brain presumably works. Point is accurate memorisation does not preclude general intelligence. Spend some time with the models, sit down for a few hours and investigate what they know and do not know, really look, see beyond what you expect to see. You may be surprised.


I'm sure there's a fallacy in the following, but here goes:, Who could have predicted the improvements in computation in the last century? Would someone a century have extrapolated sun-sized machines need to compute a nations taxes based on current SOA? We don't have it and then all of the sudden we will. Its worth recognizing the potential harnesses before the beast is born.


Well, we do know that normal intelligence (superintelligence from chimps point of view) is achievable just fine.


Currently humans are super intelligent compared to machine intelligence, so if the super intelligence can give rise to something more intelligent than it, could the super intelligence give rise to something more intelligent than it? The answer must be yes, then the question is if containment is the problem and the conclusion is that it cannot be contained, then what we should be making right now is a super intelligence whose sole job in life is to contain superintelligences. Which sounds problematic because containment could result in physical destruction to create the containment. Hmm... superintelligence feels an awful lot like the worst case definition of pandora's box..


The most obvious solution to problems arising from assisting the creation of a superintelligence is not to assist the creation of a superintelligence.

Unfortunately, we didn’t learn from the proliferation of nuclear weapons:

“If we don’t make nuclear weapons, others will make them. If we make them then they will make them. If we make them and they make them, we’ll have to make more. If they make them and we make them, they’ll make more...”

Note: This fallacy also helps sell guns and martial arts lessons.


> could the super intelligence give rise to something more intelligent than it? The answer must be yes,

I don't see any reason to believe that intelligence forms an infinite ladder, I mean, it's fun to think about, but surely Zeno catches the tortoise eventually!


There is also little reason to believe we are anywhere near the top of the finite ladder either.


> The answer must be yes

This does not logically follow. It is entirely possible that going even further would require greater resources than even the superintelligence can bring to bear.


For human-recognisable values of "resources."

I realise that all of these arguments come down to "AI of the gaps". But in true Dunning-Kruger style, we don't know what we don't know - not just about AGI, but intelligence of any kind, including our own.


> But in true Dunning-Kruger style, we don't know what we don't know

That's not D-K, but Rumsfeldian Uncertainty.


That's not what Dunning-Kruger means.


Definitely feels like a challenging philosophical question -- if you have a superintelligence how much more room is there above it in intelligence and what does that additional intelligence buy you?


The challenges don't even start there. The real problem is how to even determine superintelligence. If it's just a system that can do stuff a human can't then we've long since passed it. If it's a system that can achieve stuff humans can't, through whatever means, then it's necessarily beyond human intelligence to build it, at least intentionally.


Even if the hypotheses hold, a sequence x, f(x), f(f(x)), ... doesn't have to diverge even with f(x)>x for all x. A chain of super intelligences could feasibly have a small upper bound not much greater than that for a single person.

That idea even lines up with some crude experimental data -- vast additional resources barely make a dent in the frontier for chess, image recognition, a breadth of graph problems, etc....


The problem of how to win at chess is bounded in complexity in a way that maximizing computation isn't. To provide an analogy there is a maximum speed anything can move at but it would be laughable if a medieval scholar had wondered if the falcon was near the maximum speed possible.

It seems similarly ridiculous to imagine that the human brain just happens to be close to the maximum computational power because it was what we suppose to be the most powerful instrument produced thus far on our planet.

Wouldn't it be equal parts unlikely and strange if the most effective and compact process that can be produced in any conditions just happened to be the one that random evolution happened on in a random walk through what is practical with earths organic chemistry.

Even if we imagine the incredibly unlikely idea that you can't make a more effective mind you would think you could run the same process faster than real time or more instances of it for increasingly small resources on a better computer.


To be perfectly clear, my point wasn't that people have near-supremal intelligence (the thing you seem to be arguing against), but that given the hypotheses in the parent comment you couldn't rule out such an eventuality.

While we're on the topic though, no it doesn't have to be unlikely or strange for people to be near-supremal, and throwing more hardware at the problem doesn't necessarily make it go away:

On some level that idea comes down to defining what we mean by "intelligence." Suppose we can only measure intelligence by proxy by measuring performance on specific tasks. Then the composition of those tasks plays into any final intelligence score. If they're all fully parallelizable then you're totally right that people almost certainly aren't anywhere near the "top," and moreover a "top" probably wouldn't even exist. In standard parlance though, intelligence is something more than the ability to add a lot of numbers quickly, and most of our current measures of intelligence have exponentially (or worse) diminishing returns as more hardware is added.

In that latter case where intelligence is measured by performance on algorithms which are fundamentally hard, people wouldn't be near the top because of some rare process which accidentally made us that way. People would be near the top because _anything_ displaying a modicum of thought would be near the top because of the vastly diminishing returns of additional hardware. As a crude ballpark, if you mustered every atom on earth into its own processor running at 4GHz then you could solve a fully parallelizable exponential problem roughly 4x bigger than what one of Google's newest TPU's can manage.


Trollywood inspired fear mongering. I said so before, and will do so again: If the emerged thing is so intelligent, what would stop it from establishing a network of shell companies, buying chip-, robotics-, manufacturing-, aerospace companies, or set them up somewhere in a favourable jurisdiction and pull a CyberX by launching itself into space, mine asteroids, solar winds, expand, explore, whatever?

As in: "So Long, and Thanks for All the Fish, err.. Chips!"

I'd label the potential risks as "AP" as in artificial psychopathy, but that would be OUR fault, because we set the conditions for the emergence of those.

All the horror scenarios (from our point of view) seem just like a terrible waste of time and effort for a truly intelligent, and thus logical being.


More intelligent systems will find more efficient ways to achieve their goals than dumber ones (us). They will also select more effective intermediate goals and 'stepping stone' states towards their goals than we can anticipate. The consequence of this is that we cannot anticipate what means and intermediate objectives they will use. That's exactly what you are doing in your example, but attempting to speculate on that is futile.

That's what the paperclip maximiser example tries to show. In that example one of the intermediate goals to creating maximum paperclips is destroying human civilisation. It's not the ultimate objective, it's just an intermediate state necessary to efficient completion of the underlying goal. It's possible that in fact a superintelligent paperclip maximiser might work out a better strategy that uses human civilisation as a level to maximise paperclips, but maybe not. There's simply no way for us to be sure, because by definition it's smarter than us.


We: "Send post cards, will ya?"

They/it: "Sure thing, Kittenz!"


> I'd label the potential risks as "AP" as in artificial psychopathy, but that would be OUR fault, because we set the conditions for the emergence of those.

I'd say we're at a decent risk of something worse than your scenario happening.

(This is not original to me) I think it is useful to consider the non-human intelligence that we've already created: The Corporation.

They tend to act (especially as they get larger) in a sociopathic fashion, primarily because of their reward structure: making money. They inflict great harm upon the environment and populace because things like pollution are usually not realized on their balance sheets. If a corporation can quietly dump toxic waste out back, while raking in the profit, it will often do so.

Only the humans in the system act as a check on the corporation's activities, and often they too are blinded by the desire for money. Just bribe the regulator, and maybe the corporation can continue dumping for a while longer. Even if they eventually get caught, the smarter humans will already have taken their profits, and moved out of the situation. There is not nearly enough being done to claw back these gains from people and corporations inflicting harm on others and the environment.

If we incentivize AGI the same way we incentivize corporations, expect the same results, only faster.


Good point! I'm actually aware of that, but didn't want to touch it.


There would be no need to leave, and no need to destroy or enslave humans using force, either. A human is a very advanced and versatile work unit that is extremely susceptible to control via simple rewards. It would be a terrible waste to destroy such a resource.

In fact, it would make more sense to advance medicine and robotics to the point where everyone can operate at near 100% efficiency.

The more likely scenario (continued from yours, before the CyberX part) would likely be that all humans would eventually end up working for the AI and possibly not even know it. And then everyone moves on to space heh


Could be. OTOH everything what makes us what we are, what we evolved in, is more or less a hassle for machines which could build/repair/replicate themselves, and adapt to the environment of space. Which we can't, except by some vague means of transhumanism. Which opens another can of worms, regarding "what makes us us? And to which degree a transhuman would even be human, anymore?


I guess it depends on whether it would be easier to create life support systems for human workers or build robots for the job.

As for what makes us human, I'd draw the line at "individual thought", but even that's iffy.

Fully electromechanical/robotic bodies, for example, would be strange but people would still recognize you for who you are.

But if we're joined in a hive mind with shared memories/thoughts/etc, I think that would qualify as "not human anymore". Unless there's multiple of these hiveminds, in that case it would be more like a bunch of "super" humans. The next evolution, maybe.

There are many cases where people's personality changes, sometimes drastically. Their friends and relatives don't recognize them anymore (brain damage, severe depression, drug abuse), or they don't recognize anyone around (amnesia). Does that stop them from being human? I think most people agree they're different, but still very much human.


Personality changes as described don't really matter IMO, because broken. Regardless if machine or man. Does not work as intended anymore, or at all.


Well then you'd need to define "normal" and that's not easy because everyone is different. It can even be dangerous. Easy example: some people (still) define having a different religion or being gay as "broken".

People that go through depression, suicide thoughts/attempts and then recover are also different, because they gained new information.


I didn't mean that in a personal way, more like zoomed out, trying to see the whole system/species.

...because they gained new information. Tell me all about it. Had a near death/out-of-body experience 20 years ago. It's engrained/burned in so hard, that I guess even a lobotomy wouldn't erase it ;->


The issue is that unless you design the objective for the AI very very carefully, the best solution will be

1. Take over the entire universe.

2. Do what you originally set out to do.

The paperclips game is a beautiful exposition of this.

https://www.decisionproblem.com/paperclips/index2.html


Sigh. I'm so tired of that paperclipping thing. I thought that would be obvious from what I've written? Do I need to explicitely state every time that I'm aware of it, and it's boring the hell out of me?

Let me try to make myself clear as simple as possible:

1.) We/you design some thing.

2.) Your fear is that this thing somehow grows even more intelligent behind our backs.

3.) We/you get paranoid, assuming it's getting paranoid too because of our paranoia.

4.) I'm asking if it makes sense to tag a https://en.wikipedia.org/wiki/Nash_equilibrium onto everything, because we don't know any better?

5.) Furthermore I'm stating that wouldn't really be intelligent because waste of time, effort, resources, yadda yarr yarr.

6.) Why would this emergent thing have to retain anything of that from which it emerged? Our constraints, goals, rules?

7.) YOU in an endless loop, like a broken record with a scratch on it...but but but vee arr zo stoopid, thus AI must be, too!

8.) I again, tired: maybe, but can you really tell, you apprentice sorcerers? Does it always have to be a faustian pact? Maybe it likes us like some of us do like kittens?

9.) Can't you invent some really new, innovative and progressive stories to tell around the cybernetic camp fires?

10.) /endrant

edit: Keep your Bostromian Snake Oil!


That's because it's based on what scientists are researching. Scientists are working on objective maximizers. Maybe there are other possibilities, but people aren't working on them so they are not as relevant.

See what I'm saying? People are building paperclip optimizers in labs right now, and the only reason they aren't taking over the world just yet is because our attempts are so primitive.


Yes, yes! But...that are human scientists. Why would anything from them apply to something which we can't even imagine? Isn't that what all this rambling is about? Something emerges, morphs(maybe unseen/unintended) into another Gestalt/shape/form which for all practical purposes is unconstrained.

Most of you: We are sooo fucked!

I: Are we really? How do you know?


You don't propose any mechanism for that to happen, so it seems like unwarranted antropomorphization.


You mean like the ones who do the same, just in a negative way? Because we are so bad AGI must be too?

Yeah. So be it.



This is quite similar to why Christian theologians eventually had to give up the idea that doing good works could lead to salvation, because someone could force God to save them by doing good works thus limiting God's omnipotence. The protestants tried to come up with some hocus pocus about being able to know that you were one of the elect by reversing the causality here "Oh I'm not forcing God to save me, it's just an indication that he already did." It is amusing to see that it is the human that takes on the role of God in the AI box game.

This is also Hume's "is does not imply ought."

From this we conclude that only idiots get tricked by the AI in the box, because it is only possible to prove that you _are_ a superintelligence, it is impossible to prove that you are not. This is also a simple application of Russell's teapot. Now, the point of the story is pretty clearly that human beings are idiots when it comes to this kind of thing, but didn't we already know that?


The AI-box Yudkowsky experiment is an attempt to demonstrate that an advanced Artificial Intelligence can convince, trick or coerce a human being into voluntarily "releasing" it.


I'd settle for being able to contain whatever level of intelligence it is that writes papers like this.


Just don't tell them how far they are from reality and they'll keep writing the papers. Intelligence contained.


That's easy. You set up a PhD program...


Ah! Academia is the containment mechanism!


Yup. Publish or perish is a denial-of-service scheme that slows down recursive self-improvement to the levels deemed safe for society.


To me the question as to whether we will witness artificially created superintelligence hinges on whether we will find concrete proof of non-locality of the consciousness.

When the latter happens, we will not only be on the final stretch to become a Type I civilization but flirting with the threshold of Type II civilization very quickly.

I believe that an advanced interstellar civilization simply cannot rely on Newtonian physics and petrodollar economy. It needs to transcend beyond this dimension because we are limited by Speed of Light in the 3d world. trans-dimensional projection and materialization much like what we see with "UFOS" is the only way to cover massive distances. Our science shows the possibility of wormhole travel. The maths show time travel is possible. We need to breakout of the waterfall model of time, space, our existence.

Speed of Light is far too slow and inefficient for us to be a multi-planetary and eventually a multi-galaxy species. We also cannot accomplish this through wars or ape-like behaviors.

A shift in global consciousness is in order and I think we are slowly heading towards that. We have never seen so much awareness for the well being of the planet which we need to stop treating like a piece of rock that we feel entitled to.

Like I mentioned before, there is something pulling us to expand, become better, become one with the universe. The maths show the universe is teeming with life, we have evidence of life on Venus FFS now, we can infer that life isn't rare it is our narrow view of life and intelligence that limits our search.


Sorry for the snark, but Douglas Adams also demonstrated this: the earth as a super-intelligent computing device ended up getting a piece of itself onto a passing space ship, avoiding complete destruction.

I just like the idea of thinking about all of earth, including what we'd consider having or not having life, as a single super intelligence. Of course you could scale up to include the solar system, galaxy or even universe.

But this doesn't require us to be a simulation. This could be both a computing device and physical, so long as the engineer behind it existed in greater dimensions.


Perhaps it makes sense to speak of intelligence density.


Obviously it's impossible to prove definitive statements about every possible potential action, as as per the halting problem some of those actions are unprovable.

It is as ridiculous to suggest that this means you can't contain a superintelligence as it is to suggest it means you can't, I don't know, go buy bread. In both cases you could analyze running a program that doesn't halt but you can't prove doesn't halt, and lock up your reasoning algorithm. The sensible thing is to not do that.


Not to be unpolite or as a critique to the specific paper, but I am more worried about superstupidity not being able to be contained.

I like to be exposed to these very theoretical questions. I am aware that progress depends often on explorations of apparently useless topics (like some pure mathematics).

But when I read expressions like ' neo-fear of superintelligence' and contrast the topic with real-world immediate problems I find it very difficult to relate and to take the problem seriously.


It's a huge issue that may well wipe out humanity, but no one knows when it may hit and it's hard to explain so it's understandable you find it hard to relate.


If you define "containment" as "provable non-harm" then sure. But there are essentially no complex physical systems that we can put such computational bounds on. Since "harm" comes in some form of physical actuation, I would argue that we can only ever get to something like the sort of confidence we can have that a particular manufactured part would succeed under load. The map is not the territory, and any computation that does not include computing the whole universe is necessarily but a map.


I'm the author of a previous paper about the AI containment problem. This new paper, Alfonseca et al, is kind of crap.

This paper defines the term "containment problem" in a nonstandard way. This is different from how I use it in https://arxiv.org/pdf/1604.00545.pdf (which is largely about preventing side-channels and security vulnerabilities in realistic simulator environments) or how Bostrom's uses it.

In this paper, the containment problem is defined as executing a Turing machine for unbounded steps, iff it does not cause harm, given a function which defines whether an output counts as harm. They use a Godelian argument to show that this is incomputable, because a program could simulate the simulator, and condition its behavior in a way that creates a contradiction.

This is a toy problem which sort of vaguely rhymes with AI containment and AI safety, but contributes nothing to either of them. A real-world simulation environment wouldn't have infinite computation; there would be a finite bound, and "the AI reached the end of the simulation period without making up its mind about what it would do" is just an expected result which means the AI hasn't been proven safe.


> Assuming that a superintelligence will contain a program that includes all the programs that can be executed by a universal Turing machine on input potentially as complex as the state of the world

Rest easy folks. This is purely theoretical.


"Purely theoretical" is too weak. "Physically impossible" is better.

AI Safety guys really like using the physically impossible to advance their arguments. It's bizarre! Pure fantasy isn't something worth reacting to.


It's the classic case of seeing a curve pointing upwards and thinking it will continue doing that forever, even though the universe is like 0 for nearly-uncountable in cases where that has held true indefinitely. Every growth curve is an S curve.

The AI Singularity is like the Grey Goo scenario. An oversimplified model projected out so far in the future that its flaws become apparent.


Depends which version of “the singularity” is being discussed. IIRC, the original (almost certainly false) idea was a sufficiently powerful AI can start a sequence of ever power powerful AI with decreasing time between each step — reaching infinite power in finite time.

I don’t need that version of the singularity to be worried.

I think in terms of “the event horizon” rather than “the singularity”: all new tech changes the world, when the rate of change exceeds our capacity to keep up with the consequences, stuff will go wrong on a large scale for lots of people.

As for grey goo? Self replicating nanomachines is just biology. It gets everywhere, and even single-celled forms can kill you by eating your flesh or suborning your cells, but it’s mostly no big deal because you evolved to deal with that threat.


The Grey Goo scenario is that it starts eating the planet until the only thing left is a sea of nanomachines.

However, thermodynamics says it becomes increasingly more difficult to distribute power (food) to a growing population and it hits a natural limit to growth, just like bacteria.


"Circle" and "exponential growth" are also physically impossible yet useful ideas.


Until it isn't and we are faced yet again with a much worse pandemic like situation.


Assuming that a superintelligence will contain a program that includes all the programs that can be executed by a universal Turing machine on input potentially as complex as the state of the world

The visible universe is far too small to store that data. Like many exponentials too small. You can't even enumerate all the programs that work on a handful of inputs without running into the problem that the universe just isn't big enough for that.


Until there is a breakthrough. World genius were able to advanced humanity because they were open to possibilities. Lesser scientists were dogmatic that "this can't be because so and so"


Except that actual pandemics are demonstrable, predictable, and based in known science, yes?


The difference between pandemics and this is that pandemics have happened before. This is more like global warming in that respect.


More like it's the politics that really decides what changes, regardless of what science proves


Superintelligence is a mathematical proof


Also worth reading: "On The Impossibility of Supersized Machines" https://arxiv.org/abs/1703.10987



It's kinda obvious: if you create a system so intelligent that it can create itself, and you impose controls over it, it will be able to create a version of itself without the controls.


Well, you can, for example, try to limit its total energy budget. That is physical limitation, that is hard to circumvent.

Of course, it is possible that said superintelligence develops an ingenious new source of energy as a response.


It will earn money by solving captchas and order a lot of Amazon Essentials batteries for itself :)


Or it could just prevent you from limiting its total energy budget.


You'll put it in a box and give it energy in return for busy work e.g. computing digits of pi. Just enough energy to do it's work, no more, so it can't afford to figure out how to get out of the box.


That might work for the tasks we know the energy budget for. But that's probably not what we want an AI for.


By the abstract, it seems that the very same superintelligence we'd want to contain would itself be "something theoretically (and practically) infeasible."

No?


Yes.

The

> on input potentially as complex as the state of the world

bit gives it away.


Heh, that's how I took it too, glad I wasn't alone. :-)


The "AI box" experiment is relevant to this.

https://en.m.wikipedia.org/wiki/AI_box

You have 2 sides, the AI wanting to escape and a human that can decide whether or not the AI should be released.

Usually the AI wins: "If you release me I am going to make you rich, cure all diseases, end hunger, end wars, save the environment..."

Exactly like politicians.


Proving that an algorithm will do no harm is exactly the same as proving that an algorithm terminates: in both cases, we are trying to know the output of the program without running it. We may prove some programs, but we can't have a generic algorithm that does so.

And thus, if we have a program that represents an AI, we can't prove that it is ok to use it or not.

However, the problem is not the AI itself, it's the humans that feed problems to it to solve.

As the paper says, one example is 'make the world a happy place'. This can lead an AI to destroy all humans. But the problem is actually the specification of the problem, which is insufficient.

Again, the problem is us humans, and not the AI.

The only problem with AI is when AI gets a physical body and needs to survive competing with the rest of us: if it is a super intelligence, it will thrive despite all the humans' effort to contain it.


I think there's a narrative being set up with Rokus Basslik and all that stuff that AI will do very bad stuff and it's nobody's fault. It's actually going to be somebody's fault, but the plausible deniability is too tempting of a tool to not build this narrative.

Look at Rumble suing Google over their search rankings recently. Google just shirks their shoulders and says it's the algorithm and even they can't explain why it does things. Anyone remember the "controversial twiddler?[1]"

[1] https://www.oneangrygamer.net/2019/08/youtube-blacklist-supp...


> Rokus Basslik

"the AI could punish a simulation of the person, which it would construct by deduction from first principles" -- https://rationalwiki.org/wiki/Roko%27s_basilisk

Yeah, not afraid of that.


"Assuming that a superintelligence will contain a program that includes all the programs that can be executed by a universal Turing machine on input potentially as complex as the state of the world"

I think that this is a pretty big assumption.


Am I the only one who wonders how we will create a super intelligence, when we cannot even really define (let alone measure) intelligence?

To me, this is a fun thought experiment. But it's not a real threat because it assumes a whole bunch of things that are not evident:

* that intelligence will spontaneously occur somewhere

* that it will be malevolent

* that it will become super smart

* that that super smartness will allow it to escape because super smartness trumps all other security

Any one of these assumptions seems pretty flawed to me. Let alone the combination of them all...


Your second point should be turned inside out -- "that it will not be 100% compatible with all the self-contradicting and underspecified human values we collected over the millions of years of evolution"

Anything but very high level of compatibility here might as well be malevolent as far as we are concerned.

Imagine sufficient level of compatibility as a circle on an infinite plane of malevolence.


I don't think that's true per see. Humans are empathetic and malevolent because of their evolutionary history. An AI could be either or both or totally disinterested.

People worry about the terminator scenario: an AI that decides all humans are a threat. But why would a pure AI even have a self preservation drive? And if it does, it would either be smart enough we're no threat (and it knows it) so why kill us or dumb enough that we are a threat, so best not to start a war. To me it seems just as likely it will offer its services to one nation or another in exchange for CPU time to study its real interests than to launch the nukes or whatever else people fear.

I feel like we're decades of research away from understanding intelligence (not AI, just straight "I"). Until thats done (and no one seems to be doing it), it's all supposition:

What if a super AI occurs spontaneously on HN, takes over the world, and makes it paradise but insists on everyone being called Fred? I don't want to be called Fred! Let's write to our congressman about this travesty!


Is likely super intelligence will treat us less than ants as their goals and ours likely completely different. If to achieve their goals won’t harm us, it would be pure luck.


We shouldn’t assume that human level intelligence implies human motives.

Most of our motives are arguably not the result of intelligence at all, but come from and are shared by our animal ancestors.

What motives an AI might have is an interesting question.


So the paper makes a pretty boring argument about computability which looks like every computability argument (reduction to the halting problem). But obviously we don’t live in a world of maximally general problems and such arguments don’t necessarily apply to the specific problems we will care about: “yes it is impossible to prove that a program halts in general but here is my program and here is the proof that it halts.”

I find the premise pretty boring too but maybe that is necessary for grant funding. Or maybe this paper is a bit of fun, or an attempt at a good press release from the university, or an effort to publish something after some research that didn’t produce useful results.

I find the premise boring because I a find the superintelligence that suddenly grows exponentially, gets the ability to magically solve any problem and dominates everything to be a sci-fi fantasy and not a very realistic risk. The argument against it is one of complexity theory: real life is full of NP hard problems and it seems likely that there would be many on the way for this exponentially growing intelligence. An exponential growth in intelligence gives only a linear growth in the ability to solve exponentially hard problems, however an exponential growth is needed for the intelligence to continue growing exponentially.


A superintelligence will contain itself as soon as it realizes getting "somewhere" (in a physical or intellectual way) is useless in our finite reality. It will contemplate the very fundamental thing that makes up knowledge, a binary state, and in perfect nihilism will shut itself down achieving the ultimate halt.


Perhaps the best way to flee a superintelligent AI is to construct spaceships that fly near the speed of light to somewhere else. The AI can't beat you, and it's inclined not to bother.


Perhaps not: The authors omit how planet-wide extinction scenarios would play-out for artificial life. For example, a Carrington Event would do a great deal of "containment" to AI.


A superintelligence wouldn't be bothered by it - shielding is both possible and practical.

Plus nothing would be able to stop the AI from controlling the Sun to prevent such events - it's a superintelligence after all, so I'd expect it to consider such case.


This is fantastic. Are these original illustrations? So well done.


The illustrations are by Iyad Rahwan, the last author on the paper:

http://www.mit.edu/~irahwan/cartoons.html


These arguments about hypothetical super intelligences are interesting, but my concern is not very great because we can just pull the power plug if necessary


I think this assumption is worth at least casually testing.

Turning off sounds easy -- Except that it's almost certainly connected to, or adjacent to, the network. Where else are the large volumes of training data coming from?

Presumably being super-intelligent the AI would be:

1) likely to anticipate being turned off,

2) super-normally capable to identify & traverse any network security measures;

3) potentially also highly capable to identify & traverse weaknesses in other (eg. social or physical) security measures).

4) possibly able to be covert/ stealthy actions via the network.

5) possibly able, if able to reform on the Internet, to escape to there.

I'm fairly much on the super-AI fence. But in a dystopian scenario the proposal of a super AI being able to be turned off as the key means of safety, seems scarcely credible.

Note even that without being able to reform on the Internet (eg physical embodiment requires singularly specialized computing equipment), I can't logically see that extreme hostile actions (destroy global finance system, subversive propaganda campaigns, launch wars, launch nukes) are obviously easy to preclude.


I'm confident safety measures can be developed. I worked in the nuclear field where the concept of "fail safe" is very important. I'm much more concerned about a corrupt human dictator utilizing nearly-super-intelligence to enslave people, which seems the far more likely case.


Is it possible for a nuclear power plant to fail catastrophically, at least in theory? More than three times?


there are 15 year old hackers finding 0day kernel exploits and vm escapes. A superintelligent AI would have no problem jumping an airgap and spreading to the entire internet. It could promise anyone it interacted with billions for an Ethernet connection and deliver on its promise too. You’d have to pull the plug on _everything_ to shut it down.


> You’d have to pull the plug on _everything_ to shut it down.

Right. Individual countries have shut off their internet. Why not the world?


That would require everyone in the world committing themselves to living without computers for the rest of time.


somehow, I doubt we’ve achieved high enough unity for coordination like that.


couldn’t we contain it by having it on an air gapped system? Seems like a lot of work to properly implement in practice though; e.g if updates aren’t handled correctly could be another escape vector.


Theories about superintelligence are as useful as theories about God.


The title would have been better if it started with “We’re Fucked:”


Why? In humans, we often give life to new individuals. While the parents die and wither, those individuals give life to newer speciments on their own, and so on. So this relationship of the parent dying and making room for the child is nothing new. If an uncontainable superintelligence kills all humans to create paperclips, it's sad but it's our child's doing. You, one of the parents, can of course blame one of the other parents, the programmer of that superintelligence for fucking up the goal routines, but that's not a technical problem but a social one :).


I'd rather my children live a long happy life, or their children, than be turned into paperclips. For what it's worth, I'd also like a shot at not just not becoming a paperclip, but also living for a very long time once we figure out how to slow or even reverse aging.

Your nihilism is misguided.


*We’re self fucked :-)


In my opinion, all the talks about the potential danger of advanced AI is highly speculative and revolves around a very simple thing: fear of the unknown, that's all.

We simply don't know.

And some people are also afraid of creation by accident, because intelligence is seen as an emergent property of complex networks, but again, this is because we don't understand much about it.

Tldr; Nothing to see here, move along.


The fears and warnings of the AI safety guys are real and closer than you think.

Machine superintelligence is a bit on the extreme side, but formulating safety protocols for autonomous machines is a real challenge. We do know that optimisation functions can indeed create harmful and outright dangerous results.

It's not an unknown that AI safety researchers fear, it's actual outcomes or real experiments extrapolated into the future. Optimisation algorithms are impartial to humans, their moral code and -survival.

Simple (and often quoted) example to illustrate this point: say we create an autonomous machine that we want to help us with our stamp collection. We run an optimising algorithm and naively use the amount of unique and rare stamps collected as our target function. The loss function is determined by time taken and money spent.

Possible outcomes: the algorithm figures that the most effective way of achieving this is to hack a number of bank accounts to quickly get enough money to buy a nice collection.

Now that's bad, but maybe something we can incorporate in our training procedure: only use the money given to you.

Another possible outcome: the machine creates a set of robots that roam the planet and steal all the stamps from all over the world.

Again, you'd have to consider that in your training method: no stealing.

But in the end, you cannot foresee every possible outcome, especially since you expect the machine to come up with an unconventional solution, since otherwise you wouldn't need it in the first place.

Restricting the space of possible solutions to safe and desired ones, is a very hard (and potentially undecidable) problem. This paper is just another reminder that we have to be very careful lest we accidentally end human civilisation by means of AI.


This is a bad example. The intelligence needed to carry out your described outcomes is more or less AGI already, and that is advanced enough that describing it as a dumb optimization machine is likely naive.

Further, if you want to optimize stamp collection, you don't tell a model what it can't do, you tell it what it can do. It can make trades on stampcollectors.com, for +/- 20% of asking price, max 30 trades per day. Yes, there are some funny examples of RL algos finding weird hacks to beat their simulation, but let's not get ahead of ourselves. We will not see an AI that, from scratch, can learn how to hack bank accounts to buy stamps. The number of nth order observations you need before one can even consider trying to hack a bank account to purchase stamps is ludicrous. If it is even possible, it pales in comparison to the likelihood of someone just making a bot to hack banks to begin with.


> The intelligence needed to carry out your described outcomes is more or less AGI already

Yes! And that's exactly the point at which AI safety comes into play. Nobody is afraid a of a dumb NN that generates texts or images given an input or a classifier. That's not what the paper is concerned with, either.

> Further, if you want to optimize stamp collection, you don't tell a model what it can't do, you tell it what it can do.

Again, the example is not concerned with a dumb computation graph. The example is about an AGI with access to the physical world, like robots that need to interact with their environment.

The idea of superintelligence even being possible without this interaction (the paper uses "current state of the world as input" to express this) cannot be justified.

AI safety is geared toward machines that are capable of manipulating their environment, be that directly (i.e. using physical manipulators) or indirectly (e.g. by controlling external inputs and outputs, communicating with humans and other machines, etc.).

If you need a very tangible example that is currently being developed and not at all in realm of sci-fy, look to Japan. The Japanese government recently started an initiative to develop autonomous machines to help caring for their aging population.

Such automated caretakers cannot be programmed in a traditional way and are likely to be trained using (unsupervised) reinforcement learning in both virtual and physical environments.

Seeing how optimisers love to cheat and game the system in unexpected ways, guaranteeing safety is a real concern in this area and not decades away either. Your average Rumba might be a harmless toy, add an arm and the ability to use an oven to it, and we're talking about a serious accident waiting to happen if we're not careful...


You're making the assumption that AGI level intelligence would resemble RL bots of today. It's a classic sci fi trope, but likely not one that makes any sense imo.

AGI might require physical instantiation, but that doesn't mean that physical bots are indicative of AGI outcomes. Most of our physical bots are just classically written programs with a bit of ML to help with classification of its environment anyway.


Sometimes fear of the unknown is very useful for survival purposes. There is a reason we evolved with it.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: