Hacker News new | past | comments | ask | show | jobs | submit login
Von Neumann’s critique of automata theory and logic in computer science (1947) (yodaiken.com)
141 points by adamnemecek on May 25, 2019 | hide | past | favorite | 31 comments



Controls engineer here.

I've often wondered what a theory of computation based in differential equations would look like. We desperately need one. Control theory gives us some tools, but not nearly powerful enough ones.

It's fairly clear by now that AI, at least as applied to domains immersed in the natural world, such as robotics and machine vision, is better thought of as analog in nature. Ten years ago I'd go to robotics conferences and it would not be unusual for me to be the only person in the room who knew that control theory was a thing. Now it's nearly taken over the field. But the kinds of developments we see in the field, both in terms of controls and in terms of machine learning, are essentially experimental in nature. We try some stuff, and see what works. Mathematical proofs, or even a mathematical understanding, of these algorithms comes later, if at all.

To this day we have intuition about why deep networks work, but backing that intuition up with rigorous mathematical analysis has been very challenging and is quite incomplete. But when it works, it works far better than ideas rooted in automata. It would be a tremendous benefit to the field if we could somehow merge the practical power of differential equations and function approximators with the theoretical power of the theory of computation.


This is slightly tangential, but if you haven’t already seen it, you should check out Danny Hillis’ story of Richard Feynman working at Thinking Machines. He solved how big the buffers needed to be for the communications between processors by modeling the bit streams with partial differential equations. http://longnow.org/essays/richard-feynman-connection-machine...


Fantastic read


Lenore Blum has worked on some things in this direction.

http://www.cs.cmu.edu/~lblum/PAPERS/TuringMeetsNewton.pdf

I think most computer scientists have a natural and justified distrust of the real numbers, though, because when you get down to it they are too bizarre and impossible to actually be real. (i.e. the set of computable reals has measure 0, see Chaitin).

https://arxiv.org/pdf/math/0411418.pdf


I had that distrust of the real numbers too. It looks it's pretty common among CS folks.

However, one of my professors once said: "There is a beauty in real numbers. While they cannot be as precise as decimals, they allow to see the nature of things by exploring and discovering the laws of the universe."

Time passed and I cannot agree more. It's indeed a proverbial quote that should be given to all CS students.


> It would be a tremendous benefit to the field if we could somehow merge the practical power of differential equations and function approximators with the theoretical power of the theory of computation.

If you consider Leslie Lamport's TLA+ [0] bringing logic specifications tied to software, it comes close to what you are looking for IMHO. Code the differential equations in your code but use TLA+ for the mathematics and theory of computation.

[0] http://lamport.azurewebsites.net/tla/high-level-view.html


That is a deep philosophical quest and I find myself attached to it for my whole life. Now I would share some recent observations.

A LISP program with one entry function without side effects is the closest bridge between the conventional computation and conventional math.

If such a program would consist from only a well known set of math functions (sin, cos etc) then it would be immediately ready for typical math instruments like differentiation, integration etc. Let's call such malleable functions as ones to be in _analytical_ form.

Now add some Boolean logic. It's binary so you cannot apply the "analog" analytical math rules. But there is a trick. It is called Rvachev functions. They allow to represent Boolean functions in malleable analytical form.

Having math and logic is very close to generic computation, but not enough. You also need a notion of conditional jump (sometimes called goto or branch) for it to be a Turing-complete.

But functional languages like LISP provide a treasure trove here too. They often use jumps/gotos and functional recursion interchangeably. So, if a program is represented in malleable analytical form and uses recursion for control flow without side effects... it is not different from your typical math formula. So you can differentiate, integrate and do whatever a conventional math can do. Really intriguing.

>To this day we have intuition about why deep networks work, but backing that intuition up with rigorous mathematical analysis has been very challenging and is quite incomplete. But when it works, it works far better than ideas rooted in automata.

There were folks who translated trained networks to automata in order to squeeze them into smaller microcontrollers/FPGAs. Quite a result if you ask me. It shows that the trained AI and automata are interchangeable forms of the very same thing - computation.


That arbitrary computation could be described with differential equations is an immediate corollary of Turing's 1936 paper: because a Turing machine could be constructed in the physical world, its operation can be described with differential equations. An immediate corollary of that is that differential equations are undecidable (i.e. differentiability does not give us more power).

For a specific represnetation of TMs with ODEs see: https://link.springer.com/chapter/10.1007%2F11494645_21


> I've often wondered what a theory of computation based in differential equations would look like.

The theory of computation is not based on anything; it is fundamental in the sense that it exposes laws of nature (it is heavily based on physical limitations) independent of the means of talking about them. In a very strong sense it is more fundamental than mathematics itself as it is concerned with how hard it is to obtain the answer to a mathematical question without regard to the process of obtaining it. Different representations may, of course, assist us in proving certain theorems in the theory of computation or other areas of computer science.


Theory of computation is based in mathematics/logic. It is a part of mathematics just like computer science is a field of mathematics. I'm not sure I agree that it is more fundamental than mathematics since it is a part of mathematics. You can't have a theory of computation without mathematics. Just like you can't have the field of cryptography without mathematics. Theory of computation and cryptography are built on top of fundamental ideas of mathematics. How can that translate into theory of computation being more fundamental. It's like saying a molecule is more fundamental than an electron or proton.


> You can't have a theory of computation without mathematics.

You're right that you can't have a theory without some language to talk about it (and it also requires people to come up with it, so is psychology more fundamental than physics?), but the theory of computation is about the laws that govern the power of mathematics. I.e. mathematics (and the universe) was constrained by computation long before anyone knew that. On the other hand, Turing was very careful not to rely on any mathematical and even logical results in his 1936 paper, not even on the principle of explosion (which is used in modern formulations of the halting theorem), precisely because he knew he wanted to get at the fundamental limitations of the very process of deduction itself (he does say that the theorem can be proven using the principle of explosion, but that would be "unsatisfying"). Turing showed that if a premise can be written as some string of symbols and so can the conclusion, then, due to constraints on the mind and body of the mathematician writing them, the process of deduction is subject to the laws of computation. This applies without any need to describe what the symbols represent, if they represent anything at all.

While the theory of computation does use concepts such as functions or sets, it knowingly treats them as higher level ideas than computation. I.e. a number or a set or a function is a name given by humans to something that can be computed or an abstraction of such a thing (so we can talk of non-computable things).

And if you want to include TOC as part of mathematics (some do, some don't), then it is its most fundamental part, more fundamental than formal logic (some people include that in mathematics, but most don't), which is subject to the laws of computation but not vice-versa.


Psychology and physics are two independent and different fields. Physics wasn't built on top of psychology and vice versa. What language of psychology do you need to talk about physics and vice versa? Newtonian physics and einstein physics or quantum physics may be better examples? Or psychology with developmental psychology, behavioral psychology, etc.

Are you saying Turing, the world famous mathematician, didn't use any mathematical ideas? Are you talking about "On Computable Numbers, with an Application to the Entscheidungsproblem" where he laid out the algorithm to the Turing machine? Are you saying there was no logic behind the Halting Problem? You do realize that his 1936 was filled with mathematical proofs? The paper you referenced is one of the world's most famous mathematical papers. Just because it isn't full of numbers doesn't mean it's not mathematics. Do you think Euclid's elements is part of mathematics?


> Are you saying Turing, the world famous mathematician, didn't use any mathematical ideas?

Of course he did, but his ideas also came to him through psychology and he expressed them in English. That doesn't make psychology or English more fundamental than computation. Because humans create theories and humans are very complex, almost everything human is involved in the construction of theories, but when we talk about something being more fundamental than another we're not talking about the human process of the theory's construction but about the subject matter of the theory. The theory of computation is not only concerned with matters at a "lower-level" than mathematics, but also lower than logic.

In fact, that computation can be described using mathematics (or English) is precisely because of Turing's discovery of universal computation, which means that any system of symbols that's "rich enough" can describe any other.

> You do realize that his 1936 was filled with mathematical proofs?

Ah, but you should take a closer look at them. His proof of what we today know as the halting theorem goes to great lengths to avoid using any logical axioms. There's a great paper about that by the Turing scholar, Juliet Floyd (https://mdetlefsen.nd.edu/assets/201037/jf.turing.pdf esp. §4.5). He did that because he was trying to get to an idea that's even more fundamental than logic.

> Just because it isn't full of numbers doesn't mean it's not mathematics.

As I said, some people do consider formal logic, and even computation as branches of mathematics (though others don't), but if so you can think of computation as more fundamental than any other branch. Let me put this more precisely instead of speaking in the abstract: computation is more fundamental than the natural numbers (the axiom of infinity, often considered the most basic mathematical axiom) and even the most basic axioms of logic, such as the principle of explosion, in the sense that neither can be given a precise sense without computation, but computation is described without them.


From a modern physics perspective, pron is correct in some sense. To a physicist, the theory of computation and complexity, are built on top of the laws of the universe that you live in. You change the laws of the universe and the difficulty of computation changes. In fact, more generally, what information processing tasks (such as cryptography) are possible or their difficulty are determined by the laws of physics. For example, its possible to copy unknown information in the Newtonian universe, but generally impossible in the quantum universe. Or for example, quantum computers are said to have different complexity than classical computers.

In fact, significant progress in physics in recent times has come about because people ask what sort of information processing tasks should be possible/easy in our universe and which ones not. This allows us to reject physical theories that don't respect our intuitions about information processing. What I am getting at is that the theory of information processing (computation included) is not divorced from reality but can be divorced from mathematics. Even if we completely changed our mathematical systems (start from different axioms than ZFC for example), the type of computations possible in our universe would not change. In other words, if we wanted to use the new mathematics to model computers/information processing systems in our universe, that mathematics would have to respect the information processing results we already know about our universe.


The theory of computation doesn't really have anything to do with math. It's based on logic, machines, automata, etc. You can use it for math but, in principal it has nothing to do with math. For example, the automata may not be calculating anything. It may be describing a process of doing something more abstract.

Solving DE in computation is an interesting point to make tho. I've seen analysis style arguments, like real analysis or calculus, in symbolic languages like lisp.

Actually, there's a really interesting set of papers very much related to this topic. Link below.

http://strictlypositive.org/calculus/


> Ten years ago I'd go to robotics conferences and it would not be unusual for me to be the only person in the room who knew that control theory was a thing.

This is mind-blowing to me. When I thought I might have a reason to develop some robotics technology, the very first thing I did was to get a control theory book, on the off chance that I was really going to do this in the future. In fact, I think this explains why the product I think could be made does not exist yet.


If you literally want a visual representation, there are systems of differential equations that model biological/chemical reaction-diffusion. Roughly speaking, this gives you a bridge between cellular automata, reaction-diffusion and morphogenesis by defining chemicals, states, species - whatever - as a system of differential equations. These produce organic and natural looking patterns!


You could take a CPU design at the transistor level and use the techniques from SPICE to turn it into a system of ODEs. But that seems like a massive complication that would make it hard to reason about high-level concepts.


It seems to me like I am missing something: In most practical cases, the differential equations are solved numerically anyway. Why add the extra step?


> In studying the functioning of automata, it is clearly necessary to pay attention to a circumstance which has never before made its appearance in formal logic. Throughout all modern logic, the only thing that is important is whether a result can be achieved in a finite number of elementary steps or not. The size of the number of steps which are required, on the other hand, is hardly ever a concern of formal logic. Any finite sequence of correct steps is, as a matter of principle, as good as any other. [...] Thus the logic of automata will differ from the present system of formal logic in two relevant respects. 1. The actual length of “chains of reasoning,” that is, of the chains of operations, will have to be considered.

Interestingly, according the Wikipedia [0], Gabriel Lamé published a running time analysis of Euclid's Algorithm [1] in 1844, more than a century before von Neumann suggested this.

[0] https://en.wikipedia.org/wiki/Computational_complexity_theor...

[1] http://archive.lib.msu.edu/crcmath/math/math/l/l064.htm


If I understand this correctly, Von Neumann is saying that, in the real world, computation will have to be efficient and robust, and so techniques will have to be developed to ensure their efficiency and robustness and that such techniques will necessarily have to be continuous in nature - because combinatorics is too hard to be used for the job.

It's important to remember that this was written in 1947. So, a year before Shannon's paper on information theory and several years before the beginnings of complexity theory [1]. These are both discrete forms of analysis and they both go a long way towards addressing Von Neumann's 1947 concerns.

I don't know to what extent Von Neumann considered his 1947 criticism addressed by the subsequent advances. I'm going to go out on a limb though and say the he would probably have found at least some satisfaction in them.

______________

[1] Shannon's A Mathematical Theory of Communication paper was published in 1948. Wikipedia tells me that the foundations of complexity theory were laid down in 1965 in a paper titled On the Computational Complexity of Algorithms, by Juris Hartmanis and Richard E. Stearns.

https://en.wikipedia.org/wiki/Computational_complexity_theor...


Yes, I think that at least some of what Von Neumann is calling for is complexity theory.

It's easy to forget how recent complexity theory is. While Hartmanis and Stearns laid the foundations in 1965, its first concrete result came only in '71 (Cook) and '72 (Karp), and it took much longer for more results, and, more importantly, their meaning, to sink in (at least until the '80s if not later) among computer scientists who are not themselves computational complexity researchers. In fact, some of the most basic results complexity researchers rely on are from the mid 80s (circuit complexity), which makes complexity theory one of the youngest fields in computer science. It is about half the age of neural networks, and even younger compared to programming language theory.

Some famous assertions/hypotheses/aspirational programs in computer science, including by people like Dijkstra, Tony Hoare and Robin Milner, predate complexity theory, and should be seen in a different light because of that. And because it is so young compared to other branches of computer science, it is sometimes ignored by researchers working in older disciplines, if only because their own basic results predate complexity theory.


Prescient. Nowadays, most computer scientists don't explicitly worry about hardware failures, but we still love randomized algorithms.

Often it's easy to prove that a randomized algorithm has a desirable property with probability 1 asymptotically, or with high probability in finite time. In contrast, it's often hard to prove the analogous desirable property always holds for a deterministic algorithm. In other words, we use a trick to make a combinatoric problem more like an analysis problem, because analysis problems are easier to handle -- exactly what Von Neumann was writing about!

Turns out, the power of analytical methods was its own motivation, and we didn't need unreliable hardware to push us in that direction.


Today, there are at least two fields that address just these issues: the field of computational complexity investigates just how many operations are required depending on the problem.

I'm not too familiar, but I think control theory addresses the issue that requires accounting for the possibility for error.

Ad for the other field, when von Neumann mentions analytical tools, he is actually thinking of topological tools. To that end, Type theory investigates formal logic with much inspiration from topological tools.


Related:

* Kurt Gödel's Letter to John von Neumann (1956)

https://news.ycombinator.com/item?id=19281633

Gödel was asking John von Neumann the number of steps for solving the Entscheidungsproblem. I asked on HN for its background, and I was told it was one of the earliest discussions of computational complexity theory, unfortunately John von Neumann did not have a chance to reply.

Thanks for posting it, now I see the bigger picture: even without seeing Gödel's letter, John von Neumann had already grasped the idea of computational complexity independently around the same time.


I think that Von Neumann raise three important issues, two of them are largely solved, and one of them is not yet solved.

- Hardware failures

- Complexity theory

- CS and Formal logic is combinatorial rather than analytical, we can't use the powerful tools of mathematical analysis to solve those combinatorial problems.

This reminds me of a remark of Paul Erdős about the Collatz conjecture: "Mathematics may not be ready for such problems." EDIT: text format


"formal logic ... deals with rigid, all-or-none concepts"

Von Neumann only had knowledge of classical logic, so he was not aware of fuzzy logic, logics that deal with infinite truth values, and other non-binary logics.

By recognizing the above named inadequacy in the formal logic that he knew, von Neumann was unwittingly standing on the threshold of a new field. If only he had asked the right question (ie. what could a non-binary logic look like?), a first rate mind like his surely could have made some significant progress.


Von Neumann was thoroughly familiar with non-classical logic. He himself came up with several variants (including quantum logic) in the 1930s.


can someone provide an ELI5 for this critique?


The "normal" physical systems we build (say houses and hydraulic systems) are mostly "continuous". This means small changes in the inputs generally result in small changes in the behaviour of the system. So, for example, if the screws on your kitchen cabinet doors are not tightened quite enough, the cabinet doors might wobble a little -- that is, a small change resulted in a small change.

On the other hand, systems based on digital logic are very "discontinuous". A small changes in an input can result in the system behaving vastly different. A single-bit error in a computer program can result in an if-then-else switching from the true branch to the false branch, and thereby running an entirely different piece of code. So your computer is a single bit flip away from fintech entrepreneurs using your entire bank account to rent AWS time to mine Bitcoin.

In general, we are drawn to discrete logic because it's easy to write programs to do different things based on different inputs. However, von Neumann was worried about this, because no machine works perfectly -- every computer runs with a certain probability of error, and computers run fast enough that even a low probability of error can compounds very quickly. And if a single small error can result in radically different results, how can you trust the result of a computer program?

In this note, written in 1947, he expressed the hope that if you could make computers behave continuously, you could not worry about small errors, because they would only have a small impact on the result.

However, a few years later von Neumann ALSO invented the alternative, modern way of handling these issues! In his 1956 paper _The Synthesis of Reliable Organisms from Unreliable Parts_, he proved that it was possible to use redundancy in a way to drive the error probability arbitrarily low, enough that you COULD trust the output of a computation.


Great summary.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: