How deep is the brain? The shallow brain hypothesis

audunw · on Oct 30, 2023

I seem to remember research stating that an individual neuron has very complex behaviour that requires several ML “neurons” / nodes to simulate. So if you do a comparison, perhaps the brain is deeper than you’d think by just looking at the graph of neurons and their synapses.

Could we construct a neutral net from nodes with more complex behaviour? Probably, but in computing we’ve generally found that it’s best to build up a system from simple building blocks. So what if it takes many ML nodes to simulate a neuron? That’s probably an efficient way to do it. Especially in the early phase where we’re not quite sure which architecture is the best. It’s easier to experiment with various neural net architectures when the building blocks are simple.

rmorey · on Oct 30, 2023

> I seem to remember research stating that an individual neuron has very complex behaviour that requires several ML “neurons” / nodes to simulate.

This is probably what you're remembering: https://www.sciencedirect.com/science/article/pii/S089662732...

titzer · on Oct 30, 2023

Yeah, biological brains could be remarkably more powerful than digital neural networks if the have primitive functions that we haven't accounted for. For example, some networks seem to encode information in the firing rate, rather than just the presence of a signal. If neurons could, e.g. do frequency-based calculations (and not just threshold-based, like spiking neural nets), they could be orders of magnitude more powerful and efficient. I am thinking particularly about neurons involved in, e.g. audio processing.

andbberger · on Oct 30, 2023

the entropy rate goes way up if you consider spike timing dependent signals as well. but the difference in computational capacity between the brain and ML lies less in the brain's inherently time-dependent dynamics and more in the impressive computational capacity of single neurons. Dendrites compute, electrochemical dynamics during action potentials compute, synapses compute. All in complex time-dependent ways. check out izhikevich's dynamical systems in neuroscience for a taste of the computational capacity of the electrochemical dynamical system alone

jacobsimon · on Oct 31, 2023

My guess is that the firing rate of biological neurons more or less simplifies to the activation in an artificial neuron. Higher firing rate = higher activation.

magicalhippo · on Oct 30, 2023

> Could we construct a neutral net from nodes with more complex behaviour?

Well there's spiking neural networks (SNN)[1], which are modeled more closely to how neurons actually work.

Main obstacle is still, as far as I know, that there's no way to train a SSN as efficiently as a "regular" neural network, which lends itself very nicely to gradient descent and similar[2].

[1]: https://en.wikipedia.org/wiki/Spiking_neural_network

[2]: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9313413/

rsrsrs86 · on Oct 30, 2023

The brain backprops??????

andbberger · on Oct 30, 2023

there is no evidence to support this

chriskanan · on Oct 30, 2023

The brain has a lot of skip connections and is massively recurrent. In a sense, the brain can be thought of as having infinite depth due to recurrent thalamno-cortical loops. They do mention thalamno-cortical loops in the paper, so I think a more concrete definition of what is meant by "depth" would be helpful.

lausbub · on Oct 30, 2023

The "infinite depth" seems to be a matter of definition. It's practically infinite if you include feedback loops via learning. If you exclude learning, then it's far from "infinite". Activations linger for up to 15-30 seconds, so at oscillations of around 30 Hz that would result in about 450-900 loops (times an unknown small multiplier for the actual number of layers). But the brain presumably only backprops/optimizes a few layers at a time and not much "through" time.

sudosysgen · on Oct 30, 2023

There's also evidence that the brain does optimize through time and might be implementing, at least in some places, algorithms close to LSTD.

sheeshkebab · on Oct 30, 2023

It’s indeed odd that current dnn’s require massive amount of energy to retrain and lack any kind of practical continuous adaptation and learning.

quickthrower2 · on Oct 30, 2023

With computer-based intelligence we have the overhead of computing every bit though (probably) inefficient silicon and direct electric currents. The brain leverages the properties of chemicals, though millions of years of evolution.

jakobson14 · on Oct 30, 2023

The brain isn't a faster computer.

An infinitely-fast computer wouldn't meaningfully change the "expensive training vs fast, static inference" workflow that neural networks have always been developed around (except in the most brute force-y "retrain on the entire world, every single nanosecond" sense).

quickthrower2 · on Oct 30, 2023

I think we agree? I am talking to the efficiency of the brain. Not processing speed. Efficiency of the brain to do things advantageous to the selfish genes I guess.

The brain is supremely efficient at what the brain has evolved to do. It is almost tautological! Because if it wasn't, it wouldn't have evolved to that.

Silicon comes from an alien land, and is emulating. Even with the best algorithms there has to be a limit on how efficient a computer-based intelligence can be without changing how the chips work.

You could spin it around and say, well computers are better at many things than humans, and there is no way you could get a biological brain to be as good for the same amount of power (e.g. a raspberry pi can do calculations our brain couldn't possibly do).

DiggyJohnson · on Oct 30, 2023

Really well said, I think this is an excellent way to frame the dichotomy (comparison?).

Much of these threads make the binary mistake: can these systems be compared, or are they fundamentally different? A bit of both, almost certainly.

PaulDavisThe1st · on Oct 30, 2023

> The brain is supremely efficient at what the brain has evolved to do. It is almost tautological! Because if it wasn't, it wouldn't have evolved to that.

This echoes an extremely naive view of evolution.

There are many phenotypes in the living world which have evolved but for which there is no reason to believe that the phenotype is either (a) supremely efficient and/or (b) under selection pressure (the two are obviously related).

Evolution has no tautology. Brains do not evolve to be supremely efficient, just like humans do not evolve to be supremely efficient.

What exists today is that which has survived, for whatever reason. It's not even possible to say something as apparently simplistic as "the only purpose evolution respects is leaving behind more copies" because that ignores (a) group selection (b) changing ecosystems that favor plasticity in the long run.

cycomanic · on Oct 30, 2023

> There are many phenotypes in the living world which have evolved but for which there is no reason to believe that the phenotype is either (a) supremely efficient and/or (b) under selection pressure (the two are obviously related).

> Evolution has no tautology. Brains do not evolve to be supremely efficient, just like humans do not evolve to be supremely efficient.

> What exists today is that which has survived, for whatever reason. It's not even possible to say something as apparently simplistic as "the only purpose evolution respects is leaving behind more copies" because that ignores (a) group selection (b) changing ecosystems that favor plasticity in the long run.

A primary example of this are our legs, they would be much more efficient if the knees pointed backwards. They are not the most efficient design, but simply good enough.

m-i-l · on Oct 30, 2023

> "our legs, they would be much more efficient if the knees pointed backwards. They are not the most efficient design, but simply good enough."

I don't think you can say one leg type is better than another without reference to the intended use of the leg - plantigrade legs have better "stability and weight-bearing ability"[0], whereas digitigrade legs (like those of cats and most birds, which BTW appear to have a reverse knee but don't because it is the ankle working like a second backwards knee) "move more quickly and quietly"[1].

Tying this back to the original point, the same is true for brains and computers - they are each better in very specialist cases within specific constraints.

[0] https://en.wikipedia.org/wiki/Plantigrade

[1] https://en.wikipedia.org/wiki/Digitigrade

cortesoft · on Oct 30, 2023

> The brain is supremely efficient at what the brain has evolved to do. It is almost tautological! Because if it wasn't, it wouldn't have evolved to that.

Not really, evolution doesn't guarantee the brain will be supremely efficient. It just guarantees that it will be efficient ENOUGH.

edgyquant · on Oct 30, 2023

Again, it is efficient at what it does.

cma · on Oct 30, 2023

You're talking about something orthogonal, how efficient it is. He's talking about something different:

https://en.m.wikipedia.org/wiki/Catastrophic_interference

Which practically requires full retraining at every step to integrate new knowledge. I think we have some partial solutions like learning to select between finetunings, but not if the task needs to crosscut between them.

The human brain doesn't seem to suffer with catastrophic interference to nearly the same degree, independent of its computational efficiency, though there are possibly related things like developmental stages that if they are delayed may never be able to take place.

uoaei · on Oct 30, 2023

It's an apples-to-oranges comparison. They're both fruit that grow on trees, but that's where the similarities end.

The primary difference, and likely the reason that brains are unreasonably effective, is the specifics of the architecture and internal representations (in the rigorous, information-theoretic sense) of its computational systems. It's not quite analog but it uses analog means. It's not quite digital but it does process via abstractions.

You can still reasonably call the brain a "computer" if you decide it can shed the laden history of that word and its close association with binary operations using transistors. You can do so because it uses internal structures to process inputs and emit outputs. But like I said above, it requires a generalized interpretation of the word to start to understand where and how the two fields of study may be unified.

jakobson14 · on Oct 30, 2023

Yes, it's odd that sled dogs make terrible housepets. /s

Neural networks fundamentally aren't designed to be otherwise. The workflow that has guided their entire development for over a decade is based around expensive training and static inference.

sheeshkebab · on Oct 30, 2023

Why then all the talk about AGI when fundamentals don’t even allow for it to emerge.

ben_w · on Oct 30, 2023

Because "AGI" is very poorly defined, and ChatGPT is very "general" (compared to everything before it) and matches some (but not all) definitions of "intelligent".

jakobson14 · on Oct 30, 2023

Because drumming up talk about AGI is a really great way to get funding for your startup. The tech industry sustains itself on hype.

ben_w · on Oct 30, 2023

Not only but also.

anon291 · on Oct 30, 2023

Because transformers et al have gotten us the closest we've ever been to any system that can even claim to be AGI.

FeepingCreature · on Oct 30, 2023

First make it work; then make it efficient.

jakobson14 · on Oct 30, 2023

Your scientists were so preoccupied with whether or not they should that they didn't stop to think if they could.

drdeca · on Oct 30, 2023

... seems potentially better than the other way around? Well, I suppose it depends.

4death4 · on Oct 30, 2023

When you say “massive amount of energy” are you comparing the energy requirements to a single human or to the billions of years of solar and geothermal energy that went into producing the human species?

edmundsauto · on Oct 30, 2023

I don’t think this is an apt comparison, but I do think the amount of energy it takes to grow a human into brain maturity in adulthood is an interesting one. Brains + bodies over a 20 year development cycle is still probably much less than training even a low quality Llm.

4death4 · on Oct 30, 2023

Let’s say a human needs an average of 2000 calories a day. A calorie is roughly equivalent to 1 Watt hour, so over 20 years, it takes about 15 MWh to sustain a human.

Let’s say a single A100 has a peak power draw of 250W, and you need 100 to train an LLM. So each hour of training consumes 25,000 Wh of energy. 15 MWh / 25,000 W = 600 hours, or 25 days, which is probably pretty close to the true training time.

So the numbers are actually pretty close. But a human brain doesn’t start out as a set of random weights like an LLM. The human brain has predefined structure that’s the result of an extremely long evolutionary process.

edmundsauto · on Nov 3, 2023

It's probably taking the analogy too far, but perhaps the brain's predefined structure is akin to the original LLM training and our "life" is the fine tuning.

I wonder how many MWh the entirety of evolution represents.

Affric · on Oct 30, 2023

By that token the amount of energy for neural networks will be bound to some extent by the development of the biosphere and the creators of neural networks.

fastball · on Oct 30, 2023

Not really? The point is that most artificial neural networks are started from basically zero (random noisy weights), where as a human neural network is jump-started with an overall neural structure that has been shaped by millions of years of evolution. Sure, it's not fair to compare the overall energy required to get there, but the point is just that a biological neural network starts with a huge headstart that is frequently forgotten when talking about efficiency.

nomel · on Nov 1, 2023

See “The last question” for some sci-fi solutions to this.

phkahler · on Oct 30, 2023

>> It’s indeed odd that current dnn’s require massive amount of energy to retrain and lack any kind of practical continuous adaptation and learning.

To me that just means nobody has figured out how to do that effectively. The majority will simply make use of what's been done and proven, so we got a plateau at object recognition, and again at generative AI (with applications in several domains). One problem with continuous adaptation and learning is providing an "entity" and "environment" for it to "live" in which doing the adaptive learning. There are some researchers doing that either with robots, or simulations. That's much harder to set up than a lot of cloud compute resources. I do agree with you that these aspects are missing and things will be much more interesting when they get addressed.

golol · on Oct 30, 2023

In-comtext learning exists though.

beaugunderson · on Oct 30, 2023

https://anonymfile.com/dR8a/s41583-023-00756-z.pdf

hliyan · on Oct 30, 2023

"brain seems shallow and neural networks are deep, ergo neural networks are doing it wrong"

Please don't claim things the author didn't. What I read was "ergo (artificial) neural networks may be missing a trick"

hliyan · on Oct 30, 2023

Ignore. Reposted this under correct parent comment

rsrsrs86 · on Oct 30, 2023

Beyond the mere topological metaphor of neural networks there is almost nothing in common between brains and widigital computation. This is a widespread fallacy of category.

naasking · on Oct 30, 2023

> Beyond the mere topological metaphor of neural networks there is almost nothing in common between brains and widigital computation.

I mean, sure, but the topology is exactly what makes both work, so we only really care about the topology.

rando_dfad · on Oct 30, 2023

and more specifically, between chemical-based information processing systems and Von Neumann architectures for binary information processing.

Agreed, a widespread fallacy of category.

But computers still do some pretty cool things. Powerful tools.

epgui · on Oct 30, 2023

I completely disagree, and I think this is an example of human-exceptionalism bias.

lawrenceyan · on Oct 30, 2023

We have skip connections and recurrent neural networks at home.

jakobson14 · on Oct 30, 2023

If I had a nickel for every time some neurologist tried to compare brains to neural networks. It's a surefire way to tell someone is either desperate for grant money or has been smoking crack. (previously: comparing brains and "electronic computers")

Their entire article hinges on the complaint "brain seems shallow and neural networks are deep, ergo neural networks are doing it wrong."

Neurologists seem to have a really hard time comprehending that researchers working on neural networks aren't as clueless about computers as neurology is about the brain. They also vastly overestimate how much engineers working on neural networks even care about how biological brains work.

Virtually every attempt at making neural networks mimic biological neurons has been a miserable failure. Neural networks, despite their name, don't work anything like biological neurons and their development is guided by a combination of

A) practical experimentation and refinement, and

B) real, actual understanding about how they work.

The concept of resnets didn't come from biology. It came from observations about the flow of gradients between nodes in the computational graph. The concept of CNNs didn't come from biology, it came from old knowledge of convolutional filters. The current form and function of neural networks is grounded in repeated practical experimentation, not an attempt to mimic the slabs of meat that we place on pedestals. Neural networks are deep because it turns out hierarchical feature detectors work really well, and it doesn't really matter if the brain doesn't do things that way.

And then you have the nitwits searching the brain for transformer networks. Might as well look for mercury delay line memory while you're at it. Quantum entanglement too.

robbrown451 · on Oct 30, 2023

I can't agree with the dismissiveness of this comment, and frankly I find its tone out of line and not with the spirit of Hacker News.

There are insights that can come from studying the brain, that do indeed apply. Some researchers may not glean anything from such studies, and some may. I have no doubt that as neural networks get more an more powerful, we will continue to find more ways they are similar to the brain, and apply things we've learned about the brain to them.

I certainly prefer to see people making comparisons of neural networks to the brain, that the old "it's just a glorified autocomplete" and the like.

Relax.

ramraj07 · on Oct 30, 2023

No one disagrees we might be able to discern insights if we understand how our brain is wired. The problem is the current state of neuroscience is so flawed in its approach it’s not looking like they’re of any use. They don’t even understand how a 900 neuron worms system works but are more than happy to tap half a billion dollars from unsuspecting politicians saying they’ll map the human connectome. Go read the brain initiative proposal [1] to see how out of touch with reality the scientists in this field are. I agree with OP that sharp criticism of the entire field is fully warranted.

1. https://braininitiative.nih.gov/sites/default/files/document...

andbberger · on Oct 30, 2023

what are you talking about is this konrad kording's shitposting alt??? this reeks of naivety

I certainly have many critiques of methods used in neuroscience rn (as a working neuroscientist) but to reduce those to the conclusion that the entire project of neuroscience is hopeless is absurd. We understand certain things quite well actually, and it's not at all obvious what "understanding" at a larger scale would look like. It is very possible that the brain is irreducibly complex, and that the model you would need to construct to describe it would itself be so complex as to be useless in providing insight. Considering that the brain is by far the most complex object in the universe I think we're doing pretty well.

Furthermore, there are quite a lot of disagreements about the utility of connectomics. Outside of the extremists (Sebastian Seung and his ilk) no one thinks that connectomics is going to be the key that brings earth shattering insight. It's just another tool. There is a complete connectome for part of the drosophila brain already (privately funded btw), which is in daily use in many fly labs. It tells you what other neurons are connected to. Incredibly useful. Not earth shattering.

also you might want to measure the neuroscience funding you deem wasteful up against the tens of billions NASA is spending to send humans (and not robots) back to the moon for "the spirit of adventure". cold war's over. robots will do just fine for the moon.

ramraj07 · on Oct 30, 2023

Can you please elaborate what great strides the field of neuroscience has made in the past 30 years?

From where I stand I can’t see anyone giving a clear explanation of anything our brain does or does not do in a disease. The only novel treatment that has come out seems to have been stick a rod into the brain and zap it and it just magically cures a lot of diseases we still don’t understand even a bit.

This is not even starting to discuss what little we have learned about how brains algorithms work. I’m still waiting to understand why pyramidal neurons were somehow groundbreaking. We found some neuron that fires when you walk to a place, why wouldn’t we find one?

And what are you saying about the fly connectome again? Do we have exact names for every neuron in the fly brain and its verified connectome for every neuron?

Last I checked the worm connectome has been available in intricate detail for decades and the scientists still haven’t had any proper decoding of the algorithms in that system. In fact I know every lab trying to figure that out now, I wrote proposals in the topic myself. Everyone else has apparently decided it’s not sexy enough to work with worms so they have just leaped to more complex systems with no basic understanding. I’m not the only one saying this. Sydney Brenner said as much in an editorial. But the field was too busy doing I don’t know what to listen.

Sydney, B. & Sejnowski, T. J. Understanding the human brain. Science 334, 567 (2011).

I remember sauntering to the occasional neuroscience talk during my ut southwestern PhD and occasionally hearing some professor brag about how the majority of one of their PhD’s jobs was to segmenting a single neuron in the thousand EM images or something. Surely that’s a sign this field needs revision?

andbberger · on Oct 30, 2023

> And what are you saying about the fly connectome again? Do we have exact names for every neuron in the fly brain and its verified connectome for every neuron?

onus isn't on me to justify the existence of an entire field to you. the claim that neuroscience has not made great strides in the last 30 years is an extraordinary one, and that's all on you. but it especially doesn't help your case that if you had googled "fly connectome " you would have seen that the first result is a complete connectome of a larvae and the third result is the tour de force from Janelia that produced an adult connectome. With names and verified connections. there is even a wikipedia article for the drosophila connectome!

> I remember sauntering to the occasional neuroscience talk during my ut southwestern PhD and occasionally hearing some professor brag about how the majority of one of their PhD’s jobs was to segmenting a single neuron in the thousand EM images or something. Surely that’s a sign this field needs revision?

and if you had gone on to actually read the hemibrain connectome paper you would have gained some appreciation for the gargantuan achievement that it was. it took hundreds of person years to generate ground truth segmenting neurons by hand, to develop the ML techniques required to automatically segment the rest (extremely difficult problem) and to then validate the automatic segmentations. not to mention the insane effort it was to acquire a half petabyte EM image of a single fly at sub-synaptic resolution in the first place.

I gotta hand it to you though, the position of naivety you've delivered your middlebrow dismissal from is truly impressive in magnitude.

civilitty · on Oct 30, 2023

Agreed. Reading the GP’s comment it feels like it’s from bizzaro world. It’s the computer scientists who have been claiming that neural networks resemble the human brain - they even fucking named them neural networks for christ’s sake! That could be excused as naive hubris in the 1980s, it’s utter delusion now.

A surface review of neuroplasticity literature alone should free anyone of the illusion that “neural networks” have even a passing resemblance to biological neurons, something covered in neuroscience 101 and is widely internalized by its practitioners. The BS grant writing and PR scientists have to participate in is hardly reflect of state of the art science itself.

The irony is that machine learning methods are a perfect fit for neuroscience and biology in general which generates reams of data that is largely so multidimensional that manual analysis is intractable. What we’re seeing now is the crest of the academic hype cycle which - if the history of bioinformatics is anything to go by - means that ML will take years if not decades for the field to understand and filly utilize.

bjourne · on Oct 30, 2023

Actually it was neuroscientists that developed the models nowadays used for machine learning. The McCulloch-Pitts neuron model introduced in 1943 which lead to Frank Rosenblatt's perceptron introduced in 1958. Machine learning algorithms mostly still use those models but computational neuroscience has progressed towards much more complicated neuronal models.

anonymousDan · on Oct 30, 2023

It's typical of the arrogant, borderline anti-scientific attitude of a non-negligible fraction of the HN hive mind, i.e. if it came out of academia it must be a waste of time.

SubiculumCode · on Oct 30, 2023

As another working neuroscientist, thank you. And cheers.

__loam · on Oct 30, 2023

No I think these comments are quite necessary. People need to stop making these comparisons because they have absolutely no grounding in how brains actually work. There are bad ideas that should be dismissed.

fastball · on Oct 30, 2023

Neural networks are absolutely based on a very simplified model of how brains work. Specific NN architectures are in turn based on specific parts of the brain (e.g. Convolution Neural Networks are based on the visual cortices of cats/frogs).

andbberger · on Oct 30, 2023

nah, they're arbitrary function approximators that caught a lucky break. CNNs rose to prominence because natural scene statistics are translation invariant and convolutions can be efficiently computed on GPUs. and now that we have whole warehouses of GPUs, the current mood in DL is to stop building the symmetries of your dataset into the model (which is insane btw) and use brute force.

the tenuous connection DL once had to neuroscience (perceptrons) is a distant memory

fastball · on Oct 30, 2023

A fabricated re-telling of the past, given that we didn't start using GPUs for this type of compute until the turn of the millenium.

__loam · on Oct 30, 2023

If you want to talk about history, these things were invented using a 1950's understanding of neuroscience then promptly discarded until the ml people figured out how to make them useful.

andbberger · on Oct 30, 2023

AlexNet was the turning point for DL.

fastball · on Oct 30, 2023

Why do you say that? Deep Learning was accelerating well before that (I would argue it has been accelerating for its entire existence).

AlexNet was a state-of-the-art image recognition net for a (relatively) brief amount of time. It wasn't the first CNN to use GPU acceleration, and it was quickly eclipsed in terms of ImageNet performance.

Regardless, I think bringing up AlexNet kinda invalidates your initial point. Although yes, it turns out that the two were a great match, CNNs and modern GPUs were clearly developed independently of each other, as evidenced by the many, many iterations of both before they were combined.

andbberger · on Oct 30, 2023

is this schmidhuber's alt? sure they existed before AlexNet was where it really took off. just look at the number of citations. right paper, right time. CNNs were uniquely suited to the hardware at the time. because of their efficiency due to symmetry and suitability to GPGPU computing. not because of their history.

robbrown451 · on Oct 30, 2023

You're saying the study has no grounding in how brains work? I'd think a more reasonable conclusion would be that the neuroscientists involved have no grounding in how artificial neural networks work.

It seems the whole point is to bring in additional details of how brains work, that the think may be relevant to artificial NNs.

p1esk · on Oct 30, 2023

Artificial neural networks are the closest working model of a brain we have today.

Lots of graph nodes, with weighted connections, performing distributed computation (mainly hierarchical pattern matching), learning from data by gradually updating weights, using selective attention (and/or recurrence, and/or convolutional filters).

Which of the above is not happening in our brains? Which of the above is not biologically inspired?

In fact this description equally applies to both a brain and GPT4.

aeternum · on Oct 30, 2023

Many organisms have just a handful of neurons yet exhibit complex behavior that would be impossible given the weighted connections model. Not to mention single-celled organisms that exhibit ability to navigate.

The model can be the closest working model but that doesn't mean it is complete. It's very likely that cells can store memories/information independent from weights.

p1esk · on Oct 30, 2023

We can’t do that not because our mathematical neurons are too simple. We can’t do that because we don’t know the algorithms those biological neurons are running.

Do you see the difference?

__loam · on Oct 30, 2023

There is of course a difference between the two things you say. They're both the reason we can't recreate the brain in software though.

p1esk · on Oct 30, 2023

There are two separate goals: to simulate the brain in software, and to understand brain algorithms. They overlap, but they are still distinct, and appeal to different groups of people. Neuroscientists want to understand detailed brain operations. They are primarily interested in the brain itself. AI researchers want to understand intelligence, they are primarily interested in higher brain functions (e.g. reasoning, attention, short/long memory, emotions, motivations, goal setting, etc).

We can't (fully) recreate the brain in software partly because we don't know enough, and partly because it's too computationally complex - for example, we can't simulate an entire modern CPU at the transistor level - even though we know how each transistor works, and what each transistor does in the CPU - because each transistor requires a detailed physical model with hundreds of parameters. It's simply not computationally feasible using current supercomputers. Brain is even less feasible to simulate if we want to accurately simulate each individual neuron in it - even if we knew exactly how it works.

But the second goal is much more feasible, and we have made great progress simply by scaling up simple known algorithms which approximate some information processing functions in the brain (mainly pattern matching/prediction and attention). I can talk to GPT4 today just like I talk to other humans, and by the way, this is only possible because out of all AI/ML algorithms people have tried over the last 70 years, the most brain-like one have won (ANNs). If we want to make further progress in AI or if we want to make GPT5 to be more human-like (not sure we do), we don't necessarily need to simulate brain at a neuronal level, we simply need to understand a little bit more about higher level brain functions. Today, we (ML researchers) might actually benefit more from studying psychology than neuroscience.

ben_w · on Oct 30, 2023

> Many organisms have just a handful of neurons yet exhibit complex behavior that would be impossible given the weighted connections model.

That's rather a bold claim given that artificial neural networks are universal function approximators.

agalunar · on Oct 30, 2023

Impossible given that number of neurons.

It's perhaps not terribly surprising that it becomes possible with unlimited width or depth (or an arbitrarily complex activation function).

https://en.wikipedia.org/wiki/Universal_approximation_theore...

simiones · on Oct 30, 2023

It's incredible to me how widely this is misunderstood.

The universal function approximator theorem only applies for continuous functions. Non-continuous functions can only be approximated to the extent that they are of the same "class" as the activation function.

Additionally, the theorem only proves that for any given continuous function, there exists a particular NN with particular weight that can approximate that function to a given precision. Training is not necessarily possible, and the same NN isn't guaranteed to approximate any other function to some desired precision.

It seems pretty obvious to me that most interesting behaviors in the real world can't be modelled by a mathematical function at all (that is, for each input having a single output); if we further restrict to continuous functions, or step functions, or whatever restriction we get from our chosen activation function.

ben_w · on Oct 30, 2023

> The universal function approximator theorem only applies for continuous functions. Non-continuous functions can only be approximated to the extent that they are of the same "class" as the activation function.

Yes, and?

> Training is not necessarily possible

That would be surprising, do you have any examples?

> and the same NN isn't guaranteed to approximate any other function to some desired precision.

Well duh. Me speaking English doesn't mean I can tell 你好[0] from 泥壕[1] when spoken.

> It seems pretty obvious to me that most interesting behaviours in the real world can't be modelled by a mathematical function at all (that is, for each input having a single output)

I think all of physics would disagree with you there, what with it being built up from functions where each input has a single output. Even Heisenberg uncertainty and quantised results from the Stern-Gerlach setup can be modelled that way in silico to high correspondence with reality, despite the result of testing the Bell inequality meaning there can't be a hidden variable.

[0] Nǐ hǎo, meaning "hello"

[1] Ní háo, which google says is "mud trench", but I wouldn't know

simiones · on Oct 30, 2023

> Yes, and?

It means that there is no guarantee that, given a non-continuous function function f(x), there exists an NN that approximates it over its entire domain withing some precision p.

> That would be surprising, do you have any examples?

Do you know of a universal algorithm that can take a continuous function and a target precision, and return an NN architecture (number of layers, number of neurons per layer) and a starting set of weights for an NN, and a training set, such that training the NN will reach the final state?

All I'm claiming is that there is no known algorithm of this kind, and also that the existence of such an algorithm is not guaranteed by any known theorem.

> Well duh. Me speaking English doesn't mean I can tell 你好[0] from 泥壕[1] when spoken.

My point was relevant because we are discussing whether an NN might be equivalent to the human brain, and using the Universal Approximation Theorem to try to decide this. So what I'm saying is that even if "knowning English" were a continuous function and "knowing French" were a continuous function, so by the theorem we know there are NNs that can approximate either one, there is no guarantee that there exists a single NN which can approximate both. There might or might not be one, but the theorem doesn't promise one must exist.

> I think all of physics would disagree with you there, what with it being built up from functions where each input has a single output.

It is built up of them, but there doesn't exist a single function that represents all of physics. You have different functions for different parts of physics. I'm not saying it's not possible a single function could be defined, but I also don't think it's proven that all of physics could be represented by a single function.

ben_w · on Oct 30, 2023

> It means that there is no guarantee that, given a non-continuous function function f(x), there exists an NN that approximates it over its entire domain withing some precision p.

And why is this important?

> Do you know of a universal algorithm that can take a continuous function and a target precision, and return an NN architecture (number of layers, number of neurons per layer) and a starting set of weights for an NN, and a training set, such that training the NN will reach the final state?

> All I'm claiming is that there is no known algorithm of this kind, and also that the existence of such an algorithm is not guaranteed by any known theorem.

I think so: the construction proof of the claim that they are universal function approximators seems to meet those requirements.

Even better: it just goes direct to giving you the weights and biases.

> My point was relevant because we are discussing whether an NN might be equivalent to the human brain, and using the Universal Approximation Theorem to try to decide this. So what I'm saying is that even if "knowning English" were a continuous function and "knowing French" were a continuous function, so by the theorem we know there are NNs that can approximate either one, there is no guarantee that there exists a single NN which can approximate both. There might or might not be one, but the theorem doesn't promise one must exist.

I still don't understand your point. It still doesn't seem to matter?

If any organic brain can't do $thing, surely it makes no difference either way whether or not that $thing can or can't be done by whatever function is used by an ANN?

> It is built up of them, but there doesn't exist a single function that represents all of physics. You have different functions for different parts of physics. I'm not saying it's not possible a single function could be defined, but I also don't think it's proven that all of physics could be represented by a single function.

I could point you to this: https://www.youtube.com/watch?v=PHiyQID7SBs

But that would be unfair, given the QM/GR incompatibility.

That said, ultimately I think the onus is on you to demonstrate that it can't be done when all the (known) parts not only already exist separately in such a form, but also, AFAICT, we don't even have a way to describe any possible alternative that wouldn't be made of functions.

simiones · on Oct 30, 2023

> And why is this important?

Since we know non-continuous functions are used in describing various physical phenomena, it opens the gate to the possibility that there are physical phenomena that NNs might not be able to learn.

And while piece-wise continuous functions may still be ok, fully discontinuous functions are much harder.

> I think so: the construction proof of the claim that they are universal function approximators seems to meet those requirements.

Oops, you're right, I was too generous. If we know the function, we can easily create the NN, no learning step needed.

The actual challenge I had in mind was to construct an NN for a function which we do not know, but can only sample, such as the "understand English" function. Since we don't know the exact function, we can't use the method from the proof to even construct the network architecture (since we don't know ahead of time how many bumps there are are, we don't know how many hidden neurons to add).

And note that this is an extremely important limitation. After all, if the UAF was good enough, we wouldn't need DL or different network architectures for different domains at all: a single hidden layer is all you need to approximate any continuous function, right?

> If any organic brain can't do $thing, surely it makes no difference either way whether or not that $thing can or can't be done by whatever function is used by an ANN?

Organic brains can obviously learn both English and French. Arguably GPT-4 can too, so maybe this is not the best example.

But the general doubt remains: we know humans express knowledge in a way that doesn't seem contingent upon that knowledge being a single continuous mathematical function. Since the universal function approximator theorem only proves that for each continuous function there exists an NN which approximates it, this theorem doesn't prove that NNs are equivalent to human brains, even in principle.

> That said, ultimately I think the onus is on you to demonstrate that it can't be done when all the (known) parts not only already exist separately in such a form, but also, AFAICT, we don't even have a way to describe any possible alternative that wouldn't be made of functions.

The way physical theories are normally defined is as a set of equations that model a particular process. QM has the Schrodinger equation or its more advanced forms. Classical mechanics has Newton's laws of motion. GR has the Einstein equations. Fluid dynamics has the Navier-Stokes equations. Each of these is defined in terms of mathematical functions: but they are different functions. And yet many humans know all of them.

As we established earlier, the UFA theorem proves that some NN can approximate one function. For 5 functions you can use 5 NNs. But you can't necessarily always combine these into a single NN that can approximate all 5 functions at once. It's trivial if they are simply 5 easily distinguishable inputs which you can combine into a single 5-input function, but not as easy if they are harder to distinguish, or if you don't know that you should model them as different inputs ahead of time.

By the way, there is also an example of a pretty well known mathematical object used in physics that is not actually a proper function - the so-called Dirac delta function. It's not hard to approximate this with an NN at all, but it does show that physics is not strictly speaking limited to functions.

Edit to add: I agree with you that the GP is wrong to claim that the behavior exhibited by some organisms is impossible to explain if we assumed that the brain was equivalent to an (artificial) neural network.

I'm only trying to argue that the reverse is also not proven: that we don't have any proof that an ANN must be equivalent to a human/animal brain in computational power.

Overall, my position is that we just don't know to what extent brains and ANNs correspond to each other.

simiones · on Oct 30, 2023

> Lots of graph nodes

Neurons are not connected by a simple graph, there are plenty of neurons which affect all the neurons physically close to them. There are also many components in the body which demonstrably affect brain activity but are not neurons (hormone glands being among the most obvious).

> with weighted connections

Probably, though we don't fully understand how synapses work

> performing distributed computation (mainly hierarchical pattern matching)

This is a description of purpose, not form, so it's irrelevant.

> learning from data by gradually updating weights

We have exactly 0 idea how biological neural nets learn at the moment. What we do know for sure is that a single neuron when alone can adjust its behavior based on previous inputs, so the only thing that is really clear is that individual neurons learn as well, it's not just the synapses with their weights which modifies behavior. Even more, non-neuron cells also learn, as is obvious from the complex behaviors of many single-cell organisms, but also some non-neuron cells in multicellular organisms. So potentially, learning in a human is not completely limited to the brain's neural net, but it could include certain other parts of the body (again, glands come to mind).

> using selective attention (and/or recurrence, and/or convolutional filters).

This is completely unknown.

So no, overall, there is almost no similarity between (artificial) neural nets and brains, at least none profound enough that they wouldn't share with a GPU.

krainboltgreene · on Oct 30, 2023

What does this comment add to the discussion?

robbrown451 · on Oct 30, 2023

I dunno. My comment complained about the parent comment not adding positively to the discussion. And gave at least a bit of support for that complaint.

Would you have preferred I emulate your style, and complain while providing no support for my complaint?

Ok.

krainboltgreene · on Oct 30, 2023

Being positive is not a requirement of commenting on HN, but you should comment with something that is substantive, so yes I do think you shouldn't have commented at all. Tone policing is cringe.

JKCalhoun · on Oct 30, 2023

I don't like tone-policing in general. But when I opened this post the negative comment we're talking about was the top comment. That's makes me much more sympathetic to someone calling out the cynicism.

robbrown451 · on Oct 30, 2023

Exactly what are you doing here then?

But hey I guess I can do this too. How's this? Using cringe as an adjective is cringe.

krainboltgreene · on Oct 30, 2023

> But hey I guess I can do this too.

It sucks, doesn't it?

jacobsimon · on Oct 30, 2023

This is a really weird take. There is such a long history of shared insights between biology and neural network research, and to say they’re unrelated or can’t take inspiration from one another is bizarre.

> The concept of CNNs didn't come from biology

I just opened a survey paper on CNNs and literally the first sentence of the paper reads:

> “Convolutional Neural Network (CNN) is a well-known deep learning architecture inspired by the natural visual perception mechanism of the living creatures. In 1959, Hubel & Wiesel [1] found that cells in animal visual cortex are responsible for detecting light in receptive fields. Inspired by this discovery…”

Source: https://arxiv.org/pdf/1512.07108.pdf%C3%A3%E2%82%AC%E2%80%9A

jakobson14 · on Oct 30, 2023

That's later backfill, a retroactive change to give a manufactured "biological" origin story. Whether they're real or not, researchers love a good "we took this from nature, isn't nature wonderful!" explanation.

The C in CNN isn't "Convolution" for no reason. It came from work with convolutional filters (yay Sobel kernels!) which at it's height became filter banks and gabor filters and so on before neural networks pretty much killed off handcrafted feature development. Every explanation of how CNNs work still falls back to the original convolutional kernel intuition.

rerdavies · on Oct 30, 2023

> The C in CNN isn't "Convolution" for no reason.

The first N in CNN is "Neural" for a reason.

mjburgess · on Oct 30, 2023

Can you explain that reason?

Decision trees are called 'trees' for, more or less, the same reason.

ie., the diagrammed shape of a decision tree looks a little like the branches of a real one.

likewise, in the 50s where diagramming the earliest networks they were aiming to immitate a similar real-world structure.

Better that they had called them 'Variable Activation Networks' or some such, and none of this superstition would have started

TeMPOraL · on Oct 30, 2023

> Better that they had called them 'Variable Activation Networks' or some such

But that's the thing: they didn't. Instead, they called them "neural networks". It wasn't random.

It feels like part of the field now wants to pretend it was never about how to make a machine think. "No, we're only doing abstract maths, only going on self-contained explorations of CS theory." Yeah, right. That feels like a reaction to the new wave of AI hype in business. Now that the rubes are talking about thinking machines again, better distance themselves from them, lest we be confused for those loonies.

Thing is, the field was always driven in big part by trying to catch up with nature. It took inspiration from neuroscience, much like neuroscience borrowed some language from CS, both for legitimate reasons. A brain is a computer. It's precisely where the CS and neuroscience have an overlap - they're studying the same thing, just from opposite directions. It's just silly to play the "oh my field is better and your field doesn't know shit" game.

> Decision trees are called 'trees' for, more or less, the same reason.

Decision trees are called after the data structure, which is a way to express a mathematical object, which is older than CS and got that name from... who knows, but my money is on "genealogical tree", which itself is called a "tree" because people back then liked to tie everything to trees (symbol of growth) and flowers and cute animals (symbols of making babies).

The field inherited "trees" from the past. "Networks", too. But "neural" - that was a modern analogy the field itself is responsible for.

jacobsimon · on Oct 30, 2023

Yep! Trees, tree structures, tree diagrams have been regularly in use since the 1700s as a way of defining relationships. https://en.wikipedia.org/wiki/Tree_structure

There’s also a pretty large link between the formal representation of language using syntax trees, which was being formalized by linguists and by programming language developers around the same time: https://en.wikipedia.org/wiki/Formal_language?wprov=sfti1

dartos · on Oct 30, 2023

You can use that argument for anything you disagree with. Do you have a source or anything?

jakobson14 · on Oct 30, 2023

Have a read through the first paper describing a convolutional neural network, from 1998: http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf

There's absolutely no mention of biological inspiration whatsoever. At the same time, one can point to a long and rich history of convolutional filters being used in signal processing. And then there's the name, Convolutional Neural Network. The entire concept of a CNN is framed as a series of learned filters.

fastball · on Oct 30, 2023

That is definitely not the first paper describing a CNN. That is not even the first paper by Le Cun describing CNNs (he was already on them as early as 1989[1]).

Regardless, Le Cun is not the first to describe CNNs, merely one of the first to use them for OCR (specifically for hand-written text).

The first neural network arch to use convolutions instead of matmuls was this[2], from the year of our lord 1988. This in turn is based on Fukushima's "neocognitron"[3] (1980), which is based on the visual cortex of felines (from work done by Hubel and Wiesel in the 50s/60s).

I guess it is not super surprising you might be confused – Le Cun seems a bit more reticent than average to cite the work he's building on top of, and when he does it is frequently in reference to his own prior work. So if that is where you're getting your picture of artificial neural network history, your skewed perception makes sense.

[1] https://ieeexplore.ieee.org/abstract/document/41400

[2] https://proceedings.neurips.cc/paper/1987/file/98f1370821019...

[3] https://www.cs.princeton.edu/courses/archive/spr08/cos598B/R...

tudorw · on Oct 30, 2023

Thanks, I was looking for something to do with early work and saccades, didn't find that, but found this;

"The most influential of these early discussions was probably the 1943 paper of Warren McCulloch and Walter Pitts in which activity in neuronal* networks was identified with the operations of the propositional calculus. Actual simulations of recognition automata based on networks were carried out by Frank Rosenblatt before 1958 but the theoretical limitations of his "perceptrons" were soon pointed out by Marvin Minsky and Seymour Papert"

excerpt from a 1998 paper, "Real Brains and Artificial Intelligence" (https://www.jstor.org/stable/20025142)

"Walter Harry Pitts, Jr. (23 April 1923 – 14 May 1969) was an American logician who worked in the field of computational neuroscience.[1]"

'https://en.wikipedia.org/wiki/Walter_Pitts'

jacobsimon · on Oct 31, 2023

I don't know why I'm still responding to this thread 24 hours later, but just thought I'd add this tweet from Le Cun: "Neuroscience greatly influenced me (there is a direct line from Hubel & Wiesel to ConvNets) and Geoff Hinton. And the whole idea of neural nets and learning by adjusting synaptic weights clearly comes from neuroscience."

https://x.com/ylecun/status/1583872918634655744?s=20

jacobsimon · on Oct 30, 2023

Surely you are trolling me now. There is a very clear biological inspiration mentioned in this paper: they literally define a CNN as having “receptive fields” and then they cite the same Hubel & Wiesel research mentioned before multiple times. LeCun mentions their research in papers even earlier in the 80s as well, during which they were awarded the Nobel prize for their research on the visual system. Of course there is also a lot of computational and mathematical research that was ongoing simultaneously, but to say that there is “no inspiration whatsoever” is pretty far from the truth.

jacobsimon · on Oct 31, 2023

Plus: https://x.com/ylecun/status/1583872918634655744?s=20

albertzeyer · on Oct 30, 2023

Some time around mid 1995 until basically now, it became out of fashion to explain your motivations of some new modeling as inspired from biology, as that was often handwaving with only little understanding of the actual neuroscience. So that is why people stopped writing that in papers. Just let the actual performance numbers speak for themselves. Either you get good performance, then it doesn't really matter where this was inspired from, or it does not work well, then it also does not matter where this was inspired from. In machine learning, it mostly matters whether it works well or not.

readthenotes1 · on Oct 30, 2023

That's funny. I had a book on "neural nets" in the 1980s, and it mentioned the analog to brain neurons.

two_in_one · on Oct 30, 2023

While I agree with this emotional post there is one nuance. Neural networks aren't intelligent, brain is. And that's where we want to be. Checking gradients and studying filters can get us only this far. So, using brain as inspiration looks like a good option. There are other, but nobody knows where next breakthrough will be. Like nobody knew five years back that transformers are so powerful. My guess next step to AGI will be a complex modular multi-modal system. With hierarchy, workers and controllers, complex signals.. Sound familiar? Brain is sort of it. This is need for embodied AI, obviously. But, interesting thing, it's needed even for body-less AGI too. I.e. AGI is not a big calculator (!), it's more like real-time system. One reason is that full search is impossible. So, in many cases requests will be like 'give the best answer you can find in 4 seconds'. 'and keep looking'. So far we have only real-time dumb robots and NN big calculators. And brains, of course.

dilawar · on Oct 30, 2023

> previously: comparing brains and "electronic computers")

Before that: comparing brain with hydraulic machines. There has been tendency to compare brain with most complex machine known to us at that particular time.

"Descartes was impressed by the hydraulic figures in the royal gardens, and developed a hydraulic theory of the action of the brain. We have since had telephone theories, electrical field theories, and now theories based on computing machines… . We are more likely to find out how the brain works by studying the brain itself, and the phenomenon of behavior, than by indulging in far-fetched physical analogies." -- Karl Lashley 1951

bondarchuk · on Oct 30, 2023

Electronic computers, artificial neural networks, hydraulic machines, clockworks etc... are all computationally equivalent to the brain. Anyone making such comparisons is grasping at the fact that the brain can be understood computationally. To complain that there are no pressure-driven pistons, rotating gears or whatever in the brain is missing the point of the analogy, IMHO, which is: all these systems perform computation on top of a physical substrate, and what we actually (should) care about is the computation itself and not the mechanical workings of the substrate.

mjburgess · on Oct 30, 2023

I cannot agree enough with Karl here. What is the brain? An organic system with deep roots in the organic body, with deep causal connections with its environment.

There's little sense in ignoring the whole basic mode of operation, physics, chemistry and biology of the brain in order to analogise it to another system without any of those properties.

This, at best, provides a set of inspirations for engineers -- it does nothing for science.

TeMPOraL · on Oct 30, 2023

> There's little sense in ignoring the whole basic mode of operation, physics, chemistry and biology of the brain in order to analogise it to another system without any of those properties.

Sure there is. People had a feel for it back in "clockworks" times, nowadays we have a much better grasp because of progress of physics and math, particularly CS - mode of operation is an implementation detail. Whatever the mode, once you understand the behavior enough to model it in computational terms, you can implement it in anything you like - gears and levers, pistons, water flowing between buckets, electrons in silicon, photons going through lenses, photons diffusing through metamaterials, sound waves diffusing through metamaterials - and yes, also via a person locked in a room full of books telling them what to draw in response to a drawing they receive, and also via a billion kids following a game to the letter, via corporate bureaucracy, via board game rules, etc.

Substrate. Does. Not. Matter.

The only thing limiting your choice here is practical one. Humanity is getting a good mileage out of electrons in silicon, so that's the way to go for now. Gears would work too, they're just too annoying to handle at scale.

Of course, today we don't have a full understanding of biological substrate - we can't model it fully in terms of computation, because it's a piece of spontaneously evolved nanotech and we barely begun being able to observe things at those scales. We have a lot of studying in front of us - but this is about learning how the gooey stuff ticks, what does it compute and how. But it's not about some new dimension of computation.

mjburgess · on Oct 30, 2023

> Substrate. Does. Not. Matter.

It only doesnt matter for counting a system as implementing a pure algorithm, ie., one with no device access. This is an irrelevant theoretical curiosity.

Electronic computers are useful because they're electronic -- they can power devices, and modulate devices using that power. This cannot be done with wood, or most anything else.

"Substrate doesnt matter" is, as a scientific doctrine pseudoscience, and as a philosophical one, theological.

The causal properties of matter are essential to any really-existing system. Non-causal, purely formal properties of systems which can be modelled as functions from the naturals to the naturals (ie., those which are computable) are useless.

TeMPOraL · on Oct 30, 2023

> Electronic computers are useful because they're electronic -- they can power devices, and modulate devices using that power. This cannot be done with wood, or most anything else.

On the contrary. That's an implementation detail. You can "power devices, and modulate devices" by having a clockwork computer with transducers at the I/O boundary, converting between electricity and mechanical energy at the edge. It would work exactly like a fully electronic computer, if built to implement the same abstract computations - and as long as you use it within its operational envelope[0], you wouldn't be able to tell the difference (except for the ticking noise).

> The causal properties of matter are essential to any really-existing system. Non-causal, purely formal properties of systems which can be modelled as functions from the naturals to the naturals (ie., those which are computable) are useless.

Yes and no. Of course the causal properties of matter... matter. But the breakthrough in understanding, that came with development of computer science and information theory, is that you can take the "non-casual, purely formal" mathematical models of computation, and define some bounds on them (no infinite tapes), you can then use the real-world matter to construct a physical system following that mathematical model within the bounds, and any such system is equivalent to any other one, within those bounds. The choice of what to use for actual implementation is done on practical grounds - i.e. engineering constraints and economics.

It's how my comment reached your screen, despite being sent through some combination of electrons in wires, photons down a glass fibre, radio signals at various frequencies - hell, maybe even audio signals through the air, or printouts carried by pidgeons[1]. Computer networks are a living proof that substrate doesn't matter - as long as you stick to the abstract models and bounds described in the specs for the first three layers of ISO/OSI model, you can hook up absolutely anything whatsoever to the Internet and run TCP/IP over it, and it will work.

I bet there's at least one node on the Internet somewhere whose substantial compute is done in a purely mechanical fashion. And even if not, it could be done if someone wanted - figuring out how to implement a minimal TCP/IP stack using gears and switches is something a computer can do for you, because it's literally just a case of cross-compilation.

--

[0] - As opposed to e.g. plugging 230V AC to its GPIO port; the failure modes will be different, but that has no bearing on either machine being equivalent within the operational bounds they were designed for.

[1] - https://datatracker.ietf.org/doc/html/rfc1149

mjburgess · on Oct 30, 2023

> matter to construct a physical system following that mathematical model within the bounds, and any such system is equivalent to any other one, within those bounds

No. This wasnt discovered.

Nearly every physical system is implementing nearly every pure algorithm, ie., every computable function.

The particles of gas in the air in my room form a neural network, with the right choice of activation function.

Turing-equivalence is a property of formal models with no spatio-temporal properteis. Physical systems are not equivalent because they both implement a pure algorithm

Pure algorithms are useless, and of interest only in very abstract csci. All actual algorithms, when specified, have massive non-computational holes in them called 'i/o', device access etc.

If your two systems of cogs wants to communiate over a network of cogs, the Send() 'function' (which is not a function!) has to have a highly specific causal semantics which cannot be specified computationally.

These systems only have 'equivalent functions', as seen from a human point-of-view, if their non-computational parts serve equivalent functions. This has nothing to do with any pure algorithm.

You cannot implement a web browser on 'gears' in any useful sense, in any sense in which the partices of their air arent already implementing the web browser. That a physical system can-be-so-described is irrelevant.

Computers are useful not because theyre computers. Theyre useful because they are electrical devices whose physical state can be modulated with hyper-fine detail by macroscope devices (eg., keyboards). We have rigged a system of electrical signals to immitate a formal programming langauge -- but this is an illusion.

Reduce the system down to just want can be specified formally, and it disappears.

TeMPOraL · on Oct 30, 2023

> Nearly every physical system is implementing nearly every pure algorithm, ie., every computable function.

Sure. And also about the air and neural network. This is all irrelevant, for the same reason that every possible program and every possible copyrighted work being contained in the base-10 expansion of the number PI is irrelevant. Or that a photo of every event that ever happened anywhere is contained in the space of all possible (say) 1024x1024 24-bit-per-pixel bitmaps. It's all in there, but it's irrelevant, because you have no way of determining which combinations of pixels are photos of real events. And any random sample you take is most certainly not it.

> All actual algorithms, when specified, have massive non-computational holes in them called 'i/o', device access etc.

Only if you stick to a subset of maths you use for algorithms, and forget about everything else. The only actual hole there would be in your memory, or knowledge.

Sure, I/O doesn't play nice with functional programming. It doesn't stop functional programming from being useful with real computers in the real world. We have other mathematical frameworks to describe things that timeless, stateless computation formalisms can't. You are allowed to use more than one at the same time!

> You cannot implement a web browser on 'gears' in any useful sense, in any sense in which the partices of their air arent already implementing the web browser.

Of course I can. Here is the dumb approach for the sake of proof (one can do better with more effort):

1. Find a reference for how to make a NAND gate with gears. Maybe other logic gates too, but it's not strictly necessary.

2. Find the simplest CPU architecture someone made a browser for, for which you can find or get connection-level schematics of the chip; repeat for memory and other relevant components, up to the I/O boundary. Make sure to have some storage in there as well.

3. Build electricity/rotational motion transducers, wire them to COTS display, keyboard, mouse and Ethernet ports.

4. Mechanically translate all the logic gates and connections from point 2. to their gear equivalents using table 1., and hook up to 3.

5. Set the contents of the storage to be the same as a reference computer with a web browser on it.

6. Run the machine.

Of course, this would be a huge engineering challenge - making that many gears work together, in spite of gravity, inertia, tension and wear, and building it in under a lifetime and without bankrupting the world. Might be helpful to start by building tools to make tools to make tools, etc.

But the point is, it's a dumb mechanical process, trivially doable in principle. May be difficult with physical gears, but hey, it worked in Minecraft. People literally built CPUs inside a videogame this way.

> We have rigged a system of electrical signals to immitate a formal programming langauge -- but this is an illusion.

It's the other way around: we've rigged a system of electrical signals to make physical a formal theoretical program. We can also rig a system of optical signals, or hydraulic signals, or pidgeon-delivered paper signals, to "immitate a formal programming language" and implement a formal theoretical program - and as long as those systems immitate/implement the same formal mathematical model, they're functionally equivalent and interchangeable.

mjburgess · on Oct 30, 2023

I think you aren't following the defintion of 'computer' or 'computable', you seem to have a mixed physical notion of what a 'computer' is.

A computer, from a formal pov, is just an abstract mathematical object (like a shape) which has abstract properties (eg., like being a circle) that are computable, ie., are functions from integers to integers.

The physical devices we call 'computers', in many ways, arent. They exist in space and time and hence have non-computable properties, like their (continuous) extension in space and time.

See Turing's own paper where he makes this point himself, ie., that physical machines arent computers in his sense because they're continuous in time.

Insofar as you appeal to any causal aspects of a physical system you arent talking about a computer in turing's sense, and nothing like a turing equivalence would apply.

We already know that all computable functions can be implemented by 'arbitary substrates' -- this is just the same as saying that you can 'make a circle out of any material'.

In exactly the same sense as gears can be networked, sand dunes already are. You can just go around labelling particles of sand with 0s and 1s, and for a subset, there you have it: the computable aspects of the TCP/IP protocol.

But this is irrelevant. TCP/IP isnt useful because of its computable aspects. It's useful as a design sheet for humans to rig systems of electrical devices with highly specific causal properties.

The system we call 'the internet' is useful because it connects keyboards, screens, mice, microphones, webcams, SSDs, RAM, etc. together -- and because these devices are provide for human interaction.

The sand dune is likewise already implementing arbitary computable functions, so is the sun, so is the air, and any arbitary part of the universe you care to choose.

But the sand dune lacks all the properties the internet has: there's no webcam, no keybaord, no screen, etc.

What we actually use are physical properties. Talk of algorithms is just a design tool for people to build stuff

ben_w · on Oct 30, 2023

I mildly disagree (although your final conclusion is correct: it indeed does nothing for science).

The deepest fundamental structures in the brain[0] are quantum fields, which are also the deepest fundamental structures in everything else.

There is no known quantum field of "soul" or "intelligence".

The right abstraction is higher, and could still be a whole lot of things; but as maths can be implemented in logic, which can be implemented in electronics or clockwork or hydraulics, it doesn't matter what analogy is used — and my mild disagreement here is that such inspiration has been useful and gotten us this far.

[0] that we know of

mjburgess · on Oct 30, 2023

The process of evolution acts on organic systems, it doesn't act on quantum fields.

I appreciate there's some (imv strange) sense of 'intelligence' where 'finding the right puzzle piece' counts. I cannot fathom why we care about such a notion, and it seems to have almost nothing to do with what we do care about re 'intelligence'.

We care about that thing animals do, that thing which some do better than others. That thing which evolution brought about for (rapid) adaptive fitness to one's environment.

'Everything else is stamp collecting'

We already have a perfectly good understanding of puzzles and their solutions -- animals are their inventors

Intelligence isnt in the solution to a puzzle it's in its design, and especially, in what one does when one cannot solve it -- ie., how one adapts

The csci view of 'intelligence' is an act of self-aggrandising, it turns out to be: csci!

This is none-sense.

ben_w · on Oct 30, 2023

We can simulate evolution in a computer, and this is used as a form of AI directly.

That said, the way you're using biological evolution in your comment sounds as much like a strange analogy as all of the others: we may have some genetically programmed responses to snakes (bad) and potential mates (good), but we can also say that a loss of hydraulic pressure in our brain is a stroke, and use electrical signals to both read from and write to the brain.

What we evolved to think, while interesting from a social perspective, seems to me like the least interesting part of our brains from an AI perspective — it's the bit that looks like a hard-coded computer program, not learning, on the scale of a human life and seen from within.

mjburgess · on Oct 30, 2023

i'm referring to evolution as the process by which animals were built

if aliens had come down and given us laptops, rather we invented digital machines, then likewise i'd be talking about the relevant materials science, physics etc.

reverse engineering a laptop to figure out how it works would require extremely little computer science, and 'only at the end'

the reason digital computers are interesting and useful is that they route electricity around devices which are designed to be responsive to one another. the patterns of activation, as managed by the CPU, are weakly describable by abstract algorithms like sorting

starting with a laptop, and no further information, we'd be 100(s)+ years of research away from needing to understand that CPUs were implementing a sorting algorithm

and importantly, that it is doing so has almost nothing to do with the value of the device -- which lies in its ability to provide 'dynamical power and modulation of operation' using electricity

we're in the same situation with animals and people think that, what, understanding gradient descent or backprop is helpful? this is just some csci bs

ben_w · on Oct 30, 2023

I'm not really following you, sorry; this is all too disjointed.

> we're in the same situation with animals and people think that, what, understanding gradient descent or backprop is helpful? this is just some csci bs

Assuming I've actually got your point for this (and I'm not sure I have):

The backpropagation algorithm itself might be "just some csci bs" (it sure has vibes of "let us shortcut the maths rather than find out how our brains did it"), but gradient descent is nice and general-purpose — much like how evolution is both good for biology and in simulation for everything else.

mjburgess · on Oct 30, 2023

To get my point, imagine a laptop was delivered by an alien in the year 1900.

Now, try to take that seriously and think about the laptop as an actual object of experimental curiosity -- what exactly does science need to invent, discover, describe etc. to understand the operation of that laptop?

99.999% of that new knowledge has to be in physics and chemistry, before the tiny 0.0001% of theoretical csci knowlegde is brought to bare.

Consider how impossible it would be to apply any csci knowledge first: we do not even have the ability to measure the cpu state! So we could not even identify any part of the system with 0s, 1s, etc.

Now: that's a laptop!

Imagine now you're dealing with an animal.

Hopefully its now clear how ridiculous it is to describe basically any aspect of our mode of operation by starting with trivial little csci algorithms. It would be insane even with an actual electronic computer, let alone an organic system.

A system whereby clearly our organic properties are radically fundamental to our mode of operation

TeMPOraL · on Oct 30, 2023

Wrong.

Consider two hypothetical versions of this. One, the exact scenario as you described - history unfolded like it did, until the 1900 alien incident. CS and information theory is in its infancy. You're correct that most of the necessary work would first go to physics and chemistry and their various spin-off fields, because that's what's needed to build tools necessary to inspect the machine in full detail. The math would develop along the way, and eventually enough CS to make sense of the observations made before.

Now for an alternate scenario: it's the 1900 again, with the twist that CS is already well-developed theoretical field of mathematics (IDK, perhaps the same aliens dropped us a mechanical computer in year 1800). We'd still need to push physics and chemistry (and spin-offs) forward, but this time, we would know what we're looking for. We'd know the thing does computation, we'd be able to model what kind of computation it does. The question would change from "what does this thing do" to "how exactly does it compute the specific things we know it does". I imagine this would speed up the process of getting a complete picture, because it's easier to understand a specific solution to a problem once you know the answer, than it is to figure out the answer along with the solution.

In terms of understanding the brain, we are in the second situation. We may still know little about how the gooey thing ticks, but we have a growing understanding of what comes out of all that ticking, and a very good understanding of the fundamental rules of ticking.

mjburgess · on Oct 30, 2023

Nearly every physical system implements every algorithm -- if you wanted to find what in a laptop was 'sorting numbers' that would every part.

The light emitted by the screen is being 'sorted' as it is scanned out, the heat air by the fan is being 'sorted' as it swirls around, etc.

You cannot ask, "what physical system implements this algorithm?" as an investigative question, the answer is: nearly all of them.

This is why computable functions, ie., pure algorithms, are explanatorily useless. They play only a (observer-relative) 'design role' in creating real programs.

ben_w · on Oct 30, 2023

You're normally a lot more coherent than you have been in this thread, so… are you feeling alright? Getting enough sleep?

> The light emitted by the screen is being 'sorted' as it is scanned out, the heat air by the fan is being 'sorted' as it swirls around, etc.

This reads like either you're trolling, or that was written by an LLM, or English isn't your native language, or don't know what 'sorting' is, or you don't know what screens and fans do.

It's so fundamentally wrong I was actually tempted to get ChatGPT to respond to it, but that would be a bit mean and add little.

You're better than this. What's wrong?

mjburgess · on Oct 30, 2023

there's nothing garbled about this idea -- not sure about my messaging in this thread, maybe the explanations are a bit looser today

A computable function is a function from naturals to the naturals typically specified as an algorithm: a sequence of steps by which input numbers are transformed into output numbers.

Eg., consider sorting: 101, 001, 111, etc.

Now any physical system can have any component part associated with 0 or 1. There is no reason, a priori, to suppose that voltage flux on a CPU is a "1" or a "0" any more than to associate a photon emission.

If one associates a photon emission at some location with a 0, and another with a 1, then displaying content on a screen is a form of sorting.

Likewise a planet orbiting the sun is implementing a while(true) i = -1*i, if one associates -1/1 with position of the planet orbiting the sun. This is the heart of 'reversible computing'.

The only reason we associate some microscopic part of a CPU with 0, 1, etc. is by design it is something we as observers bring to bare on our interpretation of the physical system. But there's an infinite number of such attributions. We would only ever come to conclude that voltage flux across transitiors was relevant to the operation of a laptop via physics experiments --- no hope via computer science.

This is very important for understanding why csci is presently useless and misinformative as far as the brain is concerned. There are an infinite number of 0/1 attributions to make, and infinite number of algorithms being implemented etc. almost all of those are irrelevant.

Just, as you detect the absurdity, of using sorting algorithms to understand how an LCD works. This is presently less absurd than people talking about neural networks and equivocating with brain structures

TeMPOraL · on Oct 30, 2023

> This is very important for understanding why csci is presently useless and misinformative as far as the brain is concerned. There are an infinite number of 0/1 attributions to make, and infinite number of algorithms being implemented etc. almost all of those are irrelevant.

What makes brain a computer, and the air molecules in your room not a computer, is entropy. The behavior of air molecules is effectively random, the behavior of a brain very much not so.

Also, the universe isn't an uniform temperature soup where everything is equally random. There's energy cost to complexity, and there's a likelihood penalty to complexity. This gives us good confidence that the brain isn't doing something absurdly incomprehensible: it was made by evolution, which is a dumb, brute-force, short-term process. It didn't go out of its way to make things complex - it went with the first random thing that improved survival, which, being random, means generally the simplest thing that could work well enough.

Whatever trickery made brains tick, it must be something that's a) dumb enough for evolution to stumble on it, b) generic enough to scale up by steps small enough for evolution to find, all the way to human level, while c) conferring a survival advantage at every step of the way. Sure, the brain design isn't optimal or made in ways we'd consider elegant, but it's also not actively trying to be confusing. There's literally a survival penalty to being confusing (by means of metabolic cost)!

All to say, we're not dealing with a high-entropy blob of pure randomness. We're dealing with a messy and unusual system, but one that was strongly optimized to be as simple as one could get away with. This narrows down the problem space considerably, and CS is our helpful guide, at the very least by putting lower bounds on complexity of specific computations.

mjburgess · on Oct 30, 2023

As soon as you add these physical constraints on what counts as a 'computer' you're no longer talking about computers as specified by turing, nor computer science -- which is better called Discrete Mathematics.

You're conflating the lay sense of the term meaning 'that device that i use' with the technical sense. You cannot attribute properties of one to the other. This is the heart of this AI pseudoscience business.

All circles are topologically equivalent to all squares. That does not mean a square table is 'equivalent' to a circular table in any relevant sense.

If you want to start listing physical constraints: the physical state can be causally set deterministically, the physical state evolves causally, the input and output states are measurable, and so on -- then you end up with a 'physical computer'.

Fine, in doing so you can exclude the air. But you cannot exclude systems incapable of transfering power to devices (ie., useless systems).

So now you add that: a device which, through its operation, powers other devices. You keep doing that and you end up with 'electrical computers' or a very close set of physical objects with physical propeties.

By the time you've enumerated all these physical properties, none of your formal magical 'substrates dont matter' things apply. Indeed, you've just shown how radically the properties of the substrate do apply -- so many properties end up being required.

Now, as far as brains go -- the properties of 'physical computers' do not apply to them: their input/output states may be unmeasurable (eg., if QM is involved); they are not programmable (ie., there is no deterministic way to set their output state); they do not evolve in a causally deterministic way (sensitive to biochemical variation, randomness, etc.).

Either you speak in terms of formalism, in which case you're speaking in applicable non-explanaotry toys of discrete mathematicans'; or you start trying to explain actual physical computers and end up excluding the brain.

All this is to avoid the overwhelmingly obvious point: the study of biological organisms is biology.

spindle · on Oct 30, 2023

And also comparing brains to clockwork.

crustacean111 · on Oct 30, 2023

CNNs actually are biologically inspired. The receptive field in a CNN mimics the way that cortical neurons only respond to stimuli in a restricted region of the visual field. Different cortical neurons have receptive fields that partially overlap to cover the whole visual field [1].

[1] - https://en.wikipedia.org/wiki/Convolutional_neural_network

jakobson14 · on Oct 30, 2023

You're going to have to dig deeper. The concept of a receptive field goes all the way back to convolutional filters.

It's not surprising that we found out later the brain also uses such a fundamental element of signal theory.

SubiculumCode · on Oct 30, 2023

Oh good. So you do admit that there are useful parallels between signal processing, statistical processing, and the brain.

vkou · on Oct 30, 2023

Sure, and airplanes are inspired by birds. That doesn't mean that detailed studies of the Boeing 747 are going to unlock a lot of hitherto unknown mysteries of heron behaviour.

jacobsimon · on Oct 30, 2023

I mean, I know you’re just providing an analogy, but people are still studying the physics of bird flight and we’re nowhere close to building machines yet that can maneuver the way birds can. https://www.quantamagazine.org/geometric-analysis-reveals-ho...

ben_w · on Oct 30, 2023

I could believe "we have more to learn", but not "we're nowhere close":

https://youtu.be/w6VLzKACnS8?si=DZgOPuBRG4Vt98su

jacobsimon · on Oct 30, 2023

wslh · on Oct 30, 2023

Only an observer of the topic but I think it is good to review Koch's book about the real complexity of a single neuron [1].

[1] https://www.amazon.com/Biophysics-Computation-Information-Co...

mrstone · on Oct 30, 2023

A neurologist is a medical doctor. Neuroscientists are the PhDs who do the actual research.

blovescoffee · on Oct 30, 2023

Dude. What holy and special work do you do? There's nothing dumb or dull in searching for analogous structure between two effective machines, neither of which we understand.

hliyan · on Oct 30, 2023

"brain seems shallow and neural networks are deep, ergo neural networks are doing it wrong"

Please don't claim things the author didn't. What I read was "ergo (artificial) neural networks may be missing a trick"

b33j0r · on Oct 30, 2023

Agreed, but I do also think that order emerged from chaos. It’s an easy claim when order is defined by itself!

But in reality, we’re equipped exactly to exist, and we still wonder why in a backwards way, even with education (guilty!)

AI is the task of playing God like toddlers at recess, and LLMs the tower of babel. I still wanna play, it’s fun

bjourne · on Oct 30, 2023

First, I wonder how you got access to the article? It is behind a paywall and not yet uploaded to the sites I usually find paywalled articles on.

Second, there is no need to compare brains to neural networks because brains are neural networks. Neurons form vertices and axons edges connecting the aforementioned. What you are perhaps thinking of are artificial neural networks - most of which are very dissimilar to brains. But even then you are wrong. Artificial Izhikevich and Hodgkin-Huxley neural networks attempts to closely mimic the behavior of real neurons.

While deep, hierarchical artificial neural networks have been more successful than biologically plausible ones, that may be because the technology isn't ready yet. After all, the perceptron was invented in the 1950's but didn't become prominent until the 2010's (or so). Perhaps we need new memories that better map to (real) neural network topologies, or perhaps 3d chips that can pack transistors in the same way brains pack neurons.

mjan22640 · on Oct 30, 2023

A neuron is analogous to a 3d integrated circuit rather to a transistor. A molecule acts like a transistor https://medium.com/the-physics-arxiv-blog/the-origin-of-life...

Changes in mechanical pressure, electric field, other molecules attachment, photon absorption, can control the conductivity.

Organic semiconductors designed to fit like lego bricks to naturally build the desired structure are IMHO the way to go to produce 3d circuits, rather than layered silicone litography.

ben_w · on Oct 30, 2023

> silicone

I've seen this particular mistake a lot recently. New and exciting auto-corrupt from the latest version of iOS?

Given that our brains rewire themselves live, which ANNs can only do by being excessively connected and updating weights to/from zero, silicone (I'm thinking mainly the oil form) may be a better inspiration than lego.

https://en.wikipedia.org/wiki/Silicone

mjan22640 · on Oct 30, 2023

Yes, puzzle pieces would be more accurate than lego.

The bonds that silicone forms do not AFAIK allow as rich variety of polymers as carbon.

ben_w · on Oct 30, 2023

Silicone, with the e on the end, is one of the main polymers.

SubiculumCode · on Oct 30, 2023

If you read this article, I think most would understand that it is primarily aimed at other neuroscientists, and only using ML structures an an analogy only, and I think a somewhat useful one to boot. The real point of the article was to propose a general hierarchy for how information flows in the brain, to emphasize the importance of subcortical brain even in higher order cognition, and proposes how simultaneous processing of multiple levels of representation can inform action and thought.

As a developmental neuroscientist, I found the article insightful and thought provoking. Further, it is quite consistent with major hypotheses in psychology, how the hippocampus works (a subcortical structure) and combines information into memories: See fuzzy trace theory [1], for example.

Your dismissive tone is unappreciated, ill-informed, and crass.

[1] https://en.wikipedia.org/wiki/Fuzzy-trace_theory

visitor4711 · on Oct 30, 2023

fully agree

radarsat1 · on Oct 30, 2023

> every time some neurologist tried to compare brains to neural networks

Value of this comment aside, it kind of makes me chuckle how casually it (and other comments in this thread) just drops the word "artificial" from neural networks here, specifically when comparing with neurology. The irony is funny. Like, somehow we've forgotten why we call them that in the first place, exactly when talking about the thing that inspired the approach.

NoToP · on Oct 30, 2023

I disagree profoundly.

There are things the brain does we have not yet been able to reproduce with a neural network, or to the extent we have seemingly with excessive resources of training and network size. Therefore there is some salient feature of neurology which has been overlooked. I don't think it is necessary to mimic biology down to the exact function of real neurons, but there must in fact be something we are neglecting to mimic.

ben_w · on Oct 30, 2023

Possibly, but it may also be that we're training them wrong.

"Book smart, not street smart" (to use a catchphrase) would apply perfectly to GPT models: brain the size of a rodent's, with 50,000 year's experience of reading Reddit, Wikipedia, and StackOverflow, but no "real life" experiences of its own.

andromaton · on Oct 30, 2023

Books and articles I was reading in the 80s (eg Minsky and Papert, Byte magazine) were referring to Rosenblatt and retinas.

nathias · on Oct 30, 2023

Metaphores and analogies are important tools of thinking, even in science, some bear fruits some lead to errors, but we can't know in advance.

peyton · on Oct 30, 2023

I dunno, failure seems okay. Wouldn’t expect a better paradigm to beat SOTA at first. It’s totally plausible that neurons use eg. transposons in a way we don’t yet have the instrument resolution to characterize, which would suggest that you don’t need 1000 layers, but a lookup table or something.

svara · on Oct 30, 2023

Doesn't know what a neurologist is, knows they do shit work.

__loam · on Oct 30, 2023

As a biomedical engineer who went into software, thank you for this comment lol. So tired of rehashing this.

phlogisticfugu · on Oct 30, 2023

deep learning models have already been permitting "shallow signals" for a while. see "skip connections"

https://theaisummer.com/skip-connections/

_p9wz · on Oct 30, 2023

Who finally cares how exactly an ANN matches a human brain? Is such ANN smarter than ChatGPT?

It is more useful to use AI to develop more ecologically valid measurement methods for biology.

MagicMoonlight · on Oct 30, 2023

If it was shallow then it wouldn’t take 25 years for a human brain to fully train. The fact that some parts of it need that much data mean they must be way up the hierarchy.

GranularRecipe · on Oct 30, 2023

The reason for deep learning is that shallow networks are very hard (or impossible) to train. In that sense, long time of training is evidence for shallow networks.

IshKebab · on Oct 30, 2023

No it's because shallow networks can't express complex functions. If you think about it the shallowest network is pretty much a lookup table. They can theoretically model any function, but the number of parameters needed means in practice they can't. Deep networks can learn much more complex functions for the same number of parameters.

simiones · on Oct 30, 2023

> They can theoretically model any function, but the number of parameters needed means in practice they can't.

Even theoretically, no they can't. They can theoretically model any continuos function.

Plus, even for continuous functions, the theorem only proves that, for any function, there exists some NN that approximates it to arbitrary precision. It is not known whether there is some base NN + finite training set that could be used to arrive at that target NN using some algorithm in a finite number of steps.

nightski · on Oct 30, 2023

I'm not sure it is all that interesting of a distinction seeing as non-continuous functions can be approximated by continuous ones (basically the entire premise of a digital computer).

simiones · on Oct 30, 2023

I don't think this is right at all. Digital computers express non-continuous functions, and they sometimes use those to approximate continuous functions.

For example, for a function f(x) defined on R with f(x) = -x if x < 0 and f(x) = 7+x if x >= 0, how would you approximate it by a continuous function g(x) with precision lower than, say, 1 (i.e. |f(x) - g(x)| < 1 for any x in R)?

And of course, there are functions with much worse discontinuities than this.

IshKebab · on Oct 30, 2023

He was talking about the actual signals in chips being continuous I believe.

GranularRecipe · on Oct 30, 2023

This is not only an issue for shallow networks. As far I know, both points apply to all feed-forward networks regardless of depth.

simiones · on Oct 30, 2023

Yes, both apply regardless of depth (as long as it is finite, I imagine).

PartiallyTyped · on Oct 30, 2023

I mean… a 3 layer network is a Universal approximator… and you can very much do network distillation… it’s just that getting them wide enough to learn whatever we want them to isn’t computationally efficient. You end up with much larger matmuls which let’s say for simplicity exhibit cubic scaling in the dim. In contrast, you can stack layers and that comes much more computationally friendly because your matmuls are smaller.

Of course you then need to compensate with residuals, initialisation, normalisation, and all that, but it’s a small price to pay for scaling much much better with compute.

IshKebab · on Oct 30, 2023

> a Universal approximator

Yes that's exactly my point. A lookup table is a universal approximator. Good luck making AI with LUTs.

It's kind of like the halting problem or the no-free-lunch theorem. Interesting academic properties but they don't really have any practical significance and often confuse people into thinking that things like formal verification and lossless compression are impossible.

GranularRecipe · on Oct 30, 2023

What the ratio for the number of parameters required to learn some complex function between a shallow network and a deep network (preferably as a function of the complexity)?

numpad0 · on Oct 30, 2023

Is that to say something along, flapping wings is overcomplicated and stupid?

gwern · on Oct 30, 2023

That doesn't follow. Shallow networks can be harder to train than a deep one, which is one of the old arguments for why you should train a deep NN despite its many disadvantages (like latency - often a matter of life and death for biological organisms!). The depth allows easier learning.

This is why today, if you need a low-latency NN, which means a shallow one, often your best bet is to train a deep one first and then distill or prune it down into a shallow one. Because the deep one is so much easier, while training a shallow one from scratch without relying on depth may be an open research question and effectively impossible.

ralfd · on Oct 30, 2023

> If it was shallow then it wouldn’t take 25 years for a human brain to fully train.

It doesn't. You can speak perfectly fine with children. And in fact some teenagers think they know everything.

Salgat · on Oct 30, 2023

The brain communicates with itself, so deep layers are equivalent to sections of the brain talking to each other. The only relevance white matter depth has is with regard to how it's trained, and since it doesn't use gradient descent, it's irrelevant to neural networks in that regard.

blovescoffee · on Oct 30, 2023

Intercommunication does not equal layer depth.

Salgat · on Oct 30, 2023

Why not? All a deep neural network is doing is progressive data transformations into something more abstract and meaningful to later layers.

lawlessone · on Oct 30, 2023

So does this mean DNN are in some ways deeper than human brains?

bjornsing · on Oct 30, 2023

> This shallow architecture exploits the computational capacity of cortical microcircuits and thalamo-cortical loops that are not included in typical hierarchical deep learning and predictive coding networks.

As I understand it the thalamus is basically a giant switchboard though. I see no reason to believe that it never connects the output of one cortical area to the input of another, thus doubling the effective depth of the neural network. (I haven’t read this paper though, as it was behind a paywall.)

Simon_ORourke · on Oct 30, 2023

Judging by some of the levels of driving around these parts, the brain may be very shallow indeed.

low_tech_punk · on Oct 30, 2023

Replay of Jeff Hawkins group’s A Thousand Brains theory?

SubiculumCode · on Oct 30, 2023

"his theory" lol. Jeff Hawkins is a bit player

rando_dfad · on Oct 30, 2023

Original to Jeff or not, "A Thousand Brains" does a decent job presenting an interesting and highly plausible model of how the neocortext may function.

Your comment would be very valuable to me if it included pointers to better sources. I have sufficient background to see gaps in Jeff's book, and would be interested in exploring these, perhaps through the references you seem to be aware of.