We know how neural nets work. We don't know how a specific combination of weights in the net is capable of coherently asking questions asked in a natural language, though. If we did, we could replicate what it does without training it.
> We know how neural nets work. We don't know how a specific combination of weights in the net is capable of coherently asking questions asked in a natural language, though.
these are the same thing. the neural network is trained to predict the most likely next word (rather token, etc.) — that’s how it works. that’s it. you train a neural network on data, it learns the function you trained it to, it “acts” like the data. have you actually studied neural networks? do you know how they work? I’m confused why you and so many others are seemingly so confused by this. what fundamentally are you asking for to meet the criteria of knowing how LLMs work? some algorithm that can look at weights and predict if the net will output “coherent” text?
> If we did, we could replicate what it does without training it.
It's like you're describing a compression program as "it takes a big file and returns a smaller file by exploiting regularities in the data." Like, you have accurately described what it does, but you have in no way answered the question of how it does that.
If you then explain the function of a CPU and how ELF binaries work (which is the equivalent of trying to answer the question by explaining how neural networks work), you then have still not answered the actually important question! Which is "what are the algorithms that LLMs have learnt that allow them to (apparently) converse and somewhat reason like humans?"
Yes, in the analogy it's equivalent to saying you know "what" every instruction in the compression program is doing. push decrements rsp, xor rax, rax zeroes out the register. You know every step. But you don't know the algorithm that those instructions are implementing, and that's the same situation we're in with LLMs. We can describe their actions numerically, but we cannot describe them behaviorally, and they're doing things that we don't know how to otherwise do with numerical methods. They've clearly learnt algorithms but we cannot yet formalize what they are. The universal approximation theorem actually works against your argument here, because it's too powerful- they could be implementing anything.
edit: We know the data that their function outputs, it's a "blurry jpeg of the internet" because that's what they're trained on. But we do not know what the function is, and being able to blurrily compress the internet into a tb or whatever is utterly beyond any other compression algorithm known to man.