That might be good criteria. But it still depends on your application. Most web apps don't have hardly any overhead when under load. So it's essentially just as efficient to load the whole codebase as a monolith into each node as you scale up.
Correct, the hierarchical breakdown of services is orthogonal to the scaling unit of code. If every node in the cluster could execute every function, there is no need to split things out.
When deployment and coordination become an issue, that is when _deployment_ needs to get split up. But given our current RPC mechanisms, deployment and invocation are over-coupled so we have to consciously make these decisions when they could be made by the runtime.