Everything that's heavily vectorized in the Python ecosystem, including numpy achieves it using optimized backend code written in other languages - fortran in particular. Python is only a thin veneer over those backends. In fact, you're constantly reminded to offload the control flow as much as possible to those backends for the sake of performance, instead of doing things like looping in Python. If that's enough to consider Python to be good in vectorization, I can just link high performance fortran libraries with C, handle non-vector control flow from there and call it a day. I guarantee you that this arrangement will be far more performant than what Python can ever achieve. I have to strongly agree with the other commenter's observation that the memory model is the key to vector performance.
And of course Python has a memory model. While that model is not as well understood as C's model, it is the key to Python's success and popularity as a generic programming language and as a numeric/scientific programming language. Python's memory model unlike C's or Fortran's, isn't designed for high performance. It's designed for rich abstractions, high ergonomics and high interoperability with those performant languages. For most people, the processing time lost executing python code is an acceptable tradeoff for the highly expressive control that Python gives them over the scheduling of lower level operations.
NumPy has no Fortran code, for quite a long time now. SciPy has and it is being rewritten. What you mention is the ufunc machinery underneath which is all C. NumPy also has SIMD support (albeit limited to certain functions). BLAS is also C/Assembly but only LAPACK is F77 (which is too much code to be rewritten).
This does not mean Fortran is bad (obligatory disclaimer for Fortran fans).
Also, Fortran has progressed immensely since F77 days. (I actually wrote some F77 code back in the day, when it was already ancient.) C has also progressed quite a bit since K&R days, but, to my mind, not nearly as much.
right, the problem for scipy developers I believe is that not enough of them know fortran, whereas C knowledge is necessary to hack on CPython native extensions.
I wouldn't say Python is good _at_ vectorization so much as good _with_ it. It's also good with AI, web systems, etc, as a result of its "play well with others" philosophy, which pragmatically accepted from the start that no one language could be best for all purposes.
One of the best improvements to Python's memory model was extending it to allow sensible access to blobs of memory containing regular structures such as vectors and matrices, making it relatively easy to interface with and control already-available, well-tried, highly-optimised algorithms.
And of course Python has a memory model. While that model is not as well understood as C's model, it is the key to Python's success and popularity as a generic programming language and as a numeric/scientific programming language. Python's memory model unlike C's or Fortran's, isn't designed for high performance. It's designed for rich abstractions, high ergonomics and high interoperability with those performant languages. For most people, the processing time lost executing python code is an acceptable tradeoff for the highly expressive control that Python gives them over the scheduling of lower level operations.