But the majority of the AI R&D may be in China, with a high barrier for participation for outsiders, leading to an increasing gap. Whether this is so is not obvious.
Not the AI proper, but the need for additional AI hardware down the line. Especially the super-expensive, high-margin, huge AI hardware that DeepSeek seems not to require.
Similarly, microcomputers led to an explosion of computer market, but definitely limited the market for mainframe behemoths.
Such consequences might be undesired, but hardly unanticipated or surprising. Not to the smart folks who came up with LLMs.
(A lot of what we're having around us now has been anticipated by sci-fi, but earlier sci-fi largely failed to predict the internet, so we have fewer predictions about such stuff. Lem's "Futurological Congress" could be seen as a weird anticipation of a deeply faked world.)
It's like saying that a diesel engine is 6x more efficient than a steam engine, so the guys who spent time working on steam engines just wasted their time and money.
The thing is that the steam engine guys researched thermodynamics and developed the mechanics and tooling which allowed the diesel engine to be invented and built.
Also, for every breakthrough like DeepSeek which is highly publicized, there are dozens of fizzled attempts to explore new ideas which mostly go unnoticed. Are these wasted resources, too?
When I tested this before (some years ago), adding "restrict" to the C pointers resulted in the same behavior, so the aliasing default seem to be the key issue.
I believe it’s ecosystem. For example, Python is very high level language, not sure it even has a memory model. But it has libraries like NumPy which support all these vectorized exponents when processing long arrays of numbers.
Everything that's heavily vectorized in the Python ecosystem, including numpy achieves it using optimized backend code written in other languages - fortran in particular. Python is only a thin veneer over those backends. In fact, you're constantly reminded to offload the control flow as much as possible to those backends for the sake of performance, instead of doing things like looping in Python. If that's enough to consider Python to be good in vectorization, I can just link high performance fortran libraries with C, handle non-vector control flow from there and call it a day. I guarantee you that this arrangement will be far more performant than what Python can ever achieve. I have to strongly agree with the other commenter's observation that the memory model is the key to vector performance.
And of course Python has a memory model. While that model is not as well understood as C's model, it is the key to Python's success and popularity as a generic programming language and as a numeric/scientific programming language. Python's memory model unlike C's or Fortran's, isn't designed for high performance. It's designed for rich abstractions, high ergonomics and high interoperability with those performant languages. For most people, the processing time lost executing python code is an acceptable tradeoff for the highly expressive control that Python gives them over the scheduling of lower level operations.
NumPy has no Fortran code, for quite a long time now. SciPy has and it is being rewritten. What you mention is the ufunc machinery underneath which is all C. NumPy also has SIMD support (albeit limited to certain functions). BLAS is also C/Assembly but only LAPACK is F77 (which is too much code to be rewritten).
This does not mean Fortran is bad (obligatory disclaimer for Fortran fans).
There's no language-level difference between int[] and int* in C. Indexing is just pointer arithmetics. No support for (non-jagged) multidimensional arrays either.
A proper array would at least know its length. It does not require a lot of runtime to do, and most C code require libc anyway for basic stuff like malloc().
This isn't true. int[] decays into an int* but is a different type.
An array member of a struct will have its data allocated contiguously with other struct members (subject to compiler padding). An int* would point outside the struct. This is possible even with variable length arrays. Specifically, you can declare the final member of a struct as an int[] which makes it a "flexible array member". Then you have to malloc the struct to be the sizeof the struct + whatever size you want the array to be.
This is rather pedantic but the memory model for arrays is important here. If you're iterating over multiple struct members that are arrays, the fact the arrays are stored contiguously in memory is going to matter to SIMD.
if using an int* over an int[] changes the memory layout of a struct, that necessarily implies a difference in the ABI.
As an example, C++'s std::array can be implemented as a struct containing a fixed-size C-style array. This can be passed-by-value. This means that returning a std::array can be more performant than returning a std::vector (which might be implemented as an int* that is reallocated when you add too many elements), because a std::array is returned on the stack while a std::vector is returned on the heap.
I was bit by this once when returning a std::unordered_map because I was doing a heap-allocation in a performance-critical section.
> There's no language-level difference between int[] and int* in C
sizeof(int[n]) vs sizeof(int*). Then there's: 'int getint(int *I){return *I;}' which wont let me pass '&int[n]' but I can 'int* = int[n]' then pass &int*. It's not so much a hack but a very primitive implementation on top of pointers (alright maybe it is a bit hacky.)
> A proper array would at least know its length. It does not require a lot of runtime to do, and most C code require libc anyway for basic stuff like malloc().
Something like a fat pointer 'typedef struct {long len, void *data} MyArray;' then make a libMyArray to operate on MyArray objects. But now you have a libMyArray dependency. You also loose type information unless you add fields to MyArray with an enum or have a MyArray for each type which gets messy. Then you cant do shit like just use an identifier in a conditional e.g. 'while(MyArray[n++])' unless you change how C works which makes it notC (uh, maybe pick another name.)
I feel that C is fine as it's an old machine oriented language. Leave it alone. I would look to reach for another tool when the primitive design of C makes life difficult.
This is exactly what has changed [1]: R&D costs had been an immediate tax break, but since 2022 became an expenditure requiring a 5-year amortization period.
That change had been planned to be canceled before coming into force, but it was not canceled on time.
Hence the wave of layoffs in 2022, as companies were urgently trying to improve their balance sheets, as investors and the Wall Street requested, AFAICT.
I care relatively little because I cut ads using browser plugins and suchlike.
Allowing anything shady, let alone incriminating in your email would be insane, whether it's a mailbox at Google or at Proton. Transactional emails from shady websites, like password reset, are best done with email services like mailinator, which offer zero access protection and destroy the received email in a few minutes automatically.
This is cool. This is also massive. I wonder how much e.g. the arcade machine housing costs just in the filament used, and how many hours did it take to print.
There's a relatively well-travelled path of filling hollow prints with something to get strength back, which does mitigate that. What that something is will vary per application, but I've previously used premixed post-hole concrete where I've just wanted to add weight.
This is quite an assumption.
But the majority of the AI R&D may be in China, with a high barrier for participation for outsiders, leading to an increasing gap. Whether this is so is not obvious.
reply