> If we use the TX2 figures we have at hand, this would mean the new chip would land slightly ahead of Neoverse-N1 systems such as the Graviton2, and match more aggressively clocked designs such as the Ampere Altra.
Without HBM or more memory channels, the top SKUs will be rather hard to feed considering the (claimed) at least ~3-5x increase in instructions/s/socket while only increasing memory bandwidth by ~20%.
Maybe they mitigated that by increasing cache sizes, but there is no information available about that yet. I would expect that, since they support 4 threads per core and that may drive up cache usage (even though one thread may be able to be useful while another is waiting for a cache miss to clear).
Has the ThunderX family shipped in "mere mortal" hardware, or is it all supercomputers and custom FANG servers and the like? I seem to remember that when ThunderX first launched, there was some noise about "ARM servers" being a market that exists, and the company I worked for at the time was looking into using it for a new product but gave up on it for some reason.
What's the clock speed, and what's the IPC compared to say a 64 core 128 thread Threadripper? You'll find it won't compare favourably for 99.9% of workloads.
That sounds like a shrink of a single ThunderX2 socket.
I'd say, expect two of this in a 1U system, or maybe rather 2 separate 2S boards in a 2U case with shared power/cooling.
Geeeeee ! They said in the article that the previous gen chip was, but didn't explicitely confirm for the new one, so I didn't dare to hope. Thanks for the confirmation !
> The Triton chip will have eight memory controllers supporting memory running at 3.2 GHz, which is the same number of controllers in the Vulcan chip, which maxxed out at 2.67 GHz memory speeds. That’s a 20 percent increase in memory bandwidth, and the question is how that will balance out against the high core counts in some of the Triton SKUs.
96 cores, seems like it would be tricky to keep them busy with modest memory throughput increases.
384 Thread or vCPU in a single socket or 768 vCPU per 1U in Dual Socket. That is ~$4K per month revenue for Cloud Vendor.
Unfortunately DRAM, NAND and Bandwidth unit cost hasn't drop a bit compared to vCPU unit cost. Along with baseline price of Rent and Electricity which means Cloud Vendors unit cost aren't that much better.
I expect this would only come to 10 to 20% price drop.
The chip has a CCPI (Cache Coherent Processor Interface) with 24 lanes @ 25Gb/s for 2-socket NUMA interconnect.
This technology came from Cavium (who bought the Raza, then NetLogic, then Broadcom, and very briefly Avago XLS/XLR/XLP/Vulcan designs from Avago before being bought themselves by Marvell). I don't recall it being in their MIPS64-based Octeon II or III designs, and I thought the XLP had some minimal-glue NUMA support, but I can't find anything with a quick search.
TNP always annoys me by its wordiness and lack of tables and numbers. It's incredibly low signal to noise compared to Ars and Anandtech.