Marvell Cranks Up Cores and Clocks with “Triton” ThunderX3

hydroreadsstuff · on March 18, 2020

This announcement is lacking a lot of specifics, which means it's worse than N1 / Ampere and more of the competition.

TNP always annoys me by its wordiness and lack of tables and numbers. It's incredibly low signal to noise compared to Ars and Anandtech.

mappu · on March 18, 2020

Here's the Anandtech version: https://www.anandtech.com/show/15621/marvell-announces-thund...

> If we use the TX2 figures we have at hand, this would mean the new chip would land slightly ahead of Neoverse-N1 systems such as the Graviton2, and match more aggressively clocked designs such as the Ampere Altra.

slizard · on March 18, 2020

Without HBM or more memory channels, the top SKUs will be rather hard to feed considering the (claimed) at least ~3-5x increase in instructions/s/socket while only increasing memory bandwidth by ~20%.

rbanffy · on March 19, 2020

Maybe they mitigated that by increasing cache sizes, but there is no information available about that yet. I would expect that, since they support 4 threads per core and that may drive up cache usage (even though one thread may be able to be useful while another is waiting for a cache miss to clear).

vorpalhex · on March 18, 2020

Has there been any mention of MSRPs on these new arm chips? As a home server enthusiast, I'd love to have a serious arm driven system to play with...

bluedino · on March 18, 2020

The ThunderX2 model isn't in stock, but the Ampere model is.

Base config with Ampere eMAG 8180 32 core 2.8GHz 3MB L3 is about $3,000

https://store.avantek.co.uk/ampere-emag-64bit-arm-workstatio...

floatboth · on March 18, 2020

(That's not the new generations, which are not released yet, and there's no price info yet)

0xcde4c3db · on March 18, 2020

Has the ThunderX family shipped in "mere mortal" hardware, or is it all supercomputers and custom FANG servers and the like? I seem to remember that when ThunderX first launched, there was some noise about "ARM servers" being a market that exists, and the company I worked for at the time was looking into using it for a new product but gave up on it for some reason.

rrss · on March 18, 2020

https://buy.hpe.com/us/en/servers/apollo-systems/apollo-70-s...

https://www.gigabyte.com/us/ARM-Server/Marvell-ThunderX2

SloopJon · on March 18, 2020

I think I've used one of these Gigabyte rack servers:

https://www.gigabyte.com/us/ARM-Server

I'm not sure offhand what the pricing is, but my guess would be less than $10,000, depending on the configuration.

sudosysgen · on March 18, 2020

So much more than an equivalent x86 box.

rbanffy · on March 19, 2020

There aren't many 384-thread x86 boxes around, certainly none below US$ 10K

sudosysgen · on March 22, 2020

384 threads at which SMT level again?

What's the clock speed, and what's the IPC compared to say a 64 core 128 thread Threadripper? You'll find it won't compare favourably for 99.9% of workloads.

navaati · on March 18, 2020

240W, 96 quite beefy ARM cores, I'm not sure about SMT. All that in 1U. The density is becoming mad.

namibj · on March 18, 2020

That sounds like a shrink of a single ThunderX2 socket. I'd say, expect two of this in a 1U system, or maybe rather 2 separate 2S boards in a 2U case with shared power/cooling.

skavi · on March 18, 2020

It’s SMT4, so up to 384 threads per socket.

rbanffy · on March 19, 2020

We'll need to update htop to use the Unicode 2x2 mosaics so we can cram 2 vCores per line and use foreground and background colors cleverly...

navaati · on March 19, 2020

Geeeeee ! They said in the article that the previous gen chip was, but didn't explicitely confirm for the new one, so I didn't dare to hope. Thanks for the confirmation !

wyldfire · on March 18, 2020

> The Triton chip will have eight memory controllers supporting memory running at 3.2 GHz, which is the same number of controllers in the Vulcan chip, which maxxed out at 2.67 GHz memory speeds. That’s a 20 percent increase in memory bandwidth, and the question is how that will balance out against the high core counts in some of the Triton SKUs.

96 cores, seems like it would be tricky to keep them busy with modest memory throughput increases.

mastax · on March 18, 2020

For certain workloads, having 4SMT will keep the cores busy while they're waiting on main memory.

nsteel · on March 18, 2020

Do they really mean memory running at 3.2GHz? What memory would be running that fast? Isn't this HBM2E and it's 3.2Gbps bandwidth?

mastax · on March 18, 2020

DDR4 easily runs at 3200 MHz (DDR) though I don't know if that's common yet for ECC RDIMMS.

nsteel · on March 18, 2020

Doh! yes, thanks, that makes a LOT more sense!

thedance · on March 18, 2020

That's a comparable configuration to an AMD EPYC "Rome" with 48c/96t and 8 memory controllers.

ksec · on March 19, 2020

384 Thread or vCPU in a single socket or 768 vCPU per 1U in Dual Socket. That is ~$4K per month revenue for Cloud Vendor.

Unfortunately DRAM, NAND and Bandwidth unit cost hasn't drop a bit compared to vCPU unit cost. Along with baseline price of Rent and Electricity which means Cloud Vendors unit cost aren't that much better.

I expect this would only come to 10 to 20% price drop.

adev_ · on March 18, 2020

If they continue what they were doing for TX2 [^1] , you might get double socket in 1U.

Meaning 192 cores per U.

[1]: https://www.gigabyte.com/ARM-Server/R181-T90-rev-100

__d · on March 18, 2020

The chip has a CCPI (Cache Coherent Processor Interface) with 24 lanes @ 25Gb/s for 2-socket NUMA interconnect.

This technology came from Cavium (who bought the Raza, then NetLogic, then Broadcom, and very briefly Avago XLS/XLR/XLP/Vulcan designs from Avago before being bought themselves by Marvell). I don't recall it being in their MIPS64-based Octeon II or III designs, and I thought the XLP had some minimal-glue NUMA support, but I can't find anything with a quick search.

Minimal info at Wikichip: https://en.wikichip.org/wiki/cavium/ccpi

bryanmgreen · on March 18, 2020

Reminds me of this article from 2011 blowing my mind... 12,500 cores for Pixar's Cars 2

https://www.cnet.com/news/new-technology-revs-up-pixars-cars...