The 96GB (HBM2e) SKU is named PPU from T-head semiconductor (basically a subsidiary of Alibaba). The spec is very similar to H20. Other chips they were using include Huawei Ascend 910B (64GB) and maybe other domestic designed chips.
They've shared some interesting optimization techniques for bigger LLMs that's all, not exactly low powered devices as in power consumption. Still a good read.
Table 1 is the closest thing. Device specs for six devices: 120-989 TFLOPS and 64-96 GB RAM.
An RTX 5090 is about 105 TFLOPS.
https://www.techpowerup.com/gpu-specs/geforce-rtx-5090.c4216
reply