You misunderstand the difficulty in the software side.
Developing competitive CUDA equivalent is much harder task than what you assume. There needs to be lots of tooling and need for optimizations in software to make these low level libraries high performance. The amount of testing and benchmarking to make the right choices may take many people working full time. Implementing the functionality is not enough.
AMD has had apis for a long time and it has an has an open source deep learning stack but it's not good enough. AMD also has CUDA to HIP converter but the results are not competitive and miss features.
Both AMD and Intel are probably getting there eventually and Nvidia is cashing the monopoly phase before the prices stop.
i don’t know that it has to be competitive in speed right away as long as it works and is easy to install.
i’d be thrilled to have the option of doing deep learning on an AMD card even if it ran at 1/3 the speed on a comparable (but probably somewhat more expensive) NVIDIA card. it would open up a lot of options even if it were still less economical in throughput/dollar.
if nothing else, how many machine learning researchers would like to prototype things on their MacBook Pros?
For prototyping (as in making sure you have your matrix shapes right) the CPU versions of TensorFlow and PyTorch are fine.
They are even ok for some kinds of training (eg, if you are doing transfer learning with a fixed embedding/feature representation and have a pretty small parameter space to learn).
But it's so easy to fire up a cloud instance at Paperspace or somewhere and push some code across.
That's fair enough, and I was being a little flippant of a very difficult problem. I guess I was more curious about whether anyone has any insight into their possible upside (i.e. whether they can scale production if demand grows, how big a % of Nvidia's revenue is from scientific computing, etc.).
Developing competitive CUDA equivalent is much harder task than what you assume. There needs to be lots of tooling and need for optimizations in software to make these low level libraries high performance. The amount of testing and benchmarking to make the right choices may take many people working full time. Implementing the functionality is not enough.
AMD has had apis for a long time and it has an has an open source deep learning stack but it's not good enough. AMD also has CUDA to HIP converter but the results are not competitive and miss features.
Both AMD and Intel are probably getting there eventually and Nvidia is cashing the monopoly phase before the prices stop.