I am not sure what the status of AMD ROCm is, but I guess it will take a while until it becomes a viable alternative.
Here is the FFT example in ROCm: https://github.com/ROCm-Developer-Tools/HCC-Example-Applicat...
Compared to FFT in CUDA: https://github.com/drufat/cuda-examples/blob/master/cuda/fft...
1163 vs 216 lines of code! CUDA is really straightforward; ROCm is still low-level and incomprehensible (unless you are a C++ expert who happens to love low-level details).