Hacker News new | past | comments | ask | show | jobs | submit login

Hey I am the first author of one of the "abstractions", so I guess my words would more or less reflect my personal daily experience dealing with those lovely kernels. Well, I don't have 100 engineers working for me, unfortunately :-(

Let's instead constructively talk about techniques in concrete items. If you look at OpenAI's Triton (which is also a small team of < 5 core contributors), what's this abstraction and their key to high performance? It's a tile-based programming model, where a tile could be conveniently lowered to vector instructions, coalesced memory access, and transformed to permuted layout. Its `dot` on tiles can be directly lowed to TensorCore-specific instructions. With those in design, without a huge team painfully maintaining the system, critical kernels like FlashAttention could be quickly developed within say 30 lines of code.




>Hey I am the first author of one of the "abstractions"

I know who you are and you should probably be out in the open with the fact that you have a conflict of interest in working at octo, a company that sells a very specific type of ML compiler.

>Let's instead constructively talk about techniques in concrete items. If you look at OpenAI's Triton

Pretty ironic you would call out Triton is being the right abstraction because while it is true philippe did a very good thing by moving things from warp level to block level, there is absolutely no one that thinks (myself included) that Triton is an abstraction.


I’m using my real name, so its not hard to know who I am.

Unfortunately, I don’t know much about you, and actually I don’t really think there is conflict of interest if you work in Modular, because Modular is also developing compiler abstractions, which is something I like and agree with, isn’t it? Let’s discuss about techniques, and it doesn’t have to be that heated :-)

To clarify, my point is matmuls can be solved with proper compiler abstractions, and it’s not that hard, and if you are working on a compiler, I believe you would more or less agree with that point, do you?

Liking Triton or not is a personal preference, and I use this as an example only because it’s gaining a lot of momentum at the moment, not saying it’s a perfect abstraction. If you personally don’t like it, I could also discuss about exo-compilation, tensor comprehension, but let’s always focus on concrete technical items :-)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: