Hacker News new | past | comments | ask | show | jobs | submit login

The Nano only has 4GB VRAM and DS-R1 is 671B FP8 parameters (equivalent to 671GB model size).

You need something with about 800GB to run the full model with context. You'd still need 400GB to even run a half-sized Q4 quant of R1, so there is no reasonable way that it would work.




Just curious, what it's the cheapest card do you think that would be needed to run this model or something like llama 3.3-70B?

Only nvidia cards are compatible or AMD ones also could work?


OK, I understand the flagship model is huge, It seems to be far from local use.

Anyone did it with a smaller/distilled version, and getting good performance?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: