Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Curious how big your dataset was if you used $1000 of GPU credits on DistilBERT. I've run BERT on CPU on moderate cloud instances no problem for datasets I've worked with, but which admittedly are not huge.


If I'm reading correctly, they used $1000 running a Llama model, not DistilBERT.


You read it correctly. I obviously didn't explain myself well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: