Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

After turning on compression I was able to fit the whole thing in GPU memory and then it became much faster. Not ChatGPT speeds or anything, but under a minute for a response in their chatbot demo. A few seconds in some cases.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: