Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The way that LLMs are billed now, if you can densely pack the context with relevant information, you will come out ahead commercially. I don't see this changing with the way that LLM inference works.

Really? Because to my understanding the compute necessary to generate a token grows linearly with the context, and doesn't the OpenAI billing reflect that by seperating prompt and output tokens?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: