The way that LLMs are billed now, if you can densely pack the context with relev...

The way that LLMs are billed now, if you can densely pack the context with relevant information, you will come out ahead commercially. I don't see this changing with the way that LLM inference works.

Really? Because to my understanding the compute necessary to generate a token grows linearly with the context, and doesn't the OpenAI billing reflect that by seperating prompt and output tokens?