Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For the most part LLMs choose "the most common" tokens; so regardless of whether the content was "AI content" or not, maybe you are getting tired of mediocrity.

And of course also that mediocrity has now become so cheap that it is now the overwhelming majority.





This is similar to how the average number of children per household is 2.5, but no one has 2.5 children. The most common tokens actually yield patterns that no one actually uses together in practice

LLMs have the tendency to really like comparisons / contrasts between things, which is likely due to the nature of neural networks (eg “Paris - France + Italy” = “Rome”). This is because when representing these concepts as embeddings, they can be computer very straightforward in vector space.

So no, it’s not all due to human language, LLMs do really write content in a specific style.

One recent study also showed something interesting: AIs aren’t very good at recognizing AI generated content either, which is likely related; they’re unaware of these patterns.

https://www.sciencedirect.com/science/article/pii/S147738802...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: