For the most part LLMs choose "the most common" tokens; so regardless of whether...

catlifeonmars · 2025-11-23T15:01:24 1763910084

This is similar to how the average number of children per household is 2.5, but no one has 2.5 children. The most common tokens actually yield patterns that no one actually uses together in practice

stingraycharles · 2025-11-23T12:26:32 1763900792

LLMs have the tendency to really like comparisons / contrasts between things, which is likely due to the nature of neural networks (eg “Paris - France + Italy” = “Rome”). This is because when representing these concepts as embeddings, they can be computer very straightforward in vector space.

So no, it’s not all due to human language, LLMs do really write content in a specific style.

One recent study also showed something interesting: AIs aren’t very good at recognizing AI generated content either, which is likely related; they’re unaware of these patterns.

https://www.sciencedirect.com/science/article/pii/S147738802...