The context is the explicit tagging in this case. You don't need to understand language to detect English-as-a-second language speakers. (Indeed Markov chains will happily solve this problem for you.)
> they automatically model relations
No, they do not model anything at all. If you follow the tech bubble turtles all the way down you find a maximum likelihood logistic approximation.
I know, I know - then you'll do a sleight of hand and claim that all intelligence and modeling is also just maximum likelihood, even thought it's patently and obviously untrue.
LLM's output the statistically most average sequence of tokens (there's no intelligence there, "artificial" or otherwise), so yeah, that's by design.