Hacker News new | past | comments | ask | show | jobs | submit login

> certain types of non-native english always have a feel to them which reflects the mismatch between two languages

This. My partner always speaks frenglish (french english) after talking to her parents. You have to know a little French to understand her sentences. They’re all English words, but the phraseology is all French.

I do the same with Slovenian. The words are all English, but the shape is Slovenian. It adds a lot of soul to your words.

It can also be topic dependent. When I describe memories from home in English, the language sounds more Slovenian. Likewise when I talk about American stuff to my parents, my Slovenian sounds more English.

ChatGPT would lose all that color.

Read Man In The High Castle to see this for yourself. Whole book is English but you can tell the different nationalities of each character because the shape of their English changes. Philip K Dick used this masterfully.




> Whole book is English

Amusingly, I think this phrase illustrates your point. To the best of my knowledge, a native speaker (which I'm not) would always say "The whole book is (in?) English", leaving off articles seems to be very common for Slavic people (since I believe you don't really have them in your languages).


leaving off articles seems to be very common for Slavic people

Whenever I come across text that has a lot of missing articles, the voice inside my head automatically changes to a Russian accent; and in the instances where I've bothered to find out the author, it was always someone from Russia or some other ex-USSR country, so it seems I've already ingrained this characteristic at a subconscious level.


Poles, Czechs etc. also do this and IMHO, their accent sounds quite different from the Russian one.


I think this is more about formality and modern usage. I'm nearly 50 and am British. I sometimes write in this abbreviated form, omitting things like articles when they are unnecessary. Especially in text messages, social media posts, etc.


I used to work in academia with a Chilean guy who added extra articles where they weren’t needed and a Slovakian guy who didn’t put any in at all. I had fun editing the papers we wrote!


Spanish has definite and indefinite articles like English, so at least the concept is not unknown. However, even then, the correct usage is sometimes really arbitrary and varies across languages, e.g. why is it typically "mankind" and not "the mankind" (by contrast, in German it's "die Menschheit", with an article)?


It also helps refute the point because you could certainly ask an LLM to speak as though they’re a character from the book.

And if what it does now is unimpressive, it might be a good thing to use to monitor the rapid progress of LLMs.


Just to corroborate as a native English speaker, yes, in my experience the "the" would only be left off in quite informal registers or in haste.


There is sure to be lots of training data from people with French as a first language and English as a second language that can be pulled up with some prompting.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: