Hacker News new | past | comments | ask | show | jobs | submit login

The example story is interesting.

I have made my own implementation from scratch with my own multi-channel tokeniser, each channel gets its own embedding table 32768, 256,256, 64, and 4. Which are summed along with the position encoding.

Yet with all of those differences, my stories have Lily as a protagonist often enough that I thought I had a bug somewhere.

Might have to check tinystories for name distribution.

Most questionable output from mine so far:

"one day, a naughty man and a little boy went to the park place to find some new things."




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: