Hacker News new | past | comments | ask | show | jobs | submit login
The Unreasonable Effectiveness of Character-Level Language Models (ipython.org)
74 points by woodson on May 26, 2015 | hide | past | favorite | 12 comments



This is exactly the letter-based version of the "Dissociated Press" Markov-chain algorithm, right?

I'd suspect some of the perceived quality at higher orders (particularly for the source-code example) is just coming from the transition graph becoming sparse and deterministically repeating long stretches of the input text verbatim.


Goldberg says parenthesis-balancing and indentation etc. would require lots of non-trivial human reasoning to implement, but I found https://news.ycombinator.com/item?id=9585080 's example even more amazing: the NN actually learns meter and parts of speech – I don't find it very likely that you could get a character-based markov-chain to generate completely novel grammatically correct constructions (with correct meter!) from such a small corpus, since you always have to weigh correctness (high order) up against ability-to-generalise (low order).


It's not true that it learns the meter. If you look through the large generated example you'll see that there's not much consistency in the number of syllables per line.


That same article (that you linked) showed how the RNN learned to balance parentheses.


? I tried linking to a comment …

EDIT: Oh, misunderstood. I know the RNN can do the parens-balancing, and that's why Goldberg said that parens-balancing was impressive, since with his method you'd need to add other hacks around it.


HN discussion of the blog post this is a response to (890 points, 204 comments): https://news.ycombinator.com/item?id=9584325.


I did a project that used a similar technique, but mapping only a single state transition. For its simplicity it was very effective.

https://github.com/rectangletangle/atypical


I also did a project that used a similar technique which scrapes text from 4chan, mapping N words to the next possible word.

https://github.com/skphilipp/humanity


Why 4chan?


I browse it myself, it has a really simple API and the content is refreshed quite fast on active boards.


People have been working for age trying to generate passable text by looking at n-grams for whole words. It is quite surprising to see that all along we could have done better by using a simpler model!


Yes, I guess computers are better than us at deciding how to organize characters in words, judging by the author's typos: 'liklihood', 'Mathematiacally', 'langauge', 'immitate', 'somehwat', characer', 'commma', Shakespear', 'characteters', 'Shakepearan'.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: