Hacker News new | past | comments | ask | show | jobs | submit login
Python code to generate text using a pretrained character-based RNN (github.com/minimaxir)
75 points by jjwiseman on Aug 18, 2017 | hide | past | favorite | 10 comments



Very simple interface, but even the cherry-picked examples in the readme are quite poor in terms of meaning/grammatical correctness. It is even using pretty state-of-the-art techniques such as LSTMs and word embeddings, but the results just aren't there.

Can someone point me to some of the best examples of coherent and long (as in many words/sentences) automatic text generation? Or an explanation as to why we're so far from being able to tackle this problem? Because I haven't seen anything noteworthy in text generation to write home about.


Repo author here:

Unfortunately, a lot of the examples people give for text generation in general are cherrypicked, which leads to a selection bias toward the robustness of text generation (see my previous rant: https://news.ycombinator.com/item?id=14949220)

In terms of size vs. performance (including retraining), the 128-cell LSTM was the best balance for a repo like this. (The 3-layer 512-cell networks used in the original Karpathy examples are hundreds of MB)

I recommend looking at the examples in the /output folder for more robust examples than the ones in the README.


You could make the bigger model available elsewhere, and allow the user to download the bigger model if they want to.

Why don't you have a priming ability? That is, generating starting from some context text. Might be useful for a lot of applications.


As noted in the README, supporting bigger models is an option for future development. (I still would need to train/optimize the model which does take time/money. Additionally, file hosting is not free and I am not making any revenue off of this project currently)

The generate() function does have priming with a `prefix` parameter; see the demo.


Most examples I've seen are all trying to produce text that makes sense based on nothing. Which seems kind of pointless. You can't create a message from nothing.

What we really need is a way to take a bunch of facts, a key message, a tone, and produce text that represents those inputs.

The text generation / grammar of language is basically solved imo. The problem is taking one thought and expanding on that based on other relevant thoughts.

But we're essentially talking general AI at that point.


I know next to nothing in this space, but wouldn't a Markov chain be better than an RNN for something like text generation?

Also things like subtedditsimulator seem to excel at generating grammatically and thematically correct nonsense.


I've read that working with characters to form sentences is competitive to forming words, with symptoms showing that character based RNN can match certain grammar rules such as singular/plural, matching braces.


Could this be used to generate place names in like this https://medium.com/@hondanhon/i-trained-a-neural-net-to-gene... but with a different language? Or would that need a different training set?


The pretrained network is calibrated for English, but if you trained it on non-English data it should still work fine since the entire network is retrained (assuming there is enough overlap with characters in the vocabulary; e.g. Spanish would work fine but CJK languages would not).


Thanks! I love the textgenrnn_vocab.json easter egg by the way :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: