Hacker News new | past | comments | ask | show | jobs | submit login

One of the biggest problems with hands off LLM writing (for long horizon stuff like novels) is that you can't really give them any details of your story because they get absolutely neurotic with it.

Imagine for instance you give the LLM the profile of the love interest for your epic fantasy, it will almost always have the main character meeting them within 3 pages (usually page 1) which is of course absolutely nonsensical pacing. No attempt to tell it otherwise changes anything.

This is the first model that after 19 pages generated so far resembles anything like normal pacing even with a TON of details. I've never felt the need to generate anywhere near this much. Extremely impressed.

Edit: Sharing it - https://aistudio.google.com/app/prompts?state=%7B%22ids%22:%...

with pastebin - https://pastebin.com/aiWuYcrF






I like how critique of LLMs evolved on this site over the last few years.

We are currently at nonsensical pacing while writing novels.


The most straightforward way to measure the pace of AI progress is by attaching a speedometer to the goalposts.

Oh, that's a good one. And it's true. There seems to be a massive inability for most people to admit the building impact of modern AI development on society.

Oh, we do admit impact and even have a name for it: AI slop. (Speaking on LLMs now since AI is a broad term and it has many extremely useful applications in various areas)

AI slop is soon to be "AI output that no one wanted to take credit for".

They certainly seem to have moved from "it is literally skynet" and "FSD is just around the corner" in 2016 to "look how well it paces my first lady Trump/Musk slashfic" in 2025. Truly world changing.

Haha, so that's the first derivative of goalpost position. You could take the derivative of that to see if the rate of change is speeding up or slowing.

I've asked claude to explain what you meant... https://claude.ai/share/391160c5-d74d-47e9-a963-0c19a9c7489a

I’m not source outsourcing even the comprehension of HN comments to an LLM is going to work out well for your mind

I’m not sure lacking comprehension of a comment and choosing to ignore that lack is better. Or worse: asking everyone to manually explain every reference they make. The LLM seems a good choice when comprehension is lacking.

I love this comment.

It's not really passing the Turing Test until it outsells Harry Potter.

> It's not really passing the Turing Test until it outsells Harry Potter.

Most human-written books don't do that, so that seems to be a ceiteria for a very different test that a Turing test.


Both books that have outsold the Harry Potter series claim divine authorship, not purely human. I am prepared to bet quite a lot that the next isn't human-written, either.

The joke is that the goalpost is constantly moving.

This subgoal post can't move much further after it passes "outsells the Bible" mark.

Why would the book be worth buying tough. If AI can generate a fresh new one just for you?

I don't know. It's a question relevant to all generative AI applications in entertainment - whether books, art, music, film or videogames. To the extent the value of these works is mostly in being social objects (i.e. shared experience to talk about with other people), being able to generate clones and personalized variants freely via GenAI destroys that value.

You may be right, on the other hand it always feels like the next goalpost is the final one.

I'm pretty sure if something like this happens some dude will show up from nowhere and claim that it's just parroting what other, real people have written, just blended it together and randomly spitted it out – "real AI would come up with original ideas like cure for cancer" he'll say.

After some form of that comes another dude will show up and say that this "alphafold while-loop" is not real AI because he just went for lunch and there was a guy flipping burgers – and that "AI" can't do it so it's shit.

https://areweagiyet.com should plot those future points as well with all those funky goals like "if Einstein had access to the Internet, Wolfram etc. he could came up with it anyway so not better than humans per se", or "had to be prompted and guided by human to find this answer so didn't do it by itself really" etc.


From Gary Marcus' (notable AI skeptic) predictions of what AI won't do in 2027:

> With little or no human involvement, write Pulitzer-caliber books, fiction and non-fiction.

So, yeah. I know you made a joke, but you have the same issue as the Onion I guess.


Let me toss a grenade in here.

What if we didn’t measure success by sales, but impact to the industry (or society), or value to peoples’ lives?

Zooming out to AI broadly: what if we didn’t measure intelligence by (game-able, arguably meaningless) benchmarks, but real world use cases, adaptability, etc?


I recently watched some Claude Plays Pokemon and believe it's better measure than all those AI benchmarks. The game could be beaten by a 8yo which obviously doesn't have all that knowledge that even small local LLMs posess, but has actual intelligence and could figure out the game within < 100h. So far Claude can't even get past the first half and I doubt any other AI could get much further.

Now I want to watch Claude play Pokemon Go, hitching a ride on self-driving cars to random destinations and then trying to autonomously interpret a live video feed to spin the ball at the right pixels...

2026 news feed: Anthropic cited as AI agents simultaneously block traffic across 42 major cities while trying to capture a not-even-that-rare pokemon


the true measure of AI: does it have fun playing pokemon? did it make friends along the way?

We humans love quantifiability. Since you used the word "measure", do you believe the measurement you're aspiring for is quantifiable?

I currently assert that it's not, but I would also say that trying to follow your suggestion is better than our current approach of measuring everything by money.


> We humans love quantifiability.

No. Screw quantifiability. I don't want "we've improved the sota by 1.931%" on basically anything that matters. Show me improvements that are obvious, improvements that stand out.

Claude Plays Pokemon is one of the few really important "benchmarks". No numbers, just the progress and the mood.


This is difficult to do because one of the juiciest parts of AI is being able to take credit for it's work.

the goal posts will be moved again. Tons of people clamoring the book is stupid and vapid and only idiots bought the book. When ai starts taking over jobs which it already has you’ll get tons of idiots claiming the same thing.

Well, strictly speaking outselling the Harry Potter would fail the Turing test: the Turing test is about passing for human (in an adversarial setting), not to surpass humans.

Of course, this is just some pedantry.

I for one love that AI is progressing so quickly, that we _can_ move the goalposts like this.


To be fair, pacing as a big flaw of LLMs has been a constant complaint from writers for a long time.

There were popular writeups about this from the Deepseek-R1 era: https://www.tumblr.com/nostalgebraist/778041178124926976/hyd...


This was written on march 15. Deepseek came out in January. "Era" is not a language I would use for something that happened few days ago

This either ends at "better than 50% of human novels" garbage or at unimaginably compelling works of art that completely obsoletes fiction writing.

Not sure what is better for humanity in long term.


That could only obsolete fiction-writing if you take a very narrow, essentially commercial view of what fiction-writing is for.

I could build a machine that phones my mother and tells her I love her, but it wouldn't obsolete me doing it.


Ahh, now this would be a great premise for a short story (from the mom's POV).

We are, if this comment is the standard for all criticism on this site. Your comment seems harsh. Perhaps novel writing is too low-brow of a standard for LLM critique?

I didn't quite read parent's comment like that. I think it's more about how we keep moving the goalposts or, less cynically, how the models keep getting better and better.

I am amazed at the progress that we are _still_ making on an almost monthly basis. It is unbelievable. Mind-boggling, to be honest.

I am certain that the issue of pacing will be solved soon enough. I'd give 99% probability of it being solved in 3 years and 50% probability in 1.


In my consulting career I sometimes get to tune database servers for performance. I have a bag of tricks that yield about +10-20% performance each. I get arguments about this from customers, typically along the lines of "that doesn't seem worth it."

Yeah, but 10% plus 20% plus 20%... next thing you know you're at +100% and your server is literally double the speed!

AI progress feels the same. Each little incremental improvement alone doesn't blow my skirt up, but we've had years of nearly monthly advances that have added up to something quite substantial.


Yes, if you are Mary Poppins, each individual trick in your bag doesn't have to be large.

(For those too young or unfamiliar: Mary Poppins famously had a bag that she could keep pulling things out of.)


Except at some point the low hanging fruit is gone and it becomes +1%, +3% in some benchmarked use case and -1% in the general case, etc. and then come the benchmarking lies that we are seeing right now, where everyone picks a benchmark that makes them look good and its correlation to real world performance is questionable.

What exactly is the problem with moving the goalposts? Who is trying to win arguments over this stuff?

Yes, Z is indeed a big advance over Y was a big advance over X. Also yes, Z is just as underwhelming.

Are customers hurting the AI companies' feelings?


> Are customers hurting the AI companies' feelings?

No. It's the critics' feelings that are being hurt by continued advances, so they keep moving goalposts so they can keep believing they're right.


The goalposts should keep moving. That's called progress. Like you, I'm not sure why it seems to irritate or even amuse people.

I don’t know why I keep submitting myself to hacker news but every few months I get the itch, and it only takes a few minutes to be turned off by the cynicism. I get that it’s from potentialy wizened tech heads who have been in the trenches and are being realistic. It’s great for that, but any new bright eyed and bushy tailed dev/techy, whatever, should stay far away until much later in their journey

People are trying to use gen AI in more and more use-cases, it used to fall flat on its face at trivial stuff, now it got past trivial stuff but still scratching the boundaries of being useful. And that is not an attempt to make the gen AI tech look bad, it is really amazing what it can do - but it is far from delivering on hype - and that is why people are providing critical evaluations.

Lets not forget the OpenAI benchmarks saying 4.0 can do better at college exams and such than most students. Yet real world performance was laughable on real tasks.


> Lets not forget the OpenAI benchmarks saying 4.0 can do better at college exams and such than most students. Yet real world performance was laughable on real tasks.

That's a better criticism of college exams than the benchmarks and/or those exams likely have either the exact questions or very similar ones in the training data.

The list of things that LLMs do better than the average human tends to rest squarely in the "problems already solved by above average humans" realm.


Do we have any simple benchmarks ( and I know benchmarks are not everything ) that tests all the LLMs?

The pace is moving so fast I simply cant keep up. Or a ELI5 page which gives a 5 min explanation of LLM from 2020 to this moment?


It’s more a bellwether or symptom of a flaw where the context becomes poisoned and continually regurgitates the same thought over and over.

Not really new is it? First cars just had to be approaching horse and cart levels of speed. Comfort, ease of use etc. were non-factors as this was "cool new technology".

In that light, even a 20 year old almost broken down crappy dinger is amazing: it has a radio, heating, shock absorbers, it can go over 500km on a tank of fuel! But are we fawning over it? No, because the goalposts have moved. Now we are disappointed that it takes 5 seconds for the Bluetooth to connect and the seats to auto-adjust to our preferred seating and heating setting in our new car.


lol wouldn’t that be great to read this comment in 2022

I have actually read it and agree it is impressive. I will not comment much on the style of the writing, since this is very much subjective, but I would rate it as the "typical" modern fantasy style, which aims at filling as much pages as possible: very "flowery" language, lots of adjectives/adverbs, lots of details, lots of high-school prose ("Panic was a luxury they couldn't afford"). Not a big fan of that since I really miss the time where authors could write single, self-contained books instead of a sprawling series over thousands of pages, but I know of course that this kind of thing is very successful and people seem to enjoy it. If someone would give me this I would advise them to get a good copy editor.

There are some logical inconsistencies, though. For instance, when they both enter the cellar through a trapdoor, Kael goes first, but the innkeeper instructs him to close the trapdoor behind them, which makes no sense. Also, Kael goes down the stairs and "risks a quick look back up" and can somehow see the front door bulging and the chaos outside through the windows, which obviously is impossible when you look up through a trapdoor, not to mention that previously it was said this entry is behind the bar counter, surely blocking the sight. Kael lights an oily rag which somehow becomes a torch. There's more generic things, like somehow these Eldertides being these mythical things no one has ever seen, yet they seem to be pretty common occurrences? The dimensions of the cellar are completely unclear, at first it seems to be very small but yet they move around it quite a bit. There's other issues, like people using the same words as the narrator ("the ooze"), like they listen to him. The inkeeper suddenly calling Kael by his name like they already know each other.

Anyway, I would rate it "first draft". Of course, it is unclear whether the LLM would manage to write a consistent book, but I can fully believe that it would manage. I probably wouldn't want to read it.


Thank you for taking the time to do a thorough read, I just skimmed it, and the prose is certainly not for me. To me it lacks focus, but as you say, this may be the style the readers enjoy.

And it also, as you say, really reuses words. Just reading I notice "phosphorescence" 4 times for example in this chapter, or "ooze" 17 times (!).

It is very impressive though that it can create a somewhat cohesive storyline, and certainly an improvement over previous models.


Regarding your last sentence, I agree. My stance is this: If you didn't bother to write it, why should I bother to read it?

From a technical standpoint, this is incredible. A few years ago, computers had problems creating grammatically correct sentences. Producing a consistent narrative like this was science fiction.

From an artistic standpoint, the result is... I'd say: incredibly mediocre, with some glaring errors in between. This does not mean that an average person could produce a similar chapter. Gemini can clearly produce better prose than the vast majority of people. However, the vast majority of people does not publish books. Gemini would have to be on par with the best professional writers, and it clearly isn't. Why would you read this when there is no shortage of great books out there? It's the same with music, movies, paintings, etc. There is more great art than you could ever consume in your lifetime. All LLMs/GenAI do in art is pollute everything with their incredible mediocrity. For art (and artists), these are sad times.


It's more nuanced than that. There are certain material/content where it is mandatory/necessary to read them.

Ideally I'd prefer to read material written by a the top 1%ile expert in that field, but due to constraints you almost always get to read material written by a midwit, intern, junior associate. In which case AI written content is much better especially as I can interrogate the material and match the top 1%ile quality.


Quality is its own property separate from its creator. If a machine writing something bothers you irrespective of quality then don't read it. You think i would care ? I would not.

If this ever gets good enough to write your next bestseller or award winner, i might not even share it and if i did, i wouldn't care if some stranger read it or not because it was created entirely for my pleasure.


Yeah I just focused on how well it was paced and didn't give any instructions on style or try a second pass to spot any inconsistencies.

That would be the next step but I'd previously never thought going any further might be worth it.


> Not a big fan of that since I really miss the time where authors could write single, self-contained books instead of a sprawling series over thousands of pages, but I know of course that this kind of thing is very successful and people seem to enjoy it.

When was this time you speak of?


Using the AI in multiple phases is the approach that can handle this. Similarly to "Deep Research" approach - you can tell it to first generate a storyline with multiple twists and turns. Then ask the model to take this storyline and generate prompts for individual chapters. Then ask it to generate the individual chapters based on the prompts, etc.

Yup -- asking a chatbot to create a novel in one shot is very similar to asking a human to improvise a novel in one shot.

But a future chatbot would be able to internally project manage itself through that process, of first emitting an outline, then producing draft chapters, then going back and critiquing itself and finally rewriting the whole thing.

Yes, and that's why many people in the discussion here are very optimistic that chatbots will have solve this problem very soon. Either with the approach you suggest, or with something else (and perhaps more general, and less directly programmed in).

It's not a problem of one-shotting it. It's that the details cause a collapse. Even if you tried breaking it down which i have, you'd run into the same problem unless you tried holding its hand for every single page and then - what's the point ? I want to read the story not co-author it.

I dunno, there's a certain amount of fun in "writing" a book with ChatGPT. Like playing a video game with a bunch of different endings instead of a watch a movie with only one. does the hero save the day? or turn into a villian! you decide!

Doesn't novel literally mean something new? Can we really expect an LLM to produce a novel?

The etymology is pretty much irrelevant. In eg German, the word for novel is 'Roman'. But German readers don't expect their novels to be anymore romantic, nor do English readers expect their novels to be more novel.

LLMs have been producing new things all the time. The question was always about quality of output, never about being able to produce anything new.


Yes

I think you would be better off having the LLM help you build up the plot with high level chapter descriptions and then have it dig into each chapter or arc. Or start by giving it the beats before you ask it for help with specifics. That'd be better at keeping it on rails.

I don't disagree. Like with almost anything else involving LLMs, getting hands on produces better results but because in this instance, i much prefer to be the reader than the author or editor, it's really important to me that a LLM is capable of pacing long form writing properly on its own.

Random question, if you don't care about being a creator yourself, why do you even want to read long form writing written by an LLM? There are literally 10000s of actual human written books out there all of them better than anything an LLM can write, why not read them?

> There are literally 10000s of actual human written books out there all of them better than anything an LLM can write, why not read them?

10000s is still much smaller than the space of possibilities for even a short prompt.

You might be right that good human novels are better than what LLMs can manage today. But that's rapidly changing.

And if you really need that Harry Potter / Superman / Three Musketeers crossover fan fiction itch scratched, you might not care that some other existing novel is 'better' in some abstract sense.


Authors tell stories they want to tell and Readers read stories they want to read. The two don't necessarily overlap or overlap strongly enough. If you're even a little bit specific (nowhere near as specific as the above prompt, even just something like the dynamic between protagonists) then you don't actually have 10,000s of actual human written books. Not even close. Maybe it exists and maybe you'll find it good enough but if it's only been read by a few hundred or thousand people ? Good luck getting it recommended.

I've read a LOT of fiction. I love reading. And if it's good enough, the idea of reading something created by a machine does not bother me at all. So of course i will continue to see if the machine is finally good enough and i can be a bit more specific.


Usually porn and fan fiction.

> There are literally 10000s of actual human written books out there

Tens-of-thousands is probably low by something in the neighborhood of four orders of magnitude.


It's very hard to find good books written by humans. GoodReads is okay, but you quickly run out of high-end recommendations. I read mostly sci-fi, and the books that everyone recommends rarely end up being 10/10. But then I see some random recommendation on Reddit or HN, and it ends up being amazing.

Human-generated slop is real.


You could ask your LLM for a recommendation.

That was what I tried on the train [0] a few weeks ago. I used Groq to get something very fast to see if it would work at least somewhat. It gives you a PDF in the end. Plugging in a better model gave much better results (still not really readable if you actually try to; at a glance it's convincing though), however, it was so slow that testing what kind of impossible. Cannot really have things done in parallel either because it does need to know what it pushed out before, at least the summary of it.

[0] https://github.com/tluyben/bad-writer


My prompt is nowhere near yours.

Just for fun: Asked it to rewrite the first page of ‘The Fountainhead’ where Howard is a computer engineer, the rewrite is hilarious lol.

https://gist.github.com/sagarspatil/e0b5443132501a3596c3a9a2...


Give it time, this will be solved.

I envisioned that one day, a framework will be created that can persist LLM current state on disk and then "fragments of memories" can be paged in and out into memory.

When that happened, LLM will be able to remember everything.


I had Grok summarize + evaluate the first chapter with thinking mode enabled. The output was actually pretty solid: https://pastebin.com/pLjHJF8E.

I wouldn't be surprised if someone figured out a solid mixture of models working as a writer (team of writers?) + editor(s) and managed to generate a full book from it.

Maybe some mixture of general outlining + maintaining a wiki with a basic writing and editing flow would be enough. I think you could probably find a way to maintain plot consistency, but I'm not so sure about maintaining writing style.


I have never used an LLM for fictional writing, but I have been writing large amounts of code with them for years. What I'd recommend is when you're defining your plan up front as to the sections of the content, simply state in which phase / chapter of the content they should meet.

Planning generated content is often more important to invest in than the writing of it.

Looking at your paste, your prompt is short and basic, it should probably be broken up into clear, formatted sections (try directives inside XML style tags). For such a large output as you're expecting id expect a considerable prompt of rules and context setting (maybe a page or two).


Opening with "like a struck flint carried on a wind that wasn’t blowing." <chuckles>

I don't know why, but that is just such a literal thing to say that it seems almost random.


why would you ever want to write a novel with AI, that is human stuff right? :)

I'm terrible at writing, but I love reading. I've got ideas for novels, but I struggle to put them down.

What I have found that works is to give the LLM the "world" outline at the beginning and then just feed it one line summary of each chapter and get it to write a chapter at a time.

The problem is that the quality of results drastically decreases as the context length increases. After about 10 chapters the dialogue will start to get real snippy. I've tried getting it to summarize all the previous chapters and feed that back in, but it never includes enough detail.


The only way to get better at something is to do it. Start writing short stories or small novels, and you will get there over time. You don't even have to be a great writer to write a great book as well :). It helps, but readers will forgive a lot along your journey.

Brandon Sanderson has a great series of lectures on how he approaches it that are awesome ->

https://www.youtube.com/playlist?list=PLSH_xM-KC3ZvzkfVo_Dls...

You will get so many mental benefits from writing, too. I promise it is worth it. AI is a great tool if you hit a block and need to brainstorm.


No, you are absolutely right. A lot of the things people think they can't do are literally just lack of practice.

My other problem is... lack of time :)


ack, I also have this problem :)

I am working on some world-building for something I want to write one day, but I am trying just to write little things to help. I write a lot of nonfiction stuff for work, but I am worried that it might not translate as well to characters...


I don't want to write a novel with AI. I want to read them (when they're good enough) because i love reading. Sometimes i want to read something with a certain dynamic and it gets difficult finding human written recommendations.

I run Shepherd.com, and hopefully, it helps :). Feel free to email me at ben@shepherd.com if you need any help with book ideas. I'm working to add more book DNA breakdowns later this year to help tap into certain themes, tropes, moods, etc.

For example, with filters right now you can do things like show me hard sci fi with AI: https://shepherd.com/bookshelf/hard-science-fiction?topics=Q...

Reddit is also a great source for recommendations: https://www.reddit.com/r/booksuggestions/ https://www.reddit.com/r/fantasybooks/ https://www.reddit.com/r/scifi/

Humans write books, AI is for doing the dishes or laundry :)


>Reddit is also a great source for recommendations: https://www.reddit.com/r/booksuggestions/ https://www.reddit.com/r/fantasybooks/ https://www.reddit.com/r/scifi/

Not really. Everyone recommends the same 20 books that most have read or at least considered.

Let me give you an example that is real to me. I'd like to - 1. Read a fantasy series that pairs a human male and elf female romantically over the course of the series. - 2. What i'm looking for is to read the challenges of two fantasy races that aren't on very good terms so just being an elf won't really cut it. - 3. I also want a love interest that is a big active character in the story so not just a dozen mentions in a book. - 4. Obviously, i have to like the book(s).

It doesn't even have to be elves, it's just much harder trying to find such recs from a bespoke species.

You would think this would be an easy enough recommendation. Elves are the fantasy race after all and they usually aren't on the best of terms with humans. But it's not.. and at this point, i could give you more obscure recommendations that meet at least requirement 1, than you'd get in the vast majority of reddit threads. I spent months going through general amazon/goodreads recs and goodreads shelves with elves and still came out wanting.

Once you are even a little bit specific, options decay and if they exist, they are hard to find.

Shepherd looks good though


this seems like something that planning would fix. i wonder if that's how it's doing it

like, if it decides to <think> a table of contents, or chapter summaries, rather than just diving in at page 1


That is mind blowing. To this fantasy reader that’s pure magic.

Can you share it on a text sharing site? It seems you hit your share quota


19 pages?! Am I the only one who prefers an AI that jumps straight to the point?

- Buildup and happy background world-building

- Subtle foreshadowing

- Orcs attack

- Hero is saved by unlikely warrior of astounding beauty

- Evil is defeated until sales justify unnecessary sequel

That's the kind of story fit for the modern attention span...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: