Hacker News new | past | comments | ask | show | jobs | submit login

Is anyone letting an LLM code and run its code by itself, then iteratively fix any bugs in it without human intervention until it e.g. passes some black box tests?

Would it be possible to significantly improve an LLM using such unsupervised sessions?




Yeah at some point gpt4 loses track and just consistently is wrong.

Lately I can't feed it too much info, the longer the context the more issues.

With your suggestion, it doesn't know which part of the iteration is correct at the moment. For us iteration is logically but for chatgpt I think it's just more variables that make the chance of being wrong larger. So you need to build that in somehow that it can iteratively filter and prompt


Isn't the issue that things fall out of context and then it starts hallucinating a lot more ? Sometimes it helps to just start a new prompt


That's been my experience. At some point it can't "un-learn" its mistakes because it keeps including the "wrong" bits in scope.

I have some success saying "no, undo that," waiting for it to return the corrected version, and only then continuing.

Oobabooga's UI is better at this, since you can remove erroneous outputs from the context and edit your previous input to steer it in the right direction.

Given that OpenAI mines conversations for training data it seems to align with their interests to make you give up and start a new prompt. More abandoned prompts = more training data.


I don't know what the issue is. Been happening more lately. Before would ask a lot and would manage, now often 4-5 prompts in seems to just answer without previous context


I did, works fine for short pieces of code, but the context window size quickly becomes prohibitive in a naive approach.

Also, gpt-4 does this much better than gpt-3.5, but gpt-4 is really slow, so the iteration process can take tens of minutes.


Exactly! During the process, it seemed like if there were like two GPTs self-playing to both generate the proper prompts iteratively and the other generates like the output, all triggered by one concise command from a human - say write tests and dont stop iterating till the tests pass - basically automating the human out of the loop - could get rid of the loops fixing tests, but also take away control.


You might be interested in Auto-GPT: https://agpt.co/, an attempt at an autonomous AI based on GPT.

It does not feed back information into GPT, so the LLM is not improving. Such a system would require both guts (insanity?) _and_ money to pull off.


I haven’t had much experience with AGPT but all the “AGI is here” and “people use this to make money” posts on their news feed makes my “meat brain” suspicious ha.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: