Add to your prompt: "For every factual statement, assign a certainty float 0..1,... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

sdrinf on Feb 24, 2023 | parent | context | favorite | on: Don't believe ChatGPT – we do not offer a "phone l...

Add to your prompt: "For every factual statement, assign a certainty float 0..1, where 0 means you're very uncertain, and 1 means you're absolutely certain it is true".

Specific example: "why do we have first-person subjective experiences? List current theories. For every theory, assign a truthiness float 0..1, where 0 means you're sure it is wrong, and 1 means you're absolutely sure it is true"

From experimenting with this, it will shift the output, sometimes drastically so, as the model now has to reason about it's own certainty; it tends to make significantly less shit up (for example, the non-truth-marked version of the output for the query above also listed panpsychism; whereas the truth-marked version listed only scientific hypotheses).

So the model _can_ reason about it's certainty, and truth-value; and I strongly suspect it was just not rewarded during RLHF for omitting things it knew to be false -basically, percolating the social lies people tell to eachother- which seems to show up in coding as well.

Edit: see https://twitter.com/sdrinf/status/1629084909422931969 for results

wildrhythms on Feb 24, 2023 | [–]

I initialized with that prompt and it did not give me any 0..1 certainty values on any subsequent output to my queries.

swexbe on Feb 24, 2023 | [–]

Or maybe it will just hallucinate this number too.

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact