Genericness is overwhelmingly a product of RLHF rather than an innate property o...

catlifeonmars · 2025-01-27T00:57:28 1737939448

Isn’t that still pulling from the same distribution with a larger standard deviation? I think the problem here is that it only covers a small part of the search space. I think the problem here is that generators are not using novel distributions. They’re still sampling from the same population (existing written works).

throwaway2037 · 2025-01-27T04:01:00 1737950460

RLHF == "Reinforcement learning from human feedback"?