Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Any place people doing such grafting are congregating?

I've often pondered if taking some random chunk of weights from the middle of a trained model, and dumping it into some totally different model might perform better than random initialization when the scale gets big enough.



I dunno. Probably the Kobold, Pygmalion, or AI Collective Discord, if I were to guess.

The first effort I am aware of is here: https://huggingface.co/chargoddard/llama2-22b

Being in the Discord age, lots of the discussion about cool llm stuff is fragmented and buried. I am in these discords, and I only know because I ran into it on HuggingFace.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: