Any place people doing such grafting are congregating?
I've often pondered if taking some random chunk of weights from the middle of a trained model, and dumping it into some totally different model might perform better than random initialization when the scale gets big enough.
Being in the Discord age, lots of the discussion about cool llm stuff is fragmented and buried. I am in these discords, and I only know because I ran into it on HuggingFace.
I've often pondered if taking some random chunk of weights from the middle of a trained model, and dumping it into some totally different model might perform better than random initialization when the scale gets big enough.