Out of curiosity, why aren't we crowd sourcing distributed training of LLMs wher...

albertzeyer · on Feb 20, 2023

There is the Open Assistant project: https://github.com/LAION-AI/Open-Assistant

There is also EleutherAI (https://www.eleuther.ai/about/) with GPT-NeoX (https://github.com/EleutherAI/gpt-neox).

moffkalast · on Feb 20, 2023

Just make sure it's written in Rust, uses a Sveltekit frontend and <some other buzzwords I can't remember right now>.

wg0 · on Feb 21, 2023

And SQLite as local cache with CRDTs enabled whereas everything else from text search to queuing on PostgreSQL?

nodja · on Feb 20, 2023

https://petals.ml/

Miraste · on Feb 20, 2023

Petals doesn't train new models, it only runs BLOOM in a distributed way.

nodja · on Feb 20, 2023

You can finetune with it. If you want a more generic framework you can use hivemind[1] which is what petals uses, but you'll have to create your own community for whatever model you're trying to train.

https://github.com/learning-at-home/hivemind

nl · on Feb 21, 2023

Obviously fine-tuning is just a special (easier) case of training.

They are working on complete training of large models too: https://github.com/yandex-research/swarm

rnosov · on Feb 20, 2023

The problem here is that most people just don't have suitable hardware. Ideally, you'd want to load the entire model into a GPU and most consumer grade GPUs just don't have nowhere near enough video memory. You'd need to have something like A100 80GB GPU to be able to run a node in the potential blockchain. You can buy one of these cards for about 15k USD. Admittedly, that's not that too far off from the price of a modern bitcoin ASIC miner but still a healthy chunk of change.

And if you try to split the model across several GPUs then you'll have an issue of bandwidth as model parts would need to talk to each other (on the order of terabyte/second). At the moment, the only realistic way to contribute is just to provide feedback data for the RLHF training.