> The “source code” for a work means the preferred form of the work for making modifications to it.
That's the definition in the GPL. That it is text or binary doesn't matter.
So are the weights the preferred form for making modifications? Partly yes, because of fine tuning, but also no, because you are limited in what you can do with fine tuning. If Mistral had to make major changes to their model, they would probably start with the dataset and code they have but you don't, the one that created the weights file.
So I wouldn't call it "open source", just "open". You can do whatever you want with what you have, but you don't have the same abilities as Mistral to modify the model because you lack some data.
Still, it is a bit of an unusual situation since even with the "real sources", i.e. training data and code, most people wouldn't have the resources to retrain the model, and a big part of the value in these models is the computing resources that were invested in training them.
That's the definition in the GPL. That it is text or binary doesn't matter.
So are the weights the preferred form for making modifications? Partly yes, because of fine tuning, but also no, because you are limited in what you can do with fine tuning. If Mistral had to make major changes to their model, they would probably start with the dataset and code they have but you don't, the one that created the weights file.
So I wouldn't call it "open source", just "open". You can do whatever you want with what you have, but you don't have the same abilities as Mistral to modify the model because you lack some data.
Still, it is a bit of an unusual situation since even with the "real sources", i.e. training data and code, most people wouldn't have the resources to retrain the model, and a big part of the value in these models is the computing resources that were invested in training them.