Brian Wilson from Backblaze here (not the "Brian Beach" who is author of the blog post) - this was not a case of NIH. We use lots and lots and lots of existing software like Debian, Java, Ext4. We use tools like ansible and Zabbix. But this one thing just didn't exist for us in the form we needed it. We looked, we really did.
We did write the Reed-Solomon ourselves in a "clean room" so we did not have to pay any licensing fees and we clearly didn't steal anybody else's source code, but that is a very small amount of code. Like 80 lines of Java. Seriously. We referenced the technical papers we read to implement it in that blog post, but here it is again: http://www.cs.cmu.edu/~guyb/realworld/reedsolomon/reed_solom... And we unit tested the living heck out of that code, plus we mathematically verified various parts.
But I'm open to an alternative solution if you can suggest one? Remember our three highest priorities are: 1) reliable, 2) low cost, 3) simple. The "low cost" includes things like we do not want to pay ongoing licensing fees to other companies.
Thanks, that's basically what I wanted to hear. :)
My first thought was that you could've reused the R-S code from mdraid or dm-raid or ZFS, but on second thought 1) it may be too specialized to be reusable, and 2) it's GPL (or CDDL), so you can't just plonk it into your own code.
And yeah, if it's just 80 lines of Java, I'm worrying about the wrong things.
Backblaze is a web service, so for better or worse, the GPL doesn't apply here, since we never have access to their binaries. The AGPL would apply, but that's not the license used.
We did write the Reed-Solomon ourselves in a "clean room" so we did not have to pay any licensing fees and we clearly didn't steal anybody else's source code, but that is a very small amount of code. Like 80 lines of Java. Seriously. We referenced the technical papers we read to implement it in that blog post, but here it is again: http://www.cs.cmu.edu/~guyb/realworld/reedsolomon/reed_solom... And we unit tested the living heck out of that code, plus we mathematically verified various parts.
But I'm open to an alternative solution if you can suggest one? Remember our three highest priorities are: 1) reliable, 2) low cost, 3) simple. The "low cost" includes things like we do not want to pay ongoing licensing fees to other companies.