Hacker News new | past | comments | ask | show | jobs | submit login
Irmin: Git-like distributed DB (mirage.io)
121 points by mortimerwax on Nov 3, 2015 | hide | past | favorite | 25 comments



There was some discussion about Irmin just over a year ago [1] and it's been steadily improving ever since. The Readme on the repo [2] lists the bleeding-edge users too, so you can see examples of how it's been applied -- (note to self: we should probably write an update about the progress to date).

[1] https://news.ycombinator.com/item?id=8053687

[2] https://github.com/mirage/irmin/blob/master/README.md#use-ca...


If anyone wants to see a talk about this with a few demos of various use-cases (such as the JavaScript transpilation backend, and log servers), I gave a talk at QCon NYC that just went live on InfoQ yesterday: http://www.infoq.com/presentations/irmin


I'd like to see more information about transactions. Is it possible to update a several keys atomicly?


Yes, you can easily add thing into a "staging area" of your database (which is hold in-memory), and then "commit" your changes to update multiple keys in one go. You can also use this mechanism to keep track of reads (what Git doesn't do) to detect read/write conflicts.


See also an early paper on "mergeable persistent data structures" that shows how to do this for Irmin-based Rope and Queue data structures. http://anil.recoil.org/papers/2015-jfla-irmin.pdf


The Github repo[1] says "This repository is currently offline". I've never seen this message on github before, I'm not really sure what it means. Github then refers me to the "working when Github goes down" article, but Github hasn't really gone down.. I can access every other repo I try to access. Just not this one.

[1] https://github.com/mirage/irmin


https://status.github.com/

20:34 MST We are continuing to investigate a fileserver outage. Some repositories may be temporarily unavailable.


Github is having issues at the moment https://status.github.com/


If github went down I would seek the nearest bomb shelter.


It's down pretty often.


From your description I'm not quite sure where in the stack Irmin belongs? Is this to be used by web application developers? I assume note, as this looks like it is targeting more OS level development work?

Pretty cool stuff. I am also working on a distributed database, https://github.com/amark/gun , that operates at the high level (web/javascript) rather than the low level. Although it looks like Irmin can be used in the browser? https://github.com/talex5/irmin-js ? Would love to hear some clarification.


As for the rest of MirageOS, Irmin is a "library" database, means that you have a bunch of components than you can re-use in different contexts. Two interesting contexts are:

- the browser, where some components of Irmin are transpiled to JavaScript using js_of_ocaml (http://ocsigen.org/js_of_ocaml/). Cuekeeper (http://roscidus.com/blog/blog/2015/04/28/cuekeeper-gitting-t...) is an interesting use-case for that.

- the kernel, where some components of Irmin can be compiled into a unikernel and be run on top of Xen/baremetal, bypassing the OS completely. Irmin-ARP (http://somerandomidiot.com/blog/2015/04/24/what-a-distribute...) is an interesting step is the direction of exposing kernel data and do interesting stuff with it.


[ To avoid any confusion, the submitter is not the author of the post :) ]

I don't really understand the question about the stack. Where it belongs should only limited by its features.

It certainly can be used by people developing browser-based apps. For example, Cuekeeper [1] is a version-controlled TODO manager which uses Irmin. It's also been used with Xenstore [2], which is a different use-case (there are other examples in the readme of the Irmin repo).

[1] https://github.com/talex5/cuekeeper

[2] https://github.com/djs55/ocaml-xenstore/tree/irminsule


Like most OCaml libraries, Irmin can be used in the browser by compiling your program with js_of_ocaml.

My `irmin-js` experiment provides a Javascript API to Irmin, so you can write your application in Javascript too, rather than OCaml. My JS isn't very good though; here's what my example code ended up looking like:

https://github.com/talex5/irmin-js/blob/master/examples/test...


How does this compare / contrast with IPFS?


IPFS isn't designed as a database, more as a lookup layer for files


You'll soon be able to store arbitrary data structures on IPFS [1], and we're discussing how to perform merging in a decentralised manner [2]. I'm actually quite interested in having a persistent data storage system (like Irmin) backed by IPFS in the future.

[1] https://github.com/ipfs/go-ipld [2] https://github.com/ipfs/notes/issues/40


Nice, but there is a fundamental problem with the three-way merge: the guarantees about the result are very weak, and it may require special attention to resolve merge conflicts.


That's sort of the entire point. The idea is that you build datastructures using this library that operate over the three-way merge, and provide their own guarantees about merge conflicts.

For instance, a weakly consistent data structure could promise never to raise a merge conflict, and therefore be safe to compose. A stronger one could raise more precise merge exceptions depending on the exact error, which ripple up to the application.

An example of an application handling this sort of merge error is the Cuekeeper TODO manager (which is a pure JavaScript Irmin app that uses HTML5 Localstorage/IndexedDB). Try opening http://test.roscidus.com/CueKeeper/ in two tabs and creating conflicting changes, and see the Irmin merge error ripple up to the UI.


Note: CueKeeper is a bit unusual here. Merges always succeed, but it adds a note to the item saying what it did to resolve the conflict.

e.g. if you rename an action "orig" to "a" and to "b" and then merge, you'll end up with an action called "a" with a note saying the change to "b" was discarded.

BTW: it's easier to generate merge conflicts using this web interface, which lets you run two instances in one window:

http://roscidus.com/blog/blog/2015/04/28/cuekeeper-gitting-t...


Can you expand on the weak guarantees?

Sure, conflicts need to be handled. That's a fundamental problem with reality!


Well, the problem with a merge is that you lose information about the "intention" of the operation that resulted in those specific versions to be merged. All you've got is the end result of those operations. This becomes especially troublesome if there are invariants that need to hold over multiple data-items that have changed in a 3-way merge.


I would probably understand better with a concrete example. It's not clear what "intention" means, and why it can't be encoded as part of whatever objects are being merged.


Heads up to the Irmin folks, looks like your website is down.


Seems to be fine now.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: