CouchOne + Membase merge

Andrex · on Feb 8, 2011

I commented on the TC article, but I'll repeat: CouchDB and CouchOne are awesome, and anything that makes them more awesome is very welcomed.

I do sometimes feel like I'm the only one using CouchDB sometimes, though... Hopefully this gets more devs on board with it.

rdtsc · on Feb 8, 2011

I use it.

At first I was in love with it and now I just like it. It is a good tool for some things but it is not a universal replacement for everything database-like.

I like :

* futon (yes, I said it, I like couch because of a pretty GUI). I can quickly look at the database and debug what the problem is. Visibility into the database, the serialization format and the protocol are important to me.

* HTTP interface. Access it with curl or any other tools that speak HTTP.

* Documents are json. I like json.

* Can attach binary blobs to documents + have mime content types markers. Can serve this data to a web browser.

* Durable -- when my request finished, I know the data is on the disk.

* Map/Reduce -- I just like it. It makes sense to me.

Don't like:

* Not very good for high rate updates. Need to compact it periodically

* Not that fast. I know I can tweak it here and there, but by default it is on the slow end of the NoSQL DB speed range. I can live with this.

* MVCC. For some people this is a GoodThing for me it is a "Meh". Sometimes I would like to just update the document in place and overwrite the old one. Perhaps Redis or Mongo would be a better tool here. Now I just do it by repeatedly reading (on failure), updating _rev and then writing back.

* Map/Reduce -- Am I the only one who likes this? It is hard to get others onboard and start thinking of this instead of traditional queries.

Things I don't care about but others like:

* Offline replication. Others rave about it. I used it only a couple of times. Always afraid to end up conflicts silently buried in the DB history.

* Javascript -- don't really like it that much. I use the Python view server to write views. It is actually faster! :

http://packages.python.org/CouchDB/views.html

chapel · on Feb 8, 2011

I agree on most of your points, but I use Node.js so the Javascript stuff is just great. I also like Couchapps so in any case, CouchDB is just fantastic for those.

I think people are tainted on Map/Reduce because they see how it is done with other NoSQL DBs like MongoDB where it doesn't store the indexes. The fact that CouchDB stores the views like it does, makes complex queries fast, and easy to use. Maybe there is a little initial overhead in writing the views, but after that it is efficient and leaves a lot of code out of the program, so you are getting the data you want in the way you want it.

arst · on Feb 8, 2011

I think people are tainted on Map/Reduce because they see how it is done with other NoSQL DBs like MongoDB where it doesn't store the indexes

Maybe I misunderstand what you mean by stored indexes, but MongoDB supports indexes (http://www.mongodb.org/display/DOCS/Indexes), it just doesn't require them so you can perform exploratory ad-hoc queries when you need to.

chapel · on Feb 8, 2011

I used the wrong term, I apologize. From what I understand MongoDB doesn't store the results from Map/Reduce calculation and treats each query using Map/Reduce separately and on demand. Which means that if you have a large data set, your M/R might be too computationally demanding to run every time.

Where as with CouchDB, every time the view (M/R results) are accessed, it gets incrementally updated and stored. So even complex M/R queries on large datasets will be fast subsequently.

rit · on Feb 8, 2011

Just some clarification here as comparing MongoDB and CouchDB's Map/Reduce is a bit of "Apples and Oranges" as they are designed for different purposes.

While in CouchDB, all queries are created with Map/Reduce, in MongoDB Map/Reduce is designed for aggregation. There is a separate system for standard querying which performs much better than Map/Reduce (Before people jump on me here my performance comparison is Mongo queries to MONGO Map/Reduce. Not Couch Map/Reduce or Map/Reduce in general; Not looking to get into a benchmark discussion here, just feature clarification).

The two differ greatly---while Couch requires precalculation of "Views", MongoDB focuses on dynamic querying. Given CouchDB's use of Map/Reduce for these static views, the way they do the iterative addition of new data without re-reducing the entire dataset makes sense.

However, MongoDB doesn't require you to use Map/Reduce to run queries and its MapReduce not designed for day to day querying. Rather, one should use MongoDB Map/Reduce for data aggregation tasks.

Also notable is that in 1.8, MongoDB has added some additional functionality for those who are using Map/Reduce which allows you to merge output and build data across jobs. I did a write up on the changes a few weeks ago: http://blog.evilmonkeylabs.com/2011/01/27/MongoDB-1_8-MapRed...

(In the interest of Full Disclosure: I work for 10gen, the company behind MongoDB; I'm also working on a book about MongoDB.)

cdavid · on Feb 8, 2011

I think he meant that couchdb keeps stuff on disk (through btrees to be more precise) for the views so that when the corresponding database is updated, the map reduce does not run over the whole dataset, only on the new document.

As far as I know, this does not work on mondodb.

Personally, I simply do not see the point of map reduce for something like couch: the whole point is to be able to parallel jobs on multiple machines so that data do not need to move back and forth through the network. Debugging them is particularly painful, especially if you need to support non programmers (sql is used by many non-programmers people in my experience).

rit · on Feb 8, 2011

MongoDB uses btree indexes as well, it just doesn't use Map/Reduce for its regular querying like CouchDB does. See my above comment on differences as well as a writeup I did on new output options in MongoDB's M/R system.

cdavid · on Feb 9, 2011

The original part in couchdb design is using btree for views to hold temporary data for M/R jobs, as btree are indeed used by mongodb (and most db engines).

The new stuff for M/R in mongodb looks interesting, thanks for the update.

arethuza · on Feb 8, 2011

You're definitely not the only one who likes Map/Reduce - storing arbitrarily shaped json docs and then having a set of fixed views over these seems to be a very good map for how I like to think about application data. Having to rigidly decompose structures into a relational data model always gave me indigestion, which I used to feel guilty about, whereas I'm much happier using a document oriented database for my own projects.

tony_landis · on Feb 8, 2011

Couchdb is awesome, and you are not alone.

Really, my only complaint is the inability to query and update in one command for quick document updates. SQL has the leg up here.

> UPDATE blah WHERE blah = blah;

Please, if someone from couchone is reading this, a similar feature would be killer.

dlsspy · on Feb 8, 2011

We know about this and we also want it. We had some preliminary planning last week on how, exactly, we want to approach this.

We can't commit to when we'd be able to deliver it just yet, but it's definitely high up on our lists.

jchrisa · on Feb 8, 2011

also that was about the best meeting I've had in weeks!

tony_landis · on Feb 9, 2011

I hope it can be worked in, thanks for the reply

necubi · on Feb 8, 2011

I love it. There is nothing else that makes it so easy to replicate between various systems. And with its HTTP API, it's possible to write javascript apps that talk directly to the db, without a need for any server-side code.

chapel · on Feb 8, 2011

You're welcome, and not alone. CouchDB is awesome but it is a matter of educating people about it's strengths. I think there are a lot of people using MongoDB when they would be better off using CouchDB for their data.

Andrex · on Feb 8, 2011

Ha! Chapel. What a serendipitous meaning.

Yeah, thanks for introducing me to CouchDB.

mckoss · on Feb 8, 2011

We had the Membase team visit the Seattle Google Tech User's group last month. We had about 100 folks in attendance - and I would say we were pretty impressed with the demos we saw of Membase - especially very nice administration dashboard and ease of administering a large cluster.

BTW - the official merger announcement just dropped in my inbox:

http://www.membase.com/merger

kordless · on Feb 8, 2011

Loggly has been using Membase for a good while now. We're seriously excited about this merger - CouchDB is a great choice for the persistant data store side of things...and fast as hell.

js4all · on Feb 8, 2011

Great to see CouchOne getting momentum. Their commitment to mobile solutions is a great direction.

Additionally there are so many use cases in the fields of data warehousing and analytics. I am sure that CouchDB will replace many cube-based solutions in the future. There is great potential for companies focusing on CouchDB, like CouchOne and also Cloudant.

benblack · on Feb 8, 2011

Was this comment produced by a Markov generator? No part of it makes any sense.

js4all · on Feb 10, 2011

No, I tried to include every aspect into a short statement. Which part needs clarification? I'll be happy to elaborate.

rch · on Feb 8, 2011

Apache CouchDB will become best-of-breed software sometime after couchio/couchone/couchbase is out of the picture. The technical foundation is there, but the culture is just 'off' somehow.

smoody · on Feb 8, 2011

I'm glad to hear you'll be supporting both interfaces. And I'm looking forward to hearing details about how the two engines will be integrated going forward.

donpark · on Feb 8, 2011

All great except I can't fathom the decision to go with a wacky name like Couchbase when they already have a great name: Membase.

janl · on Feb 8, 2011

We believe the Couch-part is worth mentioning given its unique capabilities (real-time Map/Reduce, Sync) and that the new name should reflect that it is a merger of the two previous companies CouchOne and Membase.

velly · on Feb 8, 2011

Map/Reduce is a key feature that is required by the "one true datastore" along with the elegance of on the fly indexed views ;)

daleharvey · on Feb 8, 2011

and a complimentary folk song courtesy of Claire

http://www.youtube.com/watch?v=C4mHHuc4NPc

KevBurnsJr · on Feb 8, 2011

Will it be free?

dlsspy · on Feb 8, 2011

Everything remains open source and community editions remain freely available and fully featured. From that perspective, nothing really changes except each will get the best of what the other has to offer.

patrickaljord · on Feb 8, 2011

Even the new mobile stuff?

jchrisa · on Feb 8, 2011

my next step is to clean up the mobile build so we can get it in a proper public repo.

patrickaljord · on Feb 8, 2011

How are you going to make it work on iphone by the way? I doubt spidermonkey and erlang will be accepted there. What's the plan?

janl · on Feb 8, 2011

We're waiting for the App Store Review team to get back to us before we're announcing things. I hope it'll be soon, can't wait!

jchrisa · on Feb 8, 2011

(from Jan's mail to the Apache dev@ list)

It is simple, really: at CouchOne we were 100% committed on the Open Source side of things and at Couchbase we will continue to do so at the same degree. In terms of organisation, Couchbase will be it's own independent Open Source project that has Apache CouchDB and memcached as dependencies, but adds a few things of its own that warrant being its own project. Our combined engineering team, led by Damien, will continue to contribute to Apache CouchDB in the same way as we've been to date, only more.

phlux · on Feb 8, 2011

Anyone running FusionIO products on membase or couchDB layers?

If not - take a look at them.