At first I was in love with it and now I just like it. It is a good tool for some things but it is not a universal replacement for everything database-like.
I like :
* futon (yes, I said it, I like couch because of a pretty GUI). I can quickly look at the database and debug what the problem is. Visibility into the database, the serialization format and the protocol are important to me.
* HTTP interface. Access it with curl or any other tools that speak HTTP.
* Documents are json. I like json.
* Can attach binary blobs to documents + have mime content types markers. Can serve this data to a web browser.
* Durable -- when my request finished, I know the data is on the disk.
* Map/Reduce -- I just like it. It makes sense to me.
Don't like:
* Not very good for high rate updates. Need to compact it periodically
* Not that fast. I know I can tweak it here and there, but by default it is on the slow end of the NoSQL DB speed range. I can live with this.
* MVCC. For some people this is a GoodThing for me it is a "Meh". Sometimes I would like to just update the document in place and overwrite the old one. Perhaps Redis or Mongo would be a better tool here. Now I just do it by repeatedly reading (on failure), updating _rev and then writing back.
* Map/Reduce -- Am I the only one who likes this? It is hard to get others onboard and start thinking of this instead of traditional queries.
Things I don't care about but others like:
* Offline replication. Others rave about it. I used it only a couple of times. Always afraid to end up conflicts silently buried in the DB history.
* Javascript -- don't really like it that much. I use the Python view server to write views. It is actually faster! :
I agree on most of your points, but I use Node.js so the Javascript stuff is just great. I also like Couchapps so in any case, CouchDB is just fantastic for those.
I think people are tainted on Map/Reduce because they see how it is done with other NoSQL DBs like MongoDB where it doesn't store the indexes. The fact that CouchDB stores the views like it does, makes complex queries fast, and easy to use. Maybe there is a little initial overhead in writing the views, but after that it is efficient and leaves a lot of code out of the program, so you are getting the data you want in the way you want it.
I think people are tainted on Map/Reduce because they see how it is done with other NoSQL DBs like MongoDB where it doesn't store the indexes
Maybe I misunderstand what you mean by stored indexes, but MongoDB supports indexes (http://www.mongodb.org/display/DOCS/Indexes), it just doesn't require them so you can perform exploratory ad-hoc queries when you need to.
I used the wrong term, I apologize. From what I understand MongoDB doesn't store the results from Map/Reduce calculation and treats each query using Map/Reduce separately and on demand. Which means that if you have a large data set, your M/R might be too computationally demanding to run every time.
Where as with CouchDB, every time the view (M/R results) are accessed, it gets incrementally updated and stored. So even complex M/R queries on large datasets will be fast subsequently.
Just some clarification here as comparing MongoDB and CouchDB's Map/Reduce is a bit of "Apples and Oranges" as they are designed for different purposes.
While in CouchDB, all queries are created with Map/Reduce, in MongoDB Map/Reduce is designed for aggregation. There is a separate system for standard querying which performs much better than Map/Reduce (Before people jump on me here my performance comparison is Mongo queries to MONGO Map/Reduce. Not Couch Map/Reduce or Map/Reduce in general; Not looking to get into a benchmark discussion here, just feature clarification).
The two differ greatly---while Couch requires precalculation of "Views", MongoDB focuses on dynamic querying. Given CouchDB's use of Map/Reduce for these static views, the way they do the iterative addition of new data without re-reducing the entire dataset makes sense.
However, MongoDB doesn't require you to use Map/Reduce to run queries and its MapReduce not designed for day to day querying. Rather, one should use MongoDB Map/Reduce for data aggregation tasks.
Also notable is that in 1.8, MongoDB has added some additional functionality for those who are using Map/Reduce which allows you to merge output and build data across jobs. I did a write up on the changes a few weeks ago: http://blog.evilmonkeylabs.com/2011/01/27/MongoDB-1_8-MapRed...
(In the interest of Full Disclosure: I work for 10gen, the company behind MongoDB; I'm also working on a book about MongoDB.)
I think he meant that couchdb keeps stuff on disk (through btrees to be more precise) for the views so that when the corresponding database is updated, the map reduce does not run over the whole dataset, only on the new document.
As far as I know, this does not work on mondodb.
Personally, I simply do not see the point of map reduce for something like couch: the whole point is to be able to parallel jobs on multiple machines so that data do not need to move back and forth through the network. Debugging them is particularly painful, especially if you need to support non programmers (sql is used by many non-programmers people in my experience).
MongoDB uses btree indexes as well, it just doesn't use Map/Reduce for its regular querying like CouchDB does. See my above comment on differences as well as a writeup I did on new output options in MongoDB's M/R system.
The original part in couchdb design is using btree for views to hold temporary data for M/R jobs, as btree are indeed used by mongodb (and most db engines).
The new stuff for M/R in mongodb looks interesting, thanks for the update.
You're definitely not the only one who likes Map/Reduce - storing arbitrarily shaped json docs and then having a set of fixed views over these seems to be a very good map for how I like to think about application data. Having to rigidly decompose structures into a relational data model always gave me indigestion, which I used to feel guilty about, whereas I'm much happier using a document oriented database for my own projects.
I love it. There is nothing else that makes it so easy to replicate between various systems. And with its HTTP API, it's possible to write javascript apps that talk directly to the db, without a need for any server-side code.
You're welcome, and not alone. CouchDB is awesome but it is a matter of educating people about it's strengths. I think there are a lot of people using MongoDB when they would be better off using CouchDB for their data.
We had the Membase team visit the Seattle Google Tech User's group last month. We had about 100 folks in attendance - and I would say we were pretty impressed with the demos we saw of Membase - especially very nice administration dashboard and ease of administering a large cluster.
BTW - the official merger announcement just dropped in my inbox:
Loggly has been using Membase for a good while now. We're seriously excited about this merger - CouchDB is a great choice for the persistant data store side of things...and fast as hell.
Great to see CouchOne getting momentum. Their commitment to mobile solutions is a great direction.
Additionally there are so many use cases in the fields of data warehousing and analytics. I am sure that CouchDB will replace many cube-based solutions in the future. There is great potential for companies focusing on CouchDB, like CouchOne and also Cloudant.
Apache CouchDB will become best-of-breed software sometime after couchio/couchone/couchbase is out of the picture. The technical foundation is there, but the culture is just 'off' somehow.
I'm glad to hear you'll be supporting both interfaces. And I'm looking forward to hearing details about how the two engines will be integrated going forward.
We believe the Couch-part is worth mentioning given its unique capabilities (real-time Map/Reduce, Sync) and that the new name should reflect that it is a merger of the two previous companies CouchOne and Membase.
Everything remains open source and community editions remain freely available and fully featured. From that perspective, nothing really changes except each will get the best of what the other has to offer.
It is simple, really: at CouchOne we were 100% committed on the Open Source side of things and at Couchbase we will continue to do so at the same degree. In terms of organisation, Couchbase will be it's own independent Open Source project that has Apache CouchDB and memcached as dependencies, but adds a few things of its own that warrant being its own project. Our combined engineering team, led by Damien, will continue to contribute to Apache CouchDB in the same way as we've been to date, only more.
I do sometimes feel like I'm the only one using CouchDB sometimes, though... Hopefully this gets more devs on board with it.