MongoDB 1.6 stable released

kylemathews · on Aug 5, 2010

Many people expect MongoDB to be the standard datastore for high-performance Drupal sites come the Drupal 7 release. See this session at the last drupalcon: http://sf2010.drupal.org/conference/sessions/mongodb-humongo...

examiner.com is switching to Drupal soonish and has funded most of the work to write the Drupal/MongoDB integration.

barmstrong · on Aug 5, 2010

Mongo looks like its emerging as the preferred NoSQL platform on Rails at least.

As an admittedly rough measure, check out the number of "watchers" on github for each project's leading plugin:

1106 - MongoMapper http://github.com/jnunemaker/mongomapper

536 - CouchREST http://github.com/jchris/couchrest

312 - Cassandra http://github.com/fauna/cassandra

I've only built one rails app with nosql in it, but at first glance at least Mongo seemed the most fully baked of the group.

ruperp · on Aug 5, 2010

Don't forget:

695 - Mongoid http://github.com/durran/mongoid

seems to be the Rails/Mongo project to watch. They got a slick looking homepage too: http://mongoid.org

Hoping to try using it on my next project.

ekidd · on Aug 5, 2010

We've used both MongoMapper and Mongoid, and are very happy with Mongoid. It has nice ActiveModel integration, so it works quite well with Rails 3.

bkudria · on Aug 5, 2010

Seconded, using Mongoid over MM here too. It rocks.

mrinterweb · on Aug 5, 2010

Currently the choice between MongoMapper and Mongoid is pretty simple. If you're using Rails 2.3.x, MongoMapper. If you're using Rails 3, Mongoid. The ActiveModel integration is key for Rails 3. Mongoid also has great documentation. MongoMapper does not as much consolidated documentation.

spohlenz · on Aug 6, 2010

I'd also recommend checking out my MongoModel project, especially if you have been dissatisfied with the way MongoMapper or Mongoid do things.

http://github.com/spohlenz/mongomodel http://www.mongomodel.org

It's not quite as mature, but more documentation updates are currently in progress.

jpcx01 · on Aug 6, 2010

Not sure I totally understand. But it looks like its just a rewrite of MongoMapper for Rails3? Seems like the structure is very similar.

What didn't you like about MongoMapper and/or Mongoid? Starting from scratch is a big job.

spohlenz · on Aug 6, 2010

Firstly, MongoModel was started back in November '09 when both MongoModel and and Mongoid were much less mature than they are now. I spent some time contributing patches to MongoMapper but stumbled as I was working on edge rails.

MongoMapper in particular has come a long way since then but still has a few issues in my mind. For example, attributes are converted to/from their mongo representation every single time they are accessed, rather than only when the model is saved or loaded. I also believe MongoModel has a nicer model for typecasting property values (see http://gist.github.com/287379 for an example), and I disagree with the use of has_many associations for embedded collections.

MongoModel isn't perfect either, but it has been the path of least resistance for me in getting my app running on MongoDB.

jpcx01 · on Aug 6, 2010

Cool, thanks for the explanation. Btw, mongoid has a new association dsl that I think makes way more sense. embedded_in, embeds_many, referenced_in, references_many. I bet the project would love to see you as a contributor.

bl4k · on Aug 5, 2010

To say 'preferred NoSQL' is far too generic, it is as if users have an option between SQL or NoSQL. I would say that MongoDB is becoming the preferred document-based data store - but that has nothing to do with key value, graph, column-oriented, etc. stores that make up what people refer to as 'NoSQL' (which is a loaded term)

mikebo · on Aug 5, 2010

I've used MongoMapper for a couple projects and been really happy with it. John Nunemaker is doing a great job with it.

gaiusparx · on Aug 6, 2010

Anyone uses Riak? How does it compared with Mongo?

shykes · on Aug 6, 2010

We host both for our users, and use Riak for our own datastore. They're both awesome, but quite different.

Here's a pretty thourough comparison: https://wiki.basho.com/display/RIAK/Riak+Compared+to+MongoDB

Note: written by the authors of Riak, but MongoDB contributors chimed in on the comments.

j2d2 · on Aug 5, 2010

Mongoengine is pretty tight for python/mongo

rb2k_ · on Aug 5, 2010

As soon as they feature document level locking (maybe MVCC), I'll happily use it. But locking all databases when a write is occurring is a bit of a downer for large apps that require concurrency.

masomenos · on Aug 5, 2010

http://www.mongodb.org/display/DOCS/Atomic+Operations says:

"MongoDB supports atomic operations on single documents. MongoDB does not support traditional locking and complex transactions for a number of reasons. . ."

What made you think that it was "locking all databases when a write is occurring"?

robotadam · on Aug 5, 2010

MongoDB has a single, global read/write lock for the entire database. So when a write is actually processing, nothing else can happen. In a moderately write-heavy environment, a single slow write can result in pretty bad performance in my experience. Reads can happen at the same time, although there have been problems with not yielding the lock on scans, although there are changes in 1.6 to address that.

ethangunderson · on Aug 6, 2010

This used to be the case in earlier versions, but now it only issues a server level write lock for atomic updates, reads can still happen.

It's not ideal, but it is an improvement.

rb2k_ · on Aug 5, 2010

basically what robotadam said (http://news.ycombinator.com/item?id=1579212). To clarify: the database lock is actually for the whole process (-> all databases). I had an especially bad experience with it before they introduced the yielding for long running operations. But even with the yielding, write heavy parallel things still suffer. Especially if your indexes ever get too big and the the system decides to swap in/out part of the index. I've seen 5-6 second inserts with hundreds of writes and reads piling up in the mean time (even without the swap part).

If only they had an equivalent of innodb as a storage engine, it would be awesome

mjs · on Aug 5, 2010

This is also pertinent:

http://www.mongodb.org/display/DOCS/How+does+concurrency+wor...

Note that whilst a write blocks all other writes, the time you need to wait is the time it takes for the in-memory data structure to be modified; you don't have to wait for the write to hit disk. (Writes are only persisted to disk every so often.)

rb2k_ · on Aug 13, 2010

Unless you actually don't have enough RAM and that data structure has to be paged in... that's a painfully long wait

masomenos · on Aug 5, 2010

Interesting! Did not know that.

Any idea how replica sets / shards might fit into this picture?

rb2k_ · on Aug 5, 2010

It would help keeping the index size down (well... distribute it at least) and a lock on a shard will not influence the other shards. I asked a few months back if I could just run several shards on a single machine and keep the locked area smaller that way, but the general consensus was that it wasn't a good idea because of the overhead. Other than that, the lock is still there...

oozcitak · on Aug 6, 2010

Unlikely IMO. Update-in-place is one major feature of MongoDB. If you need MVCC you may be better off using CouchDB or Riak.

stavros · on Aug 5, 2010

Nobody seems to like my opinion, but MongoDB kept eating my data regularly and silently. My writeups are here:

http://www.korokithakis.net/node/116 http://www.korokithakis.net/node/119

I would love to answer questions if I could figure out how to easily see my replies on HN :/

masomenos · on Aug 5, 2010

1.3.3 was a development version, as is every odd numbered: http://www.mongodb.org/display/DOCS/Version+Numbers

stavros · on Aug 6, 2010

I know that now :P Regardless, the problems continued with 1.4.

thibaut_barrere · on Aug 5, 2010

My understanding is that you used v1.3.3 then switched to v1.4.0, is that right ?

Are you experiencing data loss with 1.4.0 ?

stavros · on Aug 6, 2010

Yes, I mark the point where I switched to 1.4 in my posts (and the same about the 64-bit version). In one case (recently, with 1.4.1 64-bit) I had lost half my data in a crash, and restored from week-old backups only to discover that they were missing back then too. Turns out it was a silent corruption days ago...

_pius · on Aug 5, 2010

Are you using the stable MongoDB?

stavros · on Aug 6, 2010

I started with the unstable version and then switched to the stable one, but the problems continued. If I remember correctly, the problems with 1.3 were not significantly more than 1.4.

patrickaljord · on Aug 6, 2010

I and many people are using mongodb just fine with no data-loss and way more data and way more queries per second than you. If you are going to make big accusations like this you need to back them up with a test-case to reproduce your error. Otherwise it's just yet another useless rant/bug report. I too could write a "cassandra lost my data" blog post with no proof whatsoever, wouldn't help. Also the 32bits argument is ridiculous, it is not supposed to store more than 2geebees anyway, so don't use it for that.

stavros · on Aug 6, 2010

See if you can make SQLite lose your data, I'd like to see that...

steveklabnik · on Aug 6, 2010

You can see replies by hitting the "threads" link at the top of the page.

Reasons (from my understanding) that you had a bad experience:

    1. You used a development, unstable version of software in production.
    2. 32 bit Mongo cannot store more than 2GB of data, this is a very public, known admission.

The reason that 'nobody seems to like your opinion' is that your post got real famous, and while mistakes happen, the internet was in a bit of a frenzy at the time over the whole NoSQL thing, and so you became sort of a poster child for various sides that people had chosen. Sucks, but it happens. I had a similar experience with my first blog post ever, so I feel your pain...

stavros · on Aug 6, 2010

Thanks for the "threads" tip, it's really handy. About your points:

1. This is true (there was no warning that I saw when I was downloading it, although one was added later, or maybe I didn't notice it). However, I upgraded to the stable version when it came out to give MongoDB a second chance, because it sounds very good in theory, and I had the same (if not bigger) problems.

2. I hadn't known about it, and, no matter how pubilc it is, the server could just refuse to store more data. Silently corrupting two documents for every document inserted is inexcusable, even if your database was forged in the pits of hell.

I think your point about the poster child thing is true. While I meant my post as a sort of "MongoDB didn't look too production-ready to me, but I hope it gets there eventually", people became really polarized and took it either as "this guy is right, MongoDB sucks" or "this guy is an idiot, silently corrupting data is perfectly acceptable if there's a notice on the website"...

stavros · on Aug 6, 2010

In retrospect, I have these observations:

* The corruption of data might be excused in the development branch.

* The silent corruption of the data when the server goes past its limit for a 32-bit DB cannot be excused, since the server could die, at the very least.

* The corruption of data due to the process being killed because the connection dropped wasn't MongoDB's fault.

* Requiring 9 GB of RAM for 5 indexes doesn't sit very well with me...

* Silently corrupting data for me to find out days later is not something a stable, unstable or toy database should do...

zefhous · on Aug 5, 2010

Some great improvements and one step closer to full-text search, which is slated for 1.7.x.

http://jira.mongodb.org/browse/SERVER-380

mumrah · on Aug 5, 2010

You can do pretty neat things with indexes and regexp queries. The mongo-search project is also pretty cool (does Porter stemming, indexing, etc).

That said, if you need powerful "full-text searching" I would look to something like Lucene or Xapian.

thibaut_barrere · on Aug 5, 2010

thanks, just saw this at the bottom which looks interesting:

http://github.com/glamkit/mongo-search

jpcx01 · on Aug 5, 2010

Replica sets look awesome. Can't wait to get them set up in production. Replacing a master/slave setup with this means automatic failover.

Looking forward to single server durability in the next one (1.8). Should enable me to convince more clients to add mongo as part of the deployment stack.

wmwong · on Aug 5, 2010

I've also got a master/slave setup and am looking forward to replica sets. It's great to see sharding is production ready. It's now there when I need it.

_yf3e · on Aug 5, 2010

I got a replica pair. Wonder how easy I can migrate it to a replica set.

patrickaljord · on Aug 5, 2010

Upgrading From Replica Pairs or Master/Slave:

http://www.mongodb.org/display/DOCS/Upgrading+to+Replica+Set...

mark_l_watson · on Aug 5, 2010

I installed 1.6 on my laptop today, but I will wait a week before starting to update servers. Anyway, I have never had data collections large enough to need sharding, but replica sets look like a much better alternative to frequent snapshots to S3.

A year ago I was equally enthusiastic about MongoDB, CouchDB, and Cassandra. However, at least for the modest scale work that I do I don't really need Cassandra, and MongoDB is so easy to work with. I still really like CouchDB but I have never had a customer request its use, so my experience is limited to just using it for my own stuff.

zaph0d · on Aug 5, 2010

Yay! Great news, just in time for my startup launch. Kudos to MongoDB.

ritonlajoie · on Aug 6, 2010

Hi pune :) I was in I2IT some years ago.. nice place !

samratjp · on Aug 5, 2010

Definitely getting huMongous - looks like it's used by justin.tv and foursquare - good enough for me. By the way, there is a MongoDB shell on the homepage for tryruby.org like mini tutorials => http://www.mongodb.org/ Convinced me enough to get interested and download mongoid as well :-)

Hmm, now all there's left is an online hosting solution like Couch.io

kchodorow · on Aug 5, 2010

Also Mongo Machine: http://mongomachine.com/

detst · on Aug 5, 2010

There's https://mongohq.com

malyk · on Aug 5, 2010

And there's a Heroku add-on that supports it as well.

thibaut_barrere · on Aug 5, 2010

You can also use MongoHQ directly if you want, from Heroku.

I described the steps here:

http://blog.logeek.fr/2010/6/29/sinatra-heroku-mongodb-mongo...

uggedal · on Aug 5, 2010

https://mongohq.com/

chuhnk · on Aug 6, 2010

I was excited to see a new movement in database schemes that provide a different way of storing data outside a relational database model however a couple things bother me. Firstly calling it nosql, its just complete disrespect of sql and relational databases, could there not have been a more mature response and have it termed as something relevant?

Secondly what scares me and I find hilarious is people who so quickly jump onto this movement, moving the entirety of their critical data without understanding the potential downfalls, such as data loss with no warning. As a key value store for non critical data, this kind of thing is brilliant, and data loss can be managed, maybe not tolerated in a high throughput environment where "cache misses" are a concern but otherwise yea its great. Still look at facebook, twitter, friendfeed, who are all still using mysql and scaling out in their own way.

lelele · on Aug 6, 2010

Already commented about Debian here: http://news.ycombinator.com/item?id=1581231

leej · on Aug 5, 2010

i dont get why do they still stick with agpl. there are a lot more _companies_ built on products with licenses other than agpl, such as, wildly popular apache projects. even microsoft releasing code with apache license but not mongodb guys!

arst · on Aug 5, 2010

There are successful companies built around all sorts of licensing schemes, but that doesn't mean that you are in a position to say what's best for 10gen's specific situation.

Also, all of the official client libraries for mongo are Apache licensed, it's only the core server that is AGPL (which means that you can use mongo in closed source applications - only if you make changes to the actual core database do you have to give those changes back to the community, and even then you still don't have to open source the rest of your application).

leej · on Aug 6, 2010

you're in no position to tell me what i say! it's simply awkard to use a license like that. also you, yourself, admitted the conflict of client vs server licensing. why this is so? because they want widest usage. they release open source software not because they love developers but because they have to. returning to the point apache or gpl projects have strong commercial backing from big companies to start-ups and knowing this i'm asking there must be a real reason why they did choose agpl over gpl or apache. btw, it's their choice i respect that as well.