Hacker News new | past | comments | ask | show | jobs | submit login
MongoDB 1.6 stable released (mongodb.org)
107 points by pierrefar on Aug 5, 2010 | hide | past | favorite | 58 comments



Many people expect MongoDB to be the standard datastore for high-performance Drupal sites come the Drupal 7 release. See this session at the last drupalcon: http://sf2010.drupal.org/conference/sessions/mongodb-humongo...

examiner.com is switching to Drupal soonish and has funded most of the work to write the Drupal/MongoDB integration.


Mongo looks like its emerging as the preferred NoSQL platform on Rails at least.

As an admittedly rough measure, check out the number of "watchers" on github for each project's leading plugin:

1106 - MongoMapper http://github.com/jnunemaker/mongomapper

536 - CouchREST http://github.com/jchris/couchrest

312 - Cassandra http://github.com/fauna/cassandra

I've only built one rails app with nosql in it, but at first glance at least Mongo seemed the most fully baked of the group.


Don't forget:

695 - Mongoid http://github.com/durran/mongoid

seems to be the Rails/Mongo project to watch. They got a slick looking homepage too: http://mongoid.org

Hoping to try using it on my next project.


We've used both MongoMapper and Mongoid, and are very happy with Mongoid. It has nice ActiveModel integration, so it works quite well with Rails 3.


Seconded, using Mongoid over MM here too. It rocks.


Currently the choice between MongoMapper and Mongoid is pretty simple. If you're using Rails 2.3.x, MongoMapper. If you're using Rails 3, Mongoid. The ActiveModel integration is key for Rails 3. Mongoid also has great documentation. MongoMapper does not as much consolidated documentation.


I'd also recommend checking out my MongoModel project, especially if you have been dissatisfied with the way MongoMapper or Mongoid do things.

http://github.com/spohlenz/mongomodel http://www.mongomodel.org

It's not quite as mature, but more documentation updates are currently in progress.


Not sure I totally understand. But it looks like its just a rewrite of MongoMapper for Rails3? Seems like the structure is very similar.

What didn't you like about MongoMapper and/or Mongoid? Starting from scratch is a big job.


Firstly, MongoModel was started back in November '09 when both MongoModel and and Mongoid were much less mature than they are now. I spent some time contributing patches to MongoMapper but stumbled as I was working on edge rails.

MongoMapper in particular has come a long way since then but still has a few issues in my mind. For example, attributes are converted to/from their mongo representation every single time they are accessed, rather than only when the model is saved or loaded. I also believe MongoModel has a nicer model for typecasting property values (see http://gist.github.com/287379 for an example), and I disagree with the use of has_many associations for embedded collections.

MongoModel isn't perfect either, but it has been the path of least resistance for me in getting my app running on MongoDB.


Cool, thanks for the explanation. Btw, mongoid has a new association dsl that I think makes way more sense. embedded_in, embeds_many, referenced_in, references_many. I bet the project would love to see you as a contributor.


To say 'preferred NoSQL' is far too generic, it is as if users have an option between SQL or NoSQL. I would say that MongoDB is becoming the preferred document-based data store - but that has nothing to do with key value, graph, column-oriented, etc. stores that make up what people refer to as 'NoSQL' (which is a loaded term)


I've used MongoMapper for a couple projects and been really happy with it. John Nunemaker is doing a great job with it.


Anyone uses Riak? How does it compared with Mongo?


We host both for our users, and use Riak for our own datastore. They're both awesome, but quite different.

Here's a pretty thourough comparison: https://wiki.basho.com/display/RIAK/Riak+Compared+to+MongoDB

Note: written by the authors of Riak, but MongoDB contributors chimed in on the comments.


Mongoengine is pretty tight for python/mongo


As soon as they feature document level locking (maybe MVCC), I'll happily use it. But locking all databases when a write is occurring is a bit of a downer for large apps that require concurrency.


http://www.mongodb.org/display/DOCS/Atomic+Operations says:

"MongoDB supports atomic operations on single documents. MongoDB does not support traditional locking and complex transactions for a number of reasons. . ."

What made you think that it was "locking all databases when a write is occurring"?


MongoDB has a single, global read/write lock for the entire database. So when a write is actually processing, nothing else can happen. In a moderately write-heavy environment, a single slow write can result in pretty bad performance in my experience. Reads can happen at the same time, although there have been problems with not yielding the lock on scans, although there are changes in 1.6 to address that.


This used to be the case in earlier versions, but now it only issues a server level write lock for atomic updates, reads can still happen.

It's not ideal, but it is an improvement.


basically what robotadam said (http://news.ycombinator.com/item?id=1579212). To clarify: the database lock is actually for the whole process (-> all databases). I had an especially bad experience with it before they introduced the yielding for long running operations. But even with the yielding, write heavy parallel things still suffer. Especially if your indexes ever get too big and the the system decides to swap in/out part of the index. I've seen 5-6 second inserts with hundreds of writes and reads piling up in the mean time (even without the swap part).

If only they had an equivalent of innodb as a storage engine, it would be awesome


This is also pertinent:

http://www.mongodb.org/display/DOCS/How+does+concurrency+wor...

Note that whilst a write blocks all other writes, the time you need to wait is the time it takes for the in-memory data structure to be modified; you don't have to wait for the write to hit disk. (Writes are only persisted to disk every so often.)


Unless you actually don't have enough RAM and that data structure has to be paged in... that's a painfully long wait


Interesting! Did not know that.

Any idea how replica sets / shards might fit into this picture?


It would help keeping the index size down (well... distribute it at least) and a lock on a shard will not influence the other shards. I asked a few months back if I could just run several shards on a single machine and keep the locked area smaller that way, but the general consensus was that it wasn't a good idea because of the overhead. Other than that, the lock is still there...


Unlikely IMO. Update-in-place is one major feature of MongoDB. If you need MVCC you may be better off using CouchDB or Riak.


Nobody seems to like my opinion, but MongoDB kept eating my data regularly and silently. My writeups are here:

http://www.korokithakis.net/node/116 http://www.korokithakis.net/node/119

I would love to answer questions if I could figure out how to easily see my replies on HN :/


1.3.3 was a development version, as is every odd numbered: http://www.mongodb.org/display/DOCS/Version+Numbers


I know that now :P Regardless, the problems continued with 1.4.


My understanding is that you used v1.3.3 then switched to v1.4.0, is that right ?

Are you experiencing data loss with 1.4.0 ?


Yes, I mark the point where I switched to 1.4 in my posts (and the same about the 64-bit version). In one case (recently, with 1.4.1 64-bit) I had lost half my data in a crash, and restored from week-old backups only to discover that they were missing back then too. Turns out it was a silent corruption days ago...


Are you using the stable MongoDB?


I started with the unstable version and then switched to the stable one, but the problems continued. If I remember correctly, the problems with 1.3 were not significantly more than 1.4.


I and many people are using mongodb just fine with no data-loss and way more data and way more queries per second than you. If you are going to make big accusations like this you need to back them up with a test-case to reproduce your error. Otherwise it's just yet another useless rant/bug report. I too could write a "cassandra lost my data" blog post with no proof whatsoever, wouldn't help. Also the 32bits argument is ridiculous, it is not supposed to store more than 2geebees anyway, so don't use it for that.


See if you can make SQLite lose your data, I'd like to see that...


You can see replies by hitting the "threads" link at the top of the page.

Reasons (from my understanding) that you had a bad experience:

    1. You used a development, unstable version of software in production.
    2. 32 bit Mongo cannot store more than 2GB of data, this is a very public, known admission.
The reason that 'nobody seems to like your opinion' is that your post got real famous, and while mistakes happen, the internet was in a bit of a frenzy at the time over the whole NoSQL thing, and so you became sort of a poster child for various sides that people had chosen. Sucks, but it happens. I had a similar experience with my first blog post ever, so I feel your pain...


Thanks for the "threads" tip, it's really handy. About your points:

1. This is true (there was no warning that I saw when I was downloading it, although one was added later, or maybe I didn't notice it). However, I upgraded to the stable version when it came out to give MongoDB a second chance, because it sounds very good in theory, and I had the same (if not bigger) problems.

2. I hadn't known about it, and, no matter how pubilc it is, the server could just refuse to store more data. Silently corrupting two documents for every document inserted is inexcusable, even if your database was forged in the pits of hell.

I think your point about the poster child thing is true. While I meant my post as a sort of "MongoDB didn't look too production-ready to me, but I hope it gets there eventually", people became really polarized and took it either as "this guy is right, MongoDB sucks" or "this guy is an idiot, silently corrupting data is perfectly acceptable if there's a notice on the website"...


In retrospect, I have these observations:

* The corruption of data might be excused in the development branch.

* The silent corruption of the data when the server goes past its limit for a 32-bit DB cannot be excused, since the server could die, at the very least.

* The corruption of data due to the process being killed because the connection dropped wasn't MongoDB's fault.

* Requiring 9 GB of RAM for 5 indexes doesn't sit very well with me...

* Silently corrupting data for me to find out days later is not something a stable, unstable or toy database should do...


Some great improvements and one step closer to full-text search, which is slated for 1.7.x.

http://jira.mongodb.org/browse/SERVER-380


You can do pretty neat things with indexes and regexp queries. The mongo-search project is also pretty cool (does Porter stemming, indexing, etc).

That said, if you need powerful "full-text searching" I would look to something like Lucene or Xapian.


thanks, just saw this at the bottom which looks interesting:

http://github.com/glamkit/mongo-search


Replica sets look awesome. Can't wait to get them set up in production. Replacing a master/slave setup with this means automatic failover.

Looking forward to single server durability in the next one (1.8). Should enable me to convince more clients to add mongo as part of the deployment stack.


I've also got a master/slave setup and am looking forward to replica sets. It's great to see sharding is production ready. It's now there when I need it.


I got a replica pair. Wonder how easy I can migrate it to a replica set.


Upgrading From Replica Pairs or Master/Slave:

http://www.mongodb.org/display/DOCS/Upgrading+to+Replica+Set...


I installed 1.6 on my laptop today, but I will wait a week before starting to update servers. Anyway, I have never had data collections large enough to need sharding, but replica sets look like a much better alternative to frequent snapshots to S3.

A year ago I was equally enthusiastic about MongoDB, CouchDB, and Cassandra. However, at least for the modest scale work that I do I don't really need Cassandra, and MongoDB is so easy to work with. I still really like CouchDB but I have never had a customer request its use, so my experience is limited to just using it for my own stuff.


Yay! Great news, just in time for my startup launch. Kudos to MongoDB.


Hi pune :) I was in I2IT some years ago.. nice place !


Definitely getting huMongous - looks like it's used by justin.tv and foursquare - good enough for me. By the way, there is a MongoDB shell on the homepage for tryruby.org like mini tutorials => http://www.mongodb.org/ Convinced me enough to get interested and download mongoid as well :-)

Hmm, now all there's left is an online hosting solution like Couch.io


Also Mongo Machine: http://mongomachine.com/



And there's a Heroku add-on that supports it as well.


You can also use MongoHQ directly if you want, from Heroku.

I described the steps here:

http://blog.logeek.fr/2010/6/29/sinatra-heroku-mongodb-mongo...



I was excited to see a new movement in database schemes that provide a different way of storing data outside a relational database model however a couple things bother me. Firstly calling it nosql, its just complete disrespect of sql and relational databases, could there not have been a more mature response and have it termed as something relevant?

Secondly what scares me and I find hilarious is people who so quickly jump onto this movement, moving the entirety of their critical data without understanding the potential downfalls, such as data loss with no warning. As a key value store for non critical data, this kind of thing is brilliant, and data loss can be managed, maybe not tolerated in a high throughput environment where "cache misses" are a concern but otherwise yea its great. Still look at facebook, twitter, friendfeed, who are all still using mysql and scaling out in their own way.


Already commented about Debian here: http://news.ycombinator.com/item?id=1581231


i dont get why do they still stick with agpl. there are a lot more _companies_ built on products with licenses other than agpl, such as, wildly popular apache projects. even microsoft releasing code with apache license but not mongodb guys!


There are successful companies built around all sorts of licensing schemes, but that doesn't mean that you are in a position to say what's best for 10gen's specific situation.

Also, all of the official client libraries for mongo are Apache licensed, it's only the core server that is AGPL (which means that you can use mongo in closed source applications - only if you make changes to the actual core database do you have to give those changes back to the community, and even then you still don't have to open source the rest of your application).


you're in no position to tell me what i say! it's simply awkard to use a license like that. also you, yourself, admitted the conflict of client vs server licensing. why this is so? because they want widest usage. they release open source software not because they love developers but because they have to. returning to the point apache or gpl projects have strong commercial backing from big companies to start-ups and knowing this i'm asking there must be a real reason why they did choose agpl over gpl or apache. btw, it's their choice i respect that as well.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: