Currently the choice between MongoMapper and Mongoid is pretty simple. If you're using Rails 2.3.x, MongoMapper. If you're using Rails 3, Mongoid. The ActiveModel integration is key for Rails 3. Mongoid also has great documentation. MongoMapper does not as much consolidated documentation.
Firstly, MongoModel was started back in November '09 when both MongoModel and and Mongoid were much less mature than they are now. I spent some time contributing patches to MongoMapper but stumbled as I was working on edge rails.
MongoMapper in particular has come a long way since then but still has a few issues in my mind. For example, attributes are converted to/from their mongo representation every single time they are accessed, rather than only when the model is saved or loaded. I also believe MongoModel has a nicer model for typecasting property values (see http://gist.github.com/287379 for an example), and I disagree with the use of has_many associations for embedded collections.
MongoModel isn't perfect either, but it has been the path of least resistance for me in getting my app running on MongoDB.
Cool, thanks for the explanation. Btw, mongoid has a new association dsl that I think makes way more sense. embedded_in, embeds_many, referenced_in, references_many. I bet the project would love to see you as a contributor.
To say 'preferred NoSQL' is far too generic, it is as if users have an option between SQL or NoSQL. I would say that MongoDB is becoming the preferred document-based data store - but that has nothing to do with key value, graph, column-oriented, etc. stores that make up what people refer to as 'NoSQL' (which is a loaded term)
As soon as they feature document level locking (maybe MVCC), I'll happily use it. But locking all databases when a write is occurring is a bit of a downer for large apps that require concurrency.
"MongoDB supports atomic operations on single documents. MongoDB does not support traditional locking and complex transactions for a number of reasons. . ."
What made you think that it was "locking all databases when a write is occurring"?
MongoDB has a single, global read/write lock for the entire database. So when a write is actually processing, nothing else can happen. In a moderately write-heavy environment, a single slow write can result in pretty bad performance in my experience. Reads can happen at the same time, although there have been problems with not yielding the lock on scans, although there are changes in 1.6 to address that.
basically what robotadam said (http://news.ycombinator.com/item?id=1579212).
To clarify: the database lock is actually for the whole process (-> all databases).
I had an especially bad experience with it before they introduced the yielding for long running operations. But even with the yielding, write heavy parallel things still suffer. Especially if your indexes ever get too big and the the system decides to swap in/out part of the index. I've seen 5-6 second inserts with hundreds of writes and reads piling up in the mean time (even without the swap part).
If only they had an equivalent of innodb as a storage engine, it would be awesome
Note that whilst a write blocks all other writes, the time you need to wait is the time it takes for the in-memory data structure to be modified; you don't have to wait for the write to hit disk. (Writes are only persisted to disk every so often.)
It would help keeping the index size down (well... distribute it at least) and a lock on a shard will not influence the other shards. I asked a few months back if I could just run several shards on a single machine and keep the locked area smaller that way, but the general consensus was that it wasn't a good idea because of the overhead. Other than that, the lock is still there...
Yes, I mark the point where I switched to 1.4 in my posts (and the same about the 64-bit version). In one case (recently, with 1.4.1 64-bit) I had lost half my data in a crash, and restored from week-old backups only to discover that they were missing back then too. Turns out it was a silent corruption days ago...
I started with the unstable version and then switched to the stable one, but the problems continued. If I remember correctly, the problems with 1.3 were not significantly more than 1.4.
I and many people are using mongodb just fine with no data-loss and way more data and way more queries per second than you. If you are going to make big accusations like this you need to back them up with a test-case to reproduce your error. Otherwise it's just yet another useless rant/bug report. I too could write a "cassandra lost my data" blog post with no proof whatsoever, wouldn't help. Also the 32bits argument is ridiculous, it is not supposed to store more than 2geebees anyway, so don't use it for that.
You can see replies by hitting the "threads" link at the top of the page.
Reasons (from my understanding) that you had a bad experience:
1. You used a development, unstable version of software in production.
2. 32 bit Mongo cannot store more than 2GB of data, this is a very public, known admission.
The reason that 'nobody seems to like your opinion' is that your post got real famous, and while mistakes happen, the internet was in a bit of a frenzy at the time over the whole NoSQL thing, and so you became sort of a poster child for various sides that people had chosen. Sucks, but it happens. I had a similar experience with my first blog post ever, so I feel your pain...
Thanks for the "threads" tip, it's really handy. About your points:
1. This is true (there was no warning that I saw when I was downloading it, although one was added later, or maybe I didn't notice it). However, I upgraded to the stable version when it came out to give MongoDB a second chance, because it sounds very good in theory, and I had the same (if not bigger) problems.
2. I hadn't known about it, and, no matter how pubilc it is, the server could just refuse to store more data. Silently corrupting two documents for every document inserted is inexcusable, even if your database was forged in the pits of hell.
I think your point about the poster child thing is true. While I meant my post as a sort of "MongoDB didn't look too production-ready to me, but I hope it gets there eventually", people became really polarized and took it either as "this guy is right, MongoDB sucks" or "this guy is an idiot, silently corrupting data is perfectly acceptable if there's a notice on the website"...
* The corruption of data might be excused in the development branch.
* The silent corruption of the data when the server goes past its limit for a 32-bit DB cannot be excused, since the server could die, at the very least.
* The corruption of data due to the process being killed because the connection dropped wasn't MongoDB's fault.
* Requiring 9 GB of RAM for 5 indexes doesn't sit very well with me...
* Silently corrupting data for me to find out days later is not something a stable, unstable or toy database should do...
Replica sets look awesome. Can't wait to get them set up in production. Replacing a master/slave setup with this means automatic failover.
Looking forward to single server durability in the next one (1.8). Should enable me to convince more clients to add mongo as part of the deployment stack.
I've also got a master/slave setup and am looking forward to replica sets.
It's great to see sharding is production ready. It's now there when I need it.
I installed 1.6 on my laptop today, but I will wait a week before starting to update servers. Anyway, I have never had data collections large enough to need sharding, but replica sets look like a much better alternative to frequent snapshots to S3.
A year ago I was equally enthusiastic about MongoDB, CouchDB, and Cassandra. However, at least for the modest scale work that I do I don't really need Cassandra, and MongoDB is so easy to work with. I still really like CouchDB but I have never had a customer request its use, so my experience is limited to just using it for my own stuff.
Definitely getting huMongous - looks like it's used by justin.tv and foursquare - good enough for me.
By the way, there is a MongoDB shell on the homepage for tryruby.org like mini tutorials => http://www.mongodb.org/ Convinced me enough to get interested and download mongoid as well :-)
Hmm, now all there's left is an online hosting solution like Couch.io
I was excited to see a new movement in database schemes that provide a different way of storing data outside a relational database model however a couple things bother me. Firstly calling it nosql, its just complete disrespect of sql and relational databases, could there not have been a more mature response and have it termed as something relevant?
Secondly what scares me and I find hilarious is people who so quickly jump onto this movement, moving the entirety of their critical data without understanding the potential downfalls, such as data loss with no warning. As a key value store for non critical data, this kind of thing is brilliant, and data loss can be managed, maybe not tolerated in a high throughput environment where "cache misses" are a concern but otherwise yea its great. Still look at facebook, twitter, friendfeed, who are all still using mysql and scaling out in their own way.
i dont get why do they still stick with agpl. there are a lot more _companies_ built on products with licenses other than agpl, such as, wildly popular apache projects. even microsoft releasing code with apache license but not mongodb guys!
There are successful companies built around all sorts of licensing schemes, but that doesn't mean that you are in a position to say what's best for 10gen's specific situation.
Also, all of the official client libraries for mongo are Apache licensed, it's only the core server that is AGPL (which means that you can use mongo in closed source applications - only if you make changes to the actual core database do you have to give those changes back to the community, and even then you still don't have to open source the rest of your application).
you're in no position to tell me what i say! it's simply awkard to use a license like that. also you, yourself, admitted the conflict of client vs server licensing. why this is so? because they want widest usage. they release open source software not because they love developers but because they have to. returning to the point apache or gpl projects have strong commercial backing from big companies to start-ups and knowing this i'm asking there must be a real reason why they did choose agpl over gpl or apache. btw, it's their choice i respect that as well.
examiner.com is switching to Drupal soonish and has funded most of the work to write the Drupal/MongoDB integration.