Hacker News new | past | comments | ask | show | jobs | submit login
Finding MongoDB instances without any authentication (shodan.io)
131 points by yammesicka on July 19, 2015 | hide | past | favorite | 42 comments



This information is dangerous :)

A while back I published research on open, unauthenticated ICA (Citrix) instances that could be found by doing basic google queries. I was able to find a lot of interesting targets including some belonging to military and government organisations. I published my findings regarding the discovery without including any details. The blog post was very vague. Anyway, it doesn't take a rocket scientist to figure out what's going on once you know the basics. Someone did exactly this and wrecked a few systems. I was contacted later by the effected organisations holding me directly responsible for the damage that was inflicted. I had no involvement whatsoever but the information that I provided was crucial for the discovery of these targets. This was when I realised that regardless how cool is to publish security research you should always take the necessary steps to ensure that no one is harmed.


I've tried to reach out for a few years now and spoke about it at every opportunity but for some reason people just aren't interested in looking at databases. Most of these MongoDB instances are running old versions and given the popularity of the project I suspect that this is a tiny fraction of deployments. Btw I'm still trying to find contacts at some organizations that are affected, but it's actually surprisingly difficult to reach somebody that is in charge of security :-/ Especially with servers hosted in the cloud doing attribution is difficult!

Edit: Btw I've also tried repeatedly to get some press coverage of this issue with reporters, but nobody was interested in covering this problem.


They're entirely responsible for the damage that was inflicted. Attempting to shift the blame to you is childish at best.


Maybe, but there is such thing as responsible disclosure.


This is a victim-blaming myth. The prime responsibility for the damage is the person who did the damage. Not the discloser, and not the victim.


Maybe, but running unauthenticated databases on the public internet is negligent at best.


No. It could be simple ignorance. Or an accident.

In your world, what is it at worst? Criminal? Capital?


Agreed that it could be either of those things. I'm not trying to excuse criminal behavior at all, rather stating that if one puts an unauthenticated database on the internet, it's going to be compromised. For software professionals, my opinion is that to do so would be negligent.


An ignorance is an excuse for compromising your company or customer's data in exactly what situations? Let's just all cover our eyes and not look, then the data will be safe I'm sure.


Of course it depends on the context. I don't know if it's reasonable to expect a small family clinic, therapist, or dental office to secure their client information. It seems that people just mass scan the internet looking for already known vulnerabilities.

However, if it's a mid-sized business handling important information, like payment information, then I do think there ought to be a standard of dutiful behavior, because otherwise who pays for the externalities?


It could also be leftover testing systems that haven't been torn down yet, with nothing interesting in them. The internet is full of them.


Why shouldn't we blame victims too? Should we not blame victims of shark attacks that swim in shark-infested waters?


Have another go at interpreting my comment.


I remember (maybe it was the same service shodan) a year ago, looking at its scanning interface and seeing my instance of my toy database listed as open. I didn't think anything of it, and was too lazy to do anything about, but I've been aware for quite some time now that people will do ip scans against common ports.

This later came back to bite me in the ass when I started messing around with elasticsearch. Elasticsearch had a pretty nasty default that let you run arbitrary code from its query api and within hours my box was compromised.


That's exactly the reason why I'm not reporting anything anymore. The legal aspect is just really badly designed for this kind of situation. If I find something opened or unsecured by mistake, I just go somewhere else and don't report anything so this way I don't risk anything. That's sad because it's not helping anyone but I just don't want to handle any legal stuff.


Sure, that's the legal environment we find ourselves in, and in many respects it's short sighted and counter productive, i.e. it actually fosters weak security in critical settings and the chance of those weaknesses being taken advantage of by malicious entities.


I did the same thing, spidering for open rsync shares:

http://blog.steve.org.uk/secure_your_rsync_shares__please_.h...

Some scary stuff out there, freely available with minimal effort. Of course with rsync things were generally read-only, but even so lots of family financial-data, and pictures.


blekko's search engine received a lot of automated queries looking for that kind of info -- that and SEO ranking research were our top 2 types of automated search.


>always take the necessary steps to ensure that no one is harmed.

That is always publish anonymously.


This is known for many years so author is not the first to acknowledge that problem (just do an online search). This is a known operational problem / must-avoid / best practice for Mongo deployment.


FYI: if you don't want to pay shodan for search results, you could run your own port scan using masscan(https://github.com/robertdavidgraham/masscan) by running the command

  masscan -p27017 0.0.0.0/0 --excludefile data/exclude.conf
Be warned that this will scan the entire IPv4 namespace.


If somebody wants to give obblekk's suggestion a try, you can use my docker masscan container right away [0]

[0] https://registry.hub.docker.com/u/dordoka/masscan/


I honestly blame DigitalOcean a bit for not providing a VPC and/or a centralized firewall. It is tedious to configure iptables rules on each server and easy to overlook and make mistakes.

Furthermore, it should be the job of the firewall to limit access to server interfaces/ports, not the services inside of servers. Binding on 0.0.0.0 seems perfectly acceptable, especially for cluster/distributed services that talk amoung themselves.


Holy crap, TWO YEARS to patch an insecure default?

Sorry, but if you're using MongoDB in production, this is the point where you should start reconsidering that. Two years to patch such a gaping security hole, regardless of any 'breakage', is completely unacceptable.


memcached has the same default to this day - listen on all interfaces, no auth.

These things are designed for use by people running them on servers that are not directly exposed to the internet. If you're running it in a dev VM with no public address, it's fine. If you're running it on a database-optimized server in your datacenter/cloud which has a firewall only allowing connections from your web-application servers to particular ports, it's fine.

In fact I wouldn't trust mongodb auth anyway, that's not it's focus, much less its strength. Leave the auth to other mechanisms designed for it.

I try not to worry about the infinite multitude of idiots who can follow some bad advice and get some software running. No matter what you do to make things foolproof, human ingenuity comes up with better fools, and in the process you make things more complicated for people who know what they're doing.

"completely unacceptable"? no, reasonable.


> These things are designed for use by people running them on servers that are not directly exposed to the internet.

Just as well internal attacks and fraud are never a thing, and pivoting attacks up a chain of successively less secure components never happens.


For any given system you can saw me I can show you 100 things that could be extra hardened.

At some point you stop caring about security and start caring about convenience / practically.


Yeah... Memcached is one of the other ones that has the same problem but around 100,000 public instances... Maybe I shouldn't have even mentioned MongoDB since everybody's so focused on it now, but this sort of configuration issue affects a ton of database products.


It actually affects a tonne of products, period.

It's quite popular in enterprise software to listen on all interfaces.


Regardless of my thoughts on MongoDB in particular, if you are relying upon authentication mechanisms built into infrastructure software so that you can put, say, MongoDB on a public IP address and communicate in the open, you are operating your infrastructure completely unacceptably. There is absolutely zero excuse for not doing this right, and if MongoDB's default fucked you here, you're not doing it right in the first place. You wouldn't put your Nest thermostat on the Internet, so why is your primary data store? And, even worse, if you've "done it right" and thrown HTTP basic authentication in front of it, you get to put your hands on your hips and say "ha! we're secure!" but you're one bypass or weak password away from losing your entire database.

I agree with you on reconsidering MongoDB in production, but administrators failing to secure their systems is not why. Authenticating to a database is an antipattern. Stop making database vendors add this shit: HTTP basic authentication against your NoSQL hotness in prod, on a public IP, is a complete waste of time. Put it in RFC 1918 space or lock down security groups like everybody else and stop losing databases like this. That's what's unacceptable.

Seriously, this stuff is bananas. Now you have to ship passwords around in your automation because you can't be bothered to deploy a real DMZ and private network. Then you have to use Ansible Vault or whatever, and the complexity just rapidly multiplies.


You're missing the point. Listening on all interfaces was the default. Defaults should be secure and fail-closed, and that it takes them two years to patch that is extremely worrying.


I'm not missing your point. Your point is wrong. Listening on all interfaces is acceptable because this is infrastructure software, and wiring up MongoDB on a publicly accessible and unfiltered endpoint is an antipattern, authentication scheme or not. If you choose the shitty deployment, like public IPv4, it's on you to actively configure the software to support your shitty deployment.

The default that isn't secure is your expectation of secure software regardless of the accessibility of the endpoint. You don't get to punt those assurances to MongoDB and file "unacceptable" JIRAs to add some other lightly-reviewed authentication scheme to software that doesn't need it. It's on you as an administrator to secure your database, and step one is not default permit to an endpoint on which you can find your entire database. Let me guess, you want authentication via HTTP basic for all of your backend services but rolling a CA and doing TLS client auth is outside your budget and time?

I have been doing production operations for a while. There's two schools of thought: "the defaults are unacceptable," and "I should really be applying defense in depth to protect my infrastructure and own responsibility," and I actively hire the latter. There are a lot of the former, and we're seeing their databases in this post.


I wouldn't worry so much. What can you infer by that? It's completely normal that people working on database engines don't know that much about operational stuff like what network connections to listen to. The product I used to work on, for example, had this exact same misfeature. The real motivation for fixing it was that developer machines would be listening to the outside world, whenever we or our users ran tests and such. The idea that people would run it this way in production was completely foreign to me, personally.


You're talking nonsense.

Most enterprise systems e.g. Hadoop listen on all interfaces as well as we are routing traffic over multiple cards e.g. Infiniband. Does that mean we should be reconsidering those choices as well ?

And it's not a gaping security hole. It's a poorly chosen default. And most normal people read and understand the configuration files before they deploy anything into production.


Can't malware/bots use these databases for communication?


Yes, but then you build your own layer of auth on top of it. Signed messages should work fine, though. Its an interesting idea, for sure.


Hehe, for that matter it'd make a hell of a means of distribution for illicit materials (pirate software/movies)...

Think db nodes to find/query for torrents...


Open ftp uploads all over again...


Am I right that HackedDB could be because someone who noticed the lack of authentication created such database?

If I can connect to an instance without auth, I can also create a DB and collections etc.


Yes, it could be that there was somebody before me that already noticed this issue and decided to exploit it :-/ I saw on Twitter that there actually was a talk in 2013 at DEFCON about these sorts of problems in NoSQL, so in certain circles it's been known for a while just not acted upon.


It's still surprising... I've used MongoDB a few times, but I was always well aware to put it behind a firewall and setup basic auth.

I'm not really one for super fine grained security at the database level, but you should at least have some level of connection controls in place.

iptables isn't that hard.


I just reached out to a very big "startup" that had many many GeeBees available publicly. Such a rookie mistake!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: