Can anyone recommend a good "Consultant's", "Solutions Architect's", or "Top-Rig...

_ondq · on July 21, 2015

To add to the others, in a nutshell:

- Mesos is a generalized, low level framework for distributing workloads over multiple nodes. It provides mechanism not policy. Therefore it requires quite a bit of up-front work to build something usable for any given application.

- Kubernetes is an opinionated cluster execution tool. It provides tools and a curated workflow for running distributed containerized applications. It's generally pretty quick and easy to get running.

- Mesos has a rich, resource-aware task scheduler. You can specify that your application requires X CPU units and Y RAM units and it will find the optimum node to run the task on.

- By contrast, the Kubernetes scheduler currently is rather dumb[1]. There's no way to specify the expected resource utilization for pods, and the scheduler simply tries to spread out replicas as much as possible throughout the available nodes.

People are (rightly) excited about things like Mesosphere which could allow the best of both worlds: the ease and API of Kubernetes with a powerful Mesos resource scheduler, not to mention nice-to-haves like a Web UI with pretty visualizations.

You can now cut me a check for 50% of the consulting revenue you get from this information. :)

1. The scheduler is intentionally simple and pluggable, to allow improvements easily in the future. My statements only apply to the current state of Kubernetes as deployed today.

davidooo · on July 21, 2015

The Kubernetes scheduler also does resource-aware scheduling. You're correct that it tries to spread replicas across nodes, but it only spreads them across the nodes that have enough free resources for the container (more precisely, Pod) that it's scheduling.

_ondq · on July 21, 2015

Nowhere in the pod spec is there a way to specify resource constraints or even hints (https://github.com/GoogleCloudPlatform/kubernetes/blob/maste...).

So even if the scheduler is vaguely resource-aware (I'm not convinced that's true) it would be entirely static, based on things like container count.

davidooo · on July 21, 2015

Currently resource requirements are specified only on containers, not Pods. The requirements for the Pod are computed by adding up the requirements of the containers within the Pod.

To be more concrete: Within the PodSpec type that you linked to, there is a field of type []Container. Within the Container type there is a field called Resources which is of type ResourceRequirements. ResourceRequirements lets you specify resource requirements of the container. The resource requirements of the Pod are computed by adding up the resource requirements of the containers that run within the Pod.

In addition to resource-based scheduling, we also support "label selectors" which allows you to label nodes with key/value pairs and then say that a Pod should only run on nodes with particular labels. That's specified in the NodeSelector field of the PodSpec (which you linked to).

_ondq · on July 21, 2015

Fair enough! :) I overlooked that aspect of the Container type, obviously.

bcbroussard · on July 21, 2015

Here's the doc for resource constraints, called "limits" in Kubernetes. https://github.com/GoogleCloudPlatform/kubernetes/blob/maste...

The Kubernetes scheduler looks at "fit" and "resource availability" to determine the node a pod will run on. Nodes can also have labels like "high-mem" or "ssd", so you can request a particular type of server (via the the nodeSelector field). More details are in the link above.

davidooo · on July 21, 2015

The page you linked to describes a slightly different feature, namely the ability to restrict and override the resource requirements of Pods at the time they are submitted to the system. So it's part of the admission control system, not part of the scheduling.

The documentation on resource-based scheduling is at https://github.com/GoogleCloudPlatform/kubernetes/blob/maste...

bcbroussard · on July 21, 2015

Thanks davidooo - I was specifically referring to the section on "limits at the point of creation" which gives a practical example of using limits in a multi-namespace (multi-tenant) environment. (https://github.com/GoogleCloudPlatform/kubernetes/blob/maste...).

The new documentation you linked to has good explanations in it as well.

SEJeff · on July 21, 2015

Mesosphere is a company, not a piece of software. I think you're referring to Mesosphere's dcos[1].

DCOS is some really nice packaging for marathon, chronos, etc, with a nice cli tool for downloading and installing new frameworks onto mesos. Personally, I find using Aurora a lot nicer.

That being said, I do think the kubernetes-mesos gives you far and above the best of both worlds. You get the developer story of kubernetes, with the ops story of mesos.

From an ops perspective, k8s is a bit clunky. I was actually shocked when I found out after bringing a new kubelet (worker node) online, you have to also update a service on the master. This was in a power training class on Kubernetes at this year's Redhat Summit in Boston. Really underwhelmed with the complexity of k8s compared to mesos, but they aren't an apples to apples comparison.

[1] https://mesosphere.com/product/

_ondq · on July 21, 2015

> I was actually shocked when I found out after bringing a new kubelet (worker node) online, you have to also update a service on the master

On AWS that has not been my experience. Nodes can be brought up dynamically and they register themselves with the master.

SEJeff · on July 22, 2015

Can't speak for the cloud. When running on bare metal, in high availability mode, you need to edit the controller manager config, /etc/kubernetes/controller-manager, and update a line called KUBELET_ADDRESSES. If you don't, they won't be in the output of:

kubectl get minions

Or usable at all. Really frustrating to me.

merb · on July 21, 2015

What you forgot is that it's not only easy to get started, you could also start REALLY REALLY small which doesn't work with ALL the other solutions. Kubernetes Master could run on a node with 512mb memory and it still works great. Your cluster doesn't die if the master dies it won't reschedule etc but thats ok mostly. the only thing what you need is enough etcd nodes to keep it up and running.

ghshephard · on July 21, 2015

Add to that, how does this relate to Docker-Swarm & Core-OS Fleet.

StackOverflow has at least one quick briefer, but wow, this field is growing quickly:

http://stackoverflow.com/questions/27640633/docker-swarm-kub...

waffle_ss · on July 21, 2015

A couple things I've found when evaluating these:

Docker Swarm seems to be the only one that supports one-off interactive containers that bind a TTY, like a Rails console (i.e. does the cluster support `docker run -it --rm busybox sh`). But its scheduling strategies[1] aren't as sophisticated as the others.

Marathon doesn't support linked containers[2], so if you're using Mesos and need linked containers, you probably will want to run Kubernetes on it and use pods.

[1]: https://github.com/docker/swarm/tree/b2182d080956040730cc76c...

[2]: https://support.mesosphere.com/hc/en-us/articles/205006415-H...

TheIronYuppie · on July 21, 2015

Kubernetes does this as well:

cluster/kubectl.sh exec pod_name -ti bash

Full disclosure: I work at Google on Kubernetes

waffle_ss · on July 21, 2015

Awesome, good to know. Based on this issue[1] I didn't think it did.

[1]: https://github.com/GoogleCloudPlatform/kubernetes/issues/152...

throwaway1979 · on July 21, 2015

I've been playing with Swarm and like what I've seen so far. WRT Kubernetes, on a platform other than GCE, is Flannel required for networking?

TheIronYuppie · on July 21, 2015

Networking on AWS works fine with no additional networking as well.

Overlay networking is not required if you're running within a bunch of nodes that can see each other. Only if you get more complex will you require something, and there are quite a few solutions (Flannel, Weave, Calico, etc)

Full disclosure: I work at Google on Kubernetes

merb · on July 21, 2015

But most of them suck, and they suck even more when you configured them badly (badly in terms of you used an option which comes by default). However with the bigger vxlan adoption most performance issues are fixed, still, could have some improvmenents. I also think that IPv6 could fix a lot of these things...

roque · on July 22, 2015

As others have mentioned there are multiple options. If you want micro-segmentation (one network per app-tier) with fine-grain access control you can use the OpenContrail plugin https://github.com/Juniper/contrail-kubernetes. It has the added advantage that you have a tenant network span k8s, openstack, vlans, or anything else you can plug into a reasonable mid-tier router.

[Disclosure: i'm currently working on this project]

amouat · on July 21, 2015

I think you just need to provide a flat networking infrastructure where all nodes get their own IP and can reach each other. So you can swap flannel for project calico or your own setup if you like.

eepyaich · on July 21, 2015

Yep - see http://www.projectcalico.org/calico-networking-for-kubernete....

[Disclosure - I'm part of the Project Calico team.]

dward · on July 21, 2015

Weave and Openvswitch are two other options. Many IaaS (gcp, aws) provide the required nobs and dials to configure this "natively" using their API so no extra SDN required if you are already running on a compatible cloud provider.

throwaway1979 · on July 21, 2015

I thought each host needs an entire /24 subnet, and the pods that run on the host get individual IP addresses. Am I mistaken?

TheIronYuppie · on July 21, 2015

No, there's a private network that's shared between all the containers running on the pods, but as long as the nodes can see each other, you're good.

Full disclosure: I work at Google on Kubernetes

mml · on July 21, 2015

Kubernetes can run on Mesos as a Mesos service. Thereby giving you a more opinionated layout for your compute cluster, while running other peer Mesos services, such as Hadoop, Chronos And Marathon.

https://github.com/GoogleCloudPlatform/kubernetes/blob/maste...

ninkendo · on July 21, 2015

I'm afraid answers like this actually make the confusion problem worse (nothing against your comment, just an observation in general.)

If you're confused about the differences between similar-sounding products X and Y, the fact that "X runs on Y" or "Y supports X" has never made the situation any better, it only makes the line between X and Y even more blurred.

I think this is especially true of Mesos, because people have a tendency to attribute qualities to Mesos that are actually qualities of a particular Mesos framework like Marathon or Aurora. As it is, Mesos is more of an SDK than anything, giving you the tools to write an orchestration system. It comes with built-in ways to communicate with nodes over a message bus, the ability to look at your nodes as resources, etc... but all of the scheduling logic is up to the frameworks themselves.

I think Mesos has a perception problem because of this. They want to build up hype about what mesos is and can do, so they claim things like Mesos being able to schedule docker images and keep them running, etc... but that's really the job of something like Marathon that runs as a Mesos framework. But if they didn't claim such things, Mesos wouldn't seem very compelling.

To me, the biggest benefit of Mesos is what the gain would be if every datacenter scheduler was a Mesos framework (yarn/spark/kubernetes/marathon/fleet/swarm/aurora, etc), and Mesos was only used to maintain multitenancy on the same hardware. That's where the real advantages come from... if you want to try Kubernetes you shouldn't have to dedicate hardware to it, you should just install it on your existing mesos cluster that is already running the rest of your stuff. In this respect, mesos is only useful insofar as all the big cluster managers use it as their underlying substrate.

tlrobinson · on July 21, 2015

"X runs on Y" at least implies Y is "lower level" than X.

As I understand it, Mesos is analogous to an operating system kernel for your cluster, while Kubernetes is a CaaS (containers-as-a-service) layer on top.

davidooo · on July 21, 2015

When Kubernetes runs on Mesos, that's correct.

Kubernetes can also run directly on VMs or physical machines.

SEJeff · on July 21, 2015

This, this so much. You just nailed it.

presspot · on July 21, 2015

>>> Mesos is more of an SDK than anything, giving you the tools to write an orchestration system

This is why Mesosphere built their DCOS; it recognizes that Mesos is a sharp-edged distributed systems kernel and needs to be packaged with convenience layers like Marathon and Chronos and "userland" tools (CLI, graphical UI, packaging system, etc) that make it a complete OS.

karlkfi · on July 21, 2015

YARN (Myriad), Spark, Kubernetes, Marathon, Swarm, and Aurora already run on Mesos, with varying degrees of maturity.

As you say, the primary benefit is that you can provision a single Mesos cluster and share its resources across various frameworks.

WestCoastJustin · on July 21, 2015

At their most basic level, Kubernetes [1] and Mesos [2] both use a client/server type architecture, where you install a client on many compute nodes, then the server part farms out jobs (in the form of containers) to these client compute nodes. Mesos does not do scheduling, and is pretty low-level in terms of interfaces, so you would typically run some type of software which talks to it via API, like Marathon [3], Aurora [4], or Hadoop [5]. Then Marathon/Aurora/Hadoop/etc tells Mesos to farm out compute jobs to these end client nodes (aka schedules). Complexity can quickly go up depending on your hosting environment, scale, and HA requirements. There is actually a really good high-level overview diagram of what a mesos/hadoop/zookeeper setup looks like here [6].

The stack looks something like this for Kubernetes/Mesos (top down):

  - Kubernetes and Mesos (client/server packages depending on node type)
  - Docker (container engine)
  - OS (Ubuntu/RHEL/etc)

What are some use-cases?

  - you have more work than can fit into one server
  - need to distribute load across N+ nodes
  - Google heavily uses containers (not k8s, but that inspired these patterns)
    - Gmail/Search/etc all run in containers [7]
  - Apple, Twitter, and Airbnb are running Mesos today [8, 9]

There are a bunch of revolving services, like:

  - distributed key/values stores (etcd/zookeeper)
  - load balancers
  - image registries
  - user interfaces
  - cli tools
  - logging/monitoring/alerting
  - etc

But, to answer your question, the main difference between Kubernetes and Mesos, is that Kubernetes offers an opinionated workflow, built-in scheduler, and patterns for how containers are deployed into this cluster of compute notes. The pattern is baked in from the start via Pods, Labels, Services, and Replication Controllers. It also helps to know that Kubernetes comes from Google, where they have been running containers in-house, so much of this workflow (pods, services, replication controllers, etc), comes from their internal use-cases. That's the 10,000 foot view.

[1] https://github.com/GoogleCloudPlatform/kubernetes

[2] http://mesos.apache.org/

[3] https://mesosphere.github.io/marathon/

[4] http://aurora.apache.org/

[5] https://hadoop.apache.org/

[6] http://mesos.apache.org/assets/img/documentation/architectur...

[7] http://www.wired.com/2013/03/google-borg-twitter-mesos/

[8] http://www.infoq.com/news/2015/05/mesos-powers-apple-siri

[9] https://www.youtube.com/watch?v=E4lxX6epM_U

DannoHung · on July 21, 2015

One small correction: Mesos is a scheduler. It doesn't natively ship with any end-user framework to access the scheduling though (you are supposed to write your own framework which uses the Mesos API). Marathon is a generic end-user framework for accessing those functions and runs on top of Mesos.

I think it's also interesting to note that Mesos can be used as the Kubernetes minion scheduler backend. And for very large collections of compute nodes, this is reputedly a good choice (though I don't have any personal experience to back that assessment up).

https://github.com/mesosphere/kubernetes-mesos

presspot · on July 21, 2015

In the parlance of Mesos, the Mesos kernel is an "allocator" (in that it assembles and allocates all of the compute resources) and the frameworks are the "schedulers" in that they request and consume "resource offers" (made by the allocator) and then schedule tasks to be run on those accepted resources.

SEJeff · on July 22, 2015

Actually that isn't entirely true. mesos-execute will run any shell command and it shows up as a framework in the framework api/ui tab on the mesos master.

WestCoastJustin · on July 21, 2015

Oh, thanks. Will check that out.

kiyoto · on July 21, 2015

>- logging/monitoring/alerting

As a maintainer of Fluentd [1], an open source log collector now integrated with Kubernetes [2] and Docker [3], happy to see this out =)

[1] https://www.fluentd.org [2] http://blog.raintown.org/2014/11/logging-kubernetes-pods-usi... [3] http://blog.treasuredata.com/blog/2015/07/07/collecting-dock...

rsync · on July 21, 2015

"you have more work than can fit into one server"

Many, many people believe this is true for them.

Most of them are wrong.

juliangregorian · on July 22, 2015

I know being the edgy you-don't-have-big-data guy is all the rage right now, but seriously? Most people can fit all of their work on one server? What kind of mom-and-pop shops are you working for?

rsync · on July 23, 2015

I started JohnCompanies in summer/fall 2001, which was the first VPS provider. This was a fairly significant financial gamble that most people can fit all their work on even less than one server.

That bet paid off.

I have no idea what "is all the rage right now".

MehdiEG · on July 21, 2015

To add to discussion, I've found this interview with Malte Schwarzkopf to be quite enlighning to understand how Omega, Kubernetes and Mesos fit together: https://blog.kismatic.com/qa-with-malte-schwarzkopf-on-distr...