Wilson Mar bio photo

Wilson Mar

Hello. Hire me!

Email me Calendar Skype call 310 320-7878

LinkedIn Twitter Gitter Google+ Instagram Youtube

Github Stackoverflow Pinterest

SMACK = Spark, Mesos, Akka, Cassandra, Kafka, etc.


Overview

Here are my succinct notes about using Mesosphere DC/OS (Data Center Operating System) and Marathon within its “SMACK stack” to create a Software Defined Data Centers: Composable Infrastructure – managing bare metal resources just like the cloud.

“SMACK refers to this set of open-source software:

  • Spark big-data micro-batching with Zeppelin integration
  • Mesos DC/OS (Data Center Operating System) multi-tenant Cluster Management
  • Akka message streams for data ingestion
  • Cassandra cache NoSQL time-series database built on Amazon Dynamo DB and Google BigTable
  • Kafka buffer (message broker) for stream processing (from Datastax)

Additionally for metrics:

  • Elastic for data
  • Kibana to display
  • Zeppelin web-based Notebook for interactive data analytics with SparkSQL and Python Conda support

Mesos to Marathon to DC/OS

Initially written as a research project at Berkeley, the Mesos Apache project runs both containerized and non-containerized workloads in a distributed manner.

Mesos was adopted by Twitter as an answer to Google’s Borg (Kubernetes’ predecessor).

“Mesosphere is democratizing the modern infrastructure we used at Twitter, AirBnB, and other webscale companies to quickly deliver data-driven services on any datacenter or cloud.” -Florian Leibert, Co-Founder & CEO

Customers include Yelp

Mesosphere as an organization aims to make container orchestration less complex to use by regular human beings by supplying the Marathon “plugin” to Mesos to handle container scheduling.

Now Mesos became synonomous with DC/OS when, in mid-2016, Mesosphere introduced DC/OS (Distributed Cloud Operating System, aka Data Center Operating System) to simplify Mesos to the point where a Mesos cluster can be deployed with the Marathon scheduler in a few minutes.

UCR (Universal Container Runtime) enables containers without Docker by running containers without an image (zip, tar, JAR), from appc image, or OCI image as well as Docker image file. It can nest containers as builds under Jenkins.

What’s the big deal?

smack-over-provisioned-1126x419-47447

[Mesosphere DC/OS reduces AWS bills by dynamically automating where services run on hardware across the datacenter. Higher utilization is achieved by DC/OS pooling unused capacity and moving apps to them. It goes beyond what VMware does, load balancing across several clouds and on-premises as a single set of resources to allocate as needed. That enables the handling of a massive amount of data and processing.

Mesos handles storage volumes as well as CPU, memory, and network resources.

The ability to scale up and out impacts availability.

VIDEO: “SMACK is the New LAMP” Codemotion Milan 2017 by Mario Cartia

Mesos architecture

smack-mesos-624x420-44499

From Stratoscale (AWS-compatible on-premises K8S)

K8S nodes are anologous to Mesos’ agents. But Mesos adds a scheduler layer that doesn’t exist in K8S. Although Hadoop (big data) and MPI (messaging) schedulers are shown above, dozens of schedulers are available, including the Marathon container scheduler and Jenkins.

Services packages are added in a couple clicks at the DC Universe app store. Alternately, use command line:

dcos package install kafka

https://www.youtube.com/watch?time_continue=179&v=VdhJ_Fm3_mk

You can create your own scheduler.

Akka supports multiple programming models for concurrency, but it emphasizes actor-based concurrency, with inspiration from Erlang. It’s from Lightbend.

Competition

  • https://www.nomadproject.io/
  • https://docs.docker.com/engine/swarm/
  • https://aws.amazon.com/ecs/
  • Kubernetes from Google

As its name suggests, DC/OS is more than a container orchestration framework like Kubernetes.

In fact, Kubernetes can run on top of DC/OS and schedule containers with it instead of using Marathon.

But DC/OS is “less opinionated”. It can run non-containerized, stateful workloads.

Marathon’s APIs are more straightforward in comparison to Kubernetes. Marathon aggregates APIs and provides a relatively small amount of API resources, while Kubernetes provides a larger variety of resources and is based on label selectors.

Kubernetes has almost 10x the commits and GitHub stars as Marathon.

While Kubernetes is a completely open source, DC/OS is controlled by a commercial company, comes with a “Premium” subscription for extra features, So “seemingly simple features needed to automate the deployment process is only included in the enterprise version.” *

  • Kafka from Confluent
  • Datastax

Docker Swarm or Kubernetes or Mesos - pick your framework! May 17, 2017 [45:37] by Arun Gupta

Installation

Florian Troßbach in his 2016 blog describes this sequence of installation:

Step 0: Prerequisites

If you want to try the examples yourself, you’ll need the following:

  1. Vagrant and Virtualbox
  2. Ansible
  3. An AWS Account. The AWS resources used in this example exceed the free tier, regular AWS fees apply. You have been warned!
  4. An EC2 key pair called dcos-intro
  5. Twitter access key

https://github.com/ftrossbach/intro-to-dcos

https://github.com/zutherb/terraform-dcos

36 people maintain 240 repositories in the Mesosphere GitHub account https://github.com/mesosphere

69 people maintain 70 repositories in the DC/OS (Datacenter Operating System) account: https://github.com/dcos https://dcos.io/ help@dcos.io

Many of the people are associated with both accounts.

https://github.com/dcos/dcos-launch Turn-key deployments of DC/OS on AWS (template and onprem), Azure, and GCE by Charles Provencher

https://github.com/dcos/dcos-launch/issues/187 In the README at https://github.com/dcos/dcos-launch the link “maws” requesting https://github.com/dcos/dcos-launch/blob/master returns a 404. Do you mean https://github.com/mesosphere/aws-cli

https://github.com/alejandroEsc/kraken Deploy a Kubernetes cluster using Terraform and Ansible on top of CoreOS.

Deployment

Mesos has its minimesos tool for testing and development. It is specially designed for continuous integration, and for deployment, you simply need to run the following command line on a machine with Docker installed:

curl -sSL https://minimesos.org/install | sh

Mesos is, as of this writing, available natively along with Azure Container Service. See https://azuremarketplace.microsoft.com/en-us/marketplace/apps/mesosphere.dcos?tab=Overview

Installation on AWS & GCP is manual.

Monitoring

One of the selling points of Mesos is that monitoring configuration is included.

To enable monitoring takes 3-20% of CPU:

--reporter graphite=tcp://localhost:2003/prefix=marathon-test&interval=10

Generate flame graphs of CPU times with functions called.

A Mesos Master which knows about available computing resources makes offers to a Scheduler. The Scheduler accepts or declines offers. Accepted offers are sent to applicable Resources to launch executors/tasks.

https://medium.com/apache-mesos/performance-improvements-in-mesos-1-7-0-50c195033c5d

Social Media

https://dcos.io/

https://twitter.com/dcos

https://www.youtube.com/channel/UCUECX_bIZBgaw_rAaCoA39Q

https://medium.com/apache-mesos

MesosCon 2018 is held in New York City on November 5th-7th, 2018.

slack?

References

VIDEOS: MesosCon North America Los Angeles Sep 26, 2017 is keynoted by Ben Hindman, Co-Creator, Apache Mesos and Founder, Mesosphere First held in 2014.

There is also a MesosCon Europe and Asia.


More on DevOps

This is one of a series on DevOps:

  1. DevOps_2.0
  2. ci-cd (Continuous Integration and Continuous Delivery)
  3. User Stories for DevOps

  4. Git and GitHub vs File Archival
  5. Git Commands and Statuses
  6. Git Commit, Tag, Push
  7. Git Utilities
  8. Data Security GitHub
  9. GitHub API
  10. TFS vs. GitHub

  11. Choices for DevOps Technologies
  12. Java DevOps Workflow
  13. AWS DevOps (CodeCommit, CodePipeline, CodeDeploy)
  14. AWS server deployment options

  15. Cloud regions
  16. AWS Virtual Private Cloud
  17. Azure Cloud Onramp
  18. Azure Cloud
  19. Azure Cloud Powershell
  20. Bash Windows using Microsoft’s WSL (Windows Subystem for Linux)

  21. Digital Ocean
  22. Cloud Foundry

  23. Packer automation to build Vagrant images
  24. Terraform multi-cloud provisioning automation

  25. Powershell Ecosystem
  26. Powershell on MacOS
  27. Powershell Desired System Configuration

  28. Jenkins Server Setup
  29. Jenkins Plug-ins
  30. Jenkins Freestyle jobs
  31. Jenkins2 Pipeline jobs using Groovy code in Jenkinsfile

  32. Dockerize apps
  33. Docker Setup
  34. Docker Build

  35. Maven on MacOSX

  36. Ansible

  37. MySQL Setup

  38. SonarQube static code scan

  39. API Management Microsoft
  40. API Management Amazon

  41. Scenarios for load