Wilson Mar bio photo

Wilson Mar

Hello!

Email me Calendar Skype call

LinkedIn Twitter Gitter Instagram Youtube

Github Stackoverflow Pinterest

SMACK = Spark, Mesos, Akka, Cassandra, Kafka, etc.

US (English)   Español (Spanish)   Français (French)   Deutsch (German)   Italiano   Português   Cyrillic Russian   中文 (简体) Chinese (Simplified)   日本語 Japanese   한국어 Korean

Overview

Here are my succinct notes about using Mesosphere DC/OS (Data Center Operating System) and Marathon within its “SMACK stack” to create a Software Defined Data Centers: Composable Infrastructure – managing bare metal resources just like the cloud.

“SMACK refers to this set of open-source software:

  • Spark big-data micro-batching with Zeppelin integration
  • Mesos DC/OS (Data Center Operating System) multi-tenant Cluster Management
  • Akka message streams for data ingestion
  • Cassandra cache NoSQL time-series database built on Amazon Dynamo DB and Google BigTable
  • Kafka buffer (message broker) for stream processing (from Datastax)

Additionally for metrics:

  • Elastic for data
  • Kibana to display
  • Zeppelin web-based Notebook for interactive data analytics with SparkSQL and Python Conda support

Mesos to Marathon to DC/OS

Initially written as a research project at Berkeley, the Mesos Apache project runs both containerized and non-containerized workloads in a distributed manner.

Mesos was adopted by Twitter as an answer to Google’s Borg (Kubernetes’ predecessor).

“Mesosphere is democratizing the modern infrastructure we used at Twitter, AirBnB, and other webscale companies to quickly deliver data-driven services on any datacenter or cloud.” -Florian Leibert, Co-Founder & CEO

Customers include Yelp

Mesosphere as an organization aims to make container orchestration less complex to use by regular human beings by supplying the Marathon “plugin” to Mesos to handle container scheduling.

Now Mesos became synonomous with DC/OS when, in mid-2016, Mesosphere introduced DC/OS (Distributed Cloud Operating System, aka Data Center Operating System) to simplify Mesos to the point where a Mesos cluster can be deployed with the Marathon scheduler in a few minutes.

UCR (Universal Container Runtime) enables containers without Docker by running containers without an image (zip, tar, JAR), from appc image, or OCI image as well as Docker image file. It can nest containers as builds under Jenkins.

What’s the big deal?

smack-over-provisioned-1126x419-47447

[Mesosphere DC/OS reduces AWS bills by dynamically automating where services run on hardware across the datacenter. Higher utilization is achieved by DC/OS pooling unused capacity and moving apps to them. It goes beyond what VMware does, load balancing across several clouds and on-premises as a single set of resources to allocate as needed. That enables the handling of a massive amount of data and processing.

Mesos handles storage volumes as well as CPU, memory, and network resources.

The ability to scale up and out impacts availability.

VIDEO: “SMACK is the New LAMP” Codemotion Milan 2017 by Mario Cartia

Mesos architecture

smack-mesos-624x420-44499

From Stratoscale (AWS-compatible on-premises K8S)

K8S nodes are anologous to Mesos’ agents. But Mesos adds a scheduler layer that doesn’t exist in K8S. Although Hadoop (big data) and MPI (messaging) schedulers are shown above, dozens of schedulers are available, including the Marathon container scheduler and Jenkins.

Services packages are added in a couple clicks at the DC Universe app store. Alternately, use command line:

dcos package install kafka

https://www.youtube.com/watch?time_continue=179&v=VdhJ_Fm3_mk

You can create your own scheduler.

Akka supports multiple programming models for concurrency, but it emphasizes actor-based concurrency, with inspiration from Erlang. It’s from Lightbend.

Competition

  • https://www.nomadproject.io/
  • https://docs.docker.com/engine/swarm/
  • https://aws.amazon.com/ecs/
  • Kubernetes from Google

As its name suggests, DC/OS is more than a container orchestration framework like Kubernetes.

In fact, Kubernetes can run on top of DC/OS and schedule containers with it instead of using Marathon.

But DC/OS is “less opinionated”. It can run non-containerized, stateful workloads.

Marathon’s APIs are more straightforward in comparison to Kubernetes. Marathon aggregates APIs and provides a relatively small amount of API resources, while Kubernetes provides a larger variety of resources and is based on label selectors.

Kubernetes has almost 10x the commits and GitHub stars as Marathon.

While Kubernetes is a completely open source, DC/OS is controlled by a commercial company, comes with a “Premium” subscription for extra features, So “seemingly simple features needed to automate the deployment process is only included in the enterprise version.” *

  • Kafka from Confluent
  • Datastax

Docker Swarm or Kubernetes or Mesos - pick your framework! May 17, 2017 [45:37] by Arun Gupta

Installation

Florian Troßbach in his 2016 blog describes this sequence of installation:

Step 0: Prerequisites

If you want to try the examples yourself, you’ll need the following:

  1. Vagrant and Virtualbox
  2. Ansible
  3. An AWS Account. The AWS resources used in this example exceed the free tier, regular AWS fees apply. You have been warned!
  4. An EC2 key pair called dcos-intro
  5. Twitter access key

https://github.com/ftrossbach/intro-to-dcos

https://github.com/zutherb/terraform-dcos

36 people maintain 240 repositories in the Mesosphere GitHub account https://github.com/mesosphere

69 people maintain 70 repositories in the DC/OS (Datacenter Operating System) account: https://github.com/dcos https://dcos.io/ help@dcos.io

Many of the people are associated with both accounts.

https://github.com/dcos/dcos-launch Turn-key deployments of DC/OS on AWS (template and onprem), Azure, and GCE by Charles Provencher

https://github.com/dcos/dcos-launch/issues/187 In the README at https://github.com/dcos/dcos-launch the link “maws” requesting https://github.com/dcos/dcos-launch/blob/master returns a 404. Do you mean https://github.com/mesosphere/aws-cli

https://github.com/alejandroEsc/kraken Deploy a Kubernetes cluster using Terraform and Ansible on top of CoreOS.

Deployment

Mesos has its minimesos tool for testing and development. It is specially designed for continuous integration, and for deployment, you simply need to run the following command line on a machine with Docker installed:

curl -sSL https://minimesos.org/install | sh

Mesos is, as of this writing, available natively along with Azure Container Service. See https://azuremarketplace.microsoft.com/en-us/marketplace/apps/mesosphere.dcos?tab=Overview

Installation on AWS & GCP is manual.

Monitoring

One of the selling points of Mesos is that monitoring configuration is included.

To enable monitoring takes 3-20% of CPU:

--reporter graphite=tcp://localhost:2003/prefix=marathon-test&interval=10

Generate flame graphs of CPU times with functions called.

A Mesos Master which knows about available computing resources makes offers to a Scheduler. The Scheduler accepts or declines offers. Accepted offers are sent to applicable Resources to launch executors/tasks.

https://medium.com/apache-mesos/performance-improvements-in-mesos-1-7-0-50c195033c5d

Social Media

https://dcos.io/

https://twitter.com/dcos

https://www.youtube.com/channel/UCUECX_bIZBgaw_rAaCoA39Q

https://medium.com/apache-mesos

MesosCon 2018 is held in New York City on November 5th-7th, 2018.

slack?

References

VIDEOS: MesosCon North America Los Angeles Sep 26, 2017 is keynoted by Ben Hindman, Co-Creator, Apache Mesos and Founder, Mesosphere First held in 2014.

There is also a MesosCon Europe and Asia.


More on DevOps

This is one of a series on DevOps:

  1. DevOps_2.0
  2. ci-cd (Continuous Integration and Continuous Delivery)
  3. User Stories for DevOps
  4. Enterprise Software)

  5. Git and GitHub vs File Archival
  6. Git Commands and Statuses
  7. Git Commit, Tag, Push
  8. Git Utilities
  9. Data Security GitHub
  10. GitHub API
  11. TFS vs. GitHub

  12. Choices for DevOps Technologies
  13. Pulumi Infrastructure as Code (IaC)
  14. Java DevOps Workflow
  15. Okta for SSO & MFA

  16. AWS DevOps (CodeCommit, CodePipeline, CodeDeploy)
  17. AWS server deployment options
  18. AWS Load Balancers

  19. Cloud services comparisons (across vendors)
  20. Cloud regions (across vendors)
  21. AWS Virtual Private Cloud

  22. Azure Cloud Onramp (Subscriptions, Portal GUI, CLI)
  23. Azure Certifications
  24. Azure Cloud

  25. Azure Cloud Powershell
  26. Bash Windows using Microsoft’s WSL (Windows Subsystem for Linux)
  27. Azure KSQL (Kusto Query Language) for Azure Monitor, etc.

  28. Azure Networking
  29. Azure Storage
  30. Azure Compute
  31. Azure Monitoring

  32. Digital Ocean
  33. Cloud Foundry

  34. Packer automation to build Vagrant images
  35. Terraform multi-cloud provisioning automation
  36. Hashicorp Vault and Consul to generate and hold secrets

  37. Powershell Ecosystem
  38. Powershell on MacOS
  39. Powershell Desired System Configuration

  40. Jenkins Server Setup
  41. Jenkins Plug-ins
  42. Jenkins Freestyle jobs
  43. Jenkins2 Pipeline jobs using Groovy code in Jenkinsfile

  44. Docker (Glossary, Ecosystem, Certification)
  45. Make Makefile for Docker
  46. Docker Setup and run Bash shell script
  47. Bash coding
  48. Docker Setup
  49. Dockerize apps
  50. Docker Registry

  51. Maven on MacOSX

  52. Ansible

  53. MySQL Setup

  54. SonarQube & SonarSource static code scan

  55. API Management Microsoft
  56. API Management Amazon

  57. Scenarios for load
  58. Chaos Engineering