Elastic Efficient Execution of Varied Containers Sharma Podila Nov - - PowerPoint PPT Presentation

elastic efficient execution of varied containers
SMART_READER_LITE
LIVE PREVIEW

Elastic Efficient Execution of Varied Containers Sharma Podila Nov - - PowerPoint PPT Presentation

Elastic Efficient Execution of Varied Containers Sharma Podila Nov 7th 2016, QCon San Francisco In other words... How do we efficiently run heterogeneous workloads on an elastic pool of heterogeneous resources, with capacity guarantees?


slide-1
SLIDE 1

Elastic Efficient Execution of Varied Containers

Sharma Podila Nov 7th 2016, QCon San Francisco

slide-2
SLIDE 2

How do we efficiently run heterogeneous workloads

  • n an elastic pool of

heterogeneous resources, with capacity guarantees?

In other words...

slide-3
SLIDE 3

Topics

  • Containers, Mesos, Fenzo - where are we today?
  • Modeling an elastic Mesos cluster
  • Capacity guarantees for varied applications
  • Network resource and security groups
  • Ongoing and future work
slide-4
SLIDE 4

About Me

  • Software engineer

○ Resource scheduling, stream processing, distributed systems ○ Netflix Edge Engineering ○ Sun Microsystems + Oracle Corp.

  • Author of Fenzo scheduling library

https://github.com/Netflix/Fenzo

slide-5
SLIDE 5

Source: https://www.sandvine.com/news/global_broadband_trends.asp

81 Million subscribers worldwide and growing!

slide-6
SLIDE 6

Microservices architecture on AWS EC2

slide-7
SLIDE 7

Containers, Apache Mesos, Fenzo - where are we today?

slide-8
SLIDE 8

Reactive stream processing: Mantis

Zuul Cluster API Cluster

Mantis

Stream processing Cloud native service

  • Configurable message delivery guarantees
  • Heterogeneous workloads

○ Real-time dashboarding, alerting ○ Anomaly detection, metric generation ○ Interactive exploration of streaming data

Anomaly Detection

slide-9
SLIDE 9

Current Mantis usage

  • Peak of 1,800 EC2 instances

M3.2xlarge instances

  • Peak of 3,700 concurrent containers

○ Trough of 2,700 containers

  • Mix of perpetual and interactive exploratory jobs
  • Peak of 11 Million events / sec
slide-10
SLIDE 10

EC2 VPC VM VM Titus Job Control Containers App Cloud Platform

(metrics, IPC, health)

VM VM Batch Containers

Eureka Edda

Container deployment: Titus

Atlas & Insight

slide-11
SLIDE 11

Current Titus usage

#Containers (tasks) for the week of 10/24 in one of the regions

  • Peak of ~1,800 instances

○ Mix of m4.4xl, r3.8xl, g2.8xl ○ ~800 instances at trough

  • Mix of batch, stream

processing, and some microservices

slide-12
SLIDE 12

Core architectural components

AWS EC2 Apache Mesos Titus/Mantis Framework Fenzo

Fenzo at https://github.com/Netflix/Fenzo Apache Mesos at http://mesos.apache.org/

slide-13
SLIDE 13

Jobs, tasks, instances, containers

Jobs can be one of batch, service, or stream processing type of jobs A jobs has one or more tasks to run

An instance is equivalent to a task

A task runs one container

slide-14
SLIDE 14

A few common themes

Heterogeneous mix of jobs and resources

Resource Task request Agent sizes CPU 1 - 32 CPUs 8 - 32 CPUs Memory 2 - 200+ GB 32 - 244 GB Network bandwidth 10 - 2000 Mbps 1024 - 10240

Resource affinity based on task type Task locality

slide-15
SLIDE 15

A few common themes

Large variation in peak to trough resource requirements

Mantis events/sec

11M 2M

Titus concurrent containers

1000s 10s

slide-16
SLIDE 16

Can we resize agent cluster based

  • n demand?

Modeling an elastic Mesos cluster

slide-17
SLIDE 17

Task assignments in a cluster

Consider a cluster with 4-slot hosts

slide-18
SLIDE 18

“Random” assignments in a cluster

An EC2 instance with 4 slots Used slot Idle slot

Cluster starts random assignments of resources to tasks

slide-19
SLIDE 19

“Random” assignments in a cluster

Cluster starts to fill up...

slide-20
SLIDE 20

“Random” assignments in a cluster

Cluster somewhat full. But, only 1 agent can be terminated for scale down without losing jobs

About 50% utilized

slide-21
SLIDE 21

“Random” assignments in a cluster

Cluster is now full

100% utilized

slide-22
SLIDE 22

“Random” assignments in a cluster

Cluster partially used as jobs finish...

About 65% utilized

slide-23
SLIDE 23

“Random” assignments in a cluster

Cluster partially used, but, can’t terminate any instance without losing jobs

About 25% utilized

slide-24
SLIDE 24

Ideal assignments in a cluster

Cluster utilized to the same level as previous, but, can now terminate 9 of the 12 instances!

Similarly, 25% utilized

slide-25
SLIDE 25

Ideal assignments in a cluster

Cluster scaled down easily due to “bin packing”

slide-26
SLIDE 26

EC2 ASG attributes for setting number of servers in cluster

EC2 AutoScalingGroups have three attributes to set

  • Min - minimum number of instances to have
  • Max - maximum number of instances
  • Desired - current number of instances to have

Fenzo sets the “Desired” count based on demand

slide-27
SLIDE 27

EC2 AutoScalingGroup for Mesos agents Min Desired Max

slide-28
SLIDE 28

Min Desired Max EC2 AutoScalingGroup for Mesos agents

slide-29
SLIDE 29

Min Desired Max EC2 AutoScalingGroup for Mesos agents

slide-30
SLIDE 30

Using multiple instance types

slide-31
SLIDE 31

Amazon EC2 provides a variety of servers a.k.a “instance types”

https://aws.amazon.com/ec2/instance-types/

Algorithm model training jobs run well on memory

  • ptimized instances of R3 type

Typical services run well on balanced compute instances of M4 type

Using multiple instance types

slide-32
SLIDE 32

How do we use multiple EC2 instance types in the same Mesos agent cluster?

Using multiple instance types

slide-33
SLIDE 33

Using multiple EC2 instance types

m4.4xlarge agent ASG r3.8xlarge agent ASG Titus Grouping agents by instance type let’s us autoscale them independently

slide-34
SLIDE 34

Using multiple EC2 instance types

m4.4xlarge agent ASG r3.8xlarge agent ASG Titus

User job: 2 CPUs, 5GB memory User job: 8 CPUs, 8GB memory User job: 1 CPUs, 1GB memory

slide-35
SLIDE 35

Continuous deployment of agents

slide-36
SLIDE 36

Continuous deployment of agents

m4.4xlarge agent ASG v1

A new version of agent introduces a new ASG

slide-37
SLIDE 37

Continuous deployment of agents

m4.4xlarge agent ASG v1 m4.4xlarge agent ASG v2

A new version of agent introduces a new ASG

slide-38
SLIDE 38

Continuous deployment of agents

m4.4xlarge agent ASG v1 m4.4xlarge agent ASG v2

Disable A new version of agent introduces a new ASG

slide-39
SLIDE 39

Continuous deployment of agents

m4.4xlarge agent ASG v1 m4.4xlarge agent ASG v2

Disable

Migrate tasks

A new version of agent introduces a new ASG

slide-40
SLIDE 40

Continuous deployment of agents

m4.4xlarge agent ASG v1 m4.4xlarge agent ASG v2

Disable A new version of agent introduces a new ASG

slide-41
SLIDE 41

Continuous deployment of agents

m4.4xlarge agent ASG v2

Old agent ASG removed A new version of agent introduces a new ASG

slide-42
SLIDE 42

Bringing it all together...

m4.4xlarge agent ASG r3.8xlarge agent ASG Titus v2 v1 v2 v1

slide-43
SLIDE 43

Capacity guarantees for varied applications

slide-44
SLIDE 44

The capacity guarantee challenge Demand for resources Supply

>

slide-45
SLIDE 45

New batch of tasks Running #tasks Tasks launched

An execution sample from a cluster

slide-46
SLIDE 46

New batch of tasks Running #tasks Tasks launched

An execution sample from a cluster

Waiting for agents to free up… Or, for new agents from scale up

slide-47
SLIDE 47

New batch of tasks Running #tasks Tasks launched

Scale up and freed agents satisfy all new pending tasks

An execution sample from a cluster

slide-48
SLIDE 48

New batch of tasks Running #tasks Tasks launched What if a service was launched at this time?

Waiting for agents to free up… Or, new agents from scale up

An execution sample from a cluster

slide-49
SLIDE 49

Capacity guarantees

Guarantee capacity for timely job starts

Mesos support for quotas, etc. evolving ^ A g r e e d u p

  • n
slide-50
SLIDE 50

Capacity guarantees

Guarantee capacity for timely job starts

Mesos support for quotas, etc. evolving ^ A g r e e d u p

  • n

Generally, optimize throughput for batch jobs and start latency for service jobs

slide-51
SLIDE 51

Capacity guarantees

Some service style jobs may be less important Categorize by expected behavior instead

slide-52
SLIDE 52

Capacity guarantees

Some service style jobs may be less important Categorize by expected behavior instead Critical versus Flex (flexible) scheduling requirements

slide-53
SLIDE 53

Capacity guarantees

Critical Flex

Quotas

slide-54
SLIDE 54

Capacity guarantees

Critical Flex Critical Flex Resource Allocation Order

Quotas Priorities

vs.

slide-55
SLIDE 55

AppC1 AppC2 AppC3 AppCN AppF1 AppF2 AppFN AppF3 Resource Allocation Order

Capacity guarantees: hybrid view

Critical Flex

slide-56
SLIDE 56

Capacity guarantees via Fenzo

Fenzo supports multi-tiered task queues Multiple “buckets” per tier with “fair sharing” by dominant resource usage

Tier 0 Tier 1

slide-57
SLIDE 57

Translating application capacity to EC2 instances

  • Define per application capacity guarantees
  • Define per tier capacity guarantees
  • Translate to number of EC2 instances
slide-58
SLIDE 58

Defining application capacity

App1-cap = num_app_instances * app_instance_dimensions app_instance_dimensions: { #cpus, memory, disk, network}

Agnostic to EC2 instance types

slide-59
SLIDE 59

Defining application capacity

Applications specify resource needs, not EC2 instance types

  • Can manage capacity guarantees using a variety of

instance types

  • Eases migration to new instance types, thereby helps

capacity procurement teams

slide-60
SLIDE 60

Tier Capacity = SUM (App1-cap + App2-cap + … + AppN-cap) + BUFFER BUFFER:

  • Accommodate some new or ad hoc jobs with no guarantees
  • Red-black pushes of services temporarily double capacity

Defining Tier capacity

slide-61
SLIDE 61

#EC2_instances = Tier_capacity / EC2_instance_dimensions A tier may use multiple instance types

Translate to number of instances

Critical Flex

= { m4.4xlarge, m3.2xlarge } = { r3.8xlarge, g2.8xlarge }

slide-62
SLIDE 62

Network resource and security groups

slide-63
SLIDE 63

Container executor

+ <

Augment missing pieces: IP per container Security - Security Groups, IAM roles Isolation for networking b/w, disk I/O

M U L T I

  • T

E N A N T

slide-64
SLIDE 64

Elastic Network Interfaces (ENI)

AWS EC2 Instance ENI0 IP0 IP1 IP2 IP3 ENI1 IP4 IP5 IP6 IP7 ENI2 IP8 IP9 IP10 IP11 ENI0 IP0 IP1 IP2 IP3

  • Each EC2 instance

in VPC has 2 or more ENIs

  • Each ENI can have 2
  • r more IPs
  • Security Groups are

set on the ENI

slide-65
SLIDE 65

ENI+IP resource allocation model

A two level resource modeled in Fenzo Each agent reports #ENIs and #IPs per ENI via custom attribute Fenzo does allocation and usage tracking

ENI 1 Assigned Security Group: SG1 Used IPs Count: 2 of 7 ENI 2 Assigned Security Group: SG1,SG2 Used IPs Count: 1 of 7 ENI 3 Assigned Security Group: SG3 Used IPs Count: 7 of 7

slide-66
SLIDE 66

Plumbing VPC Networking into Docker

No IP, SecGrp A Task 0 SecGrp Y,Z Task 1 Task 2 Task 3

Titus EC2 Host VM

eth1 ENI1

SecGrp=A

eth2 ENI2

SecGrp=X

eth3 ENI3

SecGrp=Y,Z IP 1 IP 2 IP 3

pod root veth<id> app SecGrp X pod root veth<id> app SecGrp X pod root veth<id> app app veth<id> Linux Policy Based Routing + Traffic Control Titus EC2 Metadata Proxy

169.254.169.254 IPTables NAT (*)

* * *

169.254.169.254

Non-routable IP *

slide-67
SLIDE 67

Network bandwidth isolation

Each container gets an IP on one of the ENIs Linux tc policies used on virtual Ethernet

For both incoming and outgoing traffic

Bandwidth limited to the requested value

No borrowing of unused bandwidth Easy to reason about

slide-68
SLIDE 68

Ongoing and future work

slide-69
SLIDE 69

Current and future work

  • Fine grain capacity guarantees

○ Hierarchical sharing policies ○ Preemptions to satisfy priority tiers and sharing policies

  • Execution environment security hardening
  • Onboarding new applications
  • Looking forward to working with the

community

slide-70
SLIDE 70

In Summary...

slide-71
SLIDE 71

Mesos and Fenzo help us run lots of containers

  • In an elastic fashion
  • With guaranteed capacity for varied

applications

  • Custom AWS integration gives us network

resource isolation and security groups

In summary...

slide-72
SLIDE 72

Questions? Elastic Efficient Execution of Varied Containers

Sharma Podila spodila @ netflix . com @podila linkedin . com / in / spodila