TO MILLIONS OF SUMMONERS SCOTT DELAP SCALABILITY ARCHITECT GDC 2012 - - PowerPoint PPT Presentation

to millions of summoners
SMART_READER_LITE
LIVE PREVIEW

TO MILLIONS OF SUMMONERS SCOTT DELAP SCALABILITY ARCHITECT GDC 2012 - - PowerPoint PPT Presentation

TO MILLIONS OF SUMMONERS SCOTT DELAP SCALABILITY ARCHITECT GDC 2012 ABOUT ME SCOTT DELAP Scalability Architect Joined Riot in 2008 About a year before beta @scottdelap sdelap@riotgames.com ABOUT RIOT GAMES 500+ OFFICES IN FOUNDED


slide-1
SLIDE 1

TO MILLIONS OF SUMMONERS

SCOTT DELAP

SCALABILITY ARCHITECT GDC 2012

slide-2
SLIDE 2

ABOUT ME – SCOTT DELAP Scalability Architect Joined Riot in 2008 About a year before beta @scottdelap sdelap@riotgames.com

slide-3
SLIDE 3

ABOUT RIOT GAMES

500+

EMPLOYEES

OFFICES IN

SANTA MONICA,

  • ST. LOUIS,

DUBLIN, SEOUL

FOUNDED

SEPT.2006

slide-4
SLIDE 4

OUR MISSION

TO BE THE MOST PLAYER-FOCUSED GAME COMPANY IN THE WORLD.

slide-5
SLIDE 5
slide-6
SLIDE 6

LEAGUE OF LEGENDS: INTRO

July 2011

15 MIL REGISTERED 4 MIL MONTHLY 1.4 MIL DAILY 0.5 MIL PEAK CCU 3.7 MIL DAILY HRS

November 2011

32.5 MIL REGISTERED 11.5 MIL MONTHLY 4.2 MIL DAILY 1.3 MIL PEAK CCU 10.5 MIL DAILY HRS

slide-7
SLIDE 7

A UNIQUE SCALING CHALLENGE

Social elements require uniform access Crafting an enjoyable user experience

GAME FEATURES DO NOT ALWAYS SUPPORT TRADITIONAL DECOMPOSITION

slide-8
SLIDE 8

MEETS THESE NEEDS?

slide-9
SLIDE 9

AGENDA EMBRACING JAVA AND NoSQL SIMPLE IS BEST CODE A DYNAMIC SYSTEM SCALING BEST PRACTICES MONITOR EVERYTHING

slide-10
SLIDE 10

HOW DO WE DEVELOP A SYSTEM RAPIDLY… …WHILE PLANNING FOR FUTURE CAPACITY NEEDS?

PROBLEM #1:

slide-11
SLIDE 11

LEAGUE OF LEGENDS: TECH OVERVIEW

CLIENT EXPERIENCE

PvP.net Adobe Air Flex Game Client C DirectX

SERVER SIDE STACK

Apache Tomcat Spring ActiveMQ Coherence Hibernate MySQL PHP Cake MySQL Game Servers Game Servers Game Servers Game Servers

slide-12
SLIDE 12

TODAY’S FOCUS

CLIENT EXPERIENCE

PvP.net Adobe Air Flex Game Client C DirectX

SERVER SIDE STACK

Apache Tomcat Spring ActiveMQ Coherence Hibernate MySQL PHP Cake MySQL Game Servers Game Servers Game Servers Game Servers

slide-13
SLIDE 13

A TECH STACK WITH NEW AND OLD ELEMENTS

MySQL

Apache Tomcat Spring Apache Tomcat Spring Apache Tomcat Spring Coherence Hibernate Coherence Hibernate Coherence Hibernate Coherence Hibernate

slide-14
SLIDE 14

BENEFITS OF TRADITIONAL JAVA

MATURE OPEN SOURCE ECOSYSTEM ESTABLISHED TOOLS LARGE POOL OF TALENTED DEVELOPERS

slide-15
SLIDE 15

ACCELERATING THE FOUNDATION WITH NoSQL

NoSQL SOLUTION

ORACLE COHERENCE

DATA STORED IN CACHES BY KEY NUMEROUS USES PROVIDES ELASTICITY

slide-16
SLIDE 16

NoSQL ENABLING RAPID GROWTH

Horizontal scaling of Coherence greatly simplified absorbing CCU growth over time

1

Design patterns enforced by Coherence promoted feature level scaling as well

2

slide-17
SLIDE 17

CACHING IN DETAIL

SHARDING LOGIC

AT APPLICATION LEVEL COHERENCE DAO MySQL HIBERNATE

slide-18
SLIDE 18

COHERENCE

EMBRACING CACHE ADVANTAGES

DAO MySQL HIBERNATE

slide-19
SLIDE 19

COHERENCE

EMBRACING CACHE ADVANTAGES

DAO MySQL HIBERNATE

slide-20
SLIDE 20

LEVERAGING ADVANTAGES

GRID COMPUTING TRANSPARENT PARTITIONING

slide-21
SLIDE 21

AGENDA EMBRACING JAVA AND NoSQL SIMPLE IS BEST CODE A DYNAMIC SYSTEM SCALING BEST PRACTICES MONITOR EVERYTHING

slide-22
SLIDE 22

HOW DO WE QUICKLY DEVELOP NEW FEATURES… …WHILE LIMITING BUGS?

PROBLEM #2:

slide-23
SLIDE 23

SIMPLE IS BEST

JAVA MEMORY NETWORK

MODERN CPU

3 BILLION

INSTRUCTIONS/SECOND

FAST

slide-24
SLIDE 24

Complexity is the enemy of quality

DON’T OVER DESIGN

slide-25
SLIDE 25

RIG THE GAME

Divide inputs of algorithm, then parallel process

Continually coordinate

slide-26
SLIDE 26

RIG THE GAME

THREAD 1 THREAD 2

Coordination Coordination Coordination

Data Data Data Data Data Work Work Work Work Work Work Work Work Work Work

slide-27
SLIDE 27

RIG THE GAME

Data Data Data Data Data

THREAD 1 THREAD 2

slide-28
SLIDE 28

RIG THE GAME

THREAD 1 THREAD 2

Data Work Data Work Data Work Work Work Data Work Data Work Data Work Work Work

slide-29
SLIDE 29

AGENDA EMBRACING JAVA AND NoSQL SIMPLE IS BEST CODE A DYNAMIC SYSTEM SCALING BEST PRACTICES MONITOR EVERYTHING

slide-30
SLIDE 30

HOW DO WE HANDLE NOT JUST MONTHLY CHANGE… …BUT HOURLY CHANGE?

PROBLEM #3:

slide-31
SLIDE 31

HARDWARE FAILURES

CODE A DYNAMIC SYSTEM

LARGE SYSTEM CHANGES AS IT’S RUNNING

slide-32
SLIDE 32

FIX?

HARDWARE FAILURES

CODE A DYNAMIC SYSTEM

LARGE SYSTEM CHANGES AS IT’S RUNNING

Next release? During downtime?

slide-33
SLIDE 33

FIX?

HARDWARE FAILURES

CODE A DYNAMIC SYSTEM

LARGE SYSTEM CHANGES AS IT’S RUNNING

Next release? During downtime?

slide-34
SLIDE 34

CODE A DYNAMIC SYSTEM

Dynamic Cluster Recomposition Stateless Growth Patterns

TECHNOLOGIES W/ ELASTIC PROPERTIES

NOT EVERY PIECE OF YOUR STACK HAS TO BE ELASTIC

slide-35
SLIDE 35

All relevant configuration properties are dynamic

1

Coherence near caches used to propagate changes to nodes dynamically

2

Algorithms written so they are aware their variables may change while running

3

CODE A DYNAMIC SYSTEM

slide-36
SLIDE 36

LARGER EXAMPLES OF DYNAMIC BEHAVIOR

Hotfixes require less downtime Features can be deployed in advance of release windows

Entire machine/feature combinations can be deployed & updated

THREAD POOLS

=

DYNAMICALLY CONFIGURABLE

slide-37
SLIDE 37

AGENDA EMBRACING JAVA AND NoSQL SIMPLE IS BEST CODE A DYNAMIC SYSTEM SCALING BEST PRACTICES MONITOR EVERYTHING

slide-38
SLIDE 38

WHAT HAPPENS WHEN WE FOLLOW ALL THE RULES… …AND STILL RUN INTO ISSUES?

PROBLEM #4:

slide-39
SLIDE 39

SCALING BEST PRACTICES HAVE CONSEQUENCES

Scaling is hard

1

Let’s get rid of some things so can do this easier

2

What do we get rid of? I can’t decide…

3

Plan B…instead of what you can’t do, I’ll tell you what you can

4

Follow these X rules and everything will be fine

5

slide-40
SLIDE 40

If all problems can be written with a map step and a reduce step…

MAP REDUCE

I’m taking away your joins…

NoSQL

Pick two…

CAP

SCALING BEST PRACTICES HAVE CONSEQUENCES

slide-41
SLIDE 41

Blog Entry

CONSEQUENCES

ATOMIC OPERATIONS OFTEN BECOME SCOPED BY ENTRY VALUES AND ROOT OBJECTS

slide-42
SLIDE 42

Blog Entry

CONSEQUENCES

ATOMIC OPERATIONS OFTEN BECOME SCOPED BY ENTRY VALUES AND ROOT OBJECTS

COMMENT

slide-43
SLIDE 43

Blog Entry

CONSEQUENCES

ATOMIC OPERATIONS OFTEN BECOME SCOPED BY ENTRY VALUES AND ROOT OBJECTS

COMMENT

slide-44
SLIDE 44

AN EXAMPLE OF A MISMATCH

SERVER

ROOT OBJECT

AS GAMES ARE ALLOCATED, CHILD OBJECTS ARE ADDED

slide-45
SLIDE 45

COMPLEXITY OF CHILD OBJECTS GAMES PER SERVER

AN EXAMPLE OF A MISMATCH

slide-46
SLIDE 46

ROOT OBJECTS AND CHILD OBJECTS

MACHINE

Game Instance Name Players State Game Instance Name Players State Game Instance Name Players State

slide-47
SLIDE 47

EVOLUTION OF AN ANTI-PATTERN

Child Object Child Object Child Object Child Object Child Object Child Object

MACHINE

2-50k 2-50k 2-50k 2-50k 2-50k 2-50k

<20k

>500k

NETWORK TRANSFER FAST OBJECT SERIALIZATION

BOUNDING FACTORS

slide-48
SLIDE 48

THE PIPE IS FULL

MACHINE

Game Instance Game Instance Game Instance

MACHINE

Game Instance Game Instance Game Instance

MACHINE

Game Instance Game Instance Game Instance

MACHINE

Game Instance Game Instance Game Instance

MACHINE

Game Instance Game Instance Game Instance

slide-49
SLIDE 49

DO WE REALLY HAVE ONE OBJECT?

Game Instance Name Players

MACHINE

Game Instance State Game Instance State Game Instance State

slide-50
SLIDE 50

SMALLER IS BETTER!

MACHINE

Game Instance State Game Instance State Game Instance State

MACHINE

Game Instance State Game Instance State Game Instance State

MACHINE

Game Instance State Game Instance State Game Instance State

MACHINE

Game Instance State Game Instance State Game Instance State

MACHINE

Game Instance State Game Instance State Game Instance State
slide-51
SLIDE 51

AGENDA EMBRACING JAVA AND NoSQL SIMPLE IS BEST CODE A DYNAMIC SYSTEM SCALING BEST PRACTICES MONITOR EVERYTHING

slide-52
SLIDE 52

HOW DO WE KNOW… …WHEN WE HAVE A PROBLEM?

PROBLEM #5:

slide-53
SLIDE 53

LOGS WITH MILLIONS OF OPERATIONS/DAY MONITOR EVERYTHING

VS.

slide-54
SLIDE 54

LOGS WITH MILLIONS OF OPERATIONS/DAY MONITOR EVERYTHING

VS.

slide-55
SLIDE 55

WHAT HAPPENED HERE? Networking issue! MONITOR EVERYTHING

slide-56
SLIDE 56

Automate metrics gathering

1

Spring performance monitoring interceptor

2

Log out call stack on external calls

3

Sample internal calls

4

Automate reporting

5

Trivial cost vs. benefit

6

MONITOR EVERYTHING

slide-57
SLIDE 57

…LETS GREP THE RED ITEMS… DATA IS USELESS WITHOUT AN EASY WAY TO VIEW IT.

MONITOR EVERYTHING

slide-58
SLIDE 58

AUTOMATE NEXT 5 QUESTIONS/ANSWERS

(Why should they be manual?)

MONITOR EVERYTHING

slide-59
SLIDE 59

RECAP EMBRACING JAVA AND NoSQL SIMPLE IS BEST CODE A DYNAMIC SYSTEM SCALING BEST PRACTICES MONITOR EVERYTHING

slide-60
SLIDE 60

SCOTT DELAP

SCALABILITY ARCHITECT sdelap@riotgames.com GDC 2012

www.riotgames.com/careers

(We’re also in the Career Pavilion at booth #CP1813)