Faking a Failover Over the Top With Samba Clusters Christopher R. - - PowerPoint PPT Presentation

faking a failover
SMART_READER_LITE
LIVE PREVIEW

Faking a Failover Over the Top With Samba Clusters Christopher R. - - PowerPoint PPT Presentation

Faking a Failover Over the Top With Samba Clusters Christopher R. Hertel Samba Team May 2017 Introductions Introductor a t i o n a r y e s q u e n e s s e si s m Me: Samba Team Elder SMB Wizard with: The opinions expressed are my


slide-1
SLIDE 1

Faking a Failover

Over the Top With Samba Clusters

Christopher R. Hertel Samba Team May 2017

slide-2
SLIDE 2

Introductions

slide-3
SLIDE 3

Introductorationaryesquenessesism Me:

  • Samba Team Elder
  • SMB Wizard with:

The opinions expressed are my own and not necessarily those of my employer, my spouse, my spirit familiar, the Internet, or the monster in the closet.

slide-4
SLIDE 4

Introductorationaryesquenessesism

Mantra In theory, theory and practice are the same. In practice, they're not.

slide-5
SLIDE 5

Quick Review Samba, CTDB, and Clusters

slide-6
SLIDE 6

Review: Samba

A Big Giant Semantics Engine

  • Locking and Sharing
  • Access Controls
  • File Attributes
  • File Names
  • Weird Behaviors

Samba has to keep track

  • f a lot of STATE .
slide-7
SLIDE 7

Review: Samba TDBs

Using TDBs:

  • Shared Access to State Information
  • Atomicity / Consistency
  • Resilience (survives reboot)

TDBs, generally, form Samba's state machine.

slide-8
SLIDE 8

Using CTDB:

  • Volatile State

○ Changes rapidly ○ May be safely lost when a server node is lost

  • Persistent State

○ Less dynamic ○ Must be consistent Provides a distributed state machine.

Review: Samba & CTDB

slide-9
SLIDE 9

Samba Clusters:

  • Provide Windows Semantics
  • Coordinate cluster-wide state
  • Compensate for missing

features in the underlying FS

  • Tools for cluster management
  • Hard Failover

When a server node fails, clients can reconnect to any cluster node.

Review: Samba Clusters

slide-10
SLIDE 10

Quick Review

Durable, Resilient, and Persistent Handles

slide-11
SLIDE 11

Making Handles Sticky

Durable Handles

  • SMB2.0 / Windows Vista
  • Designed with WiFi in mind
  • Requires an OpLock
  • Limited state exposed

Samba provides limited support for Durable Handles in a single-server

  • Configuration. (Not intended for

Use in Clusters.)

slide-12
SLIDE 12

Making Handles Sticky

Resilient Handles

  • SMB2.1 / Windows 7
  • Stronger guarantees
  • Doesn't need an OpLock
  • Tracks byte-range locks
  • Separate IOCTL call required

Samba doesn't support this, but it could be implemented in the VFS layer by catching the IOCTL call. Still, not intended for clusters.

slide-13
SLIDE 13

Making Handles Sticky

Persistent Handles

  • SMB2.2 (3) / Windows 8
  • Real cluster failover
  • Automatically requested

Persistent handles were added specifically to support Continuous Availability (CA).

slide-14
SLIDE 14

Making Handles Sticky

Crash Recovery

  • Durable/Resilient handles provide

file-handle recovery following a brief network outage.

  • Persistent handles add

support for failover to another node following a cluster node failure.

slide-15
SLIDE 15

Why Fake a Failover?

slide-16
SLIDE 16

Fake Failover

What the Heck?

What do you mean by Fake Failover?

  • Reconnect a Durable Handle…
  • ...to a different node

Why do such a silly thing?

  • Samba has Durable

Handle Support

  • Minimal state to keep
  • More clients
  • Prelude to Real Failover
  • Why not?
slide-17
SLIDE 17

Fake Failover

What could go wrong?

  • State must be replicated
  • Must re-establish the OpLock
  • Failover must finish within the

timeout

  • Windows must be fooled

Remember our mantra? This is theory.

slide-18
SLIDE 18

Fake Failover

How would this work?

New semantics:

  • Reliable State

Handle ID, & any state that exists for the duration of the open

  • Ephemeral State

Uncommitted/Un-ACKed changes

Do not expect Durable Handles to survive a full cluster failure. (That's for Persistent Handles.)

slide-19
SLIDE 19

Fake Failover

Implementation Options

  • MemCacheD

Distributed memory cache with client-driven replication

  • New CTDB modes

Uncommitted/Un-ACKed changes

New CTDB storage modes were presented earlier by Amitay/Martin.

slide-20
SLIDE 20

EPILOGUE

...and then they showed up with their pitchforks and torches and questions...

slide-21
SLIDE 21

Blank Slide