MongoDB Backup and Recovery Field Guide Tim Vaillancourt Sr - - PowerPoint PPT Presentation

mongodb backup and recovery field guide
SMART_READER_LITE
LIVE PREVIEW

MongoDB Backup and Recovery Field Guide Tim Vaillancourt Sr - - PowerPoint PPT Presentation

MongoDB Backup and Recovery Field Guide Tim Vaillancourt Sr Technical Operations Architect, Percona `whoami` { name: tim, lastname: vaillancourt, employer: percona, techs: [ mongodb, mysql, cassandra,


slide-1
SLIDE 1

MongoDB Backup and Recovery Field Guide

Tim Vaillancourt Sr Technical Operations Architect, Percona

slide-2
SLIDE 2

2

`whoami`

{ name: “tim”, lastname: “vaillancourt”, employer: “percona”, techs: [ “mongodb”, “mysql”, “cassandra”, “redis”, “rabbitmq”, “solr”, “python”, “golang” ] }

slide-3
SLIDE 3

3

Agenda

  • History
  • Methods

○ Logical ○ Binary

■ Cold ■ LVM ■ Hot Backup

  • Integrity / Consistency

○ mongodb_consistent_backup

  • Architecture
  • Restore and Validation
slide-4
SLIDE 4

4

History

  • 3000-4000 BC: Culturally significant

data backed up in a universal format

  • 1400: The Printing Press
  • 1600-1800: Chapultepec Aqueduct
  • 1990s: Floppy and Zip Disks
  • 2000s: No more Floppy/Zip Disks
  • Present: All my data is on Google

Drive and I have 7 days of hourly Time Machine backups!

  • Future: ?
slide-5
SLIDE 5

5

Replication != Backup

  • Replication is not a backup!

○ Replication is High Availability ○ Including

■ Binary/Statement-based Replication of any type

  • Delayed Replication***

■ RAID Arrays

  • <EOF>
slide-6
SLIDE 6

Backup Methods

slide-7
SLIDE 7

7

Logical Backups

  • Tools

○ mongodump

■ Uses find() queries with $snapshot to backup all collections ■ Supports Gzip and Threading in 3.2+ ■ Outputs a directory containing bson files in various subdirectories

○ Custom Queries

■ The client API could be used similarly to mongodump to perform logical backups

  • Benefits

○ Reduced storage footprint ○ Replication awareness ○ Compatibility

  • Drawbacks
slide-8
SLIDE 8

8

Binary Backups: Cold Backup

  • Very simple process
  • Causes full outage to MongoDB instance!
  • Process

○ Stop mongod ○ Copy and archive dbPath ○ Start mongod

slide-9
SLIDE 9

9

Binary Backups: LVM / Filer / Cloud Disk

  • Process

○ If Non-Journalled

■ db.fsyncLock() ■ Keep session open

○ Create block-device snapshot ○ Unlock the database

■ db.fsyncUnlock()

○ Copy or achive the snapshot directory ○ Remove block devics snapshot (as quickly as possible!)

  • LVM

○ Snapshots have been demonstrated to cause up to 30%* write latency impact to disk due to COW

slide-10
SLIDE 10

10

Binary Backups: Hot Backup

  • PSMDB or MongoDB Enterprise

○ Pay $$$ for MongoDB Enterprise or download PSMDB for free(!)

○ db.adminCommand({ createBackup: 1, backupDir: "/data/mongodb/backup" })

○ Copy/archive the output path ○ Delete the backup output path ○ NOTE:

■ RocksDB-based createBackup creates filesystem hardlinks whenever possible! ■ Delete RocksDB backupDir as soon as possible to reduce bloom filter overhead!

slide-11
SLIDE 11

Backup Integrity / Consistency

slide-12
SLIDE 12

12

The “Distributed Cluster Backup Problem”

  • Mongodump is single node consistent only!
  • Common to most or all database techs in

sharded environment

  • Problems:

○ Backup tools consider single-instance integrity

  • nly

○ Backups of different shards may complete at different times ○ Changes replicate asynchronously ○ Data may be balancing / moving in the cluster

  • Risks:

○ Orphaned documents / references ○ Holes in data

slide-13
SLIDE 13

13

Backups: mongodb_consistent_backup

  • Python project by Percona-Lab for consistent backups
  • URL: https://github.com/Percona-Lab/mongodb_consistent_backup
  • Best-effort support, not a “Percona Product”
  • Created to solve limitations in MongoDB backup tools:

○ Replica Set and Sharded Cluster awareness ○ Cluster-wide Point-in-time consistency ○ In-line Oplog backup (vs post-backup) ○ Notifications of success / failure

  • Extra Features

○ Remote Upload (AWS S3, Google Cloud Storage and Rsync) ○ Archiving (Tar or ZBackup deduplication and optional AES-at-rest) ○ CentOS/RHEL7 RPMs and Docker-based releases (.deb soon!)

slide-14
SLIDE 14

14

Backups: mongodb_consistent_backup

  • 1.2.0

○ Multi-threaded Rsync Upload ○ Replica Set Tags support ○ Support for MongoDB SSL / TLS connections and client auth ○ Rotation / Expiry of old backups (locally-stored only)

  • Future

○ Incremental Backups ○ Binary-level Backups (Hot Backup, Cold Backup, LVM, Cloud-based, etc) ○ More Notification Methods (PagerDuty, Email, etc) ○ Restore Helper Tool ○ Instrumentation / Metrics ○ <YOUR AWESOME IDEA HERE> we take GitHub PRs (and it’s Python)!

slide-15
SLIDE 15

Backup Architecture

slide-16
SLIDE 16

16

Architecture: Simple Example

  • Method

○ Run mongodump (with --oplog) using a plain secondary ○ Store backups with on-site remote storage (filer, rsync, etc)

  • Potential Issues

○ Application Impact

■ I/O and CPU impact due to backups may affect application ■ Storage-engine and FS caches will become dirty ■ Primary Failure

  • A failure of the Primary may cause the Secondary backing-up

to become Primary

  • This can be avoided by using a Read Preference of

‘secondary’ (supported in recent mongodump versions)

○ No Disaster Recovery

slide-17
SLIDE 17

17

Architecture: Tag-Based Example

  • Replica Set Tags

○ Allow selection of MongoDB nodes using key/value pairs ○ Represented in JSON/single document ○ Many key/value pairs is possible

  • Example Backup from “west” Only

○ Specify a single node with a tag such as { location: “west” } ○ Use Read Preference Tag in mongodump/mongodb_consistent_backup to target a specific node.

slide-18
SLIDE 18

18

Architecture: Offsite Backup Example

  • Example

○ Create backup within local datacenter ○ Upload completed backups to other datacenter, cloud, etc

■ mongodb_consistent_backup supports Amazon S3, Google Cloud Storage and Rsync for remote upload!

  • Benefits

○ Fast backup time due to in-datacenter latency

  • Drawbacks

○ A full backup data uploaded each backup job

slide-19
SLIDE 19

19

Architecture: Disaster Recovery Example

  • Example

○ Place a SECONDARY node in another location

■ Dedicated node is recommended to reduce impact ■ hidden:true recommended

○ Run backup from off-site SECONDARY member ○ Optionally upload to Cloud Storage

  • Benefits

○ Only changes (replication) replicated to offsite location ○ Potentially faster uploads to Cloud Storage

  • Drawbacks

○ Bootstrap / Initial Sync may use high bandwidth (if not seeded by backup)

slide-20
SLIDE 20

Restore and Validation

“It’s not a backup system, it’s a restore system” ~ Raymond Blum, Google SRE

slide-21
SLIDE 21

21

Restoring and Validation

  • Methodology

○ Optimise restore time, not backup run time

■ Users and business care how fast their data is back, not how long it takes to backup ■ Binary-level backups are much faster to restore in MongoDB

  • Validation

○ This is very application specific ○ Random sample restored data and validate

■ Example: Compare to Production

  • Compare real Production item, user, article, etc to backup
  • Ensure backup age doesn’t cause false alarms, ie: test data older than backup

■ Example: Integration Test / QA

  • Run code integration tests or QA on restored data

■ Example: Production Backup as Test Data

  • Copy Production Data to Test periodically using backups
slide-22
SLIDE 22

22

Thank You Sponsors!

slide-23
SLIDE 23

23

SAVE THE DATE!

CALL FOR PAPERS OPENING SOON!

www.perconalive.com

April 23-25, 2018

Santa Clara Convention Center

slide-24
SLIDE 24

24

Questions?