MongoDB Backup and Recovery Field Guide
Tim Vaillancourt Sr Technical Operations Architect, Percona
MongoDB Backup and Recovery Field Guide Tim Vaillancourt Sr - - PowerPoint PPT Presentation
MongoDB Backup and Recovery Field Guide Tim Vaillancourt Sr Technical Operations Architect, Percona `whoami` { name: tim, lastname: vaillancourt, employer: percona, techs: [ mongodb, mysql, cassandra,
Tim Vaillancourt Sr Technical Operations Architect, Percona
2
{ name: “tim”, lastname: “vaillancourt”, employer: “percona”, techs: [ “mongodb”, “mysql”, “cassandra”, “redis”, “rabbitmq”, “solr”, “python”, “golang” ] }
3
○ Logical ○ Binary
■ Cold ■ LVM ■ Hot Backup
○ mongodb_consistent_backup
4
data backed up in a universal format
Drive and I have 7 days of hourly Time Machine backups!
5
○ Replication is High Availability ○ Including
■ Binary/Statement-based Replication of any type
■ RAID Arrays
7
○ mongodump
■ Uses find() queries with $snapshot to backup all collections ■ Supports Gzip and Threading in 3.2+ ■ Outputs a directory containing bson files in various subdirectories
○ Custom Queries
■ The client API could be used similarly to mongodump to perform logical backups
○ Reduced storage footprint ○ Replication awareness ○ Compatibility
8
○ Stop mongod ○ Copy and archive dbPath ○ Start mongod
9
○ If Non-Journalled
■ db.fsyncLock() ■ Keep session open
○ Create block-device snapshot ○ Unlock the database
■ db.fsyncUnlock()
○ Copy or achive the snapshot directory ○ Remove block devics snapshot (as quickly as possible!)
○ Snapshots have been demonstrated to cause up to 30%* write latency impact to disk due to COW
10
○ Pay $$$ for MongoDB Enterprise or download PSMDB for free(!)
○ db.adminCommand({ createBackup: 1, backupDir: "/data/mongodb/backup" })
○ Copy/archive the output path ○ Delete the backup output path ○ NOTE:
■ RocksDB-based createBackup creates filesystem hardlinks whenever possible! ■ Delete RocksDB backupDir as soon as possible to reduce bloom filter overhead!
12
sharded environment
○ Backup tools consider single-instance integrity
○ Backups of different shards may complete at different times ○ Changes replicate asynchronously ○ Data may be balancing / moving in the cluster
○ Orphaned documents / references ○ Holes in data
13
○ Replica Set and Sharded Cluster awareness ○ Cluster-wide Point-in-time consistency ○ In-line Oplog backup (vs post-backup) ○ Notifications of success / failure
○ Remote Upload (AWS S3, Google Cloud Storage and Rsync) ○ Archiving (Tar or ZBackup deduplication and optional AES-at-rest) ○ CentOS/RHEL7 RPMs and Docker-based releases (.deb soon!)
14
○ Multi-threaded Rsync Upload ○ Replica Set Tags support ○ Support for MongoDB SSL / TLS connections and client auth ○ Rotation / Expiry of old backups (locally-stored only)
○ Incremental Backups ○ Binary-level Backups (Hot Backup, Cold Backup, LVM, Cloud-based, etc) ○ More Notification Methods (PagerDuty, Email, etc) ○ Restore Helper Tool ○ Instrumentation / Metrics ○ <YOUR AWESOME IDEA HERE> we take GitHub PRs (and it’s Python)!
16
○ Run mongodump (with --oplog) using a plain secondary ○ Store backups with on-site remote storage (filer, rsync, etc)
○ Application Impact
■ I/O and CPU impact due to backups may affect application ■ Storage-engine and FS caches will become dirty ■ Primary Failure
to become Primary
‘secondary’ (supported in recent mongodump versions)
○ No Disaster Recovery
17
○ Allow selection of MongoDB nodes using key/value pairs ○ Represented in JSON/single document ○ Many key/value pairs is possible
○ Specify a single node with a tag such as { location: “west” } ○ Use Read Preference Tag in mongodump/mongodb_consistent_backup to target a specific node.
18
○ Create backup within local datacenter ○ Upload completed backups to other datacenter, cloud, etc
■ mongodb_consistent_backup supports Amazon S3, Google Cloud Storage and Rsync for remote upload!
○ Fast backup time due to in-datacenter latency
○ A full backup data uploaded each backup job
19
○ Place a SECONDARY node in another location
■ Dedicated node is recommended to reduce impact ■ hidden:true recommended
○ Run backup from off-site SECONDARY member ○ Optionally upload to Cloud Storage
○ Only changes (replication) replicated to offsite location ○ Potentially faster uploads to Cloud Storage
○ Bootstrap / Initial Sync may use high bandwidth (if not seeded by backup)
“It’s not a backup system, it’s a restore system” ~ Raymond Blum, Google SRE
21
○ Optimise restore time, not backup run time
■ Users and business care how fast their data is back, not how long it takes to backup ■ Binary-level backups are much faster to restore in MongoDB
○ This is very application specific ○ Random sample restored data and validate
■ Example: Compare to Production
■ Example: Integration Test / QA
■ Example: Production Backup as Test Data
22
23
www.perconalive.com
Santa Clara Convention Center
24