dCache - delegated storage solutions Tigran Mkrtchyan for dCache - - PowerPoint PPT Presentation

dcache delegated storage solutions
SMART_READER_LITE
LIVE PREVIEW

dCache - delegated storage solutions Tigran Mkrtchyan for dCache - - PowerPoint PPT Presentation

dCache - delegated storage solutions Tigran Mkrtchyan for dCache Team ISGC 2016, Taiwan dCache on one slide JVM JVM JVM Message passing layer Door(s) Pool Manager Name Space Pools (clients entry point) Door Pools (requests scheduler)


slide-1
SLIDE 1

dCache - delegated storage solutions

Tigran Mkrtchyan for dCache Team ISGC 2016, Taiwan

slide-2
SLIDE 2

Delegated Storage | Tigran Mkrtchyan | 3/15/16 | Page 2

dCache on one slide

Pools

(Data Server)

Pools

(Data Server)

Door

Message passing layer

JVM JVM JVM Door(s)

(clients entry point)

Pool Manager

(requests scheduler)

Name Space

(MetaData Server)

Pools

(Data Server)

DBMS dcap ftp http nfs

slide-3
SLIDE 3

Delegated Storage | Tigran Mkrtchyan | 3/15/16 | Page 3

Usage around the World

  • ~ 80 installations
  • > 50% of WLCG

storage

  • biggest 22 PB
  • Typical ~100x

nodes

  • Typical ~ 10^7 files
slide-4
SLIDE 4

Delegated Storage | Tigran Mkrtchyan | 3/15/16 | Page 4

dCache as Storage System

  • Provides a single-rooted namespace.
  • Metadata (namespace) and data locations are independent.
  • Aggregates multipe storage nodes into a single storage system.
  • Manages data movement, replication, integrity.
  • Provides data migration between multiple tiers of storage (DISK,

SSD, TAPE).

  • Uniquely handles different Authentication mechanisms, like

x509, Kerberos, login+password, auth tokens.

  • Provides access to the data via variety of access protocols

(WebDAV, NFSv4.1/pNFS, xxxFTP. DCAP, Xrootd, DCAP).

slide-5
SLIDE 5

Delegated Storage | Tigran Mkrtchyan | 3/15/16 | Page 5

dCache as Storage System

  • Provides a single-rooted namespace.
  • Metadata (namespace) and data locations are independent.
  • Aggregates multipe storage nodes into a single storage system.
  • Manages data movement, replication, integrity.
  • Provides data migration between multiple tiers of storage (DISK,

SSD, TAPE).

  • Uniquely handles different Authentication mechanisms, like

x509, Kerberos, login+password, auth tokens.

  • Provides access to the data via variety of access protocols

(WebDAV, NFSv4.1/pNFS, xxxFTP. DCAP, Xrootd, DCAP).

slide-6
SLIDE 6

Delegated Storage | Tigran Mkrtchyan | 3/15/16 | Page 6

dCache's data management

  • Automatic migration
  • Tape/disk/disk
  • HotSpot detection
  • Permanent migration jobs
  • Checksumming on transfer
  • Manual migration
  • Data replication
  • multiple copies
  • same host/rack/site policy
slide-7
SLIDE 7

Delegated Storage | Tigran Mkrtchyan | 3/15/16 | Page 7

Software-defined storage (or did you listen Patrick

carefully?)

  • Abstraction of logical storage services and

capabilities from the underlying physical storage systems

  • Automation with policy-driven storage

provisioning with service-level agreements replacing technology details.

  • Commodity hardware with storage logic

abstracted into a software layer.

slide-8
SLIDE 8

Storage in dCache (what we have)

Block device Pool service

  • dCache provides high level service
  • Data replication and management core dCache service
  • Each pool attached to own disks

Block device Pool service Block device Pool service Block device Pool service Block device Pool service Replication/Migration dCache services (Namespace, PoolSelection, Doors, Authn/Authz)

slide-9
SLIDE 9

Storage in dCache (outsourcing, phase 1)

Block device Pool service

  • dCache provides high level service
  • Data replication and management core dCache service
  • Each pool has it own 'partition' on shared storage
  • Each 'partition' attached to it's own block device

Block device Pool service Block device Pool service Block device Pool service Block device Pool service Replication/Migration dCache services (Namespace, PoolSelection, Doors, Authn/Authz)

slide-10
SLIDE 10

Phase 1 (changing IO layer)

  • Single data server owns the data
  • Single data server manages data
  • flush to tape
  • restore from tape
  • removal
  • garbage collection
slide-11
SLIDE 11

Replication/Migration

Storage in dCache (outsourcing, phase 2)

Block device Pool service

  • dCache provides high level service
  • All pool see all 'partition' on shared storage
  • Any pool can deliver data from any partition
  • Object store takes care about replication

Block device Pool service Block device Pool service Block device Pool service Block device Pool service dCache services (Namespace, PoolSelection, Doors, Authn/Authz)

slide-12
SLIDE 12

Phase 2 (Changing core philosophy)

  • All data managed by 'quorum'
  • group decision who interact with tape
  • group decision who/when file is removed
  • File location is always 'known'
slide-13
SLIDE 13

Replication/Migration Replication/Migration

Storage in dCache (outsourcing, phase 3)

Block device

  • dCache provides high level service
  • dCache can move data between regular and OS pools

Block device Block device Block device Block device dCache services (Namespace, PoolSelection, Doors, Authn/Authz) Replication/Migration Pool service Pool service Pool service Pool service Pool service

slide-14
SLIDE 14

Phase 3 (mixed environment)

  • Mixed setup
  • Islands of storage servers
  • Replication and data movement between

islands

slide-15
SLIDE 15

Delegated Storage | Tigran Mkrtchyan | 3/15/16 | Page 15

Why CEPH

  • No specific hardware support
  • Runs on commodity hardware
  • Scalable to exabytes of data
  • Deployed at sites as storage system for

OpenStack

  • Provides Object, Block and File interfaces
slide-16
SLIDE 16

Delegated Storage | Tigran Mkrtchyan | 3/15/16 | Page 16

And not only CEPH

  • Other object store can be adopted
  • DDN WOS
  • Swift/S3/CDMI
  • Cluster file systems (as a side effect)
  • Luster
  • GPFS
  • GlusterFS
slide-17
SLIDE 17

Delegated Storage | Tigran Mkrtchyan | 3/15/16 | Page 17

CEPH (extremely simplified)

  • OSD ~ a physical disk
  • CRUSH - determines how to store

and retrieve data by computing data storage locations.

  • RADOS - distributes objects across

the storage cluster and replicates

  • bjects
  • librados - provides low-level access

to the RADOS service. OSD OSD OSD CRUSH RADOS LIBRADOS APP RDB CEPH FS

slide-18
SLIDE 18

Delegated Storage | Tigran Mkrtchyan | 3/15/16 | Page 18

Current work

  • Functional prototype only
  • Focus on stability first
  • RBD based
  • striping
  • alterable content
  • Object interface will be evaluated as well
slide-19
SLIDE 19

Delegated Storage | Tigran Mkrtchyan | 3/15/16 | Page 19

Roadmap

  • Phase 1
  • running prototype is available today
  • some sites volunteer to help with testing
  • cleaning up to make generally available
  • Phase 2/3
  • depends on user demand
  • operational overhead, if any
  • support overhead, if any
slide-20
SLIDE 20

Delegated Storage | Tigran Mkrtchyan | 3/15/16 | Page 20

Summary

  • dCache is demanded storage system.
  • New technology provides required building

blocks.

  • Combination on both makes us to

concentrate on missing parts.

  • Working prototype available for testing.
slide-21
SLIDE 21

Delegated Storage | Tigran Mkrtchyan | 3/15/16 | Page 21

Links

  • https://www.dcache.org/
  • https://en.wikipedia.org/wiki/Software-def

ined_storage

  • http://ceph.com/