Cloud File Services for Academics Harry Mangalam, RCS - - PowerPoint PPT Presentation

cloud file services for academics
SMART_READER_LITE
LIVE PREVIEW

Cloud File Services for Academics Harry Mangalam, RCS - - PowerPoint PPT Presentation

Cloud File Services for Academics Harry Mangalam, RCS <harry.mangalam@uci.edu> x40084 Full Report: http://goo.gl/fCNSIa BigData is here Most of what we see is unstructured data Or semi-structured (large ASCII files) 100TB


slide-1
SLIDE 1

Cloud File Services for Academics

Harry Mangalam, RCS <harry.mangalam@uci.edu> x40084 Full Report: http://goo.gl/fCNSIa

slide-2
SLIDE 2

BigData is here

  • Most of what we see is unstructured data
  • Or semi-structured (large ASCII files)
  • 100TB datasets now on HPC
  • More coming..
slide-3
SLIDE 3

Bandwidth & Latency

  • BW to many commercial services tops out at

1-5MB/s.

  • Latency increases over distance and hops.
  • Academic/Data networks can be 10-100X faster.
  • LANs are fastest, often 100X faster than WANs.
  • WANs block some protocols, ports, etc.
  • Small file operations make everything worse.
slide-4
SLIDE 4

Raw Storage Pricing

  • Raw storage halves in price every ~14mo
  • NAS-quality 4TB disk is $170

($4.25/100GB)

  • 250TB Chassis is $35K

($14/100GB)

  • Lifetime of Chassis is 4yrs

($3.5/100GB)

slide-5
SLIDE 5

Commercial Pricing

  • Cheapest is ~$24/100GB/yr (Google, iCloud)
  • Most expensive is ~$100/100GB/yr (some business

services: Box, Druva)

  • Pricing varies by business model (Backblaze is

unlimited)

  • Long Term Backup (Glacier @ $1/100GB/yr is

cheapest but horrible interface.

  • Since raw storage halves in price every 14mo,

pricing changes very rapidly.

slide-6
SLIDE 6

5 overlapping uses

  • Interactive file storage
  • File Sharing
  • File Syncing (change to a file is propagated to all copies
  • f that file)
  • Short Term Backup (incr'l backup, disaster recovery,

client-initiated recovery)

  • Long Term Backup (dataset archives w. infrequent

recovery)

slide-7
SLIDE 7

Venn Diagram of previous slide

slide-8
SLIDE 8

Important Features

  • Ability to scale greatly.
  • Support for very large files.
  • Ease of user interface, informative
  • Watch edits / co-editing.
  • Easy to set/show permissions who

can read, write, edit & delete

slide-9
SLIDE 9

And More..

  • Pay only for what you use.
  • Sync/autosync across lots of devices.
  • Direct web-availability.
  • File versioning like Apple’s TimeMachine or ZFS

snapshots.

  • High bandwidth.
  • Highly reliable.
slide-10
SLIDE 10

And Even More..

  • Multiple protocols (NFS, CIFS, HTTP/HTPPS,

Globus, FTP/SFTP, etc).

  • Mac/iOS/Win/Linux/Android clients
  • Logging to show who has accessed files and how.
  • Security (file sharing vs backups)
  • Integrated Apps
  • Stateless/Connectionless storage
slide-11
SLIDE 11

Local vs Commercial

  • Local resources tends to offer more, more flexible, faster services at

a cheaper long-term cost with less lock-in.

  • Commercial services tend to offer less HR, DC req, more stability,

decent options, at a higher aggregate cost.

  • Legal advantages to both.
  • Admin and users often have different and often opposing

requirements; Admin tends to make the final decisions.

  • Best solution is probably “Both”.
slide-12
SLIDE 12

Local Sharing Services

  • OwnCloud – 'Open Source DropBox'

Mac/iOS/Win/Linux/Android clients

  • UCLA/CASS – Very large campus storage

system, supports Globus/Grid, SMB/CIFS, NFS, CrashPlan.

slide-13
SLIDE 13

Commercial Services & more..

  • See:

<http://goo.gl/fCNSIa>

  • Harry Mangalam <harry.mangalam@uci.edu>

x40084