Understanding and Finding Crash-Consistency Bugs in Parallel File - - PowerPoint PPT Presentation

understanding and finding crash consistency bugs in
SMART_READER_LITE
LIVE PREVIEW

Understanding and Finding Crash-Consistency Bugs in Parallel File - - PowerPoint PPT Presentation

Understanding and Finding Crash-Consistency Bugs in Parallel File Systems Jinghan Sun , Chen Wang, Jian Huang, and Marc Snir University of Illinois at Urbana-Champaign Contact: Jinghan Sun (js39@illinois.edu) PFS failures are frequent and


slide-1
SLIDE 1

Understanding and Finding Crash-Consistency Bugs in Parallel File Systems

Jinghan Sun, Chen Wang, Jian Huang, and Marc Snir

University of Illinois at Urbana-Champaign Contact: Jinghan Sun (js39@illinois.edu)

slide-2
SLIDE 2

PFS failures are frequent and expensive

8% 34% 34% 12% 0% 8% 16% 24% 32% 40%

PFS Failure Frequency

Weekly Monthly Never Not Reported

41% 14% 6% 4% 35%

0% 10% 20% 30% 40% 50%

Single Day Failure Cost

<$100K $100K-$500K $500K-$1M >$1M Not Reported

Source: Hyperion Research 2019 59% 24% 14% 3% 0% 15% 30% 45% 60% 75%

PFS Recovery Time

<1 day 2-3 days 1 week >1 week

41% of PFSes suffer from monthly or weekly failures, their recovery process is expensive & time consuming

1

slide-3
SLIDE 3

Introduction to parallel file systems

Parallel file system

  • Data striping
  • Separate metadata management
  • POSIX-compliant

Parallel I/O library

  • Higher level abstractions:

Datasets, groups, collective I/O APIs

HPC I/O stack is much more complex than the traditional I/O stack

2

slide-4
SLIDE 4

A PFS failure example

PFS may experience severe data loss after system-wide power outage

3

slide-5
SLIDE 5

A study of crash vulnerabilities on PFSes

Two Filesystems Seven Workloads 34 Vulnerabilities

+ =

Atomic Replace via Rename Write-ahead Logging

HDF5

create delete rename resize update

Data loss DoS Inaccessible dataset

4

slide-6
SLIDE 6

1 2 3 4 5 6 7 8 9 10 ARVR WAL H5-create H5-delete H5-resize H5-rename H5-write

Number of Vulnerabilities on Different Filesystems

BeeGFS OrangeFS ext4

PFS crash vulnerabilities

The complexity of PFS stack makes it more vulnerable to system crashes

5

Vulnerability: Parallel I/O stack may corrupt user files if crash happens in the middle of the computation (depending on the precise timing of disk accesses)

slide-7
SLIDE 7

Crash vulnerability example

// atomic replace via rename (ARVR) bool atomic_update(){ int fd = create("file.tmp"); write(fd, new, size); close(fd); rename("file.tmp","file.txt"); } The function tries to update a file content atomically

storage #1 metadata storage #2

BeeGFS with 2 storage and 1 metadata server

6

slide-8
SLIDE 8

Crash vulnerability example

// atomic replace via rename (ARVR) bool atomic_update(){ int fd = creat("file.tmp"); write(fd, new, size); close(fd); rename("file.tmp","file.txt"); } unlink

  • ld_chunk

unlink idfile_2 dentries/tmp dentries/file rename append chunk creat chunk creat idfile idfile dentries/tmp link

storage #1 metadata storage #2

beegfs-client

Persistence order ≠ Program order! Two vulnerabilities discovered at system crash!

7

unlink

  • ld_chunk
slide-9
SLIDE 9

Crash vulnerability example

unlink

  • ld_chunk

unlink idfile_2 dentries/tmp dentries/file rename append chunk creat chunk creat idfile idfile dentries/tmp link

storage #1 metadata storage #2

Inconsistency No.1 Cause rename() persisted before append() Ordering Cross-node dependency Consequence Data loss Fixed by fsck? No

  • p

param

  • p

param

Persisted operations Non-persisted operations 8

unlink

  • ld_chunk
slide-10
SLIDE 10

Crash vulnerability example

unlink idfile_2 dentries/tmp dentries/file rename append chunk creat chunk creat idfile idfile dentries/tmp link

storage #1 metadata storage #2

Inconsistency No.2 Cause unlink() persisted before rename() Ordering Cross-node dependency Consequence Data loss Fixed by fsck? No

  • p

param

  • p

param

Persisted operations Non-persisted operations 9

unlink

  • ld_chunk

unlink

  • ld_chunk
slide-11
SLIDE 11

Crash vulnerability example

unlink idfile_2 dentries/tmp dentries/file rename append chunk creat chunk creat idfile idfile dentries/tmp link

storage #1 metadata storage #2

Inconsistency No.3 Cause unlink() persisted before rename() Ordering Intra-node dependency Consequence Data loss Fixed by fsck? Yes

  • p

param

  • p

param

Persisted operations Non-persisted operations 10

unlink

  • ld_chunk

unlink

  • ld_chunk
slide-12
SLIDE 12

crash state legal state

workload checker

passed failed

crash state

… …

crash state filesystem & app-level recovery

client-side traces server-side traces

Legal replay Crash Record Test Classification

File system images that satisfy the given consistency model

consistency model 1 2 4 3 5 legal state legal state

Report

PFSCheck design

11

Discovering PFS crash vulnerabilities systematically & efficiently

slide-13
SLIDE 13

crash state legal state

workload checker

passed failed

crash state

… …

crash state filesystem & app-level recovery

client-side traces server-side traces

Legal replay Crash Record Test Classification

File system images that satisfy the given consistency model

consistency model 1 legal state legal state

Report

Automated workload generation

  • Unified API for I/O libraries
  • 1. Multi-level tracing
  • Joint server-side & client-side I/O calls

tracing

  • Network packet tracing
  • Correlation between server & client
  • perations

The PFSCheck design

12

slide-14
SLIDE 14

crash state legal state

workload checker

passed failed

crash state

… …

crash state filesystem & app-level recovery

client-side traces server-side traces

Legal replay Crash Record Test Classification

File system images that satisfy the given consistency model

consistency model 1 2 3 legal state legal state

Report

  • 2. Efficient crash state emulation
  • Automated crash state generation via

trace permutation

  • Perform necessary post-crash recovery
  • 3. Consistency testing
  • Workload-specific consistency checker

The PFSCheck design

13

slide-15
SLIDE 15

crash state legal state

workload checker

passed failed

crash state

… …

crash state filesystem & app-level recovery

client-side traces server-side traces

Legal replay Crash Record Test Classification

File system images that satisfy the given consistency model

consistency model 1 2 4 3 5 legal state legal state

Report

  • 4. Legal replay based on given consistency

model

  • Crash consistency model specifies the

legitimate crash states of the parallel file system

  • 5. Crash vulnerability classification
  • If a vulnerable crash state is not a legal

state, we attribute it to PFS

  • Otherwise, I/O libraries are blamed

The PFSCheck design

14

slide-16
SLIDE 16

Conclusion

  • Motivation: crash vulnerabilities could be exacerbated on PFSes, due to the

complexity of the parallel I/O stack

  • Study:

– the number of crash consistency bugs on BeeGFS and OrangeFS is higher than local filesystem – the workload can fail in more ways on PFSes – the consistency relies on persistency reordering across nodes

  • Proposed framework: PFS-specific crash consistency checker with a focus on

automation and efficiency

15

slide-17
SLIDE 17

Thank you!

Contact: Jinghan Sun (js39@illinois.edu)

16