Okeanos: Wasteless Journaling for Fast and Reliable Multistream Storage
Andromachi Hatzieleftheriou, Stergios V. Anastasiadis
Department of Computer Science University of Ioannina, Greece
1
- A. Hatzieleftheriou
University of Ioannina
Okeanos: Wasteless Journaling for Fast and Reliable Multistream - - PowerPoint PPT Presentation
Okeanos: Wasteless Journaling for Fast and Reliable Multistream Storage Andromachi Hatzieleftheriou , Stergios V. Anastasiadis Department of Computer Science University of Ioannina, Greece University of Ioannina A. Hatzieleftheriou 1 Outline
Andromachi Hatzieleftheriou, Stergios V. Anastasiadis
Department of Computer Science University of Ioannina, Greece
1
University of Ioannina
2
University of Ioannina
critical for system and application
reliability
effectively random I/O
async writes have good performance due to batching in memory sync writes result in wasteful traffic due to excessive full-page I/Os
1 10 100 1000 1 10 100
Total Journal Volume (MB) Request Size (KB)
Write Traffic
(Linux ext3) Data Journaling Ordered
Page Size=4KB
3
data & metadata metadata only University of Ioannina
1.
keep data on disk
2.
sequential disk throughput
3.
writes with high positioning overhead unnecessary writes of unmodified data
batch random small writes in memory journal data updates at subpage granularity
4
University of Ioannina
DISK Filesystem
large writes disk traffic duplication
MEMORY Journal Pages
5
University of Ioannina data deltas
write threshold differentiates requests by size
MEMORY DISK Journal
data deltas
Pages Filesystem
6
University of Ioannina
atomic updates of both data and metadata
data updates either journaled or not depending on request size
consistency at least as strict as default ext3 journaling mode (ordered)
7
University of Ioannina
Header Tag Tag Tag Data Delta Data Delta Data Delta Data Copies Block Buffer
Page Cache Journal Descriptor Block Multiwrite Journal Block
Modified Data Original Data
… … …
accumulates multiple subpage data updates
apply data deltas to corresponding final disk blocks
8
University of Ioannina
x86-based servers quad-core 2.66GHz processor 3GB RAM Seagate Cheetah SAS 300GB 15KRPM disks
Microbenchmarks Postmark MPIO-IO over PVFS2
9
University of Ioannina
similar to NILFS (stable Linux port of LFS )
1 10 100 1000 20 40 60 80 100
Write Latency (ms) Number of Streams
1 Mbps/stream
Selective Ordered Wasteless Data NILFS 1 10 100 1000 20 40 60 80 100
Read Latency (us) Number of Streams
1 Mbps/stream
NILFS Selective Ordered Data Wasteless 10
University of Ioannina
0.001 0.01 0.1 1 10 2000 4000 6000 8000
Journal Throughput (MB/s) Number of Streams
1Kbps/stream
Data Wasteless Selective Ordered University of Ioannina 11
1 2 3 4 5 2000 4000 6000 8000
File System Throughput (MB/s) Number of Streams
1Kbps/stream
Ordered Data Wasteless Selective
Lower is better!
₋ Small files workload
wasteless increases transaction throughput
₋ Parallel I/O workload
13 clients, 1 PVFS2 data server, 1 PVFS2 metadata server (15 machines)
wasteless doubles the throughput of parallel application checkpointing
200 400 600 800 1 10 100
Transactions/s Request Size (KB)
Postmark
Wasteless Data Selective Ordered 0.0 0.2 0.4 0.6 0.8 1.0 10 20 30 40
Throughput MB/s Threads per Client
MPI-IO over PVFS2
(Write Size 1KB)
Wasteless Data Selective Ordered 12
University of Ioannina
apply subpage journaling of data updates to ensure reliability
merges subpage writes into page-sized journal blocks
journals only updates below a write threshold
reduced write latency improved transaction throughput avoided bandwidth waste
extent for virtualization environments and flash memory systems
13
University of Ioannina