onyx a prototype phase change memory storage array

Onyx: A Prototype Phase-Change Memory Storage Array Ameen Akel * - PowerPoint PPT Presentation

Onyx: A Prototype Phase-Change Memory Storage Array Ameen Akel * Adrian Caulfield, Todor Mollov, Rajesh Gupta, Steven Swanson Non-Volatile Systems Laboratory, Department of Computer Science and Engineering University of California, San Diego *


  1. Onyx: A Prototype Phase-Change Memory Storage Array Ameen Akel * Adrian Caulfield, Todor Mollov, Rajesh Gupta, Steven Swanson Non-Volatile Systems Laboratory, Department of Computer Science and Engineering University of California, San Diego * Now at Micron Technology 1

  2. 4 KB Operation Request Latencies Disk Flash Current PCM Projected PCM 10000 Log Operation Request Latency (us) 1000 100 10 1 0.1 0.01 Write Read 2

  3. Advantages of Studying PCM SSDs • Understand current PCM performance – With current storage infrastructure – Versus other NV tech: e.g. Flash SSDs • PCM performance may differ from simulation – Variance in write latency due to data – Wear-out characteristics • Use real applications to gauge performance • Understand how software should change for PCM • Prepare to integrate future-generation PCM 3

  4. Overview • Motivation • PCM Devices – Technology Overview – Micron P8P Devices • Onyx Architecture – Logical Architecture – PCM DIMMs – Physical Architecture • Performance Analysis • Applications and Conclusions 4

  5. PCM: The Device Level • PCM storage medium: Chalcogenide – Resistance depends on M. Breitwisch et al VLSI '07 molecular phase • Writes – Heaters are attached to the chalcogenide – Current passed through heaters to change phase – Allows bit-alterable writes • Reads – Measure resistance through chalcogenide area – Resistance sensed by ability to sink current 5

  6. PCM: The Device Level • PCM storage medium: Chalcogenide – Resistance depends on XRD-measurements molecular phase • Writes amorph fcc hexagonal – Heaters are attached to the chalcogenide – Current passed through heaters to change phase – Allows bit-alterable writes • Reads – Measure resistance through chalcogenide area – Resistance sensed by ability to sink current M. Wuttig, et. al., FP6 Project CAMELS. 6

  7. fil fi t “ ” fl fl PCM Write Operations in Depth • Material heated to… – > 600 ∘ C then cooled ! # quickly  Amorphous ( ' &' ! – ~ 350 ∘ C then cooled ! . / 0) slowly  Crystalline • Set and reset ! * +, - – Reset – 0 state &' ! – Set – 1 state ) 10 ns 50-150 ns fi 7 − fi sfi • fi Ω ◦ ◦

  8. PCM Projections • Future PCM latency projections * : Operation Latency Read 48 ns Set 150 ns Reset 40 ns • Process node progression: 90, 45, 32, 20, 9 nm *B. C. Lee, et. al. Architecting Phase Change Memory as a Scalable DRAM Alternative. ISCA 2009. 8

  9. P8P PCM • First-generation NOR-flash replacement • Part: NP8P128A13B1760E (P8P) • Process Node: 90 nm • Capacity: 16 MB • Per Device Bandwidth, Latency, Current – Write (64 bytes): 0.5 MB/s, 120 us, 35 mA – Read (16 bytes): 48.6 MB/s, 314 ns, 15 mA • Lifetime: One million writes until first bit error 9

  10. Overview • Motivation • PCM Devices – Technology Overview – Micron P8P Devices • Onyx Architecture – Logical Architecture – PCM DIMMs – Physical Architecture • Performance Analysis • Applications and Conclusions 10

  11. Moneta: SSD for Emulated Fast NVMs Application • DRAM-based NV-SSD File System emulator OS IO Stack • Learn by building Moneta Driver – Hardware – Controller & interconnect DRAM DRAM – Software – Driver, file CPU system, apps • Uses optimized software stack PCIe Moneta – Decreases request latency – Improves request DRAM DRAM DRAM DRAM DRAM DRAM concurrency 11

  12. Onyx: Phase-Change Memory SSD Application • Based on Moneta * File System – Shares hardware OS IO Stack Onyx Driver – Shares software stack • PCM replaces DRAM DRAM DRAM CPU – Uses real PCM – Custom PCM controller PCIe Onyx *A. M. Caulfield, et. al. Moneta: A high- PCM PCM PCM PCM PCM PCM performance storage array architecture for next-generation, non-volatile memories. MICRO 2010 12

  13. Moneta/Onyx Architecture Ring Control 2GB PCM Ring (4 GB/s) Transfer 2GB Buffers PCM Host via Request Scoreboard PIO Queue 2GB DMA PCM Control 2GB Tag Status PCM Registers Host via DMA 13

  14. Onyx PCM Controller • Request Completion – Late Completion – On PCM write completion – Early Completion – On request reception • Start-Gap Wear Leveling * – Low overhead wear leveling (two registers + logic) – Prevents hot spots from wearing out memory – Rotates line in memory every gap interval *M. K. Qureshi, et. al. Enhancing lifetime and security of PCM- based main memory with start-gap wear leveling. MICRO 42. 14

  15. Closer Look at a PCM DIMM • 8 Ranks of 5 PCM devices – 64 data bits + 16 ECC bits – Effectively 16 ranks per memory interface • Shared control and data lines • Capacity: 640 MB / DIMM Address[0:25 ] Device Device Device Device Device 0 1 2 3 4 Data[0:15] Data[16:31] Data[32:47] Data[48:63] Data[64:79] 15

  16. Prototyping Advanced SSDs • Built on RAMP’s BEE3 board – Four FPGAs connected in a ring – Four DIMM slots per FPGA – PCIe 1.1 x8 host connection • System capacity: 10 GB 16

  17. Overview • Motivation • PCM Devices – Technology Overview – Micron P8P Devices • Onyx Architecture – Logical Architecture – PCM DIMMs – Physical Architecture • Performance Analysis • Applications and Conclusions 17

  18. Read Performance Onyx FusionIO Moneta 2000 1800 1600 1400 Bandwidth (MB/s) 1200 1000 800 600 400 200 0 0.5 1 2 4 8 16 32 64 128 256 512 1024 Request Size (KB) 18

  19. Write Performance Onyx-Late Onyx-Early FusionIO Moneta 2000 1800 1600 1400 Bandwidth (MB/s) 1200 1000 800 600 400 200 0 0.5 1 2 4 8 16 32 64 128 256 512 1024 Request Size (KB) 19

  20. BerkeleyDB Performance Onyx FusionIO Moneta 8000 7000 6000 Transactions / Second 5000 4000 3000 2000 1000 0 BTree HashTable BDB Benchmark 20

  21. Potential PCM Applications • As a read cache – First-gen PCM read speeds compete with flash – Next-gen PCM should improve read performance • Replace DRAM in high-performance apps – PCM cost will likely drop below DRAM – Will scale aggressively past DRAM • Outpace flash in high-performance SSDs – Reduces complexity of management – Provides higher-rated lifetime – Saves power, logic, and design time 21

  22. Conclusions • Onyx designed to maximize PCM performance • More improvements possible as PCM scales – Onyx architecture will scale with PCM – Onyx will benefit from faster reads and writes • PCM simplifies SSD management relative to flash and improves small access performance 22

  23. Thank You! Questions? 23

Recommend


More recommend


Explore More Topics

Stay informed with curated content and fresh updates.