Lecture 19: Virtual Memory Virtual Memory concept, Virtual- - - PowerPoint PPT Presentation

▶

Sep 14, 2023 465 likes •717 views

Lecture 19: Virtual Memory Virtual Memory concept, Virtual- physical translation, page table, TLB, Alpha 21264 memory hierarchy 1 Adapted from UC Berkeley CS252 S01 Virtual Memory Virtual memory (VM) allows programs to have the illusion of a

SLIDE 1

Adapted from UC Berkeley CS252 S01

Lecture 19: Virtual Memory

Virtual Memory concept, Virtual- physical translation, page table, TLB, Alpha 21264 memory hierarchy

SLIDE 2

Virtual Memory

Virtual memory (VM) allows programs to have the illusion of a very large memory that is not limited by physical memory size

Make main memory (DRAM) acts like a cache for secondary

storage (magnetic disk)

Otherwise, application programmers have to move data in/out

main memory

That’s how virtual memory was first proposed

Virtual memory also provides the following functions

Allowing multiple processes share the physical memory in

multiprogramming environment

Providing protection for processes (compare Intel 8086:

without VM applications can overwrite OS kernel)

Facilitating program relocation in physical memory space

SLIDE 3

VM Example

SLIDE 4

Virtual Memory and Cache

VM address translation a provides a mapping from the virtual address of the processor to the physical address in main memory and secondary storage. Cache terms vs. VM terms

Cache block => page Cache Miss => page fault

Tasks of hardware and OS

TLB does fast address translations OS handles less frequently events:

page fault TLB miss (when software approach is used)

SLIDE 5

Virtual Memory and Cache

Parameter L1 Cache Main Memory Block (page) size 16-128 bytes 4KB – 64KB Hit time 1-3 cycles 50-150 cycles Miss Penalty 8-300 cycles 1M to 10M cycles Miss rate 0.1-10% 0.00001-0.001% Address mapping 25-45 bits => 13-21 bits 32-64 bits => 25-45 bits

SLIDE 6

4 Qs for Virtual Memory

Q1: Where can a block be placed in the upper level?

Miss penalty for virtual memory is very high => Full

associativity is desirable (so allow blocks to be placed anywhere in the memory)

Have software determine the location while accessing

disk (10M cycles enough to do sophisticated replacement)

Q2: How is a block found if it is in the upper level?

Address divided into page number and page offset Page table and translation buffer used for address

translation

Q: why fully associativity does not affect hit time?

SLIDE 7

4 Qs for Virtual Memory

Q3: Which block should be replaced on a miss?

Want to reduce miss rate & can handle in

software

Least Recently Used typically used A typical approximation of LRU

Hardware set reference bits OS record reference bits and clear them periodically OS selects a page among least-recently referenced for

replacement

Q4: What happens on a write?

Writing to disk is very expensive Use a write-back strategy

SLIDE 8

Virtual and Physical Addresses

A virtual address consists of a virtual page number and a page offset. The virtual page number gets translated to a physical page number. The page offset is not changed

Virtual Page Number Page offset Physical Page Number Page offset Translation Virtual Address Physical Address 36 bits 33 bits 12 bits 12 bits

SLIDE 9

Address Translation Via Page Table

Assume the access hits in main memory

SLIDE 10

Address Translation with Page Tables

A page table translates a virtual page number into a physical page number A page table register indicates the start of the page table. The virtual page number is used as an index into the page table that contains

The physical page number
A valid bit that indicates if the page is present in main

memory

A dirty bit to indicate if the page has been written
Protection information about the page (read only,

read/write, etc.)

Since page tables contain a mapping for every virtual page, no tags are required (how to compare it with cache?)

Page table access is slow; we will see the solution

SLIDE 11

Page Table Diagram

SLIDE 12

Accessing Main Memory or Disk

Valit bit being zero means the page is not in main memory Then a page fault occurs, and the missing page is read in from disk.

SLIDE 13

How Large Is Page Table?

Suppose

48-bit virtual address 41-bit physical address 8 KB pages => 13 bit page offset Each page table entry is 8 bytes

How large is the page table?

Virtual page number = 48 - 13 = 25 bytes Number of entries = number of pages = 225 = 32M Total size = number of entries x bytes/entry

= 32M x 8B = 256 Mbytes

Each process needs its own page table

Page tables have to be very large, thus must be stored in main page or even paged, resulting in slow access We need techniques to reduce page table size

SLIDE 14

TLB: Improving Page Table Access

Cannot afford accessing page table for every access include cache hits (then cache itself makes no sense) Again, use cache to speed up accesses to page table! (cache for cache?) TLB is translation lookaside buffer storing frequently accessed page table entry A TLB entry is like a cache entry

Tag holds portions of virtual address Data portion holds physical page number,

protection field, valid bit, use bit, and dirty bit (like in page table entry)

Usually fully associative or highly set associative Usually 64 or 128 entries

Access page table only for TLB misses

SLIDE 15

TLB Characteristics

The following are characteristics of TLBs

TLB size : 32 to 4,096 entries Block size : 1 or 2 page table entries (4 or 8

bytes each)

Hit time: 0.5 to 1 clock cycle Miss penalty: 10 to 30 clock cycles (go to

page table)

Miss rate: 0.01% to 0.1% Associative : Fully associative or set

associative

Write policy : Write back (replace

infrequently)

SLIDE 16

Alpha 21264 Data TLB

128 entries, fully associative ASN (like PID) to avoid flushing Also check protection

SLIDE 17

Determine Page Size

Larger Size Comments Page table size

Inversely proportional

Fast L1 cache hit

L1 cache can be larger

I/O utilization

Longer burst

transfer TLB hit rate

Increasing TLB coverage

Storage efficiency Reducing fragmentation I/O efficiency

Unnecessary data

transfer Process start-up

Small processes are

popular Most commonly used size: 4KB or 8KB

Hardware may support a range of page sizes OS selects the best one(s) for its purpose

SLIDE 18

Alpha 21264 TLB Access

Virtual indexed Physically tagged Physically indexed Physically tagged

SLIDE 19

Alpha 21264 Virtual Memory

Combining segmentation and paging

Segmentation: variable-size memory space range,

usually defined by a base register and a limit field

Segmentation assign meanings to address spaces, and

reduce address space that needs paging (reducing page table size)

Paging is used on the address space of each segment

Three segments in Alpha

kseg: reserved for OS kernel, not VM management seg0: virtual address accessible to user process seg1: virtual address accessible to OS kernel

SLIDE 20

Two Viewpoints of Virtual Memory

Application programs

Sees a large, flat memory space Assumes fast access to every place Hardware/OS hide the complexity

OS Kernel

Manages multiple process spaces Reserves direct accesses to some portions of

physical memory

May access physical memory, its own virtual

memory, and virtual memory of the current process

Hardware facilitates fast VM accesses, and OS

manages slow, less frequent events

SLIDE 21

Alpha 21264 Page Table

10-bit 1024 8B PTEs 13-bit 13-bit 28-bit

Page table access on TLB miss managed by software

SLIDE 22

Memory Protection

Memory protection: preventing unauthorized accesses to process and kernel memory Memory protection implementation:

User programs can only access through virtual

memory

PTE entry contains protection bits to allow

shared but protected accesses

Protection fields in Alpha

Valid, user read enable, kernel read enable, user

write enable, and kernel write enable

SLIDE 23

Memory Hierarchy Example: Alpha 21264 in AlphaServer ES40

L1 instruction cache: 2-way, 64KB, 64-byte block, Virtually indexed and tagged

Use way prediction and line prediction to allow instruction

fetching Inst prefetcher: store four prefetched instructions, accessed before L2 cache L1 data cache: 2-way, 64KB, 64-byte block, Virtually indexed, physically tagged, write-through Victim buffer: 8-entry, checked before L2 access L2 unified cache: 1-way 1MB to 16MB, off-chip, write-back;

Allow critical-word transfer to L1 cache, transfers 16B per

2.25ns

TLB: 128-entry fully associative for inst and data (each) ES40: L1 miss penalty 22ns, L2 130 ns; up to 32GB memory; 256-bit memory buses (64-bit into processor) Read 5.13 for more details