Gaining Insights into Multicore Cache Partitioning: Bridging the - - PowerPoint PPT Presentation

▶

Mar 19, 2023 103 likes •331 views

Gaining Insights into Multicore Cache Partitioning: Bridging the Gap between Simulation and Real Systems 1 Presented by Hadeel Alabandi 2/18/2014 Introduction and Motivation 2 A serious issue to the effective utilization of multicore

SLIDE 1

Gaining Insights into Multicore Cache Partitioning: Bridging the Gap between Simulation and Real Systems

Presented by Hadeel Alabandi

2/18/2014

1

SLIDE 2

Introduction and Motivation

A serious issue to the effective utilization of multicore processors

is cache partitioning and sharing

Simulation were used to evaluate cache partitioning in the

existing studies, however, it has some limitations

Excessive simulation time
Absence of OS activities
Proneness to simulation inaccuracy

2/18/2014

2

SLIDE 3

Introduction and Motivation (cont.)

In this paper, a software approach has been used
It supports static and dynamic cache partitioning by using memory

address mapping

It emulates hardware partitioning mechanism will examine cache

partitioning policies on real time systems

Three metrics were used through evaluation for optimization

purposes

Performance
Fairness
QoS

2/18/2014

3

SLIDE 4

Cache Partitioning for Multicore Processors

It has two interdependent parts
Mechanism
Forces cache partitioning
Provides partitioning policy input
Policy
Decides how much cache resources will be allocated to each program with an
ptimization objective

2/18/2014

4

SLIDE 5

Adopted Evaluation Metrics in The Study

Performance Metrics
Throughput (IPCs)
Absolute number of IPCs
Combined miss rates
Summarizes miss rates
Combined misses
Summarizes number of cache misses
QoS Metrics
Suppose that QoS constraints are never violated in their case

2/18/2014

5

SLIDE 6

Adopted Evaluation Metrics in The Study (cont.)

Fairness Metrics
Miss rates
The number of misses
The slowdown for each co- secheduled program should be identical after

cache partitioning

In the study, fairness metrics related to single core execution with

dedicated L2 cache

Date required for policy metric and the evaluation metric were acquired

by running a workload with different cache partitioning

The result value will be in the range (-1 to 1)
If the result is 1, the correlation between the 2 metrics is perfect

2/18/2014

6

SLIDE 7

Static OS-based Cache Partitioning

Static cache partitioning policy predetermines the amount of

cache blocks allocated to each program at the beginning of its execution

Page coloring will be used in the partitioning mechanism
There several bits between cache index and physical page number

in the physical address

It will be used for page color
Addressed cache will be divided to non-intersecting regions by

page color

Pages with the same color are mapped to the same cache region

2/18/2014

7

SLIDE 8

Cache Partitioning – Page Coloring

2/18/2014

8

SLIDE 9

Cache Partitioning – Page Coloring

2/18/2014

9

SLIDE 10

Dynamic OS-based Cache Partitioning

Adjust cache quotas among processes dynamically
Page recoloring procedure
Increasing the process cache resources ( i.e number of colors used by the

process)

The kernel rearrange the virtual memory mapping of the process
Allocating physical pages of the new color
Copying the memory contents
Freeing the old pages
Remapping virtual pages cause performance overhead
Reduce the overall overhead by lowering the frequency of cache allocation

adjustment

Another option is using lazy method of page migration, so the content of

colored page is moved only when it’s accessed

Average overhead of dynamic partitioning reduced to 2%
Highest migration overhead observed 7%

2/18/2014

10

SLIDE 11

Page Recoloring

2/18/2014

11

SLIDE 12

Dynamic Cache Partitioning Policies

Cache partitioning will be adjusted periodically by the policies at

the end of each epoch

Dynamic cache partitioning policy for performance
Adjust cache partitioning dynamically
Metrics
Throughput (IPCs)
Combined miss rate
Combined misses
Fair speedup
Dynamic cache partitioning policy for fairness
Two dynamic policies were implemented based on FM0 and FM4
FM0 is the evaluation metric ( i.e. the ratio of the current cumulative IPC over

the baseline IPC)

FM4 is the cache miss rates

2/18/2014

12

SLIDE 13

Dynamic Cache Partitioning Policies (cont.)

Dynamic cache partitioning policy for QoS consideration
Two core workload of two programs
The first is the target program
The second is the partner program
QoS guarantee
Ensure the target program performance is larger than or equal to X% of a

baseline execution of homogeneous workload on a dual core processor with half

f the cache capacity allocated for each program
Increase the performance of the partner program

2/18/2014

13

SLIDE 14

Experimental Methodology

Hardware and software platform
Dell PowerEdge1950
Two dual core, 3.0GHz Intel Xeon 5160 processors and 8GB fully Buffered DIMM

(FB-DIMM) main memory

Shared, 4MB, 16-way set associative L2 cache
Each core has a private 32KB instruction cache and a private 32KB data cache
Red Hat Enterprise Linux 4.0
Kernel linux-2.6.20.3
Performance collected using pfmon

2/18/2014

14

SLIDE 15

Evaluation Results

2/18/2014

15

Show the improvement with the best static partitioning of each

workload over shared cache

SLIDE 16

The Performance – Static & Dynamic

2/18/2014

16

SLIDE 17

Fairness – Correlation between Evaluation Metrics and Policy Metrics

2/18/2014

17

SLIDE 18

QoS – Static & Dynamic

2/18/2014

18

SLIDE 19

Related Work

Cache partitioning for multicore processors
Page Coloring

2/18/2014

19

SLIDE 20

Summary

An OS-based cache partitioning mechanism on multicore

processors were designed and implemented

Using it to study different cache partitioning polices
Some simulation-based study findings’ were confirmed, however,

this approach shows new insights haven’t been shown by simulation

Future work
Reduce cache partitioning overhead
Adding easy user interface
Conducting partitioning research at the compiler level for both

multiprogramming and multithreaded applications

2/18/2014

20

SLIDE 21

Discussion

Does OS-based approach had provided new insights and
bservations that simulation couldn’t or failed to show it?

2/18/2014

21

SLIDE 22

References

Gaining Insights into Multicore Cache Partitioning:Bridging the

Gap between Simulation and Real Systems

http://www.contrib.andrew.cmu.edu/~hyoseunk/pdf/ecrts13-

hyos-slides.pdf

http://ftp.cs.rochester.edu/~xiao/eurosys09/euro061-zhang.pdf

2/18/2014