CSEE 3827: Fundamentals of Computer Systems Single Cycle MIPS - - PowerPoint PPT Presentation

csee 3827 fundamentals of computer systems
SMART_READER_LITE
LIVE PREVIEW

CSEE 3827: Fundamentals of Computer Systems Single Cycle MIPS - - PowerPoint PPT Presentation

CSEE 3827: Fundamentals of Computer Systems Single Cycle MIPS Implementation Outline We will examine two MIPS implementations A single-cycle version A pipelined version Simple subset of MIPS, showing most aspects Memory


slide-1
SLIDE 1

CSEE 3827: Fundamentals of Computer Systems

Single Cycle MIPS Implementation

slide-2
SLIDE 2

Outline

  • We will examine two MIPS implementations
  • A single-cycle version
  • A pipelined version
  • Simple subset of MIPS, showing most aspects
  • Memory reference: lw, sw
  • Arithmetic/logical: add, sub, and, or, slt
  • Control transfer: beq, j
  • Next unit: CPU performance factors
  • Instruction count (determined by ISA and compiler)
  • Cycles per instruction and cycle time (determined by CPU hardware)

2

slide-3
SLIDE 3

Instruction Execution

  • PC → instruction memory, fetch instruction
  • Register numbers → register file, read registers
  • Depending on instruction class:
  • Use ALU to calculate:
  • Arithmetic or logical result
  • Memory address for load/store
  • Branch target address
  • Access data for load/store
  • PC ← target address or PC + 4

3

slide-4
SLIDE 4

CPU Overview

4

slide-5
SLIDE 5

Can’t just join wires together, use muxes

5

slide-6
SLIDE 6

Controller generates selects for the Muxes (and some other stuff)

6

slide-7
SLIDE 7

Combinational Elements

  • AND gate (Y = A & B)
  • Multiplexer (Y = S ? A : B)

7

  • Adder (Y = A + B)
  • Arithmetic/Logic Unit (ALU)

A B Y A B Y A B Y F (Y = F(A,B)) A B Y S

+

ALU

slide-8
SLIDE 8

Clocking Methodology

8

Combinational logic transforms data during clock cycles. Longest combinational delay determines clock period.

slide-9
SLIDE 9

Building a datapath incrementally

  • Datapath: elements that process data and addresses in the CPU
  • Datapath will execute one instruction in one clock cycle
  • Each datapath element can only do one function at a time
  • Hence, we need separate instruction and data memories
  • Use multiplexers where alternate data sources are used for different

instructions

9

slide-10
SLIDE 10

Instruction Fetch

10

  • Fetch Instruction contained in PC register from memory
  • Compute PC + 4 for next instruction
slide-11
SLIDE 11

Part 1: Instruction Fetch

11

slide-12
SLIDE 12

R-Format Instructions

  • Read two register operands
  • Perform arithmetic/logical operation
  • Write register result

12

slide-13
SLIDE 13

Load/Store Instructions

13

  • Read register operands
  • Calculate address using 16-bit offset (sign-extend offset and use ALU)
  • Load: read memory and update register
  • Store: write register value to memory
slide-14
SLIDE 14

Part 2: R-Type/Load/Store Datapath

14

slide-15
SLIDE 15

Branch Instructions

  • Read register operands
  • Compare operands (use ALU: subtract and check zero output)
  • Calculate target address
  • Sign-extend displacement
  • Shift left two places (word displacement)
  • Add to PC+4 (already calculated by instruction fetch)

15

slide-16
SLIDE 16

Part 3: Instruction Fetch w. Branch

16

slide-17
SLIDE 17

Full Datapath

17

slide-18
SLIDE 18

Datapath Control Scheme

18

  • Main control controls whole

datapath based on opcode

ALU control controls ALU based on opcode (ALUOp) and function field (funct)

slide-19
SLIDE 19

ALU Control Inputs/Outputs

19

R-type 10 lw 00 sw 00 beq 01 0000 AND 0001 OR 0010 add 0110 subtract 0111 set on less than Instruction[5:0]

Main Control

ALUOp Operation 2 4

ALU

ALU control

(See Appendix C of text for implementation of corresponding ALU.)

slide-20
SLIDE 20

ALU Control Implementation

20

lw sw beq R-type R-type R-type R-type R-type → 00 → 00 → 01 → 10 → 10 → 10 → 10 → 10 xxxxxx → load word xxxxxx → store word xxxxxx → branch equal 100000 → add 100010 → subtract 100100 → AND 100101 → OR 101010 → set on less than → add → add → subtract → add → subtract → AND → OR → set on less than → 0010 → 0010 → 0110 → 0010 → 0110 → 0000 → 0001 → 0111

  • p

c

  • d

e A L U O p f r

  • m

m a i n c

  • n

t r

  • l

I n s t r u c t i

  • n

[ 5 : ] O p e r a t i

  • n
slide-21
SLIDE 21

ALU Control Truth Table

21

xxxxxx xxxxxx xxxxxx 100000 100010 100100 100101 101010 0010 0010 0110 0010 0110 0000 0001 0111

A L U O p f r

  • m

m a i n c

  • n

t r

  • l

I n s t r u c t i

  • n

[ 5 : ] O p e r a t i

  • n

00 00 01 10 10 10 10 10

slide-22
SLIDE 22

ALU Control Truth Table 2

22

slide-23
SLIDE 23

Datapath Control Scheme

23

slide-24
SLIDE 24

Main control signals derive from instruction types

24

rs rt rd shamt funct

31:26 25:21 20:16 15:11 10:6 5:0

35 or 43 rs rt constant

15:0

4 rs rt constant

15:0

R-type: Load/Store: Branch:

31:26 25:21 20:16 31:26 25:21 20:16

always read read, except for load write for R-type and load sign-extend and add

slide-25
SLIDE 25

R-Type Control Signals

25

10 1 1

(Alt. illustration: Fig. 4.19)

slide-26
SLIDE 26

lw Control Signals

26

00 1 1 1 1

(Alt. illustration: Fig. 4.20)

slide-27
SLIDE 27

sw Control Signals

27

1 00 x x 1

slide-28
SLIDE 28

beq Control Signals

28

1 01 x x

(Alt. illustration: Fig. 4.21)

slide-29
SLIDE 29

Main Control Truth Table

29

  • 000000

100011 101011 000100

Instruction[31:26]

slide-30
SLIDE 30
  • Unconditional jump to instruction at label
  • Instruction encoded in J-type format
  • Jump uses word addresses
  • Update PC with concatenation of:
  • Top 4 bits of old PC
  • 26-bit jump address
  • 00

The j instruction

30

2 address

j label

25:0 31:26

slide-31
SLIDE 31

Implementing the jump instruction

31