ta9 Spring 2006 Amar Lior Adapted from Computer - - PDF document

ta9 spring 2006 amar lior adapted from computer
SMART_READER_LITE
LIVE PREVIEW

ta9 Spring 2006 Amar Lior Adapted from Computer - - PDF document

ta9 Spring 2006 Amar Lior Adapted from Computer Organization&Design, H/S interface, Patterson Hennessy@UCB,3 rd edition 1 Control Selecting the operations to perform (ALU, read/write, etc.) Controlling


slide-1
SLIDE 1

1

1

הנבמ םיבשחמ

ta9

Spring 2006

Amar Lior

Adapted from Computer Organization&Design, H/S interface, Patterson Hennessy@UCB,3rd edition

Control

  • Selecting the operations to perform (ALU, read/write, etc.)
  • Controlling the flow of data (multiplexor inputs)
  • Information comes from the 32 bits of the instruction
  • Example:

add $8, $17, $18 Instruction Format:

000000 10001 10010 01000 00000 100000

  • p

rs rt rd shamt funct

  • ALU's operation based on instruction type and function code
  • e.g., what should the ALU do with this instruction
  • Example: lw $1, 100($2)

35 2 1 100

  • p

rs rt 16 bit offset

  • ALU control input

0000 AND 0001 OR 0010 add 0110 subtract 0111 set-on-less-than 1100 NOR

  • Why is the code for subtract 0110 and not 0011?

Control

slide-2
SLIDE 2

2

  • Must describe hardware to compute 4-bit ALU control input
  • given instruction type

00 = lw, sw 01 = beq, 10 = arithmetic

  • function code for arithmetic
  • Describe it using a truth table (can turn into gates):

ALUOp computed from instruction type

Control

5

ALU Control bits

0111 Set on less then 101010 Set less than 10 R-type 0001 Or 100101 OR 10 R-type 0000 And 100100 AND 10 R-type 0110 Subtract 100010 Subtract 10 R-type 0010 Add 100000 Add 10 R-type 0110 Subtract XXXXXX Branch equal 01 Branch eq 0010 Add XXXXXX Store word 00 SW 0010 Add XXXXXX Load word 00 LW

ALU control input Desired ALU Action Funct field Instruction Operation ALU Op Instruction

  • pcode

6

Single Cycle Implementation

The value fed to the register Write data input comes from the data memory The value fed to the register Write data input comes from the ALU MemtoReg Data memory content designated by the address input are replaced by the value on the Write data input None MemWrite Data memory content designated by the address input are put on the Read data output None MemRead The PC is replaced by the output of the adder that computes the branch target The PC is replaced by the output of the adder that computes the value of PC+4 PCSrc The second ALU operand is the sign extended lower 16 bits of the instruction The second ALU operand comes from the second register file output ALUSrc The register on the Write register input is written with the value on the Write data input None RegWrite The register destination number comes from the rd field The register destination number for the Write register comes from the rt field RegDst

Effect when asserted Effect when deasserted Signal Name

slide-3
SLIDE 3

3

Instruction RegDst ALUSrc Memto- Reg Reg Write Mem Read Mem Write Branch ALUOp1 ALUp0 R-format 1 1 1 lw 1 1 1 1 sw X 1 X 1 beq X X 1 1

Read register 1 Read register 2 Write register Write data Write data Registers ALU Add Zero Read data 1 Read data 2 Sign extend 16 32 Instruction [31–0] ALU result Add ALU result M u x M u x M u x Address Data memory Read data Shift left 2 4 Read address Instruction memory PC 1 1 1 M u x 1 ALU control Instruction [5–0] Instruction [25–21] Instruction [31–26] Instruction [15–11] Instruction [20–16] Instruction [15–0] RegDst Branch MemRead MemtoReg ALUOp MemWrite ALUSrc RegWrite Control

8

R-type Instruction

9

Read register 1 Read register 2 Write register Write data Registers ALU Zero Read data 1 Read data 2 Sign extend 16 32 Instruction [31–26] Instruction [25–21] Instruction [20–16] Instruction [15–0] ALU result M u x M u x Shift left 2 Shift left 2 Instruction register PC 1 M u x 1 M u x 1 M u x 1 A B 1 2 3 M u x 1 2 ALUOut Instruction [15–0] Memory data register Address Write data Memory MemData 4 Instruction [15–11] PCWriteCond PCWrite IorD MemRead MemWrite MemtoReg IRWrite PCSource ALUOp ALUSrcB ALUSrcA RegWrite RegDst 26 28 Outputs Control Op [5–0] ALU control PC [31–28] Instruction [25-0] Instruction [5–0] Jump address [31–0]

Multi Cycle

slide-4
SLIDE 4

4

10

Action of the 1-bit control signals

The value fed to the register file Write data input comes from the MDR The value fed to the register file Write data input comes from ALUOut MemtoReg Memory content at the location specified by the Address input is replaced by value on Write data input None MemWrite Content of memory at the location specified by the Address input is put on Memory data output None MemRead The first ALU operand comes from register A The first ALU operand is the PC ALUSrcA Enable writing to the register file None RegWrite Comes from rd The register file destination number for the write register comes from rt RegDest Effect when asserted Effect when deasserted Signal Name

11

Action of the 1-bit control signals

The PC is written if the Zero

  • utput from the ALU is also active

None PCWriteCond The PC is written the source is controlled by PCSource None PCWrite The output of the memory is written to the IR None IRWrite ALUOut is used to supply the address to the memory unit The PC is used to supply the address to the memory unit IorD Effect when asserted Effect when deasserted Signal Name

12

Actions of the 2-bit control signals

Output of the ALU (PC+4) is sent to the PC for writing 00 The contents of the ALUOut are sent to the PC for writing 01 Second input to ALU comes from B 00 Second input is the constant 4 01 Second input is the sign extended lower 16 bit of IR 10 The ALU perform an add 00 The ALU perform subtract 01 The jump target address (IR[25:0] shifted left 2 bits and concatenated with PC+4[31:28] is sent to the PC for writing 10 PCSource Same as above but left shifted 2 b its (for branch) 11 ALUSrcB The funct field is used to determine the ALU operation 10 ALUOp Effect Value (binary) Signal Name

slide-5
SLIDE 5

5

How many cycles will it take to execute this code?

lw $t2, 0($t3) lw $t3, 4($t3) beq $t2, $t3, Label nop add $t5, $t2, $t3 sw $t5, 8($t3) Label: ...

What is going on during the 8th cycle of execution? In what cycle does the actual addition of $t2 and

$t3 take place?

Simple Questions

14

Delayed Branch

In a 5-stage pipeline we can make the control

hazard a feature by redefining the branch

A delayed branch always executes the following

instruction

Only the second instruction after the branch will be

effected by the branch

Compilers and assemblers try to place an

instruction that always execute after the branch

This place is called the delayed branch slots

15

Delayed Branch

add $s1, $s2, $s3 If $s2 = 0 then ## Delay slot ## If $s2 = 0 then add $s1, $s2, $s3 sub $t4, $t5, $t6 … .. add $s1, $s2, $s3 If $s1 = 0 then ## Delay slot ## … .. add $s1, $s2, $s3 If $s1 = 0 then sub $t4, $t5, $t6 add $s1, $s2, $s3 If $s1 = 0 then ## Delay slot ## … … sub $t4, $t5, $t6 add $s1, $s2, $s3 If $s1 = 0 then sub $t4, $t5, $t6

slide-6
SLIDE 6

6

16

Explanation to the examples

In (a) the delay slot is scheduled with an

independent instruction from before the branch

In (b) the branch delay slot is scheduled from the

target of the branch. This strategy is preferred when the branch is taken with high probability

In (c) the delay slot is scheduled from the not-

taken fall-through

In order make (b) and (c) legal it must be OK to

execute the sub instruction when branch goes in an unexpected direction.

17

Pros and Cons

Simple to implement The compiler do the work, so if the compiler

is improved there is no need to upgrade the hardware

Require more delay slots when the depth of

the pipe is increased

There is a problem with binary compatability

when taking code from a processor with X delay slots to a processor with Y delay slots (where X != Y)

18

Loop unrolling

Loop: lw $t0, 0($s1) addu $t0, $t0, $s2 sw $t0, 0($s1) addi $s1, $s1, -4 bne $s1, $zero, Loop

slide-7
SLIDE 7

7

19

Loop unrolling

Loop: lw $t0, 0($s1) lw $t1, 4($s1) addu $t0, $t0, $s2 addu $t1, $t1, $s2 sw $t0, 0($s1) sw $t1, 4($s1) addi $s1, $s1, -8 bne $s1, $zero, Loop