Interconnection Structures Patrick Happ Raul Queiroz Feitosa - - PowerPoint PPT Presentation

interconnection structures
SMART_READER_LITE
LIVE PREVIEW

Interconnection Structures Patrick Happ Raul Queiroz Feitosa - - PowerPoint PPT Presentation

Interconnection Structures Patrick Happ Raul Queiroz Feitosa Objective To present key issues that affect interconnection design. 2 Interconnection Structures Outline Introduction Computer Busses Bus Types PCI PCI Express


slide-1
SLIDE 1

Interconnection Structures

Patrick Happ Raul Queiroz Feitosa

slide-2
SLIDE 2 Interconnection Structures 2

Objective

To present key issues that affect interconnection design.

slide-3
SLIDE 3 Interconnection Structures 3

Outline

 Introduction  Computer Busses  Bus Types  PCI  PCI Express

slide-4
SLIDE 4

Introduction

 All the units must be connected  Different type of connection for different

type of unit:

Memory Input/Output CPU

Interconnection Structures 4
slide-5
SLIDE 5

Unit Types

Interconnection Structures 5
slide-6
SLIDE 6 Interconnection Structures 6

Computer Busses

A bus is a common electrical pathway between multiple devices

slide-7
SLIDE 7 Interconnection Structures 7

Functional groups of bus lines

slide-8
SLIDE 8 Interconnection Structures 8

Data Bus

Carries data

Remember that there is no difference between

“data” and “instruction” at this level.

Data bus width determines the amount of data moved in a single access.

 8, 16, 32, 64 bit

slide-9
SLIDE 9 Interconnection Structures 9

Address bus

Identifies the source or destination of data

e.g. CPU needs to read an instruction (data) from a

given location in memory.

Address bus width determines maximum memory capacity of system

e.g. 8080 has 16 bit address bus giving 64k address

space

slide-10
SLIDE 10 Interconnection Structures 10

Control bus

Carries control and timing information. Typical control lines:

Memory and I/O read/write signal Interrupt request/acknowledge Bus grant/request Clock signals Reset

slide-11
SLIDE 11 Interconnection Structures 11

Physical Realization of Bus Architecture

slide-12
SLIDE 12 Interconnection Structures 12

Single Bus Problems

 The more devices attached to the bus,

  • the greater the bus length,
  • the larger the propagation delays, and
  • the lower the maximum transfer rate.

 Most systems use multiple buses to overcome

these problems

slide-13
SLIDE 13 Interconnection Structures 13

Traditional bus architecture

Example: ISA bus

slide-14
SLIDE 14 Interconnection Structures 14

High Performance Bus

Example: PCI Bus

slide-15
SLIDE 15 Interconnection Structures 15

Bus Types

Dedicated

 Separate data & address lines

Multiplexed

 Shared lines  Address valid or data valid control line  Advantage - fewer lines  Disadvantages

 More complex control  Performance
slide-16
SLIDE 16 Interconnection Structures 16

Bus Arbitration

There may be more than one potential bus master, e.g.

 CPU and DMA controller  Multiple CPUs in a parallel shared bus system.

An arbitration mechanism is required to guarantee that no more than one master controls the bus at a time.

slide-17
SLIDE 17 Interconnection Structures 17

Bus Arbitration

Why?

digital output model

IN OE Vcc OUT Q1 Q2 Q3 Q4 IN OE Vcc OUT Q1 Q2 Q3 Q4 OE=low;IN=low →Q3 off ; Q4 on ; OUT=low OE=low;IN=high→Q3 on ; Q4 off ; OUT=high OE=high → Q3 off ; Q4 off ; OUT=highZ OUT=low (Q3 off - Q4 on) OUT=high (Q3 on - Q4 off) OUT=highZ (Q3 off - Q4 off) bus line short circuit

slide-18
SLIDE 18 Interconnection Structures 18

Methods of Arbitration

Centralised

a single arbiter grants bus access.

Arbiter

R0 R1 R2 . . . Rn-1 G0 G1 G2 . . . Gn-1

bus requests bus grants

slide-19
SLIDE 19 Interconnection Structures 19

Methods of Arbitration

Distributed

Each module may claim the bus. Control logic on all modules.

slide-20
SLIDE 20 Interconnection Structures 20

Methods of Arbitration

Distributed

 Bus request and Busy lines are open collector.  Requesting devices set Bus request=0  Requesting devices make Out=0; non requesting devices make Out=In  Requesting device with In=1, gets the bus  It waits until Busy=1, sets Busy=0 and takes the bus  Upon relinquishing the Bus the device sets Busy=1

slide-21
SLIDE 21 Interconnection Structures 21

Timing

Synchronous

Events determined by clock signals. Control Bus includes clock line. Usually sync on leading edge Usually a single cycle for an event A READY/WAIT line signals when the slave is

expected to have completed the access.

Advantage: simple implementation (due to the

clock signal).

slide-22
SLIDE 22 Interconnection Structures 22

Synchronous Timing Diagram

slide-23
SLIDE 23

Synchronous Timing Diagram

Interconnection Structures 23
slide-24
SLIDE 24 Interconnection Structures 24

Timing

Asynchronous

No clock signal (due to distortion) Events determined by completion of earlier events Status lines signal when the slave completes the

access.

Advantages:

 allows for “fractional” cycles  no minimal access time is imposed  longer busses are possible (clock distortion)

slide-25
SLIDE 25 Interconnection Structures 25

Asynchronous Timing Diagram

slide-26
SLIDE 26 Interconnection Structures 26

Data Transfer Type

Time → Write (multiplexed) operation Write (non-multiplexed) operation Read (multiplexed) operation Read-modify-write operation Read (non-multiplexed) operation Read after write operation Block Data Transfer

Address Data Address Data access time Address Data read Data write Address Data Data Data Address Data Address Data read Data write Address Data

slide-27
SLIDE 27 Interconnection Structures 27

The PCI Bus

The bus structure of a Pentium 4.

slide-28
SLIDE 28 Interconnection Structures 28

The PCI Bus

Characteristics

Synchronous Parallel 32 or 64 bit transfers Up to 528 MB/s

The three 64-bit PCI slots and a single 32-bit PCI-slot
slide-29
SLIDE 29 Interconnection Structures 29

PCI Bus Signals(1)

Mandatory PCI bus signals.

slide-30
SLIDE 30 Interconnection Structures 30

PCI Bus Signals(2)

Optional PCI bus signals.

slide-31
SLIDE 31 Interconnection Structures 31

PCI Bus Transactions

Examples of 32-bit PCI bus transactions.

multiplexed address and data lines bus command/bit map for bytes enable AD and C/BE are enabled read: master will accept; write: data present read: data present; write: slave will accept master slave both
slide-32
SLIDE 32 Interconnection Structures 32

Problem with parallel busses

Clock skew

 a phenomenon in

synchronous circuits in which the clock signal (sent from the clock circuit) arrives at different components at different times.

discrepancy
slide-33
SLIDE 33 Interconnection Structures 33

Problem with parallel busses

Clock skew may be caused by many factors:

 wire length,  variation in intermediate devices,  capacitive coupling,  material imperfections, …

As the clock rate increases, less variation can be tolerated if the circuit is to function properly. It imposes a clock rate limit to the parallel bus.

slide-34
SLIDE 34 Interconnection Structures 34

PCIe

A typical PCI Express system.

Manages multiple PCIe streams

slide-35
SLIDE 35

PCIe

Interconnection Structures 35

Switch: manages multiple PCIe streams PCIe endpoint: An I/O device or controller that implements PCIe, Gigabit ethernet switch, a graphis

  • r video controler, disk interface,
  • r a communication controler

PCIe/PCI bridge: Allows

  • lder PCI devices to be

connected to PCIe-base system.

slide-36
SLIDE 36 Interconnection Structures 36

PCIe Characteristics

Communication flows through one or more pairs

  • f unidirectional connections, called lanes.

Each connection of a lane consists of one wire for

the signal and one for ground, which provide high noise immunity.

PCIe devices communicate through a link, which

is built up from a collection of one or more lanes

 links with 1, 2, 4 ,8 16 and 32 lanes are allowed.

slide-37
SLIDE 37 Interconnection Structures 37

PCIe Configuration

paired serial links

slide-38
SLIDE 38 Interconnection Structures 38

PCIe Evolution

Evolution

2004: PCIe 1.1 → 2.5 GB/s (per lane). 2007: PCIe 2.0 → 5.0 GB/s (per lane). 2010: PCIe 3.0 → 8.0 GB/s (per lane). 2017: PCIe 4.0 → 16.0 GB/s (per lane). 2019: PCIe 5.0 → 32.0 GB/s (per lane). 2021: PCIe 6.0 → 64.0 GB/s (per lane).

slide-39
SLIDE 39

PCIe Evolution

Interconnection Structures 39

Source: Wikipedia

slide-40
SLIDE 40 Interconnection Structures 40

PCIe Protocol Layers

 A protocol is a set of rules governing the

conversation between two parties.

 A protocol stack is a hierarchy of protocols that deals

with different issues at different layers.

 The PCI Express protocol stack has 3 layers:

slide-41
SLIDE 41 Interconnection Structures 41

PCI Express – physical layer

 It deals with moving bits from a sender to a receiver.  Recall that each point-to-point connection consists of one or more pairs of

simplex (unidirecional) links, called lanes.1, 2, 4, 8, 16 or 32 pairs are allowed

 No master clock – 128b/130b enconding→1 enssures enough clock

transitions to keep synchonization.

A PCI Express x16 slot A PCI Express x1 slot
slide-42
SLIDE 42 Interconnection Structures 42

PCIe Multilane Distribution

slide-43
SLIDE 43 Interconnection Structures 43

PCIe Link Layer

 It deals with packet transmission  It adds to the header+payload the sequence number and error-correction

code CRC

 If the CRC is checked OK, the receiver sends back an acknowledgement

packet, otherwise it asks for retransmission.

 This greatly improves data integrity.

Transaction layer Link layer Physical layer

frame CRC seq # payload header frame CRC seq # payload header payload header
slide-44
SLIDE 44 Interconnection Structures 44

PCI Express – transaction layer

 It handles bus actions,  It splits transactions in request and response separated by time.  It may divide each lane in up to eight virtual circuits, each handling

different class of traffic.

 Flow control – guarantees that the transmiter stops sending data until the

there is free space in receiver buffer.

Transaction layer Link layer Physical layer

frame CRC seq # payload header frame CRC seq # payload header payload header
slide-45
SLIDE 45

Exercise 1

Consider two microprocessors having 8, and 16 –bit- wide external data buses, respectively. The two processors are identical otherwise and their bus cycles take just as long.

a) Suppose all instructions and operands are two bytes

  • long. By what factor do the maximum data transfer

rates differ?

b) Repeat assuming that half of the operands and

instructions are one byte long.

Interconnection Structures 45
slide-46
SLIDE 46

Exercise 2

For a synchronous read operation (slides 23 e 24), the memory module must place the data on the bus sufficiently ahead of the falling edge of the Read signal to allow for signal settling. Assume a microprocessor bus is clocked at 10 MHz.

a) When, at the latest, should memory data be placed on the bus

after the Read signal is asserted?

b) How many wait states (clock cycles) need to be inserted for

proper read operation if the read-to-data available time of a memory chip is equal to 350 ns.

Interconnection Structures 46
slide-47
SLIDE 47

Exercise 3

A microprocessor has an increment memory direct instruction, which adds 1 to the value in a memory location. The instruction has five stages: fetch opcode (four bus clock cycles), fetch operand address (three cycles), fetch operand (three cycles), add 1 to operand (three cycles), and store

  • perand (three cycles).

a)

By what amount (in percent) will the duration of the instruction increase if we have to insert two bus wait states in each memory read and memory write operation?

b)

Repeat assuming that the increment operation takes 13 cycles instead of 3 cycles.

Interconnection Structures 47
slide-48
SLIDE 48

Exercise 4

Consider a 32 bit microprocessor whose bus cycle is the same duration as that of a 16 bit microprocessor. Assume that, on average, 20% of the operands and instructions are 32 bits long, 40% are 16 bits long, and 40% are only 8 bits long. Calculate the improvement achieved when fetching instruction and

  • perands with the 32 bit microprocessor.
Interconnection Structures 48
slide-49
SLIDE 49 Interconnection Structures 49

Interconnection Structures

END