SLIDE 1 Introduction to Field Programmable Gate Arrays
Lecture 2/3
CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden, 31 May – 9 June 2007 Javier Serrano, CERN AB-CO-HT
SLIDE 2 Outline
Digital Signal Processing using FPGAs
- Introduction. Why FPGAs for DSP?
Fixed point and its subtleties. Doing arithmetic in hardware. Distributed Arithmetic (DA). COordinate Rotation DIgital Computer (CORDIC).
SLIDE 3 Outline
Digital Signal Processing using FPGAs
- Introduction. Why FPGAs for DSP?
Fixed point and its subtleties. Doing arithmetic in hardware. Distributed Arithmetic (DA). COordinate Rotation DIgital Computer (CORDIC).
SLIDE 4 Why FPGAs for DSP? (1)
Conventional DSP Device
(Von Neumann architecture)
Data Out Reg Data In
MAC unit
....
C0 Data Out C1 C2 C255
FPGA
Reg0 Reg1 Reg2
Reg255
Data In
All 256 MAC operations in 1 clock cycle 256 Loops needed to process samples
Reason 1: FPGAs handle high computational workloads
SLIDE 5 FPGAs are ideal for multi-channel DSP designs
LPF Multi Channel Filter 80MHz Samples ch1 ch2 ch3 ch4 LPF LPF LPF LPF 20MHz Samples
Many low sample rate channels can be multiplexed (e.g. TDM) and processed in the FPGA, at a high rate. Interpolation (using zeros) can also drive sample rates higher.
SLIDE 6 Why FPGAs for DSP? (2)
Q = (A x B) + (C x D) + (E x F) + (G x H) can be implemented in parallel
× × × ×
+ + + + + +
A B C D E F G H Q
Reason 2: Tremendous Flexibility But is this the only way in the FPGA?
SLIDE 7 × × × ×
+ + + + + +
×
+ +
D Q
× ×
+ + + +
D Q
Parallel Semi-Parallel Serial
Customize Architectures to Suit Your Ideal Algorithms
FPGAs allow Area (cost) / Performance tradeoffs Optimized for? Speed Cost
SLIDE 8 DDC DDC A/D A/D D/A D/A MACs Control DDC DDC DUC DUC DUC DUC MACs Control DSP Procs. DUC DUC DUC DUC DDC DDC DDC DDC
SDRAM
AFE
FPGA
DSP Card
Hundreds of Termination Resistors P P
w e e r r P P C C
SDRAM
SSTL3 Translators Quad TRx Quad TRx
ASSP
FPGA
Network Card
SDRAM
A/D A/D D/A D/A
Control Control PL4 CORBA
Pow erPC
MACs, DUCs, DDCs, Logic
Pow erPC Pow erPC Pow erPC
3.125 Gbps
ASSP
SDRAM
Reason 3: Integration simplifies PCBs
Why FPGAs for DSP? (3)
SLIDE 9 Outline
Digital Signal Processing using FPGAs
- Introduction. Why FPGAs for DSP?
Fixed point and its subtleties. Doing arithmetic in hardware. Distributed Arithmetic (DA). COordinate Rotation DIgital Computer (CORDIC).
SLIDE 10
Unsigned integers: positive values only
SLIDE 11
2’s complement
SLIDE 12 Fixed point binary numbers
Example: 3 integer bits and 5 fractional bits
SLIDE 13 Fixed point truncation vs. rounding
Note that in 2’s complement, truncation is biased while rounding isn’t.
SLIDE 14 Outline
Digital Signal Processing using FPGAs
- Introduction. Why FPGAs for DSP?
Fixed point and its subtleties. Doing arithmetic in hardware. Distributed Arithmetic (DA). COordinate Rotation DIgital Computer (CORDIC).
SLIDE 15
The Full Adder (FA)
SLIDE 16 Add/subtract circuit
S = A+B when Control=‘0’ S = A-B when Control=‘1’
SLIDE 17 Saturation
You can’t let the data path become arbitrarily wide. Saturation involves overflow detection and a multiplexer. Useful in accumulators (like the one in the PI controller we use in the lab).
SLIDE 18
Multiplication: pencil & paper approach
SLIDE 19 A 4-bit unsigned multiplier using Full Adders and AND gates
Of course, you can use embedded multipliers if your chip has them!
SLIDE 20 Constant coefficient multipliers using ROM
For “easy” coefficients, there are smarter ways. E.g. to multiply a number A by 31, left-shift A by 5 places then subtract A.
SLIDE 21 Division: pencil & paper
- Uses add/subtract blocks presented earlier.
- MSB produced first: this will usually imply we have to wait for whole operation to
finish before feeding result to another block.
- Longer combinational delays than in multiplication: an N by N division will always take
longer than an N by N multiplication.
SLIDE 22
Pipelining the division array
SLIDE 23 Square root
- Take a division array, cut it in half (diagonally) and you have square root. Square root
is therefore faster than division!
- Although with less ripple through, this block suffers from the same problems as the
division array.
- Alternative approach: first guess with a ROM, then use an iterative algorithm such as
Newton-Raphson.
SLIDE 24 Outline
Digital Signal Processing using FPGAs
- Introduction. Why FPGAs for DSP?
Fixed point and its subtleties. Doing arithmetic in hardware. Distributed Arithmetic (DA). COordinate Rotation DIgital Computer (CORDIC).
SLIDE 25 Distributed Arithmetic (DA) 1/2
∑
− =
⋅ =
1
] [ ] [
N n
n x n c y
∑ ∑
− = − =
⎟ ⎠ ⎞ ⎜ ⎝ ⎛ ⋅ ⋅ =
1 1
2 ] [ ] [
N n B b b b n
x n c y
Digital filtering is about sums of products: Let’s assume: c[n] constant (prerequisite to use DA) x[n] input signal B bits wide Then: xb[n] is bit number b
And after some rearrangement of terms:
∑ ∑
− = − =
⎟ ⎠ ⎞ ⎜ ⎝ ⎛ ⋅ ⋅ =
1 1
] [ ] [ 2
B b N n b b
n x n c y
This can be implemented with an N-input LUT
SLIDE 26 Distributed Arithmetic (DA) 2/2
∑ ∑
− = − =
⎟ ⎠ ⎞ ⎜ ⎝ ⎛ ⋅ ⋅ =
1 1
] [ ] [ 2
B b N n b b
n x n c y
xB[0]
……
x1[0] x0[0] xB[1]
……
x1[1] x0[1] xB[N-1]
……
x1[N-1] x0[N-1]
…….... …….... …….... LUT + Register
2-1
y Generates a result every B clock ticks. Replicating logic one can trade off speed vs. area, to the limit of getting one result per clock tick.
SLIDE 27 Outline
Digital Signal Processing using FPGAs
- Introduction. Why FPGAs for DSP?
Fixed point and its subtleties. Doing arithmetic in hardware. Distributed Arithmetic (DA). COordinate Rotation DIgital Computer (CORDIC).
SLIDE 28
COrdinate Rotation DIgital Computer
SLIDE 29
Pseudo-rotations
SLIDE 30
Basic CORDIC iterations
SLIDE 31
Angle accumulator
SLIDE 32
The scaling factor
SLIDE 33
Rotation Mode
SLIDE 34
Example: calculate sin and cos of 30º
SLIDE 35 Vectoring Mode
Vector magnitude
SLIDE 36
Circular coordinate system
SLIDE 37
Other coordinate systems
SLIDE 38
Generalized CORDIC equations
SLIDE 39
Summary of CORDIC functions
SLIDE 40
Precision and convergence
SLIDE 41
FPGA implementation
SLIDE 42
Iterative bit-serial design
SLIDE 43
Acknowledgements
Many thanks to Jeff Weintraub (Xilinx University Program) and Bob Stewart (University of Strathclyde) for many of these slides.