System-on-Chip Design
Transac5on-Level Modeling with SystemC
- Dr. Hao Zheng
- Comp. Sci & Eng.
System-on-Chip Design Transac5on-Level Modeling with SystemC Dr. - - PowerPoint PPT Presentation
System-on-Chip Design Transac5on-Level Modeling with SystemC Dr. Hao Zheng Comp. Sci & Eng. U of South Florida Mo5va5on Why use transac?on-level modeling and ESL languages? - Manage growing system complexity - Enable HW/SW co-design -
2
3
"Untimed functioal models"
"Architecture model" "Timed functonal model"
"Transaction model"
"Communicatin model" "Behavior level model"
model"
"Register transfer model"
Computation Communication A B C D F
Un- timed Approximate- timed Cycle- timed Un- timed Approximate- timed
E
Cycle- timed
* Figure and taxonomy by Gajski and Cai, UC Irvine A. Specification Model “‘Untimed’ Functional Models” B. Component-Assembly Model “Architecture Model” “’Timed’ Functional Model” C. Bus-Arbitration Model “Transaction Model” D. Bus-Functional Model “Communication Model” “Behavior-Level Model” E. Cycle-Accurate Computation Model F. Implementation Model “Register-Transfer Level (RTL) Model”
* Figure and taxonomy by Gajski and Cai, UC Irvine Objects: Computation: Behaviors Communication: Variables Composition: Hierarchy Execution Order Sequential Parallel Pipelined States Synchronization: Notify/Wait
4
* Figure and taxonomy by Gajski and Cai, UC Irvine Objects: Computation: Processors Memories IP Communications: Variable Channels Composition: Hierarchy Execution Order Sequential Parallel Pipelined States Synchronization: Notify/Wait
v3 v3= v1- b*b;
B3
v4 = v2 + v3; c = sequ(v4);
B4
PE3
v2 = v1 + b*b;
B2
PE2
v1 = a*a;
B1
PE1
cv2 cv12 cv11
* Figure and taxonomy by Gajski and Cai, UC Irvine Objects: Computation: Processors Memories IP, arbiters Communications: Abstract Bus Channels Composition: Hierarchy Execution Order Sequential Parallel Pipelined States Synchronization: Notify/Wait
Computa5on – behavior Communica5on – abstract channels A network of communica?ng sequen?al processes connected by abstract channels.
5
Computa5on – behavioral, approximately 5med Communica5on – protocol bus channels
* Figure and taxonomy by Gajski and Cai, UC Irvine Objects: Computation: Processors Memories IP, arbiters Communications: Protocol Bus Channels Composition: Hierarchy Execution Order Sequential Parallel Pipelined States Synchronization: Notify/Wait
v2 = v1 + b*b;
B2
PE2
v1 = a*a;
B1
PE1
v3 v3= v1- b*b;
B3
v4 = v2 + v3; c = sequ(v4);
B4
PE3 PE4 (Arbiter)
3 1 2
1: m ast er i nt er f ace 2: sl ave i nt er f ace 3: ar bi t or i nt er f ace
ready ack address[15:0] data[31:0] IProtocolSlav e ready ack address[15:0] data[31:0]* Figure and taxonomy by Gajski and Cai, UC Irvine Objects: Computation: Processors Memories IP, arbiters Wrappers Communications: Abstract Bus Channels Composition: Hierarchy Execution Order Sequential Parallel Pipelined States Synchronization: Notify/Wait
6
* Figure and taxonomy by Gajski and Cai, UC Irvine Objects: Computation: Processors Memories IP, arbiters Communications: Protocol Bus Channels Composition: Hierarchy Execution Order Sequential Parallel Pipelined States Synchronization: Notify/Wait
3 1 2
1: m ast er i nt er f ace 2: sl ave i nt er f ace 3: ar bi t or i nt er f ace
* Figure and taxonomy by Gajski and Cai, UC Irvine Objects: Computation: Processors Memories IP, arbiters Wrappers Communications: Abstract Bus Channels Composition: Hierarchy Execution Order Sequential Parallel Pipelined States Synchronization: Notify/Wait PE3
cv12 cv11 cv2 3 1 2
S0 S1 S2 S3 S4
PE4
S0 S1 S2 S3
4 4
PE2 PE1
MOV r1, 10 MUL r1, r1, r1 .... ... MLA r1, r2, r2, r1 ....
4 4
Communication
Communica?on is approximately ?med.
7
* Figure and taxonomy by Gajski and Cai, UC Irvine Objects: Computation: Processors Memories IP, arbiters Wrappers Communications: Buses/Wires Composition: Hierarchy Execution Order Sequential Parallel Pipelined States Synchronization: Notify/Wait
PE2 PE1 PE3 PE4
S0 S1 S2 S3 S4 MOV r1, 10 MUL r1, r1, r1 .... ... MLA r1, r2, r2, r1 .... S0 S1 S2 S3
MCNTR MADDR MDATA interrupt interrupt interrupt req req
F
* Figure and taxonomy by Gajski and Cai, UC Irvine
8
Constant Adder Fork Printer
9
void void read(T&); read(T&); T read(); read(); bool bool nb_read nb_read(T&); (T&); int int num_available num_available(); (); void void write( write(const const T&); T&); bool bool nb_write nb_write(const const T&); T&); int int num_free num_free(); (); sc_fifo sc_fifo(int int size=16); size=16); sc_fifo sc_fifo(char* name, (char* name, int int size=16); size=16);
10
sc_fifo_in sc_fifo_in<T> <T>: support only read operations. support only read operations. sc_fifo_out sc_fifo_out<T> <T>: support only write operations. : support only write operations.
11
template <class T> template <class T> SC_MODULE( SC_MODULE(DF_Adder DF_Adder) { ) { sc_fifo_in sc_fifo_in<T> din1, din2; <T> din1, din2; sc_fifo_out sc_fifo_out<T> <T> dout dout; ; void process() { void process() { while (1) while (1) dout.write dout.write(din1.read() + din2.read()); (din1.read() + din2.read()); } } SC_CTOR( SC_CTOR(DF_Adder DF_Adder) { SC_THREAD(process); } ) { SC_THREAD(process); } }; };
12
template <class T> template <class T> SC_MODULE( SC_MODULE(DF_Const DF_Const) { ) { sc_fifo_out sc_fifo_out<T> <T> dout dout; ; void process() { void process() { while (1) while (1) dout.write dout.write(constant_); } (constant_); } SC_HAS_PROCESS( SC_HAS_PROCESS(DF_Const DF_Const); ); DF_Const DF_Const(sc_module_name sc_module_name N, N, const const T& C) T& C) : : sc_module sc_module(N), constant_(C) (N), constant_(C) { SC_THREAD(process); } { SC_THREAD(process); } T constant_; T constant_; }; };
13
template <class T> template <class T> SC_MODULE( SC_MODULE(DF_Fork DF_Fork) { ) { sc_fifo_in sc_fifo_in<T> din; <T> din; sc_fifo_out sc_fifo_out<T> dout1, dout2; <T> dout1, dout2; void process() { void process() { while (1) { while (1) { T value = T value = din.read din.read(); (); dout1.write(value); dout1.write(value); dout2.write(value); dout2.write(value); }} }} SC_CTOR( SC_CTOR(DF_Fork DF_Fork) { SC_THREAD(process); } { SC_THREAD(process); } }; };
14
template <class T> template <class T> SC_MODULE( SC_MODULE(DF_Printer DF_Printer) { ) { sc_fifo_in sc_fifo_in<T> din; <T> din; void process() { void process() { for ( for (int int i=0; =0; i < < n_iter n_iter; ; i++) { ++) { T value = T value = din.read din.read(); (); cout cout << name() << “ “ <<value<< << name() << “ “ <<value<<endl endl; ; } } done_ = true; return; done_ = true; return; // terminate // terminate } } SC_HAS_PROCESS( SC_HAS_PROCESS(DF_Printer DF_Printer); ); DF_Printer DF_Printer(...) ... { SC_THREAD(process); } (...) ... { SC_THREAD(process); } }; };
15
sc_main sc_main(int int argc argc, char* , char* argv argv[]) { []) { DF_Const DF_Const<int int> constant(“constant”, 1); > constant(“constant”, 1); DF_Adder DF_Adder<int int> adder(“adder”); > adder(“adder”); DF_Fork DF_Fork<int int> fork(“fork”); > fork(“fork”); DF_Printer DF_Printer<int int> printer(“printer”, 10); > printer(“printer”, 10); sc_fifo sc_fifo<int int> > const_out const_out(“ (“const_out const_out”, 5); ”, 5); sc_fifo sc_fifo<int int> > adder_out adder_out(“ (“adder_out adder_out”, 1); ”, 1); sc_fifo sc_fifo<int int> feedback(“feedback”, 1); > feedback(“feedback”, 1); sc_fifo sc_fifo<int int> > to_printer to_printer(“2printer”, 1); (“2printer”, 1); feedback.write feedback.write(42); (42); // channel // channel init. init. ... ... } }
16
sc_main sc_main(int int argc argc, char* , char* argv argv[]) { []) { ... ... constant.output constant.output(const_out const_out); ); adder.din1(feedback); adder.din1(feedback); adder.din2( adder.din2(const_out const_out); ); fork.din fork.din(adder_out adder_out); ); fork.dout1(feedback); fork.dout1(feedback); fork.dout2( fork.dout2(to_printer to_printer); ); printer.din printer.din(to_printer to_printer); ); sc_start sc_start(); (); //No //No sim
. time limit return 0; return 0; } } Port binding
17
template <class T> template <class T> SC_MODULE( SC_MODULE(DF_Const DF_Const) { ) { sc_fifo_out sc_fifo_out<T> <T> dout dout; ; void process() { void process() { while (1) { while (1) { wait(200, SC_NS); wait(200, SC_NS); dout.write dout.write(constant_); (constant_); } } ... ... }; }; Computational delay
18
template <class T> template <class T> SC_MODULE( SC_MODULE(DF_Adder DF_Adder) { ) { ... ... void process() { void process() { while (1) while (1) T data = din1.read() + din2.read(); T data = din1.read() + din2.read(); wait(200, SC_NS); wait(200, SC_NS); dout.write dout.write(data); (data); } } ... ... }; };
19
20
21
22
23
24
* Figure and taxonomy by Gajski and Cai, UC Irvine Objects: Computation: Processors Memories IP Communications: Variable Channels Composition: Hierarchy Execution Order Sequential Parallel Pipelined States Synchronization: Notify/Wait
v3 v3= v1- b*b;
B3
v4 = v2 + v3; c = sequ(v4);
B4
PE3
v2 = v1 + b*b;
B2
PE2
v1 = a*a;
B1
PE1
cv2 cv12 cv11
* Figure and taxonomy by Gajski and Cai, UC Irvine Objects: Computation: Processors Memories IP, arbiters Communications: Abstract Bus Channels Composition: Hierarchy Execution Order Sequential Parallel Pipelined States Synchronization: Notify/Wait
Computa5on – behavior Communica5on – abstract channels A network of communica?ng sequen?al processes connected by abstract channels.
25
* Figure and taxonomy by Gajski and Cai, UC Irvine Objects: Computation: Processors Memories IP Communications: Variable Channels Composition: Hierarchy Execution Order Sequential Parallel Pipelined States Synchronization: Notify/Wait
* Figure and taxonomy by Gajski and Cai, UC Irvine Objects: Computation: Processors Memories IP, arbiters Communications: Abstract Bus Channels Composition: Hierarchy Execution Order Sequential Parallel Pipelined States Synchronization: Notify/Wait
v2 = v1 + b*b;
B2
PE2
v1 = a*a;
B1
PE1
v3 v3= v1- b*b;
B3
v4 = v2 + v3; c = sequ(v4);
B4
PE3
cv12 cv11 cv2
PE4 (Arbiter)
3 1 2
Abstract channels are implemented in an abstract communica?on structure.
26
class class bus_if bus_if: virtual public : virtual public sc_interface sc_interface { { public: public: virtual void virtual void burst_read burst_read ( ( char* data, char* data, unsigned unsigned addr addr, , unsigned unsigned len len) = 0; ) = 0; virtual void virtual void burst_write burst_write ( ( char* data, char* data, unsigned unsigned addr addr, , unsigned unsigned len len) = 0; ) = 0; } } How many cycles would be needed to to complete a burst transac?on in a RTL model?
27
class class simple_bus simple_bus : : public public bus_if bus_if, public , public sc_channel sc_channel { { public: public: simple_bus simple_bus(sc_module_name sc_module_name nm, unsigned nm, unsigned mem_size mem_size, , sc_time sc_time cycle_time cycle_time) : ) : sc_channel sc_channel(nm), _ (nm), _cycle_time cycle_time(cycle_time cycle_time) {...} ) {...} ~ ~simple_bus simple_bus() {...} () {...} virtual void virtual void burst_read burst_read(...) {} (...) {} virtual void virtual void burst_write burst_write(...) {} (...) {} protected: protected: char* _ char* _mem mem; ; sc_time sc_time _ _cycle_time cycle_time; ; sc_mutex sc_mutex _ _bus_mutex bus_mutex; ; // ensure exclusion access to _
// ensure exclusion access to _mem mem
} }
28
sc_mutex sc_mutex name; name; name.lock name.lock(); (); // lock the // lock the mutex mutex int int name.trylock name.trylock(); (); // non-blocking lock // non-blocking lock // return 0 for success // return 0 for success // return -1 otherwise // return -1 otherwise name.unlock name.unlock(); (); // free a locked // free a locked mutex mutex. .
29
virtual void virtual void burst_read burst_read( char* data, char* data, unsigned unsigned addr addr unsigned unsigned len len) ) { { _ _bus_mutex.lock bus_mutex.lock(); (); // Block the caller for the data
// Block the caller for the data xfer xfer // Modeling // Modeling mem mem read delay read delay
wait ( wait (len len * _ * _cycle_time cycle_time); ); // // xfer xfer data data memcpy memcpy(data, _ (data, _mem mem + + addr addr, , len len); ); _ _bus_mutex.unlock bus_mutex.unlock(); (); } }
30
virtual void virtual void burst_write burst_write( char* data, char* data, unsigned unsigned addr addr unsigned unsigned len len) ) { { _ _bus_mutex.lock bus_mutex.lock(); (); // Block the caller for the data
// Block the caller for the data xfer xfer // Modeling // Modeling mem mem write delay write delay
wait ( wait (len len * _ * _cycle_time cycle_time); ); // // xfer xfer data data memcpy memcpy(_ (_mem mem + + addr addr, data, , data, len len); ); _ _bus_mutex.unlock bus_mutex.unlock(); (); } }
Arbitra'on is not supported. Any idea how to do it?
31
32
b1,1 b1,2 · · · b1,k b2,1 b2,2 · · · b2,k . . . . . . ... . . . bn,1 bn,2 · · · bn,k a1,1 a1,2 · · · a1,n a2,1 a2,2 · · · a2,n . . . . . . ... . . . am,1 am,2 · · · am,n · = c1,1 c1,2 · · · c1,k c2,1 c2,2 · · · c2,k . . . . . . ... . . . cm,1 cm,2 · · · cm,k
where ci,j =
n
X
x=1
ai,xbx,j
33
34
35
performance
36