Architecture Kenichi Mori, Adam Esch, Abderazek Ben Abdallah, - PowerPoint PPT Presentation
Fifth International Conference on Broadband and Wireless Computing, Communication and Applications, Nov.4, 2010 Advanced Design Issues for OASIS Network-on-Chip Architecture Kenichi Mori, Adam Esch, Abderazek Ben Abdallah, Kenichi Kuroda The
Fifth International Conference on Broadband and Wireless Computing, Communication and Applications, Nov.4, 2010 Advanced Design Issues for OASIS Network-on-Chip Architecture Kenichi Mori, Adam Esch, Abderazek Ben Abdallah, Kenichi Kuroda The University of Aizu, Japan 2010/11/4 BWCCA 2010 1
Contents • Background • Original OASIS NoC – Architecture – Drawback • Our contribution • Proposal designed ONoC mechanism – Stall-go control flow methodology – ONoC(Optimized NoC) Architecture • Simulation result • Summary 2010/11/4 BWCCA 2010 2
Background • Network-on-Chip can solve bus-based problem • Scalable architectural platform with huge potential to handle growing complexity • Processing elements are connected via a packet switched communication network P1 P2 P3 P1 P2 P3 s s s P4 P5 P6 P: Processing element S: Switch P4 P5 P6 s s s Bus-based system Network-on-Chip system 2010/11/4 BWCCA 2010 3
OASIS NoC: Network • Original OASIS* NoC has 4x4 mesh network • Each router has one processing element OASIS whole network * A. Ben Abdallah, M.Sowa, Basic Network-on-Chip Interconnection for Future Gigascale MCSoCs Applications: Communication and Computation Orthogonalization, JASSST2006, Dec. 4-9th, 2006. 2010/11/4 BWCCA 2010 4
OASIS NoC: Routing destination • Routing algorithm is static XY routing • Switching method is worm hole source OASIS whole network Flit structure Routing information 2010/11/4 BWCCA 2010 5
OASIS NoC: Router design First stage : 76 They have buffering and 76 Local data_in_L[379:0] port_req[24:0] 5 sw_alloc Input_port routing mechanisms tail 16 Second stage : data_out_L[0] data_out_S[0] data_out_W[0] data_out_N[0] data_out_E[0] 76 South 76 It has scheduling and data_in_S[379:0] 5 cntrl[24:0] Input_port 16 flow control mechanism 76 data_out_L[379:0] North 76 5 data_in_N[379:0] Input_port data_out_S[379:0] 16 data_out_N[379:0] 76 crossbar West 76 data_in_W[379:0] 5 data_out_W[379:0] Input_port Third stage : 16 data_out_E[379:0] It sends flits each 76 East 76 data_in_E[379:0] adequate next port 5 Input_port 16 80 drop_out 5 drop_in One router has three pipeline stages 2010/11/4 BWCCA 2010 6
OASIS NoC drawback • Original OASIS NoC has an overhead problem – Large number of dropped flits in congestion communication PE PE full Router Router Congestion Dropped flits Large overhead Node should send again 2010/11/4 BWCCA 2010 7
Our contribution • Optimized NoC(ONoC) can overcome the OASIS overhead problem • To avoid dropped flits, an efficient stall- go (ESG) algorithm is proposed 2010/11/4 BWCCA 2010 8
Contents • Background • Original OASIS NoC – Architecture – Drawback • Our contribution • Proposal designed ONoC mechanism – Stall-go control flow methodology – ONoC(Optimized NoC) Architecture • Simulation result • Summary 2010/11/4 BWCCA 2010 9
Efficient stall-go (ESG) algorithm Nearly_full = 0 Nearly_full = 1 Data_sent = 1 Out = 0 Out = 1 Data_sent = 1 Stop Go Sent Nearly_full = 1 Out = 0 Data_sent = 0 Nearly_full = 1 Out = 0 Data_sent = 0 Nearly_full = 0 Out = 1 Data_sent = 0 Mealy machine for ESG algorithm 2010/11/4 BWCCA 2010 10
ONoC: Architecture nearly full nearly full stop nearly full 1 stop 1 1 1 1 20 ESG ESG data_in 1 1 data_sent data_sent 1 1 block block 1 1 1 1 grant grant data_out Scheduler data_out Scheduler 20 20 20 20 nearly full 1 1 1 1 20 20 20 20 20 20 data_in • ESG is implemented between input port and scheduler • ESG receives nearly full and data sent signal • If receiver FIFO will be full, stall go controls to stop sending flits 2010/11/4 BWCCA 2010 11
ONoC: Router design stop[4:0] 1 sw_req[4:0] 1 data_sent[4:0] Local 20 data_in_L[19:0 ] 20 port_req[24:0] 3 xaddr[2:0] 5 Input_port 3 tail_sent[4:0] sw_alloc yaddr[2:0] 1 1 1 data_in_S[19:0] South 20 20 3 5 Input_port 3 1 sent tail ESG is implemented 1 data_out_L[0] data_out_S[0] data_out_W[5:1] data_out_W[0] data_out_N[5:1] data_out_N[0] data_out_L[5:1] data_out_S[5:1] data_out_E[5:1] data_out_E[0] 1 cntrl[24:0] data_in_N[19:0] North 20 20 3 5 Input_port 3 1 1 1 data_out_L[19:0] data_in_W[19:0] West 20 20 data_out_S[19:0] 3 crossbar data_out_N[19:0] 5 Input_port 3 data_out_W[19:0] 1 data_out_E[19:0] 1 1 20 data_in_E[19:0] East 20 3 5 data_in[99:0 ] Input_port 3 1 5 Nearly_full 5 Nearly_full 2010/11/4 BWCCA 2010 12
Efficient stall-go achievement PE PE full Router Router Just stop Congestion sending Flits are sent without overhead 2010/11/4 BWCCA 2010 13
Contents • Background • Original OASIS NoC – Architecture – Drawback • Our contribution • Proposal designed ONoC mechanism – Stall-go control flow methodology – ONoC(Optimized NoC) Architecture • Simulation result • Summary 2010/11/4 BWCCA 2010 14
Simulation parameters ONoC parameters configurations Network size 3x3-mesh Buffer depth 4, 8, 16 and 32 Flit size 20 bit (Header: 12 bit Payload: 8 bit) Forwarding Wormhole switching Scheduling Round-robin Flow control Stall-go Routing static X-Y routing Target application JPEG codec Target device Altera Stratix III Input data size 120,015 bytes(ratio 200x200) 2010/11/4 BWCCA 2010 15
ONoC communication time analysis ONoC total communication time is 250000 less than OASIS in small buffer depth 200000 cycles 150000 OASIS cycles 100000 ONoC cycles 50000 0 4 8 16 32 Buffer depth 2010/11/4 BWCCA 2010 16
ONoC complexity analysis Power Speed Buffer size Architecture Area (ALUTs) (mW) (MHz) ONoC 5,485(5%) 649.17 185.87 4 OASIS 5,282(5%) 649.03 207.90 4.38 % extra ONoC 8,269(7%) 660.02 186.60 hardware 8 OASIS 7,890(7%) 659.31 195.05 ONoC 10,538(9%) 682.80 161.26 16 OASIS 10,279(9%) 681.63 177.43 17,416 ONoC 716.87 153.96 (15%) 32 16,569 OASIS 716.02 172.38 (15%) 2010/11/4 BWCCA 2010 17
Summary • This research presents optimization technique and architecture of a Optimized NoC • ONoC achieves 14.18 % less communication time than OASIS, and area is only 4.38 % larger than OASIS On going work • Buffer borrowing algorithm • Short cut bus 2010/11/4 BWCCA 2010 18
Thank you for listening 2010/11/4 BWCCA 2010 19
Recommend
More recommend
Explore More Topics
Stay informed with curated content and fresh updates.