Thomas Ilsche e (thomas.i .ilsche@t he@tu-dr dresden.de) esden.de) Joseph Schuchart Robert Schöne Daniel Hackenberg
Center for Information Services and High Performance Computing (ZIH)
Combining Instrumentation and Sampling for Trace-based Application - - PowerPoint PPT Presentation
Center for Information Services and High Performance Computing (ZIH) Combining Instrumentation and Sampling for Trace-based Application Performance Analysis 8th International Parallel Tools Workshop Stuttgart, Germany, October 2, 2014 Thomas
Center for Information Services and High Performance Computing (ZIH)
Thomas Ilsche 4
Thomas Ilsche 5
Based on [10] Juckeland, G.: Trace-based Performance Analysis for Hardware Accelerators. Ph.D. thesis, TU Dresden (2012)
Thomas Ilsche 6
Based on [10] Juckeland, G.: Trace-based Performance Analysis for Hardware Accelerators. Ph.D. thesis, TU Dresden (2012)
Thomas Ilsche 7
Thomas Ilsche 8
Thomas Ilsche 9
Thomas Ilsche 10
Thomas Ilsche 11
Thomas Ilsche 12
Thomas Ilsche 13
Based on [10] Juckeland, G.: Trace-based Performance Analysis for Hardware Accelerators. Ph.D. thesis, TU Dresden (2012)
Thomas Ilsche 14
Thomas Ilsche 15
Thomas Ilsche 16
Thomas Ilsche 17
Thomas Ilsche 18
Thomas Ilsche 19
Based on [10] Juckeland, G.: Trace-based Performance Analysis for Hardware Accelerators. Ph.D. thesis, TU Dresden (2012)
Thomas Ilsche 20
Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 33.34 0.02 0.02 7208 0.00 0.00 open 16.67 0.03 0.01 244 0.04 0.12 offtime 16.67 0.04 0.01 8 1.25 1.25 memccpy 16.67 0.05 0.01 7 1.43 1.43 write
Thomas Ilsche 21
Based on [10] Juckeland, G.: Trace-based Performance Analysis for Hardware Accelerators. Ph.D. thesis, TU Dresden (2012)
Thomas Ilsche 25
Thomas Ilsche 26
Thomas Ilsche 27 Estimated aggregate size of event trace: 3851MB Estimated requirements for largest trace buffer (max_buf): 3851MB Estimated memory requirements (SCOREP_TOTAL_MEMORY): 3860MB (hint: When tracing set SCOREP_TOTAL_MEMORY=3860MB to avoid intermediate flushes
type max_buf[B] visits time[s] region ALL 4,038,048,140 161,849,290 119.61 ALL USR 4,038,047,650 161,849,275 115.72 USR OMP 412 12 0.07 OMP COM 78 3 3.82 COM USR 365,389,440 14,053,440 3.58 Graph::lcgrand(int) USR 322,737,636 12,412,986 5.78 std::_List_iterator<int>::operator*() const USR 208,735,202 8,028,277 3.70 std::_List_iterator<int>::operator++() USR 201,389,266 7,745,741 3.02 std::_List_iterator<int>::_List_iterator… USR 200,350,128 12,521,883 6.12 std::_List_iterator<int>::operator!=… … USR 1,040,000 40,000 0.01 Graph::Node* std::__addressof…
Thomas Ilsche 28
Thomas Ilsche 29
Thomas Ilsche 30
Thomas Ilsche 31
Thomas Ilsche 33
Thomas Ilsche 34
Thomas Ilsche 35
Thomas Ilsche 36
Thomas Ilsche 37
Thomas Ilsche 38
Thomas Ilsche 39
Thomas Ilsche 40
Thomas Ilsche 41