Performance Annotations for Complex Software Systems
Daniele Rogora∗ Antonio Carzaniga∗ Amer Diwan$ Matthias Hauswirth∗ Robert Soulé†
∗USI, Switzerland †Yale University, USA $Google, USA
EuroSys’20
1 / 29
Performance Annotations for Complex Software Systems Daniele Rogora - - PowerPoint PPT Presentation
Performance Annotations for Complex Software Systems Daniele Rogora Antonio Carzaniga Amer Diwan $ Matthias Hauswirth Robert Soul USI, Switzerland Yale University, USA $ Google, USA EuroSys20 1 / 29 Performance
∗USI, Switzerland †Yale University, USA $Google, USA
1 / 29
2 / 29
3 / 29
documented complexity:
3 / 29
20 40 60 80 100 120 140 160 180 0.5 1 1.5 2 2.5 3 3.5 Time (seconds) List size (million)
4 / 29
5 / 29
6 / 29
actual behavior concrete metrics
6 / 29
actual behavior concrete metrics significant statistics
6 / 29
actual behavior concrete metrics significant statistics specific characterization not merely an aggregate profile
6 / 29
actual behavior concrete metrics significant statistics specific characterization not merely an aggregate profile
6 / 29
actual behavior concrete metrics significant statistics specific characterization not merely an aggregate profile
run-time memory allocation lock-holding time ...
6 / 29
actual behavior concrete metrics significant statistics specific characterization not merely an aggregate profile
run-time memory allocation lock-holding time ... input parameters, global variables, ... even in nested, structured objects identified automatically!
6 / 29
std::list<int>::sort.time(this) { uint s = *(this->_M_impl._M_node._M_storage._M_storage); [s > 49584 && s < 1450341] Norm(53350.31 - 2.10*s + 0.12*s*log(s), 12463.88); [s > 1589482 && s < 2085480] Norm(-90901042.29 + 63.11*s, 899547.29); [s > 2098759 && s < 3415880] Norm(56712024.50 + 35.38*s, 3379580.27); }
20 40 60 80 100 120 140 160 180 0.5 1 1.5 2 2.5 3 3.5 Time (seconds) List size (million)
7 / 29
std::list<int>::sort.time(this) { uint s = *(this->_M_impl._M_node._M_storage._M_storage); [s > 49584 && s < 1450341] Norm(53350.31 - 2.10*s + 0.12*s*log(s), 12463.88); [s > 1589482 && s < 2085480] Norm(-90901042.29 + 63.11*s, 899547.29); [s > 2098759 && s < 3415880] Norm(56712024.50 + 35.38*s, 3379580.27); }
20 40 60 80 100 120 140 160 180 0.5 1 1.5 2 2.5 3 3.5 Time (seconds) List size (million)
7 / 29
std::list<int>::sort.time(this) { uint s = *(this->_M_impl._M_node._M_storage._M_storage); [s > 49584 && s < 1450341] Norm(53350.31 - 2.10*s + 0.12*s*log(s), 12463.88); [s > 1589482 && s < 2085480] Norm(-90901042.29 + 63.11*s, 899547.29); [s > 2098759 && s < 3415880] Norm(56712024.50 + 35.38*s, 3379580.27); }
20 40 60 80 100 120 140 160 180 0.5 1 1.5 2 2.5 3 3.5 Time (seconds) List size (million)
7 / 29
std::list<int>::sort.time(this) { uint s = *(this->_M_impl._M_node._M_storage._M_storage); [s > 49584 && s < 1450341] Norm(53350.31 - 2.10*s + 0.12*s*log(s), 12463.88); [s > 1589482 && s < 2085480] Norm(-90901042.29 + 63.11*s, 899547.29); [s > 2098759 && s < 3415880] Norm(56712024.50 + 35.38*s, 3379580.27); }
20 40 60 80 100 120 140 160 180 0.5 1 1.5 2 2.5 3 3.5 Time (seconds) List size (million)
7 / 29
std::list<int>::sort.time(this) { uint s = *(this->_M_impl._M_node._M_storage._M_storage); [s > 49584 && s < 1450341] Norm(53350.31 - 2.10*s + 0.12*s*log(s), 12463.88); [s > 1589482 && s < 2085480] Norm(-90901042.29 + 63.11*s, 899547.29); [s > 2098759 && s < 3415880] Norm(56712024.50 + 35.38*s, 3379580.27); }
20 40 60 80 100 120 140 160 180 0.5 1 1.5 2 2.5 3 3.5 Time (seconds) List size (million)
7 / 29
std::list<int>::sort.time(this) { uint s = *(this->_M_impl._M_node._M_storage._M_storage); [s > 49584 && s < 1450341] Norm(53350.31 - 2.10*s + 0.12*s*log(s), 12463.88); [s > 1589482 && s < 2085480] Norm(-90901042.29 + 63.11*s, 899547.29); [s > 2098759 && s < 3415880] Norm(56712024.50 + 35.38*s, 3379580.27); }
20 40 60 80 100 120 140 160 180 0.5 1 1.5 2 2.5 3 3.5 Time (seconds) List size (million)
7 / 29
std::list<int>::sort.time(this) { uint s = *(this->_M_impl._M_node._M_storage._M_storage); [s > 49584 && s < 1450341] Norm(53350.31 - 2.10*s + 0.12*s*log(s), 12463.88); [s > 1589482 && s < 2085480] Norm(-90901042.29 + 63.11*s, 899547.29); [s > 2098759 && s < 3415880] Norm(56712024.50 + 35.38*s, 3379580.27); }
20 40 60 80 100 120 140 160 180 0.5 1 1.5 2 2.5 3 3.5 Time (seconds) List size (million)
7 / 29
get_func_mm_tree(RANGE_OPT_PARAM *param, Item *pred, Item_func *cond_func, Item *val, bool inv);
1 2 3 4 5 6 7 8 9 10 2000 3000 4000 5000 Time (s) cond_func->arg_count
get_func_mm_tree.time(cond_func) { uint ac = cond_func->arg_count; Norm(156569 - 269.041*ac + 0.414447*ac^2, 15781.22); }
8 / 29
get_func_mm_tree(RANGE_OPT_PARAM *param, Item *pred, Item_func *cond_func, Item *val, bool inv);
1 2 3 4 5 6 7 8 9 10 2000 3000 4000 5000 Time (s) cond_func->arg_count
get_func_mm_tree.time(cond_func) { uint ac = cond_func->arg_count; Norm(156569 - 269.041*ac + 0.414447*ac^2, 15781.22); }
8 / 29
mysql_execute_command(THD *thd, bool first_level);
20 40 60 80 100 120 140 160 4000 8000 12000 16000 Time (ms) thd.m_query_length dvv=12 dvv=0
mysql_execute_command.time(thd) { uint len = thd->m_query_string.len; uint dvv = thd->variables.dynamic_variable_version; Norm(168.65 + 4.94*len + 1886.87*dvv, 2489.04); }
9 / 29
mysql_execute_command(THD *thd, bool first_level);
20 40 60 80 100 120 140 160 4000 8000 12000 16000 Time (ms) thd.m_query_length dvv=12 dvv=0
mysql_execute_command.time(thd) { uint len = thd->m_query_string.len; uint dvv = thd->variables.dynamic_variable_version; Norm(168.65 + 4.94*len + 1886.87*dvv, 2489.04); }
9 / 29
10 / 29
11 / 29
11 / 29
12 / 29
12 / 29
12 / 29
12 / 29
12 / 29
12 / 29
13 / 29
Find function Find global variables Find parameters
13 / 29
Find function Find global variables Find parameters
entry point for the analysis
13 / 29
Find function Find global variables Find parameters
VARIABLES
13 / 29
Find function Find global variables Find parameters
VARIABLES all variables accessible by the target function
13 / 29
Build class graph Find function Find global variables Find parameters
VARIABLES
13 / 29
Build class graph Find function Find global variables Find parameters
CLASS GRAPH VARIABLES
13 / 29
Build class graph Find function Find global variables Find parameters
CLASS GRAPH VARIABLES
Check possible dynamic types Generate info Generate code
13 / 29
Build class graph Find function Find global variables Find parameters
CLASS GRAPH VARIABLES
Check possible dynamic types Generate info Generate code
determine all possible dynamic types for each statically defined variable
13 / 29
Build class graph Find function Find global variables Find parameters
CLASS GRAPH VARIABLES
Check possible dynamic types Generate info Generate code
Find location:
Explore complex types:
13 / 29
Build class graph Find function Find global variables Find parameters
CLASS GRAPH VARIABLES
Check possible dynamic types Generate info Generate code
Explore complex types:
13 / 29
14 / 29
15 / 29
void __attribute__ ((noinline)) test_quad_int(int t) { for (int i = 0; i < t; i++) { usleep(t); } }
16 / 29
void __attribute__ ((noinline)) test_quad_int(int t) { for (int i = 0; i < t; i++) { usleep(t); } }
20 40 60 80 100 120 140 160 180 50 100 150 200 250 300
Time (msecs) t
16 / 29
void __attribute__ ((noinline)) test_linear_branches_one_f(int a, int b, int c) { if (a < 10) { for (int i = 0; i < 10 - a; i++) { usleep(400); } } else { usleep(4000); for (int i = 0; i < a - 10; i++) usleep(400); } }
2 4 6 8 10 12 5 10 15 20
Time (msecs) a
17 / 29
void __attribute__ ((noinline)) test_interaction_linear_quad(int a, int b) { for (int i = 0; i < a; i++) usleep(b*b); }
2 4 6 8 10 12 2 4 6 8 10 12 14 16 18 20 Time (msecs) a b=19 b=9 b=0 2 4 6 8 10 12 2 4 6 8 10 12 14 16 18 20 Time (msecs) b a=19 a=9 a=0
18 / 29
19 / 29
0.5 1 1.5 2 2.5 3 3.5 4 4.5 20000 40000 60000 Time (ms) length clock=2.6GHz clock=1.7GHz clock=0.8GHz
ff_h2645_extract_rbsp.time(length, cpu_clock) { uint l = length; uint clock = cpu_clock; Norm(43.32 + 0.055*l - 1.46e-05*clock
}
20 / 29
100 200 300 400 500 600 700 800 900 1000 2 4 6 8 10 12 Wait Time (ms) h->param.i_threads
det=true det=false height=2160 height=240
100 200 300 400 500 600 700 800 900 1000 400 800 1200 1600 2000 Wait Time (ms) h->param.i_height
det=true det=false threads=12 threads=2
x264_8_encoder_encode.wait_time(h, pic_in) { bool sliced = h->param.b_sliced_threads; uint height = h->param.i_height; uint threads = h->param.i_threads; uint dequant = h->thread.dequant4_mf; bool det = pic_in->param.b_deterministic; [sliced] Norm(-56362 + 189.17*height - 3221.21*threads
[!sliced] 0.55Norm(108.7, 188.65); 0.30Norm(7282, 51465.24); ... }
21 / 29
22 / 29
23 / 29
20 40 60 80 100 120 Time (ms)
mysql_execute_command(thd).time{ uint len = thd->m_query_string.len; Norm(6630.19 + 0.86*len, 15.78); }
20 40 60 80 100 120 140 160 4000 8000 12000 16000 Time (ms) thd.m_query_length dvv=12 dvv=0
mysql_execute_command(thd).time{ uint len = thd->m_query_string.len; uint dvv = thd->variables.dynamic_variable_version; Norm(168.65 + 4.94*len + 1886.87*dvv, 2489.04); }
23 / 29
24 / 29
... test_quick_select(...) get_mm_tree(...) get_func_mm_tree(...) get_mm_parts(...) tree_and(...) tree_or(...) key_or(...) IN,OR/AND IN,OR/AND IN OR/AND IN IN IN IN
24 / 29
... test_quick_select(...) get_mm_tree(...) get_func_mm_tree(...) get_mm_parts(...) tree_and(...) tree_or(...) key_or(...) IN,OR/AND IN,OR/AND IN OR/AND IN IN IN IN
24 / 29
test_quick_select(THD *thd, Key_map keys_to_use, table_map prev_tables, ha_rows limit, bool force_quick_range, const enum_order interesting_order, const QEP_shared_owner *tab, Item *cond, Key_map *needed_reg, QUICK_SELECT_I **quick, bool ignore_table_scan);
2 4 6 8 10 50000 100000 150000 200000 Time (s) thd.m_query_string.length vptr <= 562874922 vptr > 562874922
test_quick_select.time(thd, cond) { uint len = thd->m_query_string.len; uint vptr = cond->_vptr.Parse_tree_node_tmpl; [vptr <= 562874922] Norm(467533 - 50.21*len + 0.0036*lenˆ2,282711.59); [vptr > 562874922] Norm(-53.603 + 0.057*len, 157.57); }
25 / 29
...
2 4 6 8 10 50000 100000 150000 200000 Time (s) thd.m_query_string.length vptr <= 562874922 vptr > 562874922test_quick_select(...)
2 4 6 8 10 50000 100000 150000 200000 Time (s) thd.m_query_string.length vptr <= 562874922 vptr > 562874922get_mm_tree(...)
1 2 3 4 5 6 7 8 9 10 2000 3000 4000 5000 Time (s) cond_func->arg_countget_func_mm_tree(...) get_mm_parts(...) tree_and(...) tree_or(...) key_or(...) IN,AND IN,AND IN AND IN IN IN IN
26 / 29
...
2 4 6 8 10 50000 100000 150000 200000 Time (s) thd.m_query_string.length vptr <= 562874922 vptr > 562874922test_quick_select(...)
2 4 6 8 10 50000 100000 150000 200000 Time (s) thd.m_query_string.length vptr <= 562874922 vptr > 562874922get_mm_tree(...)
1 2 3 4 5 6 7 8 9 10 2000 3000 4000 5000 Time (s) cond_func->arg_countget_func_mm_tree(...) get_mm_parts(...) tree_and(...) tree_or(...) key_or(...) IN,AND IN,AND IN AND IN IN IN IN
26 / 29
key_or(RANGE_OPT_PARAM *param, SEL_ROOT *key1, SEL_ROOT *key2);
1000 2000 3000 4000 1000 2000 3000 4000 Time (usecs) key2.elements
key_or.time(key2) { uint e = key2->elements; Norm(-0.276 + 0.073*e + 0.062*e*log(e), 2.24); }
27 / 29
28 / 29
28 / 29
28 / 29
28 / 29
∗USI, Switzerland †Yale University, USA $Google, USA
29 / 29