1
Charging f rom Sampled Net work Usage
Nick Duf f ield Carst en Lund Mikkel Thorup AT&T Labs-Research, Florham Park, NJ
Charging f rom Sampled Net work Usage Nick Duf f ield Carst en Lund - - PowerPoint PPT Presentation
Charging f rom Sampled Net work Usage Nick Duf f ield Carst en Lund Mikkel Thorup AT&T Labs-Research, Florham Park, NJ 1 Do Charging and Sampling Mix? J Usage sensit ive charging charge based on sampled net work usage J I s sampling
1
Nick Duf f ield Carst en Lund Mikkel Thorup AT&T Labs-Research, Florham Park, NJ
2
J Usage sensit ive charging
charge based on sampled net work usage
J I s sampling necessary?
j ust count all packet s/ byt es in net work? measure and export all t raf f ic f lows st at s?
J I s sampled usage reliable enough?
risk of overcharging or undercharging
3
J Compare charging on port -size
coarse granularit y OC3⇒OC12⇒OC48⇒0C192
J I mplicit resource management
price disincent ive t o greedy use
J Dif f erent iat ed services
will require dif f erent iat ed charges
4
J Mirror pricing policy in rout er conf igurat ion?
separat e count er f or each billable packet st ream
J Scaling/ dimensionalit y issues
pot ent ially many det erminant s t o pricing
– ToS, applicat ion t ype, source/ dest I P addr ess, …
rout ers must support large number of count ers
J Conf igurat ion issues
change pricing policy ⇒ reconf igure count ers
– administ rat ive cost
5
flow 1 flow 2 flow 3 flow 4
J I P f low abst ract ion
set of packet s ident if ied wit h “same” address, port s, et c. packet s t hat are “close” t oget her in t ime possible prot ocol-based f low demarcat ion
– e.g. t erminat e on TCP FI N J I P f low summaries
report s of measured f lows f rom rout ers
– f low ident if iers, t ot al packet s/ byt es, rout er st at e J Several f low def init ions in commercial use
6
J Measure t raf f ic f lows as t hey occur
export f low summaries t o billing syst em
J Flow volumes
J Cost
net work resources f or t ransmission st orage/ processing at billing syst em
7
J Sampling
st at ist icians ref lex act ion t o large dat aset s
J Export select ed f lows
reduce t ransmission/ st orage/ processing cost s
J Suf f icient ly accurat e f or pricing?
risk of overcharging (⇒ irat e cust omers) risk of undercharging (⇒ irat e shareholders)
8
J Packet Sampling
when rout er can’t f orm f lows at line rat e
– scaling at a single rout er
J Flow sampling
managing volume of f low st at ist ics
– scaling across downst ream measurement inf rast ruct ure
J Complement ary
could combine
– e.g. 1 in N packet sampling + f low sampling
9
J Each f low i has
“size” xi
– byt es or packet s
“color” ci
– combinat ion of I P address, port , ToS et c t hat maps t o billable st ream ( = cust omer + billing class)
J Goal
t o est imat e t ot al usage X(c) in each color c
=
c c : i i
i
10
J Mat ch sampling met hod t o f low charact erist ics
high f ract ion of t raf f ic f ound in small f ract ion of long f lows
– sample long f lows more f requent ly t han short f lows
G large cont ribut ions t o usage more reliably est imat ed
J Manage sampling error t hrough charging scheme
make charging insensitive t o small usage
– sampling error f or small usage not ref lect ed in charge t o user J Trade-of f
allow small consist ent undercount t o reduce risk of overcharge
J Show how t o relat e sampling and charging paramet ers
simple rules t o achieve desired accuracy
11
J Sample 1 in N f lows
est imat e t ot al byt es by N t imes sampled byt es
J Problem:
long f low lengt hs
– est imat e sensit ive t o inclusion or omission
12
J Sample f low summary of size x wit h prob. p(x) J Est imat e usage X by
boost up size x by f act or 1/ p(x) in est imat e X’
– compensat e against chance of being sampled J Chose p(x) t o be increasing in x
longer f lows more likely t o be sampled compare size independent sampling: p(x) =1/ N
f lows sampled
13
J Fixed set of f low sizes {x1, x2, …
,xn}
we only consider randomness of sampling
J X’ is unbiased est imat or of act ual usage X = Si xi
˜ X’ = X: averaging over all possible samplings holds f or all probabilit y f unct ions p(x)
J Proof :
X’ = S
iwi
/ p(xi)
– wi random variable
G wi =1 wit h prob. p(xi), 0 ot herwise
– ˜ wi = p(xi) hence ˜ X’ = ˜ Siwi xi / p(xi)= Si xi=X
14
J Trade-of f accuracy vs. number of samples J Express t rade-of f t hrough cost f unction
cost = variance(X’) + z2 average number of samples
– paramet er z: relat ive import ance of variance vs. # samples J Which choice of p(x) minimizes cost ? J pz(x) = min { 1 , x/ z }
f lows wit h size ≥ z: always select ed f lows wit h size <
z: select ed wit h
J Trade-of f
smaller z
– more samples, lower variance
larger z
– f ewer samples, higher variance J Will call sampling wit h pz(x) “opt imal”
pz(x) z 1 x
15
J Nearly as simple as 1 in N sampling
– no random number generat ors
sample(x) { static count = 0 if (x > z) { select_flow } else { count += x if ( count > z) { count = count - z select_flow } } }
16
J Resampling t o progressively t hin f low summaries J Finer resampling (z1 ≤ z2 ≤ z3) preserves st at ist ics
f inal f low st ream at billing syst em has same st at ist ical
propert ies as would original st ream sampled once wit h z3 Rout er
Aggregat ion Server Billing Syst em
17
J Net Flow t races
J Color f lows
J Compare
– same average sampling rat e
J Measure of accuracy
J Heavy t ailed f low size dist ribut ion is our f riend!
−
c c
X(c) | X(c) (c) X' |
18
J Opt imal sampling
no sampling error f or f lows larger t han z
J Exploit in charging scheme
f ixed charge f or small usage usage sensit ive charge only f or usage above
insensitivity level L
J Charge according t o est imat ed usage
f (X’(c)) = a + b max{ L , X’(c) }
– coef f icient s a, b and level L could depend on color c J Only usage above L needs reliable est imat ion
19
J Given t arget accuracy
relat e sampling t hreshold z t o level L
J Theorem
Variance(X’) ≤ z X (t ight bound) now assume: z ≤ ε2 L
– St d.Dev. X’ ≤ ε X if X ≥ L
G bound sampling error of est imat ed usage >
L
– St d.Dev. f (X’) ≤ ε f (X)
G bound error of charge based on est imat ed usage
J Bounds hold f or any f low sizes {xi}
no assumpt ion on f low size dist ribut ion
– j ust choose z ≤ ε2 L
20
J Target paramet ers
L = 107, ε = 10% ⇒ z = 105
J Scat t er plot
rat io est imat ed/ act ual
usage vs. act ual usage
– each color c
est imat ion of higher usage
J Want t o avoid
rat io >
1+ε = 1.1 and usage > L = 107
J Less t han 1 in 1000
“bad” point s
21
J Aim:
reduce chance of overest imat ing usage
J Met hod:
t heorem gave bound: Var(X’) ≤ z X ant icipat e upwards variat ions in X’ by
subt ract ing of f mult iples of st d. dev.
– charge according t o
again: no assumpt ions on f low size dist ribut ion
zX' X' -s ' Xs =
22
J Scat t er pushed down:
no point s wit h
rat io> 1.1 and usage > 107
J Drawback
more unbillable usage
– when X’s< X J Small unbillable usage
f or heavy users
rat io→1 St d.Dev.(X’)/ X’
vanishes as X grows
23
J Scat t er pushed down f urt her:
no point s wit h rat io >
1
J Trade of f
unbillable usage vs.
3% 3.1% 1 0% 6.2% 2 50%
X’s> X? unbill. byt es s
24
J Make sampling more accurat e
reduce z!
J For unbillable f ract ion <
η
chose s z ≤ η2L
J Example:
s = 2, η = 10% reduce z
– f rom 105 t o 104 J Alt ernat ive
increase coef f icent
a in charge f (X) t o cover cost s
25
J Want t o reduce z
bet t er accuracy, less unbillable usage
J Drawback
increased sample volume
J Solut ion
make billing period longer inst ead
– usage roughly proport ional t o billing period – allows increased charge insensit ivit y level L
sample product ion rat e cont rolled by t hreshold z
– rat e r Σx f (x)pz(x)
G f low arrival rat e r, f ract ion f (x) of f lows size x
J Need only z = ε2 L
larger L allows smaller error ε f or given z
26
J Size dependent opt imal sampling
pref erent ially sample large f lows
– more accurat e usage est imat es f or given sample volume – sample f low of size x wit h probabilit y pz(x) J Charging f rom measured usage X’
charge f (X’) = a +b max{L,X’}
– f ixed charge f or usage below insensit ivit y level L – only need t o reliably est imat e usage above L J Sampling/ charging accuracy
choose z = ε2 L t o get st andard error ε
J Variance compensat ion
replace X’ by
J Longer billing cycle
increases L, bet t er accuracy (ε) at given sampling rat e (z)
zX' X' -s ' Xs =
27
J Dynamic cont rol of sample volume
aim:
– bound sample rat e when arrival rate r varies
met hod:
– dynamic adj ust ment of sampling t hreshold z