1 Why Why flow flow control? control? sender - - PDF document

1
SMART_READER_LITE
LIVE PREVIEW

1 Why Why flow flow control? control? sender - - PDF document

Lecture 7. Lecture 7. TCP mechanisms for: TCP mechanisms for: data transfer control / flow control error control congestion control Graphical examples (applet java) of several algorithms at:


slide-1
SLIDE 1

1

Giuseppe Bianchi

data transfer control / flow control error control congestion control

Lecture 7. Lecture 7. TCP mechanisms for: TCP mechanisms for:

Graphical examples (applet java) of several algorithms at: http://www.ce.chalmers.se/~fcela/tcp-tour.html

Giuseppe Bianchi

Data transfer control over TCP Data transfer control over TCP

a double a double-

  • face

face issue issue: :

  • !"#$$$

% %

&' () * + ,-./,-#0

Giuseppe Bianchi

TCP TCP pipelining pipelining

1 ) 23 4

  • 5

6 & 57

W=6

  • +

⋅ = C MSS RTT MSS W C thr / , min

slide-2
SLIDE 2

2

Giuseppe Bianchi

Why Why flow flow control? control?

#1668,9:8,-;< = &8<9:8<)>, 5& ?& ;@166 A&&@5&$ A!6!5! 9$5B7

receiver sender

Giuseppe Bianchi

Window Window-

  • based

based flow flow control control

1668,9:8,-;< :& 8)-9:8)-,;- . &C 86)-%D8;9:8;->D 5&?& ,@166

Receiver buffer E (0 F& !6 G From IP Application process read() Receiver window

Giuseppe Bianchi

)D 4 1&&-(&0 DHHCH 6&LastByteSent - LastByteAcked <= RcvWindow. 58,-;<

#,-;<4$$F44/5%)G ,-;<&(&

Header length checksum 32 bit Sequence number Window size Source port Destination port 32 bit acknowledgement number 6 bit Reserved Urgent pointer

U R G A C K P S H R S T S Y N F I N

slide-3
SLIDE 3

3

Giuseppe Bianchi

What What is is flow flow control control needed needed for for? ?

Window flow control guarantees receiver buffer to be able to accept outstanding segments. When receiver buffer full, just send back win=0 in essence, flow control guarantees that transmission bit rate never exceed receiver rate in average! Note that instantaneous transmission rate is arbitrary… as well as receiver rate is discretized (application reads)

Giuseppe Bianchi

S=7

Sliding Sliding window window

S=4 S=5 S=6

Dynamic window based reduces to pure sliding window when receiver app is very fast in reading data… W=3 W=3

1 2 3 4 5 6 7 8 9

SEQ Window “sliding” forward

Giuseppe Bianchi

Dynamic Dynamic window window -

  • example

example

sender receiver

  • Rec. Buffer

EMPTY Exchanged param: MSS=2K, sender ISN=2047, WIN=4K (carried by receiver SYN-ACK) 4K

TCP CONN SETUP

Application does a 2K write 2K, seq=2048 4K 2K Ack=4096, win=2048 Application does a 3K write 2K, seq=4096 Sender blocked Ack=6144, win=0 4K FULL Application does a 2K read Ack=6144, win=2048 4K 2K Sender unblocks may send last 1K 1K, seq=6144

slide-4
SLIDE 4

4

Giuseppe Bianchi

Performance: Performance: bounded bounded by by receiver receiver buffer buffer size size

Up to 1992, common operating systems had transmitter & receiver buffer defaulted at 4096 e.g. SunOS 4.1.3 way suboptimal over Ethernet LANs raising buffer to 16384 = 40% throughput increase (Papadopulos & Parulkar, 1993) e.g. Solaris 2.2 default most socket APIs allow apps to set (increase) socket buffer sizes But theoretical maximum remains W=65535 bytes…

Giuseppe Bianchi

Maximum Maximum achievable achievable throughput throughput

( (assuming assuming infinite infinite speed speed line…) line…)

W = 65535 bytes

1 10 100 1000 100 200 300 400 500 RTT (ms) Throughput (Mbps)

Giuseppe Bianchi

Window Scale Window Scale Option Option

Appears in SYN segment

  • perates only if both peers understand option

allows client & server to agree on a different W scale specified in terms of bit shift (from 1 to 14) maximum window: 65535 * 2b b=14 means max W = 1.073.725.440 bytes!!

slide-5
SLIDE 5

5

Giuseppe Bianchi

Blocked Blocked sender sender deadlock deadlock problem problem

sender receiver

  • Rec. Buffer

4K FULL Application read 4K 2K BLOCKED ACK=X, WIN=2K REMAINS BLOCKED FOREVER!! Since ACK does not carry data, no ack from sender expected….

Giuseppe Bianchi

Solution Solution: : Persist Persist timer timer

!"# $% &'

# H--(& 0

() % &' 8)I4 & @ @ J

& 44 &K 44&

* #+ $ A& 1@& 8D-

Giuseppe Bianchi

Interactive Interactive applications applications

ideal ideal rlogin rlogin operation

  • peration: 4

: 4 transmitted transmitted segments segments per 1 byte!!!!! per 1 byte!!!!!

keystroke display server echo CLIENT SERVER D a t a b y t e E c h

  • f

d a t a b y t e A c k

  • f

d a t a b y t e A c k

  • f

e c h

  • e

d b y t e

  • Interactive apps: create some tricky situations….
slide-6
SLIDE 6

6

Giuseppe Bianchi

The The silly silly window window syndrome syndrome

Bulk data source TCP connection Full recv buffer Interactive user (one byte at the time)

SCENARIO SCENARIO

Giuseppe Bianchi

The The silly silly window window syndrome syndrome

Ack=X, win=1 1 byte read 1 byte read

Network loaded with tinygrams (40bytes header + 1 payload!!) Forever!

1 byte Buffer FULL Ack=X+1, win=0 Ack=X+1, win=1 Buffer FULL 1 byte Ack=X+2, win=0 Buffer FULL Fill up buffer until win=0

Giuseppe Bianchi

Silly Silly window window solution solution

Problem discovered by David Clark (MIT), 1982 easily solved, by preventing receiver to send a window update for 1 byte rule: send window update when:

receiver buffer can handle a whole MSS

  • r

half received buffer has emptied (if smaller than MSS)

sender also may apply rule

by waiting for sending data when win low

slide-7
SLIDE 7

7

Giuseppe Bianchi

Nagle’ Nagle’s s algorithm algorithm

(RFC 896, 1984) (RFC 896, 1984)

1 byte 1 b y t e a c k WAIT 2 b y t e a c k WAIT 3 b y t e

NAGLE RULE: a TCP connection can have only ONE SMALL outstanding segment self-clocking algorithm:

  • n LANs, plenty of

tynigrams

  • n slow WANs, data

aggregation

Giuseppe Bianchi

Comments Comments about about Nagle’ Nagle’s s algo algo

Over ethernet: about 16 ms round trip time Nagle algo starts operating when user digits more than 60 characters per second (!!!) disabling Nagle’s algorithm a feature offered by some TCP APIs

set TCP_NODELAY

example: mouse movement over X-windows terminal

Giuseppe Bianchi

PUSH PUSH flag flag

Used to notify TCP sender to send data

but for this an header flag NOT needed! Sufficient a “push” type indication in the TCP sender API

TCP receiver to pass received data to the application

Header length checksum 32 bit Sequence number Window size Source port Destination port 32 bit acknowledgement number 6 bit Reserved Urgent pointer

U R G A C K P S H R S T S Y N F I N

slide-8
SLIDE 8

8

Giuseppe Bianchi

Urgent Urgent data data

URG on: notifies rx that “urgent” data placed in segment. When URG on, urgent pointer contains position of last byte

  • f urgent data
  • r the one after the last, as some bugged implementations do??

and the first? No way to specify it!

receiver is expected to pass all data up to urgent ptr to app

interpretation of urgent data is left to the app typical usage: ctrlC (interrupt) in rlogin & telnet; abort in FTP

urgent data is a second exception to blocked sender

Header length checksum 32 bit Sequence number Window size Source port Destination port 32 bit acknowledgement number 6 bit Reserved Urgent pointer

U R G A C K P S H R S T S Y N F I N

Giuseppe Bianchi

TCP Error control

Giuseppe Bianchi

TCP: a TCP: a reliable reliable transport transport

TCP is a reliable protocol all data sent are guaranteed to be received

very important feature, as IP is unreliable network layer

employs positive acknowledgement cumulative ack selective ack may be activated when both peers implement it (use option) does not employ negative ack error discovery via timeout (retransmission timer) But “implicit NACK” is available

slide-9
SLIDE 9

9

Giuseppe Bianchi

Error Error discovery discovery

via via retransmission retransmission timer timer expiration expiration

time Send segment Retransmission timer: waits for acknowledgement Re-send segment

Fundamental problem: setting the retransmission timer right!

Giuseppe Bianchi

Lost Lost data == data == lost lost ack ack

a c k D A T A R e t r a n s m i t D A T A Retr timer D A T A R e t r a n s m i t D A T A Retr timer

Giuseppe Bianchi

although although lost lost ack ack may may be be discovered discovered via via subsequent subsequent acks acks

A c k 1 D A T A 1 Retransm timer 1 Retransm timer 2 D A T A 2 A c k 2 Data 1 & 2 OK

slide-10
SLIDE 10

10

Giuseppe Bianchi

Retransmission Retransmission timer timer setting setting

RTO = RTO = retransmission retransmission TimeOut TimeOut

a c k DATA Retransmit DATA RTO

TOO SHORT

Too late!!!! a c k D A T A R e t r a n s m i t D A T A RTO

TOO LONG

Might have txed!!!! ,--./-0, % ,---(1

Giuseppe Bianchi

Retransmission Retransmission timer timer setting setting

Cannot be fixed by protocol! Two reasons: different network scenarios have very different performance

LANs (short RTTs) WANs (long RTTs)

same network has time-varying performance (very fast time scale)

when congestion occurs (RTT grows) and disappears (RTT drops)

Giuseppe Bianchi

Adaptive Adaptive RTT RTT setting setting

Proposed in RFC 793 based on dynamic RTT estimation sender samples time between sending SEQ and receiving ACK (M) estimates RTT (R) by low pass filtering M (autoregressive, 1 pole)

R = α α α α R + (1-α α α α) M α α α α = 0.9

sets RTO = R β β β β β β β β = 2 (recommended)

slide-11
SLIDE 11

11

Giuseppe Bianchi

Problem Problem: : constant constant value value β=2 β=2 β=2 β=2 β=2 β=2 β=2 β=2

Propagation = 100 Queueing = mean 5, most in range 0-10

RTO = 2 x measured_RTT ~ 2 x 105 = 210 TOO LARGE! SCENARIO 1: lightly loaded long-distance communication

Propagation = 1 Queueing = mean 10, most in range 0-50

RTO = 2 x measured_RTT ~ 2 x 11 = 22 WAY TOO SMALL! SCENARIO 2: mildly loaded short-distance communication

Giuseppe Bianchi

Problem Problem: : constant constant value value β=2 β=2 β=2 β=2 β=2 β=2 β=2 β=2

SCENARIO 3: slow speed links

28.8 Kbps 100 200 300 400 500 600 500 1000 1500 2000 2500 packet size (bytes) transmission delay (ms)

Natural variation of packet sizes causes a large variation in RTT!

(from RFC 1122: utilization on 9.6 kbps link can improve from 10% up to 90% With the adoption of Jacobson algorithm) Giuseppe Bianchi

Jacobson Jacobson RTO (1988) RTO (1988)

idea: idea: make make it it depend depend on

  • n measured

measured variance variance! ! g = gain (1/8) conceptually equivalent to 1-α, but set to slightly different value D = mean deviation conceptually similar to standard deviation, but cheaper (does not require a square root computation) h = 1/4 Jacobson’s implementation: based on integer arithmetic (very efficient) Err = M-A A := A + g Err D := D + h (|Err| - D) RTO = A + 4D

slide-12
SLIDE 12

12

Giuseppe Bianchi

Guessing Guessing right right? ?

Karn’ Karn’s s problem problem

a c k DATA R e t r a n s m i t D A T A RTO

Scenario 1

M ack DATA RTO

Scenario 2

M? retransmit M?

Giuseppe Bianchi

Solution Solution to to Karn’ Karn’s s problem problem

L A!!& & !@' #& @ 4 & !&?&@

When at 64 secs, stay persist up to 9 minutes, then reset

Giuseppe Bianchi

Need Need for for implicit implicit NACKs NACKs

S e q = 1 R e t r a n s m i t S e q = 1 RTO S e q = 1 5 S e q = 2 S e q = 2 5 S e q = 5 S e q = 3 5 S e q = 3

.& =.9 & 4

4

!@ 4

=''

..23 (45 6666

slide-13
SLIDE 13

13

Giuseppe Bianchi

The Fast The Fast Retransmit Retransmit Algorithm Algorithm

Seq=100 Seq=150 Seq=50 Seq=100

#&&=.9'

=.9 &% % =.9&8

  • =6 =61#
  • C&4

& @ $ $ 4$ &

  • =7=

&

ack=100 ack=100 ack=100 ack=100: FR

RTO