Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: nsp: juniper

duplicate acks, EX3300 VC

 

 

nsp juniper RSS feed   Index | Next | Previous | View Threaded


mike.williams at comodo

May 17, 2012, 9:16 AM

Post #1 of 3 (722 views)
Permalink
duplicate acks, EX3300 VC

Hey all,

Before I punt this to JTAC, has anyone had any experience with
poor/highly-variable TCP throughput from a small stack of EX3300s?

We've got a stack of 3, one 48 port, and two 24 ports, and since they went in
we can't get reliable TCP transfers transatlantic.
Linux-Linux can go really fast, but involve Windows and we get a pityful
~100KBps, regardless of tuning done.
Junos is 11.4R2.14.

It's taken us *forever* to hone in on the issue possible being the EXs,
because who'd have thought a switch couldn't handle packets at a few 10s of
megabytes per second (10-20k PPS x 3).

To cut a looooooooong story short;
<internet><srx650><ex3300><linux firewall><same ex3300><server>
Linux firewall sees the 2 initial TCP packets correctly, but the server
generally only gets the second one, or if it gets the first it's after the
second. Then we're into a bazillion duplicate acks, out-of-order packets, and
TCP retransmissions.

I found the 'show system statistics tcp' command a short while ago and it's,
well, "interesting".


> show system statistics tcp
fpc0:
--------------------------------------------------------------------------
Tcp:
84769061 packets sent
16676437 data packets (2039615568 bytes)
1416 data packets retransmitted (1526176 bytes)
0 resends initiated by MTU discovery
67141526 ack only packets (23539653 packets delayed)
0 URG only packets
0 window probe packets
22 window update packets
3468634 control packets
125994683 packets received
15916504 acks(for 2039560634 bytes)
82630576 duplicate acks
0 acks for unsent data
25574925 packets received in-sequence(3702132560 bytes)
43149892 completely duplicate packets(5824 bytes)
10 old duplicate packets
5 packets with some duplicate data(2140 bytes duped)
0 out-of-order packets(0 bytes)
0 packets of data after window(0 bytes)
0 window probes
24585 window update packets
23 packets received after close
0 discarded for bad checksums
0 discarded for bad header offset fields
0 discarded because packet too short


fpc1 and fpc2 have similar numbers, even though these packets have no need to
leave fpc0. There aren't even any active servers off fpc1/2 yet.
fpc0 has been up 33 days, so has seen almost 30 duplicate acks per second
since it booted.

--
Mike Williams
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


p.mayers at imperial

May 17, 2012, 10:30 AM

Post #2 of 3 (696 views)
Permalink
Re: duplicate acks, EX3300 VC [In reply to]

On 17/05/12 17:16, Mike Williams wrote:
> Hey all,
>
> Before I punt this to JTAC, has anyone had any experience with
> poor/highly-variable TCP throughput from a small stack of EX3300s?

This is *through* the switch, yes? Not *to* it?
> We've got a stack of 3, one 48 port, and two 24 ports, and since they went in
> we can't get reliable TCP transfers transatlantic.
> Linux-Linux can go really fast, but involve Windows and we get a pityful
> ~100KBps, regardless of tuning done.
> Junos is 11.4R2.14.

I have no experience of that platform (or indeed any Juniper switch) but
this sounds awfully like packet drops due to small buffers.

Linux has a whole bunch of pluggable/selectable TCP congestion control
algorithms, and the defaults are, usually, much better behaved in the
face of packet loss than those on Windows, which could explain why you
see different behaviour with different OSes.

>
> It's taken us *forever* to hone in on the issue possible being the EXs,
> because who'd have thought a switch couldn't handle packets at a few 10s of
> megabytes per second (10-20k PPS x 3).

That is (presumably) the bulk/aggregate throughput. The instantaneous
throughput might be (a lot) higher, depending on the TCP window size,
the inter-packet spacing, whether TCP segmentation offload is in use,
and so forth.

If the switch has small buffers (which cheap switches often do) then an
instantaneous burst to line rate, combined with traffic to/from other
ports, can cause drops. These drops can KILL TCP performance without
adequate TCP stack tuning, and a decent congestion control algorithm.

>
> To cut a looooooooong story short;
> <internet><srx650><ex3300><linux firewall><same ex3300><server>
> Linux firewall sees the 2 initial TCP packets correctly, but the server
> generally only gets the second one, or if it gets the first it's after the
> second. Then we're into a bazillion duplicate acks, out-of-order packets, and
> TCP retransmissions.

Roughly how many dropped packets are you seeing, as a ratio?

Out-of-order packets is a bit odd; are you doing something peculiar like
per-packet load balancing?

>
> I found the 'show system statistics tcp' command a short while ago and it's,
> well, "interesting".
>
>
>> show system statistics tcp
> fpc0:
> --------------------------------------------------------------------------
> Tcp:
> 84769061 packets sent
> 16676437 data packets (2039615568 bytes)
> 1416 data packets retransmitted (1526176 bytes)

Are you sure this command shows what you think it does?

This looks awfully like statistics for the local operating system i.e.
the TCP stack on the switch, used to handle telnet/SSH/other management.

To gather these kinds of stats for *forwarded* traffic implies the
switch is doing TCP header inspection (unlikely) as you need to know TCP
connection status.
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


chris at nmedia

May 17, 2012, 10:37 AM

Post #3 of 3 (687 views)
Permalink
Re: duplicate acks, EX3300 VC [In reply to]

those TCP statistics have nothing to do with traffic passing through the switch ports, that's for traffic to the control plane (FreeBSD)

Mike Williams [mike.williams [at] comodo] wrote:
> Hey all,
>
> Before I punt this to JTAC, has anyone had any experience with
> poor/highly-variable TCP throughput from a small stack of EX3300s?
>
> We've got a stack of 3, one 48 port, and two 24 ports, and since they went in
> we can't get reliable TCP transfers transatlantic.
> Linux-Linux can go really fast, but involve Windows and we get a pityful
> ~100KBps, regardless of tuning done.
> Junos is 11.4R2.14.
>
> It's taken us *forever* to hone in on the issue possible being the EXs,
> because who'd have thought a switch couldn't handle packets at a few 10s of
> megabytes per second (10-20k PPS x 3).
>
> To cut a looooooooong story short;
> <internet><srx650><ex3300><linux firewall><same ex3300><server>
> Linux firewall sees the 2 initial TCP packets correctly, but the server
> generally only gets the second one, or if it gets the first it's after the
> second. Then we're into a bazillion duplicate acks, out-of-order packets, and
> TCP retransmissions.
>
> I found the 'show system statistics tcp' command a short while ago and it's,
> well, "interesting".
>
>
> > show system statistics tcp
> fpc0:
> --------------------------------------------------------------------------
> Tcp:
> 84769061 packets sent
> 16676437 data packets (2039615568 bytes)
> 1416 data packets retransmitted (1526176 bytes)
> 0 resends initiated by MTU discovery
> 67141526 ack only packets (23539653 packets delayed)
> 0 URG only packets
> 0 window probe packets
> 22 window update packets
> 3468634 control packets
> 125994683 packets received
> 15916504 acks(for 2039560634 bytes)
> 82630576 duplicate acks
> 0 acks for unsent data
> 25574925 packets received in-sequence(3702132560 bytes)
> 43149892 completely duplicate packets(5824 bytes)
> 10 old duplicate packets
> 5 packets with some duplicate data(2140 bytes duped)
> 0 out-of-order packets(0 bytes)
> 0 packets of data after window(0 bytes)
> 0 window probes
> 24585 window update packets
> 23 packets received after close
> 0 discarded for bad checksums
> 0 discarded for bad header offset fields
> 0 discarded because packet too short
>
>
> fpc1 and fpc2 have similar numbers, even though these packets have no need to
> leave fpc0. There aren't even any active servers off fpc1/2 yet.
> fpc0 has been up 33 days, so has seen almost 30 duplicate acks per second
> since it booted.
>
> --
> Mike Williams
> _______________________________________________
> juniper-nsp mailing list juniper-nsp [at] puck
> https://puck.nether.net/mailman/listinfo/juniper-nsp

--
Keep them laughing half the time, scared of you the other half. And always keep them guessing. -- Clair George
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp

nsp juniper RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.