Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Cisco: NSP

Flow Control and 10GE interfaces

 

 

Cisco nsp RSS feed   Index | Next | Previous | View Threaded


matt at melbourne

Nov 22, 2009, 12:28 PM

Post #1 of 16 (1351 views)
Permalink
Flow Control and 10GE interfaces

Hi,

What is the general recommendation regarding enabling flow control on
Ethernet interfaces. Is it a legacy issue when devices had smaller buffers,
or is it still required for specific applications? We are having issues with
an Enterprise NAS solution where servers using it for storage are claiming
to be losing connectivity. The NAS is connected to the switch fabric (a pair
of Catalyst 6509s) by two 2*10GE port-channels (10GBase-SR optics); receive
flow control is enabled on the switch side "flowcontrol receive on", but no
input or output pause frames are being received/sent according to the member
interface statistics.

The vendor is now suggesting that flowcontrol needs to be enabled end-to-end
- e.g. on aggregation switches downstream from the Catalyst 6509s towards
the servers and on the hosts. However, the utilisation on the NAS
port-channels is only ~400Mbps. Does enabling flowcontrol make sense here?

Cheers,

Matt

--
Matthew Melbourne

_______________________________________________
cisco-nsp mailing list cisco-nsp [at] puck
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


ross at kallisti

Nov 23, 2009, 5:41 AM

Post #2 of 16 (1287 views)
Permalink
Re: Flow Control and 10GE interfaces [In reply to]

On Sun, Nov 22, 2009 at 08:28:24PM -0000, Matthew Melbourne wrote:
> What is the general recommendation regarding enabling flow control on
> Ethernet interfaces. Is it a legacy issue when devices had smaller buffers,
> or is it still required for specific applications? We are having issues with
> an Enterprise NAS solution where servers using it for storage are claiming
> to be losing connectivity. The NAS is connected to the switch fabric (a pair
> of Catalyst 6509s) by two 2*10GE port-channels (10GBase-SR optics); receive
> flow control is enabled on the switch side "flowcontrol receive on", but no
> input or output pause frames are being received/sent according to the member
> interface statistics.
>
> The vendor is now suggesting that flowcontrol needs to be enabled end-to-end
> - e.g. on aggregation switches downstream from the Catalyst 6509s towards
> the servers and on the hosts. However, the utilisation on the NAS
> port-channels is only ~400Mbps. Does enabling flowcontrol make sense here?

Storage vendors seem to blame a plethora of issues on disabled
Ethernet Flow Control. Every discussion that I've ever had with any
of them, every document that I've ever read, totally fails to
understand what ethernet flow control does and how it works. No one
is even aware of the head of line blocking problem. Remember - when
you pause your NAS, you pause it for EVERYONE. Maybe I've talked to
the wrong folks, but no one seems to understand this. It's almost
like EMC thinks they designed their NAS for a single client...

The answer is very simple: if someone thinks that ethernet flow
control is the answer, the burden of proof is on them to answer
difficult questions about what the actual problem is, what flow
control is going to solve, and why they think that it won't cause more
problems than its worth. At best it does nothing, realistically it
interferes with TCP flow control, and at worst it pauses your storage
and breaks every client.

--
Ross Vandegrift
ross [at] kallisti

"If the fight gets hot, the songs get hotter. If the going gets tough,
the songs get tougher."
--Woody Guthrie
Attachments: signature.asc (0.19 KB)


gert at greenie

Nov 23, 2009, 7:48 AM

Post #3 of 16 (1288 views)
Permalink
Re: Flow Control and 10GE interfaces [In reply to]

Hi,

On Mon, Nov 23, 2009 at 08:41:58AM -0500, Ross Vandegrift wrote:
> The answer is very simple: if someone thinks that ethernet flow
> control is the answer, the burden of proof is on them to answer
> difficult questions about what the actual problem is, what flow
> control is going to solve, and why they think that it won't cause more
> problems than its worth. At best it does nothing, realistically it
> interferes with TCP flow control, and at worst it pauses your storage
> and breaks every client.

I tend to disagree with this statement in this broadness. We've seen
problems where lack of flow control combined with a switch with too-tiny
buffers and bursty ingress traffic led to buffer overflow on egress, and
packet loss. If the switch would use flow control here to space the
ingress traffic better (that is: stop and restart the flow for milliseconds
at a time), packet loss would be avoidable.

Of course, this can indeed fire backwards - as in: one egress port is
way overloaded, and flow control spreads the pain from there to all other
egress ports served by the ingress port in question.

So indeed, flow control is not a panacea. I agree with this :-)

gert
--
USENET is *not* the non-clickable part of WWW!
//www.muc.de/~gert/
Gert Doering - Munich, Germany gert [at] greenie
fax: +49-89-35655025 gert [at] net


p.mayers at imperial

Nov 23, 2009, 8:05 AM

Post #4 of 16 (1289 views)
Permalink
Re: Flow Control and 10GE interfaces [In reply to]

Gert Doering wrote:
> Hi,
>
> On Mon, Nov 23, 2009 at 08:41:58AM -0500, Ross Vandegrift wrote:
>> The answer is very simple: if someone thinks that ethernet flow
>> control is the answer, the burden of proof is on them to answer
>> difficult questions about what the actual problem is, what flow
>> control is going to solve, and why they think that it won't cause more
>> problems than its worth. At best it does nothing, realistically it
>> interferes with TCP flow control, and at worst it pauses your storage
>> and breaks every client.
>
> I tend to disagree with this statement in this broadness. We've seen
> problems where lack of flow control combined with a switch with too-tiny
> buffers and bursty ingress traffic led to buffer overflow on egress, and
> packet loss. If the switch would use flow control here to space the
> ingress traffic better (that is: stop and restart the flow for milliseconds
> at a time), packet loss would be avoidable.
>
> Of course, this can indeed fire backwards - as in: one egress port is
> way overloaded, and flow control spreads the pain from there to all other
> egress ports served by the ingress port in question.
>
> So indeed, flow control is not a panacea. I agree with this :-)

An interesting wrinkle (to some) is that stock flow control is not QoS
(i.e. 802.1p codepoint) aware - it's all-or-nothing, meaning your
low-bandwidth diffserv/EF flow gets paused as well as your less-then
best-effort 999.9mbit/sec FTP transfer :o(

There's a flow control extension somewhere for per-802.1p flow control,
but I can't find the references for this.

QoS seems to have gone out of fashion however, so whether this is
relevant is another matter ;o)
_______________________________________________
cisco-nsp mailing list cisco-nsp [at] puck
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


gert at greenie

Nov 23, 2009, 8:31 AM

Post #5 of 16 (1296 views)
Permalink
Re: Flow Control and 10GE interfaces [In reply to]

Hi,

On Mon, Nov 23, 2009 at 04:05:16PM +0000, Phil Mayers wrote:
> >So indeed, flow control is not a panacea. I agree with this :-)
>
> An interesting wrinkle (to some) is that stock flow control is not QoS
> (i.e. 802.1p codepoint) aware - it's all-or-nothing, meaning your
> low-bandwidth diffserv/EF flow gets paused as well as your less-then
> best-effort 999.9mbit/sec FTP transfer :o(

Oh. Even better point. So yes, flow control definitely needs to be
activated with care.

"Big buffers" is it, then :-) - plus "big pipes!".

gert
--
USENET is *not* the non-clickable part of WWW!
//www.muc.de/~gert/
Gert Doering - Munich, Germany gert [at] greenie
fax: +49-89-35655025 gert [at] net


Pawel_Sikora at netia

Nov 23, 2009, 8:48 AM

Post #6 of 16 (1289 views)
Permalink
Re: Flow Control and 10GE interfaces [In reply to]

Gert Doering wrote:

>An interesting wrinkle (to some) is that stock flow control is not QoS
>(i.e. 802.1p codepoint) aware - it's all-or-nothing, meaning your
>low-bandwidth diffserv/EF flow gets paused as well as your less-then
>best-effort 999.9mbit/sec FTP transfer :o(

>There's a flow control extension somewhere for per-802.1p flow control,
>but I can't find the references for this.

Not so distant idea, Nexus 5000 supports it afaik:

http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-542809.html


Pawel/
_______________________________________________
cisco-nsp mailing list cisco-nsp [at] puck
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


b.turnbow at twt

Nov 23, 2009, 9:50 AM

Post #7 of 16 (1291 views)
Permalink
Re: Flow Control and 10GE interfaces [In reply to]

-----Original Message-----
From: cisco-nsp-bounces [at] puck [mailto:cisco-nsp-bounces [at] puck] On Behalf Of Phil Mayers
Sent: luned́ 23 novembre 2009 17.05
To: Gert Doering
Cc: Matthew Melbourne; cisco-nsp [at] puck; Ross Vandegrift
Subject: Re: [c-nsp] Flow Control and 10GE interfaces

Gert Doering wrote:
> Hi,
>
> On Mon, Nov 23, 2009 at 08:41:58AM -0500, Ross Vandegrift wrote:
>> The answer is very simple: if someone thinks that ethernet flow
>> control is the answer, the burden of proof is on them to answer
>> difficult questions about what the actual problem is, what flow
>> control is going to solve, and why they think that it won't cause more
>> problems than its worth. At best it does nothing, realistically it
>> interferes with TCP flow control, and at worst it pauses your storage
>> and breaks every client.
>
> I tend to disagree with this statement in this broadness. We've seen
> problems where lack of flow control combined with a switch with too-tiny
> buffers and bursty ingress traffic led to buffer overflow on egress, and
> packet loss. If the switch would use flow control here to space the
> ingress traffic better (that is: stop and restart the flow for milliseconds
> at a time), packet loss would be avoidable.
>
> Of course, this can indeed fire backwards - as in: one egress port is
> way overloaded, and flow control spreads the pain from there to all other
> egress ports served by the ingress port in question.
>
> So indeed, flow control is not a panacea. I agree with this :-)

>An interesting wrinkle (to some) is that stock flow control is not QoS
>(i.e. 802.1p codepoint) aware - it's all-or-nothing, meaning your
>low-bandwidth diffserv/EF flow gets paused as well as your less-then
>best-effort 999.9mbit/sec FTP transfer :o(

>There's a flow control extension somewhere for per-802.1p flow control,
>but I can't find the references for this.

The nexus family does PFC (no it's not a card, they reused the acronym)
http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-542809.html
Basically enables sending a pause per class.
They did it for FCOE and it is proprietary , the white paper has the standard mumbo jumbo about
how it is becoming a standard and everyone is adapting cisco's proposal..


Brian

>QoS seems to have gone out of fashion however, so whether this is
>relevant is another matter ;o)
_______________________________________________
cisco-nsp mailing list cisco-nsp [at] puck
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
_______________________________________________
cisco-nsp mailing list cisco-nsp [at] puck
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


kgraham at industrial-marshmallow

Nov 23, 2009, 11:40 AM

Post #8 of 16 (1283 views)
Permalink
Re: Flow Control and 10GE interfaces [In reply to]

> The answer is very simple: if someone thinks that ethernet flow

> control is the answer, the burden of proof is on them to answer
> difficult questions about what the actual problem is, what flow
> control is going to solve, and why they think that it won't cause more
> problems than its worth. At best it does nothing, realistically it
> interferes with TCP flow control, and at worst it pauses your storage
> and breaks every client.

My understanding of this must be broken... If the pause frame is sent
only sent when or immediately before RX buffers are exhausted, then
TX queuing is triggered (hopefully only briefly before those buffers
are exhausted). This would seem to trigger behavior consistent w/ a
congested interface (which in fact it is, just prior to reaching line
rate, as the receiver can't take it off interface buffers fast enough).

Short of host-side implementation details such as one slow MSI-X queue
starving others, isn't this providing exactly the congestion feedback
that would be expected (queue-on-congestion, drop when queue
exceeded)?

_______________________________________________
cisco-nsp mailing list cisco-nsp [at] puck
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


gert at greenie

Nov 23, 2009, 1:28 PM

Post #9 of 16 (1294 views)
Permalink
Re: Flow Control and 10GE interfaces [In reply to]

Hi,

On Mon, Nov 23, 2009 at 11:40:17AM -0800, Kevin Graham wrote:
> Short of host-side implementation details such as one slow MSI-X queue
> starving others, isn't this providing exactly the congestion feedback
> that would be expected (queue-on-congestion, drop when queue
> exceeded)?

so you have one ingress port ("the NAS"), 20 egress ports ("the clients").

Egress port 1 fills up.

What are you going to do? Flow-control (-> slow down 19 other ports)
or drop?

(For "3 ingress ports, 1 egress ports, bursty traffic that doesn't
exceed the egress port speed *on average*", the answer is different)

gert

--
USENET is *not* the non-clickable part of WWW!
//www.muc.de/~gert/
Gert Doering - Munich, Germany gert [at] greenie
fax: +49-89-35655025 gert [at] net


nick at inex

Nov 23, 2009, 1:54 PM

Post #10 of 16 (1284 views)
Permalink
Re: Flow Control and 10GE interfaces [In reply to]

On 23/11/2009 21:28, Gert Doering wrote:
> What are you going to do? Flow-control (-> slow down 19 other ports)
> or drop?

The answer to this depends on the application. If you're running regular
IP then yes, drop a few packets. No-one will care too much.

FCoE is a different matter and dropped packets are an extremely serious
problem, and in this case it would seem more useful to exercise flow
control and stuff up the transfers of a bunch of other clients in order not
to drop anything for the one.

FCoE is not IP. Don't confuse them.

Nick
_______________________________________________
cisco-nsp mailing list cisco-nsp [at] puck
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


kgraham at industrial-marshmallow

Nov 23, 2009, 1:57 PM

Post #11 of 16 (1287 views)
Permalink
Re: Flow Control and 10GE interfaces [In reply to]

> so you have one ingress port ("the NAS"), 20 egress ports ("the clients").

>
> Egress port 1 fills up.
>
> What are you going to do? Flow-control (-> slow down 19 other ports)
> or drop?

Agreed, egress queuing and "flowcontrol send" seems logically flawed, but
the NAS case I see cited is "flowcontrol receive" on the switch side.
In this case, egress port pauses, backs up, and further traffic to it
drops -- there's no reason I can see for this have any impact to other
ports.

In an edge-device (NAS, server, whatever) it seems far more likely that
the -host- is what needs the pause (flowcontrol receive), not the switch
(flowcontrol send).
_______________________________________________
cisco-nsp mailing list cisco-nsp [at] puck
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


David at hughes

Nov 23, 2009, 2:30 PM

Post #12 of 16 (1318 views)
Permalink
Re: Flow Control and 10GE interfaces [In reply to]

On 24/11/2009, at 3:50 AM, Brian Turnbow wrote:

> The nexus family does PFC (no it's not a card, they reused the acronym)
> http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-542809.html
> Basically enables sending a pause per class.
> They did it for FCOE and it is proprietary , the white paper has the standard mumbo jumbo about
> how it is becoming a standard and everyone is adapting cisco's proposal..



That info is a little dated. Sure, in "Datacentre Ethernet" days when Cisco where out there alone doing this stuff then yeah, it was proprietary. Now that's not the case. It all comes under the CEE banner (Converged Enhanced Ethernet) and is being formalised by the IEEE as Data Centre Bridging. In particular you'd be interested in the following standards :

802.1Qbb (priority based flow control)
802.1Qau (congestion notification)

See http://www.ieee802.org/1/pages/dcbridges.html for all the gory details.

But, in Cisco kit, only Nexus does it. There other vendors (Brocade / Foundry for example) that can support it and even folk like BNT are making noises. Could be light at the end of the tunnel.


David
...
_______________________________________________
cisco-nsp mailing list cisco-nsp [at] puck
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


md at bts

Nov 24, 2009, 12:00 AM

Post #13 of 16 (1271 views)
Permalink
Re: Flow Control and 10GE interfaces [In reply to]

On Mon, 23 Nov 2009 11:40:17 -0800 (PST), Kevin Graham wrote
> > The answer is very simple: if someone thinks that ethernet flow
>
> > control is the answer, the burden of proof is on them to answer
> > difficult questions about what the actual problem is, what flow
> > control is going to solve, and why they think that it won't cause more
> > problems than its worth. At best it does nothing, realistically it
> > interferes with TCP flow control, and at worst it pauses your storage
> > and breaks every client.
>
> My understanding of this must be broken... If the pause frame is sent
> only sent when or immediately before RX buffers are exhausted, then
> TX queuing is triggered (hopefully only briefly before those buffers
> are exhausted). This would seem to trigger behavior consistent w/ a
> congested interface (which in fact it is, just prior to reaching line
> rate, as the receiver can't take it off interface buffers fast enough).

Yes, what you described is basically a case where the interface runs at faster
speed than the data path behind it.

Some examples: oversubcribed 10GE card with only 8 Gbps bandwidth to the switch
fabric or system bus, 100 Mpbs ethernet interface in front of 34 Mbps microware
link.

This is exactly the *only* situation, where classic flow control makes sense and
does really help, since it properly triggers output queueing at the sending side
when the real data-path speed is reached. Any other usage is likely to cause
more problems than benefits.

With kind regards,

M.

_______________________________________________
cisco-nsp mailing list cisco-nsp [at] puck
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


ross at kallisti

Nov 24, 2009, 8:24 AM

Post #14 of 16 (1270 views)
Permalink
Re: Flow Control and 10GE interfaces [In reply to]

On Tue, Nov 24, 2009 at 09:00:51AM +0100, Marian ??urkovi?? wrote:
> On Mon, 23 Nov 2009 11:40:17 -0800 (PST), Kevin Graham wrote
> > My understanding of this must be broken... If the pause frame is sent
> > only sent when or immediately before RX buffers are exhausted, then
> > TX queuing is triggered (hopefully only briefly before those buffers
> > are exhausted). This would seem to trigger behavior consistent w/ a
> > congested interface (which in fact it is, just prior to reaching line
> > rate, as the receiver can't take it off interface buffers fast enough).
>
> Yes, what you described is basically a case where the interface runs at faster
> speed than the data path behind it.
>
> Some examples: oversubcribed 10GE card with only 8 Gbps bandwidth to the switch
> fabric or system bus, 100 Mpbs ethernet interface in front of 34 Mbps microware
> link.
>
> This is exactly the *only* situation, where classic flow control makes sense and
> does really help, since it properly triggers output queueing at the sending side
> when the real data-path speed is reached. Any other usage is likely to cause
> more problems than benefits.

But in these cases you're saturated! So why not just drop the frame and
let the upper-layer figure out that it needs to back off? You're just
delaying the inevitable by invoking flow control and hiding the
information from the upper layer.

--
Ross Vandegrift
ross [at] kallisti

"If the fight gets hot, the songs get hotter. If the going gets tough,
the songs get tougher."
--Woody Guthrie
Attachments: signature.asc (0.19 KB)


md at bts

Nov 24, 2009, 9:02 AM

Post #15 of 16 (1264 views)
Permalink
Re: Flow Control and 10GE interfaces [In reply to]

On Tue, Nov 24, 2009 at 11:24:26AM -0500, Ross Vandegrift wrote:
> > Yes, what you described is basically a case where the interface runs at faster
> > speed than the data path behind it.
> >
> > Some examples: oversubcribed 10GE card with only 8 Gbps bandwidth to the switch
> > fabric or system bus, 100 Mpbs ethernet interface in front of 34 Mbps microware
> > link.
> >
> > This is exactly the *only* situation, where classic flow control makes sense and
> > does really help, since it properly triggers output queueing at the sending side
> > when the real data-path speed is reached. Any other usage is likely to cause
> > more problems than benefits.
>
> But in these cases you're saturated! So why not just drop the frame and
> let the upper-layer figure out that it needs to back off? You're just
> delaying the inevitable by invoking flow control and hiding the
> information from the upper layer.

Not exactly. By using flow control you're in fact signalling the real
data-path speed to the sender, i.e. the sender knows it talks to e.g.
"34 Mbps ethernet" interface and not to wirespeed 100 Mbps ethernet.
It can utilize this info to properly apply QOS or to smooth microbursts
using its output buffers - for instance, you'll hardly get IPTV
working over such 34 Mbps microwave links without flowcontrol enabled.


With kind regards,

M.


_______________________________________________
cisco-nsp mailing list cisco-nsp [at] puck
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


kgraham at industrial-marshmallow

Nov 24, 2009, 9:22 AM

Post #16 of 16 (1260 views)
Permalink
Re: Flow Control and 10GE interfaces [In reply to]

> This is exactly the *only* situation, where classic flow control makes sense and

> does really help, since it properly triggers output queueing at the sending side
> when the real data-path speed is reached.

OK, the vitriol towards .3x in this thread was so strong I was concerned I had
somehow misunderstood it.

> Any other usage is likely to cause more problems than benefits.

Documentation that I could find is vague at best, but are any switches actually
doing end-to-end .3x and signaling ingress ports on a congested egress queue?
(inferring that this is the 'problems' everyone is citing?)

At least for simple flowcontrol, propagating across the bus/fabric seems wholly
broken and unnecessarily complex. However, sending a pause on a congested port
complex (ie. on PINNACLE's interface's to the bus or MEDUSA/HYPERION) towards
the port strikes me as both desirable and the most-likely form of implementation.
_______________________________________________
cisco-nsp mailing list cisco-nsp [at] puck
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/

Cisco nsp RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.