Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: NANOG: users

Microsoft.com PMTUD black hole?

 

 

First page Previous page 1 2 3 Next page Last page  View All NANOG users RSS feed   Index | Next | Previous | View Threaded


nathana at fsr

May 6, 2008, 12:07 PM

Post #1 of 51 (771 views)
Permalink
Microsoft.com PMTUD black hole?

Hello,

Has anyone else here seen problems with microsoft/msn/hotmail/live.com
sites not performing PMTUD correctly? We have, for a while now, had
people on our network complain of poor microsoft.com reachability, and
discovered we can work around the issue by changing MSS on all TCP SYN
as they go out of our network.

I recently watched the whole conversation between msn.com and a host on
our network (with the MSS rewrite disabled), and if I'm reading it
right, we are following PMTUD protocol correctly by sending back ICMP
type 3 code 4, but all Microsoft hosts seem to ignore this and continue
to send packets back to our host with an MSS that is too large.

I hope I'm wrong and that it is we who are doing something stupid, but
after cruising Google for a while, I found a multitude of other
complaints from people connected to other ISPs specifically about not
being able to reach Microsoft web sites. It seems crazy that MS could
have PMTUD broken for so long with nobody ever raising a complaint to
them directly, though, which makes me wonder if there is another answer
here that I'm missing.

I sent the following message to a couple of addresses that I gleaned
from ARIN WHOIS for the IP block in question and threw hostmaster in
there just in case it went somewhere, but noc[at]microsoft.com appears to
be defunct. I have yet to receive acknowledgment of receipt from the
other address.

Are there any microsoft.com admins that hang out here that can comment
on this or get in touch with me, or is there perhaps someone on here
with connections to the Microsoft NOC?

(BTW, I stripped the referenced libpcap attachment off of this message
to the list just so that I wouldn't accidentally incur the wrath of
NANOG...if y'all want to see it, I'm happy to post it.)

Thanks,

--
Nathan Anderson
First Step Internet, LLC
nathana[at]fsr.com

-------- Original Message --------
Subject: Microsoft/MSN/Live!/Hotmail behind blackhole router?
Date: Thu, 01 May 2008 19:00:46 -0700
From: Nathan Anderson/FSR <nathana[at]fsr.com>
To: hostmoaster[at]microsoft.com, noc[at]microsoft.com, iprrms[at]microsoft.com

To microsoft.com NOC admins:

I work for a regional ISP in the inland pacific northwest. May of our
customers' connections have MTUs of less than 1500, and we get routine
complaints from them that they have trouble reaching web sites that are
under your administration.

Usually we can fix the problem by "mangling" the TCP SYNs originating
from our customers and headed to the world to reflect a lower value;
however, we would rather not have to do that. The fact that we are
REQUIRED to do this in order for your sites to be reachable by our
customers strongly suggests that either the servers that respond to HTTP
requests sent to www.microsoft/msn/hotmail/live.com are behind routers
that are blocking ALL ICMP traffic sent their way -- even ICMP type 3
code 4 (packet too large, DF set), which is necessary in order for Path
MTU Discovery to work -- or the servers themselves are not listening to
the ICMP messages that we are sending their way when our routers are
forced to drop a packet sent by you which is too large to be forwarded
to a customer of ours.

I set up a test connection "on the bench" so to speak, and had our
router capture a copy of the conversation between our test client and
www.msnbc.msn.com and forward that conversation encapsulated in TZSP to
the same test client over a different interface. The capture clearly
shows our test client establishing the TCP connection with MSNBC
(SYN/SYN+ACK/ACK), and then goes on to show MSNBC send ethernet
MTU-sized packets our way that an intermediate router of ours drops and
responds with "packet too big, DF set." Despite this, MSNBC continues
to retrasmit the original packet with the same payload and the same size
back to us. We continue to respond "packet too big, DF set," but the
MSNBC server never seems to get the message (literally).

We see the same behavior with all sites across the board contained
within the 207.46.0.0/16 space, regardless of actual hostname/FQDN.

We also find this ironic considering that Microsoft published a Technet
article a few years back on black hole routers and the problems they
pose, found at http://technet.microsoft.com/en-us/library/bb878081.aspx
(which we can't read/access unless we are mangling the MSS).

We would appreciate it if Microsoft NOC admins would please look into
the matter and take the appropriate corrective action: allowing ICMP
type 3 code 4 messages through your routers/firewalls, and making sure
that your servers respond to them appropriately as defined in RFC 1191.

I have attached the capture we made of the conversation to this e-mail
message in libpcap format for your analysis. The test client itself had
a 1500 MTU to a desktop router, which in turn had an MTU of 1492 on its
uplink to us.

I am available to answer any additional clarifying questions you may have.

Thank you for your time and attention to this matter.

Regards,

--
Nathan Anderson
First Step Internet, LLC
nathana[at]fsr.com


_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


brandon at rd

May 6, 2008, 12:58 PM

Post #2 of 51 (748 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

> Has anyone else here seen problems with microsoft/msn/hotmail/live.com
> sites not performing PMTUD correctly?

I used to see it a lot when hosting on windows was popular and people
realised they needed a firewall or decided to add a load balancer
but broke PMTUD by leaving it enabled on the servers.

I've not heard of it for some time so those people got
a clue or moved to something else (or everyone worked around them)

brandon

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


iljitsch at muada

May 6, 2008, 1:26 PM

Post #3 of 51 (747 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

On 6 mei 2008, at 21:58, Brandon Butterworth wrote:

>> Has anyone else here seen problems with microsoft/msn/hotmail/
>> live.com
>> sites not performing PMTUD correctly?

> I used to see it a lot when hosting on windows was popular and people
> realised they needed a firewall or decided to add a load balancer
> but broke PMTUD by leaving it enabled on the servers.

> I've not heard of it for some time so those people got
> a clue or moved to something else (or everyone worked around them)

Many years ago I had occasion to terminate dial-up service over L2TP
from modem pools operated by a service provider who shall remain
nameless to protect the guilty. This service had the unfortunate
tendency to drap all packets larger than 576 bytes. So we needed to
negotiate a 576-byte MTU over PPP.

We then got many complaints from users who dialed in using ISDN
routers (yes this was a while ago) because of broken path MTU
discovery. The behavior that Microsoft exhibits was EXTREMELY common
in those days, and I have no reason to assume it's any less common
today. (I also see it regularly with IPv6.) What I did was clear the
DF bit on packets going out to the L2TP virtual interfaces so the
packets could be fragmented.

A more common approach is to rewrite the MSS option in all TCP SYNs
with a smaller value so there won't be TCP segments large enough to
trigger the problem. AFAIK, all boxes that do PPPoE do this.

All of this even went so far that the IETF came up with RFC 4821,
which will do path MTU discovery by correlating lost packets with
packet sizes to determine the path MTU rather than depend on ICMP
messages.

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


nathana at fsr

May 6, 2008, 1:57 PM

Post #4 of 51 (750 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

Brandon Butterworth wrote:

> I used to see it a lot when hosting on windows was popular and people
> realised they needed a firewall or decided to add a load balancer
> but broke PMTUD by leaving it enabled on the servers.

Yeah, but this is Microsoft's OWN server farm we are talking about here,
not some small podunk IIS-based hosting provider.

...well, you may be right. I am probably giving MS too much credit here.

On another note, someone pointed out to me off-list that I apparently
tyop'd "hostmaster" when I sent the e-mail to MS. I have since re-sent
it to the properly-spelled address and again promptly received a "User
unknown" bounceback.

--
Nathan Anderson
First Step Internet, LLC
nathana[at]fsr.com

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


nathana at fsr

May 6, 2008, 2:29 PM

Post #5 of 51 (744 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

Iljitsch van Beijnum wrote:

> A more common approach is to rewrite the MSS option in all TCP SYNs

[snip]

Yeah, we do this now, but the software that we have been using for PPPoE
termination as well as for a huge portion of our clients (MikroTik
RouterOS) doesn't do it correctly in my estimation when you flip on the
automatic "change-tcp-mss" option...it rewrites the MSS in ALL SYNs
passing through it, either coming OR going. This has the effect of
breaking communication with other hosts that actually have a SMALLER MSS
than our PPPoE customers since our client will get a SYN+ACK from the
remote host that we have rewritten to reflect a larger MSS than the
remote host is capable of dealing with. Because MikroTik rewrote both
the SYNs generated by us as well as received by us, our customer's host
is now under the impression that the lowest MSS between the two hosts
matches its own.

At least that's the best theory I've come up with. We can write (and
have written) custom IP manglers on the MikroTik boxes that only touch
SYNs generated by our clients, and only when the MSS is larger than a
certain value (in order to honor MSSes even lower than that allowed by
their PPPoE gateway). But it's a PITA to deal with. I'd just rather
everyone follow protocol. :-P Although we can't always expect everyone
to do it by the book, I don't think it is too much to ask that those who
operate sizable networks that nearly everyone is required to interact
with on a daily basis (read: Microsoft) act responsibly.

> All of this even went so far that the IETF came up with RFC 4821,
> which will do path MTU discovery by correlating lost packets with
> packet sizes to determine the path MTU rather than depend on ICMP
> messages.

What's funny is that I ran my tests from a Windows XP host with the
recently-released Service Pack 3 installed, which is supposed to
activate Microsoft's "PMTUD Black Hole Router Detection" by default
(available pre-SP3 but apparently not turned on without a registry
change). I haven't read up on exactly how it's supposed to work, but I
think the basic idea is that if the TCP connection is negotiated
properly but it doesn't get a response beyond that, it will try lower
and lower MSSes until it does.

However it works (or doesn't as the case may be), it didn't make a lick
of difference. I waited and waited for content to be delivered to me
until eventually Microsoft's end sent me a TCP RST.

While I was poking at this, though, I had a thought...most IP stacks I
believe keep a path MTU cache of some sort. I know Windows does: if I
send an ICMP packet with DF set that is larger than the PPPoE gateway
can handle, I get something similar to the following:

C:\Documents and Settings\nathana>ping 64.126.160.1 -f -l 1472

Pinging 64.126.160.1 with 1472 bytes of data:

Reply from 64.126.142.249: Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.
[...]

Next time that I try the same thing, Windows doesn't even bother trying
to send the packet. It looks at its PMTU table for that IP, and already
KNOWS it is too big:

C:\Documents and Settings\nathana>ping 64.126.160.1 -f -l 1472

Pinging 64.126.160.1 with 1472 bytes of data:

Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.
[...]

However, even when trying this with www.msnbc.msn.com, and with the
MSNBC entry in its PMTU cache (and its IP set statically in my 'hosts'
file so that Akamai/MS round-robin DNS doesn't screw with me during the
test), when I tried to build a TCP connection to MSNBC from this same
host, Windows told the remote host it had a 1460 MSS.

Now, although that makes sense, in order to avoid issues like the one we
are facing with Microsoft, would it not make _more_ sense for the stack
to look at the PMTU cache first, and then adjust its own MSS just for
connections to that one host? Maybe even send out an MTU - 40 ICMP
packet to the host that we want to build a TCP connection with FIRST to
get an ICMP type 3 code 4 response from the router in-between with the
smaller MTU?

That would put the burden of PMTUD on the host requesting the TCP
session rather than on the one responding, but if hosts were "smarter"
like this it seems to me it might smooth out some of these issues. The
remote end could be "broken" with respect to PMTUD but it wouldn't matter.

Thoughts?

--
Nathan Anderson
First Step Internet, LLC
nathana[at]fsr.com

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


nathana at fsr

May 6, 2008, 2:32 PM

Post #6 of 51 (742 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

Nathan Anderson/FSR wrote:

[...]

> connections to that one host? Maybe even send out an MTU - 40 ICMP


:s/40/sized. Brain fart.

--
Nathan Anderson
First Step Internet, LLC
nathana[at]fsr.com

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


bonomi at mail

May 6, 2008, 3:53 PM

Post #7 of 51 (744 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

`

> Date: Tue, 06 May 2008 14:29:03 -0700
> From: Nathan Anderson/FSR <nathana[at]fsr.com>
> Subject: Re: [NANOG] Microsoft.com PMTUD black hole?
>
>
> Now, although that makes sense, in order to avoid issues like the one we
> are facing with Microsoft, would it not make _more_ sense for the stack
> to look at the PMTU cache first, and then adjust its own MSS just for
> connections to that one host?

This _is_ Microsoft we're talking about, remember. 'sense' and 'Microsoft'
are, at a =minimum= orthogonal to each other -- and may not even inhabit
the same address-space. <wry grin>

As for standards, it is official Microsoft policy to "embrace and extend",
not to implement in a way compatible with the rest of the world. *sigh*

I -don't- believe the rumor that "PMTUD/Vista Ultimate" sends incrementally
increasing-size packets, and uses the first one that -doesn't- get through
as the size limit. <giggle>



_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


tomb at byrneit

May 6, 2008, 5:51 PM

Post #8 of 51 (736 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

Interestingly, Windows XP, Sp3, released today, describes changes in
PMTUD behavior.

Black Hole Router detection is now on by default:

http://download.microsoft.com/download/6/8/7/687484ed-8174-496d-8db9-f02
b40c12982/Overview%20of%20Windows%20XP%20Service%20Pack%203.pdf


> -----Original Message-----
> From: Robert Bonomi [mailto:bonomi[at]mail.r-bonomi.com]
> Sent: Tuesday, May 06, 2008 3:54 PM
> To: nanog[at]merit.edu
> Subject: Re: [NANOG] Microsoft.com PMTUD black hole?
>
> `
>
> > Date: Tue, 06 May 2008 14:29:03 -0700
> > From: Nathan Anderson/FSR <nathana[at]fsr.com>
> > Subject: Re: [NANOG] Microsoft.com PMTUD black hole?
> >
> >
> > Now, although that makes sense, in order to avoid issues
> like the one
> > we are facing with Microsoft, would it not make _more_
> sense for the
> > stack to look at the PMTU cache first, and then adjust its own MSS
> > just for connections to that one host?
>
> This _is_ Microsoft we're talking about, remember. 'sense'
> and 'Microsoft'
> are, at a =minimum= orthogonal to each other -- and may not
> even inhabit the same address-space. <wry grin>
>
> As for standards, it is official Microsoft policy to "embrace
> and extend",
> not to implement in a way compatible with the rest of the
> world. *sigh*
>
> I -don't- believe the rumor that "PMTUD/Vista Ultimate" sends
> incrementally increasing-size packets, and uses the first one
> that -doesn't- get through
> as the size limit. <giggle>
>
>
>
> _______________________________________________
> NANOG mailing list
> NANOG[at]nanog.org
> http://mailman.nanog.org/mailman/listinfo/nanog
>

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


tme at multicasttech

May 6, 2008, 6:06 PM

Post #9 of 51 (736 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

On May 6, 2008, at 8:51 PM, Tomas L. Byrnes wrote:

> Interestingly, Windows XP, Sp3, released today, describes changes in
> PMTUD behavior.
>
> Black Hole Router detection is now on by default:
>
> http://download.microsoft.com/download/6/8/7/687484ed-8174-496d-8db9-f02
> b40c12982/Overview%20of%20Windows%20XP%20Service%20Pack%203.pdf
>

<http://download.microsoft.com/download/6/8/7/687484ed-8174-496d-8db9-f02b40c12982/Overview%20of%20Windows%20XP%20Service%20Pack%203.pdf
>

or

http://tinyurl.com/323xb

Regards

>
>> -----Original Message-----
>> From: Robert Bonomi [mailto:bonomi[at]mail.r-bonomi.com]
>> Sent: Tuesday, May 06, 2008 3:54 PM
>> To: nanog[at]merit.edu
>> Subject: Re: [NANOG] Microsoft.com PMTUD black hole?
>>
>> `
>>
>>> Date: Tue, 06 May 2008 14:29:03 -0700
>>> From: Nathan Anderson/FSR <nathana[at]fsr.com>
>>> Subject: Re: [NANOG] Microsoft.com PMTUD black hole?
>>>
>>>
>>> Now, although that makes sense, in order to avoid issues
>> like the one
>>> we are facing with Microsoft, would it not make _more_
>> sense for the
>>> stack to look at the PMTU cache first, and then adjust its own MSS
>>> just for connections to that one host?
>>
>> This _is_ Microsoft we're talking about, remember. 'sense'
>> and 'Microsoft'
>> are, at a =minimum= orthogonal to each other -- and may not
>> even inhabit the same address-space. <wry grin>
>>
>> As for standards, it is official Microsoft policy to "embrace
>> and extend",
>> not to implement in a way compatible with the rest of the
>> world. *sigh*
>>
>> I -don't- believe the rumor that "PMTUD/Vista Ultimate" sends
>> incrementally increasing-size packets, and uses the first one
>> that -doesn't- get through
>> as the size limit. <giggle>
>>
>>
>>
>> _______________________________________________
>> NANOG mailing list
>> NANOG[at]nanog.org
>> http://mailman.nanog.org/mailman/listinfo/nanog
>>
>
> _______________________________________________
> NANOG mailing list
> NANOG[at]nanog.org
> http://mailman.nanog.org/mailman/listinfo/nanog


_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


nathana at fsr

May 6, 2008, 6:12 PM

Post #10 of 51 (734 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

All,

A member of Microsoft's GNS network escalations team saw my postings on
NANOG about this issue and took offense at my use of this forum to raise
this issue with them, and criticized me as being unprofessional and
lacking in business acumen.

Therefore, I would like to publicly apologize for my actions here. It
was not my intention to "humiliate" Microsoft into compliance but rather
to find a means of effective contact with them since none was to be
found before today. However, I recognize that I did step over the line,
especially with regards to one comment I made in an earlier post about
"giving Microsoft too much credit." I apologize for this and retract
this, and ask their forgiveness.

As I promised, I will not be posting any more to this list regarding
this issue unless it is to report the final verdict that I receive from
my now-open ticket with Microsoft (thanks to this list, I found an
effective contact), or to discuss the mechanics of PMTUD in general.

Regards,

--
Nathan Anderson
First Step Internet, LLC
nathana[at]fsr.com

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


nathana at fsr

May 6, 2008, 6:18 PM

Post #11 of 51 (736 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

Tomas L. Byrnes wrote:

> Interestingly, Windows XP, Sp3, released today, describes changes in
> PMTUD behavior.
>
> Black Hole Router detection is now on by default:

As I pointed out in my post earlier today timestamped at 2:29PM, I was
using an XP SP3 host to perform my tests with, and it made no
difference. I also used BBR's DrTCP application to make sure that black
hole router detection was, in fact, enabled on my XP box before
commencing my packet captures.

I cannot explain why it made no difference, but at the same time I don't
know enough about how WinNT's black hole router detection works to begin
speculating at this point. I do plan on looking into it, however.

--
Nathan Anderson
First Step Internet, LLC
nathana[at]fsr.com

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


iljitsch at muada

May 6, 2008, 10:22 PM

Post #12 of 51 (731 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

On 6 mei 2008, at 23:48, Valdis.Kletnieks[at]vt.edu wrote:

>> A more common approach is to rewrite the MSS option in all TCP SYNs
>> with a smaller value so there won't be TCP segments large enough to
>> trigger the problem. AFAIK, all boxes that do PPPoE do this.

> And just the other day, you were saying:

>> Very few people out there use an MTU significantly below 1500
>> bytes. A
>> 1500-byte MTU will give you an _average_ packet size of ~1000 on
>> long-
>> lived TCP flows because there is one tiny ACK for every two full size
>> data segments.

Right. Why is that noteworthy?

I have a lot more to say about MTU issues in this draft about
negotating MTUs between two hosts/routers on a subnet so jumboframes
can be deployed without manual configuration:

http://www.ietf.org/internet-drafts/draft-van-beijnum-multi-mtu-02.txt

> Apparently, there's a *reason* why RFC1122, section 3.3.3 says:

> It is generally desirable to avoid local fragmentation and to
> choose EMTU_S low enough to avoid fragmentation in any gateway
> along the path. In the absence of actual knowledge of the
> minimum MTU along the path, the IP layer SHOULD use
> EMTU_S <= 576 whenever the destination address is not on a
> connected network, and otherwise use the connected network's
> MTU.

Tell it to Microsoft and their ICMP-filtering friends...

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


iljitsch at muada

May 6, 2008, 10:29 PM

Post #13 of 51 (731 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

On 6 mei 2008, at 23:29, Nathan Anderson/FSR wrote:

> Now, although that makes sense, in order to avoid issues like the
> one we
> are facing with Microsoft, would it not make _more_ sense for the
> stack
> to look at the PMTU cache first, and then adjust its own MSS just for
> connections to that one host? Maybe even send out an MTU - 40 ICMP
> packet to the host that we want to build a TCP connection with FIRST
> to
> get an ICMP type 3 code 4 response from the router in-between with the
> smaller MTU?

No. This would add significant delay because you'd have to give the
other side enough time to respond to the large packet (also sending a
large packet on something like GPRS/EDGE is a waste of bandwidth and
battery power) while if there is ICMP filtering, there won't be a
response, which is exactly the reason why we're in this bind in the
first place (along with the stupid idea that DF should be set for ALL
packets rather than just once in a while).

And adjusting the MSS based on ephemeral information is the wrong
thing to do in the first place. The path MTU can vary. Once you've
advertised a small MSS you can never increase it.

It is incredibly unprofessional that people enable PMTUD, then break
it and require the rest of the world to implement workarounds. Either
use PMTUD properly by accepting the ICMP messages or turn PMTUD off.

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


randy at psg

May 6, 2008, 10:44 PM

Post #14 of 51 (730 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

> A member of Microsoft's GNS network escalations team saw my postings on
> NANOG about this issue and took offense at my use of this forum to raise
> this issue with them, and criticized me as being unprofessional and
> lacking in business acumen.

they try that intimidation every time a vulnerability or bug is
revealed. laugh and post their overly-aggressive message on a public
web site.

randy

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


gdt at gdt

May 7, 2008, 12:12 AM

Post #15 of 51 (730 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

Nathan Anderson/FSR wrote:
> A member of Microsoft's GNS network escalations team saw my postings on
> NANOG about this issue and took offense at my use of this forum to raise
> this issue with them, and criticized me as being unprofessional and
> lacking in business acumen.

Hang on a tick. Aren't you one of their customers...<looking through
mail spool>...

> As I pointed out in my post earlier today timestamped at 2:29PM, I was
> using an XP SP3 host to perform my tests with...

...why yes, you are.

I can't think of any other supplier that would be so unprofessional and
so lacking in business acumen as to say that their customer was UALIBI.

Amazing. A fine case study of a person in customer contact undoing the
work of millions of dollars in PR. Whatever you say about Steve Ballmer
he's a great sales person at heart. He must despair at some of his staff.

--
Glen Turner

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


newton at internode

May 7, 2008, 1:05 AM

Post #16 of 51 (729 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

On 07/05/2008, at 4:42 PM, Glen Turner wrote:

> Amazing. A fine case study of a person in customer contact undoing the
> work of millions of dollars in PR.

I wouldn't worry too much about it, Glen. My observation is that the
millions of dollars in PR isn't working very well either :-)

- mark


--
Mark Newton Email: newton[at]internode.com.au
(W)
Network Engineer Email:
newton[at]atdot.dotat.org (H)
Internode Systems Pty Ltd Desk: +61-8-82282999
"Network Man" - Anagram of "Mark Newton" Mobile: +61-416-202-223






_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


bjorn at mork

May 7, 2008, 1:10 AM

Post #17 of 51 (729 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

Iljitsch van Beijnum <iljitsch[at]muada.com> writes:

> Many years ago I had occasion to terminate dial-up service over L2TP
> from modem pools operated by a service provider who shall remain
> nameless to protect the guilty. This service had the unfortunate
> tendency to drap all packets larger than 576 bytes. So we needed to
> negotiate a 576-byte MTU over PPP.
>
> We then got many complaints from users who dialed in using ISDN
> routers (yes this was a while ago) because of broken path MTU
> discovery. The behavior that Microsoft exhibits was EXTREMELY common
> in those days, and I have no reason to assume it's any less common
> today. (I also see it regularly with IPv6.) What I did was clear the
> DF bit on packets going out to the L2TP virtual interfaces so the
> packets could be fragmented.

Right. I once stumbled across a SOHO-router doing just that. I never
understood why, but now you've given at least one explanation how it
could appear to be a good idea.

I can also provide the reason why we found it to be an extremely bad
idea at the time: Some (most? all?) systems won't set both the DF flag
and the identification field at the same time. If you clear the DF flag
without changing the identification field, you might end up with
fragmented packets that are impossible to reassemble. Which was why I
stumbled across the DF-clearing SOHO-router in the first place. The
random problems it generated were extremely difficult to debug, and when
we started we truly believed that we had a problem with a layer 4 load
balancing switch.

Note: There are solutions that will both clear the DF flag and generate
a new id. E.g. http://www.openbsd.org/faq/pf/scrub.html

This is the proper way to clear DF, if you must. Never just clear it.



Bjørn

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


rsk at gsp

May 7, 2008, 6:45 AM

Post #18 of 51 (701 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

On Tue, May 06, 2008 at 06:12:42PM -0700, Nathan Anderson/FSR wrote:
> A member of Microsoft's GNS network escalations team saw my postings on
> NANOG about this issue and took offense at my use of this forum to raise
> this issue with them, and criticized me as being unprofessional and
> lacking in business acumen.

This is a typical Microsoft reaction: blame the messenger for their
own incompetence, laziness, stupidity, and greed. I think you should
post their assinine message so that it can receive the public ridicule
it surely deserves.

---Rsk


_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


patrick at zill

May 7, 2008, 6:49 AM

Post #19 of 51 (699 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

Glen Turner wrote:
> Amazing. A fine case study of a person in customer contact undoing the
> work of millions of dollars in PR. Whatever you say about Steve Ballmer
> he's a great sales person at heart. He must despair at some of his staff.
>

The rest of us however, despair at having to support their crap.

Patrick Giagnocavo
patrick[at]zill.net

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


stephen at sprunk

May 7, 2008, 9:32 AM

Post #20 of 51 (694 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

Thus spake "Nathan Anderson/FSR" <nathana[at]fsr.com>
> A member of Microsoft's GNS network escalations team saw my
> postings on NANOG about this issue and took offense at my use
> of this forum to raise this issue with them, and criticized me as
> being unprofessional and lacking in business acumen.

First, it's "unprofessional and lacking in business acumen" for someone to
criticize their customers to their face. As one manager taught me, "The
customer may not always be right, but they're never wrong."

Second, it's their own damn fault for not maintaining their contact
information properly in public databases. If the only option they leave you
is to post to NANOG, because they don't respond to (or even accept) direct
requests to the listed contacts, then that's what you have to do.

Many companies are guilty of the latter, and we all get the benefit of
seeing the state of their customer service for reference when making future
buying decisions. Very few are arrogant enough to do the former, though.

S

Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking


_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


nathana at fsr

May 7, 2008, 11:50 AM

Post #21 of 51 (693 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

Iljitsch van Beijnum wrote:

> No. This would add significant delay because you'd have to give the
> other side enough time to respond to the large packet (also sending a
> large packet on something like GPRS/EDGE is a waste of bandwidth and
> battery power) while if there is ICMP filtering, there won't be a
> response, which is exactly the reason why we're in this bind in the
> first place

I admit the idea needs tweaking (at best), and it was just a stray
thought :-), but 1) even if there is ICMP filtering happening way at the
other end, I (the TCP initiator) will still get a response from the
router in the middle (RITM) that is reducing the total path MTU if I try
to send a packet through it larger than the actual path MTU, and 2) if I
don't get a response to my single large packet (either from a RITM or
the other end) in a timely fashion (less than a second?), then the
client/initiator may just assume that path MTU == local MTU and will set
its MSS accordingly (which is no different than what is happening now),
until it has a reason to think differently.

Also, if there is already something in the local PMTU cache for a single
host address, I'm not sure I follow why it would be a bad idea for the
TCP initiator to consult that cache when preparing the SYN. Although,
on second thought, I suppose it is possible (and, in more than a few
cases, likely) that in instances of route path asymmetry, the PMTU of
the path from the initiator to the server may be different than the PMTU
of the path back from the server to the client. Hmmm.

Okay, scratch that idea then. :-P

--
Nathan Anderson
First Step Internet, LLC
nathana[at]fsr.com

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


nathana at fsr

May 7, 2008, 12:24 PM

Post #22 of 51 (692 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

Here is a brief update on the situation:

I have been in contact with someone at Microsoft's service operations
center, who has confirmed for me that MS does in fact block _all_ ICMP
at the edge of their network, that they are aware that this will in fact
break PMTUD, and that they have no current plans to change this practice
which they have implemented in the interest of security.

Nevertheless, the person I have been in contact with is naturally not
the final decision-maker on this issue and is going to continue to pass
the issue on up the chain of command for me. So although this issue is
not over and I do not have a final verdict from MS yet, I felt that,
given that I don't know how much time to expect to pass between now and
when that final verdict is rendered, it would be appropriate to let
everybody here know what I have learned thus far. Hopefully public
dissemination of this information factoid will prevent others in a
position similar to mine from having to helplessly beat their heads into
their keyboards.

I, naturally, voiced my strong objection over this security policy, and
attempted to make a reasoned argument with the contact I have over
there. We will see what comes of this.

Some have asked me to post copies of my private communication with my
Microsoft contact here. I don't think it is appropriate for me to post
copies of private communication without the other party's consent, so I
will have to decline unless he first gives me said consent.

Others have asked for valid contact information for the Microsoft NOC,
since the ARIN records for their 207.46.0.0/16 do not appear to be up to
date. I eventually found a working e-mail address from somebody
off-list who pointed to the WHOIS lookup from TUCOWS for
microsoft.comosoft.com (which I'm still not clear on what exactly this
is...). The e-mail address that was gleaned from this lookup was
msnhst[at]microsoft.com, which goes to the Microsoft Corporate Domains
Team. They, in turn, forwarded my message on to
msnalerts[at]microsoft.com, which generated a ticket # for me and is, as I
understand it, the e-mail address I was looking for in the first place
(leads to their network/system people).

I hope this is helpful to others.

Regards,

--
Nathan Anderson
First Step Internet, LLC
nathana[at]fsr.com

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


nathana at fsr

May 7, 2008, 12:38 PM

Post #23 of 51 (684 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

Valdis.Kletnieks[at]vt.edu wrote:

> The usual case where you get screwed over is when the router trying to toss
> the ICMP FRAG NEEDED is *behind* the ICMP-munching firewall. And in case (2),
> you still can't assume that path MTU == local MTU, because your local MTU is
> likely 1500, and the fragging router often trying to stuff your 1500 byte
> packet down an PPPoE tunnel that's got an MTU of 1492....

Yes, but my point was precisely that one OR the other side (server OR
client) is going to NOT have the ICMP-munching firewall in between
itself and the "RITM" as I have affectionately been calling it (although
it is definitely possible that there are two ICMP-munchers on either
side of the RITM).

And case #2 is exactly what is occurring right now _anyway_: hosts
assume that path MTU == local MTU even if there is already an active
PMTU cache entry from a recent earlier communication with the remote
host. So I don't see how making that assumption _after_ making an
honest attempt at actively determining whether or not it is actually the
case is any more broken than they way things are already being done.

The problem is that, as I realized at the end of the message you quoted,
there are potentially multiple paths between the same two hosts, and the
path that the packet takes in one direction is not guaranteed to be the
same path that the packet takes in the opposite direction.

--
Nathan Anderson
First Step Internet, LLC
nathana[at]fsr.com

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


tomb at byrneit

May 7, 2008, 12:43 PM

Post #24 of 51 (682 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

I'm not sure what the issue is here.

Just about every modern firewall I've used has an option to enable PMTU
on interfaces, while blocking all other ICMP.

Is MS not running something manufactured in the last 10 years at their
perimeter?


> -----Original Message-----
> From: Nathan Anderson/FSR [mailto:nathana[at]fsr.com]
> Sent: Wednesday, May 07, 2008 12:39 PM
> To: Valdis.Kletnieks[at]vt.edu
> Cc: nanog[at]merit.edu
> Subject: Re: [NANOG] Microsoft.com PMTUD black hole?
>
> Valdis.Kletnieks[at]vt.edu wrote:
>
> > The usual case where you get screwed over is when the
> router trying to
> > toss the ICMP FRAG NEEDED is *behind* the ICMP-munching
> firewall. And
> > in case (2), you still can't assume that path MTU == local MTU,
> > because your local MTU is likely 1500, and the fragging
> router often
> > trying to stuff your 1500 byte packet down an PPPoE tunnel
> that's got an MTU of 1492....
>
> Yes, but my point was precisely that one OR the other side (server OR
> client) is going to NOT have the ICMP-munching firewall in
> between itself and the "RITM" as I have affectionately been
> calling it (although it is definitely possible that there are
> two ICMP-munchers on either side of the RITM).
>
> And case #2 is exactly what is occurring right now _anyway_:
> hosts assume that path MTU == local MTU even if there is
> already an active PMTU cache entry from a recent earlier
> communication with the remote host. So I don't see how
> making that assumption _after_ making an honest attempt at
> actively determining whether or not it is actually the case
> is any more broken than they way things are already being done.
>
> The problem is that, as I realized at the end of the message
> you quoted, there are potentially multiple paths between the
> same two hosts, and the path that the packet takes in one
> direction is not guaranteed to be the same path that the
> packet takes in the opposite direction.
>
> --
> Nathan Anderson
> First Step Internet, LLC
> nathana[at]fsr.com
>
> _______________________________________________
> NANOG mailing list
> NANOG[at]nanog.org
> http://mailman.nanog.org/mailman/listinfo/nanog
>

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog


michael at rancid

May 7, 2008, 12:46 PM

Post #25 of 51 (683 views)
Permalink
Re: Microsoft.com PMTUD black hole? [In reply to]

Nathan Anderson/FSR wrote:
> Here is a brief update on the situation:
>
> I have been in contact with someone at Microsoft's service operations
> center, who has confirmed for me that MS does in fact block _all_ ICMP
> at the edge of their network, that they are aware that this will in fact
> break PMTUD, and that they have no current plans to change this practice
> which they have implemented in the interest of security.

Although the need for your previous apology has already been questioned
in this forum, the confirmation that they block not only certain ICMP
types, but all ICMP, further vacates the need for any apology for
criticizing this behavior in a pubic forum. It is disheartening for
those of us who use and support MSFT's products to learn that their
understanding of security lacks even the basic nuance to know not to
block an entire--critical--portion of the Internet Protocol. Perhaps
they should also block _all_ TCP and UDP as well, and then we can move on.

I agree with Iljitsch that it happens frequently, but I think I am
justified in expecting more than that from Microsoft. Anything less
would be unprofessional.

*Speaking for myself only, of course!*

michael

_______________________________________________
NANOG mailing list
NANOG[at]nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog

First page Previous page 1 2 3 Next page Last page  View All NANOG users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.