Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: nsp: juniper

JUNOS and MX Trio cards

 

 

nsp juniper RSS feed   Index | Next | Previous | View Threaded


ras at e-gerbil

Jun 29, 2010, 1:05 AM

Post #1 of 16 (3276 views)
Permalink
JUNOS and MX Trio cards

For all those who were wondering about code stability for Trio cards, I
have my first experience to report. We just got our first shipment
of MPC2 cards, and tested it out in an MX960 running 10.2R1 with MPC2
cards only, no classic DPCs.

When I went to commit the config of the very first routed port with a
firewall filter (IMHO a fairly simple config, about a dozen terms, but
making use of chained filters), the FPC the port was on promptly
crashed. Every time the FPC would reboot and come back up, it would
immediately crash again. Moving the interface config with the filter to
a different FPC caused that FPC to crash as well. Disabling the firewall
filter caused the crashing to stop.

But, the box didn't fully recover on its own. Following the crash, some
packets forwarded through that box were being blackholed. After doing a
GRES/NSR switchover, the blackholing cleared briefly, but started again
the exact instant the backup RE came back online. I tried disabling the
GRES/NSR config, but the blackholing still didn't go away. A complete
PFE restart was required to clear the blackholing.

Oh and BTW, the pending route BGP stall bug is worse than ever in 10.2.
On a MX960 with RE-S-2000 and a BGP config consisting of nothing more
than an IBGP mesh (28 sessions) and a SINGLE TRANSIT SESSION, it took
just over 12 minutes before a single route from the transit session was
successfully installed to hardware.

So far things aren't looking good.

--
Richard A Steenbergen <ras [at] e-gerbil> http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


mtinka at globaltransit

Jun 29, 2010, 1:59 AM

Post #2 of 16 (3224 views)
Permalink
Re: JUNOS and MX Trio cards [In reply to]

On Tuesday 29 June 2010 04:05:55 pm Richard A Steenbergen
wrote:

> So far things aren't looking good.

Very, very nasty, indeed. Hope you have JTAC running around
on this. Would be glad to hear what comes of it. Nasty,
indeed.

On my end, while away on tour, Juniper came back and took
back a test MX80 and got it reinstalled with a more stable
version of JUNOS 10.2. The previous one was causing regular
Gig-E SFP modules to output a rather weak laser, and the
remote devices didn't have any clue that there was anything
the other side (850nm, no less).

The box came back with a newer JUNOS and fixed laser power.
Yet to follow-up with details, but it's odd.

Cheers,

Mark.
Attachments: signature.asc (0.82 KB)


dwinkworth at att

Jun 29, 2010, 8:37 AM

Post #3 of 16 (3222 views)
Permalink
Re: JUNOS and MX Trio cards [In reply to]

When you say 'transit session' what do you mean exactly?  Also disappointed to hear about the bugs. 

Is the stuck-in-pending issue easily reproducible?  I have read some of your past  posts, but recently it sounds like this can be reproduced without a lot of effort?




________________________________
From: Richard A Steenbergen <ras [at] e-gerbil>
To: "juniper-nsp [at] puck" <juniper-nsp [at] puck>
Sent: Tue, June 29, 2010 3:05:55 AM
Subject: [j-nsp] JUNOS and MX Trio cards

For all those who were wondering about code stability for Trio cards, I
have my first experience to report. We just got our first shipment
of MPC2 cards, and tested it out in an MX960 running 10.2R1 with MPC2
cards only, no classic DPCs.

When I went to commit the config of the very first routed port with a
firewall filter (IMHO a fairly simple config, about a dozen terms, but
making use of chained filters), the FPC the port was on promptly
crashed. Every time the FPC would reboot and come back up, it would
immediately crash again. Moving the interface config with the filter to
a different FPC caused that FPC to crash as well. Disabling the firewall
filter caused the crashing to stop.

But, the box didn't fully recover on its own. Following the crash, some
packets forwarded through that box were being blackholed. After doing a
GRES/NSR switchover, the blackholing cleared briefly, but started again
the exact instant the backup RE came back online. I tried disabling the
GRES/NSR config, but the blackholing still didn't go away. A complete
PFE restart was required to clear the blackholing.

Oh and BTW, the pending route BGP stall bug is worse than ever in 10.2.
On a MX960 with RE-S-2000 and a BGP config consisting of nothing more
than an IBGP mesh (28 sessions) and a SINGLE TRANSIT SESSION, it took
just over 12 minutes before a single route from the transit session was
successfully installed to hardware.

So far things aren't looking good.

--
Richard A Steenbergen <ras [at] e-gerbil>      http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


ras at e-gerbil

Jun 29, 2010, 12:59 PM

Post #4 of 16 (3216 views)
Permalink
Re: JUNOS and MX Trio cards [In reply to]

On Tue, Jun 29, 2010 at 08:37:20AM -0700, Derick Winkworth wrote:
> When you say 'transit session' what do you mean exactly?? Also
> disappointed to hear about the bugs.?

Transit (n): An EBGP session where an external ASN sends you a full copy
of the global routing table, usually in exchange for money. :)

> Is the stuck-in-pending issue easily reproducible?? I have read some
> of your past? posts, but recently it sounds like this can be
> reproduced without a lot of effort?

Trivially reproducable here, all that seems to be required is a decent
number of BGP sessions that you have to send the update to. Just last
night I noticed it took over 6 minutes to remove the routes and stop
forwarding traffic to a ebgp session I shut down on a 9.6R4 router
(which was mostly cpu idle before starting), and EX8200s running 10.1
have taken 5-7 minutes to start installing or exchanging routes with
nothing more than 2 IBGP RR feeds and a local transit session. Usually
the problem is worst after a fresh reboot, where it can take 10-20
minutes to actually install the routing table into hw, but on newer code
it seems to be happening on an otherwise stable router with just a
single BGP session flap.

--
Richard A Steenbergen <ras [at] e-gerbil> http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


dwinkworth at att

Jun 29, 2010, 6:50 PM

Post #5 of 16 (3223 views)
Permalink
Re: JUNOS and MX Trio cards [In reply to]

So basically, this stalled route issue has been going on for so long, that its truthful to say that Juniper probably doesn't think its important to fix? or they don't care?

I wonder what their official line is. Might be similar to their official line with respect to the manufacturing issue with the EX series, where so many ASICs are just bad... I think they have some code in JUNOS now that detects the bad ASICs and just resets them when the failure detected.

How unfortunate. I wonder of Alca-Lu can do better. Lord knows Cisco could care less about code quality. surely some networking vendor must give a sh*t.






________________________________
From: Richard A Steenbergen <ras [at] e-gerbil>
To: Derick Winkworth <dwinkworth [at] att>
Cc: "juniper-nsp [at] puck" <juniper-nsp [at] puck>
Sent: Tue, June 29, 2010 2:59:55 PM
Subject: Re: [j-nsp] JUNOS and MX Trio cards

On Tue, Jun 29, 2010 at 08:37:20AM -0700, Derick Winkworth wrote:
> When you say 'transit session' what do you mean exactly?? Also
> disappointed to hear about the bugs.?

Transit (n): An EBGP session where an external ASN sends you a full copy
of the global routing table, usually in exchange for money. :)

> Is the stuck-in-pending issue easily reproducible?? I have read some
> of your past? posts, but recently it sounds like this can be
> reproduced without a lot of effort?

Trivially reproducable here, all that seems to be required is a decent
number of BGP sessions that you have to send the update to. Just last
night I noticed it took over 6 minutes to remove the routes and stop
forwarding traffic to a ebgp session I shut down on a 9.6R4 router
(which was mostly cpu idle before starting), and EX8200s running 10.1
have taken 5-7 minutes to start installing or exchanging routes with
nothing more than 2 IBGP RR feeds and a local transit session. Usually
the problem is worst after a fresh reboot, where it can take 10-20
minutes to actually install the routing table into hw, but on newer code
it seems to be happening on an otherwise stable router with just a
single BGP session flap.

--
Richard A Steenbergen <ras [at] e-gerbil> http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


zorick at fr

Jun 30, 2010, 1:26 AM

Post #6 of 16 (3212 views)
Permalink
Re: JUNOS and MX Trio cards [In reply to]

On Tue, Jun 29, 2010 at 06:50:49PM -0700, Derick Winkworth wrote:

[dd]

>
> How unfortunate. I wonder of Alca-Lu can do better. Lord knows Cisco could care less about code quality. surely some networking vendor must give a sh*t.

Small brief from our ALU equipment evaluation:
BGP:4-byte ASN unsupported, BGP:PE-CE protocol unsupported, IOM2 can forward
traffic for detached neighbour for 30 min

Do you still wonder they can do better?

--
ZA-RIPE||ZA1-UANIC
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


ras at e-gerbil

Jun 30, 2010, 1:30 AM

Post #7 of 16 (3211 views)
Permalink
Re: JUNOS and MX Trio cards [In reply to]

On Tue, Jun 29, 2010 at 06:50:49PM -0700, Derick Winkworth wrote:
> So basically, this stalled route issue has been going on for so long,
> that its truthful to say that Juniper probably doesn't think its
> important to fix? or they don't care?

6 years by my count. The weird thing is I'm constantly running into
plenty of really smart competent people at Juniper who do want to help,
they just have no idea that things are really this broken, or they
aren't empowered to do anything about it. I guess you could call that
"they don't care" at a corporate level.

Oh and btw, they also broke subinterface counters in 10.2R1 too:

IF-MIB::ifDescr.539 = STRING: xe-1/0/1
IF-MIB::ifDescr.583 = STRING: xe-1/0/1.0

IF-MIB::ifHCInOctets.539 = Counter64: 3917358216
IF-MIB::ifHCInOctets.583 = Counter64: 0

IF-MIB::ifHCOutOctets.539 = Counter64: 20928565080777
IF-MIB::ifHCOutOctets.583 = Counter64: 8643351

The really scary part is I can name a lot of big networks who are
critically under-provisioned right now, because they've been holding off
on new DPC purchases for their MX's pending shipment of Trio cards. I
think Juniper is about to find themselves in a world of hurt when these
guys go to deploy the new cards and discover what a disaster modern
JUNOS has become. Seriously, how the hell do you manage to ship
production code that has broken subinterface counters? Does anyone in
systest actually do anything any more?

--
Richard A Steenbergen <ras [at] e-gerbil> http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


m.kucharczyk at net

Jun 30, 2010, 2:00 AM

Post #8 of 16 (3213 views)
Permalink
Re: JUNOS and MX Trio cards [In reply to]

On Wednesday 30 of June 2010 03:50:49 Derick Winkworth wrote:
> I wonder what their official line is. Might be similar to their official
> line with respect to the manufacturing issue with the EX series, where so
> many ASICs are just bad... I think they have some code in JUNOS now that
> detects the bad ASICs and just resets them when the failure detected.

Do you have some more information about broken ASICs in EX series? We have few
EX switches and we have some problems with them (strange behavior with STP,
some link UP/DOWN issues).

Thanks and regards,
Marcin
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


dwinkworth at att

Jun 30, 2010, 2:28 AM

Post #9 of 16 (3207 views)
Permalink
Re: JUNOS and MX Trio cards [In reply to]

#############
6 years by my count. The weird thing is I'm constantly running into
plenty of really smart competent people at Juniper who do want to help,
they just have no idea that things are really this broken, or they
aren't empowered to do anything about it. I guess you could call that
"they don't care" at a corporate level.
##############

Yeah, let me just say I work with a number of supremely competent people at Juniper who care immensely about the customer and the product. I can't emphasize that enough. I think Juniper does care, actually, I think that there is "paradigm" shift that is happening there with respect to how code is produced. I understand things will get much, much better in 10.3 thru 10.5.

In the meantime, 10.0r3 / r4 will likely be our production code releases. Knock on wood, these will do at the moment... The folks we are working with at Juniper are putting some effort into making sure these are solid releases for us.


##############
Does anyone in systest actually do anything any more?
##############

I have actually heard there is some frustration there. See comment above about paradigm-shift.
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


dwinkworth at att

Jun 30, 2010, 4:25 AM

Post #10 of 16 (3228 views)
Permalink
Re: JUNOS and MX Trio cards [In reply to]

hahahaha nice!





________________________________
From: Andrey Zarechansky <zorick [at] fr>
To: juniper-nsp [at] puck
Sent: Wed, June 30, 2010 3:26:50 AM
Subject: Re: [j-nsp] JUNOS and MX Trio cards

On Tue, Jun 29, 2010 at 06:50:49PM -0700, Derick Winkworth wrote:

[dd]

>
> How unfortunate. I wonder of Alca-Lu can do better. Lord knows Cisco could care less about code quality. surely some networking vendor must give a sh*t.

Small brief from our ALU equipment evaluation:
BGP:4-byte ASN unsupported, BGP:PE-CE protocol unsupported, IOM2 can forward
traffic for detached neighbour for 30 min

Do you still wonder they can do better?

--
ZA-RIPE||ZA1-UANIC
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


philxor at gmail

Jun 30, 2010, 6:54 AM

Post #11 of 16 (3197 views)
Permalink
Re: JUNOS and MX Trio cards [In reply to]

When were you evaulating it? I think both of those two features have been available for some time now, almost 2 years.

Not to say they do not have their own set of issues along with Juniper and Cisco...

Phil


On Jun 30, 2010, at 4:26 AM, Andrey Zarechansky wrote:

> On Tue, Jun 29, 2010 at 06:50:49PM -0700, Derick Winkworth wrote:
>
> [dd]
>
>>
>> How unfortunate. I wonder of Alca-Lu can do better. Lord knows Cisco could care less about code quality. surely some networking vendor must give a sh*t.
>
> Small brief from our ALU equipment evaluation:
> BGP:4-byte ASN unsupported, BGP:PE-CE protocol unsupported, IOM2 can forward
> traffic for detached neighbour for 30 min
>
> Do you still wonder they can do better?
>
> --
> ZA-RIPE||ZA1-UANIC
> _______________________________________________
> juniper-nsp mailing list juniper-nsp [at] puck
> https://puck.nether.net/mailman/listinfo/juniper-nsp


_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


chrisccnpspam2 at gmail

Jun 30, 2010, 7:49 AM

Post #12 of 16 (3198 views)
Permalink
Re: JUNOS and MX Trio cards [In reply to]

Is there any way that someone could tell me how to reproduce the stalled
route issue? How do I see if is happening etc?

We are about to purchase some mx's to replace our m series and would love to
nail this.

On Jun 30, 2010 7:35 AM, "Derick Winkworth" <dwinkworth [at] att> wrote:

hahahaha nice!





________________________________
From: Andrey Zarechansky <zorick [at] fr>
To: juniper-nsp [at] puck
Sent: Wed, June 30, 2010 3:26:50 AM

Subject: Re: [j-nsp] JUNOS and MX Trio cards

On Tue, Jun 29, 2010 at 06:50:49PM -0700, Derick Winkworth wrote:

[dd]

>
> How unfortunate. I wo...
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


zorick at fr

Jun 30, 2010, 8:11 AM

Post #13 of 16 (3203 views)
Permalink
Re: JUNOS and MX Trio cards [In reply to]

On Wed, Jun 30, 2010 at 09:54:16AM -0400, Phil Bedard wrote:
> When were you evaulating it? I think both of those two features have been available for some time now, almost 2 years.
>

Less than a half year. What we had in the lab:
- few boxes 7750 SR-7, IOM2 based hw, running TiMOS 7.something
- few boxes 7450 ESS, don't rember OS version
- Agilent N2X router tester

1. We were unable to configure 4byte ASN.

2. We were able to configure BGP as PE-CE protocol but it never came up.
ALU network expert have found an internal reference that BGP as PE-CE will
be available on 2010Q3 starting with new major release of the TiMOS.

3. IOM2 issue was achieved with disarranged full-view from N2X to the 7750SR
and making random withdrawal and re-announcement for set of prefixes under
the test.

Possibly ALU have fixed IOM issue with the new hardware on IOM3 card but
we haven't tested this due to the limited lab environment.


> Not to say they do not have their own set of issues along with Juniper and Cisco...
>
> Phil
>
> >
> > [dd]
> >
> >>
> >> How unfortunate. I wonder of Alca-Lu can do better. Lord knows Cisco could care less about code quality. surely some networking vendor must give a sh*t.
> >
> > Small brief from our ALU equipment evaluation:
> > BGP:4-byte ASN unsupported, BGP:PE-CE protocol unsupported, IOM2 can forward
> > traffic for detached neighbour for 30 min
> >
> > Do you still wonder they can do better?

--
ZA-RIPE||ZA1-UANIC
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


joeyconcrete at gmail

Jun 30, 2010, 2:27 PM

Post #14 of 16 (3195 views)
Permalink
Re: JUNOS and MX Trio cards [In reply to]

On 30 June 2010 15:49, Chris Evans <chrisccnpspam2 [at] gmail> wrote:
> Is there any way that someone could tell me how to reproduce the stalled
> route issue?  How do I see if is happening etc?
>
> We are about to purchase some mx's to replace our m series and would love to
> nail this.

The question of stability (or lack of) is of interest to me.

I began an exercise a few months back researching the options
available to replace some of our Cisco gear with Juniper. At the time
- it was looking like a combination of the M7i and the EX series
switches - but since learning the EX has limitations in regard to MPLS
and the fact the M7i is getting old - the MX looked a perfect
candidate; decent port density with sufficient horsepower. Despite the
attractiveness of the platform, I'm not sure I could cope with the
sleepless nights.

_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


mtinka at globaltransit

Jun 30, 2010, 10:06 PM

Post #15 of 16 (3175 views)
Permalink
Re: JUNOS and MX Trio cards [In reply to]

On Wednesday 30 June 2010 05:28:39 pm Derick Winkworth
wrote:

> I understand
> things will get much, much better in 10.3 thru 10.5.

Without any confirmation from anyone at Juniper, I suspect
the same. This would be a mirror experiences with JUNOS 9,
where anything pre-9.3 was really terrible.

When I look back at JUNOS 8, 8.5 seems to be the favorite,
although I see 8.1 has EEOL (well, that seems to have run
out too, last May).

If we trend this, it would make sense to stay on 8.5 and 9.6
until 10.3 is out, and remain on that up to 10.6 until 11.3
is out, and then on that till 11.6 until 12.3 is out... see
a pattern :-).

Mark.
Attachments: signature.asc (0.82 KB)


mtinka at globaltransit

Jun 30, 2010, 11:36 PM

Post #16 of 16 (3175 views)
Permalink
Re: JUNOS and MX Trio cards [In reply to]

On Thursday 01 July 2010 05:27:26 am Joe Hughes wrote:

> I began an exercise a few months back researching the
> options available to replace some of our Cisco gear with
> Juniper. At the time - it was looking like a combination
> of the M7i and the EX series switches -

We implemented this combo for some Metro deployments in our
attempt to have a non-STP-based control plane in the Access.
It works quite well.

But the MX80 makes much more sense now.

> but since
> learning the EX has limitations in regard to MPLS and
> the fact the M7i is getting old - the MX looked a
> perfect candidate; decent port density with sufficient
> horsepower. Despite the attractiveness of the platform,
> I'm not sure I could cope with the sleepless nights.

We couldn't wait to get the Trio-based cards and moved to
purchase our new batch of MX480 DPC's. Even if we'd gotten
them (which would have been several months later), tons of
bugs would need to be worked out (recall the start of this
thread).

The real PITA is that the Trio cards will give you more
value for money when you start looking at platforms like the
MX240 or higher. Just that the code sucks today. I mean,
what Richard was trying to do was pretty stock. If this
issue is not limited to the batch of kit he received, JUNOS
has really become something else.

Mark.
Attachments: signature.asc (0.82 KB)

nsp juniper RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.