Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: nsp: extreme

Suggestions?

 

 

nsp extreme RSS feed   Index | Next | Previous | View Threaded


pkranz at unwiredltd

Mar 16, 2006, 9:14 AM

Post #1 of 3 (1230 views)
Permalink
Suggestions?

Greetings,
Our Extreme Blackdiamond 6808 crashed last night in a manner where
it could not be reached via serial ports of IP addresses. We would
appreciate any suggestions on how to prevent this from re-occuring. The
following information was collected:

<A Hard crash occured about 20 minutes later than the next error>
03/16/2006 03:04:52.01 <Crit:SYST> Sys-health-check [EDP] checksum error
(slow-path) on MSM-A, port 0xb
03/16/2006 03:04:29.77 <Crit:KERN> Sys-health-check [INT] checksum error
(fast-path) on MSM-A. prev=56 cur=5b
03/16/2006 03:04:28.01 <Crit:SYST> Sys-health-check [DIAG] First 16 bytes of
unknown pkt (slow-path) on slot 7
03/16/2006 03:04:15.00 <Crit:SYST> Sys-health-check [DIAG] First 16 bytes of
unknown pkt (slow-path) on slot 2
03/16/2006 03:04:07.01 <Crit:SYST> Sys-health-check [DIAG] First 16 bytes of
unknown pkt (slow-path) on slot 1
03/16/2006 03:04:04.79 <Crit:KERN> Sys-health-check [INT] checksum error
(fast-path) on MSM-B. prev=0 cur=7
03/16/2006 03:04:04.77 <Crit:KERN> Sys-health-check [INT] checksum error
(fast-path) on MSM-A. prev=4c cur=51
03/16/2006 03:03:55.02 <Crit:KERN> Sys-health-check [CPU] checksum error
(slow-path) on slot 8
03/16/2006 03:03:00.01 <Crit:SYST> Sys-health-check [DIAG] First 16 bytes of
unknown pkt (slow-path) on slot 7
03/16/2006 03:02:58.77 <Crit:KERN> Sys-health-check [INT] checksum error
(fast-path) on MSM-A. prev=47 cur=4c

Also the following hidden errors were seen:
03/16/2006 03:56:07.23 <Crit:ENG> PBUS SYNC ERROR (2) got=100ff expected=ff
on MSM-A
03/16/2006 03:56:02.23 <Crit:ENG> PBUS SYNC ERROR (1) got=100ff expected=ff
on MSM-A
03/16/2006 03:55:57.23 <Crit:ENG> PBUS SYNC ERROR (0) got=100ff expected=ff
on MSM-A

Peter Kranz
Founder/CEO - Unwired Ltd
www.UnwiredLtd.com
Desk: 510-868-1614 x100
Mobile: 510-207-0000
Fax: 510-217-6031
pkranz at unwiredltd.com


xds at LanGame

Mar 17, 2006, 2:11 AM

Post #2 of 3 (1117 views)
Permalink
Suggestions? [In reply to]

What is version of EW that in use ? The EW check healt status of
communication between MSM CPU module and I/O modules , the check is made
using EDP packets , int this case i think there a loose of
synchronization between cpu and some of your i/o module , and large
amount of packets loose right path and some memory packet mapping is
wrong , you can try to use this command

config sys-health-check auto-recovery <number of tries> [offline |
online] (BlackDiamond)

this will try to perform packet memory scanning and mapping but use
this with care if you specify offline MSM will shutdown faulty module .



br,
CCNP Atanas Yankov
Network Administrator
AngelSoft Ltd.

Peter Kranz wrote:

>Greetings,
> Our Extreme Blackdiamond 6808 crashed last night in a manner where
>it could not be reached via serial ports of IP addresses. We would
>appreciate any suggestions on how to prevent this from re-occuring. The
>following information was collected:
>
><A Hard crash occured about 20 minutes later than the next error>
>03/16/2006 03:04:52.01 <Crit:SYST> Sys-health-check [EDP] checksum error
>(slow-path) on MSM-A, port 0xb
>03/16/2006 03:04:29.77 <Crit:KERN> Sys-health-check [INT] checksum error
>(fast-path) on MSM-A. prev=56 cur=5b
>03/16/2006 03:04:28.01 <Crit:SYST> Sys-health-check [DIAG] First 16 bytes of
>unknown pkt (slow-path) on slot 7
>03/16/2006 03:04:15.00 <Crit:SYST> Sys-health-check [DIAG] First 16 bytes of
>unknown pkt (slow-path) on slot 2
>03/16/2006 03:04:07.01 <Crit:SYST> Sys-health-check [DIAG] First 16 bytes of
>unknown pkt (slow-path) on slot 1
>03/16/2006 03:04:04.79 <Crit:KERN> Sys-health-check [INT] checksum error
>(fast-path) on MSM-B. prev=0 cur=7
>03/16/2006 03:04:04.77 <Crit:KERN> Sys-health-check [INT] checksum error
>(fast-path) on MSM-A. prev=4c cur=51
>03/16/2006 03:03:55.02 <Crit:KERN> Sys-health-check [CPU] checksum error
>(slow-path) on slot 8
>03/16/2006 03:03:00.01 <Crit:SYST> Sys-health-check [DIAG] First 16 bytes of
>unknown pkt (slow-path) on slot 7
>03/16/2006 03:02:58.77 <Crit:KERN> Sys-health-check [INT] checksum error
>(fast-path) on MSM-A. prev=47 cur=4c
>
>Also the following hidden errors were seen:
>03/16/2006 03:56:07.23 <Crit:ENG> PBUS SYNC ERROR (2) got=100ff expected=ff
>on MSM-A
>03/16/2006 03:56:02.23 <Crit:ENG> PBUS SYNC ERROR (1) got=100ff expected=ff
>on MSM-A
>03/16/2006 03:55:57.23 <Crit:ENG> PBUS SYNC ERROR (0) got=100ff expected=ff
>on MSM-A
>
>Peter Kranz
>Founder/CEO - Unwired Ltd
>www.UnwiredLtd.com
>Desk: 510-868-1614 x100
>Mobile: 510-207-0000
>Fax: 510-217-6031
>pkranz at unwiredltd.com
>
>
>_______________________________________________
>extreme-nsp mailing list
>extreme-nsp at puck.nether.net
>https://puck.nether.net/mailman/listinfo/extreme-nsp
>
>
>
>


pkranz at unwiredltd

Mar 17, 2006, 5:04 AM

Post #3 of 3 (1104 views)
Permalink
Suggestions? [In reply to]

We are running 7.1.1b16

I setup fdb scanning:

configure fdb-scan period 60
configure fdb-scan failure-action sys-health-check
enable fdb-scan slot 1
enable fdb-scan slot 2
enable fdb-scan slot 3
enable fdb-scan slot 4
enable fdb-scan slot 5
enable fdb-scan slot 6
enable fdb-scan slot 7
enable fdb-scan slot 8
enable fdb-scan slot MSM-A
enable fdb-scan slot MSM-B

And it spit this out almost immediately:

03/16/2006 08:49:12.95 <Warn:SYST> FDB Scan: entry 43/0 marked 'remapped'
03/16/2006 08:49:12.95 <Warn:SYST> FDB Scan: (1) sw/hw dpath mismatch slot 1
bucket 43 entry 0

I hope that resolves the issue!

Peter Kranz
Founder/CEO - Unwired Ltd
www.UnwiredLtd.com
Desk: 510-868-1614 x100
Mobile: 510-207-0000
pkranz at unwiredltd.com

-----Original Message-----
From: extreme-nsp-bounces [at] puck
[mailto:extreme-nsp-bounces at puck.nether.net] On Behalf Of Atanas Yankov
Sent: Friday, March 17, 2006 2:12 AM
To: extreme-nsp at puck.nether.net
Subject: Re: [e-nsp] Suggestions?

What is version of EW that in use ? The EW check healt status of
communication between MSM CPU module and I/O modules , the check is made
using EDP packets , int this case i think there a loose of
synchronization between cpu and some of your i/o module , and large
amount of packets loose right path and some memory packet mapping is
wrong , you can try to use this command

config sys-health-check auto-recovery <number of tries> [offline |
online] (BlackDiamond)

this will try to perform packet memory scanning and mapping but use
this with care if you specify offline MSM will shutdown faulty module .



br,
CCNP Atanas Yankov
Network Administrator
AngelSoft Ltd.

Peter Kranz wrote:

>Greetings,
> Our Extreme Blackdiamond 6808 crashed last night in a manner where
>it could not be reached via serial ports of IP addresses. We would
>appreciate any suggestions on how to prevent this from re-occuring. The
>following information was collected:
>
><A Hard crash occured about 20 minutes later than the next error>
>03/16/2006 03:04:52.01 <Crit:SYST> Sys-health-check [EDP] checksum error
>(slow-path) on MSM-A, port 0xb
>03/16/2006 03:04:29.77 <Crit:KERN> Sys-health-check [INT] checksum error
>(fast-path) on MSM-A. prev=56 cur=5b
>03/16/2006 03:04:28.01 <Crit:SYST> Sys-health-check [DIAG] First 16 bytes
of
>unknown pkt (slow-path) on slot 7
>03/16/2006 03:04:15.00 <Crit:SYST> Sys-health-check [DIAG] First 16 bytes
of
>unknown pkt (slow-path) on slot 2
>03/16/2006 03:04:07.01 <Crit:SYST> Sys-health-check [DIAG] First 16 bytes
of
>unknown pkt (slow-path) on slot 1
>03/16/2006 03:04:04.79 <Crit:KERN> Sys-health-check [INT] checksum error
>(fast-path) on MSM-B. prev=0 cur=7
>03/16/2006 03:04:04.77 <Crit:KERN> Sys-health-check [INT] checksum error
>(fast-path) on MSM-A. prev=4c cur=51
>03/16/2006 03:03:55.02 <Crit:KERN> Sys-health-check [CPU] checksum error
>(slow-path) on slot 8
>03/16/2006 03:03:00.01 <Crit:SYST> Sys-health-check [DIAG] First 16 bytes
of
>unknown pkt (slow-path) on slot 7
>03/16/2006 03:02:58.77 <Crit:KERN> Sys-health-check [INT] checksum error
>(fast-path) on MSM-A. prev=47 cur=4c
>
>Also the following hidden errors were seen:
>03/16/2006 03:56:07.23 <Crit:ENG> PBUS SYNC ERROR (2) got=100ff expected=ff
>on MSM-A
>03/16/2006 03:56:02.23 <Crit:ENG> PBUS SYNC ERROR (1) got=100ff expected=ff
>on MSM-A
>03/16/2006 03:55:57.23 <Crit:ENG> PBUS SYNC ERROR (0) got=100ff expected=ff
>on MSM-A
>
>Peter Kranz
>Founder/CEO - Unwired Ltd
>www.UnwiredLtd.com
>Desk: 510-868-1614 x100
>Mobile: 510-207-0000
>Fax: 510-217-6031
>pkranz at unwiredltd.com
>
>
>_______________________________________________
>extreme-nsp mailing list
>extreme-nsp at puck.nether.net
>https://puck.nether.net/mailman/listinfo/extreme-nsp
>
>
>
>

_______________________________________________
extreme-nsp mailing list
extreme-nsp at puck.nether.net
https://puck.nether.net/mailman/listinfo/extreme-nsp

nsp extreme RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.