Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Quagga: Users

sick quagga syndrome

 

 

Quagga users RSS feed   Index | Next | Previous | View Threaded


mike-quagga at tiedyenetworks

Dec 16, 2011, 9:31 AM

Post #1 of 1 (455 views)
Permalink
sick quagga syndrome

Hi group,

I've tried communicating this in the past and have not received a
response. This regards prior list posts concerning bad state in ospfd
which can spread across neighbors, requiring a restart of affected
processes in order to fix, and which may also be a good cause to add a
'fuzzer' to ospfd or a testsuite to try and shake some of this stuff out.



>
> Sick quagga syndrome
>
> Scenario:
>
> west, east, and cisco are running ospf and are in area 0. During a
> switch upgrade (unplug from old, connect to new), these three routers
> develop a shared, bad ospf state, requiring manual ospf process
> kill/restarts on all three routers in order to clear. Please note my clocks
> have the wrong times but there is no issue with out of step ticks, backwards
> running clocks or other fatal time problems, they are just wrong for the
> purposes of these logs.
>
> Players:
>
> Router 'west' - Quagga 0.9.16, linux 2.6.33.4, loopback ip xx.67
> Router 'east' - Quagga 0.9.16, linux 2.6.33.4, loopback ip xx.68
> Router 'cisco' - c3725-is-mz.123-26.bin, loopback ip xx.66
>
> Neighbor relationships:
>
> east:
>
> Neighbor ID Pri State Dead Time Address Interface RXmtL RqstL DBsmL
> xx.67 1 Full/DROther 1.496s 10.0.1.225 vlan2:10.0.1.226 0 0 0
> xx.66 5 Full/Backup 1.500s 10.0.1.227 vlan2:10.0.1.226 0 0 0
> xx.67 200 Full/Backup 57.496s 172.16.10.1 vlan103:172.16.10.4 0 0 0
> xx.67 10 Full/DR 17.496s 172.16.24.26 vlan105:172.16.24.27 0 0 0
>
> west:
>
> Neighbor ID Pri State Dead Time Address Interface RXmtL RqstL DBsmL
> xx.68 1 Full/DR 1.711s 10.0.1.226 vlan2:10.0.1.225 0 0 0
> xx.66 5 Full/Backup 1.696s 10.0.1.227 vlan2:10.0.1.225 0 0 0
> xx.68 200 Full/DR 59.713s 172.16.10.4 vlan103:172.16.10.1 0 0 0
> xx.68 5 Full/Backup 19.713s 172.16.24.27 vlan105:172.16.24.26 0 0
>
> Cisco:
>
> xx.67 1 FULL/DROTHER 00:00:01 10.0.1.225 FastEthernet0/0
> xx.68 1 FULL/DR 00:00:01 10.0.1.226 FastEthernet0/0
>
>
> Start of problem:
>
> The switches are being consolidated from 2 x 24 port gigE to 1x 48
> port gigE and this involves merely moving each ethernet one at a time from
> the old switches to the new. Interlinks are established to maintain full
> connectivity between devices on the old switches and those moved to the new,
> just to keep things smooth and make recovery possible if things go badly.
> The moves all cause the expected link down/link up and of course ospf link
> state updates and so forth as most ethernet moves take upwards of 30 seconds
> or longer each to complete.
>
> Over on east, we begin to see syslog messages flooding the circular
> buffer. Note that 1033 of the following were logged in this 1 second period:
>
> Oct 21 18:30:33 eastbridge daemon.warn ospfd[2586]: Link State Update: LSA checksum error 76f, ba17.
> Oct 21 18:30:33 eastbridge daemon.warn ospfd[2586]: Link State Update: LSA checksum error 76f, ba17.
> Oct 21 18:30:33 eastbridge daemon.warn ospfd[2586]: Link State Update: LSA checksum error 76f, ba17.
>
> Over on west, it begins noticing problems with east and cisco:
>
> Oct 21 17:45:44 westbridge daemon.warn ospfd[2903]: Link State Update: Unknown Neighbor 216.7.64.68 on int: vlan103:172.16.10.1
> Oct 21 17:45:44 westbridge daemon.warn ospfd[2903]: Link State Update: Unknown Neighbor 216.7.64.68 on int: vlan103:172.16.10.1
> Oct 21 17:45:45 westbridge daemon.warn ospfd[2903]: Link State Acknowledgment: Neighbor[65.127.32.66] state ExStart is less than Exchange
> Oct 21 17:45:46 westbridge daemon.warn ospfd[2903]: Link State Update: Neighbor[216.7.64.68] state ExStart is less than Exchange
> Oct 21 17:45:46 westbridge daemon.warn ospfd[2903]: Link State Update: Neighbor[216.7.64.68] state ExStart is less than Exchange
> Oct 21 17:45:46 westbridge daemon.warn ospfd[2903]: Link State Update: Neighbor[216.7.64.68] state ExStart is less than Exchange
> Oct 21 17:45:47 westbridge daemon.warn ospfd[2903]: Link State Acknowledgment: Neighbor[65.127.32.66] state ExStart is less than Exchange
> Oct 21 17:45:48 westbridge daemon.warn ospfd[2903]: Link State Update: Neighbor[216.7.64.68] state ExStart is less than Exchange
> Oct 21 17:45:48 westbridge daemon.warn ospfd[2903]: Link State Update: Neighbor[216.7.64.68] state ExStart is less than Exchange
> Oct 21 17:45:48 westbridge daemon.warn ospfd[2903]: Link State Update: Neighbor[65.127.32.66] state ExStart is less than Exchange
> Oct 21 17:45:48 westbridge daemon.warn ospfd[2903]: Link State Update: Neighbor[12.149.130.17] state ExStart is less than Exchange
>
> While over on cisco, we start getting BAD LSA CHKSUM errors flooding it's
> logging buffer complaining about west:
>
> (start of switch cutover)
> Oct 21 17:35:32: %OSPF-5-ADJCHG: Process 1, Nbr 216.7.64.67 on FastEthernet0/0 from FULL to DOWN, Neighbor Down: Dead timer expired
> Oct 21 17:36:22: %OSPF-5-ADJCHG: Process 1, Nbr 216.7.64.67 on FastEthernet0/0 from LOADING to FULL, Loading Done
> Oct 21 17:36:23: %OSPF-4-BADLSATYPE: Invalid lsa: Bad LSA chksum Type 2, Length 32, LSID 172.16.24.26 from 216.7.64.67, 10.0.1.225, FastEthernet0/0
> Oct 21 17:36:29: %OSPF-4-BADLSATYPE: Invalid lsa: Bad LSA chksum Type 2, Length 32, LSID 172.16.24.26 from 216.7.64.67, 10.0.1.225, FastEthernet0/0
> Oct 21 17:36:35: %OSPF-4-BADLSATYPE: Invalid lsa: Bad LSA chksum Type 2, Length 32, LSID 172.16.24.26 from 216.7.64.67, 10.0.1.225, FastEthernet0/0
> Oct 21 17:36:42: %OSPF-4-BADLSATYPE: Invalid lsa: Bad LSA chksum Type 2, Length 32, LSID 172.16.24.26 from 216.7.64.67, 10.0.1.225, FastEthernet0/0
>
>
> Resolution:
>
> On cisco, clear ip ospf was used to reset the ospf process. This
> resulted in a cessation of the 'LSA checksum error 76f, ba17' on east.
>
> On west, a kill/restart of ospfd was peformed, resulting in messages
> on cisco stopping and neighbor adjanceies being restored.
>
>
>
>
Attachments: sickquagga (5.50 KB)

Quagga users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.