
Elmar.Vonlanthen at united-security-providers
Oct 10, 2011, 4:43 AM
Post #1 of 1
(150 views)
Permalink
|
|
Re: ospfd system call 'sendmsg' causes a kerneloops
|
|
Hello again I'm sorry, I was wrong with the linenumber in ospf_packet.c. The correct number is 737 (inside ospf_write() and not ospf_write_frags()): ret = sendmsg (ospf->fd, &msg, flags); Here I have some more info's from added zlog_warn message. At this time, there was no kernel oops: 2011-10-07 17:03:36 chgut11fw01 ospfd[17932]: ospf_write (before write frags): dst=224.0.0.5, id=28384, off=0, len=64, iface=gAAchgut1, mtu=1356 2011-10-07 17:03:36 chgut11fw01 ospfd[17932]: ospf_write (before htosys): dst=224.0.0.5, id=28384, off=0, len=64, iface=gAAchgut1, mtu=1356 2011-10-07 17:03:36 chgut11fw01 ospfd[17932]: ospf_write (after htosys): dst=224.0.0.5, id=57454, off=0, len=16384, iface=gAAchgut1, mtu=1356 2011-10-07 17:03:36 chgut11fw01 ospfd[17932]: ospf_write (after sendmsg): dst=224.0.0.5, id=57454, off=0, len=16384, iface=gAAchgut1, mtu=1356 And here was a kernel oops: 2011-10-07 17:03:37 chgut11fw01 ospfd[17932]: ospf_write (before write frags): dst=224.0.0.5, id=28429, off=0, len=64, iface=gAAchgut1, mtu=1356 2011-10-07 17:03:37 chgut11fw01 ospfd[17932]: ospf_write (before htosys): dst=224.0.0.5, id=28429, off=0, len=64, iface=gAAchgut1, mtu=1356 2011-10-07 17:03:37 chgut11fw01 ospfd[17932]: ospf_write (after htosys): dst=224.0.0.5, id=3439, off=0, len=16384, iface=gAAchgut1, mtu=1356 Best regards Elmar > -----Ursprüngliche Nachricht----- > Von: quagga-users-bounces [at] lists [mailto:quagga-users- > bounces [at] lists] Im Auftrag von Vonlanthen, Elmar > Gesendet: Montag, 10. Oktober 2011 11:54 > An: quagga-users [at] lists > Betreff: [quagga-users 12498] ospfd system call 'sendmsg' causes a > kerneloops > > Hello all > > I have a weird problem with a kernel oops, if quagga tries to send a > multicast packet. > > This is the setup I have: > > [ host chgut11fw01 ] ================================================== > [ host chgut1fw01 ] > > (dev gAAchgut1) 10.254.11.1 ---------gre/ospf--------- 10.254.1.11 > (dev gAAchgut11) > V V > (dev eth1) 10.10.111.4 -----strongswan/ipsec----- 10.10.101.4 > (dev eth1) > > > Now, if I restart ipsec or reconfigure the gre tunnel (ip tun del ...; > ip tun add ...; ip link set...; ip addr add ...) on host chgut11fw01 in > a loop, after some seconds a kernel oops happens: > > 2011-10-10 09:32:08 chgut11fw01 pluto[16283]: interface gAAchgut1 > activated 2011-10-10 09:32:08 chgut11fw01 pluto[16283]: 10.254.11.1 > appeared on > gAAchgut1 > 2011-10-10 09:32:08 chgut11fw01 ospfd[17011]: ospfTrapIfStateChange > trap > sent: 10.254.11.1 now Point-To-Point > 2011-10-10 09:32:08 chgut11fw01 ospfd[17011]: interface 10.254.11.1 > [59605] join AllSPFRouters Multicast group. > 2011-10-10 09:32:08 chgut11fw01 kernel: skb_over_panic: text:c1305749 > len:64 put:64 head:c7ea6a00 data:c7ea6a50 tail:0xc7ea6a90 > end:0xc7ea6a80 dev:<NULL> 2011-10-10 09:32:08 chgut11fw01 kernel: ----- > -------[ cut here > ]------------ > 2011-10-10 09:32:09 chgut11fw01 kernel: kernel BUG at > net/core/skbuff.c:128! > 2011-10-10 09:32:09 chgut11fw01 kernel: invalid opcode: 0000 [#40] SMP > 2011-10-10 09:32:09 chgut11fw01 kernel: Modules linked in: > nf_conntrack_netlink nfnetlink ip_gre gre authenc xfrm4_mode_transport > deflate zlib_deflate ctr twofish_generic twofish_common serpent cryptd > aes_i5 > 86 aes_generic blowfish cast5 cbc ecb rmd160 sha512_generic > sha256_generic sha1_generic xfrm_user xfrm4_tunnel tunnel4 ipcomp > xfrm_ipcomp esp4 ah4 af_key tun ipt_LOG xt_limit ipt_REJECT xt_state > ipt_REDIRECT ipt_MASQUERADE xt_policy xt_TCPMSS xt_tcpmss xt_tcpudp > xt_NOTRACK iptable_filter iptable_nat xt_mark xt_connmark > iptable_mangle iptable_raw ip_tables x_tables nf_conntrack_tftp > nf_nat_ftp nf_nat nf_conntrac > k_ipv4 nf_defrag_ipv4 nf_conntrack_ftp nf_conntrack rtc ppdev > parport_pc parport w83792d i2c_dev i2c_i801 i2c_core pl2303 usbserial > coretemp hwmon usbhid ohci_hcd uhci_hcd ehci_hcd usbcore e1000 e1000e > aufs ata_piix libata 2011-10-10 09:32:09 chgut11fw01 kernel: > 2011-10-10 09:32:09 chgut11fw01 kernel: Pid: 17011, comm: ospfd > Tainted: > G D 3.0.4-SMP #1 /LakePort > 2011-10-10 09:32:09 chgut11fw01 kernel: EIP: 0060:[<c12b6575>] EFLAGS: > 00210296 CPU: 0 > 2011-10-10 09:32:09 chgut11fw01 kernel: EIP is at skb_put+0x85/0x90 > 2011-10-10 09:32:09 chgut11fw01 kernel: EAX: 00000078 EBX: c1305749 > ECX: > 00200082 EDX: ffffff8b > 2011-10-10 09:32:09 chgut11fw01 kernel: ESI: dd6420c0 EDI: 00000040 > EBP: > d7d23c7c ESP: d7d23c54 > 2011-10-10 09:32:09 chgut11fw01 kernel: DS: 007b ES: 007b FS: 00d8 GS: > 0033 SS: 0068 > 2011-10-10 09:32:09 chgut11fw01 kernel: Process ospfd (pid: 17011, > ti=d7d22000 task=de908a90 task.ti=d7d22000) 2011-10-10 09:32:09 > chgut11fw01 kernel: Stack: > 2011-10-10 09:32:09 chgut11fw01 kernel: c13fd19c c1305749 00000040 > 00000040 c7ea6a00 c7ea6a50 c7ea6a90 c7ea6a80 2011-10-10 09:32:09 > chgut11fw01 kernel: c13fb224 d83d90c0 d7d23d34 > c1305749 d7d23d20 02916cfd 4c9e613e 00000001 2011-10-10 09:32:09 > chgut11fw01 kernel: 00000001 d7d23ce4 d7d23cc8 00000000 0000e8d5 > d7d23ecc caddb300 c7ea6a50 2011-10-10 09:32:09 chgut11fw01 kernel: Call > Trace: > 2011-10-10 09:32:09 chgut11fw01 kernel: [<c1305749>] ? > raw_sendmsg+0x5a9/0x850 > 2011-10-10 09:32:09 chgut11fw01 kernel: [<c1305749>] > raw_sendmsg+0x5a9/0x850 2011-10-10 09:32:09 chgut11fw01 kernel: > [<c1305749>] raw_sendmsg+0x5a9/0x850 2011-10-10 09:32:09 chgut11fw01 > kernel: [<c12ccebd>] ? > __rtnl_unlock+0xd/0x10 > 2011-10-10 09:32:09 chgut11fw01 kernel: [<c12c06bd>] ? > netdev_run_todo+0x3d/0x220 > 2011-10-10 09:32:09 chgut11fw01 kernel: [<c133d400>] ? > _raw_read_lock_irq+0x20/0x20 > 2011-10-10 09:32:09 chgut11fw01 kernel: [<c130efa2>] > inet_sendmsg+0x42/0xa0 2011-10-10 09:32:09 chgut11fw01 kernel: > [<c12ea0e5>] ? > do_ip_setsockopt+0x75/0xbe0 > 2011-10-10 09:32:09 chgut11fw01 kernel: [<c10213e7>] ? > __wake_up_sync_key+0x47/0x60 > 2011-10-10 09:32:09 chgut11fw01 kernel: [<c12af947>] > sock_sendmsg+0xa7/0xd0 2011-10-10 09:32:09 chgut11fw01 kernel: > [<c12b5df9>] ? > skb_queue_tail+0x39/0x50 > 2011-10-10 09:32:09 chgut11fw01 kernel: [<c12af947>] ? > sock_sendmsg+0xa7/0xd0 > 2011-10-10 09:32:09 chgut11fw01 kernel: [<c12b8be3>] ? > verify_iovec+0x53/0xb0 > 2011-10-10 09:32:09 chgut11fw01 kernel: [<c12b0640>] > __sys_sendmsg+0x2f0/0x300 2011-10-10 09:32:09 chgut11fw01 kernel: > [<c12af9de>] ? > sockfd_lookup_light+0x1e/0x70 > 2011-10-10 09:32:09 chgut11fw01 kernel: [<c12b003a>] ? > sys_sendto+0xaa/0xe0 > 2011-10-10 09:32:09 chgut11fw01 kernel: [<c1047d65>] ? > sched_clock_local+0xa5/0x180 > 2011-10-10 09:32:09 chgut11fw01 kernel: [<c102f898>] ? > nsecs_to_jiffies+0x8/0x10 > 2011-10-10 09:32:09 chgut11fw01 kernel: [<c12eac91>] ? > ip_setsockopt+0x41/0xa0 > 2011-10-10 09:32:09 chgut11fw01 kernel: [<c12b0796>] > sys_sendmsg+0x36/0x60 2011-10-10 09:32:09 chgut11fw01 kernel: > [<c12b13c9>] sys_socketcall+0xe9/0x280 2011-10-10 09:32:09 chgut11fw01 > kernel: [<c133d6a5>] syscall_call+0x7/0xb 2011-10-10 09:32:09 > chgut11fw01 kernel: [<c1330000>] ? > packet_recvmsg+0x410/0x440 > 2011-10-10 09:32:09 chgut11fw01 kernel: Code: 00 00 89 4c 24 14 8b 88 > a4 00 00 00 89 54 24 0c 89 4c 24 10 8b 40 50 89 5c 24 04 c7 04 24 9c d1 > 3f > c1 89 44 24 08 e8 dc 4b 08 00 <0f> 0b eb fe b9 24 b2 3f c1 eb ae 55 89 > e5 57 56 89 d6 53 89 c3 > 2011-10-10 09:32:09 chgut11fw01 kernel: EIP: [<c12b6575>] > skb_put+0x85/0x90 SS:ESP 0068:d7d23c54 2011-10-10 09:32:09 chgut11fw01 > kernel: ---[ end trace 88d35f34d6a6da4a > ]--- > 2011-10-10 09:32:09 chgut11fw01 pluto[16283]: "chgut1_aa" #2: Dead Peer > Detection (RFC 3706) enabled 2011-10-10 09:32:09 chgut11fw01 > pluto[16283]: "chgut1_aa" #2: sent QI2, IPsec SA established > {ESP=>0xc4d50da8 <0xcfd0a69f} > > I'm not sure if quagga is really the problem, but it is ospfd which > executes the system call "sendmsg" in ospfd/ospf_packet.c on line 568 > (version 0.99.20): > ret = sendmsg (fd, msg, flags); > > I can reproduce the behavior with the following software versions: > - quagga 0.99.17, 0.99.18, 0.99.19, 0.99.20 > - kernel (vanilla) 2.6.35.10, 2.6.35.14, 3.0.4 (each one with SMP > enabled) > - strongswan 4.3.5, 4.5.3 > > Unfortunately I am only able to reproduce the error on the following > hardware: > Nexcom NSA-1068N7 > (http://www.omtec.de/network-security-appliances/nsa-1086n7/ > > This is the quagga configuration on chgut11fw01 (the config on > chgut1fw01 is simlear): > ospfd.conf: > hostname chgut11fw01 > log syslog > > interface gAAchgut1 > ip ospf network point-to-point > ip ospf authentication-key hallowel > ip ospf cost 15 > ip ospf hello-interval 2 > ip ospf dead-interval 10 > > interface eth0 > interface eth1 > interface eth2 > interface eth3 > interface lo > > router ospf > ospf router-id 172.16.111.1 > auto-cost reference-bandwidth 1 > redistribute kernel route-map mynet > no ospf rfc1583compatibility > passive-interface eth0 > passive-interface eth1 > passive-interface eth2 > passive-interface eth3 > network 10.254.11.1/32 area 0 > network 10.254.1.11/32 area 0 > network 172.16.111.0/24 area 0 > > access-list 99 permit 172.16.111.0 0.0.0.255 access-list 99 deny any > route-map mynet permit 10 > match ip address 99 > line vty > > zebra.conf: > hostname chgut11fw01 > log syslog > table 254 > > Do you have any ideas? > > Thank you very much. > > Elmar Vonlanthen
|