Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Quagga: Dev

Re: LSA checksum error - diff

 

 

Quagga dev RSS feed   Index | Next | Previous | View Threaded


jafojtik at seznam

Oct 3, 2011, 1:56 PM

Post #1 of 5 (555 views)
Permalink
Re: LSA checksum error - diff

Dear Balaji,

> Can you send the fix as a patch so that it becomes easy to review the
> changes per se

here is it. If you like it, I can improve it a little bit.

--------------------------------------------------------------------------


*** M:\quagga\quagga-0.99.18\ospfd\ospf_packet.c Mon Mar 21 13:09:14 2011
--- P:\USR_SRC\quagga-0.99.18\ospfd\ospf_packet.c Mon Oct 3 22:53:08 2011
***************
*** 1528,1533 ****
--- 1528,1537 ----
list_free (ls_upd);
}

+
+ static int LSA_CHECKSUM_ERR_COUNT = 0;
+ static int LSA_CHECKSUM_ERR_SEC = 0;
+
/* Get the list of LSAs from Link State Update packet.
And process some validation -- RFC2328 Section 13. (1)-(2). */
static struct list *
***************
*** 1561,1568 ****
sum = lsah->checksum;
if (sum != ospf_lsa_checksum (lsah))
{
! zlog_warn ("Link State Update: LSA checksum error %x, %x.",
sum, lsah->checksum);
continue;
}

--- 1565,1590 ----
sum = lsah->checksum;
if (sum != ospf_lsa_checksum (lsah))
{
! struct timeval clock;
!
! gettimeofday(&clock, NULL);
! LSA_CHECKSUM_ERR_COUNT++;
!
! if(LSA_CHECKSUM_ERR_COUNT==1)
! {
! zlog_warn("Link State Update: LSA checksum error %x, %x.",
sum, lsah->checksum);
+ LSA_CHECKSUM_ERR_SEC = clock.tv_sec;
+ }
+ else
+ {
+ if(labs(clock.tv_sec-LSA_CHECKSUM_ERR_SEC)>60)
+ {
+ zlog_warn("Link State Update: LSA checksum error %x, %x repeated %d times.",
+ sum, lsah->checksum, LSA_CHECKSUM_ERR_COUNT);
+ LSA_CHECKSUM_ERR_COUNT = 0;
+ }
+ }
continue;
}


--------------------------------------------------------------------------

regards
Jara

_______________________________________________
Quagga-dev mailing list
Quagga-dev [at] lists
http://lists.quagga.net/mailman/listinfo/quagga-dev


shemminger at vyatta

Oct 3, 2011, 3:27 PM

Post #2 of 5 (532 views)
Permalink
Re: LSA checksum error - diff [In reply to]

On Mon, 03 Oct 2011 22:56:55 +0200
"Jaroslav Fojtik" <jafojtik [at] seznam> wrote:

> Dear Balaji,
>
> > Can you send the fix as a patch so that it becomes easy to review the
> > changes per se
>
> here is it. If you like it, I can improve it a little bit.
>
> --------------------------------------------------------------------------
>
>
> *** M:\quagga\quagga-0.99.18\ospfd\ospf_packet.c Mon Mar 21 13:09:14 2011
> --- P:\USR_SRC\quagga-0.99.18\ospfd\ospf_packet.c Mon Oct 3 22:53:08 2011
> ***************
> *** 1528,1533 ****
> --- 1528,1537 ----
> list_free (ls_upd);
> }
>
> +
> + static int LSA_CHECKSUM_ERR_COUNT = 0;
> + static int LSA_CHECKSUM_ERR_SEC = 0;
> +
> /* Get the list of LSAs from Link State Update packet.
> And process some validation -- RFC2328 Section 13. (1)-(2). */
> static struct list *
> ***************
> *** 1561,1568 ****
> sum = lsah->checksum;
> if (sum != ospf_lsa_checksum (lsah))
> {
> ! zlog_warn ("Link State Update: LSA checksum error %x, %x.",
> sum, lsah->checksum);
> continue;
> }
>
> --- 1565,1590 ----
> sum = lsah->checksum;
> if (sum != ospf_lsa_checksum (lsah))
> {
> ! struct timeval clock;
> !
> ! gettimeofday(&clock, NULL);
> ! LSA_CHECKSUM_ERR_COUNT++;
> !
> ! if(LSA_CHECKSUM_ERR_COUNT==1)
> ! {
> ! zlog_warn("Link State Update: LSA checksum error %x, %x.",
> sum, lsah->checksum);
> + LSA_CHECKSUM_ERR_SEC = clock.tv_sec;
> + }
> + else
> + {
> + if(labs(clock.tv_sec-LSA_CHECKSUM_ERR_SEC)>60)
> + {
> + zlog_warn("Link State Update: LSA checksum error %x, %x repeated %d times.",
> + sum, lsah->checksum, LSA_CHECKSUM_ERR_COUNT);
> + LSA_CHECKSUM_ERR_COUNT = 0;
> + }
> + }
> continue;
> }
>
>
> --------------------------------------------------------------------------
>
> regards
> Jara
>

Thank you for the patch. The concept is good but next time please try
and follow the guidelines in the HACKING file.

1. variable names shouldn't be all upper case (that is reserved for macros).
2. use 'diff -up ' to get better patch format.

I think a more general solution (inspired by linux kernel netratelimit())
would be something like the following (untested):

diff --git a/lib/thread.c b/lib/thread.c
index fd841c2..8745c6f 100644
--- a/lib/thread.c
+++ b/lib/thread.c
@@ -216,6 +216,39 @@ recent_relative_time (void)
return relative_time;
}

+/* Limit messages from the same place */
+int __ratelimit(struct msg_ratelimit *rt, const char *func)
+{
+ struct timeval now;
+
+ /* if no limit on messages per second? */
+ if (rt->interval == 0)
+ return 1;
+
+ quagga_real_stabilised (&now);
+ if (rt->begin.tv_sec == 0 && rt->begin.tv_usec == 0)
+ rt->begin = now;
+
+ else if (timeval_elapsed (now, rt->begin) / TIMER_SECOND_MICRO > rt->interval)
+ {
+ if (rt->missed)
+ zlog_warn("%s: %d messages surpressed\n",
+ func, rt->missed);
+ rt->begin = now;
+ rt->printed = 0;
+ rt->missed = 0;
+ }
+
+ if (rt->burst && rt->burst > rt->printed) {
+ rt->printed++;
+ return 1;
+ } else {
+ rt->missed++;
+ return 0;
+ }
+}
+
+
static unsigned int
cpu_record_hash_key (struct cpu_thread_history *a)
{
diff --git a/lib/thread.h b/lib/thread.h
index 978aa6b..78d4ce8 100644
--- a/lib/thread.h
+++ b/lib/thread.h
@@ -216,4 +216,21 @@ extern unsigned long thread_consumed_time(RUSAGE_T *after, RUSAGE_T *before,
extern struct timeval recent_time;
/* Similar to recent_time, but a monotonically increasing time value */
extern struct timeval recent_relative_time (void);
+
+/* Allow control of rate limit of messages. Used to avoid log
+ overflow for protocol errors, etc. */
+#define DEFAULT_RATELIMIT_INTERVAL (5 * HZ)
+#define DEFAULT_RATELIMIT_BURST 10
+
+struct msg_ratelimit {
+ struct timeval begin;
+ unsigned int interval;
+ unsigned int burst;
+ unsigned int printed;
+ unsigned int missed;
+};
+
+extern int __ratelimit(struct msg_ratelimit *rt, const char *func);
+#define ratelimit(state) __ratelimit(state, __func__)
+
#endif /* _ZEBRA_THREAD_H */
diff --git a/ospfd/ospf_packet.c b/ospfd/ospf_packet.c
index 0f338d3..d74b9c9 100644
--- a/ospfd/ospf_packet.c
+++ b/ospfd/ospf_packet.c
@@ -1535,6 +1535,11 @@ ospf_ls_req (struct ip *iph, struct ospf_header *ospfh,
list_free (ls_upd);
}

+static struct msg_ratelimit ospf_lsa_errorlimit = {
+ .interval = DEFAULT_RATELIMIT_INTERVAL,
+ .burst = DEFAULT_RATELIMIT_BURST,
+};
+
/* Get the list of LSAs from Link State Update packet.
And process some validation -- RFC2328 Section 13. (1)-(2). */
static struct list *
@@ -1560,7 +1565,8 @@ ospf_ls_upd_list_lsa (struct ospf_neighbor *nbr, struct stream *s,

if (length > size)
{
- zlog_warn ("Link State Update: LSA length exceeds packet size.");
+ if (ratelimit(&ospf_lsa_errorlimit))
+ zlog_warn ("Link State Update: LSA length exceeds packet size.");
break;
}

@@ -1568,15 +1574,17 @@ ospf_ls_upd_list_lsa (struct ospf_neighbor *nbr, struct stream *s,
sum = lsah->checksum;
if (sum != ospf_lsa_checksum (lsah))
{
- zlog_warn ("Link State Update: LSA checksum error %x, %x.",
- sum, lsah->checksum);
+ if (ratelimit(&ospf_lsa_errorlimit))
+ zlog_warn ("Link State Update: LSA checksum error %x, %x.",
+ sum, lsah->checksum);
continue;
}

/* Examine the LSA's LS type. */
if (lsah->type < OSPF_MIN_LSA || lsah->type >= OSPF_MAX_LSA)
{
- zlog_warn ("Link State Update: Unknown LS type %d", lsah->type);
+ if (ratelimit(&ospf_lsa_errorlimit))
+ zlog_warn ("Link State Update: Unknown LS type %d", lsah->type);
continue;
}

@@ -1600,7 +1608,8 @@ ospf_ls_upd_list_lsa (struct ospf_neighbor *nbr, struct stream *s,
* This neighbor must know the exact usage of O-bit;
* the bit will be set in Type-9,10,11 LSAs only.
*/
- zlog_warn ("LSA[Type%d:%s]: O-bit abuse?", lsah->type, inet_ntoa (lsah->id));
+ if (ratelimit(&ospf_lsa_errorlimit))
+ zlog_warn ("LSA[Type%d:%s]: O-bit abuse?", lsah->type, inet_ntoa (lsah->id));
continue;
}
#endif /* STRICT_OBIT_USAGE_CHECK */




_______________________________________________
Quagga-dev mailing list
Quagga-dev [at] lists
http://lists.quagga.net/mailman/listinfo/quagga-dev


jafojtik at seznam

Oct 3, 2011, 11:52 PM

Post #3 of 5 (522 views)
Permalink
Re: LSA checksum error - diff [In reply to]

Dear Stephen,

> > > Can you send the fix as a patch so that it becomes easy to review the
> > > changes per se
> Thank you for the patch. The concept is good but next time please try
> and follow the guidelines in the HACKING file.
I has been solved compromising my computer caused by too many logs that I cannot
influence. For other patches I will look at it.

> 1. variable names shouldn't be all upper case (that is reserved for macros).
> 2. use 'diff -up ' to get better patch format.
>
> I think a more general solution (inspired by linux kernel netratelimit())
> would be something like the following (untested):
I hope that you are on a good way to make quagga more robust.

Next step should be to log also IP address of the mad peer that is spamming with
bad LSA packets.

And finally it would be nice to fix ospfd in such a way that it would not
generate corrupted packets.

regards
Jara



> diff --git a/lib/thread.c b/lib/thread.c
> index fd841c2..8745c6f 100644
> --- a/lib/thread.c
> +++ b/lib/thread.c
> @@ -216,6 +216,39 @@ recent_relative_time (void)
> return relative_time;
> }
>
> +/* Limit messages from the same place */
> +int __ratelimit(struct msg_ratelimit *rt, const char *func)
> +{
> + struct timeval now;
> +
> + /* if no limit on messages per second? */
> + if (rt->interval == 0)
> + return 1;
> +
> + quagga_real_stabilised (&now);
> + if (rt->begin.tv_sec == 0 && rt->begin.tv_usec == 0)
> + rt->begin = now;
> +
> + else if (timeval_elapsed (now, rt->begin) / TIMER_SECOND_MICRO > rt->interval)
> + {
> + if (rt->missed)
> + zlog_warn("%s: %d messages surpressed\n",
> + func, rt->missed);
> + rt->begin = now;
> + rt->printed = 0;
> + rt->missed = 0;
> + }
> +
> + if (rt->burst && rt->burst > rt->printed) {
> + rt->printed++;
> + return 1;
> + } else {
> + rt->missed++;
> + return 0;
> + }
> +}
> +
> +
> static unsigned int
> cpu_record_hash_key (struct cpu_thread_history *a)
> {
> diff --git a/lib/thread.h b/lib/thread.h
> index 978aa6b..78d4ce8 100644
> --- a/lib/thread.h
> +++ b/lib/thread.h
> @@ -216,4 +216,21 @@ extern unsigned long thread_consumed_time(RUSAGE_T *after, RUSAGE_T *before,
> extern struct timeval recent_time;
> /* Similar to recent_time, but a monotonically increasing time value */
> extern struct timeval recent_relative_time (void);
> +
> +/* Allow control of rate limit of messages. Used to avoid log
> + overflow for protocol errors, etc. */
> +#define DEFAULT_RATELIMIT_INTERVAL (5 * HZ)
> +#define DEFAULT_RATELIMIT_BURST 10
> +
> +struct msg_ratelimit {
> + struct timeval begin;
> + unsigned int interval;
> + unsigned int burst;
> + unsigned int printed;
> + unsigned int missed;
> +};
> +
> +extern int __ratelimit(struct msg_ratelimit *rt, const char *func);
> +#define ratelimit(state) __ratelimit(state, __func__)
> +
> #endif /* _ZEBRA_THREAD_H */
> diff --git a/ospfd/ospf_packet.c b/ospfd/ospf_packet.c
> index 0f338d3..d74b9c9 100644
> --- a/ospfd/ospf_packet.c
> +++ b/ospfd/ospf_packet.c
> @@ -1535,6 +1535,11 @@ ospf_ls_req (struct ip *iph, struct ospf_header *ospfh,
> list_free (ls_upd);
> }
>
> +static struct msg_ratelimit ospf_lsa_errorlimit = {
> + .interval = DEFAULT_RATELIMIT_INTERVAL,
> + .burst = DEFAULT_RATELIMIT_BURST,
> +};
> +
> /* Get the list of LSAs from Link State Update packet.
> And process some validation -- RFC2328 Section 13. (1)-(2). */
> static struct list *
> @@ -1560,7 +1565,8 @@ ospf_ls_upd_list_lsa (struct ospf_neighbor *nbr, struct stream *s,
>
> if (length > size)
> {
> - zlog_warn ("Link State Update: LSA length exceeds packet size.");
> + if (ratelimit(&ospf_lsa_errorlimit))
> + zlog_warn ("Link State Update: LSA length exceeds packet size.");
> break;
> }
>
> @@ -1568,15 +1574,17 @@ ospf_ls_upd_list_lsa (struct ospf_neighbor *nbr, struct stream *s,
> sum = lsah->checksum;
> if (sum != ospf_lsa_checksum (lsah))
> {
> - zlog_warn ("Link State Update: LSA checksum error %x, %x.",
> - sum, lsah->checksum);
> + if (ratelimit(&ospf_lsa_errorlimit))
> + zlog_warn ("Link State Update: LSA checksum error %x, %x.",
> + sum, lsah->checksum);
> continue;
> }
>
> /* Examine the LSA's LS type. */
> if (lsah->type < OSPF_MIN_LSA || lsah->type >= OSPF_MAX_LSA)
> {
> - zlog_warn ("Link State Update: Unknown LS type %d", lsah->type);
> + if (ratelimit(&ospf_lsa_errorlimit))
> + zlog_warn ("Link State Update: Unknown LS type %d", lsah->type);
> continue;
> }
>
> @@ -1600,7 +1608,8 @@ ospf_ls_upd_list_lsa (struct ospf_neighbor *nbr, struct stream *s,
> * This neighbor must know the exact usage of O-bit;
> * the bit will be set in Type-9,10,11 LSAs only.
> */
> - zlog_warn ("LSA[Type%d:%s]: O-bit abuse?", lsah->type, inet_ntoa (lsah->id));
> + if (ratelimit(&ospf_lsa_errorlimit))
> + zlog_warn ("LSA[Type%d:%s]: O-bit abuse?", lsah->type, inet_ntoa (lsah->id));
> continue;
> }
> #endif /* STRICT_OBIT_USAGE_CHECK */
>
>
>
>
> _______________________________________________
> Quagga-dev mailing list
> Quagga-dev [at] lists
> http://lists.quagga.net/mailman/listinfo/quagga-dev


_______________________________________________
Quagga-dev mailing list
Quagga-dev [at] lists
http://lists.quagga.net/mailman/listinfo/quagga-dev


paul at jakma

Oct 6, 2011, 4:25 AM

Post #4 of 5 (516 views)
Permalink
Re: LSA checksum error - diff [In reply to]

On Mon, 3 Oct 2011, Jaroslav Fojtik wrote:

> here is it. If you like it, I can improve it a little bit.

> {
> ! struct timeval clock;
> !
> ! gettimeofday(&clock, NULL);
> ! LSA_CHECKSUM_ERR_COUNT++;
> !
> ! if(LSA_CHECKSUM_ERR_COUNT==1)
> ! {
> ! zlog_warn("Link State Update: LSA checksum error %x, %x.",
> sum, lsah->checksum);
> + LSA_CHECKSUM_ERR_SEC = clock.tv_sec;
> + }
> + else
> + {
> + if(labs(clock.tv_sec-LSA_CHECKSUM_ERR_SEC)>60)

It'd probably be better to add rate-limiting support generally to the
logging code, along with a log message to say "your daemon is hammering at
the logging sub-system at the moment".

Or, as Denis pointed out on the bugzilla comment, you can use syslog and
the existing support in syslog daemons for this.

regards,
--
Paul Jakma paul [at] jakma twitter: @pjakma PGP: 64A2FF6A
Fortune:
If you lose your temper at a newspaper columnist, he'll get rich,
or famous or both.
_______________________________________________
Quagga-dev mailing list
Quagga-dev [at] lists
http://lists.quagga.net/mailman/listinfo/quagga-dev


jafojtik at seznam

Oct 11, 2011, 3:01 AM

Post #5 of 5 (498 views)
Permalink
Re: LSA checksum error - diff [In reply to]

Dear Paul,

> It'd probably be better to add rate-limiting support generally to the
> logging code, along with a log message to say "your daemon is hammering at
> the logging sub-system at the moment".
There are macros like REPORT_ONCE for logging even inside Linux kernel source code.


> Or, as Denis pointed out on the bugzilla comment, you can use syslog and
> the existing support in syslog daemons for this.

It is a bad approach to generate gigabytes of useless messages and rely on
external tool to filter them. Reported messages are totally useless, because noboody
knows WHICH PEER is sending bad LSA packers.


Anyway, syslog filters only last message. If there were more repeating messages,
this problem would arise anew. It is not a solution. It might be only dirty
workaround.

regards
Jara


>
> regards,
> --
> Paul Jakma paul [at] jakma twitter: @pjakma PGP: 64A2FF6A
> Fortune:
> If you lose your temper at a newspaper columnist, he'll get rich,
> or famous or both.


_______________________________________________
Quagga-dev mailing list
Quagga-dev [at] lists
http://lists.quagga.net/mailman/listinfo/quagga-dev

Quagga dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.