Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: nsp: juniper

Regular maintenance advice

 

 

nsp juniper RSS feed   Index | Next | Previous | View Threaded


skeeve+junipernsp at eintellego

Apr 3, 2012, 7:28 AM

Post #1 of 16 (2612 views)
Permalink
Regular maintenance advice

Hey all,

I am designing a document for low level technicians to regularly
(depending on sensitivity of the device) login to the Juniper
router/or switch to look around and make sure that things are 'ok'.

I am seeking comments of anything else that would be useful for an
technician to look at that would catch their eye that something is
potentially wrong.

So far I have:

---

RJ01 – Router

Description: Standard Juniper Router or Switch

1. Show log messages

a.     Look at last few days for anything suspicious

          i.     Interfaces flapping

2. Show interfaces terse

a.     Anything down that shouldn’t be?


3. Show chassis alarm

a.     Look for any alarm information

4. Show system snapshot

a.     If older than 1 week then – ‘Request system snapshot’

5. Show system uptime

a.     As expected?

6. Show system storage

a.     Confirm / (root) disk space is not getting full.


---

Skeeve Stevens, CEO

eintellego Pty Ltd
skeeve [at] eintellego ; www.eintellego.net

Phone: 1300 753 383 ; Fax: (+612) 8572 9954

Cell +61 (0)414 753 383 ; skype://skeeve

facebook.com/eintellego

twitter.com/networkceoau ; www.linkedin.com/in/skeeve

PO Box 7726, Baulkham Hills, NSW 1755 Australia


The Experts Who The Experts Call
Juniper - Cisco – Brocade - IBM

_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


jgoodwin at studio442

Apr 3, 2012, 7:41 AM

Post #2 of 16 (2563 views)
Permalink
Re: Regular maintenance advice [In reply to]

On 04/04/12 00:28, Skeeve Stevens wrote:
> 1. Show log messages
>
> a. Look at last few days for anything suspicious
>
> i. Interfaces flapping

"show int | match flap" is your friend. Also chassisd

> 2. Show interfaces terse
>
> a. Anything down that shouldn’t be?

Also anything *up* that shouldn't be.

If you can be strict about it you can say anything but up/up and
down/down are problems.

> 3. Show chassis alarm
>
> a. Look for any alarm information

If you have any EX (at least, can't remember for SRX/J, not for M/...)
also add:

show system alarms

(It's sad how few people know about this)

> 4. Show system snapshot
>
> a. If older than 1 week then – ‘Request system snapshot’

er, why?
Do a snapshot on OS upgrade, shouldn't be needed after that.

Verifing "commit sync" is default is also good.

> 5. Show system uptime
>
> a. As expected?
>
> 6. Show system storage
>
> a. Confirm / (root) disk space is not getting full.
Attachments: signature.asc (0.26 KB)


skeeve+junipernsp at eintellego

Apr 3, 2012, 7:59 AM

Post #3 of 16 (2563 views)
Permalink
Re: Regular maintenance advice [In reply to]

Excellent Julian.

btw. Doing the show system snapshot on a an EX4200 stack just showed me:

user [at] hos> show system snapshot
error: external media missing or invalid

I'm guessing a USB key should be installed by default for this? or you
think a switch may not need it?


*Skeeve Stevens, CEO*
eintellego Pty Ltd
skeeve [at] eintellego ; www.eintellego.net <http://www.eintellego.net.au>

Phone: 1300 753 383 ; Fax: (+612) 8572 9954

Cell +61 (0)414 753 383 ; skype://skeeve

facebook.com/eintellego

twitter.com/networkceoau ; www.linkedin.com/in/skeeve

PO Box 7726, Baulkham Hills, NSW 1755 Australia

The Experts Who The Experts Call
Juniper - Cisco – Brocade - IBM



On Wed, Apr 4, 2012 at 00:41, Julien Goodwin <jgoodwin [at] studio442>wrote:

> On 04/04/12 00:28, Skeeve Stevens wrote:
> > 1. Show log messages
> >
> > a. Look at last few days for anything suspicious
> >
> > i. Interfaces flapping
>
> "show int | match flap" is your friend. Also chassisd
>
> > 2. Show interfaces terse
> >
> > a. Anything down that shouldn’t be?
>
> Also anything *up* that shouldn't be.
>
> If you can be strict about it you can say anything but up/up and
> down/down are problems.
>
> > 3. Show chassis alarm
> >
> > a. Look for any alarm information
>
> If you have any EX (at least, can't remember for SRX/J, not for M/...)
> also add:
>
> show system alarms
>
> (It's sad how few people know about this)
>
> > 4. Show system snapshot
> >
> > a. If older than 1 week then – ‘Request system snapshot’
>
> er, why?
> Do a snapshot on OS upgrade, shouldn't be needed after that.
>
> Verifing "commit sync" is default is also good.
>
> > 5. Show system uptime
> >
> > a. As expected?
> >
> > 6. Show system storage
> >
> > a. Confirm / (root) disk space is not getting full.
>
>
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


adam at leff

Apr 3, 2012, 8:14 AM

Post #4 of 16 (2571 views)
Permalink
Re: Regular maintenance advice [In reply to]

If you're running the 10.4 variant that has the dual boot partitions, no
USB key is needed.

Just change your command to: show system snapshot media internal

~Adam

On Tue, Apr 3, 2012 at 10:59 AM, Skeeve Stevens <
skeeve+junipernsp [at] eintellego> wrote:

> Excellent Julian.
>
> btw. Doing the show system snapshot on a an EX4200 stack just showed me:
>
> user [at] hos> show system snapshot
> error: external media missing or invalid
>
> I'm guessing a USB key should be installed by default for this? or you
> think a switch may not need it?
>
>
> *Skeeve Stevens, CEO*
> eintellego Pty Ltd
> skeeve [at] eintellego ; www.eintellego.net <http://www.eintellego.net.au>
>
> Phone: 1300 753 383 ; Fax: (+612) 8572 9954
>
> Cell +61 (0)414 753 383 ; skype://skeeve
>
> facebook.com/eintellego
>
> twitter.com/networkceoau ; www.linkedin.com/in/skeeve
>
> PO Box 7726, Baulkham Hills, NSW 1755 Australia
>
> The Experts Who The Experts Call
> Juniper - Cisco – Brocade - IBM
>
>
>
> On Wed, Apr 4, 2012 at 00:41, Julien Goodwin <jgoodwin [at] studio442
> >wrote:
>
> > On 04/04/12 00:28, Skeeve Stevens wrote:
> > > 1. Show log messages
> > >
> > > a. Look at last few days for anything suspicious
> > >
> > > i. Interfaces flapping
> >
> > "show int | match flap" is your friend. Also chassisd
> >
> > > 2. Show interfaces terse
> > >
> > > a. Anything down that shouldn’t be?
> >
> > Also anything *up* that shouldn't be.
> >
> > If you can be strict about it you can say anything but up/up and
> > down/down are problems.
> >
> > > 3. Show chassis alarm
> > >
> > > a. Look for any alarm information
> >
> > If you have any EX (at least, can't remember for SRX/J, not for M/...)
> > also add:
> >
> > show system alarms
> >
> > (It's sad how few people know about this)
> >
> > > 4. Show system snapshot
> > >
> > > a. If older than 1 week then – ‘Request system snapshot’
> >
> > er, why?
> > Do a snapshot on OS upgrade, shouldn't be needed after that.
> >
> > Verifing "commit sync" is default is also good.
> >
> > > 5. Show system uptime
> > >
> > > a. As expected?
> > >
> > > 6. Show system storage
> > >
> > > a. Confirm / (root) disk space is not getting full.
> >
> >
> _______________________________________________
> juniper-nsp mailing list juniper-nsp [at] puck
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


misha.gzirishvili at gmail

Apr 3, 2012, 8:35 AM

Post #5 of 16 (2559 views)
Permalink
Re: Regular maintenance advice [In reply to]

Hi Skeeve,
I think, forwarding messages to syslog server, will avoid rutine of logging
on device.
rsyslog or syslog-ng with web interface and mysql backend will allow your
support to search for desired messages using web UI.
For uptime and disk usage - I think, that snmp is the best way.
On Apr 3, 2012 6:44 PM, "Julien Goodwin" <jgoodwin [at] studio442> wrote:

> On 04/04/12 00:28, Skeeve Stevens wrote:
> > 1. Show log messages
> >
> > a. Look at last few days for anything suspicious
> >
> > i. Interfaces flapping
>
> "show int | match flap" is your friend. Also chassisd
>
> > 2. Show interfaces terse
> >
> > a. Anything down that shouldn’t be?
>
> Also anything *up* that shouldn't be.
>
> If you can be strict about it you can say anything but up/up and
> down/down are problems.
>
> > 3. Show chassis alarm
> >
> > a. Look for any alarm information
>
> If you have any EX (at least, can't remember for SRX/J, not for M/...)
> also add:
>
> show system alarms
>
> (It's sad how few people know about this)
>
> > 4. Show system snapshot
> >
> > a. If older than 1 week then – ‘Request system snapshot’
>
> er, why?
> Do a snapshot on OS upgrade, shouldn't be needed after that.
>
> Verifing "commit sync" is default is also good.
>
> > 5. Show system uptime
> >
> > a. As expected?
> >
> > 6. Show system storage
> >
> > a. Confirm / (root) disk space is not getting full.
>
>
> _______________________________________________
> juniper-nsp mailing list juniper-nsp [at] puck
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


phil at juniper

Apr 3, 2012, 9:06 AM

Post #6 of 16 (2561 views)
Permalink
Re: Regular maintenance advice [In reply to]

Skeeve Stevens writes:
>I am designing a document for low level technicians to regularly
>(depending on sensitivity of the device) login to the Juniper
>router/or switch to look around and make sure that things are 'ok'.

How much of this is generic (or can be made generic) enough to cook
into an op script? Checks like "indicate system uptime of less
than one week" and "indicate if /, /config, or /tmp is more than
90% full" are trivial, and interface flapping is simple enough, but
"show suspicious log messages" are more human detectable than
scriptable.

I'd be happy enough to do the script work if we can come up with
a reasonable set of "system health" diagnostic checks.

Okay, I worked up a bit of a template for it. See attached.

Thanks,
Phil
Attachments: juniper-checkup.slax (1.54 KB)


wrx230 at gmail

Apr 3, 2012, 9:18 AM

Post #7 of 16 (2578 views)
Permalink
Re: Regular maintenance advice [In reply to]

Why don't you poll all of this via snmp?

Sent from my iPhone

On Apr 3, 2012, at 9:06 AM, Phil Shafer <phil [at] juniper> wrote:

> Skeeve Stevens writes:
>> I am designing a document for low level technicians to regularly
>> (depending on sensitivity of the device) login to the Juniper
>> router/or switch to look around and make sure that things are 'ok'.
>
> How much of this is generic (or can be made generic) enough to cook
> into an op script? Checks like "indicate system uptime of less
> than one week" and "indicate if /, /config, or /tmp is more than
> 90% full" are trivial, and interface flapping is simple enough, but
> "show suspicious log messages" are more human detectable than
> scriptable.
>
> I'd be happy enough to do the script work if we can come up with
> a reasonable set of "system health" diagnostic checks.
>
> Okay, I worked up a bit of a template for it. See attached.
>
> Thanks,
> Phil
>
> version 1.0;
>
> ns junos = "http://xml.juniper.net/junos/*/junos";
> ns xnm = "http://xml.juniper.net/xnm/1.1/xnm";
> ns jcs extension = "http://xml.juniper.net/junos/commit-scripts/1.0";
> ns dyn extension = "http://exslt.org/dynamic";
>
> import "../import/junos.xsl";
>
> param $uptime = 60 * 60 * 24 * 7;
> param $filesystem-threshold = 80;
>
> var $fsnames := {
> <fs> "/";
> <fs> "/tmp";
> <fs> "/config";
> }
>
> var $checks := {
> <check> {
> <name> "System Uptime";
> <rpc> {
> <get-system-uptime-information>;
> }
> <test> "uptime-information/up-time/@junos:seconds < $uptime";
> }
> <check> {
> <name> "Filesystem Space";
> <rpc> {
> <get-system-storage>;
> }
> for-each ($fsnames/fs) {
> <test message=. _ " is full">
> "filesystem[mounted-on = '" _ .
> _ "'][number(used-percent) > $filesystem-threshold]";
> }
> }
> }
>
> match / {
> <op-script-results> {
> var $conn = jcs:open();
>
> for-each ($checks/check) {
> expr jcs:output("Checking ", name);
> var $check = .;
> expr jcs:output(" [rpc ", local-name(rpc/node()), "]");
> var $res = jcs:execute($conn, rpc);
> if ($res/..//xnm:error) {
> expr jcs:output(" error from rpc: ", $res/..//xnm:error);
> } else {
> for-each (test) {
> var $test = .;
> for-each ($res) {
> var $p = dyn:evaluate($test);
> if (boolean($p)) {
> var $msg = jcs:first-of($test/@message,
> "failed condition");
> expr jcs:output(" error from test: ", $msg);
> } else {
> expr jcs:output(" [passed]");
> }
> }
> }
> }
> }
>
> expr jcs:close($conn);
> }
> }
> _______________________________________________
> juniper-nsp mailing list juniper-nsp [at] puck
> https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


tom at snnap

Apr 3, 2012, 11:18 AM

Post #8 of 16 (2548 views)
Permalink
Re: Regular maintenance advice [In reply to]

On 3 April 2012 15:41, Julien Goodwin <jgoodwin [at] studio442> wrote:
> If you can be strict about it you can say anything but up/up and
> down/down are problems.

What about SONET/SDH interfaces that display down/up?

The interface can be admin down, but if its still receiving a
SONET/SDH signal from the other side then line proto will be up -
nothing necessarily wrong with that. :-)

In reply to Skeeves original email, is there any reason you couldn't
script something like this? At least give a device a once over and
produce a summary report of "problems for this device" after which the
tech can then target only devices that have issues that need
attention. Otherwise you find yourself wasting time looking at a bunch
of boxes that dont need to be looked at when you could be doing
something more productive.

Or better yet, syslog and SNMP traps collectors and some scripts that
produce a dashboard highlighting any issues detected. :-)

Scripts, scripts, scripts everywhere. :-)
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


piotr.szlenk at gmail

Apr 3, 2012, 11:57 AM

Post #9 of 16 (2582 views)
Permalink
Re: Regular maintenance advice [In reply to]

Skeeve,

Try this one. This should provide info about current code on both
partitions on EX series.

> show system snapshot media internal
Information for snapshot on internal (/dev/da0s1a) (backup)
Creation date: Mar 20 15:39:34 2012
JUNOS version on snapshot:
jbase : 11.2R1.2
jcrypto-ex: 11.2R1.2
jdocs-ex: 11.2R1.2
jkernel-ex: 11.2R1.2
jroute-ex: 11.2R1.2
jswitch-ex: 11.2R1.2
jweb-ex: 11.2R1.2
Information for snapshot on internal (/dev/da0s2a) (primary)
Creation date: Mar 20 18:08:56 2012
JUNOS version on snapshot:
jbase : 11.4R1.6
jcrypto-ex: 11.4R1.6
jdocs-ex: 11.4R1.6
jkernel-ex: 11.4R1.6
jroute-ex: 11.4R1.6
jswitch-ex: 11.4R1.6
jweb-ex: 11.4R1.6


2012/4/3 Skeeve Stevens <skeeve+junipernsp [at] eintellego>:
> Excellent Julian.
>
> btw.  Doing the show system snapshot on a an EX4200 stack just showed me:
>
> user [at] hos> show system snapshot
> error: external media missing or invalid
>
> I'm guessing a USB key should be installed by default for this?  or you
> think a switch may not need it?
>
>
>  *Skeeve Stevens, CEO*
> eintellego Pty Ltd
> skeeve [at] eintellego ; www.eintellego.net <http://www.eintellego.net.au>
>
> Phone: 1300 753 383 ; Fax: (+612) 8572 9954
>
> Cell +61 (0)414 753 383 ; skype://skeeve
>
> facebook.com/eintellego
>
> twitter.com/networkceoau ; www.linkedin.com/in/skeeve
>
> PO Box 7726, Baulkham Hills, NSW 1755 Australia
>
> The Experts Who The Experts Call
> Juniper - Cisco – Brocade - IBM
>
>
>
> On Wed, Apr 4, 2012 at 00:41, Julien Goodwin <jgoodwin [at] studio442>wrote:
>
>> On 04/04/12 00:28, Skeeve Stevens wrote:
>> > 1. Show log messages
>> >
>> >      a.     Look at last few days for anything suspicious
>> >
>> >           i.     Interfaces flapping
>>
>> "show int | match flap" is your friend. Also chassisd
>>
>> > 2. Show interfaces terse
>> >
>> >      a.     Anything down that shouldn’t be?
>>
>> Also anything *up* that shouldn't be.
>>
>> If you can be strict about it you can say anything but up/up and
>> down/down are problems.
>>
>> > 3. Show chassis alarm
>> >
>> >      a.     Look for any alarm information
>>
>> If you have any EX (at least, can't remember for SRX/J, not for M/...)
>> also add:
>>
>> show system alarms
>>
>> (It's sad how few people know about this)
>>
>> > 4. Show system snapshot
>> >
>> >      a.     If older than 1 week then – ‘Request system snapshot’
>>
>> er, why?
>> Do a snapshot on OS upgrade, shouldn't be needed after that.
>>
>> Verifing "commit sync" is default is also good.
>>
>> > 5. Show system uptime
>> >
>> >      a.     As expected?
>> >
>> > 6. Show system storage
>> >
>> >      a.     Confirm / (root) disk space is not getting full.
>>
>>
> _______________________________________________
> juniper-nsp mailing list juniper-nsp [at] puck
> https://puck.nether.net/mailman/listinfo/juniper-nsp



--
Piotr Szlenk
 e-mail: piotr.szlenk [at] gmail | mobile: +48793717288

_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


gordon at gswsystems

Apr 3, 2012, 4:15 PM

Post #10 of 16 (2555 views)
Permalink
Re: Regular maintenance advice [In reply to]

Most of this you can automate on your monitoring boxes.
e.g. use rancid to generate an email on config changes, interfaces
flapping & chassis alarms will generate SNMP alerts.

You only need to snapshot when upgrading code. Definitely make that
part of the upgrade procedure, and let rancid keep track of the config.

Another thing to look at would be BGP peers - number of routes,
uptimes, etc. Low uptimes on a peer can indicate a problem at the far
end that the cust isn't aware of.

Cheers,
Gordon


On Wed, 4 Apr 2012 00:28:09 +1000, Skeeve Stevens wrote:
> Hey all,
>
> I am designing a document for low level technicians to regularly
> (depending on sensitivity of the device) login to the Juniper
> router/or switch to look around and make sure that things are 'ok'.
>
> I am seeking comments of anything else that would be useful for an
> technician to look at that would catch their eye that something is
> potentially wrong.
>
> So far I have:
>
> ---
>
> RJ01 – Router
>
> Description: Standard Juniper Router or Switch
>
> 1. Show log messages
>
> a.     Look at last few days for anything suspicious
>
>           i.     Interfaces flapping
>
> 2. Show interfaces terse
>
> a.     Anything down that shouldn’t be?
>
>
> 3. Show chassis alarm
>
> a.     Look for any alarm information
>
> 4. Show system snapshot
>
> a.     If older than 1 week then – ‘Request system snapshot’
>
> 5. Show system uptime
>
> a.     As expected?
>
> 6. Show system storage
>
> a.     Confirm / (root) disk space is not getting full.
>
>


_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


skeeve+junipernsp at eintellego

Apr 3, 2012, 4:23 PM

Post #11 of 16 (2559 views)
Permalink
Re: Regular maintenance advice [In reply to]

Phil,

Great help!

*Skeeve Stevens, CEO*
eintellego Pty Ltd
skeeve [at] eintellego ; www.eintellego.net <http://www.eintellego.net.au>

Phone: 1300 753 383 ; Fax: (+612) 8572 9954

Cell +61 (0)414 753 383 ; skype://skeeve

facebook.com/eintellego

twitter.com/networkceoau ; www.linkedin.com/in/skeeve

PO Box 7726, Baulkham Hills, NSW 1755 Australia

The Experts Who The Experts Call
Juniper - Cisco – Brocade - IBM



On Wed, Apr 4, 2012 at 02:06, Phil Shafer <phil [at] juniper> wrote:

> Skeeve Stevens writes:
> >I am designing a document for low level technicians to regularly
> >(depending on sensitivity of the device) login to the Juniper
> >router/or switch to look around and make sure that things are 'ok'.
>
> How much of this is generic (or can be made generic) enough to cook
> into an op script? Checks like "indicate system uptime of less
> than one week" and "indicate if /, /config, or /tmp is more than
> 90% full" are trivial, and interface flapping is simple enough, but
> "show suspicious log messages" are more human detectable than
> scriptable.
>
> I'd be happy enough to do the script work if we can come up with
> a reasonable set of "system health" diagnostic checks.
>
> Okay, I worked up a bit of a template for it. See attached.
>
> Thanks,
> Phil
>
>
> version 1.0;
>
> ns junos = "http://xml.juniper.net/junos/*/junos";
> ns xnm = "http://xml.juniper.net/xnm/1.1/xnm";
> ns jcs extension = "http://xml.juniper.net/junos/commit-scripts/1.0";
> ns dyn extension = "http://exslt.org/dynamic";
>
> import "../import/junos.xsl";
>
> param $uptime = 60 * 60 * 24 * 7;
> param $filesystem-threshold = 80;
>
> var $fsnames := {
> <fs> "/";
> <fs> "/tmp";
> <fs> "/config";
> }
>
> var $checks := {
> <check> {
> <name> "System Uptime";
> <rpc> {
> <get-system-uptime-information>;
> }
> <test> "uptime-information/up-time/@junos:seconds < $uptime";
> }
> <check> {
> <name> "Filesystem Space";
> <rpc> {
> <get-system-storage>;
> }
> for-each ($fsnames/fs) {
> <test message=. _ " is full">
> "filesystem[mounted-on = '" _ .
> _ "'][number(used-percent) > $filesystem-threshold]";
> }
> }
> }
>
> match / {
> <op-script-results> {
> var $conn = jcs:open();
>
> for-each ($checks/check) {
> expr jcs:output("Checking ", name);
> var $check = .;
> expr jcs:output(" [rpc ", local-name(rpc/node()), "]");
> var $res = jcs:execute($conn, rpc);
> if ($res/..//xnm:error) {
> expr jcs:output(" error from rpc: ", $res/..//xnm:error);
> } else {
> for-each (test) {
> var $test = .;
> for-each ($res) {
> var $p = dyn:evaluate($test);
> if (boolean($p)) {
> var $msg = jcs:first-of($test/@message,
> "failed condition");
> expr jcs:output(" error from test: ", $msg);
> } else {
> expr jcs:output(" [passed]");
> }
> }
> }
> }
> }
>
> expr jcs:close($conn);
> }
> }
>
>
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


skeeve+junipernsp at eintellego

Apr 3, 2012, 4:23 PM

Post #12 of 16 (2556 views)
Permalink
Re: Regular maintenance advice [In reply to]

I'm really looking for something more interactive when its needed.

*Skeeve Stevens, CEO*
eintellego Pty Ltd
skeeve [at] eintellego ; www.eintellego.net <http://www.eintellego.net.au>

Phone: 1300 753 383 ; Fax: (+612) 8572 9954

Cell +61 (0)414 753 383 ; skype://skeeve

facebook.com/eintellego

twitter.com/networkceoau ; www.linkedin.com/in/skeeve

PO Box 7726, Baulkham Hills, NSW 1755 Australia

The Experts Who The Experts Call
Juniper - Cisco – Brocade - IBM



On Wed, Apr 4, 2012 at 02:18, Morgan Mclean <wrx230 [at] gmail> wrote:

> Why don't you poll all of this via snmp?
>
> Sent from my iPhone
>
> On Apr 3, 2012, at 9:06 AM, Phil Shafer <phil [at] juniper> wrote:
>
> > Skeeve Stevens writes:
> >> I am designing a document for low level technicians to regularly
> >> (depending on sensitivity of the device) login to the Juniper
> >> router/or switch to look around and make sure that things are 'ok'.
> >
> > How much of this is generic (or can be made generic) enough to cook
> > into an op script? Checks like "indicate system uptime of less
> > than one week" and "indicate if /, /config, or /tmp is more than
> > 90% full" are trivial, and interface flapping is simple enough, but
> > "show suspicious log messages" are more human detectable than
> > scriptable.
> >
> > I'd be happy enough to do the script work if we can come up with
> > a reasonable set of "system health" diagnostic checks.
> >
> > Okay, I worked up a bit of a template for it. See attached.
> >
> > Thanks,
> > Phil
> >
> > version 1.0;
> >
> > ns junos = "http://xml.juniper.net/junos/*/junos";
> > ns xnm = "http://xml.juniper.net/xnm/1.1/xnm";
> > ns jcs extension = "http://xml.juniper.net/junos/commit-scripts/1.0";
> > ns dyn extension = "http://exslt.org/dynamic";
> >
> > import "../import/junos.xsl";
> >
> > param $uptime = 60 * 60 * 24 * 7;
> > param $filesystem-threshold = 80;
> >
> > var $fsnames := {
> > <fs> "/";
> > <fs> "/tmp";
> > <fs> "/config";
> > }
> >
> > var $checks := {
> > <check> {
> > <name> "System Uptime";
> > <rpc> {
> > <get-system-uptime-information>;
> > }
> > <test> "uptime-information/up-time/@junos:seconds < $uptime";
> > }
> > <check> {
> > <name> "Filesystem Space";
> > <rpc> {
> > <get-system-storage>;
> > }
> > for-each ($fsnames/fs) {
> > <test message=. _ " is full">
> > "filesystem[mounted-on = '" _ .
> > _ "'][number(used-percent) > $filesystem-threshold]";
> > }
> > }
> > }
> >
> > match / {
> > <op-script-results> {
> > var $conn = jcs:open();
> >
> > for-each ($checks/check) {
> > expr jcs:output("Checking ", name);
> > var $check = .;
> > expr jcs:output(" [rpc ", local-name(rpc/node()), "]");
> > var $res = jcs:execute($conn, rpc);
> > if ($res/..//xnm:error) {
> > expr jcs:output(" error from rpc: ", $res/..//xnm:error);
> > } else {
> > for-each (test) {
> > var $test = .;
> > for-each ($res) {
> > var $p = dyn:evaluate($test);
> > if (boolean($p)) {
> > var $msg = jcs:first-of($test/@message,
> > "failed condition");
> > expr jcs:output(" error from test: ", $msg);
> > } else {
> > expr jcs:output(" [passed]");
> > }
> > }
> > }
> > }
> > }
> >
> > expr jcs:close($conn);
> > }
> > }
> > _______________________________________________
> > juniper-nsp mailing list juniper-nsp [at] puck
> > https://puck.nether.net/mailman/listinfo/juniper-nsp
>
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


skeeve+junipernsp at eintellego

Apr 3, 2012, 4:24 PM

Post #13 of 16 (2555 views)
Permalink
Re: Regular maintenance advice [In reply to]

Tom,

Phil just did that.... nice too.

*Skeeve Stevens, CEO*
eintellego Pty Ltd
skeeve [at] eintellego ; www.eintellego.net <http://www.eintellego.net.au>

Phone: 1300 753 383 ; Fax: (+612) 8572 9954

Cell +61 (0)414 753 383 ; skype://skeeve

facebook.com/eintellego

twitter.com/networkceoau ; www.linkedin.com/in/skeeve

PO Box 7726, Baulkham Hills, NSW 1755 Australia

The Experts Who The Experts Call
Juniper - Cisco – Brocade - IBM



On Wed, Apr 4, 2012 at 04:18, Tom Storey <tom [at] snnap> wrote:

> On 3 April 2012 15:41, Julien Goodwin <jgoodwin [at] studio442> wrote:
> > If you can be strict about it you can say anything but up/up and
> > down/down are problems.
>
> What about SONET/SDH interfaces that display down/up?
>
> The interface can be admin down, but if its still receiving a
> SONET/SDH signal from the other side then line proto will be up -
> nothing necessarily wrong with that. :-)
>
> In reply to Skeeves original email, is there any reason you couldn't
> script something like this? At least give a device a once over and
> produce a summary report of "problems for this device" after which the
> tech can then target only devices that have issues that need
> attention. Otherwise you find yourself wasting time looking at a bunch
> of boxes that dont need to be looked at when you could be doing
> something more productive.
>
> Or better yet, syslog and SNMP traps collectors and some scripts that
> produce a dashboard highlighting any issues detected. :-)
>
> Scripts, scripts, scripts everywhere. :-)
>
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


skeeve+junipernsp at eintellego

Apr 3, 2012, 4:27 PM

Post #14 of 16 (2591 views)
Permalink
Re: Regular maintenance advice [In reply to]

Thanks for that Piotr.

What are the current thoughts/best practices on the snapshot?

Like your mis-match below, I have some switches which are the same.

Should they be running a current snapshot if possible (maybe except while
upgrading or becoming stable) ?

*Skeeve Stevens, CEO*
eintellego Pty Ltd
skeeve [at] eintellego ; www.eintellego.net <http://www.eintellego.net.au>

Phone: 1300 753 383 ; Fax: (+612) 8572 9954

Cell +61 (0)414 753 383 ; skype://skeeve

facebook.com/eintellego

twitter.com/networkceoau ; www.linkedin.com/in/skeeve

PO Box 7726, Baulkham Hills, NSW 1755 Australia

The Experts Who The Experts Call
Juniper - Cisco – Brocade - IBM



On Wed, Apr 4, 2012 at 04:57, Piotr Szlenk <piotr.szlenk [at] gmail> wrote:

> Skeeve,
>
> Try this one. This should provide info about current code on both
> partitions on EX series.
>
> > show system snapshot media internal
> Information for snapshot on internal (/dev/da0s1a) (backup)
> Creation date: Mar 20 15:39:34 2012
> JUNOS version on snapshot:
> jbase : 11.2R1.2
> jcrypto-ex: 11.2R1.2
> jdocs-ex: 11.2R1.2
> jkernel-ex: 11.2R1.2
> jroute-ex: 11.2R1.2
> jswitch-ex: 11.2R1.2
> jweb-ex: 11.2R1.2
> Information for snapshot on internal (/dev/da0s2a) (primary)
> Creation date: Mar 20 18:08:56 2012
> JUNOS version on snapshot:
> jbase : 11.4R1.6
> jcrypto-ex: 11.4R1.6
> jdocs-ex: 11.4R1.6
> jkernel-ex: 11.4R1.6
> jroute-ex: 11.4R1.6
> jswitch-ex: 11.4R1.6
> jweb-ex: 11.4R1.6
>
>
> 2012/4/3 Skeeve Stevens <skeeve+junipernsp [at] eintellego>:
> > Excellent Julian.
> >
> > btw. Doing the show system snapshot on a an EX4200 stack just showed me:
> >
> > user [at] hos> show system snapshot
> > error: external media missing or invalid
> >
> > I'm guessing a USB key should be installed by default for this? or you
> > think a switch may not need it?
> >
> >
> > *Skeeve Stevens, CEO*
> > eintellego Pty Ltd
> > skeeve [at] eintellego ; www.eintellego.net <http://www.eintellego.net.au
> >
> >
> > Phone: 1300 753 383 ; Fax: (+612) 8572 9954
> >
> > Cell +61 (0)414 753 383 ; skype://skeeve
> >
> > facebook.com/eintellego
> >
> > twitter.com/networkceoau ; www.linkedin.com/in/skeeve
> >
> > PO Box 7726, Baulkham Hills, NSW 1755 Australia
> >
> > The Experts Who The Experts Call
> > Juniper - Cisco – Brocade - IBM
> >
> >
> >
> > On Wed, Apr 4, 2012 at 00:41, Julien Goodwin <jgoodwin [at] studio442
> >wrote:
> >
> >> On 04/04/12 00:28, Skeeve Stevens wrote:
> >> > 1. Show log messages
> >> >
> >> > a. Look at last few days for anything suspicious
> >> >
> >> > i. Interfaces flapping
> >>
> >> "show int | match flap" is your friend. Also chassisd
> >>
> >> > 2. Show interfaces terse
> >> >
> >> > a. Anything down that shouldn’t be?
> >>
> >> Also anything *up* that shouldn't be.
> >>
> >> If you can be strict about it you can say anything but up/up and
> >> down/down are problems.
> >>
> >> > 3. Show chassis alarm
> >> >
> >> > a. Look for any alarm information
> >>
> >> If you have any EX (at least, can't remember for SRX/J, not for M/...)
> >> also add:
> >>
> >> show system alarms
> >>
> >> (It's sad how few people know about this)
> >>
> >> > 4. Show system snapshot
> >> >
> >> > a. If older than 1 week then – ‘Request system snapshot’
> >>
> >> er, why?
> >> Do a snapshot on OS upgrade, shouldn't be needed after that.
> >>
> >> Verifing "commit sync" is default is also good.
> >>
> >> > 5. Show system uptime
> >> >
> >> > a. As expected?
> >> >
> >> > 6. Show system storage
> >> >
> >> > a. Confirm / (root) disk space is not getting full.
> >>
> >>
> > _______________________________________________
> > juniper-nsp mailing list juniper-nsp [at] puck
> > https://puck.nether.net/mailman/listinfo/juniper-nsp
>
>
>
> --
> Piotr Szlenk
> e-mail: piotr.szlenk [at] gmail | mobile: +48793717288
>
> _______________________________________________
> juniper-nsp mailing list juniper-nsp [at] puck
> https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


cra at wpi

Apr 3, 2012, 9:20 PM

Post #15 of 16 (2585 views)
Permalink
Re: Regular maintenance advice [In reply to]

On Wed, Apr 04, 2012 at 09:27:29AM +1000, Skeeve Stevens wrote:
> On Wed, Apr 4, 2012 at 04:57, Piotr Szlenk <piotr.szlenk [at] gmail> wrote:
> > Try this one. This should provide info about current code on both
> > partitions on EX series.
> >
> > > show system snapshot media internal
>
> Thanks for that Piotr.
>
> What are the current thoughts/best practices on the snapshot?
>
> Like your mis-match below, I have some switches which are the same.

You need to manually synchronize the software versions from primary to
backup slices on all VC members one at a time after doing a software
upgrade (and verifying you are happy with the new software).
Otherwise, if there is a failure/corruption of the primary flash, it
will boot into the backup slice running the old version.

> Should they be running a current snapshot if possible (maybe except while
> upgrading or becoming stable) ?

It is only necessary after a change in software version or after flash
corruption such as if the switch is power cycled without being
shutdown properly (substitute alternate w/1 or 2 as necessary for
repair):

ex-switch> request system snapshot media internal slice alternate member N

Another useful command to see what partitions are currently being used
for what purposes:

ex-switch> show system storage partitions
fpc0:
--------------------------------------------------------------------------
Boot Media: internal (da0)
Active Partition: da0s2a
Backup Partition: da0s1a
Currently booted from: active (da0s2a)

Partitions information:
Partition Size Mountpoint
s1a 184M altroot
s2a 184M /
s3d 369M /var/tmp
s3e 123M /var
s4d 62M /config
s4e unused (backup config)
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp


skeeve+junipernsp at eintellego

Apr 3, 2012, 9:57 PM

Post #16 of 16 (2560 views)
Permalink
Re: Regular maintenance advice [In reply to]

Gordon,

Thanks. I already have a different profile for the BGP devices with all of
that.

...Skeeve

*Skeeve Stevens, CEO*
eintellego Pty Ltd
skeeve [at] eintellego ; www.eintellego.net <http://www.eintellego.net.au>

Phone: 1300 753 383 ; Fax: (+612) 8572 9954

Cell +61 (0)414 753 383 ; skype://skeeve

facebook.com/eintellego

twitter.com/networkceoau ; www.linkedin.com/in/skeeve

PO Box 7726, Baulkham Hills, NSW 1755 Australia

The Experts Who The Experts Call
Juniper - Cisco – Brocade - IBM



On Wed, Apr 4, 2012 at 09:15, Gordon Smith <gordon [at] gswsystems> wrote:

> Most of this you can automate on your monitoring boxes.
> e.g. use rancid to generate an email on config changes, interfaces
> flapping & chassis alarms will generate SNMP alerts.
>
> You only need to snapshot when upgrading code. Definitely make that part
> of the upgrade procedure, and let rancid keep track of the config.
>
> Another thing to look at would be BGP peers - number of routes, uptimes,
> etc. Low uptimes on a peer can indicate a problem at the far end that the
> cust isn't aware of.
>
> Cheers,
> Gordon
>
>
> On Wed, 4 Apr 2012 00:28:09 +1000, Skeeve Stevens wrote:
>
>> Hey all,
>>
>> I am designing a document for low level technicians to regularly
>> (depending on sensitivity of the device) login to the Juniper
>> router/or switch to look around and make sure that things are 'ok'.
>>
>> I am seeking comments of anything else that would be useful for an
>> technician to look at that would catch their eye that something is
>> potentially wrong.
>>
>> So far I have:
>>
>> ---
>>
>> RJ01 – Router
>>
>> Description: Standard Juniper Router or Switch
>>
>> 1. Show log messages
>>
>> a. Look at last few days for anything suspicious
>>
>> i. Interfaces flapping
>>
>> 2. Show interfaces terse
>>
>> a. Anything down that shouldn’t be?
>>
>>
>> 3. Show chassis alarm
>>
>> a. Look for any alarm information
>>
>> 4. Show system snapshot
>>
>> a. If older than 1 week then – ‘Request system snapshot’
>>
>> 5. Show system uptime
>>
>> a. As expected?
>>
>> 6. Show system storage
>>
>> a. Confirm / (root) disk space is not getting full.
>>
>>
>>
>
> ______________________________**_________________
> juniper-nsp mailing list juniper-nsp [at] puck
> https://puck.nether.net/**mailman/listinfo/juniper-nsp<https://puck.nether.net/mailman/listinfo/juniper-nsp>
_______________________________________________
juniper-nsp mailing list juniper-nsp [at] puck
https://puck.nether.net/mailman/listinfo/juniper-nsp

nsp juniper RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.