Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Users

obscure networking failover help

 

 

Linux-HA users RSS feed   Index | Next | Previous | View Threaded


rob.dawson at investis

Sep 18, 2002, 6:03 AM

Post #1 of 4 (1110 views)
Permalink
obscure networking failover help

Hi,

I'm considering how to get heartbeat to failover a set of nodes for me, and
having some fun trying to figure out what I actually need. Alan mentioned
there were a number of reasonably clued up people on this list (thanks, Alan
:-) so I thought I'd drop a line in here and see if someone could help.

What I'm trying to do is set up a couple of firewalls, with failover. I'll
see if I can diagram it below, but it sortof goes like this: I have an
external IP address which can float, no problem. I have an internal link
from each firewall box (live and failover - fw1 & fw2) to each dmz box
(again, live and failover, dmz1 & 2 - it'd be nice to manage an
active-active config, but I'm not sure heartbeat will manage that, and,
really, it's probably overkill. nice overkill, but still.. :-) This leave a
total of 4 internal cables, plus the heartbeat eth link between fw1 & fw2,
and between dmz1 & dmz2, plus the serial heartbeat links between fw1-fw2 &
dmz1-dmz2


internet
| |
| | IP addresses:
e0 e0 fw1-e0=192.168.0.40/24
fw2-e0=192.168.0.41/24
+-------+S0----S0+-------+ fw1-S0=serial link fw2-S0=serial
link
| fw1 |e3----e3| fw2 | fw1-e1=192.168.10.1/28
fw2-e1=192.168.10.5/28
+-------+e2 e2+-------+ fw1-e2=192.168.10.9/28
fw2-e2=192.168.10.13/28
e1 \ / e1 fw1-e3=192.168.10.17/30
fw2-e3=192.168.10.18/30
| \/ | N1=192.168.10.3/28
N2=192.168.11/28
N1 N2 N1
| /\ | N1=192.168.10.4/28
N2=192.168.12/28
e0 / \ e0 dmz1-e0=192.168.10.2/28
dmz2-e0=192.168.10.6/28
+-------+e3 e3+-------+ dmz1-e1=192.168.11.1/24
dmz2-e1=192.168.11.2/24
| dmz1 |S0----S0| dmz2 | dmz1-e2=192.168.12.2/24
dmz2-e2=192.168.12.1/24
+-------+e4----e4+-------+ dmz1-e3=192.168.10.14/28
dmz2-e3=192.168.10.10/28
e1 e2 e2 e1 dmz1-e4=192.168.10.21/30
dmz2-e4=192.168.10.22/30
| \ / |
| \/ |
| /\ |
+-------+__/ \__+-------+
|lobal1 | |local2 |
+-------+ +-------+

I've put aside N1 & N2 addresses addresses for failover, but I'm still
having trouble seeing how to set it up...

Now, the problem is not the internet side - that's easy. Nor is it in the
internal side, with the load balancers - I can accept (unlike many) that we
drop a minute or two of traffic - what's behind this is a reasonable-sized
web farm, so we'd really like not to drop anything, but we can accept that
in case of hardware or admin failure :-) it might take a minute or so to
cope. The traffic is being NATed through, so there's minimal handling on the
box itself - connections would drop, as conntrack wouldn't be able to hand
over between boxes without a heck of a lot of interesting kernel coding,
which I'm not feeling up to this week :-)

The bit I have trouble with is how to get failover working nicely on the
network between fw & dmz. The only way I can see it working is, say, in the
situation of fw1/dmz1 being the live boxes, and fw1 dies for some reason -
fw2 will pick up the external link, and traffic will pile through there, get
directed internally (via NAT) and wander through dmz2, and caper on happily.
If, however, dmz2 goes down as well, I was looking at how I can then a) have
fw2 become aware of that, and b) have it redirect traffic through dmz1.
About the only way I can come up with is having a set of firewall scripts
that heartbeat runs on ip-up/down equivalence, and 6 heartbeat links - one
for every pair of hosts. This is... not clean. Not even close. Unless
heartbeat is more flexible than it seems in regard to this particular setup,
I'd be looking at 4 running copies on every box, with 4 sets of config files
et al.

Is there some way of doing this cleanly, or do I have to make this up as I
go along? I've had a browse through the maillist archives, but I'm more or
less at a loss as to what to search for. I couldn't see anything that seemed
appropriate, although that's probably myopia more than anything...

I expect I'm looking at this from the wrong angle. Would someone care to
hand me a mirror, so I can see round the bend?

Many thanks,
Rob Dawson

System Administrator
Investis Ltd
Ph. 020 7071 8513
Mb. 077 8917 2195


david.lang at digitalinsight

Sep 18, 2002, 10:14 PM

Post #2 of 4 (1029 views)
Permalink
obscure networking failover help [In reply to]

what you would need to do is to put switches between the firewall and the
DMZ box, use floating addresses on both sides of the firewalls, and on both
sides of the DMZ boxes, then switches again between the DMZ and the local
boxes

so you do

outside
------------
| |
fw1 fw2
| |
------------
| |
dmz1 dmz2
| |
------------
| |
local1 local2

for max redundancy have each of the netowrk layers (the switches -----)
actually be a pair of switches joined togeather so that any one component
can fail and the traffic will still have a route available to get through.

basicly treat each layer as if it was alone and didn't have HA on the other
layers, implement the HA on that layer and then go on to the next.

the problem gets much easier to deal with that way.

David Lang


-----Original Message-----
From: Rob Dawson [mailto:rob.dawson [at] investis]
Sent: Wednesday, September 18, 2002 6:03 AM
To: 'linux-ha [at] muc'
Subject: obscure networking failover help


Hi,

I'm considering how to get heartbeat to failover a set of nodes for me, and
having some fun trying to figure out what I actually need. Alan mentioned
there were a number of reasonably clued up people on this list (thanks, Alan
:-) so I thought I'd drop a line in here and see if someone could help.

What I'm trying to do is set up a couple of firewalls, with failover. I'll
see if I can diagram it below, but it sortof goes like this: I have an
external IP address which can float, no problem. I have an internal link
from each firewall box (live and failover - fw1 & fw2) to each dmz box
(again, live and failover, dmz1 & 2 - it'd be nice to manage an
active-active config, but I'm not sure heartbeat will manage that, and,
really, it's probably overkill. nice overkill, but still.. :-) This leave a
total of 4 internal cables, plus the heartbeat eth link between fw1 & fw2,
and between dmz1 & dmz2, plus the serial heartbeat links between fw1-fw2 &
dmz1-dmz2


internet
| |
| | IP addresses:
e0 e0 fw1-e0=192.168.0.40/24
fw2-e0=192.168.0.41/24
+-------+S0----S0+-------+ fw1-S0=serial link fw2-S0=serial
link
| fw1 |e3----e3| fw2 | fw1-e1=192.168.10.1/28
fw2-e1=192.168.10.5/28
+-------+e2 e2+-------+ fw1-e2=192.168.10.9/28
fw2-e2=192.168.10.13/28
e1 \ / e1 fw1-e3=192.168.10.17/30
fw2-e3=192.168.10.18/30
| \/ | N1=192.168.10.3/28
N2=192.168.11/28
N1 N2 N1
| /\ | N1=192.168.10.4/28
N2=192.168.12/28
e0 / \ e0 dmz1-e0=192.168.10.2/28
dmz2-e0=192.168.10.6/28
+-------+e3 e3+-------+ dmz1-e1=192.168.11.1/24
dmz2-e1=192.168.11.2/24
| dmz1 |S0----S0| dmz2 | dmz1-e2=192.168.12.2/24
dmz2-e2=192.168.12.1/24
+-------+e4----e4+-------+ dmz1-e3=192.168.10.14/28
dmz2-e3=192.168.10.10/28
e1 e2 e2 e1 dmz1-e4=192.168.10.21/30
dmz2-e4=192.168.10.22/30
| \ / |
| \/ |
| /\ |
+-------+__/ \__+-------+
|lobal1 | |local2 |
+-------+ +-------+

I've put aside N1 & N2 addresses addresses for failover, but I'm still
having trouble seeing how to set it up...

Now, the problem is not the internet side - that's easy. Nor is it in the
internal side, with the load balancers - I can accept (unlike many) that we
drop a minute or two of traffic - what's behind this is a reasonable-sized
web farm, so we'd really like not to drop anything, but we can accept that
in case of hardware or admin failure :-) it might take a minute or so to
cope. The traffic is being NATed through, so there's minimal handling on the
box itself - connections would drop, as conntrack wouldn't be able to hand
over between boxes without a heck of a lot of interesting kernel coding,
which I'm not feeling up to this week :-)

The bit I have trouble with is how to get failover working nicely on the
network between fw & dmz. The only way I can see it working is, say, in the
situation of fw1/dmz1 being the live boxes, and fw1 dies for some reason -
fw2 will pick up the external link, and traffic will pile through there, get
directed internally (via NAT) and wander through dmz2, and caper on happily.
If, however, dmz2 goes down as well, I was looking at how I can then a) have
fw2 become aware of that, and b) have it redirect traffic through dmz1.
About the only way I can come up with is having a set of firewall scripts
that heartbeat runs on ip-up/down equivalence, and 6 heartbeat links - one
for every pair of hosts. This is... not clean. Not even close. Unless
heartbeat is more flexible than it seems in regard to this particular setup,
I'd be looking at 4 running copies on every box, with 4 sets of config files
et al.

Is there some way of doing this cleanly, or do I have to make this up as I
go along? I've had a browse through the maillist archives, but I'm more or
less at a loss as to what to search for. I couldn't see anything that seemed
appropriate, although that's probably myopia more than anything...

I expect I'm looking at this from the wrong angle. Would someone care to
hand me a mirror, so I can see round the bend?

Many thanks,
Rob Dawson

System Administrator
Investis Ltd
Ph. 020 7071 8513
Mb. 077 8917 2195


rob.dawson at investis

Sep 19, 2002, 3:57 AM

Post #3 of 4 (1029 views)
Permalink
obscure networking failover help [In reply to]

Hmm. That's about what I figured.

Which makes heartbeat somewhat more complex to set up. :-(


I came up with another solution late last night, tho. I figured, use
heartbeat to manage the external IP, and the internal IP, and write up a
series of scripts to manage fiddling the internal network to cope - it's all
NAT traffic, or squid with parents, so if one internface goes down, it's not
going to cause problems there - all I got to do is set up refiddling the
firewall scripts for each case...

Rob Dawson

System Administrator
Investis Ltd
Ph. 020 7071 8513
Mb. 077 8917 2195

> -----Original Message-----
> From: David Lang [mailto:david.lang [at] digitalinsight]
> Sent: 19 September 2002 06:15
> To: 'Rob Dawson'; 'linux-ha [at] muc'
> Subject: RE: obscure networking failover help
>
>
> what you would need to do is to put switches between the
> firewall and the
> DMZ box, use floating addresses on both sides of the
> firewalls, and on both
> sides of the DMZ boxes, then switches again between the DMZ
> and the local
> boxes
>

From Wallwork, Nathan" <nwallwo [at] pnm Thu Sep 19 17:00:45 2002 [5778]
From: Wallwork, Nathan" <nwallwo [at] pnm (Wallwork, Nathan)
Date: Thu, 19 Sep 2002 10:00:45 -0600 (MDT)
Subject: obscure networking failover help
In-Reply-To: <D53BF43BC70DD511A22500508BB3C0077D9475 [at] wlvexc00>
Message-ID: <Pine.LNX.4.44.0209190826480.2828-100000 [at] test0>

On Wed, 18 Sep 2002, David Lang wrote:

> what you would need to do is to put switches between the firewall and the
> DMZ box, use floating addresses on both sides of the firewalls, and on both
> sides of the DMZ boxes, then switches again between the DMZ and the local
> boxes
>
> so you do
>
> outside
> ------------
> | |
> fw1 fw2
> | |
> ------------
> | |
> dmz1 dmz2
> | |
> ------------
> | |
> local1 local2
>
> for max redundancy have each of the netowrk layers (the switches -----)
> actually be a pair of switches joined togeather so that any one component
> can fail and the traffic will still have a route available to get through.
>
> basicly treat each layer as if it was alone and didn't have HA on the other
> layers, implement the HA on that layer and then go on to the next.
>
> the problem gets much easier to deal with that way.

This is exactly what I've been working towards.

It's important to draw the switches in the picture, because they
may fail, so you need to plan for that, and you need to be able to
distinguish between switch failures and react accordingly.


| |
fw1 fw2
| |
sw1a ----- sw2a
| |
| |
eth0 | null | eth0
dmz1 ----- dmz2
eth1 | modem | eth1
| |
| |
sw1b ---- swb2
| |
local1 local2


Suppose we focus on dmz1 and dmz2.

We can set up a null-modem serial cable between them, and we can
set up udp broadcast over eth0 and eth1, but what we really care
about is the ability to route traffic.

Suppose the udp broadcast over eth0 fails. Both dmz1 and dmz2 will
notice, but that alone would not be enough to determine if the udp
broadcast was failing because sw1a was down or because sw1b was down,
so it wouldn't be enough information to determine which dmz host is
still able to route traffic.

For dmz1 to be able to route traffic, it must be able to reach both
sw1a and sw1b.

For dmz2 to be able to route traffic, it must be able to reach both
sw2a and sw2b.

What we need is for dmz1 to go into standby if sw1a or sw1b is down
and for dmz2 to go into standby if sw2a or sw2b is down.

That's where ipfail comes into play.

We can configure sw1a and sw1b as ping nodes for dmz1 and configure
sw2a and sw2b as ping nodes for dmz2. Then if sw1a or sw1b cannot
be reached by dmz1, ipfail will tell dmz1 to go into standby, and
if sw2a or sw2b cannot be reached by dmz2, ipfail will tell dmz2
to go into standby.

That's the concept anyway. I've been having trouble getting ipfail
to work the way I want it to, but hopefully we'll get it working soon.
If you want more details, read the recent posts on the dev list. If
you want to help work on it, join the dev list.


rob.dawson at investis

Sep 20, 2002, 2:46 AM

Post #4 of 4 (1029 views)
Permalink
obscure networking failover help [In reply to]

> From: Wallwork, Nathan [mailto:nwallwo [at] pnm]
>
> On Wed, 18 Sep 2002, David Lang wrote:
>
> > basicly treat each layer as if it was alone and didn't have
> HA on the other
> > layers, implement the HA on that layer and then go on to the next.
> >
> > the problem gets much easier to deal with that way.
>
> This is exactly what I've been working towards.
>
> It's important to draw the switches in the picture, because they
> may fail, so you need to plan for that, and you need to be able to
> distinguish between switch failures and react accordingly.
>
>
> | |
> fw1 fw2
> | |
> sw1a ----- sw2a
> | |
> | |
> eth0 | null | eth0
> dmz1 ----- dmz2
> eth1 | modem | eth1
> | |
> | |
> sw1b ---- swb2
> | |
> local1 local2
>
>
> Suppose we focus on dmz1 and dmz2.
>
> We can set up a null-modem serial cable between them, and we can
> set up udp broadcast over eth0 and eth1, but what we really care
> about is the ability to route traffic.
>
> Suppose the udp broadcast over eth0 fails. Both dmz1 and dmz2 will
> notice, but that alone would not be enough to determine if the udp
> broadcast was failing because sw1a was down or because sw1b was down,
> so it wouldn't be enough information to determine which dmz host is
> still able to route traffic.
>
> For dmz1 to be able to route traffic, it must be able to reach both
> sw1a and sw1b.
>
> For dmz2 to be able to route traffic, it must be able to reach both
> sw2a and sw2b.
>
> What we need is for dmz1 to go into standby if sw1a or sw1b is down
> and for dmz2 to go into standby if sw2a or sw2b is down.
>
> That's where ipfail comes into play.
>
> We can configure sw1a and sw1b as ping nodes for dmz1 and configure
> sw2a and sw2b as ping nodes for dmz2. Then if sw1a or sw1b cannot
> be reached by dmz1, ipfail will tell dmz1 to go into standby, and
> if sw2a or sw2b cannot be reached by dmz2, ipfail will tell dmz2
> to go into standby.
>
> That's the concept anyway. I've been having trouble getting ipfail
> to work the way I want it to, but hopefully we'll get it working soon.
> If you want more details, read the recent posts on the dev list. If
> you want to help work on it, join the dev list.

I'm not sure your concept covers things completely, but I *am* thinking of
my situation specifically, so I might be reading things betwen the lines
that aren't there... :-)

In the case you've shown, what happens if, say, sw2a & sw1b fail? both sides
fail over to each other?

If we have dmz1 connected to both sw1a & sw2a, we can route traffic even if
we have multiple failures - it's just a matter of how. In this case, two
seperate services on two seperate IP addresses (say, set up squid1 & squid2)
would enable you to have the failover cope gracefully. And, given a 4 port
network card isn't that hard to get hold of - or two 2-port, preferably, if
you can fit them in, for redundancy - having both dmz boxes talking to both
switches on both sides of them is not too much of a stretch, I figure.

Then, of course, assuming dmz1 & dmz2 are routing, your redundant IP link
for heartbeat could be used to send the traffic betwene them, and out via
the other switch - admittedly that's a bit messy...

Of course, it's possible there's another facet I've missed...

Linux-HA users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.