Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Users

Simple file sync

 

 

Linux-HA users RSS feed   Index | Next | Previous | View Threaded


lmb at teuto

Jul 9, 1999, 5:18 AM

Post #1 of 25 (1016 views)
Permalink
Simple file sync

Good morning.

In a redundant/failover system, there are often lots of configuration files
which need to be synchronised. While we are all anxiously awaiting Stephens
cluster layer, it should be possible to implement something a little less
sophisticated right now.

Each configuration file gets tagged with a numeric serial number and a md5
checksum.

Using Alan's heartbeat, the nodes communicate their serial numbers/checksums
for each configuration file (either at startup, when a file was changed, or in
regular intervals). The highest serial with a valid checksum wins and is
transferred to all nodes.

A system keeps old revisions of the files and only advertises a configuration
file with a valid checksum.

The configuration file manager has a list of configuration files and which
programs to call if one of them changes. (This file itself can not be
synchronised over this mechanism very well - since it doesn't change in a
working system, I propose that it should be set immuteable and if it gets
corrupted, the node is treated as failed).

Should two files have the same serial but different checksums, the one with
the highest mtime wins. (This case may occur when a configuration file is
"checked in" to the system on two nodes at virtually the same time - however,
this case is "very unlikely"(tm) in the case I am considering, since the
configuration files get modified by the admin user)

This is not perfect, but IMHO good enough to synchronise firewall rules,
interface configurations etc in a simple setup. (Like two failover
loadbalancers, which is the case I am looking at)

What do you all think?

Sincerely,
Lars Marowsky-Brée

--
Lars Marowsky-Brée
Network Management

teuto.net Netzdienste GmbH - DPN Verbund-Partner


alanr at bell-labs

Jul 9, 1999, 3:01 PM

Post #2 of 25 (1007 views)
Permalink
Simple file sync [In reply to]

Lars Marowsky-Bree wrote:
>
> Good morning.
>
> In a redundant/failover system, there are often lots of configuration files
> which need to be synchronised. While we are all anxiously awaiting Stephens
> cluster layer, it should be possible to implement something a little less
> sophisticated right now.
>
> Each configuration file gets tagged with a numeric serial number and a md5
> checksum.
>
> Using Alan's heartbeat, the nodes communicate their serial numbers/checksums
> for each configuration file (either at startup, when a file was changed, or in
> regular intervals). The highest serial with a valid checksum wins and is
> transferred to all nodes.
>
> A system keeps old revisions of the files and only advertises a configuration
> file with a valid checksum.
>
> The configuration file manager has a list of configuration files and which
> programs to call if one of them changes. (This file itself can not be
> synchronised over this mechanism very well - since it doesn't change in a
> working system, I propose that it should be set immuteable and if it gets
> corrupted, the node is treated as failed).
>
> Should two files have the same serial but different checksums, the one with
> the highest mtime wins. (This case may occur when a configuration file is
> "checked in" to the system on two nodes at virtually the same time - however,
> this case is "very unlikely"(tm) in the case I am considering, since the
> configuration files get modified by the admin user)
>
> This is not perfect, but IMHO good enough to synchronise firewall rules,
> interface configurations etc in a simple setup. (Like two failover
> loadbalancers, which is the case I am looking at)
>
> What do you all think?

Something like this is definitely needed. I have a few different thoughts on
how one might carry this out.

I don't think Stephen's stuff will solve this problem except for basic cluster
configuration.
>
> Each configuration file gets tagged with a numeric serial number and a md5
> checksum.
>
> Using Alan's heartbeat, the nodes communicate their serial numbers/checksums
> for each configuration file (either at startup, when a file was changed, or in
> regular intervals). The highest serial with a valid checksum wins and is
> transferred to all nodes.

There's a little more to this than you describe. Often a service also needs to
be restarted when things change. I would suggest that rather than automatically
picking up a new configuration that something slightly more secure might be in
order which assures that everyone has the same configuration before starting -
otherwise damage may have already been done. A method that goes like this be a
better choice:

When a node enters the cluster, it verifies the MD5 of it's config
before starting the relevant service.

When you want to change a file, you use rcp or rdist or something
similar to create files called "config.new" (for example). Then
you send out a cluster request like
"newconfig-check servicename checksum".
The result of newconfig-check is that everyone checks their config.new
files, and if each answers back if their config file matches the given
checksum, and reports back to the originator. When the originator
verifies that everyone has reported back with the correct checksum,
then it issues "newconfig-switch servicename" which then causes
each node to move "config to "config.old", and move "config.new" to
config, and restart servicename.

The current incarnation of the heartbeat protocol isn't the best choice
for a file transfer protocol. You could make one on top of it, but it would
be some work, and the serial links aren't the greatest as far as bandwidth is
concerned. There are some simple things I'm planning to do to make the protocol
more suitable for this kind of activity, but the potential bandwidth problem
remains.

On the other hand, the current protocol is in some sense a broadcast protocol,
so it might be a better choice than first glance might tell you.

Probably good to hook this into RCS as you alluded.

-- Alan Robertson
alanr [at] bell-labs


lmb at teuto

Jul 10, 1999, 4:31 AM

Post #3 of 25 (1010 views)
Permalink
Simple file sync [In reply to]

On 1999-07-09T16:01:11,
Alan Robertson <alanr [at] bell-labs> said:

> I don't think Stephen's stuff will solve this problem except for basic cluster
> configuration.

Hm. I thought Stephen's code was supposed to provide a general cluster wide
database, which could be used for configuration files too.

> There's a little more to this than you describe. Often a service also needs to
> be restarted when things change.

Thats why there is a program which gets called when an update is received over
the network.

> When a node enters the cluster, it verifies the MD5 of it's config
> before starting the relevant service.

And it talks to the cluster asking whether any newer revision exists before
starting the service. The configuration might have been updated during the
downtime. (Timeout problem: I think it might be more wise to start the service
with the locally available configuration as quickly as possible and then ask
the cluster - maybe we are the only node alive, then waiting for the cluster
would be pretty stupid)

> When you want to change a file, you use rcp or rdist or something
> similar to create files called "config.new" (for example). Then
> you send out a cluster request like
> "newconfig-check servicename checksum".
> The result of newconfig-check is that everyone checks their config.new
> files, and if each answers back if their config file matches the given
> checksum, and reports back to the originator. When the originator
> verifies that everyone has reported back with the correct checksum,
> then it issues "newconfig-switch servicename" which then causes
> each node to move "config to "config.old", and move "config.new" to
> config, and restart servicename.


Too complicated. And I see the problem of lengthy timeouts when one of the
client nodes (ie one which is supposed to report back) fails during this
cycle. Instead, I would do it like this:

When a new configuration file is successfully checked in locally, the node
broadcasts a "newconfig <servicename> <serial> <checksum>". All other nodes
pull the file from the node, verify the checksum and restart their local
daemons (if necessary - maybe only one of them is supposed to be active atm).

There will always be a short delay until all nodes have switched over, and we
must always consider that a node may not switch over to the new configuration
because of some problem. While it is nice to minimize switch over delays so it
stays <3s or so, we must cope with multiple active configurations as neatly as
possible anyway, so we should not go out of our way to prevent this from
happening. At least not in "Phase I".

> The current incarnation of the heartbeat protocol isn't the best choice
> for a file transfer protocol. You could make one on top of it, but it would
> be some work, and the serial links aren't the greatest as far as bandwidth is
> concerned. There are some simple things I'm planning to do to make the protocol
> more suitable for this kind of activity, but the potential bandwidth problem
> remains.

The problem with not using the OOB heartbeat mechanism is that you may receive
a newconfig request (because the serial ring is still working fine), but can't
retrieve it, because the ethernet is misconfigured / corrupted. (And you can't
update to the correct ethernet configuration because you can't get the file)

Doesn't the new heartbeat code over PPP use UDP which gets fragmented down
nicely, so transferring even 100k (which is a mighty big configuration file!)
shouldn't interfere too much with normal heartbeat operation?

> On the other hand, the current protocol is in some sense a broadcast protocol,
> so it might be a better choice than first glance might tell you.

Yes, this would be useful, even though not very much in my 2-3 node setup.

Sincerely,
Lars Marowsky-Brée

--
Lars Marowsky-Brée
Network Management

teuto.net Netzdienste GmbH - DPN Verbund-Partner


andy at globalauctions

Jul 10, 1999, 5:20 AM

Post #4 of 25 (1010 views)
Permalink
Simple file sync [In reply to]

On Fri, 9 Jul 1999, Alan Robertson wrote:
> Then
> you send out a cluster request like
> "newconfig-check servicename checksum".
> The result of newconfig-check is that everyone checks their config.new
> files, and if each answers back if their config file matches the given
> checksum, and reports back to the originator. When the originator
> verifies that everyone has reported back with the correct checksum,
> then it issues "newconfig-switch servicename" which then causes
> each node to move "config to "config.old", and move "config.new" to
> config, and restart servicename.

By over-specifying configuration behavior (I think _maybe_ you are, but I
might be jumping the gun), are you not limiting the ability to have rolling
upgrades?

-Andy


alanr at bell-labs

Jul 11, 1999, 12:22 AM

Post #5 of 25 (1012 views)
Permalink
Simple file sync [In reply to]

Lars Marowsky-Bree wrote:
>
> On 1999-07-09T16:01:11,
> Alan Robertson <alanr [at] bell-labs> said:
>
> > I don't think Stephen's stuff will solve this problem except for basic cluster
> > configuration.
>
> Hm. I thought Stephen's code was supposed to provide a general cluster wide
> database, which could be used for configuration files too.

I'm not sure what you mean here. I don't think he's planning on modifying
apache code to let him supply /etc/httpd/conf/access.conf, etc. My guess is
that most of the (dozens of) configuration files on a system won't be in his
database.

> And it talks to the cluster asking whether any newer revision exists before
> starting the service. The configuration might have been updated during the
> downtime. (Timeout problem: I think it might be more wise to start the service
> with the locally available configuration as quickly as possible and then ask
> the cluster - maybe we are the only node alive, then waiting for the cluster
> would be pretty stupid)

It's easy to see what nodes are currently active in the configuration. It's
probably a mistake to start this process with a cluster transition is in
progress.

> > When you want to change a file, you use rcp or rdist or something
> > similar to create files called "config.new" (for example). Then
> > you send out a cluster request like
> > "newconfig-check servicename checksum".
> > The result of newconfig-check is that everyone checks their config.new
> > files, and if each answers back if their config file matches the given
> > checksum, and reports back to the originator. When the originator
> > verifies that everyone has reported back with the correct checksum,
> > then it issues "newconfig-switch servicename" which then causes
> > each node to move "config to "config.old", and move "config.new" to
> > config, and restart servicename.
>
> Too complicated. And I see the problem of lengthy timeouts when one of the
> client nodes (ie one which is supposed to report back) fails during this
> cycle. Instead, I would do it like this:
>
> When a new configuration file is successfully checked in locally, the node
> broadcasts a "newconfig <servicename> <serial> <checksum>". All other nodes
> pull the file from the node, verify the checksum and restart their local
> daemons (if necessary - maybe only one of them is supposed to be active atm).

This sounds OK to me.

> > The current incarnation of the heartbeat protocol isn't the best choice
> > for a file transfer protocol. You could make one on top of it, but it would
> > be some work, and the serial links aren't the greatest as far as bandwidth is
> > concerned. There are some simple things I'm planning to do to make the protocol
> > more suitable for this kind of activity, but the potential bandwidth problem
> > remains.
>
> The problem with not using the OOB heartbeat mechanism is that you may receive
> a newconfig request (because the serial ring is still working fine), but can't
> retrieve it, because the ethernet is misconfigured / corrupted. (And you can't
> update to the correct ethernet configuration because you can't get the file)
>
> Doesn't the new heartbeat code over PPP use UDP which gets fragmented down
> nicely, so transferring even 100k (which is a mighty big configuration file!)
> shouldn't interfere too much with normal heartbeat operation?

The current code sets up PPP links but doesn't turn on or expect routing,
doesn't run a routing protocol, nor would it deal very well with some other
application (potentially) monopolizing the link.

It is possible to run a system with only raw serial (non-PPP) links. There are
some reasons why that might be desirable. For example, PPP links take up to 8
seconds to come up. This can be confusing to systems coming up in terms of
their deciding whether others in the world are alive, etc.

On the other hand, I have done telnet across the PPP links with no ill effects.
It's pretty slick. I haven't tried FTP.

> > On the other hand, the current protocol is in some sense a broadcast protocol,
> > so it might be a better choice than first glance might tell you.
>
> Yes, this would be useful, even though not very much in my 2-3 node setup.

It would make a 3x difference for a 4-node configuration.

It's probably a bad idea to send out configuration updates while the main
network is down anyway (unless, as you pointed out, that's why it's down).

Of course, you're always welcome to try it out. That's probably better than
guessing. You could just try to FTP across the PPP links and see what happens
to heartbeat traffic. That might give a feel for whether this would be a
problem in practice.

I haven't done that, but running a couple or three big FTPs while the links are
running at low speed ought to be a good test :-)

By the way, I'll have to continue this conversation next weekend. I'll be on
vacation until then, and there are no phones where I'm going :-)

By the way Lars, this is a really good suggestion in general (file
synchronization, etc). It would be good to see an implementation of something
like this.


-- Alan Robertson
alanr [at] bell-labs


alanr at bell-labs

Jul 11, 1999, 12:30 AM

Post #6 of 25 (1013 views)
Permalink
Simple file sync [In reply to]

Andy Poling wrote:
>
> On Fri, 9 Jul 1999, Alan Robertson wrote:
> > Then
> > you send out a cluster request like
> > "newconfig-check servicename checksum".
> > The result of newconfig-check is that everyone checks their config.new
> > files, and if each answers back if their config file matches the given
> > checksum, and reports back to the originator. When the originator
> > verifies that everyone has reported back with the correct checksum,
> > then it issues "newconfig-switch servicename" which then causes
> > each node to move "config to "config.old", and move "config.new" to
> > config, and restart servicename.
>
> By over-specifying configuration behavior (I think _maybe_ you are, but I
> might be jumping the gun), are you not limiting the ability to have rolling
> upgrades?

You're probably right. Perhaps some kind of version compatability stamps or
identifiers are needed? For example, "This config file is for Apache 1.3.6" so
that the config files are directed to either particular machines, or machines by
version (or something like this).

-- Alan Robertson
alanr [at] bell-labs


lmb at teuto

Jul 11, 1999, 2:23 AM

Post #7 of 25 (1017 views)
Permalink
Simple file sync [In reply to]

On 1999-07-11T01:22:34,
Alan Robertson <alanr [at] bell-labs> said:

> > When a new configuration file is successfully checked in locally, the node
> > broadcasts a "newconfig <servicename> <serial> <checksum>". All other
> > nodes pull the file from the node, verify the checksum and restart their
> > local daemons (if necessary - maybe only one of them is supposed to be
> > active atm).
> This sounds OK to me.

Good, then I will try to implement that. Shouldn't be too difficult (yeah
right, thats how it always starts).

> On the other hand, I have done telnet across the PPP links with no ill
> effects. It's pretty slick. I haven't tried FTP.

Maybe tftp is a good choice even. Since it fragments the transfers into
512byte blocks anyway, it should work fine and not interfere with the
heartbeat.

> I haven't done that, but running a couple or three big FTPs while the links
> are running at low speed ought to be a good test :-)

But not very good for real world situations. Serial ports ought to be running
at 115,2kbit/s or at least 57,6kbit/s, and I don't think a small configuration
file sync will be a problem then.

In a 2-node setup one doesn't even need any routing protocols.

> By the way Lars, this is a really good suggestion in general (file
> synchronization, etc). It would be good to see an implementation of
> something like this.

Yes. I will implement the above this week (as time permits, but well, it IS my
current work project...).

Sync'ing _major_ content data (like web sites, databases etc) is probably best
done via rsync / CODA though. Hm. Maybe it is best to base this directly on
CODA, disconnected operation right from the start. Have a CODA share with the
master data, if a configuration file is checked in locally, copy it to the
CODA tree, and all nodes which are connected to CODA will notice and restart
their service...

Very elegant, but overkill for a 2 node setup and the monster code of CODA
isn't exactly KISS ;-)

I need to think about this some more.

Lets see, I get to report my solutions at Linux Kongress or better hide in a
hole. Oops.

Anyway, have fun on your vacation ;-)

Sincerely,
Lars Marowsky-Brée

--
Lars Marowsky-Brée
Network Management

teuto.net Netzdienste GmbH - DPN Verbund-Partner


steveu at netpage

Jul 11, 1999, 3:47 AM

Post #8 of 25 (1013 views)
Permalink
Simple file sync [In reply to]

Lars Marowsky-Bree wrote:

> Sync'ing _major_ content data (like web sites, databases etc) is probably best
> done via rsync / CODA though. Hm. Maybe it is best to base this directly on
> CODA, disconnected operation right from the start. Have a CODA share with the
> master data, if a configuration file is checked in locally, copy it to the
> CODA tree, and all nodes which are connected to CODA will notice and restart
> their service...

What about omirr? I only noticed it recently, looking through the packages in
Debian, and it looks potentially interesting for certain HA file replication
applications (possibly as it stands, and possibly in a modified form to provide
greater control over what is replicated). It seems it hasn't been touched much in
2 years. I'm wondering if that is because of serious problems, or if the author
just got bored and abandoned it.

Steve


alanr at bell-labs

Jul 11, 1999, 6:48 AM

Post #9 of 25 (1016 views)
Permalink
Simple file sync [In reply to]

Lars Marowsky-Bree wrote:
>
> On 1999-07-11T01:22:34,
> Alan Robertson <alanr [at] bell-labs> said:
>
> > > When a new configuration file is successfully checked in locally, the node
> > > broadcasts a "newconfig <servicename> <serial> <checksum>". All other
> > > nodes pull the file from the node, verify the checksum and restart their
> > > local daemons (if necessary - maybe only one of them is supposed to be
> > > active atm).
> > This sounds OK to me.
>
> Good, then I will try to implement that. Shouldn't be too difficult (yeah
> right, thats how it always starts).

I'll send you the newest (unreleased) version of heartbeat - it has the /proc
monitoring, so you can easily see the system configuration. Gives you more
freedom in how you implement things. It doesn't all install neatly yet, but
you'll figure it out :-)

> > On the other hand, I have done telnet across the PPP links with no ill
> > effects. It's pretty slick. I haven't tried FTP.
>
> Maybe tftp is a good choice even. Since it fragments the transfers into
> 512byte blocks anyway, it should work fine and not interfere with the
> heartbeat.

Doesn't tftp have some known security holes, such that many (most?) sites turn
it off?

> > I haven't done that, but running a couple or three big FTPs while the links
> > are running at low speed ought to be a good test :-)
>
> But not very good for real world situations. Serial ports ought to be running
> at 115,2kbit/s or at least 57,6kbit/s, and I don't think a small configuration
> file sync will be a problem then.

The point is -- make the *test* much worse than the expected reality, then when
things go much worse than expected in the real world, you don't get surprised as
much or as often. There are logs of lost packets in the heartbeat world. Be
sure and check them out. There are hooks for dealing with lost packets in the
current code, but no error recovery protocol.

> In a 2-node setup one doesn't even need any routing protocols.

But, then the solution is *limited* to 2-node setups. That's my concern. This
limitation isn't acceptable in the long term.

> > By the way Lars, this is a really good suggestion in general (file
> > synchronization, etc). It would be good to see an implementation of
> > something like this.
>
> Yes. I will implement the above this week (as time permits, but well, it IS my
> current work project...).
>
> Sync'ing _major_ content data (like web sites, databases etc) is probably best
> done via rsync / CODA though. Hm. Maybe it is best to base this directly on
> CODA, disconnected operation right from the start. Have a CODA share with the
> master data, if a configuration file is checked in locally, copy it to the
> CODA tree, and all nodes which are connected to CODA will notice and restart
> their service...
>
> Very elegant, but overkill for a 2 node setup and the monster code of CODA
> isn't exactly KISS ;-)

From what I heard Peter Braam say about CODA, I don't think CODA's the right
route. Feel free to ignore this part for now.

> I need to think about this some more.

I'd suggest either rsync or something like "Poor man's file replication" methods
described earlier. Note that the newest version of rsync has license problems
with regard to commercial use (sigh).
>
> Lets see, I get to report my solutions at Linux Kongress or better hide in a
> hole. Oops.

When does this come up?

>
> Anyway, have fun on your vacation ;-)

I will! No computer except maybe a little Civilization: Call to power :-), and
a few days time up in the mountains.


Thanks Lars!

-- Alan Robertson
alanr [at] bell-labs


sct at redhat

Jul 12, 1999, 7:36 AM

Post #10 of 25 (1010 views)
Permalink
Simple file sync [In reply to]

Hi,

On Fri, 9 Jul 1999 14:18:29 +0200, Lars Marowsky-Bree <lmb [at] teuto>
said:

> In a redundant/failover system, there are often lots of
> configuration files which need to be synchronised. While we are all
> anxiously awaiting Stephens cluster layer, it should be possible to
> implement something a little less sophisticated right now.

Sure.

Actually, what I was planning in the long term was simply to have a
layer on top of a cluster-wide database, accessed in exactly the same
simple manner as libdbm (with the possible addition of a transaction
facility).

This can work very well indeed. libdbm is an extremely simply but
versatile API. You can store information about the config files in
the dbm "repository" by using hash keys such as "config#0", "config#1"
etc to contain values of the form "/etc/hosts=HOSTS", which associates
a config file with a key into the database. Looking up keys
"HOSTS#0", "HOSTS#1" etc. will return the successive chunks of the
stored file, say 1K at a time.

It's a really simple model with lots of advantages. First, the libdbm
API exists today, and has loads of implementations. Secondly, NIS
*already* has the ability to export libdbm files around a cluster and
to maintain cluster synchronisation of the files. (The one thing it
doesn't give is transactional updates, but that can be done via
versioning, adding a version name to the hash keys and updating the
"valid" version as a single key update.)

This sort of model would let you start coding a libdbm-to-config-file
maintenance layer right now, independently of any other coding going
on. If somebody wants to come up with a better, transactional,
distributed version of libdbm later then the code can be ported
forward with minimal changes.

--Stephen


alanr at bell-labs

Jul 16, 1999, 5:31 AM

Post #11 of 25 (1011 views)
Permalink
Simple file sync [In reply to]

"Stephen C. Tweedie" wrote:
>
> Hi,
>
> On Fri, 9 Jul 1999 14:18:29 +0200, Lars Marowsky-Bree <lmb [at] teuto>
> said:
>
> > In a redundant/failover system, there are often lots of
> > configuration files which need to be synchronised. While we are all
> > anxiously awaiting Stephens cluster layer, it should be possible to
> > implement something a little less sophisticated right now.
>
> Sure.
>
> Actually, what I was planning in the long term was simply to have a
> layer on top of a cluster-wide database, accessed in exactly the same
> simple manner as libdbm (with the possible addition of a transaction
> facility).
>
> This can work very well indeed. libdbm is an extremely simply but
> versatile API. You can store information about the config files in
> the dbm "repository" by using hash keys such as "config#0", "config#1"
> etc to contain values of the form "/etc/hosts=HOSTS", which associates
> a config file with a key into the database. Looking up keys
> "HOSTS#0", "HOSTS#1" etc. will return the successive chunks of the
> stored file, say 1K at a time.

This is close, but not quite right. You have to have Config#0:HOSTS#0 (or
something similar) as the keys in order to have transactional semanatics.

> It's a really simple model with lots of advantages. First, the libdbm
> API exists today, and has loads of implementations. Secondly, NIS
> *already* has the ability to export libdbm files around a cluster and
> to maintain cluster synchronisation of the files. (The one thing it
> doesn't give is transactional updates, but that can be done via
> versioning, adding a version name to the hash keys and updating the
> "valid" version as a single key update.)

I'm concerned about what seem to be notable disadvantages to NIS (not DBM) for
the cluster as well.

First, a machine can't be bound to more than one NIS domain
at a time. This means that each cluster would have to have it's own NIS
domain, but cannot bind to the cluster's NIS domain - because binding to the NIS
domain is already used for things like password and group file access, which are
usually handled according to local site policy, and not directly associated with
cluster membership -- in fact can't be used for cluster memebership. Having it
bind to the cluster domain would mean that each cluster had it's own independent
password, group, etc file. This in turn means that you cannot use the yp
library functions for accessing the data. I'm not sure if a machine can be a
slave server for more than one NIS domain at a time. If it can, then all may be
well (and this item is a red herring), but if it can't, then there are problems
afoot, because a high-availability server can't be an NIS client (because then
the NIS server becomes a single point of failure).

Second, NIS is unreliable in practice for distributing data. It has been my
experience that one NIS map or another gets gets trashed, and has to be
redistributed several times a year. This is not where you want to start for a
high-availability file distribution model.

Third, NIS has the wrong file distribution model for a high-availability
environment. In NIS, there is a single master server which is a single point of
failure for distributing updates in the NIS world. I suppose you could move the
NIS master around the cluster as part of the recovery process, but the
complexity of this as part of the cluster's basic services makes me more than a
bit nervous. It seems to me that this adds unnecessary complexity to basic
cluster services in order to cover up what is basically the wrong underlying
file distribution model.

> This sort of model would let you start coding a libdbm-to-config-file
> maintenance layer right now, independently of any other coding going
> on. If somebody wants to come up with a better, transactional,
> distributed version of libdbm later then the code can be ported
> forward with minimal changes.

And, after all is said and done, it still doesn't get the data where you need it
(in files), and you have to then write a piece of code which still has to
checksum the file and have transactional installation properties anyway.

I would offer the opinion that this doesn't sound particularly simple, elegant
or reliable when all the details involved are considered.

-- Alan Robertson
alanr [at] bell-labs


sct at redhat

Jul 16, 1999, 2:50 PM

Post #12 of 25 (1008 views)
Permalink
Simple file sync [In reply to]

Hi,

On Fri, 16 Jul 1999 06:31:02 -0600, Alan Robertson
<alanr [at] bell-labs> said:

> I'm concerned about what seem to be notable disadvantages to NIS
> (not DBM) for the cluster as well.

> First, a machine can't be bound to more than one NIS domain at a
> time. This means that each cluster would have to have it's own NIS
> domain

That doesn't follow at all: there is no reason why we can't use
non-overlapping parts of the yp map space.

> Second, NIS is unreliable in practice for distributing data. It has
> been my experience that one NIS map or another gets gets trashed,
> and has to be redistributed several times a year. This is not where
> you want to start for a high-availability file distribution model.

That's not a valid argument. If your NIS implementation has bugs,
then fix it, don't write a new protocol.

> Third, NIS has the wrong file distribution model for a
> high-availability environment. In NIS, there is a single master
> server which is a single point of failure for distributing updates
> in the NIS world.

Read my initial post! NIS is a way to get this sort of thing going.
It is not something I'd expect to be the ultimate cluster database.
The master is only going to be needed for reconfiguration in the
scenarios posted so far.

>> This sort of model would let you start coding a libdbm-to-config-file
>> maintenance layer right now, independently of any other coding going
>> on. If somebody wants to come up with a better, transactional,
>> distributed version of libdbm later then the code can be ported
>> forward with minimal changes.

> And, after all is said and done, it still doesn't get the data where
> you need it (in files), and you have to then write a piece of code
> which still has to checksum the file and have transactional
> installation properties anyway.

> I would offer the opinion that this doesn't sound particularly
> simple, elegant or reliable when all the details involved are
> considered.

Doing a NIS-to-text-file extractor would, IMHO, be a rather
straightforward job compared to implementing a new cluster data
distribution mechanism. Text file configuration is not the be-all and
end-all of cluster data anyway --- many applications will be able to
take advantage of a distributed libdbm directly.

I can see ways of doing text file extraction from libdbm, but I can't
see any effective way of getting clustered libdbm functionality on top
of a text-file distribution mechanism. To me, that strongly implies
that the dbm API is more fundamental, and if we consider using that as
a quick and easy way of getting data around the cluster, then we can
both build more flexible APIs on top of it and reimplement the actual
data distribution mechanism in the future without breaking the API
itself.

I'm not arguing that NIS is the future of clustering. Rather, it's
libdbm that I like: it is a rather flexible API, one which already has
distributed implementations, and one to which we can add proper
clustering semantics in the future.

--Stephen


wiegand at suse

Jul 18, 1999, 2:21 AM

Post #13 of 25 (1010 views)
Permalink
Simple file sync [In reply to]

Hello,

please let me add my 2 pence to the NIS and LIBDBM discussion without
quoting all the previous postings. If I got it right, we are talking about
configuration file synchronisation here, but these two are record based,
so why should we use them? I have no knowledge of existing applications
which make use of either. Sounds to me like introducing an additional
artificial layer just for the sake of using these protocols.

NIS has been around for a while and is well understood -- that's why Sun
have abandoned it and developed NIS+ ;-) NIS Maps have been created to
distribute records of users, machines, services and others, but certainly
not configuration files. Maps are distributed unconditionally, so you
would have to introduce additional layers for dissemination control.

In NIS, clients do not store data, so every cluster node would have to be
a NIS master or slave server. We would probably have to set up separate
domains for each cluster, but then the clusters would be hard to integrate
into a common infrastructure.

NIS scores surprisingly high on the list of forbidden protocols in any
security aware environment. And it's so easy to spoof. And, as Alan
pointed out, it does have the occasional hickup. No, this is not a matter
of insufficient bug fixing, since I have myself collected enough sad
experience with AIX, Solaris and Linux over the past few years.

LIBDBM is basically a low level key/data storage mechanism. All the layers
of synchronisation, file building and check-in or check-out would have to
be added programmatically. Sounds like an artificially introduced extra
round to me.

My idea is to use CVS. I'm almost sure this must have been proposed
before, it's so obvious. Zero development effort.

CVS is file oriented. And we do want to work on files.

CVS provides versioning. And we can take advantage of it by using tags.
E.g. one tag for every node. This would even give us rolling upgrades for
free.

We don't have to worry about tools for managing the files. There is cvs
itself, useful for managing and extracting. There is tkCVS, there is an
EMACS CVS mode, and I wouldn't be surprised if there was also a module for
Perl.

All it takes to distribute these files is to trigger a "cvs update" on the
nodes, and it can even use SSH for authentication and encryption.

CVS does have to have one repository (which can well be kept outside the
cluster if we like). Is this really a single point of failure? I think
not. The repository is needed only when changing. (As Stephen pointed out,
this would also be true for NIS or other protocols.)

Why can't we build up a CVS tree *somewhere* on the net and have the
nodes' working directories under /etc/ha/config.d just check out from
there? And then maybe make /etc/httpd/var/httpd.conf a symbolic link to
the corresponding file within this tree? Then all we need to communicate
is the root of the tree.

It has been suggested to investigate in using the heartbeat media (serial
lines etc.) for synchronizing the config files. I believe that clusters
should not touch config files while they are instable. Would you trust
config files that were updated while the network was broken? So we would
have to introduce an extra layer of verification. And I would strongly
vote against speeding up serial lines to their limit. I'm using 9600 bps,
because I need it to be rock solid, not fast and "working most of the
time".

This CVS stuff is only good for persistent data, of course. Transient
data, like cluster or application states, are completely different. Maybe
we should not even store volatile data on disk, just in-memory.

Volker


--
Volker Wiegand Phone: +49 (0) 6196 / 50951-24
SuSE Rhein/Main AG i.G. Fax: +49 (0) 6196 / 40 96 07
Mergenthalerallee 45-47 Mobile: +49 (0) 179 / 292 66 76
D-65760 Eschborn E-Mail: Volker.Wiegand [at] suse


lm at bitmover

Jul 18, 1999, 7:34 AM

Post #14 of 25 (1008 views)
Permalink
Simple file sync [In reply to]

: My idea is to use CVS. I'm almost sure this must have been proposed
: before, it's so obvious. Zero development effort.

For what it is worth - I've had a number of discussions with Red Hat
about putting _all_ configuration files, not just the HA ones, under
version control. So the idea has other supporters.

However, it's not so clear that you want to use CVS. The Red Hat
discussions were surrounding BitKeeper because it is a far more
scalable answer.


d_martin at worldnet

Jul 18, 1999, 8:46 AM

Post #15 of 25 (1009 views)
Permalink
Simple file sync [In reply to]

Here's my $.02:

Volker Wiegand wrote:
> CVS does have to have one repository (which can well be kept outside the
> cluster if we like). Is this really a single point of failure? I think
> not. The repository is needed only when changing.

I agree that this is the case, and given that it is, it seems to me that any
method of distributing config files is fine, and should be left up to the
user. You might like CVS, and I might like rdist. My buddy might prefer to
hand rcp them.

I also agree that revision control is a good idea, but should also be left up
to the user. I think the real key is to DISCUSS these options in the
documentation.

Derek


sct at redhat

Jul 19, 1999, 6:11 AM

Post #16 of 25 (1011 views)
Permalink
Simple file sync [In reply to]

Hi,

On Sun, 18 Jul 1999 08:34:47 -0600, Larry McVoy <lm [at] bitmover> said:

> : My idea is to use CVS. I'm almost sure this must have been proposed
> : before, it's so obvious. Zero development effort.

> For what it is worth - I've had a number of discussions with Red Hat
> about putting _all_ configuration files, not just the HA ones, under
> version control. So the idea has other supporters.

For what it is worth - linuxconf, as distributed with all recent Red Hat
versions, already has the ability to do this for all the config files it
controls, through the "profile" facility. You can keep multiple
different system profiles (read: "CVS branches") and do versioning on
each one.

--Stephen


wiegand at suse

Jul 20, 1999, 3:58 AM

Post #17 of 25 (1017 views)
Permalink
Simple file sync [In reply to]

On Sun, 18 Jul 1999, Derek Martin wrote:

> Here's my $.02:
>
> Volker Wiegand wrote:
> > CVS does have to have one repository (which can well be kept
> > outside the cluster if we like). Is this really a single point
> > of failure? I think not. The repository is needed only when
> > changing.
>
> I agree that this is the case, and given that it is, it seems to me
> that any method of distributing config files is fine, and should be
> left up to the user. You might like CVS, and I might like rdist. My
> buddy might prefer to hand rcp them.
>

It should *NOT* be left to the user. We are talking about defining an
infrastructure here. In this environment CVS acts like an API, with well
defined paths, maybe even providing specialized commands for updating.
Think only of the "event" handling needed to trigger rolling updates.

> I also agree that revision control is a good idea, but should also be
> left up to the user. I think the real key is to DISCUSS these options
> in the documentation.
>
> Derek

This versioning is not only a nice-to-have, but at the heart of the whole
design. What I want is "dissemination control", i.e. knowing which version
of which configuration file is active on any node at any given time. If
you or your buddy can guarantee this in the same way as CVS can, show it
to me, and I will happily abandon CVS and buy into your solution.

Volker

--
Volker Wiegand Phone: +49 (0) 6196 / 50951-24
SuSE Rhein/Main AG i.G. Fax: +49 (0) 6196 / 40 96 07
Mergenthalerallee 45-47 Mobile: +49 (0) 179 / 292 66 76
D-65760 Eschborn E-Mail: Volker.Wiegand [at] suse


wiegand at suse

Jul 28, 1999, 11:30 PM

Post #18 of 25 (1012 views)
Permalink
Simple file sync [In reply to]

On Sun, 18 Jul 1999, Larry McVoy wrote:

> [...]
> However, it's not so clear that you want to use CVS. The Red Hat
> discussions were surrounding BitKeeper because it is a far more
> scalable answer.

Now that I found the BitKeeper homepage, let me just quote a few
statements:

"This is not free software, you have to pay for it
one way or another."

"The retail price is $2600/user"

I know that they have also set up conditions where you don't have to pay
them money, but these include other kinds of obligations I might not be
willing to abide by.

To be honest, this is definitely not what I will be looking into further.

Volker

--
Volker Wiegand Phone: +49 (0) 6196 / 50951-24
SuSE Rhein/Main AG i.G. Fax: +49 (0) 6196 / 40 96 07
Mergenthalerallee 45-47 Mobile: +49 (0) 179 / 292 66 76
D-65760 Eschborn E-Mail: Volker.Wiegand [at] suse


steveu at netpage

Jul 28, 1999, 11:35 PM

Post #19 of 25 (1009 views)
Permalink
Simple file sync [In reply to]

Volker Wiegand wrote:

> On Sun, 18 Jul 1999, Larry McVoy wrote:
>
> > [...]
> > However, it's not so clear that you want to use CVS. The Red Hat
> > discussions were surrounding BitKeeper because it is a far more
> > scalable answer.
>
> Now that I found the BitKeeper homepage, let me just quote a few
> statements:
>
> "This is not free software, you have to pay for it
> one way or another."
>
> "The retail price is $2600/user"
>
> I know that they have also set up conditions where you don't have to pay
> them money, but these include other kinds of obligations I might not be
> willing to abide by.
>
> To be honest, this is definitely not what I will be looking into further.
>
> Volker

Quite true. With its current conditions BitKeeper is good for managing free
software projects, but cannot actually become a part of them. Since it was
BitKeeper's author that actually suggested its use I am a bit confused.

Steve


lm at bitmover

Jul 29, 1999, 1:26 AM

Post #20 of 25 (1009 views)
Permalink
Simple file sync [In reply to]

On Thu, Jul 29, 1999 at 08:30:03AM +0200, Volker Wiegand wrote:
> On Sun, 18 Jul 1999, Larry McVoy wrote:
> > However, it's not so clear that you want to use CVS. The Red Hat
> > discussions were surrounding BitKeeper because it is a far more
> > scalable answer.
>
> Now that I found the BitKeeper homepage, let me just quote a few
> statements:
>
> "This is not free software, you have to pay for it
> one way or another."
>
> "The retail price is $2600/user"

This is a really skewed way of presenting the data, my friend. First of
all, single user repositories are free. So if you don't want to pay
for it, do this

alias bk='USER=I_AM_REALLY_CHEAP /usr/bitkeeper/bk'

and you never have to pay a dime.

Second of all, the $2600 price is for very large sites - the price, as
you clearly could tell abou 4 lines above where you got that one, goes
as low as $800/seat for small shops of 6 or less.

Third, I've had discussions with Red Hat about using BitKeeper for
sys admin on Red Hat boxes. There is a real chance that one day
you'll be using BitKeeper for sysadmin whether you like it or not.

Thanks for your kind consideration.
--
---
Larry McVoy lm [at] bitmover http://www.bitmover.com/lm


wiegand at suse

Jul 29, 1999, 4:40 AM

Post #21 of 25 (1012 views)
Permalink
Simple file sync [In reply to]

On Thu, 29 Jul 1999, Larry McVoy wrote:

> On Thu, Jul 29, 1999 at 08:30:03AM +0200, Volker Wiegand wrote:
> > On Sun, 18 Jul 1999, Larry McVoy wrote:
> > > However, it's not so clear that you want to use CVS. The Red Hat
> > > discussions were surrounding BitKeeper because it is a far more
> > > scalable answer.
> >
> > Now that I found the BitKeeper homepage, let me just quote a few
> > statements:
> >
> > "This is not free software, you have to pay for it
> > one way or another."
> >
> > "The retail price is $2600/user"
>
> This is a really skewed way of presenting the data, my friend. First of
> all, single user repositories are free. So if you don't want to pay
> for it, do this
>

Sorry, Larry, it is not skewed. This is not free software, and I was just
stating that I will not be considering commercial software for what I am
doing as an open source project if I can help it (i.e. if there's free
software that perfectly fits the need).

This whole discussion centered around using a version control system for
the purpose of storing production data, e.g. within companies. Since this
storing is not an "open source project" itself (which I interpret as the
act of development, and not the usage of the result of such programming in
production), and since those companies might not be willing to publish the
history of such changes, the option of using it for free does not apply.

> alias bk='USER=I_AM_REALLY_CHEAP /usr/bitkeeper/bk'
>
> and you never have to pay a dime.
>
> Second of all, the $2600 price is for very large sites - the price, as
> you clearly could tell abou 4 lines above where you got that one, goes
> as low as $800/seat for small shops of 6 or less.
>

To be honest, I really did not read carefully enough. I was just assuming
that discounts were given for higher volumes. So I restate my second quote
into "$800/seat", albeit without changing the meaning of it.

> Third, I've had discussions with Red Hat about using BitKeeper for
> sys admin on Red Hat boxes. There is a real chance that one day
> you'll be using BitKeeper for sysadmin whether you like it or not.
>

Hmmm, there's more than one way to skin a cat. Without going into further
detail, I might answer, "There is a real chance that I'll get along
without ever using a Red Hat box if I don't like it." And to be honest
again, I have had enough of monopoly companies. Maybe we should not carry
this too much further.

> Thanks for your kind consideration.
>

Again Larry, I have no interest whatsoever to say anything against your
product. In fact, I did honestly not know you were in any way connected
with it.

Let's try to get back to reality. When you said "it is a far more scalable
answer", I was only curious to find out what BitKeeper had and CVS didn't,
technically speaking of course. What is it that makes it superior?

Volker

--
Volker Wiegand Phone: +49 (0) 6196 / 50951-24
SuSE Rhein/Main AG i.G. Fax: +49 (0) 6196 / 40 96 07
Mergenthalerallee 45-47 Mobile: +49 (0) 179 / 292 66 76
D-65760 Eschborn E-Mail: Volker.Wiegand [at] suse


tom at globalauctions

Jul 29, 1999, 8:28 AM

Post #22 of 25 (1007 views)
Permalink
Simple file sync [In reply to]

On Thu, 29 Jul 1999, Larry McVoy wrote:

> Second of all, the $2600 price is for very large sites - the price, as
> you clearly could tell abou 4 lines above where you got that one, goes
> as low as $800/seat for small shops of 6 or less.

Such a deel.


> Third, I've had discussions with Red Hat about using BitKeeper for
> sys admin on Red Hat boxes. There is a real chance that one day
> you'll be using BitKeeper for sysadmin whether you like it or not.

That sounds like something bill gates would say.


Tom O'Toole - tom.otoole [at] jhu
Protect privacy, boycott Intel: http://www.bigbrotherinside.org
Joachim Kempin says Windows is like Moby Dick, and, you know, he's right...
...just ask Netscape.


wanger at redhat

Jul 29, 1999, 8:25 PM

Post #23 of 25 (1008 views)
Permalink
Simple file sync [In reply to]

On Thu, 29 Jul 1999 13:40:20 +0200 (MEST), Volker Wiegand wrote:

>> Third, I've had discussions with Red Hat about using BitKeeper for
>> sys admin on Red Hat boxes. There is a real chance that one day
>> you'll be using BitKeeper for sysadmin whether you like it or not.
>>
>
>Hmmm, there's more than one way to skin a cat. Without going into further
>detail, I might answer, "There is a real chance that I'll get along
>without ever using a Red Hat box if I don't like it." And to be honest
>again, I have had enough of monopoly companies. Maybe we should not carry
>this too much further.

Larry's discussions with us have been limited to him presenting his
arguments for why this should happen. I don't see it going any further
than that until the licensing scheme is changed to make bitkeeper
completely free and open. Even then, I'm not convinced a source
control system to manage config files is the correct approach.

Mike

-----------------------------------------------------------------------
Mike Wangsmo Red Hat, Inc

"I've seen this before in Montana! Its snowing, nobody lick a flag
pole" -- Peggy Hill


wiegand at suse

Jul 30, 1999, 12:18 AM

Post #24 of 25 (1011 views)
Permalink
Simple file sync [In reply to]

On Thu, 29 Jul 1999 wanger [at] redhat wrote:

> [...]
> Larry's discussions with us have been limited to him presenting his
> arguments for why this should happen. I don't see it going any further
> than that until the licensing scheme is changed to make bitkeeper
> completely free and open.

That sounds different. Fine with me. And then let's evaluate it against
CVS or whatever.

> Even then, I'm not convinced a source
> control system to manage config files is the correct approach.

Now we are getting somewhere. Why not? What is the correct approach?
Please convince me. This is the discussion I want to have.

> [...]

Volker

--
Volker Wiegand Phone: +49 (0) 6196 / 50951-24
SuSE Rhein/Main AG i.G. Fax: +49 (0) 6196 / 40 96 07
Mergenthalerallee 45-47 Mobile: +49 (0) 179 / 292 66 76
D-65760 Eschborn E-Mail: Volker.Wiegand [at] suse


jochen at scram

Jul 30, 1999, 7:43 AM

Post #25 of 25 (1015 views)
Permalink
Simple file sync [In reply to]

Hi Larry,

> Third, I've had discussions with Red Hat about using BitKeeper for
> sys admin on Red Hat boxes. There is a real chance that one day
> you'll be using BitKeeper for sysadmin whether you like it or not.

This will be the day when i switch away from the RedHat distribution as i
did with S.u.S.E for their use of their non-free (think speech) YaST tool
for system administration.

Cheers,
Jochen

Linux-HA users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.