
rw26 at acf3
Mar 10, 1998, 9:47 AM
Post #4 of 4
(520 views)
Permalink
|
|
Another possible inter-node communication system
[In reply to]
|
|
On Tue, 10 Mar 1998, Alan Cox wrote: > > My memory is that the IrDA spec allows for multiple devices to communicate with > > each other, with some kind of address-resolution and assignment protocol. It > > *seemed* to imply that multiple devices could talk to each other. Does anyone > > out there know enough to know if that is what was intended. Maybe it was only > > intended for two... > > > > Its intended to handle many to 1 communications. The big problem is the protocol > design is complex and many layered, that IMHO is the biggest thing that stopped > irda taking off. Dag Brattel (I hope I spelt that right) has a nearly working > Linux IRDA > It's Dag Brattli, http://www.cs.uit.no/~dagb/irda/irda.html --randy From Alan Robertson <alanr [at] bell-labs> Wed Mar 11 04:38:21 1998 [62] From: Alan Robertson <alanr [at] bell-labs> (Alan Robertson <alanr [at] bell-labs>) Date: Wed, 11 Mar 1998 05:38:21 +0100 Subject: Another possible inter-node communication system In-Reply-To: <Pine.OSF.3.95.980310114437.26322A-100000 [at] acf3> Message-ID: <ML-3.3.889591101.7457.alanr [at] alanrhome> Thanks for all the up-to-date references. I looked at the specs (again), and it looks *possible* to do a keep-alive using it, perhaps even for "N" machines. That's why I sent it out. I suspect the Linux drivers are months (a year?) away from being able to do all we'd like for that purpose though. It looks like one might be able use it as a sort of "poor-man's TCP". Connection setup time could be a problem with "N" machines all vying to send trivial packets to each other :-) It looks like the "multiplexer mode" (LM-MUX mode) might be useful for letting each machine grab the link and query "who's there?" perhaps without setting up N-squared individual connections. -- Alan Robertson alanr [at] bell-labs > On Tue, 10 Mar 1998, Alan Cox wrote: > > > > My memory is that the IrDA spec allows for multiple devices to > > > communicate with each other, with some kind of address-resolution and > > > assignment protocol. It *seemed* to imply that multiple devices could > > > talk to each other. Does anyone out there know enough to know if that > > > is what was intended. Maybe it was only intended for two... > > > > > > > Its intended to handle many to 1 communications. The big problem is the > > protocol design is complex and many layered, that IMHO is the biggest > > thing that stopped irda taking off. Dag Brattel (I hope I spelt that > > right) has a nearly working Linux IRDA > > > > It's Dag Brattli, http://www.cs.uit.no/~dagb/irda/irda.html > > --randy From Alan Robertson <alanr [at] bell-labs> Wed Mar 11 05:37:37 1998 [63] From: Alan Robertson <alanr [at] bell-labs> (Alan Robertson <alanr [at] bell-labs>) Date: Wed, 11 Mar 1998 06:37:37 +0100 Subject: Linux-HA project structure? Message-ID: <ML-3.3.889594657.1942.alanr [at] alanrhome> Is there a list of who is working on what regarding Linux-HA? In other words, is there some kind of structure (individuals, teams, committees, whatnot) to the efforts associated with Linux-HA? Here's all I know about (which isn't much): Activity Participants ---------------------------- FAQ: Harald Milz I'm willing to put this up on my (personal) web site, if there's more to it than this, and others would be interested in having such a list on the web. Thanks!! -- Alan Robertson alanr [at] bell-labs From Alan Robertson <alanr [at] bell-labs> Thu Mar 12 02:41:12 1998 [64] From: Alan Robertson <alanr [at] bell-labs> (Alan Robertson <alanr [at] bell-labs>) Date: Thu, 12 Mar 1998 03:41:12 +0100 Subject: Linux-HA project structure? In-Reply-To: <B1D4CEFC33F9D011AD4800805F0565F0053780 [at] hdi014s> Message-ID: <ML-3.3.889670472.225.alanr [at] alanrhome> Eric van Dijken <dijken [at] tfi> > Hi, > > I'am working solo on a Linux HA project, using Linux Alpha systems. > > Greetings Eric. Eric, Is it something you can tell us about? I'm primarily interested in Alphas also. -- Alan Robertson alanr [at] bell-labs From Alan Robertson <alanr [at] bell-labs> Fri Mar 13 06:31:36 1998 [65] From: Alan Robertson <alanr [at] bell-labs> (Alan Robertson <alanr [at] bell-labs>) Date: Fri, 13 Mar 1998 07:31:36 +0100 Subject: Basic Linux-HA architecture In-Reply-To: <199803120813.JAA27765 [at] hdxl29> Message-ID: <ML-3.3.889770696.9124.alanr [at] alanrhome> Eric vanDijken wrote: > My goals are: > - IP takeover > - Application takeover > - 99,9 % available system > > I'am currently working on a flowchart about what to do when. I spent some time thinking about this this evening and would offer some (probably obvious) thoughts. First: In an HA-system, the heartbeat and IP takeover seem to be most fundamental. So, it seems to me that you're starting in the right place. Second: It seems that having Dynamic DNS would be most beneficial for such as system, especially when one has a round-robin DNS configuration and wants to gracefully shutdown a node for maintenance. What you want to do is: A) Take node OOS (out of service) by: 1) inform DNS to stop delivering to this IP address 2) Tell the applications to stop accepting new clients [.This may means telling Beowulf or whoever that it shouldn't be considered for job scheduling] 2) wait a while for applications to naturally terminate 3) give applications the signal to shut down anyway B) Take node offline by: 1) performing the IP takeover This means that each node has to have an administrative IP address so you can get into it even when someone has taken over its "real" IP address. Now, without dynamic DNS, step A.1 can't be done realistically. But, before we are ready to ship any product :-), it will be available, and would seem to be good to plan to accommodate it, and allow for graceful maintenance shutdowns as well as catastrophic ones. Wandering into more random thoughts: It would seem that a SYSV init-style directory structure would be helpful for allowing various applications to "plug into" the HA architecture. For example: Postulate a directory named /etc/ha.d, with a subdirectory named init.d. In this script are various application scripts which get invoked whenever a node changes it's HA status. For example, one could have a script named 020beowulf which would be invoked as "/etc/ha.d/init.d/020beowulf online" after a node successfully joined the cluster. When the cluster was taken out of service it would be invoked with an "OOS" argument, and invoked with an "offline" argument when the current machine was no longer any part of the cluster. Such a convention would allow applications to plug into the HA architecture and be informed of changes in the state of the world neatly and cleanly. Similarly, one could imagine another directory named /etc/ha.d/config.d, with scripts in it to be invoked when the cluster configuration changed (but not the current machine). Similarly, one might imagine a script 020beowulf which was called as "/etc/ha.d/config.d/020beowulf online alpha-1.kpn.com". It could also be called with an OOS and offline argument as above. You could have scripts in this directory which would carry out the flowchart that you are designing. Does this seem like a good way to think about it? Does anyone else see the need for three states (rather than just two)? -- Alan Robertson alanr [at] bell-labs
|