Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Users

multiple SCSI hosts

 

 

Linux-HA users RSS feed   Index | Next | Previous | View Threaded


jo2y at positroncomp

Mar 28, 1999, 10:05 PM

Post #1 of 6 (1696 views)
Permalink
multiple SCSI hosts

Hi,
I've been thinking about the shared SCSI bus question, and I have
some ideas that I would like to bounce off some people before I spend too
many hour researching an empty dream. I've done a little reading on the
archives, but I don't think I'm using the right search words.

I'm curious why we should restrict ourselves to current SCSI
hardware that is available? What about designing a special card that does
what we want? I'll admit up front that I am no where near familiar with
the current scsi specs, but I would guess that we could add control codes
to the set as long as we don't overlap in namespace.

The card that I have in mind would work just like any other SCSI
card when by itself, but when you configure it in a network of machines,
each SCSI card could 'ping' the other scsi cards on the bus, and tell if
they are still alive. My first thoughts on how control is done would be by
SCSI ID. The lower your number the more authority you have.

I read that a problem is when a card comes back on-line it
reinitalizes the drives, but with a card designed for this purpose, it
could be made to detect if the bus is already active.

I have some more ideas about this, but they start getting into
details of implimentation. I'm hoping that I get one of a few replies.
Either someone can point me to a link of someone who is already doing
this. Some one could point out why this idea is flawed. Or someone could
point me to some links for where I could start researching more.

thanks
-james


steveu at netpage

Mar 29, 1999, 2:03 AM

Post #2 of 6 (1635 views)
Permalink
multiple SCSI hosts [In reply to]

This is my first intrusion into the HA forum. I have implemented a variety of
high availability and fault tolerant hardware and software at various times
over the years, so I hope I have something meaningful to say.

I don't really see a problem with current commercial SCSI PCI cards in high
availability systems, as far as functionality is concerned. The cards don't
have to reinitialise the SCSI drives as they come up. That is (usually)
controllable in software, and the drivers can be adapted appropriately.
Adaptec, BusLogic and other cards have all seem to have the functionality
needed to provide a high availability solution, with hosts rebooting at any
time. An important issue, however, is how you terminate the bus. The active
termination on some cards goes crazy during reboot, so termination must be
provided elsewhere. As long as the cards are just another SCSI node along the
bus there should be no problem with reboots. Cycling the power is another
issue, and starts to get to the heart of problems with SCSI and high
availability.

Detecting failures and working around them in a high availability system is
simple compared to the MAJOR problem you have to face - systems that don't
die, but just act screwy. Screwy here could mean babbling, or functionality
that comes and goes intermittently. When you think you have allowed for every
screwy effect an intermittent fault could cause, you can be sure you haven't.
SCSI offers lots of potential for screwy behaviour. Like most systems, it has
the potential for babbling faults, but these seem quite rare in practice. The
real problems are intermittent connections, grounding problems and so forth.

An important issue, for any complex multi-machine SCSI configuration, is that
it tends to get flaky, due to grounding problems. This not only causes data
errors - they can actually be bad enough to blow up the hardware. Grounding
was a particularly bad problem in the Ultra-2 era, with high clock rates on
single ended systems. The newest SCSI hardware has gone differential (yes, I
know differential was always an option for SCSI, but I haven't seen much
differential hardware in use for the older SCSI revisions) grounding should
be a bit less of a bother. The system is still not immune from grounding
troubles, though. Its just a bit more tolerant. Remember, those maximum cable
lengths quoted in the SCSI specs. are nothing to do with science (e.g. like
the maximum length for a thick Ethernet being based on the speed of light),
but are engineering guestimates of what you might be able to get away with.
The Linux HA HOW TO says a bit about ground and termination problems, but
doesn't sufficiently emphasise the extent to which these, and SCSI's
generally poor design for distributed operation are a pain in the posterior.

Modern SCSI variants use very flimsy connectors, which are quite trouble
prone. SCSI is a parallel bus, so it has a lot of connections, and requires
every one to be reliable. It has no fault tolerance, and only simple parity
(which is often not even implemented) as a fault detection mechanism. Taking
SCSI cables and connectors outside a box, to connect to another box, is
asking for trouble. Its fine in a test environment. In the real world
anything remotely fragile gets damaged, even in a fairly well controlled
room, or a rack. I only ever feel comfortable with SCSI when its all tidily
in one box, with one power supply. One processor box connected to one
adjacent RAID box, tightly cabled together, and plugged into the same power
outlet is about the greatest risk I want to take with distributed SCSI. I've
never seen anything more complex be 100% reliable.

So, spreading a SCSI bus around a number of boxes may have more potential for
reducing availability than increasing it.

"If you want to know about fault tolerant systems ask Microsoft. I know of
nobody else that tolerates quite so many faults in the systems!"

Steve



James O'Kane wrote:

> Hi,
> I've been thinking about the shared SCSI bus question, and I have
> some ideas that I would like to bounce off some people before I spend too
> many hour researching an empty dream. I've done a little reading on the
> archives, but I don't think I'm using the right search words.
>
> I'm curious why we should restrict ourselves to current SCSI
> hardware that is available? What about designing a special card that does
> what we want? I'll admit up front that I am no where near familiar with
> the current scsi specs, but I would guess that we could add control codes
> to the set as long as we don't overlap in namespace.
>
> The card that I have in mind would work just like any other SCSI
> card when by itself, but when you configure it in a network of machines,
> each SCSI card could 'ping' the other scsi cards on the bus, and tell if
> they are still alive. My first thoughts on how control is done would be by
> SCSI ID. The lower your number the more authority you have.
>
> I read that a problem is when a card comes back on-line it
> reinitalizes the drives, but with a card designed for this purpose, it
> could be made to detect if the bus is already active.
>
> I have some more ideas about this, but they start getting into
> details of implimentation. I'm hoping that I get one of a few replies.
> Either someone can point me to a link of someone who is already doing
> this. Some one could point out why this idea is flawed. Or someone could
> point me to some links for where I could start researching more.
>
> thanks
> -james


okeefe at lcse

Mar 29, 1999, 6:51 AM

Post #3 of 6 (1621 views)
Permalink
multiple SCSI hosts [In reply to]

A few comments on Steve and Jame's comments about SCSI and HA:

Fibre Channel is the disk drive industry's answer to most of the
problems mentioned with parallel SCSI. It is a fast, scalable,
network interface that fixes all of the physical interface issues.
Each Fibre Channel port can connect directly to 100s to 1000s of devices;
it is a Gigabit serial interface using either optical or coax
and can extend 10s of kilometers; it has a smart switch framework
called a "Fabric" that can effectively isolate hosts that have gone
mad or are just in a state of constant stupidity (for example, NT
insists on re-formatting any drive it can see on a SCSI bus if it doesn't
have a Microsoft NT disk label embedded on it).
>
> An important issue, for any complex multi-machine SCSI configuration, is that
> it tends to get flaky, due to grounding problems. This not only causes data
> errors - they can actually be bad enough to blow up the hardware.
Goes away with FC.

> Modern SCSI variants use very flimsy connectors, which are quite trouble
> prone. SCSI is a parallel bus, so it has a lot of connections, and requires
> every one to be reliable.
FC is serial that uses only point-to-point connections: there is no
physical bus..

> It has no fault tolerance, and only simple parity
> (which is often not even implemented) as a fault detection mechanism. Taking
> SCSI cables and connectors outside a box, to connect to another box, is
> asking for trouble.

Goes away with FC, which has very sophisticated error checking mechanisms..

>Its fine in a test environment. In the real world
> anything remotely fragile gets damaged, even in a fairly well controlled
> room, or a rack. I only ever feel comfortable with SCSI when its all tidily
> in one box, with one power supply. One processor box connected to one
> adjacent RAID box, tightly cabled together, and plugged into the same power
> outlet is about the greatest risk I want to take with distributed SCSI. I've
> never seen anything more complex be 100% reliable.
>
> So, spreading a SCSI bus around a number of boxes may have more potential for
> reducing availability than increasing it.

Again, Fibre Channel basically gets rid of all these problems, but of
course brings a few problems of its own :-( There is a tendency in
current FC chip sets to do the equivalent of a SCSI BUS RESET --
in FC it is called a LIP (Loop Initialization Protocol) -- whenever
"problems" occur, for example new devices or adapters using
different chip sets appearing on the bus. This problem should go
away as FC matures, and can be avoided altogether if you have the
$$$ for a FC fabric.

Regarding heartbeats: IP heartbeats are great since they are integrated
with the host naming environment, and they can help you determine if
hosts are down or if the network is down (especailly if hosts are
connected with both FC/SCSI and Ethernet). Our group has been
working with the industry to define a SCSI lock protocol
(called Device Locks) which includes a feature which could be
used for heartbeating hosts.

Each device lock is a multiple readers/single writer lock
that sits on a device (each device can have thousands or millions of
these locks). A lock held by a host must be tickled by the host
every X seconds, where X is defined in the mode page. By assigning
each host a specific lock for hearbeating, and by regularly
reading the expired lock bitmap using a DLOCK command, it is
possible to determine which clients can connect to a device and
appear to be operational.

Device Locks are about 6 months away from being standardized in SCSI-3.
For more information about them get the spec from the following
web page: http://gfs.lcse.umn.edu. Currently Seagate FC drives
and a few RAID vendors have implemented DLOCKs. They should be more
widely available once the command makes it through the formal SCSI
standardization process.

Matt
Matthew T. O'Keefe okeefe [at] ece (612) 625-6306
Director Pretty Cool Software Laboratory
University of Minnesota FAX: (612) 625-4583
Minneapolis, MN 55455 WWW: http://www.lcse.umn.edu/~okeefe



>
> "If you want to know about fault tolerant systems ask Microsoft. I know of
> nobody else that tolerates quite so many faults in the systems!"
>
> Steve
>
>
>
> James O'Kane wrote:
>
> > Hi,
> > I've been thinking about the shared SCSI bus question, and I have
> > some ideas that I would like to bounce off some people before I spend too
> > many hour researching an empty dream. I've done a little reading on the
> > archives, but I don't think I'm using the right search words.
> >
> > I'm curious why we should restrict ourselves to current SCSI
> > hardware that is available? What about designing a special card that does
> > what we want? I'll admit up front that I am no where near familiar with
> > the current scsi specs, but I would guess that we could add control codes
> > to the set as long as we don't overlap in namespace.
> >
> > The card that I have in mind would work just like any other SCSI
> > card when by itself, but when you configure it in a network of machines,
> > each SCSI card could 'ping' the other scsi cards on the bus, and tell if
> > they are still alive. My first thoughts on how control is done would be by
> > SCSI ID. The lower your number the more authority you have.
> >
> > I read that a problem is when a card comes back on-line it
> > reinitalizes the drives, but with a card designed for this purpose, it
> > could be made to detect if the bus is already active.
> >
> > I have some more ideas about this, but they start getting into
> > details of implimentation. I'm hoping that I get one of a few replies.
> > Either someone can point me to a link of someone who is already doing
> > this. Some one could point out why this idea is flawed. Or someone could
> > point me to some links for where I could start researching more.
> >
> > thanks
> > -james
>


steveu at netpage

Mar 29, 1999, 8:34 AM

Post #4 of 6 (1626 views)
Permalink
multiple SCSI hosts [In reply to]

Comments on comments on comments about SCSI, FC and HA.

FC seems interesting, but pretty immature right now. You refered to some quirks in
current chip sets, and I have heard of others. I think FC is a maybe technology -
maybe it will get critical mass, or maybe it will fade away. Right now we only
have SCSI in the mass market low cost arena, and FC commands an unreasonable price
premium. Maybe that will go, but since SCSI has an unreasonable premium over IDE
maybe it will stay.

I haven't worked with a real FC system yet, but from what I understand is has some
significant SPOF problems of its own. The switching fabric, which negotiates
amongst participants looks like the most serious one. Thats a single piece of
electronics negotiating amongst multiple fibres, is it not? The electronics should
have a much lower failure rate that the mechanics of the drives, so it offers some
real availability benefits (somewhat like a RAID box does). It does seem like a
weak link for a large system, though. Are there redundant options for the SPOFs? I
haven't read about them in descriptions of FC.

I was not aware that FC has a coax option. Sounds like a dumb idea for long runs,
as it removes the electrical isolation that makes FC look so nice. Did you really
mean that FC on coax. is designed for long runs? I would think coax would only be
appropriate within a single box.

I certainly hope an isolated (presumably fibre based) solution with simple cabling
and robust connections does replace SCSI. Maybe it will be FC. I certainly
consider SCSI a liability in HA systems.

Steve


okeefe at lcse

Mar 29, 1999, 9:54 AM

Post #5 of 6 (1627 views)
Permalink
multiple SCSI hosts [In reply to]

>
> Comments on comments on comments about SCSI, FC and HA.
>
> FC seems interesting, but pretty immature right now. You refered to some quirks in
> current chip sets, and I have heard of others. I think FC is a maybe technology -
> maybe it will get critical mass, or maybe it will fade away. Right now we only
> have SCSI in the mass market low cost arena, and FC commands an unreasonable price
> premium. Maybe that will go, but since SCSI has an unreasonable premium over IDE
> maybe it will stay.

I think it will stay because Seagate is giving it away on the drives. When the
adapter prices drop below $300 that will also help alot. Most of the major PC
OEMs (Dell, Compaq, IBM) are shipping it now.

>
> I haven't worked with a real FC system yet, but from what I understand is has some
> significant SPOF problems of its own. The switching fabric, which negotiates
> amongst participants looks like the most serious one. Thats a single piece of
> electronics negotiating amongst multiple fibres, is it not? The electronics should
> have a much lower failure rate that the mechanics of the drives, so it offers some
> real availability benefits (somewhat like a RAID box does). It does seem like a
> weak link for a large system, though. Are there redundant options for the SPOFs? I
> haven't read about them in descriptions of FC.

Yes, you can use multiple Fabrics to provide multiple paths. Some companies are
like McData are selling these products already using standard FC fabric switches
with some redundancy hardware for multipathing and fail-over built in. They are pretty
pricey so part of our efforts include building some of this support for multiple
Fabrics into Linux. Brocade and Seagate have been supportive of our efforts.

Other cool things about Fibre Channel:

(1) termination is not an issue since as I mentioned in my last note connections
are all point to point: however, this does not mean that a device failure
cuts the "logical" bus.

(2) In fact, FC has a port bypass capability that allows
signal pass through: this also allows for clean, safe hot pluggability to
replace failed drives or enclosures. Port bypass can be controlled
by the disk device, the enclosure, or the host.

>
> I was not aware that FC has a coax option. Sounds like a dumb idea for long runs,
> as it removes the electrical isolation that makes FC look so nice. Did you really
> mean that FC on coax. is designed for long runs? I would think coax would only be
> appropriate within a single box.

Coax reduces the per-port cost compared to more expensive optical ports.
When the disk vendors started getting serious about Fibre Channel they insisted
on a coax capability for cost (not distance) reasons. Actually, I should
really say copper, not coax. What people are using as an alternative to fibre
is twinax, which is basically a differential pair which sits inside a shield,
both the transmit pair and the receive pair. Differential lines tend to align
signal edges better and reduce emissions. Right now twinax can go 30 meters
at 1 Gigabit; the design plans in FC are to go to 2 Gbit, then 4, then
8 Gbit.

With fibre connections FC can run for 10s of kilometers -- a team at the U. of
Minnesota just reported results for such long runs, and they were pretty good.
This is very useful for disaster recovery and tolerance.

>
> I certainly hope an isolated (presumably fibre based) solution with simple cabling
> and robust connections does replace SCSI. Maybe it will be FC. I certainly
> consider SCSI a liability in HA systems.

Quite true...
Matt

>
> Steve
>
>


alan at lxorguk

Mar 29, 1999, 9:55 AM

Post #6 of 6 (1636 views)
Permalink
multiple SCSI hosts [In reply to]

> significant SPOF problems of its own. The switching fabric, which negotiates
> amongst participants looks like the most serious one. Thats a single piece of
> electronics negotiating amongst multiple fibres, is it not? The electronics should
> have a much lower failure rate that the mechanics of the drives, so it offers some
> real availability benefits (somewhat like a RAID box does). It does seem like a


You can have multiple links or fabrics. The boxhill kit I borrowed had
two fibrechannel loops across all the drives.

Linux-HA users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.