Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux Virtual Server: Users

[lvs-users] LVS + Database

 

 

Linux Virtual Server users RSS feed   Index | Next | Previous | View Threaded


sayupirapo at gmail

Jun 13, 2012, 5:32 PM

Post #1 of 17 (1059 views)
Permalink
[lvs-users] LVS + Database

Hi,
How I can configure load balancing Database cluster with the LVS.?
I use keepalived, and I have one configuration for the load balancer for
the web server, but I don't know how work for the database.

with what port I work?
how health check to monitor the real servers?

Thanks

--
Sayuri Komatsu
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


anders.henke at 1und1

Jun 14, 2012, 1:01 AM

Post #2 of 17 (1021 views)
Permalink
Re: [lvs-users] LVS + Database [In reply to]

On June 13th, YesGood wrote:
> Hi,
> How I can configure load balancing Database cluster with the LVS.?
> I use keepalived, and I have one configuration for the load balancer for
> the web server, but I don't know how work for the database.
>
> with what port I work?

This depends on your specific database server and install.
For example, MySQL per default runs on port 3306, but may be
run on any arbitary port.

Please note in general, your specific application has to be fine with being
load balanced, as LVS only balances on the network level.

LVS doesn't inspect your IP packets or tcp connections and can't e.g.
tell "read" from "write" requests, accordingly route them to different
nodes or clusters. LVS also doesn't synchronize data between your
database servers - that's the job of your DBMS.

Actually, there aren't many DBMS who do permit loadbalanced database
connections, and some of them do implement loadbalancing on a different
level, yet others may be "mis-used" for loadbalancing.

For example, I'm running a MySQL "write master", whose data is being
replicated onto a few dozen different MySQL servers. Those replicas are
being run in "read-only" mode, and a LVS loadbalancer balances incoming
connections among those read-only-nodes.
The usecase scenarios are the following:
-any application who writes data, does connect to the "write master".
-any application who absolutely requires consistent data does connect to
the "write master" as well.
-any application, who is fine with only reading data, and this data may
be off by a few seconds, does connect to the read-only-cluster.

> how health check to monitor the real servers?

keepalived has a "check_misc"-Command for health checks; you may add a
small, shortrunning skript (shell/perl/python/whatever) to test the
availability of your database servers that way: connect to the database,
optionally do run some small availibility test, disconnect from the
database, return 0 for success and any other return code for failure.

As a starting point, you may e.g. use the nagios-plugins; they do
connect to various database systems and already do implement this
"returncode-interface".


Anders
--
1&1 Internet AG Expert Systems Architect (IT Operations)
Brauerstrasse 50 v://49.721.91374.0
D-76135 Karlsruhe f://49.721.91374.225

Amtsgericht Montabaur HRB 6484
Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich,
Robert Hoffmann, Andreas Hofmann, Markus Huhn, Hans-Henning Kettler,
Dr. Oliver Mauss, Jan Oetjen
Aufsichtsratsvorsitzender: Michael Scheeren


_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


sayupirapo at gmail

Jun 14, 2012, 6:29 AM

Post #3 of 17 (1019 views)
Permalink
Re: [lvs-users] LVS + Database [In reply to]

Thanks you Anders Henke

And, but it's possible to build a load balancing cluster of database with
LVS?
where all clients connects to database servers in write and read mode.
or
there are other best tools for this target?

I want to build a cluster with postgreSQL, and then I would use the port
5432 in the keepalived.conf
For the script in the MISC_CHECK, there are some rules?
And I don't know nagios, but with the nagios, what condition I should check?

For the replicas in the database servers pool, I thinking work with the
DRBD+ocfs2.
And use heartbeat for the health-checking monitoring in the pool of
database servers.
It's a feasible scenario?

--
Sayuri Komatsu
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


anders.henke at 1und1

Jun 14, 2012, 9:08 AM

Post #4 of 17 (1034 views)
Permalink
Re: [lvs-users] LVS + Database [In reply to]

On June 14th 2012, YesGood wrote:
> And, but it's possible to build a load balancing cluster of database with
> LVS?
> where all clients connects to database servers in write and read mode.
> or
> there are other best tools for this target?

It's possible, when the database servers do support such a pattern.

In terms of databases, you're looking for terms like "synchronous
multi-master replication". As far as I know, PG doesn't support multi-master
replication out of the box, but there are a few third-party tools,
who claim to add such support to PG.

http://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling

looks like a good starting point.

Maybe you'd like to take a look at

http://www.postgresql.org/docs/current/interactive/high-availability.html
and
http://www.postgresql.org/docs/current/interactive/different-replication-solutions.html

as well (both for load balancing as well as HA).

The next question is wether the overall design is actually an improvement
to a simple active/passive HA cluster.

> I want to build a cluster with postgreSQL, and then I would use the port
> 5432 in the keepalived.conf
> For the script in the MISC_CHECK, there are some rules?
> And I don't know nagios, but with the nagios, what condition I should check?

Basically Nagios consists of a daemon, a web interface and a set of
so-called "plugins". The later ones are developed independently from
Nagios and are simple command line utilities available via http://nagiosplugins.org/.

For example this one:

---cut
$ /usr/lib/nagios/plugins/check_pgsql --help
check_pgsql v1.4.15 (nagios-plugins 1.4.15)
Copyright (c) 1999-2007 Nagios Plugin Development Team
<nagiosplug-devel [at] lists>

Test whether a PostgreSQL Database is accepting connections.

Usage:
check_pgsql [-H <host>] [-P <port>] [-c <critical time>] [-w <warning
[-t <timeout>] [-d <database>] [-l <logname>] [-p <password>]
[...]
This plugin tests a PostgreSQL DBMS to determine whether it is active and
accepting queries. In its current operation, it simply connects to the
specified database, and then disconnects. If no database is specified, it
connects to the template1 database, which is present in every functioning
PostgreSQL DBMS.

The plugin will connect to a local postmaster if no host is specified. To
connect to a remote host, be sure that the remote postmaster accepts TCP/IP
connections (start the postmaster with the -i option).
[...]
---cut

would give a good start for such a "check_misc" skript.

> For the replicas in the database servers pool, I thinking work with the
> DRBD+ocfs2.

>From what you're writing, I do assume the following:
-you'll be installing PG on two nodes "behind" a LVS load balancer.
-DRBD is used to give all nodes a shared block storage.
-A shared OCFS2 filesystem will be run on top of the DRBD volumes.

In order to speed up database queries, a lot of information is being
cached in local memory (RAM) of each node. So if one node actually wrote
a block onto the ocfs2 filesystem in a not multi-master-aware DBMS, the
other nodes won't know about this changed information: the changing DBMS
isn't aware of any other DBMS to notify about this. If the other nodes were
asked for the same information, they might answer with an locally cached,
but outdated information.

So if your application is going to read from one PG node and write
requests will go to a different PG node, you may invalidate your data.

That's just a very simple example. In reality, your nodes will not
only write "wrong" records to their shared database, they may also
damage indexes, overwrite consistent data with inconsistent data or
simply shred your data in ways your DBMS didn't ever think about and
can't repair.

This issue may be skipped, if PG is configured not to use any caching at all
and services every read request from disk (I don't know if that's possible).
However, this way you're also skipping what makes PG fast: its cache in memory.
Retrieving data from memory is a lot faster than retrieving data from disk.


Depending on the exact configuration, writes may also be temporarily delayed
in RAM of each node, but that's another story and may shred even more data
much faster in this setup.

A multi-master-aware DBMS would lock affected database records per write
(much like the lock manager for OCFS2 does) and ensure that any changed
data has also been written onto all other nodes. Network latency usually
becomes an issue here, that's why many of those systems do rely on
non-Ethernet-networks for their synchronization work.

Then the next dimension: even if this clustering wouldn't damage your data,
it most likely won't improve your performance as well. At least actual performance
for writes are likely to suffice.

-DRBD does service read requests from local storage (unless that local
storage has failed), so read requests are usually fast and not of an
issue. Writing can be an issue, as write requests have to be performed by
both nodes in the DRBD cluster before the write requests may be marked
as "complete". So your database is not only writing for the local disk
to complete the "write" transaction, it's also waiting for the network
to transmit the changed block, the other node to acknowledge and write
that block to its local disk.
At least, you're adding this network latency to every write transaction.
DRBD's permits some tuning (protocol a-c), but this does add inconsistency,
so for a shared filesystem, you're stuck with synchronous mode.

-OCFS2 requires a lock manager; this lock manager makes sure only one
node at a time is able to write to a file on a shared filesystem.

So if you're running two non-multi-master-aware database systems on
the same shared file system, only one of them is actually able to write
to the shared transaction log. The other one will either throw an error
message (->no additional performance) or wait until the lock is available
(->no additional performance).

-OCFS2 works using block sizes from 512 Byte to 4kb. DRBD works on top of a
4kb internal block size. So in probably the worst scenario, you're using
OCFS2 with 512 Byte block size and for every OCFS2 block write, DRBD
will attempt to re-sync 4kb of actual data.
If DRBD and OCFS2 aren't properly aligned to each other, this may eat
up any remaining performance pretty fast.

> And use heartbeat for the health-checking monitoring in the pool of
> database servers.
> It's a feasible scenario?

Heartbeat is a complete cluster manager and not a monitoring tool.
It's usually used for automatically managing what needs to be done when
one node fails or a failed node later shows up again.

If you're considering DRBD-replication (which is limited to two servers),
I'd rather recommend to setup a "typical" active/passive HA setup.

One node is active, while the other node is only replicating any changes
from the master. In order to improve performance, I'd rather recommend
tuning the PG configuration. I'm no PG expert, so I can't give much advice
on this; however, I'm aware that many linux distributions do initially
setup PG in a very conservative, extremely slow default configuration
with much room for improvement in terms of performance.


Anders
--
1&1 Internet AG Expert Systems Architect (IT Operations)
Brauerstrasse 50 v://49.721.91374.0
D-76135 Karlsruhe f://49.721.91374.225

Amtsgericht Montabaur HRB 6484
Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich,
Robert Hoffmann, Andreas Hofmann, Markus Huhn, Hans-Henning Kettler,
Dr. Oliver Mauss, Jan Oetjen
Aufsichtsratsvorsitzender: Michael Scheeren


_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


sethcall at gmail

Jun 14, 2012, 10:04 AM

Post #5 of 17 (1018 views)
Permalink
Re: [lvs-users] LVS + Database [In reply to]

Great response Anders. And I second that: PG is configured by default
extremely conservative.

There is a PG healthcheck added to HAProxy as a patch in 1.4 (easy to add
yourself), or you can use 1.5beta.

http://haproxy.1wt.eu/git?p=haproxy.git;a=commit;h=38b4156a691557f4eda30445f0ef8ce61f280dfc

If you look at checks.c 's diff, you can 'see' what they are doing (you
have to understand a little more context of checks.c to really grok it, I
think. I can tell it's somehow parsing data coming back from postgres but
I can't tell where or how. I'd guess when HAProxy does a TCP connect to
postgres, you can add a check like this to actually look at the response
data to see if it looks good.)

Anyway, maybe haproxy is suitable for you or not, regardless, maybe the
code in this patch would help you script something better.

Or, you could see how you like this script:
http://bucardo.org/wiki/Check_postgres


On Thu, Jun 14, 2012 at 11:08 AM, Anders Henke <anders.henke [at] 1und1>wrote:

> On June 14th 2012, YesGood wrote:
> > And, but it's possible to build a load balancing cluster of database with
> > LVS?
> > where all clients connects to database servers in write and read mode.
> > or
> > there are other best tools for this target?
>
> It's possible, when the database servers do support such a pattern.
>
> In terms of databases, you're looking for terms like "synchronous
> multi-master replication". As far as I know, PG doesn't support
> multi-master
> replication out of the box, but there are a few third-party tools,
> who claim to add such support to PG.
>
>
> http://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling
>
> looks like a good starting point.
>
> Maybe you'd like to take a look at
>
> http://www.postgresql.org/docs/current/interactive/high-availability.html
> and
>
> http://www.postgresql.org/docs/current/interactive/different-replication-solutions.html
>
> as well (both for load balancing as well as HA).
>
> The next question is wether the overall design is actually an improvement
> to a simple active/passive HA cluster.
>
> > I want to build a cluster with postgreSQL, and then I would use the port
> > 5432 in the keepalived.conf
> > For the script in the MISC_CHECK, there are some rules?
> > And I don't know nagios, but with the nagios, what condition I should
> check?
>
> Basically Nagios consists of a daemon, a web interface and a set of
> so-called "plugins". The later ones are developed independently from
> Nagios and are simple command line utilities available via
> http://nagiosplugins.org/.
>
> For example this one:
>
> ---cut
> $ /usr/lib/nagios/plugins/check_pgsql --help
> check_pgsql v1.4.15 (nagios-plugins 1.4.15)
> Copyright (c) 1999-2007 Nagios Plugin Development Team
> <nagiosplug-devel [at] lists>
>
> Test whether a PostgreSQL Database is accepting connections.
>
> Usage:
> check_pgsql [-H <host>] [-P <port>] [-c <critical time>] [-w <warning
> time>]
> [-t <timeout>] [-d <database>] [-l <logname>] [-p <password>]
> [...]
> This plugin tests a PostgreSQL DBMS to determine whether it is active and
> accepting queries. In its current operation, it simply connects to the
> specified database, and then disconnects. If no database is specified, it
> connects to the template1 database, which is present in every functioning
> PostgreSQL DBMS.
>
> The plugin will connect to a local postmaster if no host is specified. To
> connect to a remote host, be sure that the remote postmaster accepts
> TCP/IP
> connections (start the postmaster with the -i option).
> [...]
> ---cut
>
> would give a good start for such a "check_misc" skript.
>
> > For the replicas in the database servers pool, I thinking work with the
> > DRBD+ocfs2.
>
> >From what you're writing, I do assume the following:
> -you'll be installing PG on two nodes "behind" a LVS load balancer.
> -DRBD is used to give all nodes a shared block storage.
> -A shared OCFS2 filesystem will be run on top of the DRBD volumes.
>
> In order to speed up database queries, a lot of information is being
> cached in local memory (RAM) of each node. So if one node actually wrote
> a block onto the ocfs2 filesystem in a not multi-master-aware DBMS, the
> other nodes won't know about this changed information: the changing DBMS
> isn't aware of any other DBMS to notify about this. If the other nodes were
> asked for the same information, they might answer with an locally cached,
> but outdated information.
>
> So if your application is going to read from one PG node and write
> requests will go to a different PG node, you may invalidate your data.
>
> That's just a very simple example. In reality, your nodes will not
> only write "wrong" records to their shared database, they may also
> damage indexes, overwrite consistent data with inconsistent data or
> simply shred your data in ways your DBMS didn't ever think about and
> can't repair.
>
> This issue may be skipped, if PG is configured not to use any caching at
> all
> and services every read request from disk (I don't know if that's
> possible).
> However, this way you're also skipping what makes PG fast: its cache in
> memory.
> Retrieving data from memory is a lot faster than retrieving data from disk.
>
>
> Depending on the exact configuration, writes may also be temporarily
> delayed
> in RAM of each node, but that's another story and may shred even more data
> much faster in this setup.
>
> A multi-master-aware DBMS would lock affected database records per write
> (much like the lock manager for OCFS2 does) and ensure that any changed
> data has also been written onto all other nodes. Network latency usually
> becomes an issue here, that's why many of those systems do rely on
> non-Ethernet-networks for their synchronization work.
>
> Then the next dimension: even if this clustering wouldn't damage your data,
> it most likely won't improve your performance as well. At least actual
> performance
> for writes are likely to suffice.
>
> -DRBD does service read requests from local storage (unless that local
> storage has failed), so read requests are usually fast and not of an
> issue. Writing can be an issue, as write requests have to be performed by
> both nodes in the DRBD cluster before the write requests may be marked
> as "complete". So your database is not only writing for the local disk
> to complete the "write" transaction, it's also waiting for the network
> to transmit the changed block, the other node to acknowledge and write
> that block to its local disk.
> At least, you're adding this network latency to every write transaction.
> DRBD's permits some tuning (protocol a-c), but this does add
> inconsistency,
> so for a shared filesystem, you're stuck with synchronous mode.
>
> -OCFS2 requires a lock manager; this lock manager makes sure only one
> node at a time is able to write to a file on a shared filesystem.
>
> So if you're running two non-multi-master-aware database systems on
> the same shared file system, only one of them is actually able to write
> to the shared transaction log. The other one will either throw an error
> message (->no additional performance) or wait until the lock is available
> (->no additional performance).
>
> -OCFS2 works using block sizes from 512 Byte to 4kb. DRBD works on top of a
> 4kb internal block size. So in probably the worst scenario, you're using
> OCFS2 with 512 Byte block size and for every OCFS2 block write, DRBD
> will attempt to re-sync 4kb of actual data.
> If DRBD and OCFS2 aren't properly aligned to each other, this may eat
> up any remaining performance pretty fast.
>
> > And use heartbeat for the health-checking monitoring in the pool of
> > database servers.
> > It's a feasible scenario?
>
> Heartbeat is a complete cluster manager and not a monitoring tool.
> It's usually used for automatically managing what needs to be done when
> one node fails or a failed node later shows up again.
>
> If you're considering DRBD-replication (which is limited to two servers),
> I'd rather recommend to setup a "typical" active/passive HA setup.
>
> One node is active, while the other node is only replicating any changes
> from the master. In order to improve performance, I'd rather recommend
> tuning the PG configuration. I'm no PG expert, so I can't give much advice
> on this; however, I'm aware that many linux distributions do initially
> setup PG in a very conservative, extremely slow default configuration
> with much room for improvement in terms of performance.
>
>
> Anders
> --
> 1&1 Internet AG Expert Systems Architect (IT Operations)
> Brauerstrasse 50 v://49.721.91374.0
> D-76135 Karlsruhe f://49.721.91374.225
>
> Amtsgericht Montabaur HRB 6484
> Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich,
> Robert Hoffmann, Andreas Hofmann, Markus Huhn, Hans-Henning Kettler,
> Dr. Oliver Mauss, Jan Oetjen
> Aufsichtsratsvorsitzender: Michael Scheeren
>
>
> _______________________________________________
> Please read the documentation before posting - it's available at:
> http://www.linuxvirtualserver.org/
>
> LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
> Send requests to lvs-users-request [at] LinuxVirtualServer
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
>
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


sayupirapo at gmail

Jun 16, 2012, 11:18 PM

Post #6 of 17 (992 views)
Permalink
Re: [lvs-users] LVS + Database [In reply to]

Hi
Thanks you for yours post.

I am reading your link suggestions of the PostgreSQL. It's helped me for
more understand.

I wanted to build a cluster master-master server, and I found the PGCluster
tools allow for this environment. But more reading of this topic, the
server performance could be worse than a single server.

I need build a load balancing cluster of the database. And for that I
looking for the node master-master server or read/write server.

>From what you're writing, I do assume the following:
> -you'll be installing PG on two nodes "behind" a LVS load balancer.
>

yes


> -DRBD is used to give all nodes a shared block storage.
>
-A shared OCFS2 file system will be run on top of the DRBD volumes.
>

yes, but now I believe, that's not the best choice for master-master
replication

Changed, now I prefer build a master-hot standby (read/write node-read only
node ) server for the load balancing cluster.
It's possible send request for both nodes for build load balance?


> If you're considering DRBD-replication (which is limited to two servers),
> I'd rather recommend to setup a "typical" active/passive HA setup.
>

With active/passive HA, I don't offer load balancing.
I see about DRBD three-node setup.
And also I thinking of the NAS shared-disk, but apparently that isn't
compatible with the ocfs, neither with gfs.
Which is the best choice for the shared data between databases server, and
offer load balancing?

I listen of the HAProxy, that offer too load balancing. If I use HAProxy is
equal to use LVS, and so I must choice between both. In this moment I don't
know HAProxy.


--
Sayuri Komatsu
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


david at davidcoulson

Jun 17, 2012, 4:49 AM

Post #7 of 17 (997 views)
Permalink
Re: [lvs-users] LVS + Database [In reply to]

On 6/17/12 2:18 AM, YesGood wrote:
> Changed, now I prefer build a master-hot standby (read/write node-read
> only node ) server for the load balancing cluster. It's possible send
> request for both nodes for build load balance?
No, you only want to send connections to the read/write server,
otherwise clients can't update data.
> With active/passive HA, I don't offer load balancing.
> I see about DRBD three-node setup.
> And also I thinking of the NAS shared-disk, but apparently that isn't
> compatible with the ocfs, neither with gfs.
> Which is the best choice for the shared data between databases server, and
> offer load balancing?
What are you trying to accomplish? Do you really need load balancing
between two boxes, or would an active/standby database be enough. You
say you want it for performance, but when a system dies you will need to
be able to support your entire workload on one box.
>
> I listen of the HAProxy, that offer too load balancing. If I use HAProxy is
> equal to use LVS, and so I must choice between both. In this moment I don't
> know HAProxy.
>
There are a dozen ways to get requests from a client to multiple boxes.
you have a database problem, not a load balancer problem.


_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


david at davidcoulson

Jun 17, 2012, 4:14 PM

Post #8 of 17 (991 views)
Permalink
Re: [lvs-users] LVS + Database [In reply to]

On 6/17/12 7:04 PM, YesGood wrote:

>
> says:
> one that can accept connections and serves read-only queries is called
> a /hot standby/ server.
That would require logic in the app to send read-only queries to the
correct system.
>
> You say you want it for performance, but when a system dies you
> will need to be able to support your entire workload on one box.
>
>
> Please, Can you explain me more ?
Well, if you have 2 systems, and one fails, then you presumably need to
be able to run everything you need one one box (or N-1 boxes, where N is
the total number of systems you have).

In general, running active/active is way more complicated to build,
support and troubleshoot than active/backup. Most likely, you think you
need load balancing, but you really don't. Or, if you do to support your
workload, then you're going to be having a bad day when you loose a system.

David


_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


graeme at graemef

Jun 18, 2012, 4:58 AM

Post #9 of 17 (990 views)
Permalink
Re: [lvs-users] LVS + Database [In reply to]

On Sun, 2012-06-17 at 07:49 -0400, David Coulson wrote:
> There are a dozen ways to get requests from a client to multiple boxes.
> you have a database problem, not a load balancer problem.

I agree with this.

The biggest question is: *why* do you want to load-balance connections
to a database?

Graeme


_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


sayupirapo at gmail

Jun 18, 2012, 7:06 AM

Post #10 of 17 (991 views)
Permalink
Re: [lvs-users] LVS + Database [In reply to]

---------- Forwarded message ----------
From: YesGood <sayupirapo [at] gmail>
Date: 2012/6/17
Subject: Re: [lvs-users] LVS + Database
To: David Coulson <david [at] davidcoulson>


Hi
Thank you David for your response.

I need Load Balancing of the database cluster formed with 2 or more
database server.

Several clients can to send queries request (read and write ) to the
database servers, who are nodes of the cluster.
So with the load balancing, the answers to the request, can response any
server available, which don't attend a request in this moment.

I see about master-master server, which can offer load balancing, but that
say it's worse than a only server, caused by locking of the write work.
and then, I search other best solution. So I mentioned the master-hot
standby.

In this page:
http://www.postgresql.org/docs/current/interactive/high-availability.html

says:
one that can accept connections and serves read-only queries is called a *hot
standby* server.



> You say you want it for performance, but when a system dies you will need
> to be able to support your entire workload on one box.
>

Please, Can you explain me more ?

--
Sayuri Komatsu
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


sayupirapo at gmail

Jun 18, 2012, 7:07 AM

Post #11 of 17 (991 views)
Permalink
Re: [lvs-users] LVS + Database [In reply to]

---------- Forwarded message ----------
From: YesGood <sayupirapo [at] gmail>
Date: 2012/6/17
Subject: Re: [lvs-users] LVS + Database
To: David Coulson <david [at] davidcoulson>


Thanks you, again.

That would require logic in the app to send read-only queries to the
> correct system.
>
> I don't know about logic in the app, but using logic in the database, it's
possible to build a load balancing system of master- hot standby? if yes,
I will study this issue.

Well, if you have 2 systems, and one fails, then you presumably need to be
> able to run everything you need one one box (or N-1 boxes, where N is the
> total number of systems you have).
>

with your example, it's the following? if one systems (database server)
fails the load balancers delete the fails server and follow offer database
servers, with remainder nodes of cluster.
The changes in the active not fails system, it's updated in the active
fails system, before offering services. Although, I don't decide what use
for storage, in a environment using shared-disk, this is no need. right?


> Most likely, you think you need load balancing, but you really don't.
>

Why?


> Or, if you do to support your workload, then you're going to be having a
> bad day when you loose a system.
>
> But that would be less, if it's possible to use master-hot standby for
load balancing.
what you suggest to improve a database server with many request of
different client?

--
Sayuri Komatsu
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


jmack at wm7d

Jun 18, 2012, 7:13 AM

Post #12 of 17 (988 views)
Permalink
Re: [lvs-users] LVS + Database [In reply to]

On Mon, 18 Jun 2012, YesGood wrote:

>> Most likely, you think you need load balancing, but you really don't.
>>
>
> Why?

the problems with loadbalancing a database server were laid
out in the HOWTO about 10yrs ago. The short answer is that
you can load balance reads, but you can't loadbalance
writes. If you want to loadbalance writes, the databased has
to do it.

Joe

--
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml
Homepage http://www.austintek.com/ It's GNU/Linux!

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


sayupirapo at gmail

Jun 18, 2012, 3:00 PM

Post #13 of 17 (980 views)
Permalink
Re: [lvs-users] LVS + Database [In reply to]

Hi everybody

The build load balancer cluster of database, in this moment is an
investigation.
the response of why are:

- I have web server in a load balancer environment, and
- I search other system, which can offer load balancing.
- I choice database, because it's would be interesting, in a much client
environment
- and so offer best service.

In short of all response receive

- build a load balance system of database is possible.
- type are:
- master-master (both offer write ), is most difficult and
complicated. That can be worse than only server.
- master-standby (write and only read), only one server the master
can receive write request, and other server is standby and only receive
read request. But need logic in app. What is it?
- but build a load balance system for database, isn't best choice.
Because could offer outdated data Right?
- Now, are there some references, bibliography that speak about that?

Thank you.
--
Sayuri Komatsu
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


enno+lvs at groeper-berlin

Jun 20, 2012, 2:03 AM

Post #14 of 17 (935 views)
Permalink
Re: [lvs-users] LVS + Database [In reply to]

Hi,

Am 19.06.2012 00:00, schrieb YesGood:
> Hi everybody
>
> The build load balancer cluster of database, in this moment is an
> investigation.
> the response of why are:
>
> - I have web server in a load balancer environment, and
> - I search other system, which can offer load balancing.
> - I choice database, because it's would be interesting, in a much client
> environment
> - and so offer best service.
Using a loadbalanced cluster doesn't automatically mean offering "best
service". Such a cluster introduces a huge amount of complexity,
especially in a database environment. Loadbalancing web servers is
usually no problem.
For having a loadbalanced database cluster you should have a good reason
and some knowhow. Such a good reason would be to _know_, that a single
server just can't manage the load.
If it is for high availability, you don't need to loadbalance. This is a
different usage scenario. Fur high availability you use things like
pacemaker or heartbeat.

>
> In short of all response receive
>
> - build a load balance system of database is possible.
> - type are:
> - master-master (both offer write ), is most difficult and
> complicated. That can be worse than only server.
> - master-standby (write and only read), only one server the master
> can receive write request, and other server is standby and only receive
> read request. But need logic in app. What is it?
If I understand you correctly, you think there is a special app for that?
That is not the case. It's simply, that the applications (PHP scripts
whatever) running on your webservers need to have the logic implemented.
Most applications don't have that and you should just don't do
loadbalancing in that case, as long as you don't really know what you
are doing.

> - but build a load balance system for database, isn't best choice.
> Because could offer outdated data Right?
Not only that. You can really screw up your whole database, if you don't
know what you are doing.
The biggest problem is consistency, yes.

> - Now, are there some references, bibliography that speak about that?
About what? Loadbalancing database servers? The possibilities for that
usually depend on the database management system your are using (MySQL,
pgsql, Oracle, ...). It's a really specific problem.
First you should know what you really need and what you want to do.
Can't your current database server manage the load? Do you want high
availability to decrease the time the server is down?
If you do it wrong, you could even increase the downtime with high
availability due to the complexity this things introduce.
Is this all academic interest or do you need to have some production
system up and running?

HTH,
Enno
Attachments: signature.asc (0.26 KB)


sayupirapo at gmail

Jun 20, 2012, 5:56 PM

Post #15 of 17 (934 views)
Permalink
Re: [lvs-users] LVS + Database [In reply to]

Hi
Thanks you Enno Gröper

On Wed, 20 Jun 2012 Enno Gröper wrote:
> > - Now, are there some references, bibliography that speak about that?
> About what? Loadbalancing database servers? The possibilities for that
> usually depend on the database management system your are using (MySQL,
> pgsql, Oracle, ...). It's a really specific problem.
>

It's for load balancing in general and the load balancing database in
general.

First you should know what you really need and what you want to do.
> Can't your current database server manage the load? Do you want high
> availability to decrease the time the server is down?
> If you do it wrong, you could even increase the downtime with high
> availability due to the complexity this things introduce.
> Is this all academic interest or do you need to have some production
> system up and running?
>

I really want high availability and load balance for the database server,
but I think, if first, I build a load balance database, the high
availability it's offer the load balance without implemented high
availability with pacemaker or heartbeat. Right?
It's a academic interest

--
Sayuri Komatsu
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


sethcall at gmail

Jun 20, 2012, 6:43 PM

Post #16 of 17 (935 views)
Permalink
Re: [lvs-users] LVS + Database [In reply to]

If you are set on postgresql + load-balancing, then use bucardo to
implement a master-master postgresql database.

Then, put Haproxy or LVS in front to load balance.

Then, be confused when your application makes a write to one master, and
then selects from another but the data hasn't propagated over.

Ok, fine, then instead implement master-master but do an active/standby.

Then ask yourself why are you using Bucardo just to have active/standby
(which can be done with master/slave + custom scripting or pg-poolII).

Ok, fine, wait until 9.3 postgresql (mid 2013) which pg team says *might*
have native master-master.

Ok, give up on all that. Just use master/slave, and instead of worrying
about load-balancing, learn how to write good schemas, good queries, use
postgresql 9.1 'unlogged tables' (for 10x faster write performance for
can-lose data), use table partitioning for large tables, use 'WITH
RECURSIVE' to optimize parent/child queries,... or forget optimizing...
just use postgresql for your relational data but use something like redis
for data that doesn't need powerful SQL-level querying but has high
write/read characteristics.

But I think what everyone is trying to tell you, the path of most
resistance is doing what you say you want. Academic or not.




On Wed, Jun 20, 2012 at 7:56 PM, YesGood <sayupirapo [at] gmail> wrote:

> Hi
> Thanks you Enno Gröper
>
> On Wed, 20 Jun 2012 Enno Gröper wrote:
> > > - Now, are there some references, bibliography that speak about
> that?
> > About what? Loadbalancing database servers? The possibilities for that
> > usually depend on the database management system your are using (MySQL,
> > pgsql, Oracle, ...). It's a really specific problem.
> >
>
> It's for load balancing in general and the load balancing database in
> general.
>
> First you should know what you really need and what you want to do.
> > Can't your current database server manage the load? Do you want high
> > availability to decrease the time the server is down?
> > If you do it wrong, you could even increase the downtime with high
> > availability due to the complexity this things introduce.
> > Is this all academic interest or do you need to have some production
> > system up and running?
> >
>
> I really want high availability and load balance for the database server,
> but I think, if first, I build a load balance database, the high
> availability it's offer the load balance without implemented high
> availability with pacemaker or heartbeat. Right?
> It's a academic interest
>
> --
> Sayuri Komatsu
> _______________________________________________
> Please read the documentation before posting - it's available at:
> http://www.linuxvirtualserver.org/
>
> LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
> Send requests to lvs-users-request [at] LinuxVirtualServer
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
>
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users


anders.henke at 1und1

Jun 25, 2012, 6:38 AM

Post #17 of 17 (888 views)
Permalink
Re: [lvs-users] LVS + Database [In reply to]

On 06/18/2012, YesGood wrote:
> The build load balancer cluster of database, in this moment is an
> investigation.
> the response of why are:
>
> - I have web server in a load balancer environment, and
> - I search other system, which can offer load balancing.
> - I choice database, because it's would be interesting, in a much client
> environment
> - and so offer best service.

The definition of "best service" may vary - what's yours?

> In short of all response receive
>
> - build a load balance system of database is possible.
> - type are:
> - master-master (both offer write ), is most difficult and
> complicated. That can be worse than only server.


It did take Oracle a few years to write RAC and 11g, so it shouldn't be that easy :-)


> - master-standby (write and only read), only one server the master
> can receive write request, and other server is standby and only receive
> read request. But need logic in app. What is it?

The application, who acts as a "client" to the database, needs to know
when it's fine to ask the "probably" outdated standby system. For
example, it may be fine to retrieve information for the product page of
an online shop from a standby system, who usually isn't off by more than
a few seconds.
However, when prices are being calculated at the paygate, your application
clearly wants to have accurate pricing and so ask the write master.

This is stuff which is clearly out of scope of what LVS can do. LVS is working
on OSI layers 2 to 5. Your database queries are being done on higher levels and
out of scope of the capabilities of LVS.

In terms of databases, the MySQL proxy tries to implement a highlevel loadbalancer:
MySQL proxy is an application, who looks like "the database server" to
the client. It analyzes incoming SQL queries and according to one's own configuration,
may send some kind of queries to one node, while other queries are being sent
to a different node. However, the risk of doing it "wrong" (for the
application) is at that level still fairly high.


> - but build a load balance system for database, isn't best choice.
> Because could offer outdated data Right?

Either poor performance or outdated data. Or a hefty price tag.
Maybe more than one of them.


Poor Performance:
If the master has to ensure that all data is replayed on the standby
system, any write request may only be marked as "done" when it has been
written on both servers. So in the best case (parallel IO, very rare for database
replication), the slowest of both servers defines the maximum performance.

Usually, your write requests in synchronous IO need to wait for the
master to complete, the master in behind asks the standby to complete
the write request, and when both have agreed on being "done", the master
reports the write request as "done" to the client application. So
basically, this setup is adding about every possible latencies.


Outdated data:
When the master is reporting something as "done", while it hasn't yet made
it to the standby system, you may describe this as an asynchronous
operation. While "usually", the time needed to process the write
requests on the standby system may be very low and barely recognizable,
there may also other situations occur. For example, the standby server
has to rebuild a broken disk in its local RAID, and so a lot of IO
performance goes to rebuilding the disk instead of the DBMS. Instead of
a few milliseconds, a write request may need at least a few seconds to proceed
from master to standby.

If the application doesn't know when it's fine to ask the standby system and when
it has to ask the master, this may also result in data loss:
1: application writes to master
2: application reads from standby before changes have been replicated from master
3: application uses the read data to compute changes, which are again
written to master. Result: the changes from step 1 are lost.

Hefty price tag:
In clusters of in-memory databases, don't need to rely that much on disk performance,
as data is being stored in RAM. Once Infiniband or Myrinet network
adapters are installed for the inter-cluster-communication, the network
latency decreases quite a lot (compared to Ethernet), so one may run
with synchronous data replication at still a high performance.

However, your 100G-Database will also require 100G of RAM - per node.
And in order to reduce network latency, you may need to install a
secondary network for replication of your database data, ideally using
low-latency network adapters like Infiniband or Myrinet. Which is also
not that cheap like standard ethernet.

> - Now, are there some references, bibliography that speak about that?

Hmmm. I consider this as being basic knowledge in computer science.
Starting from simple physics to CAP theorem, to fallacies of distributed
computing.

For example: in terms of storage speed and latency, the caches in a CPU is
usually fastest, then it's "usual" RAM, afterwards local disk storage, and
later ethernet network.

If one application needs to access a networked remote disk storage, you
can simply add up two of the worst latencies and see this performance
won't be that fast like with about any given local disk storage.





Anders
--
1&1 Internet AG Expert Systems Architect (IT Operations)
Brauerstrasse 50 v://49.721.91374.0
D-76135 Karlsruhe f://49.721.91374.225

Amtsgericht Montabaur HRB 6484
Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich,
Robert Hoffmann, Andreas Hofmann, Markus Huhn, Hans-Henning Kettler,
Dr. Oliver Mauss, Jan Oetjen
Aufsichtsratsvorsitzender: Michael Scheeren


_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users [at] LinuxVirtualServer
Send requests to lvs-users-request [at] LinuxVirtualServer
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

Linux Virtual Server users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.