Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: DRBD: Users

DRBD Sync stalls

 

 

DRBD users RSS feed   Index | Next | Previous | View Threaded


DBOYN at POSTPATH

Jul 14, 2009, 4:39 PM

Post #1 of 7 (1469 views)
Permalink
DRBD Sync stalls

Hi,
Does anybody have an idea how should I troubleshoot the following problem?

My peers were connecting and syncronizyng without any problem until network guys introduced a firewall between the peers

(this is all in my lab but I wanted to emulate remote site replication)
As you can see the peers are still connecting OK and synchronization starts but in a few second it stalls and remains in this state for ever.

What is even worse is that this introduces some kind of kernel problem as with time the machine becomes inaccessible through ssh.
Ping is alive but no telnet can be established to any of the active service ports.

Please help with any troubleshooting suggestions or ideas!

Thanks! š


"
Every 2.0s: cat /proc/drbdšššššššššššššššššššššššššššššššššššššššššššššššššššššššššššššššššššššššššš šššššššššššššššššššššššššššššššTue Jul 14 16:30:00 2009

version: 8.0.13 (api:86/proto:86)
GIT-hash: ee3ad77563d2e87171a3da17cc002ddfd1677dbe build by buildsvn [at] c5-x8664-buil, 2008-10-03 10:12:56
š0: cs:SyncSource st:Primary/Secondary ds:UpToDate/Inconsistent C r---
ššš ns:178836 nr:0 dw:745036 dr:2388809 al:5441 bm:5364 lo:1 pe:2056 ua:251 ap:2049
ššššššš [>....................] sync'ed:š 0.1% (183823/183825)M
ššššššš stalled
ššššššš resync: used:1/61 hits:721 misses:4 starving:0 dirty:0 changed:4
ššššššš act_log: used:4/127 hits:180818 misses:5996 starving:0 dirty:555 changed:5441

10: cs:Unconfigured
"

Boyn, Dimitar G.

Technical Marketing Engineer

dboyn [at] postpath



3979 Freedom Circle, Santa Clara, CA 95054 <http://maps.yahoo.com/py/maps.py?Pyt=Tmap&addr=1200+Villa+Street%2C+Suite+150&csz=Mountain+View%2C+CA+94041-1106&country=us>

ICQ# 57-539-824

IM: Y! niobd


adam.taylor at wml

Jul 15, 2009, 12:29 PM

Post #2 of 7 (1378 views)
Permalink
Re: DRBD Sync stalls [In reply to]

Hi,

I have ran into a similar problem with Symantec Backup Agent running on the
same network and for some reason killing the tcp/ip stack. More than likely
this issue will be related to the network. Any chance of sharing your
configs?

Cheers,

___________________________________________________


Adam Taylor | Engineer | WML Software
Unit 3c | 14-22 Triton Drive | Albany | Auckland

P. +64 9 477 4555 | F. +64 9 478 6926
DDI. +64 9 477 6375 | MOB. +64 21 621 519
E. <mailto:adam.taylor [at] wml> adam.taylor [at] wml
W. <http://www.wml.co.nz/> www.wml.co.nz |
<http://www.compose.co.nz/> www.compose.co.nz


WML Software


_____

From: drbd-user-bounces [at] lists
[mailto:drbd-user-bounces [at] lists] On Behalf Of ??????? ????
Sent: Wednesday, July 15, 2009 11:40 AM
To: 'drbd-user [at] lists'
Subject: [DRBD-user] DRBD Sync stalls



Hi,

Does anybody have an idea how should I troubleshoot the following problem?



My peers were connecting and syncronizyng without any problem until network
guys introduced a firewall between the peers
(this is all in my lab but I wanted to emulate remote site replication)

As you can see the peers are still connecting OK and synchronization starts
but in a few second it stalls and remains in this state for ever.



What is even worse is that this introduces some kind of kernel problem as
with time the machine becomes inaccessible through ssh.

Ping is alive but no telnet can be established to any of the active service
ports.



Please help with any troubleshooting suggestions or ideas!



Thanks!





"

Every 2.0s: cat /proc/drbd
Tue Jul 14 16:30:00 2009



version: 8.0.13 (api:86/proto:86)

GIT-hash: ee3ad77563d2e87171a3da17cc002ddfd1677dbe build by
buildsvn [at] c5-x8664-buil, 2008-10-03 10:12:56

0: cs:SyncSource st:Primary/Secondary ds:UpToDate/Inconsistent C r---

ns:178836 nr:0 dw:745036 dr:2388809 al:5441 bm:5364 lo:1 pe:2056 ua:251
ap:2049

[>....................] sync'ed: 0.1% (183823/183825)M

stalled

resync: used:1/61 hits:721 misses:4 starving:0 dirty:0 changed:4

act_log: used:4/127 hits:180818 misses:5996 starving:0 dirty:555
changed:5441



10: cs:Unconfigured

"




Boyn, Dimitar G.
Technical Marketing Engineer
<mailto:dboyn [at] postpath> dboyn [at] postpath

<http://maps.yahoo.com/py/maps.py?Pyt=Tmap&addr=1200+Villa+Street%2C+Suite+1
50&csz=Mountain+View%2C+CA+94041-1106&country=us> 3979 Freedom Circle,
Santa Clara, CA 95054
ICQ# 57-539-824
IM: Y! niobd












__________ NOD32 4244 (20090715) Information __________

This message was checked by NOD32 antivirus system.
http://www.eset.com
Attachments: WML-New-Logo.gif (2.55 KB)


timkom at gmail

Jul 15, 2009, 3:31 PM

Post #3 of 7 (1379 views)
Permalink
Re: DRBD Sync stalls [In reply to]

The behaviour looks similar like if you close DRBD resource port on one
node. It must be opened on both nodes.

Tino

2009/7/15 ไษอษิ฿า โฯสฮ <DBOYN [at] postpath>

> Hi,
>
> Does anybody have an idea how should I troubleshoot the following problem?
>
>
>
> My peers were connecting and syncronizyng without any problem until network
> guys introduced a firewall between the peers
> (this is all in my lab but I wanted to emulate remote site replication)
>
> As you can see the peers are still connecting OK and synchronization starts
> but in a few second it stalls and remains in this state for ever.
>
>
>
> What is even worse is that this introduces some kind of kernel problem as
> with time the machine becomes inaccessible through ssh.
>
> Ping is alive but no telnet can be established to any of the active service
> ports.
>
>
>
> Please help with any troubleshooting suggestions or ideas!
>
>
>
> Thanks!
>
>
>
>
>
> "
>
> Every 2.0s: cat
> /proc/drbd
> Tue Jul 14 16:30:00 2009
>
>
>
> version: 8.0.13 (api:86/proto:86)
>
> GIT-hash: ee3ad77563d2e87171a3da17cc002ddfd1677dbe build by
> buildsvn [at] c5-x8664-buil, 2008-10-03 10:12:56
>
> 0: cs:SyncSource st:Primary/Secondary ds:UpToDate/Inconsistent C r---
>
> ns:178836 nr:0 dw:745036 dr:2388809 al:5441 bm:5364 lo:1 pe:2056 ua:251
> ap:2049
>
> [>....................] sync'ed: 0.1% (183823/183825)M
>
> stalled
>
> resync: used:1/61 hits:721 misses:4 starving:0 dirty:0 changed:4
>
> act_log: used:4/127 hits:180818 misses:5996 starving:0 dirty:555
> changed:5441
>
>
>
> 10: cs:Unconfigured
>
> "
>
> *Boyn, Dimitar G.*
> *Technical Marketing Engineer*
> dboyn [at] postpath
> 3979 Freedom Circle,
> Santa Clara*,* CA 95054<http://maps.yahoo.com/py/maps.py?Pyt=Tmap&addr=1200+Villa+Street%2C+Suite+150&csz=Mountain+View%2C+CA+94041-1106&country=us>
> ICQ# 57-539-824
> IM: Y! niobd
>
>
>
>
>
>
>
> _______________________________________________
> drbd-user mailing list
> drbd-user [at] lists
> http://lists.linbit.com/mailman/listinfo/drbd-user
>
>


dboyn at postpath

Aug 3, 2009, 9:25 PM

Post #4 of 7 (1264 views)
Permalink
Re: DRBD Sync stalls [In reply to]

Hi, again!
So hoping that the new DRBD 8.3 might address my issue I did the upgrade.
:-( it didn't help.
As I did not receive any answers to my original post 2 weeks ago I guess my problem is unique?
What would be the best way to troubleshoot? Any way to troubleshoot?

Thanks as always!
./Dimitar Boyn

From: drbd-user-bounces [at] lists On Behalf Of ??????? ????

Sent: Tuesday, July 14, 2009 4:40 PM

To: drbd-user [at] lists

Subject: [DRBD-user] DRBD Sync stalls



Hi,
Does anybody have an idea how should I troubleshoot the following problem?

My peers were connecting and syncronizyng without any problem until network guys introduced a firewall between the peers

(this is all in my lab but I wanted to emulate remote site replication)
As you can see the peers are still connecting OK and synchronization starts but in a few second it stalls and remains in this state for ever.

What is even worse is that this introduces some kind of kernel problem as with time the machine becomes inaccessible through ssh.
Ping is alive but no telnet can be established to any of the active service ports.

Please help with any troubleshooting suggestions or ideas!

Thanks!


"
Every 2.0s: cat /proc/drbd Tue Jul 14 16:30:00 2009

version: 8.0.13 (api:86/proto:86)
GIT-hash: ee3ad77563d2e87171a3da17cc002ddfd1677dbe build by buildsvn [at] c5-x8664-buil, 2008-10-03 10:12:56
0: cs:SyncSource st:Primary/Secondary ds:UpToDate/Inconsistent C r---
ns:178836 nr:0 dw:745036 dr:2388809 al:5441 bm:5364 lo:1 pe:2056 ua:251 ap:2049
[>....................] sync'ed: 0.1% (183823/183825)M
stalled
resync: used:1/61 hits:721 misses:4 starving:0 dirty:0 changed:4
act_log: used:4/127 hits:180818 misses:5996 starving:0 dirty:555 changed:5441

10: cs:Unconfigured
"

Boyn, Dimitar G.

Technical Marketing Engineer

dboyn [at] postpath



3979 Freedom Circle, Santa Clara, CA 95054 <http://maps.yahoo.com/py/maps.py?Pyt=Tmap&addr=1200+Villa+Street%2C+Suite+150&csz=Mountain+View%2C+CA+94041-1106&country=us>

ICQ# 57-539-824

IM: Y! niobd


lars.ellenberg at linbit

Aug 4, 2009, 2:05 AM

Post #5 of 7 (1287 views)
Permalink
Re: DRBD Sync stalls [In reply to]

On Mon, Aug 03, 2009 at 09:25:09PM -0700, ะ”ะธะผะธั‚ัŠั€ ะ‘ะพะนะฝ wrote:
> Hi, again!
> So hoping that the new DRBD 8.3 might address my issue I did the upgrade.
> :-( it didn't help.
> As I did not receive any answers to my original post 2 weeks ago I guess my problem is unique?

There have been two answers.

> What would be the best way to troubleshoot? Any way to troubleshoot?

have a look at my sig ;)

--
: Lars Ellenberg
: LINBIT HA-Solutions GmbH
: DRBDยฎ/HA support and consulting http://www.linbit.com

DRBDยฎ and LINBITยฎ are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list -- I'm subscribed
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


groen692 at grosc

Aug 4, 2009, 3:19 AM

Post #6 of 7 (1258 views)
Permalink
Re: DRBD Sync stalls [In reply to]

Hi all,

If seen this behaviour also.
1) DRBD is full syncing
2) after a period of time the sync is stalled
3) after again a little while the whole machine is frozen. (only the
source machine)

my setup
openSUSE 11.1
drbd 8.3.1 (however the same, with 8.2)
Both machines where connected to the same 1 gig switch. (simple 1 gig
switch for staging purposes)

My impression was it was HW related. since a was installing three
clusters and only one of them was showing this behavior. The machine
involved was a Fujisu Siemens server TX200

mfg,

jeroen

Lars Ellenberg wrote:
> On Mon, Aug 03, 2009 at 09:25:09PM -0700, ะ”ะธะผะธั‚ัŠั€ ะ‘ะพะนะฝ wrote:
>
>> Hi, again!
>> So hoping that the new DRBD 8.3 might address my issue I did the upgrade.
>> :-( it didn't help.
>> As I did not receive any answers to my original post 2 weeks ago I guess my problem is unique?
>>
>
> There have been two answers.
>
>
>> What would be the best way to troubleshoot? Any way to troubleshoot?
>>
>
> have a look at my sig ;)
>
>
> ------------------------------------------------------------------------
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 8.5.392 / Virus Database: 270.13.43/2280 - Release Date: 08/03/09 17:56:00
>
>


dboyn at postpath

Aug 7, 2009, 3:36 PM

Post #7 of 7 (1232 views)
Permalink
Re: DRBD Sync stalls [In reply to]

Thanks for the hints!
I have another DRBD pair configured with the same hardware and cloned OS images etc which sync just fineโ€ฆ
All server nodes are connected to the same switch :-(

Any ideas?

Here is the config:
cat /etc/sysctl.conf | grep net

net.ipv4.ip_forward=0
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.lo.accept_source_route = 0
net.ipv4.conf.eth0.accept_source_route = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.lo.rp_filter = 1
net.ipv4.conf.eth0.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.lo.accept_redirects = 0
net.ipv4.conf.eth0.accept_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.lo.accept_source_route = 0
net.ipv4.conf.eth0.accept_source_route = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.lo.rp_filter = 1
net.ipv4.conf.eth0.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.lo.accept_redirects = 0
net.ipv4.conf.eth0.accept_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.tcp_fin_timeout = 15
net.ipv4.tcp_keepalive_time = 1800
net.ipv4.tcp_window_scaling = 0
net.ipv4.tcp_sack = 0
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_syncookies = 1
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv4.tcp_max_syn_backlog = 1280
net.ipv4.tcp_max_tw_buckets = 1440000
net.ipv4.ip_local_port_range = 16384 65536

From: drbd-user-bounces [at] lists On Behalf Of Jeroen Groenewegen van der Weyden

Sent: Tuesday, August 04, 2009 3:19 AM

To: drbd-user [at] lists

Subject: Re: [DRBD-user] DRBD Sync stalls



Hi all,



If seen this behaviour also.

1) DRBD is full syncing

2) after a period of time the sync is stalled

3) after again a little while the whole machine is frozen. (only the source machine)



my setup

openSUSE 11.1

drbd 8.3.1 (however the same, with 8.2)

Both machines where connected to the same 1 gig switch. (simple 1 gig switch for staging purposes)



My impression was it was HW related. since a was installing three clusters and only one of them was showing this behavior. The machine involved was a Fujisu Siemens server TX200



mfg,



jeroen



Lars Ellenberg wrote:
On Mon, Aug 03, 2009 at 09:25:09PM -0700, ะ”ะธะผะธั‚ัŠั€ ะ‘ะพะนะฝ wrote:ย  Hi, again!So hoping that the new DRBD 8.3 might address my issue I did the upgrade.:-( it didn't help.As I did not receive any answers to my original post 2 weeks ago I guess my problem is unique?ย ย ย  There have been two answers. ย  What would be the best way to troubleshoot? Any way to troubleshoot?ย ย ย  have a look at my sig ;) ย 



No virus found in this incoming message.Checked by AVG - www.avg.com <http://www.avg.com> Version: 8.5.392 / Virus Database: 270.13.43/2280 - Release Date: 08/03/09 17:56:00 ย 

DRBD users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.