Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Zope: Users

Frequent ZOPE crashes

 

 

Zope users RSS feed   Index | Next | Previous | View Threaded


andreas.krasa at wu-wien

Nov 25, 2009, 8:05 AM

Post #1 of 7 (1574 views)
Permalink
Frequent ZOPE crashes

Hello Mailinglist,

we've been using ZOPE in combination with the Silva CMS for around four
years now to serve our University's homepage. And everything worked fine
so far.

A week ago we switched to a new layout (for corporate reasons) and now
we're experiencing frequent crashes of the Zope servers. Fortunately
enough the reconnect themselves to the ZODB but since this is now
happening around every five minutes, I'm rather worried that this might
permanently damage the ZODB.

I have absolutely no idea how this can happen, as we're using the same
python, libxml2, libxslt and other module versions as with the old
homepage - in fact the new site even runs on the same hardware. We never
experienced any problems like these up until now.

As far as I understood so far, it "requires" some C modules to
successfully cause ZOPE to segfault?

Versions we're using:

Python 2.4.6
Zope 2.11.2
LibXML2 2.7.3
LibXSLT 1.1.24
Python-LDAP 2.3.6
Setuptools 0.6c9
and a Kerberos Module

plus the Silva CMS (2.1) on top.

We have four ZOPE servers, each running two ZEO processes and a separate
ZODB. The machines all run RedHat Enterprise Linux 5.4. In front of that
Apache, Squid and Pound take care of the caching.

What we did was to examine the coredump-files with gdb but unfortunately
this didn't prove to be very helpful because either things go wrong
during garbage collection or some ceval stuff. So basically something
trashes certain python-objects at time before.

Do you have *any* hinst in how to track down this problem? Or are there
any known problems with the versions above? The changelogs didn't reveal
any plausible cause for me...

Kind regards,
Andreas Krasa
_______________________________________________
Zope maillist - Zope [at] zope
https://mail.zope.org/mailman/listinfo/zope
** No cross posts or HTML encoding! **
(Related lists -
https://mail.zope.org/mailman/listinfo/zope-announce
https://mail.zope.org/mailman/listinfo/zope-dev )


lukesh at seznam

Nov 25, 2009, 8:37 AM

Post #2 of 7 (1479 views)
Permalink
Re: Frequent ZOPE crashes [In reply to]

At first, try to eliminate error outside of the Zope itself. Try to install
it all into plain whole new (and reliable!) machine. Do not use restore of
any backups!

----- Original Message -----
From: "Andreas Krasa" <andreas.krasa [at] wu-wien>

> A week ago we switched to a new layout (for corporate reasons) and now
> we're experiencing frequent crashes of the Zope servers. Fortunately

_______________________________________________
Zope maillist - Zope [at] zope
https://mail.zope.org/mailman/listinfo/zope
** No cross posts or HTML encoding! **
(Related lists -
https://mail.zope.org/mailman/listinfo/zope-announce
https://mail.zope.org/mailman/listinfo/zope-dev )


andreas.krasa at wu-wien

Nov 29, 2009, 5:14 AM

Post #3 of 7 (1446 views)
Permalink
Re: Frequent ZOPE crashes [In reply to]

Am 25.11.09 17:37, schrieb Jaroslav Lukesh:
> At first, try to eliminate error outside of the Zope itself. Try to
> install it all into plain whole new (and reliable!) machine. Do not use
> restore of any backups!
>
> ----- Original Message ----- From: "Andreas Krasa"
> <andreas.krasa [at] wu-wien>
>
>> A week ago we switched to a new layout (for corporate reasons) and now
>> we're experiencing frequent crashes of the Zope servers. Fortunately

Hi Jaroslav,

we're right in the process of tracking down the error outside of ZOPE.

We have completely installed a new server from scratch with RHEL 5.4 and
have re-installed python 2.4.6 and the latest versions of libxml2 and
libxslt there. We double checked the LD config, and made sure that te
correct shared objects get loaded (via lsof).

We also reinstalled a few other modules that contain C-code (such as
python-ldap) which we need for being able to do authenitcation.

Unfortunately that didn't really help much. We still experience crashes.

Are there any known issues with Zope 2.11.2, LibXML2 and/or LibXSLT that
could cause these problems?

The only thing we re-used is the Data.fs, which we have to, because
we're talking about a production system here.

Also note, that we have used excatly the same setup for a long time now,
even on the same hardware, without any of these troubles. The problems
only started when we switched over to a new (and probably more
resource-intensive layout).

We're unfortunately still not able to reproduce these crashes.

Kind regards,
Andreas
_______________________________________________
Zope maillist - Zope [at] zope
https://mail.zope.org/mailman/listinfo/zope
** No cross posts or HTML encoding! **
(Related lists -
https://mail.zope.org/mailman/listinfo/zope-announce
https://mail.zope.org/mailman/listinfo/zope-dev )


lukesh at seznam

Nov 29, 2009, 10:52 AM

Post #4 of 7 (1438 views)
Permalink
Re: Frequent ZOPE crashes [In reply to]

Try to cache page segments selectively and then you should identify
problematic piece of page. If You does not use page segments, then divide
page to some pieces and get them together.


----- Original Message -----
From: "Andreas Krasa" <andreas.krasa [at] wu-wien>
>
> The only thing we re-used is the Data.fs, which we have to, because
> we're talking about a production system here.
>
> Also note, that we have used excatly the same setup for a long time now,
> even on the same hardware, without any of these troubles. The problems
> only started when we switched over to a new (and probably more
> resource-intensive layout).

_______________________________________________
Zope maillist - Zope [at] zope
https://mail.zope.org/mailman/listinfo/zope
** No cross posts or HTML encoding! **
(Related lists -
https://mail.zope.org/mailman/listinfo/zope-announce
https://mail.zope.org/mailman/listinfo/zope-dev )


tseaver at palladion

Nov 29, 2009, 12:57 PM

Post #5 of 7 (1434 views)
Permalink
Re: Frequent ZOPE crashes [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andreas Krasa wrote:
> Am 25.11.09 17:37, schrieb Jaroslav Lukesh:
>> At first, try to eliminate error outside of the Zope itself. Try to
>> install it all into plain whole new (and reliable!) machine. Do not use
>> restore of any backups!
>>
>> ----- Original Message ----- From: "Andreas Krasa"
>> <andreas.krasa [at] wu-wien>
>>
>>> A week ago we switched to a new layout (for corporate reasons) and now
>>> we're experiencing frequent crashes of the Zope servers. Fortunately
>
> Hi Jaroslav,
>
> we're right in the process of tracking down the error outside of ZOPE.
>
> We have completely installed a new server from scratch with RHEL 5.4 and
> have re-installed python 2.4.6 and the latest versions of libxml2 and
> libxslt there. We double checked the LD config, and made sure that te
> correct shared objects get loaded (via lsof).
>
> We also reinstalled a few other modules that contain C-code (such as
> python-ldap) which we need for being able to do authenitcation.
>
> Unfortunately that didn't really help much. We still experience crashes.
>
> Are there any known issues with Zope 2.11.2, LibXML2 and/or LibXSLT that
> could cause these problems?
>
> The only thing we re-used is the Data.fs, which we have to, because
> we're talking about a production system here.
>
> Also note, that we have used excatly the same setup for a long time now,
> even on the same hardware, without any of these troubles. The problems
> only started when we switched over to a new (and probably more
> resource-intensive layout).
>
> We're unfortunately still not able to reproduce these crashes.

Can you set 'ulimit -c' to get a core file, which might at least help
point to the extension which is to blame (although it may just show the
"downstream" victim of a heap munge).

What versions of libxml2 / libxslt are you using? How about lxml?


Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 tseaver [at] palladion
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAksS4FEACgkQ+gerLs4ltQ4d+ACeJom2eaL9cKTU5qq0zYd+4wjT
syMAoI+2rPjBOJtz/HlQnbQgvDEghfGP
=gOdT
-----END PGP SIGNATURE-----

_______________________________________________
Zope maillist - Zope [at] zope
https://mail.zope.org/mailman/listinfo/zope
** No cross posts or HTML encoding! **
(Related lists -
https://mail.zope.org/mailman/listinfo/zope-announce
https://mail.zope.org/mailman/listinfo/zope-dev )


andreas.krasa at wu-wien

Nov 29, 2009, 11:59 PM

Post #6 of 7 (1420 views)
Permalink
Re: Frequent ZOPE crashes [In reply to]

Hi Tres,

thank you very much for your reply!

Am 29.11.09 21:57, schrieb Tres Seaver:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>>> ----- Original Message ----- From: "Andreas Krasa"
>>> <andreas.krasa [at] wu-wien>
>>>
>> we're right in the process of tracking down the error outside of ZOPE.
>>
>> We have completely installed a new server from scratch with RHEL 5.4 and
>> have re-installed python 2.4.6 and the latest versions of libxml2 and
>> libxslt there. We double checked the LD config, and made sure that te
>> correct shared objects get loaded (via lsof).
>>
>> We also reinstalled a few other modules that contain C-code (such as
>> python-ldap) which we need for being able to do authenitcation.
>>
>> Unfortunately that didn't really help much. We still experience crashes.
>>
>> Are there any known issues with Zope 2.11.2, LibXML2 and/or LibXSLT that
>> could cause these problems?
>>
>> The only thing we re-used is the Data.fs, which we have to, because
>> we're talking about a production system here.
>>
>> Also note, that we have used excatly the same setup for a long time now,
>> even on the same hardware, without any of these troubles. The problems
>> only started when we switched over to a new (and probably more
>> resource-intensive layout).
>>
>> We're unfortunately still not able to reproduce these crashes.
>
> Can you set 'ulimit -c' to get a core file, which might at least help
> point to the extension which is to blame (although it may just show the
> "downstream" victim of a heap munge).
>
> What versions of libxml2 / libxslt are you using? How about lxml?

Yes, we did set the ulimit and were indeed able to produce a coredump
for each crash happening (each having something between 300 and 700 MB).
We tried to debug using "gdb" but unfortunaley they only reveal two
cases when the crashes occur:

1) During garbage collection where the gc tries to clean up damaged
python objects
2) During some "ceval" process, also related to accessing damaged python
objects

Unfortunately it doesn't reveal what exactly trashes the objects. To us
it seems that this could happen some time earlier before either of the
two processes mentioned above tries to access the objects and crashes ZOPE.

For now, we don't really see a reproduceable pattern as it seems to be a
somewhat more complex user behavior which leads to this. We were able to
extract a few URLs out of the coredumps but directly accessing those
does nothing. Also the last logged access in the Z2.log before the
coredump triggers nothing, when directly accessing it.

We're running ZOPE-2.11.2 with an eggified version of ZODB3-3.8.4 plus
libxml2-2.7.6, libxslt-1.1.26 and lxml-2.2.4 now, the crashes still
happen. Previously we've been running with ZOPE-2.11.2, libxml2-2.7.3,
libxslt-1.1.24 and lxml-2.1.5. That also crashed ZOPE occasionally.

This only happened since we switched to a new layout (probably in
combination with a few minor Silva updates).

We have been using the same system software (RHEL5), hardware, python
version and libxml2/libxslt/lxml versions with our old old layout, where
everything worked fine for years.

I would be happy to paste any particular gdb outputs if that is of any
help...?

Kind regards,
Andreas
_______________________________________________
Zope maillist - Zope [at] zope
https://mail.zope.org/mailman/listinfo/zope
** No cross posts or HTML encoding! **
(Related lists -
https://mail.zope.org/mailman/listinfo/zope-announce
https://mail.zope.org/mailman/listinfo/zope-dev )


tseaver at palladion

Nov 30, 2009, 7:58 AM

Post #7 of 7 (1415 views)
Permalink
Re: Frequent ZOPE crashes [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andreas Krasa wrote:
> Hi Tres,
>
> thank you very much for your reply!
>
> Am 29.11.09 21:57, schrieb Tres Seaver:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>>>> ----- Original Message ----- From: "Andreas Krasa"
>>>> <andreas.krasa [at] wu-wien>
>>>>
>>> we're right in the process of tracking down the error outside of ZOPE.
>>>
>>> We have completely installed a new server from scratch with RHEL 5.4 and
>>> have re-installed python 2.4.6 and the latest versions of libxml2 and
>>> libxslt there. We double checked the LD config, and made sure that te
>>> correct shared objects get loaded (via lsof).
>>>
>>> We also reinstalled a few other modules that contain C-code (such as
>>> python-ldap) which we need for being able to do authenitcation.
>>>
>>> Unfortunately that didn't really help much. We still experience crashes.
>>>
>>> Are there any known issues with Zope 2.11.2, LibXML2 and/or LibXSLT that
>>> could cause these problems?
>>>
>>> The only thing we re-used is the Data.fs, which we have to, because
>>> we're talking about a production system here.
>>>
>>> Also note, that we have used excatly the same setup for a long time now,
>>> even on the same hardware, without any of these troubles. The problems
>>> only started when we switched over to a new (and probably more
>>> resource-intensive layout).
>>>
>>> We're unfortunately still not able to reproduce these crashes.
>> Can you set 'ulimit -c' to get a core file, which might at least help
>> point to the extension which is to blame (although it may just show the
>> "downstream" victim of a heap munge).
>>
>> What versions of libxml2 / libxslt are you using? How about lxml?
>
> Yes, we did set the ulimit and were indeed able to produce a coredump
> for each crash happening (each having something between 300 and 700 MB).
> We tried to debug using "gdb" but unfortunaley they only reveal two
> cases when the crashes occur:
>
> 1) During garbage collection where the gc tries to clean up damaged
> python objects
> 2) During some "ceval" process, also related to accessing damaged python
> objects
>
> Unfortunately it doesn't reveal what exactly trashes the objects. To us
> it seems that this could happen some time earlier before either of the
> two processes mentioned above tries to access the objects and crashes ZOPE.
>
> For now, we don't really see a reproduceable pattern as it seems to be a
> somewhat more complex user behavior which leads to this. We were able to
> extract a few URLs out of the coredumps but directly accessing those
> does nothing. Also the last logged access in the Z2.log before the
> coredump triggers nothing, when directly accessing it.
>
> We're running ZOPE-2.11.2 with an eggified version of ZODB3-3.8.4 plus
> libxml2-2.7.6, libxslt-1.1.26 and lxml-2.2.4 now, the crashes still
> happen. Previously we've been running with ZOPE-2.11.2, libxml2-2.7.3,
> libxslt-1.1.24 and lxml-2.1.5. That also crashed ZOPE occasionally.

Does your application ever use the libxml2 / libxslt Python bindings
directly? If so, I would go over that part of your app with a
microscope: it is incredibly easy to trigger segfaults from those
bindings. If not, then I would look for help on the lxml mailing list.

> This only happened since we switched to a new layout (probably in
> combination with a few minor Silva updates).

By "new layout", to you mean a new site them? If so, how do lxml /
libxml2 / lbixslt interact with your application to generate the theme?
What is structurally different about the new theme?

> We have been using the same system software (RHEL5), hardware, python
> version and libxml2/libxslt/lxml versions with our old old layout, where
> everything worked fine for years.
>
> I would be happy to paste any particular gdb outputs if that is of any
> help...?

I'm afraid that won't help: the GC segfaults indicate somebody is
munging the heap way before the segfault is triggered.



Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 tseaver [at] palladion
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAksT65cACgkQ+gerLs4ltQ5swACgsSuScLIAfFtd1d9TMznaQEeu
7JEAoJBetJHX3KOCbinGlyV5F/7DWjqK
=qGv5
-----END PGP SIGNATURE-----

_______________________________________________
Zope maillist - Zope [at] zope
https://mail.zope.org/mailman/listinfo/zope
** No cross posts or HTML encoding! **
(Related lists -
https://mail.zope.org/mailman/listinfo/zope-announce
https://mail.zope.org/mailman/listinfo/zope-dev )

Zope users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.