Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Apache: Dev

Segfault in openssl's err_cmp when using SSLCryptoDevice and new SSLProxyMachineCertificateChainFile

 

 

Apache dev RSS feed   Index | Next | Previous | View Threaded


DRuggeri at primary

Jan 18, 2012, 8:40 AM

Post #1 of 10 (485 views)
Permalink
Segfault in openssl's err_cmp when using SSLCryptoDevice and new SSLProxyMachineCertificateChainFile

All;
I stumbled across this yesterday and was hoping some of our more
experienced openssl developers may be able to offer suggestions on how I
can track this down. I've been testing on 2.2.21 though the code should
be the same in trunk/2.4. The patch I've applied is currently proposed
for backport in 2.2 (and works fine until using an openssl engine).

Patch applied to 2.2.21 distribution - trunk already has this:
http://people.apache.org/~druggeri/patches/httpd-2.2-SSLProxyMachineCertificateChainFile.patch

When the new SSLProxyMachineCertificateChainFile directive is set at the
same time SSLCryptoDevice is set, a segfault occurs during
ssl_hook_pre_config while calling SSL_load_error_strings. The backtrace
I gathered with dbx points to something deeper inside openssl, but I'm
sure I've done something to cause it.

t@1 (l@1) signal SEGV (no mapping at the fault address) in err_cmp at
0xffffffff7ab05540
0xffffffff7ab05540: err_cmp : ld [%o0 + 4], %o3
Current function is ssl_hook_pre_config (optimized)
280 SSL_load_error_strings();
(dbx) where
current thread: t@1
[1] err_cmp(0xffffffff7ae542a8, 0xffffffff7fff9470, 0x22cd,
0x100251f30, 0xac, 0xab), at 0xffffffff7ab05540
[2] lh_retrieve(0x10023aa80, 0xffffffff7fff9470, 0x14064057, 0x57,
0x10024edc8, 0xffffffff7ab05540), at 0xffffffff7ab034bc
[3] int_err_get_item(0xffffffff7fff9470, 0xffffffff7acb4528, 0x14520,
0xffffffff7aca0008, 0x19b904, 0x14400), at 0xffffffff7ab0476c
[4] ERR_func_error_string(0x64, 0xffffffff7acbdf48, 0x14520,
0xffffffff7acbdf48, 0xffffffff7acb4528, 0x14400), at 0xffffffff7ab053d0
[5] ERR_load_SSL_strings(0x0, 0xffffffff77e542a8, 0xffffffff77e4f0d0,
0x51d8, 0x105df4, 0x5000), at 0xffffffff77d492f8
=>[6] ssl_hook_pre_config(pconf = ???, plog = ???, ptemp = ???)
(optimized), at 0xffffffff77f08f04 (line ~280) in "mod_ssl.c"
[7] ap_run_pre_config(pconf = ???, plog = ???, ptemp = ???)
(optimized), at 0x10004cfe4 (line ~85) in "config.c"
[8] main(argc = ???, argv = ???) (optimized), at 0x100031954 (line
~709) in "main.c"

For reference, removing one directive or the other avoids the segfault.
This seems to be brought on by the combination of the two (and possibly
the engine implementation).

Any ideas?

--
Daniel Ruggeri


sctemme at apache

Jan 18, 2012, 11:13 PM

Post #2 of 10 (461 views)
Permalink
Re: Segfault in openssl's err_cmp when using SSLCryptoDevice and new SSLProxyMachineCertificateChainFile [In reply to]

On Jan 18, 2012, at 8:40 AM, Daniel Ruggeri wrote:

> All;
> I stumbled across this yesterday and was hoping some of our more
> experienced openssl developers may be able to offer suggestions on how I
> can track this down. I've been testing on 2.2.21 though the code should
> be the same in trunk/2.4. The patch I've applied is currently proposed
> for backport in 2.2 (and works fine until using an openssl engine).
>
> Patch applied to 2.2.21 distribution - trunk already has this:
> http://people.apache.org/~druggeri/patches/httpd-2.2-SSLProxyMachineCertificateChainFile.patch
>
> When the new SSLProxyMachineCertificateChainFile directive is set at the
> same time SSLCryptoDevice is set, a segfault occurs during
> ssl_hook_pre_config while calling SSL_load_error_strings. The backtrace
> I gathered with dbx points to something deeper inside openssl, but I'm
> sure I've done something to cause it.

Interesting... which version of OpenSSL? Must be 0.9.7 or 0.9.8, because err_cmp() disappeared after that. And the signature doesn't match what we're seeing in the backtrace.

And which platform? Solaris? SPARC or x86_64?

> t@1 (l@1) signal SEGV (no mapping at the fault address) in err_cmp at
> 0xffffffff7ab05540
> 0xffffffff7ab05540: err_cmp : ld [%o0 + 4], %o3
> Current function is ssl_hook_pre_config (optimized)
> 280 SSL_load_error_strings();
> (dbx) where
> current thread: t@1
> [1] err_cmp(0xffffffff7ae542a8, 0xffffffff7fff9470, 0x22cd,
> 0x100251f30, 0xac, 0xab), at 0xffffffff7ab05540
> [2] lh_retrieve(0x10023aa80, 0xffffffff7fff9470, 0x14064057, 0x57,
> 0x10024edc8, 0xffffffff7ab05540), at 0xffffffff7ab034bc
> [3] int_err_get_item(0xffffffff7fff9470, 0xffffffff7acb4528, 0x14520,
> 0xffffffff7aca0008, 0x19b904, 0x14400), at 0xffffffff7ab0476c
> [4] ERR_func_error_string(0x64, 0xffffffff7acbdf48, 0x14520,
> 0xffffffff7acbdf48, 0xffffffff7acb4528, 0x14400), at 0xffffffff7ab053d0
> [5] ERR_load_SSL_strings(0x0, 0xffffffff77e542a8, 0xffffffff77e4f0d0,
> 0x51d8, 0x105df4, 0x5000), at 0xffffffff77d492f8
> =>[6] ssl_hook_pre_config(pconf = ???, plog = ???, ptemp = ???)
> (optimized), at 0xffffffff77f08f04 (line ~280) in "mod_ssl.c"
> [7] ap_run_pre_config(pconf = ???, plog = ???, ptemp = ???)
> (optimized), at 0x10004cfe4 (line ~85) in "config.c"
> [8] main(argc = ???, argv = ???) (optimized), at 0x100031954 (line
> ~709) in "main.c"
>
> For reference, removing one directive or the other avoids the segfault.
> This seems to be brought on by the combination of the two (and possibly
> the engine implementation).

So the combination of directives causes some memory to be overwitten that ends up pointing outside httpd's allocated address space. Does the order of the directives matter?

Which Engine if I may ask? A fix was applied to the CHIL Engine that removes a dangling cleanup function pointer which caused a segfault on startup on platforms that vary the address location in which libraries are loaded (RHEL 5 being a prime example). I don't remember off the top of my head which OpenSSL version got the fix.

Can you reproduce with a non-optimized, debug/symbols enabled build of OpenSSL and Apache? With the latest versions of each?

S.

> Any ideas?
>
> --
> Daniel Ruggeri
>


--
sctemme [at] apache http://www.temme.net/sander/
PGP FP: FC5A 6FC6 2E25 2DFD 8007 EE23 9BB8 63B0 F51B B88A

View my availability: http://tungle.me/sctemme


DRuggeri at primary

Jan 30, 2012, 3:43 PM

Post #3 of 10 (444 views)
Permalink
Re: Segfault in openssl's err_cmp when using SSLCryptoDevice and new SSLProxyMachineCertificateChainFile [In reply to]

It's been hell lately - sorry for the sloooooow reply

On 1/19/2012 1:13 AM, Sander Temme wrote:
> Interesting... which version of OpenSSL? Must be 0.9.7 or 0.9.8, because err_cmp() disappeared after that. And the signature doesn't match what we're seeing in the backtrace.
>
> And which platform? Solaris? SPARC or x86_64?

I was building on Sparc - but I'll have to try with openssl 1.0.0.

>
>> ...
> So the combination of directives causes some memory to be overwitten that ends up pointing outside httpd's allocated address space. Does the order of the directives matter?
>
> Which Engine if I may ask? A fix was applied to the CHIL Engine that removes a dangling cleanup function pointer which caused a segfault on startup on platforms that vary the address location in which libraries are loaded (RHEL 5 being a prime example). I don't remember off the top of my head which OpenSSL version got the fix.
>
> Can you reproduce with a non-optimized, debug/symbols enabled build of OpenSSL and Apache? With the latest versions of each?
>
> S.
>

I'll try messing with the order and will let you know how I get on - the
chil engine is the one in use but this is a fairly recent openssl
(0.9.8r). I didn't explicitly enable optimization of either build but
did explicitly add "-g" which seemed to create a build of httpd with
debug symbols but a regular old build of openssl. I have some other
platforms available (RHEL being one of them) and will try soon to see
what I get there.

--
Daniel Ruggeri


shenson at opensslfoundation

Jan 30, 2012, 4:49 PM

Post #4 of 10 (448 views)
Permalink
Re: Segfault in openssl's err_cmp when using SSLCryptoDevice and new SSLProxyMachineCertificateChainFile [In reply to]

On 30/01/2012 23:43, Daniel Ruggeri wrote:
> It's been hell lately - sorry for the sloooooow reply
>
> On 1/19/2012 1:13 AM, Sander Temme wrote:
>> Interesting... which version of OpenSSL? Must be 0.9.7 or 0.9.8, because err_cmp() disappeared after that. And the signature doesn't match what we're seeing in the backtrace.
>>
>> And which platform? Solaris? SPARC or x86_64?
>
> I was building on Sparc - but I'll have to try with openssl 1.0.0.
>
>>
>>> ...
>> So the combination of directives causes some memory to be overwitten that ends up pointing outside httpd's allocated address space. Does the order of the directives matter?
>>
>> Which Engine if I may ask? A fix was applied to the CHIL Engine that removes a dangling cleanup function pointer which caused a segfault on startup on platforms that vary the address location in which libraries are loaded (RHEL 5 being a prime example). I don't remember off the top of my head which OpenSSL version got the fix.
>>
>> Can you reproduce with a non-optimized, debug/symbols enabled build of OpenSSL and Apache? With the latest versions of each?
>>
>> S.
>>
>
> I'll try messing with the order and will let you know how I get on - the
> chil engine is the one in use but this is a fairly recent openssl
> (0.9.8r). I didn't explicitly enable optimization of either build but
> did explicitly add "-g" which seemed to create a build of httpd with
> debug symbols but a regular old build of openssl. I have some other
> platforms available (RHEL being one of them) and will try soon to see
> what I get there.
>

The fix in 0.9.8r, the relevant patch is here:

http://cvs.openssl.org/chngview?cn=19659

Steve.
--
Dr Stephen Henson. OpenSSL Software Foundation, Inc.
1829 Mount Ephraim Road
Adamstown, MD 21710
+1 877-673-6775
shenson [at] opensslfoundation


DRuggeri at primary

Feb 2, 2012, 11:02 AM

Post #5 of 10 (444 views)
Permalink
Re: Segfault in openssl's err_cmp when using SSLCryptoDevice and new SSLProxyMachineCertificateChainFile [In reply to]

On 1/19/2012 1:13 AM, Sander Temme wrote:
> Interesting... which version of OpenSSL? Must be 0.9.7 or 0.9.8, because err_cmp() disappeared after that. And the signature doesn't match what we're seeing in the backtrace.
>
> And which platform? Solaris? SPARC or x86_64?

I tried building against 1.0.0g and get a new error (with or without the
new SSLProxyMachineCertificateChainFile directive).
[Thu Feb 02 12:48:48 2012] [error] Init: Failed to enable Crypto Device
API `chil'
[Thu Feb 02 12:48:48 2012] [error] SSL Library Error: 2164682865
error:81067071:CHIL engine:HWCRHK_INIT:unit failure
[Thu Feb 02 12:48:48 2012] [error] SSL Library Error: 638287981
error:260B806D:engine routines:ENGINE_TABLE_REGISTER:init failed

Since this happens with every attempt to start, I suspect it has nothing
to do with the new directive and more to do with something I did on the
openssl build. Attempts to run this guy in dbx bomb out with SIGILL here:
[1] _sparcv9_fmadd_probe(0x0, 0x1, 0x0, 0x82, 0xffffffff7ec00200,
0x6), at 0xffffffff7a96353c
[2] OPENSSL_cpuid_setup(0x0, 0x0, 0x0, 0x1000, 0x0, 0x0), at
0xffffffff7a9631ec

If I force the ENV var OPENSSL_sparcv9cap=3, then things seem to chug
along and fail at the same point in the logs as above. Not sure where to
go from here or it it's even worth pursuing. I'm guessing openssl 1.0.0
isn't going to play nice in my environment.

> So the combination of directives causes some memory to be overwitten
> that ends up pointing outside httpd's allocated address space. Does
> the order of the directives matter? Which Engine if I may ask? A fix
> was applied to the CHIL Engine that removes a dangling cleanup
> function pointer which caused a segfault on startup on platforms that
> vary the address location in which libraries are loaded (RHEL 5 being
> a prime example). I don't remember off the top of my head which
> OpenSSL version got the fix. Can you reproduce with a non-optimized,
> debug/symbols enabled build of OpenSSL and Apache? With the latest
> versions of each? S.

After some testing, order doesn't matter- same results regardless. I
have confirmed that the chil update (noted by you and Steve) is applied
to this version of openssl (looks like it was formally incorporated into
the one I'm using). Unfortunately, openssl doesn't have a
debug-solaris64-sparcv9-cc build option so I can't produce a debug
version of openssl. I did remove optomizations from httpd, though... I'm
going to try with 0.9.8t and see if anything is different.

--
Daniel Ruggeri


DRuggeri at primary

Feb 3, 2012, 9:45 AM

Post #6 of 10 (429 views)
Permalink
Re: Segfault in openssl's err_cmp when using SSLCryptoDevice and new SSLProxyMachineCertificateChainFile [In reply to]

On 2/2/2012 1:02 PM, Daniel Ruggeri wrote:
> Since this happens with every attempt to start, I suspect it has nothing
> to do with the new directive and more to do with something I did on the
> openssl build.

I was, indeed, doing something stupid. A build with openssl 1.0.0g
replicates the behavior of 0.9.8g in that it fails when
SSLProxyMachineCertificateChainFile is enabled. The annoying part is
that (due to the error I get when running in dbx) I can get no useful
information in a debug session from Solaris.

... so I've switched to RHEL and gdb and have interesting information.
Under Linux, I get this error on init:
[Fri Feb 03 10:56:21 2012] [error] Init: Failed to enable Crypto Device
API `chil'
[Fri Feb 03 10:56:21 2012] [error] SSL Library Error: 2164682852
error:81067064:CHIL engine:HWCRHK_INIT:already loaded
[Fri Feb 03 10:56:21 2012] [error] SSL Library Error: 638287981
error:260B806D:engine routines:ENGINE_TABLE_REGISTER:init failed

This only happens when SSLProxyMachineCertificateChainFile is set....
With some quick debugging I see that the hwcrhk_finish DOES NOT get
called during ssl_cleanup_pre_config... but DOES get called when the
directive has been removed. To me, it looks like httpd has not
registered the engine for cleanup, but that certainly shouldn't be
impacted by this patch. It seems something in the process of loading the
store is complicating things.

I'll continue poking around, but pointers are certainly appreciated.

--
Daniel Ruggeri


shenson at opensslfoundation

Feb 3, 2012, 10:27 AM

Post #7 of 10 (426 views)
Permalink
Re: Segfault in openssl's err_cmp when using SSLCryptoDevice and new SSLProxyMachineCertificateChainFile [In reply to]

On 03/02/2012 17:45, Daniel Ruggeri wrote:
> On 2/2/2012 1:02 PM, Daniel Ruggeri wrote:
>> Since this happens with every attempt to start, I suspect it has nothing
>> to do with the new directive and more to do with something I did on the
>> openssl build.
>
> I was, indeed, doing something stupid. A build with openssl 1.0.0g
> replicates the behavior of 0.9.8g in that it fails when
> SSLProxyMachineCertificateChainFile is enabled. The annoying part is
> that (due to the error I get when running in dbx) I can get no useful
> information in a debug session from Solaris.
>
> ... so I've switched to RHEL and gdb and have interesting information.
> Under Linux, I get this error on init:
> [Fri Feb 03 10:56:21 2012] [error] Init: Failed to enable Crypto Device
> API `chil'
> [Fri Feb 03 10:56:21 2012] [error] SSL Library Error: 2164682852
> error:81067064:CHIL engine:HWCRHK_INIT:already loaded
> [Fri Feb 03 10:56:21 2012] [error] SSL Library Error: 638287981
> error:260B806D:engine routines:ENGINE_TABLE_REGISTER:init failed
>
> This only happens when SSLProxyMachineCertificateChainFile is set....
> With some quick debugging I see that the hwcrhk_finish DOES NOT get
> called during ssl_cleanup_pre_config... but DOES get called when the
> directive has been removed. To me, it looks like httpd has not
> registered the engine for cleanup, but that certainly shouldn't be
> impacted by this patch. It seems something in the process of loading the
> store is complicating things.
>
> I'll continue poking around, but pointers are certainly appreciated.
>

Hmm... the ENGINE code is careful not to shutdown an ENGINE if keys exist which
make use of it.

So there is a possibility that the some chain verification leaves a reference to
an RSA key which prevents the ENGINE from closing down completely.

In engines/e_chil.c try commenting out the line containing
ERR_load_HWCRHK_strings().

Only side effect of doing that is you will only get numerical error codes and
not error strings.

Steve.
--
Dr Stephen Henson. OpenSSL Software Foundation, Inc.
1829 Mount Ephraim Road
Adamstown, MD 21710
+1 877-673-6775
shenson [at] opensslfoundation


sctemme at apache

Feb 3, 2012, 11:41 AM

Post #8 of 10 (426 views)
Permalink
Re: Segfault in openssl's err_cmp when using SSLCryptoDevice and new SSLProxyMachineCertificateChainFile [In reply to]

Remember the CHIL engine cleanup was fixed to prevent a dangling cleanup function pointer... I forget which OpenSSL version got that fix but in any case RH only recently backported it.

I'm sure I didn't test with any proxy config at the time.

S.

--
Sander Temme
sander [at] temme

Sent from my phone

On Feb 3, 2012, at 1:27 PM, Dr Stephen Henson <shenson [at] opensslfoundation> wrote:

> On 03/02/2012 17:45, Daniel Ruggeri wrote:
>> On 2/2/2012 1:02 PM, Daniel Ruggeri wrote:
>>> Since this happens with every attempt to start, I suspect it has nothing
>>> to do with the new directive and more to do with something I did on the
>>> openssl build.
>>
>> I was, indeed, doing something stupid. A build with openssl 1.0.0g
>> replicates the behavior of 0.9.8g in that it fails when
>> SSLProxyMachineCertificateChainFile is enabled. The annoying part is
>> that (due to the error I get when running in dbx) I can get no useful
>> information in a debug session from Solaris.
>>
>> ... so I've switched to RHEL and gdb and have interesting information.
>> Under Linux, I get this error on init:
>> [Fri Feb 03 10:56:21 2012] [error] Init: Failed to enable Crypto Device
>> API `chil'
>> [Fri Feb 03 10:56:21 2012] [error] SSL Library Error: 2164682852
>> error:81067064:CHIL engine:HWCRHK_INIT:already loaded
>> [Fri Feb 03 10:56:21 2012] [error] SSL Library Error: 638287981
>> error:260B806D:engine routines:ENGINE_TABLE_REGISTER:init failed
>>
>> This only happens when SSLProxyMachineCertificateChainFile is set....
>> With some quick debugging I see that the hwcrhk_finish DOES NOT get
>> called during ssl_cleanup_pre_config... but DOES get called when the
>> directive has been removed. To me, it looks like httpd has not
>> registered the engine for cleanup, but that certainly shouldn't be
>> impacted by this patch. It seems something in the process of loading the
>> store is complicating things.
>>
>> I'll continue poking around, but pointers are certainly appreciated.
>>
>
> Hmm... the ENGINE code is careful not to shutdown an ENGINE if keys exist which
> make use of it.
>
> So there is a possibility that the some chain verification leaves a reference to
> an RSA key which prevents the ENGINE from closing down completely.
>
> In engines/e_chil.c try commenting out the line containing
> ERR_load_HWCRHK_strings().
>
> Only side effect of doing that is you will only get numerical error codes and
> not error strings.
>
> Steve.
> --
> Dr Stephen Henson. OpenSSL Software Foundation, Inc.
> 1829 Mount Ephraim Road
> Adamstown, MD 21710
> +1 877-673-6775
> shenson [at] opensslfoundation


DRuggeri at primary

Feb 3, 2012, 2:57 PM

Post #9 of 10 (426 views)
Permalink
Re: Segfault in openssl's err_cmp when using SSLCryptoDevice and new SSLProxyMachineCertificateChainFile [In reply to]

On 2/3/2012 12:27 PM, Dr Stephen Henson wrote:
> Hmm... the ENGINE code is careful not to shutdown an ENGINE if keys exist which
> make use of it.
>
> So there is a possibility that the some chain verification leaves a reference to
> an RSA key which prevents the ENGINE from closing down completely.
>
> In engines/e_chil.c try commenting out the line containing
> ERR_load_HWCRHK_strings().
>
> Only side effect of doing that is you will only get numerical error codes and
> not error strings.
>
> Steve.

I will try that on Monday. This is a good tip, though, and gives me an
avenue to explore! Thanks!


On 2/3/2012 1:41 PM, Sander Temme wrote:
> Remember the CHIL engine cleanup was fixed to prevent a dangling cleanup function pointer... I forget which OpenSSL version got that fix but in any case RH only recently backported it.
>
> I'm sure I didn't test with any proxy config at the time.

Correct,sir. I am compiling and packaging for three platforms from the
latest sources available - I do all of my testing with two-way proxy
authentication. This recent test was openssl 1.0.0g but the behavior is
observed also in 0.9.8t. I am certain that this is an issue only when
using SSLProxyMachineCertificateChainFile (currently in trunk and
proposed for backport in 2.2) with an engine.

--
Daniel Ruggeri


DRuggeri at primary

Mar 2, 2012, 4:37 PM

Post #10 of 10 (370 views)
Permalink
Re: Segfault in openssl's err_cmp when using SSLCryptoDevice and new SSLProxyMachineCertificateChainFile [In reply to]

On 2/3/2012 4:57 PM, Daniel Ruggeri wrote:
> On 2/3/2012 12:27 PM, Dr Stephen Henson wrote:
>> Hmm... the ENGINE code is careful not to shutdown an ENGINE if keys exist which
>> make use of it.
>>
>> So there is a possibility that the some chain verification leaves a reference to
>> an RSA key which prevents the ENGINE from closing down completely.
>>
>> In engines/e_chil.c try commenting out the line containing
>> ERR_load_HWCRHK_strings().
>>
>> Only side effect of doing that is you will only get numerical error codes and
>> not error strings.
>>
>> Steve.
> I will try that on Monday. This is a good tip, though, and gives me an
> avenue to explore! Thanks!

Yep! This was ultimately what the problem was - a missing cleanup of the
context after the config stage. Not a problem for straight forward certs
without an engine, but posed a problem in CHIL. Thank you for pointing
this out.

I'm still scratching my head about why the error manifested as a
segfault on Solaris SPARC and as CHIL (validly) complaining/bombing out
on AIX and RHEL. Unfortunately, it seems my debugger gets in the way
when trying to figure this out, so it may be a mystery to me forever.

--
Daniel Ruggeri

Apache dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.