Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Apache: Dev

Fix for Windows bug#52476

 

 

Apache dev RSS feed   Index | Next | Previous | View Threaded


claudioc at microsoft

Aug 9, 2012, 11:13 AM

Post #1 of 12 (254 views)
Permalink
Fix for Windows bug#52476

Please code review the fix and let me know if you find any issue.

Attached is the proposed patch for
server\mpm\winnt\child.c

Summary for code reviewers:
If AcceptFilter is 'connect' or 'none', we read data from socket on worker thread. We use blocking recv and assign context->overlapped.Pointer to heap allocated buffer. It is the same procedure as in case of 'AcceptFilter data', but done in worker thread to keep listen thread unblocked.

Note:
It looks like context with overlapped.Pointer == NULL is not processed correctly in windows version of httpd. It may be related to the fact that winnt_insert_network_bucket() rejects context records with overlapped.Pointer == NULL


Please advise on what the next step(s) should be.

Thanks
Claudio
Attachments: child.c.patch (1.49 KB)


claudioc at microsoft

Aug 9, 2012, 11:26 AM

Post #2 of 12 (248 views)
Permalink
RE: Fix for Windows bug#52476 [In reply to]

Better patch, fixed minor issue after another code review.

Thanks
Claudio


From: Claudio Caldato (MS OPEN TECH) [mailto:claudioc [at] microsoft]
Sent: Thursday, August 9, 2012 11:13 AM
To: dev [at] httpd
Subject: Fix for Windows bug#52476

Please code review the fix and let me know if you find any issue.

Attached is the proposed patch for
server\mpm\winnt\child.c

Summary for code reviewers:
If AcceptFilter is 'connect' or 'none', we read data from socket on worker thread. We use blocking recv and assign context->overlapped.Pointer to heap allocated buffer. It is the same procedure as in case of 'AcceptFilter data', but done in worker thread to keep listen thread unblocked.

Note:
It looks like context with overlapped.Pointer == NULL is not processed correctly in windows version of httpd. It may be related to the fact that winnt_insert_network_bucket() rejects context records with overlapped.Pointer == NULL


Please advise on what the next step(s) should be.

Thanks
Claudio
Attachments: child.c.patch (1.50 KB)


trawick at gmail

Aug 9, 2012, 12:52 PM

Post #3 of 12 (249 views)
Permalink
Re: Fix for Windows bug#52476 [In reply to]

On Thu, Aug 9, 2012 at 2:26 PM, Claudio Caldato (MS OPEN TECH)
<claudioc [at] microsoft> wrote:
> Better patch, fixed minor issue after another code review.

I tested and seemed to get good results, but my testing isn't
reproducible enough with/without various patches to be conclusive.

A couple of comments on the patch itself:

* Why is BytesRead initialized to 0 in the other function? (I didn't
include that part of your patch in my test.)
* Pass apr_get_netos_error() to ap_log_error instead of rc after a
Winsock function like recv() fails. (apr_get_os_error() otherwise)

More comments below:


> Thanks
>
> Claudio

> From: Claudio Caldato (MS OPEN TECH) [mailto:claudioc [at] microsoft]
> Sent: Thursday, August 9, 2012 11:13 AM
> To: dev [at] httpd
> Subject: Fix for Windows bug#52476
>
>
>
> Please code review the fix and let me know if you find any issue.
>
>
>
> Attached is the proposed patch for
>
> server\mpm\winnt\child.c
>
>
>
> Summary for code reviewers:
>
> If AcceptFilter is ‘connect’ or ‘none’, we read data from socket on worker
> thread. We use blocking recv and assign context->overlapped.Pointer to heap
> allocated buffer. It is the same procedure as in case of ‘AcceptFilter
> data’, but done in worker thread to keep listen thread unblocked.

The big picture doesn't seem correct. The main path of httpd (that
stuff called via ap_run_process_connection()) needs to be able to read
at will. It shouldn't help to move a read() to before the call to
ap_run_process_connection().


> Note:
>
> It looks like context with overlapped.Pointer == NULL is not processed
> correctly in windows version of httpd. It may be related to the fact that
> winnt_insert_network_bucket() rejects context records with
> overlapped.Pointer == NULL

Uhh, maybe I'm confused but that sounds like the issue! (I'm not
familiar with that hook.)

What about this patch?

Index: server/mpm/winnt/child.c
===================================================================
--- server/mpm/winnt/child.c (revision 1371377)
+++ server/mpm/winnt/child.c (working copy)
@@ -780,11 +780,15 @@
apr_bucket *e;
winnt_conn_ctx_t *context = ap_get_module_config(c->conn_config,
&mpm_winnt_module);
- if (context == NULL || (e = context->overlapped.Pointer) == NULL)
+ if (context == NULL) {
return AP_DECLINED;
+ }

- /* seed the brigade with AcceptEx read heap bucket */
- APR_BRIGADE_INSERT_HEAD(bb, e);
+ if ((e = context->overlapped.Pointer) != NULL) {
+ /* seed the brigade with AcceptEx read heap bucket */
+ APR_BRIGADE_INSERT_HEAD(bb, e);
+ }
+
/* also seed the brigade with the client socket. */
e = apr_bucket_socket_create(socket, c->bucket_alloc);
APR_BRIGADE_INSERT_TAIL(bb, e);

>
>
>
>
>
> Please advise on what the next step(s) should be.
>
>
>
> Thanks
>
> Claudio
>
>



--
Born in Roswell... married an alien...
http://emptyhammock.com/


trawick at gmail

Aug 9, 2012, 1:28 PM

Post #4 of 12 (255 views)
Permalink
Re: Fix for Windows bug#52476 [In reply to]

On Thu, Aug 9, 2012 at 3:52 PM, Jeff Trawick <trawick [at] gmail> wrote:
> On Thu, Aug 9, 2012 at 2:26 PM, Claudio Caldato (MS OPEN TECH)
> <claudioc [at] microsoft> wrote:
>> Better patch, fixed minor issue after another code review.
>
> I tested and seemed to get good results, but my testing isn't
> reproducible enough with/without various patches to be conclusive.
>
> A couple of comments on the patch itself:
>
> * Why is BytesRead initialized to 0 in the other function? (I didn't
> include that part of your patch in my test.)
> * Pass apr_get_netos_error() to ap_log_error instead of rc after a
> Winsock function like recv() fails. (apr_get_os_error() otherwise)
>
> More comments below:
>
>
>> Thanks
>>
>> Claudio
>
>> From: Claudio Caldato (MS OPEN TECH) [mailto:claudioc [at] microsoft]
>> Sent: Thursday, August 9, 2012 11:13 AM
>> To: dev [at] httpd
>> Subject: Fix for Windows bug#52476
>>
>>
>>
>> Please code review the fix and let me know if you find any issue.
>>
>>
>>
>> Attached is the proposed patch for
>>
>> server\mpm\winnt\child.c
>>
>>
>>
>> Summary for code reviewers:
>>
>> If AcceptFilter is ‘connect’ or ‘none’, we read data from socket on worker
>> thread. We use blocking recv and assign context->overlapped.Pointer to heap
>> allocated buffer. It is the same procedure as in case of ‘AcceptFilter
>> data’, but done in worker thread to keep listen thread unblocked.
>
> The big picture doesn't seem correct. The main path of httpd (that
> stuff called via ap_run_process_connection()) needs to be able to read
> at will. It shouldn't help to move a read() to before the call to
> ap_run_process_connection().
>
>
>> Note:
>>
>> It looks like context with overlapped.Pointer == NULL is not processed
>> correctly in windows version of httpd. It may be related to the fact that
>> winnt_insert_network_bucket() rejects context records with
>> overlapped.Pointer == NULL
>
> Uhh, maybe I'm confused but that sounds like the issue! (I'm not
> familiar with that hook.)
>
> What about this patch?
>
> Index: server/mpm/winnt/child.c
> ===================================================================
> --- server/mpm/winnt/child.c (revision 1371377)
> +++ server/mpm/winnt/child.c (working copy)
> @@ -780,11 +780,15 @@
> apr_bucket *e;
> winnt_conn_ctx_t *context = ap_get_module_config(c->conn_config,
> &mpm_winnt_module);
> - if (context == NULL || (e = context->overlapped.Pointer) == NULL)
> + if (context == NULL) {
> return AP_DECLINED;
> + }
>
> - /* seed the brigade with AcceptEx read heap bucket */
> - APR_BRIGADE_INSERT_HEAD(bb, e);
> + if ((e = context->overlapped.Pointer) != NULL) {
> + /* seed the brigade with AcceptEx read heap bucket */
> + APR_BRIGADE_INSERT_HEAD(bb, e);
> + }
> +
> /* also seed the brigade with the client socket. */
> e = apr_bucket_socket_create(socket, c->bucket_alloc);
> APR_BRIGADE_INSERT_TAIL(bb, e);

no, that's wrong

you only need to put the socket bucket there if you need to put data
in front of it; (when declining, core's insert_network_bucket hook
will do the right thing with the socket)

AFAICT there's no functional issue with winnt_insert_network_bucket,
though it could defer to core to handle the socket bucket

>
>>
>>
>>
>>
>>
>> Please advise on what the next step(s) should be.
>>
>>
>>
>> Thanks
>>
>> Claudio
>>
>>
>
>
>
> --
> Born in Roswell... married an alien...
> http://emptyhammock.com/



--
Born in Roswell... married an alien...
http://emptyhammock.com/


trawick at gmail

Aug 9, 2012, 2:50 PM

Post #5 of 12 (249 views)
Permalink
Re: Fix for Windows bug#52476 [In reply to]

On Thu, Aug 9, 2012 at 4:28 PM, Jeff Trawick <trawick [at] gmail> wrote:
> On Thu, Aug 9, 2012 at 3:52 PM, Jeff Trawick <trawick [at] gmail> wrote:
>> On Thu, Aug 9, 2012 at 2:26 PM, Claudio Caldato (MS OPEN TECH)
>> <claudioc [at] microsoft> wrote:
>>> Better patch, fixed minor issue after another code review.
>>
>> I tested and seemed to get good results, but my testing isn't
>> reproducible enough with/without various patches to be conclusive.
>>
>> A couple of comments on the patch itself:
>>
>> * Why is BytesRead initialized to 0 in the other function? (I didn't
>> include that part of your patch in my test.)
>> * Pass apr_get_netos_error() to ap_log_error instead of rc after a
>> Winsock function like recv() fails. (apr_get_os_error() otherwise)
>>
>> More comments below:
>>
>>
>>> Thanks
>>>
>>> Claudio
>>
>>> From: Claudio Caldato (MS OPEN TECH) [mailto:claudioc [at] microsoft]
>>> Sent: Thursday, August 9, 2012 11:13 AM
>>> To: dev [at] httpd
>>> Subject: Fix for Windows bug#52476
>>>
>>>
>>>
>>> Please code review the fix and let me know if you find any issue.
>>>
>>>
>>>
>>> Attached is the proposed patch for
>>>
>>> server\mpm\winnt\child.c
>>>
>>>
>>>
>>> Summary for code reviewers:
>>>
>>> If AcceptFilter is ‘connect’ or ‘none’, we read data from socket on worker
>>> thread. We use blocking recv and assign context->overlapped.Pointer to heap
>>> allocated buffer. It is the same procedure as in case of ‘AcceptFilter
>>> data’, but done in worker thread to keep listen thread unblocked.
>>
>> The big picture doesn't seem correct. The main path of httpd (that
>> stuff called via ap_run_process_connection()) needs to be able to read
>> at will. It shouldn't help to move a read() to before the call to
>> ap_run_process_connection().
>>
>>
>>> Note:
>>>
>>> It looks like context with overlapped.Pointer == NULL is not processed
>>> correctly in windows version of httpd. It may be related to the fact that
>>> winnt_insert_network_bucket() rejects context records with
>>> overlapped.Pointer == NULL
>>
>> Uhh, maybe I'm confused but that sounds like the issue! (I'm not
>> familiar with that hook.)
>>
>> What about this patch?
>>
>> Index: server/mpm/winnt/child.c
>> ===================================================================
>> --- server/mpm/winnt/child.c (revision 1371377)
>> +++ server/mpm/winnt/child.c (working copy)
>> @@ -780,11 +780,15 @@
>> apr_bucket *e;
>> winnt_conn_ctx_t *context = ap_get_module_config(c->conn_config,
>> &mpm_winnt_module);
>> - if (context == NULL || (e = context->overlapped.Pointer) == NULL)
>> + if (context == NULL) {
>> return AP_DECLINED;
>> + }
>>
>> - /* seed the brigade with AcceptEx read heap bucket */
>> - APR_BRIGADE_INSERT_HEAD(bb, e);
>> + if ((e = context->overlapped.Pointer) != NULL) {
>> + /* seed the brigade with AcceptEx read heap bucket */
>> + APR_BRIGADE_INSERT_HEAD(bb, e);
>> + }
>> +
>> /* also seed the brigade with the client socket. */
>> e = apr_bucket_socket_create(socket, c->bucket_alloc);
>> APR_BRIGADE_INSERT_TAIL(bb, e);
>
> no, that's wrong
>
> you only need to put the socket bucket there if you need to put data
> in front of it; (when declining, core's insert_network_bucket hook
> will do the right thing with the socket)
>
> AFAICT there's no functional issue with winnt_insert_network_bucket,
> though it could defer to core to handle the socket bucket
>
>>
>>>
>>>
>>>
>>>
>>>
>>> Please advise on what the next step(s) should be.
>>>
>>>
>>>
>>> Thanks
>>>
>>> Claudio

Attached is an alternate version of your patch which uses a different
control over how many bytes should be read.

With TO_READ = 0 (disable your code), it falls over consistently.
With TO_READ = 1 (read just 1 byte) it works reasonably consistently
(1 handshake failure in 100s of attempted connections).

Is there something about the state of the socket that gets reset once
a vanilla recv() is performed? (a little more later)

wrowe had the suggestion recently that we weren't manufacturing the
apr socket properly and the socket state was wrong. I think that
creation of the apr socket is as good as it can be, though in some
experiments I did I was left nervous about this code on the listening
socket:

/* The event needs to be removed from the accepted socket,
* if not removed from the listen socket prior to accept(),
*/
rv = WSAEventSelect(nlsd, events[2], FD_ACCEPT);
if (rv) {
ap_log_error(APLOG_MARK, APLOG_ERR,
apr_get_netos_error(), ap_server_conf, APLOGNO(00333)
"WSAEventSelect() failed.");
CloseHandle(events[2]);
return 1;
}

This makes the listening socket non-blocking, and any attempt to set
it blocking results in WSAEINVAL. I dunno which of this leaks into
the connected child, or if is related to the observation that the
vanilla recv() somehow allows the apr-ized socket to work fine.

We do run this code on the client socket for AcceptFilter None:

/* Per MSDN, cancel the inherited association of this socket
* to the WSAEventSelect API, and restore the state corresponding
* to apr_os_sock_make's default assumptions (really, a flaw within
* os_sock_make and os_sock_put that it does not query).
*/
WSAEventSelect(context->accept_socket, 0, 0);

I guess this is all okay, but voodoo is all that comes to mind due to
the first WSAEventSelect() magically making the socket non-blocking.
Attachments: read_in_MPM.txt (1.30 KB)


trawick at gmail

Aug 9, 2012, 3:35 PM

Post #6 of 12 (248 views)
Permalink
Re: Fix for Windows bug#52476 [In reply to]

On Thu, Aug 9, 2012 at 5:50 PM, Jeff Trawick <trawick [at] gmail> wrote:
> On Thu, Aug 9, 2012 at 4:28 PM, Jeff Trawick <trawick [at] gmail> wrote:
>> On Thu, Aug 9, 2012 at 3:52 PM, Jeff Trawick <trawick [at] gmail> wrote:
>>> On Thu, Aug 9, 2012 at 2:26 PM, Claudio Caldato (MS OPEN TECH)
>>> <claudioc [at] microsoft> wrote:
>>>> Better patch, fixed minor issue after another code review.
>>>
>>> I tested and seemed to get good results, but my testing isn't
>>> reproducible enough with/without various patches to be conclusive.
>>>
>>> A couple of comments on the patch itself:
>>>
>>> * Why is BytesRead initialized to 0 in the other function? (I didn't
>>> include that part of your patch in my test.)
>>> * Pass apr_get_netos_error() to ap_log_error instead of rc after a
>>> Winsock function like recv() fails. (apr_get_os_error() otherwise)
>>>
>>> More comments below:
>>>
>>>
>>>> Thanks
>>>>
>>>> Claudio
>>>
>>>> From: Claudio Caldato (MS OPEN TECH) [mailto:claudioc [at] microsoft]
>>>> Sent: Thursday, August 9, 2012 11:13 AM
>>>> To: dev [at] httpd
>>>> Subject: Fix for Windows bug#52476
>>>>
>>>>
>>>>
>>>> Please code review the fix and let me know if you find any issue.
>>>>
>>>>
>>>>
>>>> Attached is the proposed patch for
>>>>
>>>> server\mpm\winnt\child.c
>>>>
>>>>
>>>>
>>>> Summary for code reviewers:
>>>>
>>>> If AcceptFilter is ‘connect’ or ‘none’, we read data from socket on worker
>>>> thread. We use blocking recv and assign context->overlapped.Pointer to heap
>>>> allocated buffer. It is the same procedure as in case of ‘AcceptFilter
>>>> data’, but done in worker thread to keep listen thread unblocked.
>>>
>>> The big picture doesn't seem correct. The main path of httpd (that
>>> stuff called via ap_run_process_connection()) needs to be able to read
>>> at will. It shouldn't help to move a read() to before the call to
>>> ap_run_process_connection().
>>>
>>>
>>>> Note:
>>>>
>>>> It looks like context with overlapped.Pointer == NULL is not processed
>>>> correctly in windows version of httpd. It may be related to the fact that
>>>> winnt_insert_network_bucket() rejects context records with
>>>> overlapped.Pointer == NULL
>>>
>>> Uhh, maybe I'm confused but that sounds like the issue! (I'm not
>>> familiar with that hook.)
>>>
>>> What about this patch?
>>>
>>> Index: server/mpm/winnt/child.c
>>> ===================================================================
>>> --- server/mpm/winnt/child.c (revision 1371377)
>>> +++ server/mpm/winnt/child.c (working copy)
>>> @@ -780,11 +780,15 @@
>>> apr_bucket *e;
>>> winnt_conn_ctx_t *context = ap_get_module_config(c->conn_config,
>>> &mpm_winnt_module);
>>> - if (context == NULL || (e = context->overlapped.Pointer) == NULL)
>>> + if (context == NULL) {
>>> return AP_DECLINED;
>>> + }
>>>
>>> - /* seed the brigade with AcceptEx read heap bucket */
>>> - APR_BRIGADE_INSERT_HEAD(bb, e);
>>> + if ((e = context->overlapped.Pointer) != NULL) {
>>> + /* seed the brigade with AcceptEx read heap bucket */
>>> + APR_BRIGADE_INSERT_HEAD(bb, e);
>>> + }
>>> +
>>> /* also seed the brigade with the client socket. */
>>> e = apr_bucket_socket_create(socket, c->bucket_alloc);
>>> APR_BRIGADE_INSERT_TAIL(bb, e);
>>
>> no, that's wrong
>>
>> you only need to put the socket bucket there if you need to put data
>> in front of it; (when declining, core's insert_network_bucket hook
>> will do the right thing with the socket)
>>
>> AFAICT there's no functional issue with winnt_insert_network_bucket,
>> though it could defer to core to handle the socket bucket
>>
>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Please advise on what the next step(s) should be.
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Claudio
>
> Attached is an alternate version of your patch which uses a different
> control over how many bytes should be read.
>
> With TO_READ = 0 (disable your code), it falls over consistently.
> With TO_READ = 1 (read just 1 byte) it works reasonably consistently
> (1 handshake failure in 100s of attempted connections).
>
> Is there something about the state of the socket that gets reset once
> a vanilla recv() is performed? (a little more later)
>
> wrowe had the suggestion recently that we weren't manufacturing the
> apr socket properly and the socket state was wrong. I think that
> creation of the apr socket is as good as it can be, though in some
> experiments I did I was left nervous about this code on the listening
> socket:
>
> /* The event needs to be removed from the accepted socket,
> * if not removed from the listen socket prior to accept(),
> */
> rv = WSAEventSelect(nlsd, events[2], FD_ACCEPT);
> if (rv) {
> ap_log_error(APLOG_MARK, APLOG_ERR,
> apr_get_netos_error(), ap_server_conf, APLOGNO(00333)
> "WSAEventSelect() failed.");
> CloseHandle(events[2]);
> return 1;
> }
>
> This makes the listening socket non-blocking, and any attempt to set
> it blocking results in WSAEINVAL. I dunno which of this leaks into
> the connected child, or if is related to the observation that the
> vanilla recv() somehow allows the apr-ized socket to work fine.
>
> We do run this code on the client socket for AcceptFilter None:
>
> /* Per MSDN, cancel the inherited association of this socket
> * to the WSAEventSelect API, and restore the state corresponding
> * to apr_os_sock_make's default assumptions (really, a flaw within
> * os_sock_make and os_sock_put that it does not query).
> */
> WSAEventSelect(context->accept_socket, 0, 0);
>
> I guess this is all okay, but voodoo is all that comes to mind due to
> the first WSAEventSelect() magically making the socket non-blocking.

Woohoo! I get consistent success with this bit of code in lieu of the
patch being discussed/tweaked in this thread. I think this means that
simply calling recv() to change some aspect of the socket state is
what is needed, and the problem isn't related to the queuing of any
pre-read data. (That's not to say that this is the desired way to
change that state, but it is better than nothing.)

#define TO_PEEK 1
if (TO_PEEK > 0 &&
(context->overlapped.Pointer == NULL) &&
(context->accept_socket != INVALID_SOCKET)) {
char buffer[TO_PEEK];

rc = recv(context->accept_socket, buffer, TO_PEEK, MSG_PEEK);
if (rc == SOCKET_ERROR) {
ap_log_error(APLOG_MARK, APLOG_ERR, apr_get_netos_error(),
ap_server_conf, APLOGNO(00365)
"worker_main: recv(MSG_PEEK) error");
}
}

--
Born in Roswell... married an alien...
http://emptyhammock.com/


wrowe at rowe-clan

Aug 9, 2012, 4:35 PM

Post #7 of 12 (248 views)
Permalink
Re: Fix for Windows bug#52476 [In reply to]

On 8/9/2012 11:26 AM, Claudio Caldato (MS OPEN TECH) wrote:
> Better patch, fixed minor issue after another code review.

Sadly, it won't fix the defect.

Yes, you are successfullly performing a blocking initial read.

And the pipe remains unblocked for the rest of the connection, so any
further blocking reads will be ignored (just like the initial read by
mod_ssl failed to block). Any module expecting blocking behavior on
subsequent reads will be disappointed.


trawick at gmail

Aug 10, 2012, 9:41 AM

Post #8 of 12 (248 views)
Permalink
Re: Fix for Windows bug#52476 [In reply to]

On Thu, Aug 9, 2012 at 7:35 PM, William A. Rowe Jr. <wrowe [at] rowe-clan> wrote:
> On 8/9/2012 11:26 AM, Claudio Caldato (MS OPEN TECH) wrote:
>> Better patch, fixed minor issue after another code review.
>
> Sadly, it won't fix the defect.
>
> Yes, you are successfullly performing a blocking initial read.
>
> And the pipe remains unblocked for the rest of the connection, so any
> further blocking reads will be ignored (just like the initial read by
> mod_ssl failed to block). Any module expecting blocking behavior on
> subsequent reads will be disappointed.

The SSL case with the chitchat to set up the connection is a bit
different than the simple case where as soon as you read you have
everything needed, so it does seem moderately interesting that the
read|peek makes the simple SSL testcase work.

Also in the FWLIW department:

a. once we get to the process_connection hook, APR's limited knowledge
of the socket state (timeout, non-blocking) is the same on Linux and
Windows

(That was expected all along.)

I had hoped to query state directly from the OS socket too, but
apparently the Microsoft folks don't think you need to be able to do
that, and I don't see any external sysinternals|other tools to tell
you anything interesting about the socket other than the normal
netstat stuff.

b. Using WSAEnumNetworkEvents to clear any existing events didn't seem to help

/* start new */
rv = WSAEnumNetworkEvents(context->accept_socket, 0, &lpNetworkEvents);
if (rv == SOCKET_ERROR) {
ap_log_error(APLOG_MARK, APLOG_WARNING, 0, ap_server_conf,
"failed to clear network events on connected socket");
}
/* end new */
apr_os_sock_make(&context->sock, &sockinfo, context->ptrans);

c. switching out the WSAEvent stuff for a straight
select(is-listener-readable) doesn't work around the problem either

--
Born in Roswell... married an alien...
http://emptyhammock.com/


trawick at gmail

Aug 10, 2012, 10:31 AM

Post #9 of 12 (247 views)
Permalink
Re: Fix for Windows bug#52476 [In reply to]

On Fri, Aug 10, 2012 at 12:41 PM, Jeff Trawick <trawick [at] gmail> wrote:
> On Thu, Aug 9, 2012 at 7:35 PM, William A. Rowe Jr. <wrowe [at] rowe-clan> wrote:
>> On 8/9/2012 11:26 AM, Claudio Caldato (MS OPEN TECH) wrote:
>>> Better patch, fixed minor issue after another code review.
>>
>> Sadly, it won't fix the defect.
>>
>> Yes, you are successfullly performing a blocking initial read.
>>
>> And the pipe remains unblocked for the rest of the connection, so any
>> further blocking reads will be ignored (just like the initial read by
>> mod_ssl failed to block). Any module expecting blocking behavior on
>> subsequent reads will be disappointed.
>
> The SSL case with the chitchat to set up the connection is a bit
> different than the simple case where as soon as you read you have
> everything needed, so it does seem moderately interesting that the
> read|peek makes the simple SSL testcase work.
>
> Also in the FWLIW department:
>
> a. once we get to the process_connection hook, APR's limited knowledge
> of the socket state (timeout, non-blocking) is the same on Linux and
> Windows
>
> (That was expected all along.)
>
> I had hoped to query state directly from the OS socket too, but
> apparently the Microsoft folks don't think you need to be able to do
> that, and I don't see any external sysinternals|other tools to tell
> you anything interesting about the socket other than the normal
> netstat stuff.
>
> b. Using WSAEnumNetworkEvents to clear any existing events didn't seem to help
>
> /* start new */
> rv = WSAEnumNetworkEvents(context->accept_socket, 0, &lpNetworkEvents);
> if (rv == SOCKET_ERROR) {
> ap_log_error(APLOG_MARK, APLOG_WARNING, 0, ap_server_conf,
> "failed to clear network events on connected socket");
> }
> /* end new */
> apr_os_sock_make(&context->sock, &sockinfo, context->ptrans);
>
> c. switching out the WSAEvent stuff for a straight
> select(is-listener-readable) doesn't work around the problem either

The APR socket gets set to non-blocking prior to the first call to
apr_socket_recv():

#0 apr_socket_opt_set (sock=0x2933890, opt=8, on=1)
at network_io/win32/sockopt.c:104
#1 0x00423daa in ap_core_output_filter (f=0xc2bb310, new_bb=0x2933d20)
at core_filters.c:394
#2 0x0042197c in ap_pass_brigade (next=0xc2bb310, bb=0x2933d20)
at util_filter.c:533
#3 0x0045be1e in bio_filter_out_pass (bio=<value optimized out>)
at ssl_engine_io.c:134
#4 bio_filter_out_flush (bio=<value optimized out>) at ssl_engine_io.c:155
#5 0x0045bff9 in bio_filter_in_read (bio=0x28aa9a8,
in=0x2899348 "0╪ \f\003\001", inlen=11) at ssl_engine_io.c:458
#6 0x004b30f7 in BIO_read ()
#7 0x00c1b798 in ?? ()

We picked up that apr_socket_opt_set() from the async-dev branch with
r327872, though the timeout calls in there were changed subsequently.
I wonder if that call is stray and it doesn't get along with the
timeout handling on Windows because of the SO_RCVTIMEO/SO_SENDTIMEO
use on Windows, in lieu of non-blocking socket + poll like on Unix.

At the time it was added, the new code was

apr_socket_timeout_set(client_socket, 0)
apr_socket_opt_set(client_socket, APR_SO_NONBLOCK, 1)

(redundant unless there was some APR glitch at the time)

The apr_socket_timeout_set() was subsequently removed.

I'll proceed with the obvious test. (I'm pretending that someone is
out there reading this and will chime in with any hints or other
comments based on my stream of observations :) )

--
Born in Roswell... married an alien...
http://emptyhammock.com/


trawick at gmail

Aug 10, 2012, 10:43 AM

Post #10 of 12 (252 views)
Permalink
Re: Fix for Windows bug#52476 [In reply to]

On Fri, Aug 10, 2012 at 1:31 PM, Jeff Trawick <trawick [at] gmail> wrote:
> On Fri, Aug 10, 2012 at 12:41 PM, Jeff Trawick <trawick [at] gmail> wrote:
>> On Thu, Aug 9, 2012 at 7:35 PM, William A. Rowe Jr. <wrowe [at] rowe-clan> wrote:
>>> On 8/9/2012 11:26 AM, Claudio Caldato (MS OPEN TECH) wrote:
>>>> Better patch, fixed minor issue after another code review.
>>>
>>> Sadly, it won't fix the defect.
>>>
>>> Yes, you are successfullly performing a blocking initial read.
>>>
>>> And the pipe remains unblocked for the rest of the connection, so any
>>> further blocking reads will be ignored (just like the initial read by
>>> mod_ssl failed to block). Any module expecting blocking behavior on
>>> subsequent reads will be disappointed.
>>
>> The SSL case with the chitchat to set up the connection is a bit
>> different than the simple case where as soon as you read you have
>> everything needed, so it does seem moderately interesting that the
>> read|peek makes the simple SSL testcase work.
>>
>> Also in the FWLIW department:
>>
>> a. once we get to the process_connection hook, APR's limited knowledge
>> of the socket state (timeout, non-blocking) is the same on Linux and
>> Windows
>>
>> (That was expected all along.)
>>
>> I had hoped to query state directly from the OS socket too, but
>> apparently the Microsoft folks don't think you need to be able to do
>> that, and I don't see any external sysinternals|other tools to tell
>> you anything interesting about the socket other than the normal
>> netstat stuff.
>>
>> b. Using WSAEnumNetworkEvents to clear any existing events didn't seem to help
>>
>> /* start new */
>> rv = WSAEnumNetworkEvents(context->accept_socket, 0, &lpNetworkEvents);
>> if (rv == SOCKET_ERROR) {
>> ap_log_error(APLOG_MARK, APLOG_WARNING, 0, ap_server_conf,
>> "failed to clear network events on connected socket");
>> }
>> /* end new */
>> apr_os_sock_make(&context->sock, &sockinfo, context->ptrans);
>>
>> c. switching out the WSAEvent stuff for a straight
>> select(is-listener-readable) doesn't work around the problem either
>
> The APR socket gets set to non-blocking prior to the first call to
> apr_socket_recv():
>
> #0 apr_socket_opt_set (sock=0x2933890, opt=8, on=1)
> at network_io/win32/sockopt.c:104
> #1 0x00423daa in ap_core_output_filter (f=0xc2bb310, new_bb=0x2933d20)
> at core_filters.c:394
> #2 0x0042197c in ap_pass_brigade (next=0xc2bb310, bb=0x2933d20)
> at util_filter.c:533
> #3 0x0045be1e in bio_filter_out_pass (bio=<value optimized out>)
> at ssl_engine_io.c:134
> #4 bio_filter_out_flush (bio=<value optimized out>) at ssl_engine_io.c:155
> #5 0x0045bff9 in bio_filter_in_read (bio=0x28aa9a8,
> in=0x2899348 "0╪ \f\003\001", inlen=11) at ssl_engine_io.c:458
> #6 0x004b30f7 in BIO_read ()
> #7 0x00c1b798 in ?? ()
>
> We picked up that apr_socket_opt_set() from the async-dev branch with
> r327872, though the timeout calls in there were changed subsequently.
> I wonder if that call is stray and it doesn't get along with the
> timeout handling on Windows because of the SO_RCVTIMEO/SO_SENDTIMEO
> use on Windows, in lieu of non-blocking socket + poll like on Unix.
>
> At the time it was added, the new code was
>
> apr_socket_timeout_set(client_socket, 0)
> apr_socket_opt_set(client_socket, APR_SO_NONBLOCK, 1)
>
> (redundant unless there was some APR glitch at the time)
>
> The apr_socket_timeout_set() was subsequently removed.
>
> I'll proceed with the obvious test. (I'm pretending that someone is
> out there reading this and will chime in with any hints or other
> comments based on my stream of observations :) )

This patch is testing great so far with the AcceptFilter-none+SSL
scenario on Windows.

Index: server/core_filters.c
===================================================================
--- server/core_filters.c (revision 1371377)
+++ server/core_filters.c (working copy)
@@ -391,10 +391,6 @@
if (ctx == NULL) {
ctx = apr_pcalloc(c->pool, sizeof(*ctx));
net->out_ctx = (core_output_filter_ctx_t *)ctx;
- rv = apr_socket_opt_set(net->client_socket, APR_SO_NONBLOCK, 1);
- if (rv != APR_SUCCESS) {
- return rv;
- }
/*
* Need to create tmp brigade with correct lifetime. Passing
* NULL to apr_brigade_split_ex would result in a brigade

I'll run it through the test framework on Linux before committing.

--
Born in Roswell... married an alien...
http://emptyhammock.com/


info at apachelounge

Aug 13, 2012, 5:58 AM

Post #11 of 12 (237 views)
Permalink
Re: Fix for Windows bug#52476 [In reply to]

Also here it is running now without issues till now here with
AcceptFilter-none+SSL

Steffen

-----Original Message-----
From: Jeff Trawick
Sent: Friday, August 10, 2012 7:43 PM Newsgroups: gmane.comp.apache.devel
To: dev [at] httpd
Subject: Re: Fix for Windows bug#52476

This patch is testing great so far with the AcceptFilter-none+SSL
scenario on Windows.

Index: server/core_filters.c
===================================================================
--- server/core_filters.c (revision 1371377)
+++ server/core_filters.c (working copy)
@@ -391,10 +391,6 @@
if (ctx == NULL) {
ctx = apr_pcalloc(c->pool, sizeof(*ctx));
net->out_ctx = (core_output_filter_ctx_t *)ctx;
- rv = apr_socket_opt_set(net->client_socket, APR_SO_NONBLOCK, 1);
- if (rv != APR_SUCCESS) {
- return rv;
- }
/*
* Need to create tmp brigade with correct lifetime. Passing
* NULL to apr_brigade_split_ex would result in a brigade

I'll run it through the test framework on Linux before committing.


trawick at gmail

Aug 13, 2012, 6:47 AM

Post #12 of 12 (236 views)
Permalink
Re: Fix for Windows bug#52476 [In reply to]

On Mon, Aug 13, 2012 at 8:58 AM, Apache Lounge <info [at] apachelounge> wrote:
> Also here it is running now without issues till now here with
> AcceptFilter-none+SSL

awesome/thanks!

>
> Steffen
>
> -----Original Message----- From: Jeff Trawick
> Sent: Friday, August 10, 2012 7:43 PM Newsgroups: gmane.comp.apache.devel
> To: dev [at] httpd
> Subject: Re: Fix for Windows bug#52476
>
>
> This patch is testing great so far with the AcceptFilter-none+SSL
> scenario on Windows.
>
> Index: server/core_filters.c
> ===================================================================
> --- server/core_filters.c (revision 1371377)
> +++ server/core_filters.c (working copy)
> @@ -391,10 +391,6 @@
> if (ctx == NULL) {
> ctx = apr_pcalloc(c->pool, sizeof(*ctx));
> net->out_ctx = (core_output_filter_ctx_t *)ctx;
> - rv = apr_socket_opt_set(net->client_socket, APR_SO_NONBLOCK, 1);
> - if (rv != APR_SUCCESS) {
> - return rv;
> - }
> /*
> * Need to create tmp brigade with correct lifetime. Passing
> * NULL to apr_brigade_split_ex would result in a brigade
>
> I'll run it through the test framework on Linux before committing.
>



--
Born in Roswell... married an alien...
http://emptyhammock.com/

Apache dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.