Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Apache: Dev

Httpd 3.0 or something else

 

 

First page Previous page 1 2 3 Next page Last page  View All Apache dev RSS feed   Index | Next | Previous | View Threaded


Brian.Akins at turner

Nov 9, 2009, 10:51 AM

Post #26 of 69 (977 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

On 11/9/09 1:40 PM, "Brian Akins" <Brian.Akins [at] turner> wrote:

> On 11/9/09 1:36 PM, "Graham Leggett" <minfrin [at] sharp> wrote:
>
>>> It works really well for proxy.
>>
>> Aka "static data" :)
>
> Nah, we proxy to fastcgi php stuff, http java stuff, some horrid HTTP perl
> stuff, etc (Full disclosure, I wrote the horrid perl stuff.)

Replying to my own post:

What we discussed some on list some at Apachecon, was having a really good
and simple process manager. Mod_fcgid is too much work to configure for
mere mortals. If we just had something like:

AssociateExternal .php /path/to/my/php-cgi

And it did the sensible thing (whether fcgi, http, wscgi, etc.) then all the
"config" is in one place. Obviously, we could have some "advanced" process
management directives.

If your app needed some special config stuff, we could easily pass it across
somehow.

--
Brian Akins


minfrin at sharp

Nov 9, 2009, 10:59 AM

Post #27 of 69 (972 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

Akins, Brian wrote:

>>> It works really well for proxy.
>> Aka "static data" :)
>
> Nah, we proxy to fastcgi php stuff, http java stuff, some horrid HTTP perl
> stuff, etc (Full disclosure, I wrote the horrid perl stuff.)

Doesn't matter, once httpd proxy gets hold of it, it's just shifting
static bits.

Something I want to teach httpd to do is buffer up data for output, and
then forget about the output to focus on releasing the backend resources
ASAP, ready for the next request when it (eventually) comes. The fact
that network writes block makes this painful to achieve.

Proxy had an optimisation that released proxied backend resources when
it detected EOS from the backend but before attempting to pass it to the
frontend, but someone refactored that away at some point. It would be
good if such an optimisation was available server wide.

I want to be able to write something to the filter stack, and get an
EWOULDBLOCK (or similar) back if it isn't ready. I could then make
intelligent decisions based on this. For example, if I were a cache, I
would carry on reading from the backend and writing the data to the
cache, while the frontend was saying "not now, slow browser ahead". I
could have long since finished caching and closed the backend connection
and freed the resources, before the frontend returned "cool, ready for
you now", at which point I answer "no worries, have the cached content I
prepared earlier".

Regards,
Graham
--


Brian.Akins at turner

Nov 9, 2009, 11:05 AM

Post #28 of 69 (971 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

On 11/9/09 1:59 PM, "Graham Leggett" <minfrin [at] sharp> wrote:


> Doesn't matter, once httpd proxy gets hold of it, it's just shifting
> static bits.

True.

> Something I want to teach httpd to do is buffer up data for output, and
> then forget about the output to focus on releasing the backend resources
> ASAP, ready for the next request when it (eventually) comes. The fact
> that network writes block makes this painful to achieve.

FWIW, nginx "buffers" backend stuff to a file, then sendfiles it out - I
think this is what perlbal does as well. Same can be done outside apache
using X-sendfile like methods. Seems like we could move this "inside"
apache fairly easy. May can do it with a filter. I tried once and got it
to filter "most" backend stuff to a temp file, but it tended to miss and
block. That was a while ago, but I haven't learned anymore about the
filters since then to think it would work any better.

Maybe a mod_buffer that goes to a file?

Also, all these temp files are normally in tmpfs for us.

--
Brian Akins


gstein at gmail

Nov 9, 2009, 11:06 AM

Post #29 of 69 (972 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

On Mon, Nov 9, 2009 at 13:59, Graham Leggett <minfrin [at] sharp> wrote:
> Akins, Brian wrote:
>
>>>> It works really well for proxy.
>>> Aka "static data" :)
>>
>> Nah, we proxy to fastcgi php stuff, http java stuff, some horrid HTTP perl
>> stuff, etc (Full disclosure, I wrote the horrid perl stuff.)
>
> Doesn't matter, once httpd proxy gets hold of it, it's just shifting
> static bits.
>
> Something I want to teach httpd to do is buffer up data for output, and
> then forget about the output to focus on releasing the backend resources
> ASAP, ready for the next request when it (eventually) comes. The fact
> that network writes block makes this painful to achieve.
>
> Proxy had an optimisation that released proxied backend resources when
> it detected EOS from the backend but before attempting to pass it to the
> frontend, but someone refactored that away at some point. It would be
> good if such an optimisation was available server wide.
>
> I want to be able to write something to the filter stack, and get an
> EWOULDBLOCK (or similar) back if it isn't ready. I could then make
> intelligent decisions based on this. For example, if I were a cache, I
> would carry on reading from the backend and writing the data to the
> cache, while the frontend was saying "not now, slow browser ahead". I
> could have long since finished caching and closed the backend connection
> and freed the resources, before the frontend returned "cool, ready for
> you now", at which point I answer "no worries, have the cached content I
> prepared earlier".

These issues are already solved by moving to a Serf core. It is fully
asynchronous.

Backend handlers will no longer "push" bits towards the network. The
core will "pull" them from a bucket. *Which* bucket is defined by a
{URL,Headers}->Bucket mapping system.

Cheers,
-g


minfrin at sharp

Nov 9, 2009, 11:17 AM

Post #30 of 69 (974 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

Akins, Brian wrote:

> FWIW, nginx "buffers" backend stuff to a file, then sendfiles it out - I
> think this is what perlbal does as well. Same can be done outside apache
> using X-sendfile like methods. Seems like we could move this "inside"
> apache fairly easy. May can do it with a filter. I tried once and got it
> to filter "most" backend stuff to a temp file, but it tended to miss and
> block. That was a while ago, but I haven't learned anymore about the
> filters since then to think it would work any better.
>
> Maybe a mod_buffer that goes to a file?

mod_disk_cache can be made to do this quite trivially (it's on the list
of things to do When I Have Time(TM)).

In theory, a mod_disk_buffer could do this quite easily, on condition
upstream writes didn't block.

Regards,
Graham
--


Brian.Akins at turner

Nov 9, 2009, 11:19 AM

Post #31 of 69 (971 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

On 11/9/09 2:06 PM, "Greg Stein" <gstein [at] gmail> wrote:

> These issues are already solved by moving to a Serf core. It is fully
> asynchronous.

Okay that's one convert, any others? ;)

That's what Paul and I discussed a lot last week.

My ideal httpd 3.0 is:

Libev + serf + lua

--
Brian Akins


paul at querna

Nov 9, 2009, 11:21 AM

Post #32 of 69 (974 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

On Mon, Nov 9, 2009 at 11:06 AM, Greg Stein <gstein [at] gmail> wrote:
> On Mon, Nov 9, 2009 at 13:59, Graham Leggett <minfrin [at] sharp> wrote:
>> Akins, Brian wrote:
>>
>>>>> It works really well for proxy.
>>>> Aka "static data" :)
>>>
>>> Nah, we proxy to fastcgi php stuff, http java stuff, some horrid HTTP perl
>>> stuff, etc (Full disclosure, I wrote the horrid perl stuff.)
>>
>> Doesn't matter, once httpd proxy gets hold of it, it's just shifting
>> static bits.
>>
>> Something I want to teach httpd to do is buffer up data for output, and
>> then forget about the output to focus on releasing the backend resources
>> ASAP, ready for the next request when it (eventually) comes. The fact
>> that network writes block makes this painful to achieve.
>>
>> Proxy had an optimisation that released proxied backend resources when
>> it detected EOS from the backend but before attempting to pass it to the
>> frontend, but someone refactored that away at some point. It would be
>> good if such an optimisation was available server wide.
>>
>> I want to be able to write something to the filter stack, and get an
>> EWOULDBLOCK (or similar) back if it isn't ready. I could then make
>> intelligent decisions based on this. For example, if I were a cache, I
>> would carry on reading from the backend and writing the data to the
>> cache, while the frontend was saying "not now, slow browser ahead". I
>> could have long since finished caching and closed the backend connection
>> and freed the resources, before the frontend returned "cool, ready for
>> you now", at which point I answer "no worries, have the cached content I
>> prepared earlier".
>
> These issues are already solved by moving to a Serf core. It is fully
> asynchronous.
>
> Backend handlers will no longer "push" bits towards the network. The
> core will "pull" them from a bucket. *Which* bucket is defined by a
> {URL,Headers}->Bucket mapping system.

I was talking to Aaron about this at ApacheCon.

I agree in general, a serf-based core does give us a good start.

But Serf Buckets and the event loop definitely do need some more work
-- simple things, like if the backend bucket is a socket, how do you
tell the event loop, that a would block rvalue maps to a file
descriptor talking to an origin server. You don't want to just keep
looping over it until it returns data, you want to poll on the origin
socket, and only try to read when data is available.

I am also concerned about the patterns of sendfile() in the current
serf bucket archittecture, and making a whole pipeline do sendfile
correctly seems quite difficult.

-Paul


gstein at gmail

Nov 9, 2009, 12:08 PM

Post #33 of 69 (974 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

On Mon, Nov 9, 2009 at 14:21, Paul Querna <paul [at] querna> wrote:
>...
> I agree in general, a serf-based core does give us a good start.
>
> But Serf Buckets and the event loop definitely do need some more work
> -- simple things, like if the backend bucket is a socket, how do you
> tell the event loop, that a would block rvalue maps to a file
> descriptor talking to an origin server.   You don't want to just keep
> looping over it until it returns data, you want to poll on the origin
> socket, and only try to read when data is available.

The goal would be that the handler's (aka content generator, aka serf
bucket) socket would be process in the same select() as the client
connections. When the bucket has no more data from the backend, then
it returns "done for now". Eventually, all network reads/writes
finalize and control returns to the core loop. If data comes in the
backend, then the core opens and that bucket can read/return data.

There are two caveats that I can think of, right off hand:

1) Each client connection is associated with one bucket generating the
response. Ideally, you would not bother to read that bucket
unless/until the client connection is ready for reading. But that
could create a deadlock internal to the bucket -- *some* data may need
to be consumed from the backend, processed, and returned to the
backend to "unstick" the entire flow (think SSL). Even though nothing
pops out the top of the bucket, internal processing may need to
happen.

2) If you have 10,000 client connections, and some number of sockets
in the system ready for read/write... how do you quickly determine
*which* buckets to poll to get those sockets processed? You don't want
to poll 9999 idle connections/buckets if only one is ready for
read/write. (note: there are optimizations around this; if the bucket
wants to return data, but wasn't asked to, then next-time-around it
has the same data; no need to drill way down to the source bucket to
attempt to read network data; tho this kinda sets up a busy loop until
that bucket's client is ready for writing)

Are either of these the considerations you were thinking of?

I can certainly see some kind of system to associate buckets and the
sockets that affect their behavior. Though that could get pretty crazy
since it doesn't have to be a 1:1 mapping. One backend socket might
actually service multiple buckets, and vice-versa.

> I am also concerned about the patterns of sendfile() in the current
> serf bucket archittecture, and making a whole pipeline do sendfile
> correctly seems quite difficult.

Well... it generally *is* quite difficult in the presence of SSL,
gzip, and chunking. Invariably, content is mangled before hitting the
network, so sendfile() rarely gets a chance to play ball.

But if you really are just dealing with plain files (maybe prezipped),
then the read_for_sendfile() should be workable. Most buckets can't do
squat with it, and should just use a default function. But the file
bucket can return a proper handle.
(and it is entirely possible/reasonable that the signature should be
adjusted to simplify the process)

Cheers,
-g


nick at webthing

Nov 9, 2009, 12:14 PM

Post #34 of 69 (973 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

Akins, Brian wrote:

> What we discussed some on list some at Apachecon, was having a really good
> and simple process manager. Mod_fcgid is too much work to configure for
> mere mortals. If we just had something like:
>
> AssociateExternal .php /path/to/my/php-cgi

Sounds interesting. Any notes from apachecon or otherwise on
that discussion?

--
Nick Kew


minfrin at sharp

Nov 9, 2009, 1:19 PM

Post #35 of 69 (971 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

Greg Stein wrote:

> These issues are already solved by moving to a Serf core. It is fully
> asynchronous.
>
> Backend handlers will no longer "push" bits towards the network. The
> core will "pull" them from a bucket. *Which* bucket is defined by a
> {URL,Headers}->Bucket mapping system.

How is "pull" different from "push"[1]?

Pull, by definition, is blocking behaviour.

You will only run as often as you are pulled, and never more often. And
if the pull is controlled by how quickly the client is accepting the
data, which is typically orders of magnitude slower than the backend can
push, you have no opportunity to try speed up the server in any way.

Push however, gives you a choice: the push either worked (yay! go
browser!), or it didn't (sensible alternative behaviour, like cache it
for later in a connection filter). Push happens as fast the backend, not
as slow as the frontend.

So far I'm not convinced it is a step forward, will have to think about
it more.

[1] Apart from the obvious.

Regards,
Graham
--


gstein at gmail

Nov 9, 2009, 2:10 PM

Post #36 of 69 (968 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

On Mon, Nov 9, 2009 at 16:19, Graham Leggett <minfrin [at] sharp> wrote:
> Greg Stein wrote:
>> These issues are already solved by moving to a Serf core. It is fully
>> asynchronous.
>>
>> Backend handlers will no longer "push" bits towards the network. The
>> core will "pull" them from a bucket. *Which* bucket is defined by a
>> {URL,Headers}->Bucket mapping system.
>
> How is "pull" different from "push"[1]?

The network loop pulls data from the content-generator.

Apache 1.x and 2.x had a handler that pushed data at the network.
There is no loop, of course, since each worker had direct control of
the socket to push data into.

> Pull, by definition, is blocking behaviour.

You may want to check your definitions.

When you read from a serf bucket, it will return however much you ask
for, or as much as it has without blocking. When it gives you that
data, it can say "I have more", "I'm done", or "This is what I had
without blocking".

> You will only run as often as you are pulled, and never more often. And
> if the pull is controlled by how quickly the client is accepting the
> data, which is typically orders of magnitude slower than the backend can
> push, you have no opportunity to try speed up the server in any way.

Eh? Are you kidding me?

One single network thread can manage N client connections. As each
becomes writable, the loop reads ("pulls") from the bucket and jams it
into the client socket. If you're really fancy, then you know what the
window is, and you ask the bucket for that much data.

> Push however, gives you a choice: the push either worked (yay! go
> browser!), or it didn't (sensible alternative behaviour, like cache it
> for later in a connection filter). Push happens as fast the backend, not
> as slow as the frontend.

Push means that you have a worker per connection, pushing the response
onto the network. I really would like to see us get away from a worker
per connection.

Once a worker thread determines which bucket to create/build, then it
passes it along to the network thread, and returns for more work. The
network thread can then manage N connections with their associated
response buckets.

If one network thread cannot read/generate the content fast enough,
then you use multiple threads to keep the connections full.

Then you want to add in a bit of control around reading of requests in
order to manage the backlog of responses (and any potential memory
buildup that entails). If the network thread is consuming 100M and 20k
sockets, you may want to stop accepting connections or accept but read
them slowly until the pressure eases. etc...

Cheers,
-g


minfrin at sharp

Nov 9, 2009, 3:47 PM

Post #37 of 69 (961 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

Greg Stein wrote:

>> How is "pull" different from "push"[1]?
>
> The network loop pulls data from the content-generator.
>
> Apache 1.x and 2.x had a handler that pushed data at the network.
> There is no loop, of course, since each worker had direct control of
> the socket to push data into.

As I said in [1], apart from the obvious ;)

>> Pull, by definition, is blocking behaviour.
>
> You may want to check your definitions.
>
> When you read from a serf bucket, it will return however much you ask
> for, or as much as it has without blocking. When it gives you that
> data, it can say "I have more", "I'm done", or "This is what I had
> without blocking".

Who is "you"?

Up till now, my understanding is that "you" is the core, and therefore
not under control of a module writer.

Let me put it another way. Imagine I am a cache module. I want to read
as much as possible as fast as possible from a backend, and I want to
write this data to two places simultaneously: the cache, and the
downstream network. I know the cache is always writable, but the
downstream network I am not sure of, I only want to write to the
downstream network when the downstream network is ready for me.

How would I do this in a serf model?

>> You will only run as often as you are pulled, and never more often. And
>> if the pull is controlled by how quickly the client is accepting the
>> data, which is typically orders of magnitude slower than the backend can
>> push, you have no opportunity to try speed up the server in any way.
>
> Eh? Are you kidding me?
>
> One single network thread can manage N client connections. As each
> becomes writable, the loop reads ("pulls") from the bucket and jams it
> into the client socket. If you're really fancy, then you know what the
> window is, and you ask the bucket for that much data.

That I understand, but it makes no difference as I see it - your loop
only reads from the bucket and jams it into the client socket if the
client socket is good and ready to accept data.

If the client socket isn't good and ready, the bucket doesn't get pulled
from, and resources used by the bucket are left in limbo until the
client is done. If the bucket wants to do something clever, like cache,
or release resources early, it can't - because as soon as it returns the
data it has to wait for the client socket to be good and ready all over
again. The server runs as slow as the browser, which in computing terms
is glacially slow.

>> Push however, gives you a choice: the push either worked (yay! go
>> browser!), or it didn't (sensible alternative behaviour, like cache it
>> for later in a connection filter). Push happens as fast the backend, not
>> as slow as the frontend.
>
> Push means that you have a worker per connection, pushing the response
> onto the network. I really would like to see us get away from a worker
> per connection.

Only if you write it that way (which we have done till now).

There is no reason why one event loop can't handle many requests at the
same time.

One event loop handling many requests each == event MPM (speed and
resource efficient, but we'd better be bug free).
Many event loops handling many requests each == worker MPM (compromise).
Many event loops handling one request each == prefork (reliable old
workhorse).

In theory if we turn the content handler into a filter and bootstrap the
filter stack with a bucket of some kind, this may work.

In fact, using both "push" and "pull" at the same time might also make
some sense - your event loop creates a bucket from which data is
"pulled" (serf model), which is in turn "pulled" by a filter stack
(existing filter stack model) and "pushed" upstream.

Functions that work better as a "pull" (proxy and friends) can be
pulled, functions that work better as a "push" (like caching) can be
filters.

Regards,
Graham
--


nikke at acc

Nov 10, 2009, 2:29 AM

Post #38 of 69 (930 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

On Mon, 9 Nov 2009, Graham Leggett wrote:

> Akins, Brian wrote:
>
>> FWIW, nginx "buffers" backend stuff to a file, then sendfiles it out - I
>> think this is what perlbal does as well. Same can be done outside apache
>> using X-sendfile like methods. Seems like we could move this "inside"
>> apache fairly easy. May can do it with a filter. I tried once and got it
>> to filter "most" backend stuff to a temp file, but it tended to miss and
>> block. That was a while ago, but I haven't learned anymore about the
>> filters since then to think it would work any better.
>>
>> Maybe a mod_buffer that goes to a file?
>
> mod_disk_cache can be made to do this quite trivially (it's on the list
> of things to do When I Have Time(TM)).
>
> In theory, a mod_disk_buffer could do this quite easily, on condition
> upstream writes didn't block.

I'm guessing that this would be the good-looking implementation of my
ugly-but-working making-disk-cache-work-for-large-files patchset
(version for 2.2.9 at
https://issues.apache.org/bugzilla/show_bug.cgi?id=39380, I'm in the
process of respinning it for 2.2.14 but ENOTIME makes testing slow).

The main issue I had when cobbling that together was to deal with the
fact that stuff<tm> wants to block, and it really isn't obvious in the
current httpd core how to do this nicely when you have a one-to-many
situation.

As you might remember, I "solved" it by spawning a thread to deal with
caching files in the background when needed. Since our usecase is
delivering static files it works, but it sure would be nice with an
infrastructure that tried to help you instead of being damn near
hostile at times.

/Nikke
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | nikke [at] acc
---------------------------------------------------------------------------
Quantum Trek: Time travel with a twist!
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=


Brian.Akins at turner

Nov 10, 2009, 8:14 AM

Post #39 of 69 (932 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

On 11/9/09 3:08 PM, "Greg Stein" <gstein [at] gmail> wrote:

> 2) If you have 10,000 client connections, and some number of sockets
> in the system ready for read/write... how do you quickly determine
> *which* buckets to poll to get those sockets processed? You don't want
> to poll 9999 idle connections/buckets if only one is ready for
> read/write.

Epoll/kqueue/etc. Takes care of that for you.

--
Brian Akins


minfrin at sharp

Nov 10, 2009, 8:37 AM

Post #40 of 69 (928 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

Paul Querna wrote:

> But Serf Buckets and the event loop definitely do need some more work
> -- simple things, like if the backend bucket is a socket, how do you
> tell the event loop, that a would block rvalue maps to a file
> descriptor talking to an origin server. You don't want to just keep
> looping over it until it returns data, you want to poll on the origin
> socket, and only try to read when data is available.

I think it can probably be generally stated that every request processed
by the server has N descriptors associated with that request (instead of
1 descriptor, in the current code).

In the case of a simple file transfer, there are two descriptors, one
belonging to the file, the other belonging to the network socket.

In the case of a proxy, one socket belongs to the backend connection,
and the other belongs to a frontend network socket.

And descriptors might need to be polled for read, or for write, or both
(SSL).

If a mechanism existed whereby all descriptors associated with a request
could be given to the event loop, we could be completely asynchronous
throughout the server, from the reading from the backend, to the writing
to the frontend.

Regards,
Graham
--


minfrin at sharp

Nov 10, 2009, 8:37 AM

Post #41 of 69 (928 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

Greg Stein wrote:

>> I am also concerned about the patterns of sendfile() in the current
>> serf bucket archittecture, and making a whole pipeline do sendfile
>> correctly seems quite difficult.
>
> Well... it generally *is* quite difficult in the presence of SSL,
> gzip, and chunking. Invariably, content is mangled before hitting the
> network, so sendfile() rarely gets a chance to play ball.

Not necessarily - a sensible cache that writes an interim response to
disk should ideally replace the current in-memory response with a
sendfile-capable file bucket.

Having done whatever filtering magic is required, the server just goes
"here kernel, give this file to the network, I'm off to serve the next
request, bye".

Regards,
Graham
--


jim at jaguNET

Nov 10, 2009, 9:01 AM

Post #42 of 69 (930 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

On Nov 9, 2009, at 2:19 PM, Akins, Brian wrote:

> On 11/9/09 2:06 PM, "Greg Stein" <gstein [at] gmail> wrote:
>
>> These issues are already solved by moving to a Serf core. It is fully
>> asynchronous.
>
> Okay that's one convert, any others? ;)
>

I said the same thing back on the 4th ;)

> That's what Paul and I discussed a lot last week.
>
> My ideal httpd 3.0 is:
>
> Libev + serf + lua

+1

For 3.0, I see us breaking the mold and the API in a pretty
substantial way.


gstein at gmail

Nov 10, 2009, 9:03 AM

Post #43 of 69 (929 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

On Mon, Nov 9, 2009 at 18:47, Graham Leggett <minfrin [at] sharp> wrote:
>...
>> When you read from a serf bucket, it will return however much you ask
>> for, or as much as it has without blocking. When it gives you that
>> data, it can say "I have more", "I'm done", or "This is what I had
>> without blocking".
>
> Who is "you"?

Anybody who reads from a bucket. In this case, the core network loop
when a client connection is ready for writing.

> Up till now, my understanding is that "you" is the core, and therefore
> not under control of a module writer.
>
> Let me put it another way. Imagine I am a cache module. I want to read
> as much as possible as fast as possible from a backend, and I want to
> write this data to two places simultaneously: the cache, and the
> downstream network. I know the cache is always writable, but the
> downstream network I am not sure of, I only want to write to the
> downstream network when the downstream network is ready for me.
>
> How would I do this in a serf model?

No module *anywhere* ever writes to the network.

The core loop reads/pulls from a bucket when it needs more data (for
writing to the network).

When your cache bucket reads from its interior bucket, it can also
drop the content into a file, off to the side. Think of this bucket as
a filter. All content that is read through it will be dumped into a
file, too.

>...
> That I understand, but it makes no difference as I see it - your loop
> only reads from the bucket and jams it into the client socket if the
> client socket is good and ready to accept data.
>
> If the client socket isn't good and ready, the bucket doesn't get pulled
> from, and resources used by the bucket are left in limbo until the
> client is done. If the bucket wants to do something clever, like cache,
> or release resources early, it can't - because as soon as it returns the
> data it has to wait for the client socket to be good and ready all over
> again. The server runs as slow as the browser, which in computing terms
> is glacially slow.

I'm not sure that I understand you, and that you're familiar with the
serf bucket model.

The bucket can certainly cache data as it flows through. No problem
there. Once the bucket has returned all of its data, it can close its
file handle or socket or whatever resources it may have.

Buckets are one-time use, so once it has returned all of its data, it
can throw out any resources.

And no... the server does NOT run as slow as the browser. There are N
browsers connected, and the server is processing ALL of them. One
single response bucket is running as fast as its client, sure, but the
server certainly is not idle.

>...
> One event loop handling many requests each == event MPM (speed and
> resource efficient, but we'd better be bug free).
> Many event loops handling many requests each == worker MPM (compromise).
> Many event loops handling one request each == prefork (reliable old
> workhorse).

These have no bearing. The current MPM model is based on
content-generators writing/pushing data into the network.

A serf-based model reads from content-generators.

> In theory if we turn the content handler into a filter and bootstrap the
> filter stack with a bucket of some kind, this may work.
>
> In fact, using both "push" and "pull" at the same time might also make
> some sense - your event loop creates a bucket from which data is
> "pulled" (serf model), which is in turn "pulled" by a filter stack
> (existing filter stack model) and "pushed" upstream.

That is NOT the design that myself, Paul, and Justin envision. The
core is serf. So *everything* is read/pull-based.

The old-style handlers and filters get their own thread and push into
a pipe, or an in-memory data queue. The core loop uses a bucket which
reads out of that pipe.

>...

Cheers,
-g


gstein at gmail

Nov 10, 2009, 9:10 AM

Post #44 of 69 (930 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

On Tue, Nov 10, 2009 at 11:14, Akins, Brian <Brian.Akins [at] turner> wrote:
> On 11/9/09 3:08 PM, "Greg Stein" <gstein [at] gmail> wrote:
>
>> 2) If you have 10,000 client connections, and some number of sockets
>> in the system ready for read/write... how do you quickly determine
>> *which* buckets to poll to get those sockets processed? You don't want
>> to poll 9999 idle connections/buckets if only one is ready for
>> read/write.
>
> Epoll/kqueue/etc. Takes care of that for you.

Sorry. I wasn't clear.

You have 10k buckets representing the response for 10k clients. The
core loop reads the response from the bucket, and writes that to the
network.

Now. A client socket wakes up as writable. I think it is pretty easy
to say "read THAT bucket" to get data for writing.

Consider the scenario where one of those responses is proxied -- it is
arriving from a backend origin server. That underlying read-socket is
stuffed into the core loop. When that read-socket becomes available
for reading, *which* client response bucket do you start reading from?
And what happens if the client socket is not writable?

You could just zip thru the 10k response buckets and poll each one for
data to read, and the serf design states that the underlying
read-socket *will* get read. But you've gotta do a lot of polling to
get there.

I think that will be an interesting problem to solve. I believe it
would be something like this:

Consider when a request arrives. The core looks at the Request-URI and
the Headers. From these inputs, it determines the appropriate
response. In this case, that response is identified by a bucket,
configured with those inputs. (and somewhere in here, any Request-Body
is managed; but ignore that for now) As that response bucket is
constructed, along with all interior/nested buckets, that construction
can say "I've got an FD here. Please add this to the core loop." The
FD would be added, and would then be associated with the response
bucket, so we know which to read when the FD wakes up.

Cheers,
-g


gstein at gmail

Nov 10, 2009, 9:12 AM

Post #45 of 69 (929 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

On Tue, Nov 10, 2009 at 12:01, Jim Jagielski <jim [at] jagunet> wrote:
> On Nov 9, 2009, at 2:19 PM, Akins, Brian wrote:
>> On 11/9/09 2:06 PM, "Greg Stein" <gstein [at] gmail> wrote:
>>
>>> These issues are already solved by moving to a Serf core. It is fully
>>> asynchronous.
>>
>> Okay that's one convert, any others? ;)

Convert? Bah. Justin and myself *started* serf. I'm rather biased, and
have never been a simple convert. Messiah, maybe. ;-)

>> That's what Paul and I discussed a lot last week.
>>
>> My ideal httpd 3.0 is:
>>
>> Libev + serf + lua
>
> +1
>
> For 3.0, I see us breaking the mold and the API in a pretty
> substantial way.

+1 and ditto.

(tho I think we can provide for old handlers thru the pipe mechanism I
described earlier on this thread)

Cheers,
-g


minfrin at sharp

Nov 10, 2009, 9:54 AM

Post #46 of 69 (928 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

Greg Stein wrote:

>> Who is "you"?
>
> Anybody who reads from a bucket. In this case, the core network loop
> when a client connection is ready for writing.

So would it be correct to say that in this theoretical httpd, the httpd
core, and nobody else, would read from the serf bucket?

>> Up till now, my understanding is that "you" is the core, and therefore
>> not under control of a module writer.
>>
>> Let me put it another way. Imagine I am a cache module. I want to read
>> as much as possible as fast as possible from a backend, and I want to
>> write this data to two places simultaneously: the cache, and the
>> downstream network. I know the cache is always writable, but the
>> downstream network I am not sure of, I only want to write to the
>> downstream network when the downstream network is ready for me.
>>
>> How would I do this in a serf model?
>
> No module *anywhere* ever writes to the network.
>
> The core loop reads/pulls from a bucket when it needs more data (for
> writing to the network).
>
> When your cache bucket reads from its interior bucket, it can also
> drop the content into a file, off to the side. Think of this bucket as
> a filter. All content that is read through it will be dumped into a
> file, too.

Makes sense, but what happens when the cache has finished reading the
interior bucket after the first pass through the code?

At this point, my cache needs to make a decision, and before it can make
that decision it wants to know whether upstream is capable of swallowing
the data right now without blocking.

If the answer is yes, I cache the data and pass the data upstream and
wait to be called again immediately, because I know upstream won't block.

If the answer is no, I *don't* pass data upstream (because it would
block from my perspective), and I read from the interior bucket again,
cache some more, and then ask again whether to pass the two data chunks
upstream.

How does my cache get the answer to its question?

And how does my cache code know when it is safe to read from the
interior bucket without blocking?

>> That I understand, but it makes no difference as I see it - your loop
>> only reads from the bucket and jams it into the client socket if the
>> client socket is good and ready to accept data.
>>
>> If the client socket isn't good and ready, the bucket doesn't get pulled
>> from, and resources used by the bucket are left in limbo until the
>> client is done. If the bucket wants to do something clever, like cache,
>> or release resources early, it can't - because as soon as it returns the
>> data it has to wait for the client socket to be good and ready all over
>> again. The server runs as slow as the browser, which in computing terms
>> is glacially slow.
>
> I'm not sure that I understand you, and that you're familiar with the
> serf bucket model.

You are 100% right, I am not completely familiar with the serf bucket
model, which is why I'm asking these questions.

I figure there are no better people to explain how serf works than they
who wrote serf ;)

> The bucket can certainly cache data as it flows through. No problem
> there. Once the bucket has returned all of its data, it can close its
> file handle or socket or whatever resources it may have.
>
> Buckets are one-time use, so once it has returned all of its data, it
> can throw out any resources.
>
> And no... the server does NOT run as slow as the browser. There are N
> browsers connected, and the server is processing ALL of them. One
> single response bucket is running as fast as its client, sure, but the
> server certainly is not idle.

That isn't what I meant.

Imagine big bloated expensive application server, the kind that's
typically built by the lowest bidder.

Imagine this server is fronted by an httpd reverse proxy.

Image at the end of the chain, there is a glacially slow (in computing
terms) browser waiting to consume the response.

A request is processed, and the httpd proxy receives an EOS from the big
bloated application server. Ideally it wants to drop the backend
connection ASAP, no point handing around, but it can't, because the
cleanup for the backend connection is tied to the pool from the request.
And the request pool is only complete when the last byte of the request
has been finally acknowledged by the glacially slow browser.

So httpd, and the big bloated expensive application server, sit around
waiting, waiting and waiting with memory allocated, database connections
left open, for the browser to finally say "got it, gimme some more"
before httpd's event loops goes "that was it,
apr_pool_destroy(serf_bucket->pool), next!".

And the reason why this happened was that all of this was driven by the
core's event loop, timed against the speed of the glacially slow browser.

Obviously a second browser next door is being serviced at same time as
you pointed out, but it too waits, waits, waits for that browser to
eventually acknowledge the end of the request.

This is the reason why people are sticking things like varnish caches
between their servers and the browsers - because the backend can't
terminate early.

I don't believe httpd v3.0 gives us any value if it suffers this same
limitation suffered by httpd v2.x.

I can see us solve this problem simply by making the filter stack non
blocking, and by making content generators event driven. I don't see a
need to rewrite the server.

>> One event loop handling many requests each == event MPM (speed and
>> resource efficient, but we'd better be bug free).
>> Many event loops handling many requests each == worker MPM (compromise).
>> Many event loops handling one request each == prefork (reliable old
>> workhorse).
>
> These have no bearing. The current MPM model is based on
> content-generators writing/pushing data into the network.
>
> A serf-based model reads from content-generators.

So httpd's event loop reads from a buggy leaky interior bucket.

How does the server protect itself from becoming unstable?

Regards,
Graham
--


gstein at gmail

Nov 10, 2009, 10:56 AM

Post #47 of 69 (931 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

On Tue, Nov 10, 2009 at 12:54, Graham Leggett <minfrin [at] sharp> wrote:
> Greg Stein wrote:
>
>>> Who is "you"?
>>
>> Anybody who reads from a bucket. In this case, the core network loop
>> when a client connection is ready for writing.
>
> So would it be correct to say that in this theoretical httpd, the httpd
> core, and nobody else, would read from the serf bucket?

Correct. That bucket represents the response to the client, and only
the core reads that.

>...
>> No module *anywhere* ever writes to the network.
>>
>> The core loop reads/pulls from a bucket when it needs more data (for
>> writing to the network).
>>
>> When your cache bucket reads from its interior bucket, it can also
>> drop the content into a file, off to the side. Think of this bucket as
>> a filter. All content that is read through it will be dumped into a
>> file, too.
>
> Makes sense, but what happens when the cache has finished reading the
> interior bucket after the first pass through the code?

If the interior has returned EOF, then the caching bucket can destroy
it, if it likes.

> At this point, my cache needs to make a decision, and before it can make
> that decision it wants to know whether upstream is capable of swallowing
> the data right now without blocking.

No no... the core only asked for as much as it can handle. You return
*no more* than that. It isn't your problem to make blocking decisions
for the reader of your bucket.

If you read more from the interior than the caller wants from you,
then that's your problem :-) You need to hold that in memory, dump it
to disk, or ... dunno.

> If the answer is yes, I cache the data and pass the data upstream and
> wait to be called again immediately, because I know upstream won't block.
>
> If the answer is no, I *don't* pass data upstream (because it would
> block from my perspective), and I read from the interior bucket again,
> cache some more, and then ask again whether to pass the two data chunks
> upstream.

Again: you don't make that decision. You just return what the caller
asked you for. It may decide to call you again, but that isn't up to
you.

If you return "this is all I have for you right now", then it won't
call you again until some (network) event occurs which may provide
more data for reading.

If you return EOF, then it shouldn't call you again, tho I believe our
rules state that if it *does*, then just return EOF again.

> How does my cache get the answer to its question?
>
> And how does my cache code know when it is safe to read from the
> interior bucket without blocking?

Buckets *never* block. The interior bucket will give you data saying
"I have more", give you data saying "I have no more right now", or say
"no more" (EOF). But in no case should it ever block.

(note: we do "block" on reading a file, but if we had portable async
I/O file operations, then we'd switch to those)

>...
> I figure there are no better people to explain how serf works than they
> who wrote serf ;)

Happy to. Unfortunately, we have a dearth of documentation :-(

Hopefully, this thread will help to educate several (httpd) developers
on the serf model.

>...
> Imagine big bloated expensive application server, the kind that's
> typically built by the lowest bidder.
>
> Imagine this server is fronted by an httpd reverse proxy.
>
> Image at the end of the chain, there is a glacially slow (in computing
> terms) browser waiting to consume the response.
>
> A request is processed, and the httpd proxy receives an EOS from the big
> bloated application server. Ideally it wants to drop the backend
> connection ASAP, no point handing around, but it can't, because the
> cleanup for the backend connection is tied to the pool from the request.
> And the request pool is only complete when the last byte of the request
> has been finally acknowledged by the glacially slow browser.
>
> So httpd, and the big bloated expensive application server, sit around
> waiting, waiting and waiting with memory allocated, database connections
> left open, for the browser to finally say "got it, gimme some more"
> before httpd's event loops goes "that was it,
> apr_pool_destroy(serf_bucket->pool), next!".

Okay. The bucket system is different. We have a somewhat-confusing
blend between explicit and region-based freeing. If you're done with a
bucket, then kill it. Don't wait for the pool to be cleared.

In your above scenario, the reverse-proxy-bucket can kill the
socket-bucket once the latter returns EOF, and that will drop the
connection.

Now... all that said, the above scenario is a bit problematic. If the
appserver return 2G of content to the frontend server, then where does
it go? Any type of bucket that reads-to-EOF is going to have to spool
its results somewhere (memory or disk). Otherwise, you keep a
small-ish read buffer in memory and you stream through the buffer at
whatever read-rate your caller is providing (potentially the client
browser's speed).

>...
> I can see us solve this problem simply by making the filter stack non
> blocking, and by making content generators event driven. I don't see a
> need to rewrite the server.

I'd like to see the serf model right at the core. A fully async model
can work with synchronous models (such as the current "handler"
push/write mechanism), but it is much harder to start synchronous and
somehow get async benefits.

>>> One event loop handling many requests each == event MPM (speed and
>>> resource efficient, but we'd better be bug free).
>>> Many event loops handling many requests each == worker MPM (compromise).
>>> Many event loops handling one request each == prefork (reliable old
>>> workhorse).
>>
>> These have no bearing. The current MPM model is based on
>> content-generators writing/pushing data into the network.
>>
>> A serf-based model reads from content-generators.
>
> So httpd's event loop reads from a buggy leaky interior bucket.
>
> How does the server protect itself from becoming unstable?

For robustness, I think we'd continue to have an N x M process/thread model.

But it gets a little bit more complicated. There are two types of
threads: network threads, and response-creation threads. The latter is
more like the worker threads we see today. A request comes in, and is
passed to a handy/available response-creation thread. That creates the
nest of buckets, resulting in a single response-bucket. The
response-bucket is passed off to a network thread, and the
response-creation thread returns to the pool.

On the network threads, CPU is used as the buckets build/compute the
response in realtime. Sure, some might be "read from this file
descriptor" or "here is some static text" and will have little CPU.
But some buckets might be performing gzip or SSL encryption. That
consumes CPU within the network thread. Thus, you really want a pool
of network threads, too. I'm not really sure what the balancing
algorithm is. Maybe as the core iterates over available client sockets
to write, it pulls a thread off the pool and has it do the read/write.
That allows a bucket-read to consume CPU without blocking the core
network loop.

Cheers,
-g


svnlgo at mobsol

Nov 10, 2009, 1:33 PM

Post #48 of 69 (925 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

On Tue, Nov 10, 2009 at 6:10 PM, Greg Stein <gstein [at] gmail> wrote:
> On Tue, Nov 10, 2009 at 11:14, Akins, Brian <Brian.Akins [at] turner> wrote:
>> On 11/9/09 3:08 PM, "Greg Stein" <gstein [at] gmail> wrote:
>>
>>> 2) If you have 10,000 client connections, and some number of sockets
>>> in the system ready for read/write... how do you quickly determine
>>> *which* buckets to poll to get those sockets processed? You don't want
>>> to poll 9999 idle connections/buckets if only one is ready for
>>> read/write.
>>
>> Epoll/kqueue/etc. Takes care of that for you.
>
> Sorry. I wasn't clear.
>
> You have 10k buckets representing the response for 10k clients. The
> core loop reads the response from the bucket, and writes that to the
> network.
>
> Now. A client socket wakes up as writable. I think it is pretty easy
> to say "read THAT bucket" to get data for writing.
>
> Consider the scenario where one of those responses is proxied -- it is
> arriving from a backend origin server. That underlying read-socket is
> stuffed into the core loop. When that read-socket becomes available
> for reading, *which* client response bucket do you start reading from?
> And what happens if the client socket is not writable?
>
> You could just zip thru the 10k response buckets and poll each one for
> data to read, and the serf design states that the underlying
> read-socket *will* get read. But you've gotta do a lot of polling to
> get there.
>
> I think that will be an interesting problem to solve. I believe it
> would be something like this:
>
> Consider when a request arrives. The core looks at the Request-URI and
> the Headers. From these inputs, it determines the appropriate
> response. In this case, that response is identified by a bucket,
> configured with those inputs. (and somewhere in here, any Request-Body
> is managed; but ignore that for now)  As that response bucket is
> constructed, along with all interior/nested buckets, that construction
> can say "I've got an FD here. Please add this to the core loop." The
> FD would be added, and would then be associated with the response
> bucket, so we know which to read when the FD wakes up.
>
Suppose this is the diagram of the proxy scenario, where A and B are
buckets wrapping the socket bucket:

browser --> (client fd) [core loop] [A [B [socket bucket (server
fd) <-- server

If there's an event on the client fd, the core loop can read bytes
from bucket A - as much as the client socket can handle.

But if only the server fd wakes up, the core loop can't really read
anything as it has nowhere to forward the data to.
The best thing it can do, is tell bucket A: somewhere deep down
there's data to read and considering I (the core loop) was alerted of
that fact there must be one of the other buckets B, C.. interested in
buffering/proactively transforming that data, so please forward this
trigger.

I don't think the buckets interface already has a function for that,
but something similar to 'read 0 bytes' would do.

So, did I understand your proposal correctly?

Lieven


Brian.Akins at turner

Nov 10, 2009, 2:30 PM

Post #49 of 69 (918 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

On 11/10/09 1:56 PM, "Greg Stein" <gstein [at] gmail> wrote:


> But some buckets might be performing gzip or SSL encryption. That
> consumes CPU within the network thread.

You could just run x times CPU cores number of "network" threads. You can't
use more than 100% of a CPU anyway.

The model that some of us discussed -- Greg, you may have invented it ;) --
was to have a small pool of acceptor threads (maybe just one) and a pool of
"worker" threads. The acceptor threads accept connections and move them into
worker threads - that's it. A single fd is then entirely owned by that
worker thread until it (the fd) goes away - network/disk io, gzip, ssl, etc.


--
Brian Akins


gstein at gmail

Nov 10, 2009, 3:13 PM

Post #50 of 69 (918 views)
Permalink
Re: Httpd 3.0 or something else [In reply to]

On Tue, Nov 10, 2009 at 16:33, Lieven Govaerts <svnlgo [at] mobsol> wrote:
> On Tue, Nov 10, 2009 at 6:10 PM, Greg Stein <gstein [at] gmail> wrote:
>...
>> You have 10k buckets representing the response for 10k clients. The
>> core loop reads the response from the bucket, and writes that to the
>> network.
>>
>> Now. A client socket wakes up as writable. I think it is pretty easy
>> to say "read THAT bucket" to get data for writing.
>>
>> Consider the scenario where one of those responses is proxied -- it is
>> arriving from a backend origin server. That underlying read-socket is
>> stuffed into the core loop. When that read-socket becomes available
>> for reading, *which* client response bucket do you start reading from?
>> And what happens if the client socket is not writable?
>>
>> You could just zip thru the 10k response buckets and poll each one for
>> data to read, and the serf design states that the underlying
>> read-socket *will* get read. But you've gotta do a lot of polling to
>> get there.
>>
>> I think that will be an interesting problem to solve. I believe it
>> would be something like this:
>>
>> Consider when a request arrives. The core looks at the Request-URI and
>> the Headers. From these inputs, it determines the appropriate
>> response. In this case, that response is identified by a bucket,
>> configured with those inputs. (and somewhere in here, any Request-Body
>> is managed; but ignore that for now)  As that response bucket is
>> constructed, along with all interior/nested buckets, that construction
>> can say "I've got an FD here. Please add this to the core loop." The
>> FD would be added, and would then be associated with the response
>> bucket, so we know which to read when the FD wakes up.
>>
> Suppose this is the diagram of the proxy scenario, where A and B are
> buckets wrapping the socket bucket:
>
> browser -->  (client fd)  [core loop]  [A [B [socket bucket  (server
> fd) <-- server
>
> If there's an event on the client fd, the core loop can read bytes
> from bucket A - as much as the client socket can handle.

Right, and right.

> But if only the server fd wakes up,  the core loop can't really read
> anything as it has nowhere to forward the data to.
> The best thing it can do, is tell bucket A: somewhere deep down
> there's data to read and considering I (the core loop) was alerted of
> that fact there must be one of the other buckets B, C.. interested in
> buffering/proactively transforming that data, so please forward this
> trigger.

Buckets have a peek() function.

Hmm. Theoretically, the bucket is *empty* of contents, or you would
not have returned to the event loop. Thus, when the peek() rolls
around, the bucket is going to figure out what it can provide without
blocking.

But... the buckets were designed for client-side operation. Buckets
are supposed to be emptied completely. That isn't true on the server:
the client socket might not be available for writing, so we don't
empty a response bucket to completion.

It does sound like something more may be needed, in order to propagate
some reading down the stack of buckets. But there is also a worry of:
if we read, then were do we put that, if the network isn't ready for
writing?

These read/status/nesting/etc concept are done in order to prevent
deadlocks. Ideally, *everything* is read and written to completion. An
appserver might not be able to provide you with more content, until
you give it something first. So the trick is to flush all writes, and
to flush all reads (because the latter might signal another write in
order to continue generating content... ad nauseum).

> I don't think the buckets interface already has a function for that,
> but something similar to 'read 0 bytes' would do.
>
> So, did I understand your proposal correctly?

Yes. But we may have some refining to do, as you've raised, and
looking more closely at the flows.

Cheers,
-g

First page Previous page 1 2 3 Next page Last page  View All Apache dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.