
phk at phk
Feb 16, 2006, 3:09 AM
Post #3 of 8
(173 views)
Permalink
|
In message <65058.193.213.34.102.1140050754.squirrel at denise.vg.no>, "Anders Ber g" writes: Let me just try to see if I can express the overall threading strategy I have formed without using a whiteboard: The [...] is which thread we're in. [acceptor] Incoming connections are handled by acceptfilters in a single thread or if acceptfilters are not available with a single threaded poll loop. [acceptor] Once a full HTTP request has been gathered, the URL is hashed and looked up to see if we have a hit or not. [acceptor] If we have a hit, and the object is in a "ready" state, a thread is pulled off the "sender" queue and given the request to complete. [sender] The object will be shipped out according to its state (it may still be arriving from the backend) and the HTTP headers. sendfile will be used if at all possible. Once done, the the fd will be sent back the the acceptor if not closed {can we engage acceptfilters again ?} {We may ($config) engage in compression here and in such case we would embellish the object with the compressed version (up front) so it can be reused by other senders.} [acceptor] If we have a hit, but the object is not in a "ready" state, (for instance we are trying to get the object from the backend, but havn't received any of it yet) the request is parked on the object. [acceptor] If we have no hit, the header needs to be analyzed (URL cleanup, rewriting, negative lookup etc etc). We could use a "sender" thread to do this, but I would rather in order to limit the amount of potentially expensive work we do here. My initial thought therefore is to put the request into a queue to be dealt with by the "backend" threads. [backend] These threads will look for two kinds of work in order of priority: requests that needs analysing and objects nearing expiration. [backend] Requests needing analysis are chewed upon according to the configured rules and one of four outcomes are possible: [backend] Invalid request. Grap a "sender" and ship out a static error-object. [backend] Rematched request, (after analysis it matches an existing object) treat like the acceptor would for a hash hit. If configuration allows: add new hash entry to put this URL on fast track in the future. [backend] Unmatched request, cacheable (glob/regexp matching). Create object, queue request on it. Add hash entry. Initiate fetch from backend. When HTTP header arrives, set expiry on object accordingly. Once some data has arrived, grab sender and pass it the object (NB: not the request). Receive full object. [backend] Unmatched request, uncacheable (glob/regexp matching). Create (transient) object. Initiate fetch from backend. Once some data has arrived, grab sender thread and pass it object. Receive full object. [backend] Near-expiry objects: Once an object nears expiry (defined by config) it is eligble for refresh. A backend thread will determine if the object is important enough (defined by config) compared to current backend responsiveness to be refreshed. If it is, a GET request is sent to the backend. (I'm not sure optimizing with a HEAD is worth much here, maybe a hybrid strategy: If the object has been refreshed before and a GET was necessary more often than not, then do GET otherwise try HEAD first). [sender] When passed object: If only one request queued on object, behave as if passed that request. If more than one request is queued, grab a sender for each and pass that request. [sender] On transient object: Destroy object after transmission. [any] If on attempting to pull a sender off the queue, none is available, the request or object is queued instead. [overseer] Monitor number of sender threads and create/destroy them as appropriate. Sender threads go back to the front of the queue (to cache efficiency reasons) and if they linger in the tail of the queue doing nothing for more than $config seconds, they get killed off. [overseer] Monitor backend responsiveness based on backend thread statistics. Switch between various policy states accordingly. [master] handle requests coming in via $channel from janitor process. ... or something like that. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence.
|