
gunther at extropia
Feb 16, 2000, 3:41 AM
Post #1 of 1
(637 views)
Permalink
|
|
Apache::Session-- how does it protect session data?
|
|
I am trying to understand the locking model that is used to protect an Apache::Session from being trampled by multiple processes. The documentation talks about implementing it in a way that people would expect. Unfortunately, when I think about it, there are quite a few ways I could expect locking to work to make sure the session data is saved. It looks to me like the default is 1) Read lock upon loading an existing session 2) Exclusive lock upon creating a session from scratch 3) Exclusive lock upon writing data (not upon storage of data in the session hash!) 4) Release locks only upon destruction of the session object It's very possible I could be wrong, but this model strikes me as promoting deadlock and not preventing session corruption. Let's consider a couple possible scenarios: 1. Let's suppose that two processes (1 and 2) both want to modify a variable in the session. For arguement's sake, let's say that one process is modifying an "age" key while the other one is modifying a "firstname" key. Let's also assume the session was previously created. Thus, process 1 loads a session. Obtains a read (non exclusive) lock. Process 1 modifies the age value. This is done to a memory-resident cache so no exclusive lock is obtained yet as no write is performed to the data store. Process 2 loads the same session. Perhaps the result of a submission from a different frame of a web app or in a different browser window. Obtains a read (non exclusive) lock. Process 2 modifies the "firstname" key/value. Again, this is done to memory-resident cache (at least this is how I read Apache::Session). Now, let's assume process 1 (running at about the same time)... gains control of CPU and the program completes. At this point the session object goes out of scope. When Process 1's session object goes out of scope, it attempts to write the session data that it could not write previously. Before doing this, however, it must obtain an exclusive lock. But it can't get an exclusive lock as Process 2 still has a read only lock on the same session. So eventually process 1 blocks until process 2 gains CPU time again. Process 2 then ends up exiting and the destruction of Process 2's CGI object demands that, it too, get an exclusive lock on the session to write the firstname data out to the persistent data store. Unfortunately process 2 must wait for process 1 to release the read lock it has. Deadlock. Process 1 wants process 2 to release its read only lock and process 2 wants process 1 to release its read only lock. The other alternative I see is that the locks would be freed after a timeout period.. but if this is the case, then one of the process'es acts of writing the session data to the data store will overwrite the other's. The session file will not be corrupted because the entire write operation will be surrounded by an exclusive lock, but the data will be "logically" corrupted because the application author will have an applicatin state he did not expect. In conclusion, the locking workflow in Apache::Session confuses me. I suspect people haven't run into this problem before because most people do not share sessions among many different apps and the likelihood that two scripts will be writing the same session 's data to disk at the same time is extremely low. However, I imagine the locking was put in place to prevent data corruption in these extreme cases... so if this is the case, I am wondering if it is really does work in these cases? I have tried going through the Apache::Session logic myself to figure this out, and I might be missing some piece of the puzzle... Hopefully Jeffrey or someone else can shed some light on this for me? At the very minimum, I would expect step 3 from above (the act of storing any one data value in the session) to end up causing the session to obtain an exclusive lock. I believe this would prevent the deadlock scenario I outlined above. The problem with this mechanism is that concurrency goes way done for an application that might require reading the session data into multiple processes at hte same time. For example, if an app uses frames, session data that might be read in other frames at the same time will definately force those framed scripts to wait. This will cause the frames to look like they are loading in order instead of all at once... a bit of an ugly sight. Please no comments on the merits (or lack thereof) of using frames. :)... I am merely focusing on why the locking logic exists in Apache::Session the way it does, and in what cases I am not sure whether the locking will actually work as intended. Thanks, Gunther
|