Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Bricolage: devel

Bricolage State Errors

 

 

Bricolage devel RSS feed   Index | Next | Previous | View Threaded


rolfm at denison

Oct 29, 2008, 3:26 PM

Post #1 of 10 (2204 views)
Permalink
Bricolage State Errors

Since I reported the state errors I received in 1.11.1, I've been
trying to at least track down where the errors are coming from, if not
get something reproducible. Here are some examples of things I'm
talking about, which Sarah found by looking through the archives:

http://www.gossamer-threads.com/lists/bricolage/users/9696
http://www.gossamer-threads.com/lists/bricolage/users/8002
http://marc.info/?l=bricolage-general&m=116539588811694&w=2
http://bugs.bricolage.cc/show_bug.cgi?id=1377
http://www.gossamer-threads.com/lists/bricolage/users/35053

Another example is duplicate stories on desks, which could occur when
notes were edited , and at other random times.

The common thread in all of these is a strange story state that needs
to be fixed in the db or fixes itself after a check in/out, and which
have an unclear cause.

Due to the fairly mature level of API testing in Bric, I strongly
suspect bugs of this type have their roots in the gui. Furthermore,
as stated by David in this thead:

http://www.gossamer-threads.com/lists/bricolage/devel/34969

we know the state of stories gets messed up regularly by opening them
up in more than one window, but that it doesn't account for all story
state corruption issues.

Based on all this, I'm going to posit that the cause for most of these
errors lies in the session handling code, most likely in
Apache::Session. Chris pointed out last year

http://www.gossamer-threads.com/lists/bricolage/devel/19593#19593

that Apache::Session hadn't been the best maintained module. More
disturbing is this review:

http://cpanratings.perl.org/user/ti

I think points two and three are the most relevant here. For those
that don't want to click, it's possible to end up with non-exclusive
locks and inconsistent sessions fairly easily. Looking at the
changelog, it doesn't look like much has changed since the review.

And as was pointed out earlier, it is possible for Apache::Session to
melt down in horrible, insecure ways:

http://www.gossamer-threads.com/lists/bricolage/users/34804

So in short, I think the way Bricolage handles sessions is broken in
some fundamental ways. It doesn't recognize when a user has logged in
from more than one window and prevent them from doing so (and that is
clearly tied into the masquerade functionality), it is possible for
users and sessions to decouple and associate at random, and from the
evidence presented here it looks like sessions can become inconsistent
during normal user behavior and cause the insertion of bad data into
the db.

This idea is not much more than a hypothesis at the moment, but it at
least gives us an area to focus our testing and code checking for this
issue.

I'd like to hear what others think.

-Matt


D-Beaudet at NGA

Oct 29, 2008, 4:20 PM

Post #2 of 10 (2133 views)
Permalink
RE: Bricolage State Errors [In reply to]

I think it all comes down to developing a test case that consistently reproduces the error where stories lose their desk association, but where desks still have a story association (or vice versa). I was close to a reproducible test case at some point last year and left myself some notes about it, so I'll go back and see what I can dig up... probably not until next week though.

-----Original Message-----
From: Matt Rolf [mailto:rolfm [at] denison]
Sent: Wed 10/29/2008 6:26 PM
To: devel [at] lists
Subject: Bricolage State Errors

Since I reported the state errors I received in 1.11.1, I've been
trying to at least track down where the errors are coming from, if not
get something reproducible. Here are some examples of things I'm
talking about, which Sarah found by looking through the archives:

http://www.gossamer-threads.com/lists/bricolage/users/9696
http://www.gossamer-threads.com/lists/bricolage/users/8002
http://marc.info/?l=bricolage-general&m=116539588811694&w=2
http://bugs.bricolage.cc/show_bug.cgi?id=1377
http://www.gossamer-threads.com/lists/bricolage/users/35053

Another example is duplicate stories on desks, which could occur when
notes were edited , and at other random times.

The common thread in all of these is a strange story state that needs
to be fixed in the db or fixes itself after a check in/out, and which
have an unclear cause.

Due to the fairly mature level of API testing in Bric, I strongly
suspect bugs of this type have their roots in the gui. Furthermore,
as stated by David in this thead:

http://www.gossamer-threads.com/lists/bricolage/devel/34969

we know the state of stories gets messed up regularly by opening them
up in more than one window, but that it doesn't account for all story
state corruption issues.

Based on all this, I'm going to posit that the cause for most of these
errors lies in the session handling code, most likely in
Apache::Session. Chris pointed out last year

http://www.gossamer-threads.com/lists/bricolage/devel/19593#19593

that Apache::Session hadn't been the best maintained module. More
disturbing is this review:

http://cpanratings.perl.org/user/ti

I think points two and three are the most relevant here. For those
that don't want to click, it's possible to end up with non-exclusive
locks and inconsistent sessions fairly easily. Looking at the
changelog, it doesn't look like much has changed since the review.

And as was pointed out earlier, it is possible for Apache::Session to
melt down in horrible, insecure ways:

http://www.gossamer-threads.com/lists/bricolage/users/34804

So in short, I think the way Bricolage handles sessions is broken in
some fundamental ways. It doesn't recognize when a user has logged in
from more than one window and prevent them from doing so (and that is
clearly tied into the masquerade functionality), it is possible for
users and sessions to decouple and associate at random, and from the
evidence presented here it looks like sessions can become inconsistent
during normal user behavior and cause the insertion of bad data into
the db.

This idea is not much more than a hypothesis at the moment, but it at
least gives us an area to focus our testing and code checking for this
issue.

I'd like to hear what others think.

-Matt


bret at pectopah

Oct 29, 2008, 4:37 PM

Post #3 of 10 (2138 views)
Permalink
RE: Bricolage State Errors [In reply to]

I'm reasonably close to reproducability too.

On the Sportsnet Bricolage install, we sometimes run into it when a
template edits a story at (almost) the same time that a person edits it.

Specifically, there's an element called "lineup manipulator" that
producers can add to news stories. When a news story is published
containing a lineup manipulator, lineup_manipulator.mc edits one or more
covers to add the news story as a related story.

The code select that the lineup manipulator element offers to the
producer looks at the checkin/checkout state of the covers, and only
offers to edit those covers if they're not checked out.

But if somebody checks out one of those covers in between the time when
the code select is displayed and the time the story is published and
lineup_manipulator.mc runs, the state of the cover can get very wobbly
indeed.

When this happends, attempting to edit that cover directly throws an
error, because (according to the error message) something in Bricolage
can't call "get_desk" on an undefined value.

It's possible to check the cover out, although the error displays on
checkout, and it's also possible to check it back in and move it from
desk to desk and even to republish it, but not to edit it, because
Bricolage throws the error instead of displaying the story profile.

It does seem possible to resolve the situation by having a different
user (one who did not see the error in the UI) check it out and publish
it. Somehow, the poison seems to be associated with a user, and having
another user come to the rescue fixes the problem. The new user doesn't
ever see the error.

Anyway. It's a tough thing to reproduce, because of all the timing
involved. But I'll keep y'all posted.


Cheers,

Bret



On Wed, 2008-10-29 at 19:20 -0400, Beaudet, David wrote:
> I think it all comes down to developing a test case that consistently reproduces the error where stories lose their desk association, but where desks still have a story association (or vice versa). I was close to a reproducible test case at some point last year and left myself some notes about it, so I'll go back and see what I can dig up... probably not until next week though.
>
> -----Original Message-----
> From: Matt Rolf [mailto:rolfm [at] denison]
> Sent: Wed 10/29/2008 6:26 PM
> To: devel [at] lists
> Subject: Bricolage State Errors
>
> Since I reported the state errors I received in 1.11.1, I've been
> trying to at least track down where the errors are coming from, if not
> get something reproducible. Here are some examples of things I'm
> talking about, which Sarah found by looking through the archives:
>
> http://www.gossamer-threads.com/lists/bricolage/users/9696
> http://www.gossamer-threads.com/lists/bricolage/users/8002
> http://marc.info/?l=bricolage-general&m=116539588811694&w=2
> http://bugs.bricolage.cc/show_bug.cgi?id=1377
> http://www.gossamer-threads.com/lists/bricolage/users/35053
>
> Another example is duplicate stories on desks, which could occur when
> notes were edited , and at other random times.
>
> The common thread in all of these is a strange story state that needs
> to be fixed in the db or fixes itself after a check in/out, and which
> have an unclear cause.
>
> Due to the fairly mature level of API testing in Bric, I strongly
> suspect bugs of this type have their roots in the gui. Furthermore,
> as stated by David in this thead:
>
> http://www.gossamer-threads.com/lists/bricolage/devel/34969
>
> we know the state of stories gets messed up regularly by opening them
> up in more than one window, but that it doesn't account for all story
> state corruption issues.
>
> Based on all this, I'm going to posit that the cause for most of these
> errors lies in the session handling code, most likely in
> Apache::Session. Chris pointed out last year
>
> http://www.gossamer-threads.com/lists/bricolage/devel/19593#19593
>
> that Apache::Session hadn't been the best maintained module. More
> disturbing is this review:
>
> http://cpanratings.perl.org/user/ti
>
> I think points two and three are the most relevant here. For those
> that don't want to click, it's possible to end up with non-exclusive
> locks and inconsistent sessions fairly easily. Looking at the
> changelog, it doesn't look like much has changed since the review.
>
> And as was pointed out earlier, it is possible for Apache::Session to
> melt down in horrible, insecure ways:
>
> http://www.gossamer-threads.com/lists/bricolage/users/34804
>
> So in short, I think the way Bricolage handles sessions is broken in
> some fundamental ways. It doesn't recognize when a user has logged in
> from more than one window and prevent them from doing so (and that is
> clearly tied into the masquerade functionality), it is possible for
> users and sessions to decouple and associate at random, and from the
> evidence presented here it looks like sessions can become inconsistent
> during normal user behavior and cause the insertion of bad data into
> the db.
>
> This idea is not much more than a hypothesis at the moment, but it at
> least gives us an area to focus our testing and code checking for this
> issue.
>
> I'd like to hear what others think.
>
> -Matt
>
--
Bret Dawson
Producer
Pectopah Productions Inc.
(416) 895-7635
bret [at] pectopah
www.pectopah.com


david at kineticode

Oct 29, 2008, 5:18 PM

Post #4 of 10 (2137 views)
Permalink
Re: Bricolage State Errors [In reply to]

On Oct 29, 2008, at 16:20, Beaudet, David wrote:

> I think it all comes down to developing a test case that
> consistently reproduces the error where stories lose their desk
> association, but where desks still have a story association (or vice
> versa). I was close to a reproducible test case at some point last
> year and left myself some notes about it, so I'll go back and see
> what I can dig up... probably not until next week though

Agreed.

FWIW, with all of the bugs I've dealt with over the years, I don't
recall a single one coming down to an issue with Apache::Session. It's
an ancient, crappy module, to be sure, but I've not seen any problems
with it at all aside from file permission or disk space issues. The
inability to have multiple windows sucks, but other than that
drawback, I seriously doubt that the state issues are because of
Apache::Session.

But a reproducible example will tell us all.

Thanks,

David


david at kineticode

Oct 29, 2008, 5:21 PM

Post #5 of 10 (2135 views)
Permalink
Re: Bricolage State Errors [In reply to]

On Oct 29, 2008, at 16:37, Bret Dawson wrote:

> Specifically, there's an element called "lineup manipulator" that
> producers can add to news stories. When a news story is published
> containing a lineup manipulator, lineup_manipulator.mc edits one or
> more
> covers to add the news story as a related story.

I try to avoid hacks like this. Better is to have the covers look for
all stories with linup manipulator subelements and to list them,
without any element-level relationships.

> The code select that the lineup manipulator element offers to the
> producer looks at the checkin/checkout state of the covers, and only
> offers to edit those covers if they're not checked out.

That's good.

> But if somebody checks out one of those covers in between the time
> when
> the code select is displayed and the time the story is published and
> lineup_manipulator.mc runs, the state of the cover can get very wobbly
> indeed.

The code select should double-check that the story in question has not
become checked-out.

> When this happends, attempting to edit that cover directly throws an
> error, because (according to the error message) something in Bricolage
> can't call "get_desk" on an undefined value.

Yeah, the code callback probably mucks up the checkout state.

> It's possible to check the cover out, although the error displays on
> checkout, and it's also possible to check it back in and move it from
> desk to desk and even to republish it, but not to edit it, because
> Bricolage throws the error instead of displaying the story profile.
>
> It does seem possible to resolve the situation by having a different
> user (one who did not see the error in the UI) check it out and
> publish
> it. Somehow, the poison seems to be associated with a user, and having
> another user come to the rescue fixes the problem. The new user
> doesn't
> ever see the error.

Bleh.

> Anyway. It's a tough thing to reproduce, because of all the timing
> involved. But I'll keep y'all posted.

Thanks.

Best,

David


rolfm at denison

Nov 20, 2008, 11:11 AM

Post #6 of 10 (1983 views)
Permalink
RE: Bricolage State Errors [In reply to]

Quoting "Beaudet, David" <D-Beaudet [at] NGA>:

>
> I think it all comes down to developing a test case that
> consistently reproduces the error where stories lose their desk
> association, but where desks still have a story association (or vice
> versa). I was close to a reproducible test case at some point last
> year and left myself some notes about it, so I'll go back and see
> what I can dig up... probably not until next week though.

I've been going through our story_instances table and found a couple
of issues. I reported a soap bug as one of them.

As for the other issue, I've seen multiple occurrences where people
will be editing a document, checking it in and out, creating new
versions. Then all of a sudden, the versions will drop back down
before incrementing again. So a story will go version 66, 67, 68, 66,
67, 68, 69.

Has anyone else observed this? Might it be indicative of users
switching browsers midstream, or something else?


rolfm at denison

Nov 20, 2008, 12:31 PM

Post #7 of 10 (1984 views)
Permalink
RE: Bricolage State Errors [In reply to]

Quoting rolfm [at] denison:

> As for the other issue, I've seen multiple occurrences where people
> will be editing a document, checking it in and out, creating new
> versions. Then all of a sudden, the versions will drop back down
> before incrementing again. So a story will go version 66, 67, 68,
> 66, 67, 68, 69.

More information on this. We've seen that in the database, when
something like this happens, the first version that repeats has only
one entry in the db, and it appears to be the first time the version
appears.

So why would it not allow that verison to be repeated, but it will
then increment subsequent versions starting from the failed verison?

Our current thought is that the checkin mechanism should be wrapped in
a transaction block, and if one part fails, the whole thing gets
rolled back and an error gets sent to the user. Thoughts?

-Matt


D-Beaudet at NGA

Nov 20, 2008, 12:46 PM

Post #8 of 10 (1987 views)
Permalink
RE: Bricolage State Errors [In reply to]

The best practice of transaction for any db change is advisable in my
opinion. Why the transaction fails is worth looking into and worth
logging the condition of, but is secondary to database integrity.


> -----Original Message-----
> From: rolfm [at] denison [mailto:rolfm [at] denison]
> Sent: Thursday, November 20, 2008 3:31 PM
> To: devel [at] lists
> Subject: RE: Bricolage State Errors
>
> Quoting rolfm [at] denison:
>
> > As for the other issue, I've seen multiple occurrences where people
> > will be editing a document, checking it in and out, creating new
> > versions. Then all of a sudden, the versions will drop back down
> > before incrementing again. So a story will go version 66, 67, 68,
> > 66, 67, 68, 69.
>
> More information on this. We've seen that in the database, when
> something like this happens, the first version that repeats has only
> one entry in the db, and it appears to be the first time the version
> appears.
>
> So why would it not allow that verison to be repeated, but it will
> then increment subsequent versions starting from the failed verison?
>
> Our current thought is that the checkin mechanism should be wrapped in
> a transaction block, and if one part fails, the whole thing gets
> rolled back and an error gets sent to the user. Thoughts?
>
> -Matt


david at kineticode

Nov 20, 2008, 12:48 PM

Post #9 of 10 (1983 views)
Permalink
Re: Bricolage State Errors [In reply to]

On Nov 20, 2008, at 12:31 PM, rolfm [at] denison wrote:

> More information on this. We've seen that in the database, when
> something like this happens, the first version that repeats has only
> one entry in the db, and it appears to be the first time the version
> appears.
>
> So why would it not allow that verison to be repeated, but it will
> then increment subsequent versions starting from the failed verison?

Sounds vaguely like a SELECT issue.

> Our current thought is that the checkin mechanism should be wrapped
> in a transaction block, and if one part fails, the whole thing gets
> rolled back and an error gets sent to the user. Thoughts?

They are. Or they should be, anyway.

Best,

David


lannings at who

Nov 21, 2008, 1:54 AM

Post #10 of 10 (1989 views)
Permalink
Re: Bricolage State Errors [In reply to]

On Thu, 20 Nov 2008, David E. Wheeler wrote:
> On Nov 20, 2008, at 12:31 PM, rolfm [at] denison wrote:
>> Our current thought is that the checkin mechanism should be wrapped in a
>> transaction block, and if one part fails, the whole thing gets rolled back
>> and an error gets sent to the user. Thoughts?
>
> They are. Or they should be, anyway.

I don't remember for checkin in particular,
but assets are not saved in a single transaction
There are (attempted) "nested" transactions
(in Bric::App::Handler, but also in Story->save).
I've brought that up before, I think regarding
the SOAP interface, couldn't figure out a good solution.

Actually I took a peek in lib/Bric/App/Callback/Profile/Story.pm
the sub checkin has a commit in it.

Bricolage devel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.