Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: OpenStack: Dev

[swift] Operational knowledge sharing

 

 

OpenStack dev RSS feed   Index | Next | Previous | View Threaded


me at not

Aug 10, 2012, 9:31 AM

Post #1 of 6 (164 views)
Permalink
[swift] Operational knowledge sharing

In a standard swift deployment, the proxy server is running behind a load balancer and/or an SSL terminator. At SwiftStack, we discovered an issue that may arise from some config parameters in this layer, and we'd like to share it with other swift deployers.

Symptom:

Users updating metadata (ie POST) on larger objects get 503 error responses. However, there are no error responses logged by swift.

Cause:

Since POSTs are implemented, by default, as a server-side copy in swift and there is no traffic between the user and swift during the server-side copy, the LB or SSL terminator times out before the operation is done.

Solution:

Two options:

1) Raise the timeout in the LB/SSL terminator config. For example, with pound change the "TimeOut" for the swift backend. pound defaults to 15 seconds. The appropriate value is however log it takes to do a server side copy of your largest object. If you have a 1gbps network, it will take about 160 seconds to copy a 5GB object ((8*5*2**30)/((2**30)/4) -- the divide by 4 is because the 1gbps link is used to read one stream (the original) and write the new copy (3 replicas)).

2) Change the behavior of POSTs to not do a server-side copy. This will make POSTs faster, but it will prevent all metadata values from being updated (notably, Content-Type will not be able to be modified with a POST). Also, this will not make the issue go away with user-initiated server-side copies.

I would recommend the first solution, unless your workload makes heavy use of POSTs.

Hoep this helps.

--John
Attachments: smime.p7s (4.23 KB)


acs at parvuscaptus

Aug 10, 2012, 10:50 AM

Post #2 of 6 (155 views)
Permalink
Re: [swift] Operational knowledge sharing [In reply to]

Thanks for sharing.



On Fri, Aug 10, 2012 at 12:31 PM, John Dickinson <me [at] not> wrote:

> In a standard swift deployment, the proxy server is running behind a load
> balancer and/or an SSL terminator. At SwiftStack, we discovered an issue
> that may arise from some config parameters in this layer, and we'd like to
> share it with other swift deployers.
>
> Symptom:
>
> Users updating metadata (ie POST) on larger objects get 503 error
> responses. However, there are no error responses logged by swift.
>
> Cause:
>
> Since POSTs are implemented, by default, as a server-side copy in swift
> and there is no traffic between the user and swift during the server-side
> copy, the LB or SSL terminator times out before the operation is done.
>
> Solution:
>
> Two options:
>
> 1) Raise the timeout in the LB/SSL terminator config. For example, with
> pound change the "TimeOut" for the swift backend. pound defaults to 15
> seconds. The appropriate value is however log it takes to do a server side
> copy of your largest object. If you have a 1gbps network, it will take
> about 160 seconds to copy a 5GB object ((8*5*2**30)/((2**30)/4) -- the
> divide by 4 is because the 1gbps link is used to read one stream (the
> original) and write the new copy (3 replicas)).
>
> 2) Change the behavior of POSTs to not do a server-side copy. This will
> make POSTs faster, but it will prevent all metadata values from being
> updated (notably, Content-Type will not be able to be modified with a
> POST). Also, this will not make the issue go away with user-initiated
> server-side copies.
>
> I would recommend the first solution, unless your workload makes heavy use
> of POSTs.
>
> Hoep this helps.
>
> --John
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack [at] lists
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp
>
>


john at openstack

Aug 10, 2012, 12:20 PM

Post #3 of 6 (152 views)
Permalink
Re: [swift] Operational knowledge sharing [In reply to]

This is great info, John. Thanks.

John

John Purrier
john [at] openstack
(206) 930-0788
http://www.linkedin.com/in/johnpur





On 8/10/12 9:31 AM, "John Dickinson" <me [at] not> wrote:

>In a standard swift deployment, the proxy server is running behind a load
>balancer and/or an SSL terminator. At SwiftStack, we discovered an issue
>that may arise from some config parameters in this layer, and we'd like
>to share it with other swift deployers.
>
>Symptom:
>
>Users updating metadata (ie POST) on larger objects get 503 error
>responses. However, there are no error responses logged by swift.
>
>Cause:
>
>Since POSTs are implemented, by default, as a server-side copy in swift
>and there is no traffic between the user and swift during the server-side
>copy, the LB or SSL terminator times out before the operation is done.
>
>Solution:
>
>Two options:
>
>1) Raise the timeout in the LB/SSL terminator config. For example, with
>pound change the "TimeOut" for the swift backend. pound defaults to 15
>seconds. The appropriate value is however log it takes to do a server
>side copy of your largest object. If you have a 1gbps network, it will
>take about 160 seconds to copy a 5GB object ((8*5*2**30)/((2**30)/4) --
>the divide by 4 is because the 1gbps link is used to read one stream (the
>original) and write the new copy (3 replicas)).
>
>2) Change the behavior of POSTs to not do a server-side copy. This will
>make POSTs faster, but it will prevent all metadata values from being
>updated (notably, Content-Type will not be able to be modified with a
>POST). Also, this will not make the issue go away with user-initiated
>server-side copies.
>
>I would recommend the first solution, unless your workload makes heavy
>use of POSTs.
>
>Hoep this helps.
>
>--John
>
>
>_______________________________________________
>Mailing list: https://launchpad.net/~openstack
>Post to : openstack [at] lists
>Unsubscribe : https://launchpad.net/~openstack
>More help : https://help.launchpad.net/ListHelp



_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


z-launchpad at brim

Aug 10, 2012, 2:06 PM

Post #4 of 6 (150 views)
Permalink
Re: [swift] Operational knowledge sharing [In reply to]

Followup note: Though briefly mentioned by John, I like to emphasize this also affects COPY (or PUT with X-Copy-From) requests, and #1 (upping the lb timeout) is really the only solution unless we go crazy and implement async requests with status checks. Well, another weird solution is to have Swift return useless response bodies very slowly as a keep alive. :)


On Aug 10, 2012, at 11:31 AM, John Dickinson <me [at] not> wrote:

> In a standard swift deployment, the proxy server is running behind a load balancer and/or an SSL terminator. At SwiftStack, we discovered an issue that may arise from some config parameters in this layer, and we'd like to share it with other swift deployers.
>
> Symptom:
>
> Users updating metadata (ie POST) on larger objects get 503 error responses. However, there are no error responses logged by swift.
>
> Cause:
>
> Since POSTs are implemented, by default, as a server-side copy in swift and there is no traffic between the user and swift during the server-side copy, the LB or SSL terminator times out before the operation is done.
>
> Solution:
>
> Two options:
>
> 1) Raise the timeout in the LB/SSL terminator config. For example, with pound change the "TimeOut" for the swift backend. pound defaults to 15 seconds. The appropriate value is however log it takes to do a server side copy of your largest object. If you have a 1gbps network, it will take about 160 seconds to copy a 5GB object ((8*5*2**30)/((2**30)/4) -- the divide by 4 is because the 1gbps link is used to read one stream (the original) and write the new copy (3 replicas)).
>
> 2) Change the behavior of POSTs to not do a server-side copy. This will make POSTs faster, but it will prevent all metadata values from being updated (notably, Content-Type will not be able to be modified with a POST). Also, this will not make the issue go away with user-initiated server-side copies.
>
> I would recommend the first solution, unless your workload makes heavy use of POSTs.
>
> Hoep this helps.
>
> --John
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack [at] lists
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


Caitlin.Bestler at nexenta

Aug 13, 2012, 9:36 AM

Post #5 of 6 (131 views)
Permalink
Re: [swift] Operational knowledge sharing [In reply to]

Greg Holt wrote:

> Followup note: Though briefly mentioned by John, I like to emphasize this also affects COPY (or PUT with X-Copy-From) requests,
> and #1 (upping the lb timeout) is really the only solution unless we go crazy and implement async requests with status checks.
> Well, another weird solution is to have Swift return useless response bodies very slowly as a keep alive. :)

I'm not sure it's worth the compatibility hassles, but why would periodic "Progress" returns that could be translated into a client status bar be "useless"?
If the operation takes long enough for network elements to forget about the connection then any human user will certainly be wondering what's going on as well.
Of course the challenge would be to introduce periodic feedback in a way that did not break existing automated clients and scripts.

Perhaps an option for periodic status reports?



_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


z-launchpad at brim

Aug 13, 2012, 10:03 AM

Post #6 of 6 (134 views)
Permalink
Re: [swift] Operational knowledge sharing [In reply to]

On Aug 13, 2012, at 11:36 AM, Caitlin Bestler <Caitlin.Bestler [at] nexenta> wrote:

> I'm not sure it's worth the compatibility hassles, but why would periodic "Progress" returns that could be translated into a client status bar be "useless"?

Sorry, poor choice of word I guess.


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp

OpenStack dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.