Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Zope: CMF

Weird UnicodeDecodeError with zope.formlib

 

 

Zope cmf RSS feed   Index | Next | Previous | View Threaded


charlie.clark at clark-consulting

Nov 28, 2012, 10:12 AM

Post #1 of 9 (577 views)
Permalink
Weird UnicodeDecodeError with zope.formlib

Hi,

one of my sites has (hopefully) started behaving funny. I have a formlib
driven contact form that is rejecting any input that is not ascii as part
of the validation step of the form:

UnicodeWarning: Unicode equal comparison failed to convert both arguments
to Unicode - interpreting them as being unequal

I may have got this wrong but I thought inputs into forms could be
considered as unicode and we only had to worry about them when storing
them in case they were being accessed by non-unicode-aware code.

What's really puzzling is that I have almost identical forms on other
sites that don't exhibit this behaviour which makes me think it must be a
configuration error such as the default encoding which is set to utf-8 for
this site.

Any ideas?

Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226
_______________________________________________
Zope-CMF maillist - Zope-CMF [at] zope
https://mail.zope.org/mailman/listinfo/zope-cmf

See https://bugs.launchpad.net/zope-cmf/ for bug reports and feature requests


patrick.gerken at computer

Nov 28, 2012, 3:12 PM

Post #2 of 9 (537 views)
Permalink
Re: Weird UnicodeDecodeError with zope.formlib [In reply to]

With the information you provided I'd first try this on a python prompt on
a working machine : "Köln" == u"Bonn"
If this does not throw the same error, somebody changed the python default
encoding. Then I'd look if some of my validators get constraints with
umlauts.
But I guess, you tried that already?


On Wed, Nov 28, 2012 at 7:12 PM, Charlie Clark <
charlie.clark [at] clark-consulting> wrote:

> Hi,
>
> one of my sites has (hopefully) started behaving funny. I have a formlib
> driven contact form that is rejecting any input that is not ascii as part
> of the validation step of the form:
>
> UnicodeWarning: Unicode equal comparison failed to convert both arguments
> to Unicode - interpreting them as being unequal
>
> I may have got this wrong but I thought inputs into forms could be
> considered as unicode and we only had to worry about them when storing them
> in case they were being accessed by non-unicode-aware code.
>
> What's really puzzling is that I have almost identical forms on other
> sites that don't exhibit this behaviour which makes me think it must be a
> configuration error such as the default encoding which is set to utf-8 for
> this site.
>
> Any ideas?
>
> Charlie
> --
> Charlie Clark
> Managing Director
> Clark Consulting & Research
> German Office
> Kronenstr. 27a
> Düsseldorf
> D- 40217
> Tel: +49-211-600-3657
> Mobile: +49-178-782-6226
> ______________________________**_________________
> Zope-CMF maillist - Zope-CMF [at] zope
> https://mail.zope.org/mailman/**listinfo/zope-cmf<https://mail.zope.org/mailman/listinfo/zope-cmf>
>
> See https://bugs.launchpad.net/**zope-cmf/<https://bugs.launchpad.net/zope-cmf/>for bug reports and feature requests
>


charlie.clark at clark-consulting

Nov 29, 2012, 12:43 AM

Post #3 of 9 (538 views)
Permalink
Re: Weird UnicodeDecodeError with zope.formlib [In reply to]

Hiya Patrick,

Am 29.11.2012, 00:12 Uhr, schrieb Patrick Gerken
<patrick.gerken [at] computer>:

> With the information you provided I'd first try this on a python prompt
> on a working machine : "Köln" == u"Bonn"

>>> "Köln" == u"Bonn"
bin/zopepy:1: UnicodeWarning: Unicode equal comparison failed to convert
both arguments to Unicode - interpreting them as being unequal


> If this does not throw the same error, somebody changed the python
> default encoding. Then I'd look if some of my validators get constraints
> with umlauts.

There are no fancy constraints just a bundle of TextLine fields.

> But I guess, you tried that already?

No. I guess I'll have to look more closely at the wigdets data. As I said
a different site on the same machine doesn't exhibit these problems so I
should have a point of comparison.

Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226
_______________________________________________
Zope-CMF maillist - Zope-CMF [at] zope
https://mail.zope.org/mailman/listinfo/zope-cmf

See https://bugs.launchpad.net/zope-cmf/ for bug reports and feature requests


charlie.clark at clark-consulting

Nov 29, 2012, 2:52 AM

Post #4 of 9 (537 views)
Permalink
Re: Weird UnicodeDecodeError with zope.formlib [In reply to]

Am 29.11.2012, 09:43 Uhr, schrieb Charlie Clark
<charlie.clark [at] clark-consulting>:

> No. I guess I'll have to look more closely at the wigdets data. As I
> said a different site on the same machine doesn't exhibit these problems
> so I should have a point of comparison.

Definitely weird. From site 1:

(Pdb) t = self.widgets['town']
(Pdb) t._getFormInput()
u'D\xfcsseldorf'

as expected but from site 2:

(Pdb) t = self.widgets['town']
(Pdb) t._getFormInput()
'D\xc3\xbcsseldorf'

Note the similarity of the field name as one of these forms started out as
a copy of the other. Need to check what is causing this. I think I might
have set a default encoding for Zope to UTF8 ostensibly to reduce IE /
Safari errors. Oh, isn't this fun!

Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226
_______________________________________________
Zope-CMF maillist - Zope-CMF [at] zope
https://mail.zope.org/mailman/listinfo/zope-cmf

See https://bugs.launchpad.net/zope-cmf/ for bug reports and feature requests


charlie.clark at clark-consulting

Nov 30, 2012, 2:19 AM

Post #5 of 9 (528 views)
Permalink
Re: Weird UnicodeDecodeError with zope.formlib [In reply to]

Hi Patrick,

Am 30.11.2012, 09:50 Uhr, schrieb Patrick Gerken <do3ccqrv [at] googlemail>:

> Add sentry logging with raven to the sites. Trigger an exception in both
> sites. With sentry you can not only see the traceback, but check the
> local
> variable of each frame. You can do the same with pdb of course but not so
> easily side by side to see where the local vars start to differ.
> I can give you access to my sentry server to send the logs to.

thanks for the tip. I've got Sentry and Raven running and reporting but
I'm afraid I still can't see the difference. The posted form looks
indentical in both cases. I can only assume that, as you first suggested,
there is a difference lower down the stack which is causing one instance
to decode the URL-encoded form to unicode and the other to encode it as
UTF-8. How can I check this? locale.getdefaultlocale() reports ('de_DE',
'UTF8') for both.

Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226
_______________________________________________
Zope-CMF maillist - Zope-CMF [at] zope
https://mail.zope.org/mailman/listinfo/zope-cmf

See https://bugs.launchpad.net/zope-cmf/ for bug reports and feature requests


patrick.gerken at computer

Nov 30, 2012, 9:14 AM

Post #6 of 9 (529 views)
Permalink
Re: Weird UnicodeDecodeError with zope.formlib [In reply to]

On Fri, Nov 30, 2012 at 11:19 AM, Charlie Clark <
charlie.clark [at] clark-consulting> wrote:

> Hi Patrick,
>
> Am 30.11.2012, 09:50 Uhr, schrieb Patrick Gerken <do3ccqrv [at] googlemail
> >:
>
>
> Add sentry logging with raven to the sites. Trigger an exception in both
>> sites. With sentry you can not only see the traceback, but check the local
>> variable of each frame. You can do the same with pdb of course but not so
>> easily side by side to see where the local vars start to differ.
>> I can give you access to my sentry server to send the logs to.
>>
>
> thanks for the tip. I've got Sentry and Raven running and reporting but
> I'm afraid I still can't see the difference. The posted form looks
> indentical in both cases. I can only assume that, as you first suggested,
> there is a difference lower down the stack which is causing one instance to
> decode the URL-encoded form to unicode and the other to encode it as UTF-8.
> How can I check this? locale.getdefaultlocale() reports ('de_DE', 'UTF8')
> for both.


I don't understand why you see no difference in the stacktrace, but a
difference with pdb in the end. Doesn't one instance show that the input is
a string and the other that its unicode?
Do you see this until you extract it first from the request object?

You are not having one form saying fieldname:string and the other just
fieldname?


patrick.gerken at computer

Nov 30, 2012, 9:21 AM

Post #7 of 9 (525 views)
Permalink
Re: Weird UnicodeDecodeError with zope.formlib [In reply to]

Did you try to put a pdb in prcessInputs of ZPublisher/HTTPRequest, around
line 642 where my code shows something like this:
640 if flags & CONVERTED:
641 try:
642 if character_encoding:
643 # We have a string with a specified
character
644 # encoding. This gets passed to the
converter
645 # either as unicode, if it can handle
it, or
646 # crunched back down to latin-1 if it
can not.
647 item = unicode(item,character_encoding)
648 if
hasattr(converter,'convert_unicode'):
649 item =
converter.convert_unicode(item)
650 else:
651 item = converter(
652 item.encode(default_encoding))
653 else:
654 item = converter(item)
655
656 # Flag potentially unsafe values
657 if converter_type in ('string',
'required', 'text',
658 'ustring', 'utext'):



...

The only place I can see where a default encoding gets changed is by the
default-zpublisher-encoding from zope.conf


On Fri, Nov 30, 2012 at 6:14 PM, Patrick Gerken <patrick.gerken [at] computer
> wrote:

>
>
>
> On Fri, Nov 30, 2012 at 11:19 AM, Charlie Clark <
> charlie.clark [at] clark-consulting> wrote:
>
>> Hi Patrick,
>>
>> Am 30.11.2012, 09:50 Uhr, schrieb Patrick Gerken <do3ccqrv [at] googlemail
>> >:
>>
>>
>> Add sentry logging with raven to the sites. Trigger an exception in both
>>> sites. With sentry you can not only see the traceback, but check the
>>> local
>>> variable of each frame. You can do the same with pdb of course but not so
>>> easily side by side to see where the local vars start to differ.
>>> I can give you access to my sentry server to send the logs to.
>>>
>>
>> thanks for the tip. I've got Sentry and Raven running and reporting but
>> I'm afraid I still can't see the difference. The posted form looks
>> indentical in both cases. I can only assume that, as you first suggested,
>> there is a difference lower down the stack which is causing one instance to
>> decode the URL-encoded form to unicode and the other to encode it as UTF-8.
>> How can I check this? locale.getdefaultlocale() reports ('de_DE', 'UTF8')
>> for both.
>
>
> I don't understand why you see no difference in the stacktrace, but a
> difference with pdb in the end. Doesn't one instance show that the input is
> a string and the other that its unicode?
> Do you see this until you extract it first from the request object?
>
> You are not having one form saying fieldname:string and the other just
> fieldname?
>


charlie.clark at clark-consulting

Nov 30, 2012, 9:21 AM

Post #8 of 9 (527 views)
Permalink
Re: Weird UnicodeDecodeError with zope.formlib [In reply to]

Am 30.11.2012, 18:14 Uhr, schrieb Patrick Gerken
<patrick.gerken [at] computer>:

Hi Patrick,

thanks for your patience in attempting to help me on this!

> I don't understand why you see no difference in the stacktrace, but a
> difference with pdb in the end. Doesn't one instance show that the input
> is a string and the other that its unicode?

Let me explain: in pdb I have access to request.form which is where I can
see the difference. With Sentry I can only see the raw body of the
request. I may simply have not understood well enough how to use it to
inspect what's happening.

I raise an exception in both cases in the forms' validate method.

> Do you see this until you extract it first from the request object?


> You are not having one form saying fieldname:string and the other just
> fieldname?

No, they are all zope.formlib/zope.schema fields so there is no additional
marshalling.

Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226
_______________________________________________
Zope-CMF maillist - Zope-CMF [at] zope
https://mail.zope.org/mailman/listinfo/zope-cmf

See https://bugs.launchpad.net/zope-cmf/ for bug reports and feature requests


charlie.clark at clark-consulting

Dec 2, 2012, 7:08 AM

Post #9 of 9 (477 views)
Permalink
Re: Weird UnicodeDecodeError with zope.formlib [In reply to]

Am 30.11.2012, 18:21 Uhr, schrieb Charlie Clark
<charlie.clark [at] clark-consulting>:

> Let me explain: in pdb I have access to request.form which is where I
> can see the difference. With Sentry I can only see the raw body of the
> request. I may simply have not understood well enough how to use it to
> inspect what's happening.
> I raise an exception in both cases in the forms' validate method.
> Do you see this until you extract it first from the request object?
> You are not having one form saying fieldname:string and the other just
> fieldname?
> No, they are all zope.formlib/zope.schema fields so there is no
> additional marshalling.

I have finally tracked down the problem: I seem to have been bitten by a
change in zope.formlib 4.1.

There are two solutions: either extend a form's update method with the
something like the following:

def update(self):
from Products.Five.browser.decode import processInputs
from ZPublisher import HTTPRequest
# XXX: if we don't set default_encoding explicitly, main_template
might
# set a different charset
self.request.RESPONSE.setHeader('Content-Type',
'text/html; charset=%s' % HTTPRequest.default_encoding)
# BBB: for Zope < 2.14
if not getattr(self.request, 'postProcessInputs', False):
processInputs(self.request, [HTTPRequest.default_encoding])
super(_EditFormMixin, self).update()

Or, more simply, base forms on those provided by five.formlib.formbase

Thanks to yuppie for fixing this in the CMF.

I can confirm that this also works with Internet Explorer.

Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226
_______________________________________________
Zope-CMF maillist - Zope-CMF [at] zope
https://mail.zope.org/mailman/listinfo/zope-cmf

See https://bugs.launchpad.net/zope-cmf/ for bug reports and feature requests

Zope cmf RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.