Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Python

decoding a byte array that is unicode escaped?

 

 

Python python RSS feed   Index | Next | Previous | View Threaded


samuelrobertson at gmail

Nov 6, 2009, 12:48 AM

Post #1 of 2 (40 views)
Permalink
decoding a byte array that is unicode escaped?

I have a byte stream read over the internet:

responseByteStream = urllib.request.urlopen( httpRequest );
responseByteArray = responseByteStream.read();

The characters are encoded with unicode escape sequences, for example
a copyright symbol appears in the stream as the bytes:

5C 75 30 30 61 39

which translates to:
\u00a9

which is unicode for the copyright symbol.

I am simply trying to display this copyright symbol on a webpage, so
how do I encode the byte array to utf-8 given that it is 'escape
encoded' in the above way? I tried:

responseByteArray.decode('utf-8')
and responseByteArray.decode('unicode_escape')
and str(responseByteArray).

I am using Python 3.1.

--
http://mail.python.org/mailman/listinfo/python-list


__peter__ at web

Nov 6, 2009, 12:59 AM

Post #2 of 2 (36 views)
Permalink
Re: decoding a byte array that is unicode escaped? [In reply to]

sam wrote:

> I have a byte stream read over the internet:
>
> responseByteStream = urllib.request.urlopen( httpRequest );
> responseByteArray = responseByteStream.read();
>
> The characters are encoded with unicode escape sequences, for example
> a copyright symbol appears in the stream as the bytes:
>
> 5C 75 30 30 61 39
>
> which translates to:
> \u00a9
>
> which is unicode for the copyright symbol.
>
> I am simply trying to display this copyright symbol on a webpage, so
> how do I encode the byte array to utf-8 given that it is 'escape
> encoded' in the above way? I tried:
>
> responseByteArray.decode('utf-8')
> and responseByteArray.decode('unicode_escape')
> and str(responseByteArray).
>
> I am using Python 3.1.

Convert the bytes to unicode first:

>>> u = b"\\u00a9".decode("unicode-escape")
>>> u
'©'

Then convert the string to bytes:

>>> u.encode("utf-8")
b'\xc2\xa9'


--
http://mail.python.org/mailman/listinfo/python-list

Python python RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.