Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Bugs

[issue14455] plistlib unable to read json and binary plist files

 

 

Python bugs RSS feed   Index | Next | Previous | View Threaded


report at bugs

Mar 30, 2012, 2:56 PM

Post #1 of 12 (223 views)
Permalink
[issue14455] plistlib unable to read json and binary plist files

New submission from d9pouces <python [at] 19pouces>:

Hi,

Plist files have actually three flavors : XML ones, binary ones, and now (starting from Mac OS X 10.7 Lion) json one. The plistlib.readPlist function can only read XML plist files and thus cannot read binary and json ones.

The binary format is open and described by Apple (http://opensource.apple.com/source/CF/CF-550/CFBinaryPList.c).

Here is the diff (from Python 2.7 implementation of plistlib) to transparently read both binary and json formats.

API of plistlib remains unchanged, since format detection is done by plistlib.readPlist.
An InvalidFileException is raised in case of malformed binary file.


57,58c57
< "Plist", "Data", "Dict",
< "InvalidFileException",
---
> "Plist", "Data", "Dict"
64d62
< import json
66d63
< import os
68d64
< import struct
81,89c77,78
< header = pathOrFile.read(8)
< pathOrFile.seek(0)
< if header == '<?xml ve' or header[2:] == '<?xml ': #XML plist file, without or with BOM
< p = PlistParser()
< rootObject = p.parse(pathOrFile)
< elif header == 'bplist00': #binary plist file
< rootObject = readBinaryPlistFile(pathOrFile)
< else: #json plist file
< rootObject = json.load(pathOrFile)
---
> p = PlistParser()
> rootObject = p.parse(pathOrFile)
195,285d183
<
< # timestamp 0 of binary plists corresponds to 1/1/2001 (year of Mac OS X 10.0), instead of 1/1/1970.
< MAC_OS_X_TIME_OFFSET = (31 * 365 + 8) * 86400
<
< class InvalidFileException(ValueError):
< def __str__(self):
< return "Invalid file"
< def __unicode__(self):
< return "Invalid file"
<
< def readBinaryPlistFile(in_file):
< """
< Read a binary plist file, following the description of the binary format: http://opensource.apple.com/source/CF/CF-550/CFBinaryPList.c
< Raise InvalidFileException in case of error, otherwise return the root object, as usual
< """
< in_file.seek(-32, os.SEEK_END)
< trailer = in_file.read(32)
< if len(trailer) != 32:
< return InvalidFileException()
< offset_size, ref_size, num_objects, top_object, offset_table_offset = struct.unpack('>6xBB4xL4xL4xL', trailer)
< in_file.seek(offset_table_offset)
< object_offsets = []
< offset_format = '>' + {1: 'B', 2: 'H', 4: 'L', 8: 'Q', }[offset_size] * num_objects
< ref_format = {1: 'B', 2: 'H', 4: 'L', 8: 'Q', }[ref_size]
< int_format = {0: (1, '>B'), 1: (2, '>H'), 2: (4, '>L'), 3: (8, '>Q'), }
< object_offsets = struct.unpack(offset_format, in_file.read(offset_size * num_objects))
< def getSize(token_l):
< """ return the size of the next object."""
< if token_l == 0xF:
< m = ord(in_file.read(1)) & 0x3
< s, f = int_format[m]
< return struct.unpack(f, in_file.read(s))[0]
< return token_l
< def readNextObject(offset):
< """ read the object at offset. May recursively read sub-objects (content of an array/dict/set) """
< in_file.seek(offset)
< token = in_file.read(1)
< token_h, token_l = ord(token) & 0xF0, ord(token) & 0x0F #high and low parts
< if token == '\x00':
< return None
< elif token == '\x08':
< return False
< elif token == '\x09':
< return True
< elif token == '\x0f':
< return ''
< elif token_h == 0x10: #int
< result = 0
< for k in xrange((2 << token_l) - 1):
< result = (result << 8) + ord(in_file.read(1))
< return result
< elif token_h == 0x20: #real
< if token_l == 2:
< return struct.unpack('>f', in_file.read(4))[0]
< elif token_l == 3:
< return struct.unpack('>d', in_file.read(8))[0]
< elif token_h == 0x30: #date
< f = struct.unpack('>d', in_file.read(8))[0]
< return datetime.datetime.utcfromtimestamp(f + MAC_OS_X_TIME_OFFSET)
< elif token_h == 0x80: #data
< s = getSize(token_l)
< return in_file.read(s)
< elif token_h == 0x50: #ascii string
< s = getSize(token_l)
< return in_file.read(s)
< elif token_h == 0x60: #unicode string
< s = getSize(token_l)
< return in_file.read(s * 2).decode('utf-16be')
< elif token_h == 0x80: #uid
< return in_file.read(token_l + 1)
< elif token_h == 0xA0: #array
< s = getSize(token_l)
< obj_refs = struct.unpack('>' + ref_format * s, in_file.read(s * ref_size))
< return map(lambda x: readNextObject(object_offsets[x]), obj_refs)
< elif token_h == 0xC0: #set
< s = getSize(token_l)
< obj_refs = struct.unpack('>' + ref_format * s, in_file.read(s * ref_size))
< return set(map(lambda x: readNextObject(object_offsets[x]), obj_refs))
< elif token_h == 0xD0: #dict
< result = {}
< s = getSize(token_l)
< key_refs = struct.unpack('>' + ref_format * s, in_file.read(s * ref_size))
< obj_refs = struct.unpack('>' + ref_format * s, in_file.read(s * ref_size))
< for k, o in zip(key_refs, obj_refs):
< key = readNextObject(object_offsets[k])
< obj = readNextObject(object_offsets[o])
< result[key] = obj
< return result
< raise InvalidFileException()
< return readNextObject(object_offsets[top_object])
<

----------
assignee: ronaldoussoren
components: Library (Lib), Macintosh
files: plistlib.py
messages: 157152
nosy: d9pouces, ronaldoussoren
priority: normal
severity: normal
status: open
title: plistlib unable to read json and binary plist files
type: enhancement
versions: Python 2.7
Added file: http://bugs.python.org/file25075/plistlib.py

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14455>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Mar 30, 2012, 3:14 PM

Post #2 of 12 (209 views)
Permalink
[issue14455] plistlib unable to read json and binary plist files [In reply to]

R. David Murray <rdmurray [at] bitdance> added the comment:

Thanks for the patch. Could you upload it as a context diff?

----------
nosy: +r.david.murray
stage: -> patch review
versions: +Python 3.3 -Python 2.7

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14455>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Mar 30, 2012, 3:50 PM

Post #3 of 12 (201 views)
Permalink
[issue14455] plistlib unable to read json and binary plist files [In reply to]

d9pouces <python [at] 19pouces> added the comment:

Here is the new patch. I assumed that you meant to use diff -c instead of the raw diff command.

----------
keywords: +patch
Added file: http://bugs.python.org/file25076/context.diff

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14455>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Mar 30, 2012, 4:31 PM

Post #4 of 12 (202 views)
Permalink
[issue14455] plistlib unable to read json and binary plist files [In reply to]

R. David Murray <rdmurray [at] bitdance> added the comment:

Hmm. Apparently what I meant was -u instead of -c (unified diff). I just use the 'hg diff' command myself, which does the right thing :) Of course, to do that you need to have a checkout. (We can probably use the context diff.)

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14455>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Mar 30, 2012, 9:19 PM

Post #5 of 12 (197 views)
Permalink
[issue14455] plistlib unable to read json and binary plist files [In reply to]

Changes by Ned Deily <nad [at] acm>:


----------
nosy: +ned.deily

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14455>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Mar 31, 2012, 12:55 AM

Post #6 of 12 (199 views)
Permalink
[issue14455] plistlib unable to read json and binary plist files [In reply to]

Serhiy Storchaka <storchaka [at] gmail> added the comment:

This patch is for Python 2. New features are accepted only for Python 3.3+. I ported the patch, but since I have no Mac, I can't check.

To date code was specified incorrectly.

The length of integers was calculated incorrectly. To convert integers, you can use int.from_bytes.

Objects identity was not preserved.

I'm not sure that the recognition of XML done enough. Should consider UTF-16 and UTF-32 with the BOM and without.

Need tests.

Also I'm a bit cleaned up and modernizing the code. I believe that it should be rewritten in a more object-oriented style. It is also worth to implement writer.

----------
nosy: +storchaka
Added file: http://bugs.python.org/file25077/plistlib_ext.patch

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14455>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 2, 2012, 8:37 AM

Post #7 of 12 (192 views)
Permalink
[issue14455] plistlib unable to read json and binary plist files [In reply to]

Changes by Georges Martin <jrjsmrtn [at] gmail>:


----------
nosy: +jrjsmrtn

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14455>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 4, 2012, 2:06 PM

Post #8 of 12 (218 views)
Permalink
[issue14455] plistlib unable to read json and binary plist files [In reply to]

d9pouces <python [at] 19pouces> added the comment:

storchaka > I'm trying to take care of your remarks.
So, I'm working on a more object-oriented code, with both write and read functions. I just need to write some test cases.
IMHO, we should add a new parameter to the writePlist function, to allow the use of the binary or the json format of plist files instead of the default XML one.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14455>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 6, 2012, 9:30 AM

Post #9 of 12 (187 views)
Permalink
[issue14455] plistlib unable to read json and binary plist files [In reply to]

√Čric Araujo <merwok [at] netwok> added the comment:

Keep it simple: if a few functions work, there is no need at all to add classes. Before doing more work though I suggest you wait for the feedback of the Mac maintainers.

----------
nosy: +eric.araujo

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14455>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 6, 2012, 9:44 AM

Post #10 of 12 (194 views)
Permalink
[issue14455] plistlib unable to read json and binary plist files [In reply to]

Ronald Oussoren <ronaldoussoren [at] mac> added the comment:

I (as one of the Mac maintainers) like the new functionality, but would like to see some changes:

1) as others have noted it is odd that binary and json plists can be read but not written

2) there need to be tests, and I'd add two or even three set of tests:

a. tests that read pre-generated files in the various formats
(tests that we're compatible with the format generated by Apple)

b. tests that use Apple tools to generated plists in various formats,
and check that the library can read them
(these tests would be skipped on platforms other than OSX)

c. if there are read and write functions: check that the writer
generates files that can be read back in.

3) there is a new public function for reading binary plist files,
I'd keep that private and add a "format" argument to readPlist
when there is a need for forcing the usage of a specific format
(and to mirror the (currently hypothetical) format argument for
writePlist).

Don't worry about rearchitecturing plistlib, it might need work in that regard but that need not be part of this issue and makes it harder to review the changes. I'm also far from convinced that a redesign of the code is needed.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14455>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 6, 2012, 1:34 PM

Post #11 of 12 (197 views)
Permalink
[issue14455] plistlib unable to read json and binary plist files [In reply to]

d9pouces <python [at] 19pouces> added the comment:

I'm working on a class, BinaryPlistParser, which allow to both read and write binary files.

I've also added a parameter fmt to writePlist and readPlist, to specify the format ('json', 'xml1' or 'binary1', using XML by default). These constants are used by Apple for its plutil program.

I'm now working on integrating these three formats to the test_plistlib.py. However, the json is less expressive than the other two, since it cannot handle dates.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14455>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 8, 2012, 1:31 AM

Post #12 of 12 (185 views)
Permalink
[issue14455] plistlib unable to read json and binary plist files [In reply to]

d9pouces <python [at] 19pouces> added the comment:

Here is the new patch, allowing read and write binary, json and xml plist files.

It includes both the plistlib.py and test/test_plistlib.py patches.
JSON format does not allow dates and data, so XML is used by default to write files.
I use the json library to write JSON plist files, but its output is slightly different from the Apple default output: keys of dictionaries are in different order. Thus, I removed the test_appleformattingfromliteral test for JSON files.

Similarly, my binary writer does not write the same binary files as the Apple library: my library writes the content of compound objects (dicts, lists and sets) before the object itself, while Apple writes the object before its content. Copying the Apple behavior results in some additional weird lines of code, for little benefit. Thus, I also removed the test_appleformattingfromliteral test for binary files.

Other tests are made for all the three formats.

----------
Added file: http://bugs.python.org/file25156/plistlib_with_test.diff

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14455>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com

Python bugs RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.