
jpiitula at ling
Apr 30, 2012, 11:14 PM
Post #2 of 8
(486 views)
Permalink
|
|
Re: Trouble splitting strings with consecutive delimiters
[In reply to]
|
|
deuteros writes: > I'm using regular expressions to split a string using multiple > delimiters. But if two or more of my delimiters occur next to each > other in the string, it puts an empty string in the resulting > list. For example: > > re.split(':|;|px', "width:150px;height:50px;float:right") > > Results in > > ['width', '150', '', 'height', '50', '', 'float', 'right'] > > Is there any way to avoid getting '' in my list without adding px; > as a delimiter? You could use a sequence of such delimiters. >>> re.split('(?::|;|px)+', "width:150px;height:50px;float:right") ['width', '150', 'height', '50', 'float', 'right'] Consider splitting twice instead: first into key-value substrings at semicolons, and those into key-value pairs at colons. Here as a dict. Better handle the units after that. >>> dict(kv.split(':') for kv in "width:150px;height:50px;float:right".split(';')) {'width': '150px', 'float': 'right', 'height': '50px'} You might also want to accept whitespace as part of the delimiters. (There might be a parser for such data formats somewhere in the library already. CSV?) -- http://mail.python.org/mailman/listinfo/python-list
|