Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Python

Regexp and multiple groups (with repeats)

 

 

Python python RSS feed   Index | Next | Previous | View Threaded


mrkafk at gmail

Nov 20, 2009, 4:03 AM

Post #1 of 3 (122 views)
Permalink
Regexp and multiple groups (with repeats)

Hello,

>>> r=re.compile(r'(?:[a-zA-Z]:)([\\/]\w+)+')

>>> r.search(r'c:/tmp/spam/eggs').groups()
('/eggs',)

Obviously, I would like to capture all groups:
('/tmp', '/spam', '/eggs')

But it seems that re captures only the last group. Is there any way to
capture all groups with repeat following it, i.e. (...)+ or (...)* ?

Even better would be:

('tmp', 'spam', 'eggs')

Yes, I know about re.split:

>>> re.split( r'(?:\w:)?[/\\]', r'c:/tmp/spam\\eggs/' )
['', 'tmp', 'spam', '', 'eggs', '']

My interest is more general in this case: how to capture many groups
with a repeat?

Regards,
mk


--
http://mail.python.org/mailman/listinfo/python-list


neilc at norwich

Nov 20, 2009, 7:52 AM

Post #2 of 3 (111 views)
Permalink
Re: Regexp and multiple groups (with repeats) [In reply to]

On 2009-11-20, mk <mrkafk [at] gmail> wrote:
> Hello,
>
> >>> r=re.compile(r'(?:[a-zA-Z]:)([\\/]\w+)+')
>
> >>> r.search(r'c:/tmp/spam/eggs').groups()
> ('/eggs',)
>
> Obviously, I would like to capture all groups:
> ('/tmp', '/spam', '/eggs')

You'll have to do something else, for example:

>>> s = re.compile(r'(?:[a-zA-Z]:)')
>>> n = re.compile(r'[\\/]\w+')
>>> m = s.match('c:/tmp/spam/eggs')
>>> n.findall(m.string[m.end():])
['/tmp', '/spam', '/eggs']

--
Neil Cerutti
--
http://mail.python.org/mailman/listinfo/python-list


metolone+gmane at gmail

Nov 20, 2009, 8:03 AM

Post #3 of 3 (105 views)
Permalink
Re: Regexp and multiple groups (with repeats) [In reply to]

"mk" <mrkafk [at] gmail> wrote in message news:he60ha$ivv$1 [at] ger
> Hello,
>
> >>> r=re.compile(r'(?:[a-zA-Z]:)([\\/]\w+)+')
>
> >>> r.search(r'c:/tmp/spam/eggs').groups()
> ('/eggs',)
>
> Obviously, I would like to capture all groups:
> ('/tmp', '/spam', '/eggs')
>
> But it seems that re captures only the last group. Is there any way to
> capture all groups with repeat following it, i.e. (...)+ or (...)* ?
>
> Even better would be:
>
> ('tmp', 'spam', 'eggs')
>
> Yes, I know about re.split:
>
> >>> re.split( r'(?:\w:)?[/\\]', r'c:/tmp/spam\\eggs/' )
> ['', 'tmp', 'spam', '', 'eggs', '']
>
> My interest is more general in this case: how to capture many groups with
> a repeat?

re.findall is what you're looking for. Here's all words not followed by a
colon:

>>> import re
>>> re.findall(u'(\w+)(?!:)',r'c:\tmp\spam/eggs')
['tmp', 'spam', 'eggs']

-Mark


--
http://mail.python.org/mailman/listinfo/python-list

Python python RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.