
public at khwilliamson
Aug 4, 2012, 9:23 PM
Post #4 of 4
(52 views)
Permalink
|
|
Re: Why should these two regexes behave differently?
[In reply to]
|
|
Never mind. My folly is becoming apparent to me. On 08/04/2012 09:10 PM, Karl Williamson wrote: > On 08/04/2012 05:24 PM, Lukas Mai wrote: >> On 2012-08-04 Karl Williamson wrote: >> >>> I'm trying to start to understand the backtracking control verbs. I >>> don't understand why these two things differ: >>> >>> blead -E 'say "foo" =~ / (?: (?! foo | bar ) ) . /x || 0;' >>> 1 >>> >>> blead -E 'say "foo" =~ / (?: (?= foo | bar ) (*F) ) . /x || 0;' >>> 0 >> >> I don't understand why you think these two regexes should be similar. >> >> >> [Regex #1] >> >> Let's simplify a bit: >> >> / (?: (?! foo | bar ) ) . /x >> >> # eliminate /x >> /(?:(?!foo|bar))./ >> >> # remove redundant (?: ) >> /(?!foo|bar)./ >> >> => "at a position not followed by foo or bar, match a non-newline >> character" >> >> Our string is "foo". >> >> First we attempt to match at offset 0. This fails because offset 0 is >> indeed followed by "foo". >> >> Next we attempt to match at offset 1. This succeeds because offset 1 is >> followed by "oo", which is matched by neither "foo" nor "bar". Then we >> match a non-newline character ("o"). We've run out of regex: success. >> >> Complete success at offset 1. >> >> >> [Regex #2] >> >> Simplify: >> >> / (?: (?= foo | bar ) (*F) ) . /x >> >> # eliminate /x >> /(?:(?=foo|bar)(*F))./ >> >> # remove redundant (?: ) >> /(?=foo|bar)(*F)./ > > I don't believe the (?: ) is redundant. The *F applies to its subpattern. > >> >> => "at a position followed by foo or bar, fail, then match a >> non-newline character" >> >> This regex can never succeed because it contains a non-optional (*F) >> (and by that I mean every possible path through the regex reaches >> (*F)). Reaching (*F) makes the current match attempt fail. >> >> With our string of "foo": >> >> Attempt to match at offset 0. (?=foo|bar) succeeds because offset 0 is >> indeed followed by "foo". Then we hit (*F) and fail. There are no >> recorded backtracking positions, so this offset fails completely. >> >> Attempt to match at offset 1, 2, 3. We immediately fail because >> (?=foo|bar) fails. >> >> We've run out of possible match positions and fail the whole match. >> >
|