Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Perl: porters

[perl #114452] [adi@hexapodia.org: Bug#681849: perl: regex negative lookbehind does not work before $]

 

 

Perl porters RSS feed   Index | Next | Previous | View Threaded


perlbug-followup at perl

Aug 11, 2012, 4:57 AM

Post #1 of 3 (66 views)
Permalink
[perl #114452] [adi@hexapodia.org: Bug#681849: perl: regex negative lookbehind does not work before $]

# New Ticket Created by Dominic Hargreaves
# Please include the string: [perl #114452]
# in the subject line of all future correspondence about this issue.
# <URL: https://rt.perl.org:443/rt3/Ticket/Display.html?id=114452 >


I can confirm that the behaviour is the same on current blead, so
forwarding here initially.

Thanks,
Dominic.

----- Forwarded message from Andy Isaacson <adi [at] hexapodia> -----

Date: Mon, 16 Jul 2012 23:06:05 -0700
From: Andy Isaacson <adi [at] hexapodia>
To: Debian Bug Tracking System <submit [at] bugs>
Subject: Bug#681849: perl: regex negative lookbehind does not work before $

Package: perl
Version: 5.14.2-6
Severity: normal

Dear Maintainer,

The negative look-behind assertion does not work correctly before $ (the
end-of-line assertion).

I expect to be able to say "match lines that do not end in bar"
using the regex /(?<!bar)$/ . However this does not work:

% (echo foo; echo bar; echo foobaz) | perl -ne 'print if(/(?<!bar)$/)'
foo
bar
foobaz
%

It should not have printed "bar" above.

A similar pattern using /^(?!bar)/ works to say "lines that do not
start with bar", and negative look-behind works before a string:

# negative lookahead
% (echo foo; echo bar; echo foobaz) | perl -ne 'print if(/^(?!foo)/)'
bar
%

# negative lookbehind before string
% (echo foo; echo bar; echo foobaz) | sed 's/$/x/g' | \
perl -ne 'print if(/(?<!bar)x$/)'
foox
foobazx
%

I found a workaround that may shed light on the root cause of the
problem. Normally /$/ matches the end of a string or the line-ending
character at the end of a <> string, and regex behavior with $ is not
changed by chomp()ing the line-ending-character away. But in this case,
there is a difference. If I "chomp;" before matching, the negative
look-behind assertion works correctly:

% (echo foo; echo bar; echo foobaz) | perl -ne 'chomp; print if(/(?<!bar)$/)'
foofoobaz
%

Note that it did not print "bar" above, correctly implementing the behavior
documented in perlre(1).


-- System Information:
Debian Release: wheezy/sid
APT prefers unstable
APT policy: (500, 'unstable'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 3.4.0-rc4-00095-g95f7147 (SMP w/4 CPU cores)
Locale: LANG=en_US.utf8, LC_CTYPE=en_US.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages perl depends on:
ii libbz2-1.0 1.0.6-1
ii libc6 2.13-26
ii libdb5.1 5.1.29-1
ii libgdbm3 1.8.3-10
ii perl-base 5.14.2-6
ii perl-modules 5.14.2-6
ii zlib1g 1:1.2.3.4.dfsg-3

Versions of packages perl recommends:
ii netbase 4.47

Versions of packages perl suggests:
ii libterm-readline-gnu-perl | libterm-readline-perl-perl <none>
ii make 3.81-8.1
ii perl-doc 5.14.2-6

-- no debconf information

----- End forwarded message -----

--
Dominic Hargreaves | http://www.larted.org.uk/~dom/
PGP key 5178E2A5 from the.earth.li (keyserver,web,email)


perlbug-followup at perl

Aug 11, 2012, 12:53 PM

Post #2 of 3 (57 views)
Permalink
[perl #114452] [adi@hexapodia.org: Bug#681849: perl: regex negative lookbehind does not work before $] [In reply to]

On Sat Aug 11 04:57:51 2012, dom wrote:
> I found a workaround that may shed light on the root cause of the
> problem. Normally /$/ matches the end of a string

And since it can match the end of a string, "bar\n" =~ /(?<!bar)$/ will
match successfully after the \n. So this is not a bug.

> or the line-ending
> character at the end of a <> string,

No. When it does not match at the very end of a string, it matches the
position right before the final newline. /$/ is equivalent to /(?=\n?\z)/.

If you don’t want it to match at the very end of an unchomped string,
use /(?<!bar)(?<!\n)$/.

> and regex behavior with $ is not
> changed by chomp()ing the line-ending-character away. But in this
> case,
> there is a difference. If I "chomp;" before matching, the negative
> look-behind assertion works correctly:

As I would expect.

--

Father Chrysostomos


---
via perlbug: queue: perl5 status: new
https://rt.perl.org:443/rt3/Ticket/Display.html?id=114452


davem at iabyn

Aug 11, 2012, 12:59 PM

Post #3 of 3 (60 views)
Permalink
Re: [perl #114452] [adi@hexapodia.org: Bug#681849: perl: regex negative lookbehind does not work before $] [In reply to]

On Sat, Aug 11, 2012 at 04:57:51AM -0700, Dominic Hargreaves wrote:
> % (echo foo; echo bar; echo foobaz) | perl -ne 'print if(/(?<!bar)$/)'
> foo
> bar
> foobaz
> %
>
> It should not have printed "bar" above.

Not a bug.

'$' is documented to match either at the end of a string, or before the \n
at the end of a string:

From perlre.pod:

$ Match the end of the line (or before newline at the end)

e.g.

$ perl -E 'say "match" if "abc\n" =~ /abc$/'
match
$ perl -E 'say "match" if "abc\n" =~ /abc\n$/'
match
$

In the case of "bar\n" =~ /(?<!bar)$/, it matches as:
("ar\n" not equal to "bar"), followed by end-of-string.

--
The crew of the Enterprise encounter an alien life form which is
surprisingly neither humanoid nor made from pure energy.
-- Things That Never Happen in "Star Trek" #22

Perl porters RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.