
abigail at abigail
May 20, 2008, 6:39 AM
Post #20 of 23
(182 views)
Permalink
|
On Tue, May 20, 2008 at 03:09:58PM +0200, Aristotle Pagaltzis wrote: > * Abigail <abigail[at]abigail.be> [2008-05-20 01:25]: > > On Tue, May 20, 2008 at 12:22:00AM +0200, demerphq wrote: > > > 2008/5/19 Aristotle Pagaltzis <pagaltzis[at]gmx.de>: > > > > I tried to contrive such an example myself, but failed. I > > > > was unable to think of a single task in which I could use > > > > the empty pattern where I didn't also clearly prefer some > > > > other way of achieving the same intent. > > > > > > Erm, so what did you come up with for my example? > > > > > > if (/$pat1/ or /$pat2/ or /$pat3/) { > > > $hash{$1}=$_; > > > s///; > > > } > > > > I'd write that as: > > > > my $copy = $_; > > $hash {$1} = $copy if s/$pat1// or > > s/$pat2// or > > s/$pat3//; > > Exactly. And if the behaviour of only creating a copy when a > match is known to exist is desired, empty pattern can be emulated > directly with qr//, as I said: > > my $str = \$_; > my @rx = ( qr/$pat1/, qr/$pat2/, qr/$pat3/ ); > if ( my $m = List::Util::first { $$str =~ $_ } @rx ) { > $hash{$1} = $_; > s/$m//; > } > > Again, it takes more work, but I consider this a feature. (Less > so in this case than in my previous mail, but still.) My version has the disadvantage of using a potentially unnessary copy, but it has the advantage of not running the same pattern twice (and hence, eliminating the need for m//). What's more efficient will depend on the size of the string, and the complexity of the pattern. I guess the following eliminates neither makes an unnecessary copy, nor will it run a pattern twice: if (/$pat1/ or /$pat2/ or /$pat3/) { $hash {$1} = $_; substr $hash {$1}, $- [0], $+ [0] - $- [0], ""; } > [.[. This would be *trivial*, btw, if we had a flag to ask `s///` > to return a modified copy instead of modifying in situ. Assuming > it was called `/R` and returned undef on failure (which I would > advocate, as it???s very easy to check for this and use the > original string instead if that???s what you want (particularly > since 5.10)), this could be written thus: > > if ( my $cleaned = s!$pat1!!R // s!$pat2!!R // s!$pat3!!R ) { > ( $hash{$1}, $_ ) = ( $_, $cleaned ); > } > > I *far* prefer this over any other variant. It???s shorter and > expresses the entire intent directly, and yet it will still > allocate memory for a copy only if the substitution succeeds. And > not only is it cleaner, it???s also potentially much faster as it > will only ever attempt any one match once, regardless of success > or failure. It may be shorter, but as it is written, it's broken; it will fail one of the patterns matches the entire string - or the entire string except for a leading or trailing 0. You ought to write it as: if (defined (my $cleaned = s!$pat1!!R // s!$pat2!!R // s!$pat3!!R)) { ($hash {$1}, $_) = ($_, $cleaned); } or { my $cleaned = s!$pat1!!R // s!$pat2!!R // s!$pat3!!R // last; ($hash {$1}, $_) = ($_, $cleaned); } neither of which is that nice. And then there's this, where s///R doesn't help: for (my $i = 0; $i < @array1; $i ++) { say "Match on index $i" if $array1 [$i] =~ /pat/ && $array2 [$i] =~ //; } Of course, here the only gain of m// is the number characters typed (which is mostly useful for the command line and short scripts). Abigail
|