Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Perl: porters

[perl #54040] Misparsing of sort comparison subroutine intention

 

 

Perl porters RSS feed   Index | Next | Previous | View Threaded


perlbug-followup at perl

May 12, 2008, 8:44 AM

Post #1 of 11 (220 views)
Permalink
[perl #54040] Misparsing of sort comparison subroutine intention

# New Ticket Created by Ken Williams
# Please include the string: [perl #54040]
# in the subject line of all future correspondence about this issue.
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=54040 >


To: perlbug[at]perl.org
Subject: Misparsing of sort subroutine intention
Reply-To: ken.williams[at]thomson.com
Message-Id: <5.8.6_1048_1210604177[at]mailbo2.westgroup.com>

This is a bug report for perl from ken.williams[at]thomson.com,
generated with the help of perlbug 1.35 running under perl v5.8.6.


-----------------------------------------------------------------
[Please enter your report here]

I was surprised by the following behavior of the parser, which
misinterprets the union() subroutine call as a SUBNAME comparison
routine for the sort() function:

==================
% cat c2.pl
use strict;
my %one = qw(a 1 b 2 c 3);
my %two = qw(a 1 c 2 e 3);
sub union {
my %h;
$h{$_}++ for @_;
keys %h;
}
foreach my $k (sort union(keys(%one), keys(%two)) ) {
print "$k\n";
}

% perl c2.pl
c
a
b
e
c
a

% perl -MO=Deparse,-p c2.pl
use strict 'refs';
(my(%one) = ('a', '1', 'b', '2', 'c', '3'));
(my(%two) = ('a', '1', 'c', '2', 'e', '3'));
sub union {
my(%h);
(++$h{$_}) foreach (@_);
keys(%h);
}
foreach my($k) (sort union keys(%one), keys(%two)) {
print("$k\n");
}
c2.pl syntax OK
==================

Of course, I can use explicit parens for the sort() function, but I
thought the parens I'm using for the union() function should have been
sufficient.

I verified that this same behavior exists with my copy of perl 5.10.0
also.

-Ken

[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
category=core
severity=medium
---
Site configuration information for perl v5.8.6:

Configured by root at Wed Nov 1 16:59:38 PST 2006.

Summary of my perl5 (revision 5 version 8 subversion 6) configuration:
Platform:
osname=darwin, osvers=8.0, archname=darwin-thread-multi-2level
uname='darwin b19.apple.com 8.0 darwin kernel version 8.3.0: mon oct 3
20:04:04 pdt 2005; root:xnu-792.6.22.obj~2release_ppc power macintosh
powerpc '
config_args='-ds -e -Dprefix=/usr -Dccflags=-g -pipe
-Dldflags=-Dman3ext=3pm -Duseithreads -Duseshrplib'
hint=recommended, useposix=true, d_sigaction=define
usethreads=define use5005threads=undef useithreads=define
usemultiplicity=define
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='cc', ccflags ='-g -pipe -fno-common -DPERL_DARWIN -no-cpp-precomp
-fno-strict-aliasing -I/usr/local/include',
optimize='-O3',
cppflags='-no-cpp-precomp -g -pipe -fno-common -DPERL_DARWIN
-no-cpp-precomp -fno-strict-aliasing -I/usr/local/include'
ccversion='', gccversion='4.0.1 (Apple Computer, Inc. build 5363)',
gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags ='-L/usr/local/lib'
libpth=/usr/local/lib /usr/lib
libs=-ldbm -ldl -lm -lc
perllibs=-ldl -lm -lc
libc=/usr/lib/libc.dylib, so=dylib, useshrplib=true,
libperl=libperl.dylib
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' '
cccdlflags=' ', lddlflags='-bundle -undefined dynamic_lookup
-L/usr/local/lib'

Locally applied patches:
23953 - fix for File::Path::rmtree CAN-2004-0452 security issue
33990 - fix for setuid perl security issues
SPRINTF0 - fixes for sprintf formatting issues - CVE-2005-3962

---
@INC for perl v5.8.6:
/System/Library/Perl/5.8.6/darwin-thread-multi-2level
/System/Library/Perl/5.8.6
/Library/Perl/5.8.6/darwin-thread-multi-2level
/Library/Perl/5.8.6
/Library/Perl
/Network/Library/Perl/5.8.6/darwin-thread-multi-2level
/Network/Library/Perl/5.8.6
/Network/Library/Perl
/System/Library/Perl/Extras/5.8.6/darwin-thread-multi-2level
/System/Library/Perl/Extras/5.8.6
/Library/Perl/5.8.1
.

---
Environment for perl v5.8.6:
DYLD_LIBRARY_PATH (unset)
HOME=/Users/u0048513
LANG (unset)
LANGUAGE (unset)
LC_ALL=C
LD_LIBRARY_PATH (unset)
LOGDIR (unset)

PATH=/bin:/etc:/usr/bin:/usr/etc:/usr/sbin:/sbin:/Users/u0048513/bin:/usr/lo
cal/bin:/sw/bin:/Users/u0048513/p4/tools/FileTools:/Users/u0048513/p4/tools/
perforce:.
PERL_BADLANG (unset)
SHELL=/bin/zsh

--
Ken Williams
Research Scientist
The Thomson Corporation
Eagan, MN


davidnicol at gmail

May 12, 2008, 2:38 PM

Post #2 of 11 (211 views)
Permalink
Re: [perl #54040] Misparsing of sort comparison subroutine intention [In reply to]

On Mon, May 12, 2008 at 3:44 PM, via RT Ken Williams <
perlbug-followup[at]perl.org> wrote:
>
> (sort union(keys(%one), keys(%two)) ) {
>


Considering that your didn't-do-what-you-expected example syntactically
matches the examples in perldoc -f sort such as

@sortedclass = sort byage @class;

it would be very difficult to consider this a bug, even a documentation bug.

We could add a warning in perlfunc.pod:

depending on how the elements of the list are to be ordered. (The C<< <=>
>> and C<cmp> operators are extremely useful in such routines.) SUBNAME may
be a scalar variable name (unsubscripted), in which case the value provides
the name of (or a reference to) the actual subroutine to use. In place of a
SUBNAME, you can provide a BLOCK as an anonymous, in-line sort subroutine.

(BEGIN PROPOSED ADDITION)
Warning: the arguments to C<sort> violate the "looks like a function call"
rule for associating a subroutine name and its arguments, as C<sort> will
bind the subroutine as SUBNAME rather than running it as a list generator.
To have a list generated by a subroutine as your sorted list, wrap the whole
expression in parenthese.
(END PROPOSED ADDITION)

If the subroutine's prototype is C<($$)>, the elements to be compared are
passed by reference in C<@_>, as for a normal subroutine. This is slower
than unprototyped subroutines, where the elements to be compared are passed
into the subroutine






Furthermore, is anyone for modifying the sort section in perlfunc.pod to
warn that the consequence of using sort in scalar context is having Randal
Schwartz will appear in your dreams?


pagaltzis at gmx

May 14, 2008, 4:52 PM

Post #3 of 11 (198 views)
Permalink
Re: [perl #54040] Misparsing of sort comparison subroutine intention [In reply to]

* David Nicol <davidnicol[at]gmail.com> [2008-05-12 23:40]:
> (BEGIN PROPOSED ADDITION)
> Warning: the arguments to C<sort> violate the "looks like a
> function call" rule for associating a subroutine name and its
> arguments, as C<sort> will bind the subroutine as SUBNAME
> rather than running it as a list generator. To have a list
> generated by a subroutine as your sorted list, wrap the whole
> expression in parenthese.
> (END PROPOSED ADDITION)

++ on the idea, but the wording is awkward. This has bitten me
badly before.

> Furthermore, is anyone for modifying the sort section in
> perlfunc.pod to warn that the consequence of using sort in
> scalar context is having Randal Schwartz will appear in your
> dreams?

++

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>


davidnicol at gmail

May 15, 2008, 12:07 PM

Post #4 of 11 (192 views)
Permalink
Re: [perl #54040] Misparsing of sort comparison subroutine intention [In reply to]

On Wed, May 14, 2008 at 6:52 PM, Aristotle Pagaltzis <pagaltzis[at]gmx.de> wrote:
> * David Nicol <davidnicol[at]gmail.com> [2008-05-12 23:40]:
>> (BEGIN PROPOSED ADDITION)
>> Warning: the arguments to C<sort> violate the "looks like a
>> function call" rule for associating a subroutine name and its
>> arguments, as C<sort> will bind the subroutine as SUBNAME
>> rather than running it as a list generator. To have a list
>> generated by a subroutine as your sorted list, wrap the whole
>> expression in parenthese.
>> (END PROPOSED ADDITION)
>
> ++ on the idea, but the wording is awkward. This has bitten me
> badly before.

would an example help?

expression in parentheses.

@Contactlist = sort (TubaPlayersIn("Holland")); # alphabetize
all tuba players in Holland


pagaltzis at gmx

May 15, 2008, 2:40 PM

Post #5 of 11 (188 views)
Permalink
Re: [perl #54040] Misparsing of sort comparison subroutine intention [In reply to]

* David Nicol <davidnicol[at]gmail.com> [2008-05-15 21:10]:
> On Wed, May 14, 2008 at 6:52 PM, Aristotle Pagaltzis <pagaltzis[at]gmx.de> wrote:
> > * David Nicol <davidnicol[at]gmail.com> [2008-05-12 23:40]:
> >> (BEGIN PROPOSED ADDITION)
> >> Warning: the arguments to C<sort> violate the "looks like a
> >> function call" rule for associating a subroutine name and
> >> its arguments, as C<sort> will bind the subroutine as
> >> SUBNAME rather than running it as a list generator. To have
> >> a list generated by a subroutine as your sorted list, wrap
> >> the whole expression in parenthese.
> >> (END PROPOSED ADDITION)
> >
> > ++ on the idea, but the wording is awkward. This has bitten
> > me badly before.
>
> would an example help?
>
> expression in parentheses.
>
> @Contactlist = sort (TubaPlayersIn("Holland"));
> # alphabetize all tuba players in Holland

Yes, though not by itself. Your explanation is written
deductively with a touch of jargon; I think it needs to be
task-oriented and worded more simply:

B<Warning:> if you intend to sort the list returned from
a function, you need to use either two sets of parentheses or
the function call sigil:

@contact = sort(find_records(@key));
@contact = sort &find_records(@key);

In contrast, both of the following will do something
completely different:

@contact = sort find_records(@key); # WRONG
@contact = sort(find_records @key); # WRONG

Both of these might look like they contain function calls
like the previous examples, but in fact the parser will grab
the bareword following the "C<sort>" token to use it as a
comparison function. So instead of sorting the results of
C<find_records> called with the contents of C<@key> as its
arguments, these will sort C<@key> using C<find_records> as
a comparator function.

If anyone can condense this a little, that would help, though
I fear it’s not possible to remove too much without loss of
clarity.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>


p5p at perl

May 15, 2008, 2:47 PM

Post #6 of 11 (188 views)
Permalink
Re: [perl #54040] Misparsing of sort comparison subroutine intention [In reply to]

> B<Warning:> if you intend to sort the list returned from
> a function, you need to use either two sets of parentheses or
> the function call sigil:
>
> @contact = sort(find_records(@key));
> @contact = sort &find_records(@key);

TMTOWTDI:
@contact = sort { $a cmp $b } find_records(@key)


davidnicol at gmail

May 15, 2008, 2:55 PM

Post #7 of 11 (188 views)
Permalink
Re: [perl #54040] Misparsing of sort comparison subroutine intention [In reply to]

On Thu, May 15, 2008 at 4:40 PM, Aristotle Pagaltzis <pagaltzis[at]gmx.de> wrote:

B<Warning:> if you intend to sort the list returned from
a function, you need to use either two sets of parentheses or
the function call sigil or make the sort function explicit:

@contact = sort(find_records(@key));
@contact = sort &find_records(@key);
@contact = sort { $a cmp $b } find_records(@key)

In contrast, both of the following

@contact = sort find_records(@key); # WRONG
@contact = sort(find_records @key); # WRONG

will do something completely different:because the parser
interprets the bareword following the "C<sort>" token
as SUBNAME.


> If anyone can condense this a little, that would help, though
> I fear it's not possible to remove too much without loss of
> clarity.

I took out the redundant description of what SUBNAME is all about,
and added Bram's third method.


abigail at abigail

May 15, 2008, 3:21 PM

Post #8 of 11 (188 views)
Permalink
Re: [perl #54040] Misparsing of sort comparison subroutine intention [In reply to]

On Thu, May 15, 2008 at 11:40:10PM +0200, Aristotle Pagaltzis wrote:
> * David Nicol <davidnicol[at]gmail.com> [2008-05-15 21:10]:
> > On Wed, May 14, 2008 at 6:52 PM, Aristotle Pagaltzis <pagaltzis[at]gmx.de> wrote:
> > > * David Nicol <davidnicol[at]gmail.com> [2008-05-12 23:40]:
> > >> (BEGIN PROPOSED ADDITION)
> > >> Warning: the arguments to C<sort> violate the "looks like a
> > >> function call" rule for associating a subroutine name and
> > >> its arguments, as C<sort> will bind the subroutine as
> > >> SUBNAME rather than running it as a list generator. To have
> > >> a list generated by a subroutine as your sorted list, wrap
> > >> the whole expression in parenthese.
> > >> (END PROPOSED ADDITION)
> > >
> > > ++ on the idea, but the wording is awkward. This has bitten
> > > me badly before.
> >
> > would an example help?
> >
> > expression in parentheses.
> >
> > @Contactlist = sort (TubaPlayersIn("Holland"));
> > # alphabetize all tuba players in Holland
>
> Yes, though not by itself. Your explanation is written
> deductively with a touch of jargon; I think it needs to be
> task-oriented and worded more simply:
>
> B<Warning:> if you intend to sort the list returned from
> a function, you need to use either two sets of parentheses or
> the function call sigil:
>
> @contact = sort(find_records(@key));
> @contact = sort &find_records(@key);

The simplest way may be to unary +:

@contact = sort +find_record (@key);


Alternatively, one may be explicite by mentioning the sorting block:

@contact = sort {$a <=> $b} find_record (@key);

>
> In contrast, both of the following will do something
> completely different:
>
> @contact = sort find_records(@key); # WRONG
> @contact = sort(find_records @key); # WRONG
>
> Both of these might look like they contain function calls
> like the previous examples, but in fact the parser will grab
> the bareword following the "C<sort>" token to use it as a
> comparison function. So instead of sorting the results of
> C<find_records> called with the contents of C<@key> as its
> arguments, these will sort C<@key> using C<find_records> as
> a comparator function.
>
> If anyone can condense this a little, that would help, though
> I fear it???s not possible to remove too much without loss of
> clarity.


Abigail


pagaltzis at gmx

May 16, 2008, 2:56 AM

Post #9 of 11 (176 views)
Permalink
Re: [perl #54040] Misparsing of sort comparison subroutine intention [In reply to]

* Bram <p5p[at]perl.wizbit.be> [2008-05-15 23:50]:
>
>> B<Warning:> if you intend to sort the list returned from
>> a function, you need to use either two sets of parentheses or
>> the function call sigil:
>>
>> @contact = sort(find_records(@key));
>> @contact = sort &find_records(@key);
>
> TMTOWTDI:
> @contact = sort { $a cmp $b } find_records(@key)

Yeah, but I think of that as a workaround rather than a
disambiguation. Still, it could be mentioned.


* Abigail <abigail[at]abigail.be> [2008-05-16 00:25]:
> The simplest way may be to unary +:
>
> @contact = sort +find_record (@key);

Ohh, yes, hadn’t thought of that; much nicer.


* David Nicol <davidnicol[at]gmail.com> [2008-05-16 00:00]:
> I took out the redundant description of what SUBNAME is all
> about, and added Bram's third method.

Good point, I hadn’t considered the context.

> B<Warning:> if you intend to sort the list returned from
> a function, you need to use either two sets of parentheses or
> the function call sigil or make the sort function explicit:
>
> @contact = sort(find_records(@key));
> @contact = sort &find_records(@key);
> @contact = sort { $a cmp $b } find_records(@key)
>
> In contrast, both of the following
>
> @contact = sort find_records(@key); # WRONG
> @contact = sort(find_records @key); # WRONG
>
> will do something completely different:because the parser
> interprets the bareword following the "C<sort>" token
> as SUBNAME.

B<Warning:> if you intend to sort the list returned from a
function, you need to either use two sets of parentheses or
otherwise disambiguate the function call:

@contact = sort(find_records(@key));
@contact = sort +find_records(@key);
@contact = sort &find_records(@key);

Alternatively you can use an explicit comparator block. In
contrast, both of the following will do something completely
different:

@contact = sort find_records(@key); # WRONG
@contact = sort(find_records @key); # WRONG

Here the parser interprets the bareword following the
"C<sort>" token as SUBNAME and takes C<@key> as the list to
sort.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>


davidnicol at gmail

May 16, 2008, 11:46 AM

Post #10 of 11 (173 views)
Permalink
Re: [perl #54040] Misparsing of sort comparison subroutine intention [In reply to]

B<Warning:> Care is required when sorting the list returned
from a function. Here are four ways to do it:

@contact = sort(find_records(@key));
@contact = sort +find_records(@key);
@contact = sort &find_records(@key);
@contact = sort { $a cmp $b } find_records @key;

In both of the following, the parser will interpret
the bareword following the "C<sort>" token as SUBNAME
and C<@key> as LIST.

@contact = sort find_records(@key); # WRONG
@contact = sort(find_records @key); # WRONG


davidnicol at gmail

May 16, 2008, 1:43 PM

Post #11 of 11 (173 views)
Permalink
Re: [perl #54040] Misparsing of sort comparison subroutine intention [In reply to]

we could put a doubleparenned version in the WRONG section too, or
advise against using parens.
Is this supposed to happen?

Whitespace sensitivity in Cygwin perl 5.8.8

$ perl -lwe '@A = 1 .. 9; sub shuffle { @_[0,3,1] = @_[3,1,0]; @_; }
print shuffle @A; print sort ( shuffle( @A))'
sort (...) interpreted as function at -e line 1.
413256789
123456789

$ perl -lwe '@A = 1 .. 9; sub shuffle { @_[0,3,1] = @_[3,1,0]; @_; }
print shuffle @A; print sort ( shuffle ( @A))'
sort (...) interpreted as function at -e line 1.
Unquoted string "shuffle" may clash with future reserved word at -e line 1.
413256789
987652314

which is reversed because that's what happens to happen when the sort
function always returns 2.

Perl porters RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.