Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Perl: porters

[perl #47906] Split does not always strip trailing empty fields

 

 

Perl porters RSS feed   Index | Next | Previous | View Threaded


perlbug-followup at perl

Nov 27, 2007, 9:30 PM

Post #1 of 4 (360 views)
Permalink
[perl #47906] Split does not always strip trailing empty fields

# New Ticket Created by John Wiersba
# Please include the string: [perl #47906]
# in the subject line of all future correspondence about this issue.
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=47906 >


[Please enter your report here]

Question: Under which conditions does split() strip trailing null (empty)
fields? Apparently only under at least one of these conditions:

1) When the return value of split is used directly as the return value of
a function.
2) When the return value of split is assigned to a list of variables which
includes an array variable.

Test case showing the inconsistency of split() striping trailing null fields
sometimes and not others:


#!/usr/bin/perl -w
use strict;

sub fn {
split /:/, ":x:";
}
sub printit {
$_ = defined $_ ? $_ : "UNDEF" for @_;
local $" = "><";
print "<$_>" for @_;
print "\n";
}

my ($a, $b, $c, $d, @out);

($a, $b, $c ) = fn; printit $a, $b, $c;
($a, $b, $c ) = split /:/, ":x:"; printit $a, $b, $c;

($a, $b, $c, $d ) = fn; printit $a, $b, $c, $d;
($a, $b, $c, $d ) = split /:/, ":x:"; printit $a, $b, $c, $d;

($a, @out) = fn; printit $a, $#out, @out;
($a, @out) = split /:/, ":x:"; printit $a, $#out, @out;

($a, $b, @out) = fn; printit $a, $b, $#out, @out;
($a, $b, @out) = split /:/, ":x:"; printit $a, $b, $#out, @out;

($a, $b, $c, @out) = fn; printit $a, $b, $c, $#out, @out;
($a, $b, $c, @out) = split /:/, ":x:"; printit $a, $b, $c, $#out, @out;


Output:
<><x><UNDEF>
<><x><> <= NOTE: unexpected empty field after "x"
<><x><UNDEF><UNDEF>
<><x><><UNDEF> <= NOTE: unexpected empty field after "x"
<><0><x>
<><0><x>
<><x><-1>
<><x><-1>
<><x><UNDEF><-1>
<><x><UNDEF><-1>


[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
category=core
severity=medium
---
Site configuration information for perl v5.8.8:

Configured by Debian Project at Tue Mar 6 01:52:23 UTC 2007.

Summary of my perl5 (revision 5 version 8 subversion 8) configuration:
Platform:
osname=linux, osvers=2.6.15.7, archname=i486-linux-gnu-thread-multi
uname='linux rothera 2.6.15.7 #1 smp sat sep 30 10:21:42 utc 2006 i686 gnuli++ nux '
config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-f++ PIC -Darchname=i486-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8 -Da++ rchlib=/usr/lib/perl/5.8 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dv++ endorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/p++ erl/5.8.8 -Dsitearch=/usr/local/lib/perl/5.8.8 -Dman1dir=/usr/share/man/man1 ++ -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir++ =/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pa++ ger -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.8 -++ Dd_dosuid -des'
hint=recommended, useposix=true, d_sigaction=define
usethreads=define use5005threads=undef useithreads=define usemultiplicity=de++ fine
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -++ fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OF++ FSET_BITS=64',
optimize='-O2',
cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-stric++ t-aliasing -pipe -I/usr/local/include'
ccversion='', gccversion='4.1.2 (Ubuntu 4.1.2-0ubuntu4)', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize++ =8
alignbytes=4, prototype=define
Linker and Libraries:
ld='cc', ldflags =' -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib
libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
perllibs=-ldl -lm -lpthread -lc -lcrypt
libc=/lib/libc-2.5.so, so=so, useshrplib=true, libperl=libperl.so.5.8.8
gnulibc_version='2.5'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:


---
@INC for perl v5.8.8:
/jrw/mdst/sh/../lib
/usr/local/lib/perl5/site_perl/5.8.0
/etc/perl
/usr/local/lib/perl/5.8.8
/usr/local/share/perl/5.8.8
/usr/lib/perl5
/usr/share/perl5
/usr/lib/perl/5.8
/usr/share/perl/5.8
/usr/local/lib/site_perl
.

---
Environment for perl v5.8.8:
HOME=/home/jrw
LANG (unset)
LANGUAGE (unset)
LC_ALL=C
LD_LIBRARY_PATH (unset)
LOGDIR (unset) PATH=/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin:/home/jrw/sh:/home/jrw/bin:++ /jrw/mdst/sh:/jrw/mdst/sh/GAMES:/jrw/mdst/binu:/jrw/mdst/binw:.
PERL5LIB=/jrw/mdst/sh/../lib:/usr/local/lib/perl5/site_perl/5.8.0
PERLDB_OPTS=NonStop frame=2
PERL_BADLANG (unset)
SHELL=/bin/ksh




____________________________________________________________________________________
Be a better pen pal.
Text or chat with friends inside Yahoo! Mail. See how. http://overview.mail.yahoo.com/


rgarciasuarez at gmail

Nov 28, 2007, 4:00 AM

Post #2 of 4 (337 views)
Permalink
Re: [perl #47906] Split does not always strip trailing empty fields [In reply to]

On 28/11/2007, via RT John Wiersba <perlbug-followup [at] perl> wrote:
> use strict;
>
> sub fn {
> split /:/, ":x:";
> }
> sub printit {
> $_ = defined $_ ? $_ : "UNDEF" for @_;
> local $" = "><";
> print "<$_>" for @_;
> print "\n";
> }
>
> my ($a, $b, $c, $d, @out);
>
> ($a, $b, $c ) = fn; printit $a, $b, $c;
> ($a, $b, $c ) = split /:/, ":x:"; printit $a, $b, $c;

That 2nd line falls under the case "When assigning to a list" documented
in perlfunc/split. It's thus equivalent at specifying a LIMIT of 4 and
the result you get is expected.

> ($a, $b, $c, $d ) = fn; printit $a, $b, $c, $d;
> ($a, $b, $c, $d ) = split /:/, ":x:"; printit $a, $b, $c, $d;

Likewise (because empty trailing fields are deleted).
I thus see no bug with regard to the documented behavior.

> ($a, @out) = fn; printit $a, $#out, @out;
> ($a, @out) = split /:/, ":x:"; printit $a, $#out, @out;
>
> ($a, $b, @out) = fn; printit $a, $b, $#out, @out;
> ($a, $b, @out) = split /:/, ":x:"; printit $a, $b, $#out, @out;
>
> ($a, $b, $c, @out) = fn; printit $a, $b, $c, $#out, @out;
> ($a, $b, $c, @out) = split /:/, ":x:"; printit $a, $b, $c, $#out, @out;
>
>
> Output:
> <><x><UNDEF>
> <><x><> <= NOTE: unexpected empty field after "x"
> <><x><UNDEF><UNDEF>
> <><x><><UNDEF> <= NOTE: unexpected empty field after "x"
> <><0><x>
> <><0><x>
> <><x><-1>
> <><x><-1>
> <><x><UNDEF><-1>
> <><x><UNDEF><-1>


robin at cpan

Nov 28, 2007, 4:08 AM

Post #3 of 4 (336 views)
Permalink
Re: [perl #47906] Split does not always strip trailing empty fields [In reply to]

What you're seeing is documented behaviour. The relevant parts of the
split documentation are:

> split /PATTERN/,EXPR,LIMIT
> split /PATTERN/,EXPR
> split /PATTERN/
> split Splits the string EXPR into a list of strings and
> returns that
> list. By default, empty leading fields are
> preserved, and
> empty trailing ones are deleted.
>
> [...]
>
> If LIMIT is unspecified or zero, trailing null
> fields are stripped
>
> [...]
>
>
> The LIMIT parameter can be used to split a line
> partially
>
> ($login, $passwd, $remainder) = split(/:/, $_, 3);
>
> When assigning to a list, if LIMIT is omitted, or
> zero, Perl
> supplies a LIMIT one larger than the number of
> variables in the
> list, to avoid unnecessary work. For the list above
> LIMIT
> would have been 4 by default.


In other words, where you explictly write

($a, $b, $c ) = split /:/, ":x:";

or

($a, $b, $c, $d ) = split /:/, ":x:";

the LIMIT parameter is taken to be 4 or 5 respectively, and trailing
null fields are not stripped. In the other cases, the LIMIT is taken
to be unspecified and trailing nulls are stripped.


(I'm not suggesting that this is particularly reasonable behaviour,
only that it's longstanding and documented.)

Robin


ikegami at adaelis

Nov 28, 2007, 2:49 PM

Post #4 of 4 (332 views)
Permalink
Re: [perl #47906] Split does not always strip trailing empty fields [In reply to]

>
> That 2nd line falls under the case "When assigning to a list" documented
> in perlfunc/split. It's thus equivalent at specifying a LIMIT of 4 and
> the result you get is expected.
>

Is the right-most assignment in
($a, $b, $c) = (@a = split /:/, ":x:");
considered a list assignment? split thinks so.

(I'm not suggesting that this is particularly reasonable behaviour,
> only that it's longstanding and documented.)
>

Workarounds:
@a = split /:/, ":x:"; ($a, $b, $c) = @a;
@{[ $a, $b, $c ]} = split /:/, ":x:";
($a, $b, $c) = do { split /:/, ":x:" };

Perl porters RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.