Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Perl: porters

[perl #54198] readlink() returns result along with garbage

 

 

Perl porters RSS feed   Index | Next | Previous | View Threaded


perlbug-followup at perl

May 15, 2008, 4:37 AM

Post #1 of 6 (176 views)
Permalink
[perl #54198] readlink() returns result along with garbage

# New Ticket Created by "Denis Melnikov"
# Please include the string: [perl #54198]
# in the subject line of all future correspondence about this issue.
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=54198 >


To: perlbug[at]perl.org
Subject: readlink() returns result along with garbage
Reply-To: dmelnik[at]regent.ru
Message-Id: <5.8.8_16640_1210847066[at]hq-data.msk.regent.ru>

This is a bug report for perl from dmelnik[at]regent.ru,
generated with the help of perlbug 1.35 running under perl v5.8.8.


-----------------------------------------------------------------
[Please enter your report here]
Hi,
There's a problem with readlink().

print readlink("/proc/13917/exe");
results in:
/usr/sbin/squidr.pyo (deleted)

While it should be '/usr/sbin/squid'.
Actually there are \0's in the resulting string:

00000000 2f 75 73 72 2f 73 62 69 6e 2f 73 71 75 69 64 00 |/usr/sbin/squid.|
00000010 72 2e 70 79 6f 00 00 00 00 00 00 00 00 00 00 00 |r.pyo...........|
00000020 20 28 64 65 6c 65 74 65 64 29 | (deleted)|

As far as I understand, C's readlink() and shell's readlink(1) work
fine 'cause they see \0 termination, while Perl doesn't use it.

The problem has appeared today, yesterday the code worked fine.

[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
category=core
severity=medium
---
This perlbug was built using Perl v5.8.8 in the Red Hat build system.
It is being executed now by Perl v5.8.8 - Wed Jan 9 11:30:38 CST 2008.

Site configuration information for perl v5.8.8:

Configured by Red Hat, Inc. at Wed Jan 9 11:30:38 CST 2008.

Summary of my perl5 (revision 5 version 8 subversion 8) configuration:
Platform:
osname=linux, osvers=2.6.18-8.1.15.el5, archname=x86_64-linux-thread-multi
uname='linux linux55.fnal.gov 2.6.18-8.1.15.el5 #1 smp mon oct 22 09:47:50 edt 2007 x86_64 x86_64 x86_64 gnulinux '
config_args='-des -Doptimize=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -Dversion=5.8.8 -Dmyhostname=localhost -Dperladmin=root[at]localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Dlibpth=/usr/local/lib64 /lib64 /usr/lib64 -Dprivlib=/usr/lib/perl5/5.8.8 -Dsitelib=/usr/lib/perl5/site_perl/5.8.8 -Dvendorlib=/usr/lib/perl5/vendor_perl/5.8.8 -Darchlib=/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi -Dsitearch=/usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi -Dvendorarch=/usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi -Darchname=x86_64-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl=n -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dd_gethostent_r_proto -Ud_endhostent_r_pro
to -Ud_sethostent_r_proto -Ud_endprotoent_r_proto -Ud_setprotoent_r_proto -Ud_endservent_r_proto -Ud_setservent_r_proto -Dinc_version_list=5.8.7 5.8.6 5.8.5 -Dscriptdir=/usr/bin'
hint=recommended, useposix=true, d_sigaction=define
usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=define use64bitall=define uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
optimize='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic',
cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include -I/usr/include/gdbm'
ccversion='', gccversion='4.1.1 20070105 (Red Hat 4.1.1-52)', gccosandvers=''
intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='gcc', ldflags =''
libpth=/usr/local/lib64 /lib64 /usr/lib64
libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
libc=/lib/libc-2.5.so, so=so, useshrplib=true, libperl=libperl.so
gnulibc_version='2.5'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/CORE'
cccdlflags='-fPIC', lddlflags='-shared -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic'

Locally applied patches:


---
@INC for perl v5.8.8:
/usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi
/usr/lib64/perl5/site_perl/5.8.7/x86_64-linux-thread-multi
/usr/lib64/perl5/site_perl/5.8.6/x86_64-linux-thread-multi
/usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.8
/usr/lib/perl5/site_perl/5.8.7
/usr/lib/perl5/site_perl/5.8.6
/usr/lib/perl5/site_perl/5.8.5
/usr/lib/perl5/site_perl
/usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi
/usr/lib64/perl5/vendor_perl/5.8.7/x86_64-linux-thread-multi
/usr/lib64/perl5/vendor_perl/5.8.6/x86_64-linux-thread-multi
/usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.8
/usr/lib/perl5/vendor_perl/5.8.7
/usr/lib/perl5/vendor_perl/5.8.6
/usr/lib/perl5/vendor_perl/5.8.5
/usr/lib/perl5/vendor_perl
/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi
/usr/lib/perl5/5.8.8
.

---
Environment for perl v5.8.8:
HOME=/root
LANG=en_US.UTF-8
LANGUAGE (unset)
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
PERL_BADLANG (unset)
SHELL=/bin/bash


maddingue at free

May 16, 2008, 5:28 PM

Post #2 of 6 (154 views)
Permalink
Re: [perl #54198] readlink() returns result along with garbage [In reply to]

Hello,


Denis Melnikov wrote:

> There's a problem with readlink().

I was bitten by the same thing in a program I wrote at $work. AFAIK,
it's a Linux only thing.

> print readlink("/proc/13917/exe");
> results in:
> /usr/sbin/squidr.pyo (deleted)
>
> While it should be '/usr/sbin/squid'.
> Actually there are \0's in the resulting string:
>
> 00000000 2f 75 73 72 2f 73 62 69 6e 2f 73 71 75 69 64 00 |/usr/
> sbin/squid.|
> 00000010 72 2e 70 79 6f 00 00 00 00 00 00 00 00 00 00 00 |
> r.pyo...........|
> 00000020 20 28 64 65 6c 65 74 65 64 29 |
> (deleted)|
>
> As far as I understand, C's readlink() and shell's readlink(1) work
> fine 'cause they see \0 termination, while Perl doesn't use it.
>
> The problem has appeared today, yesterday the code worked fine.


Read the full string: it only appears when the target file has been
deleted. IIRC, the buffer contains the target path, a fixed number of
bytes (I don't know their meaning), then the string "(deleted)". I
think it's used by utilities like lsof.

Note that Perl simply uses the C readlink(2) function, so any C
program will show the same thing. Except you usually won't see it
because if you printf("%s\n", buf) (where buf has been filled by
readlink(2)), you'll only see the target path because printf(3) will
stop at the first \0. In Perl, you see everything because Perl
strings don't end at \0.

It can be demonstrated (on Linux) with a short C program. I don't
have by hand the one I wrote when I discovered this, but I think it
was something like this:

$ cat myreadlink.c
#include <stdio.h>
#include <unistd.h>

#define BUF_SIZE 128

int main(int argc, char *argv[]) {
char buf[BUF_SIZE];
int n, i;

n = readlink(argv[1], buf, BUF_SIZE);
printf("readlink() returned %d\n", n);
printf(" buf=\"%s\"\n", buf);
printf(" buf: ");

for (i=0; i<=BUF_SIZE; i++) {
printf("%02hhx ", buf[i]);
}

puts("");

return 0;
}

Compile it, then call it with "myreadlink /proc/13917/exe". IIRC,
readlink(2) on Linux returns the total number of bytes it put in the
buffer, up to and including the "(deleted)" string, which Perl uses
as the length of the string.

If we consider this a bug, the following (untested) patch should
solve it:

--- pp_sys.c.old 2008-04-30 13:51:55.000000000 +0200
+++ pp_sys.c 2008-05-17 02:25:59.000000000 +0200
@@ -3586,7 +3586,8 @@
EXTEND(SP, 1);
if (len < 0)
RETPUSHUNDEF;
- PUSHp(buf, len);
+ buf[len] = '\0';
+ PUSHp(buf, strlen(buf));
RETURN;
#else
EXTEND(SP, 1);


--
Sébastien Aperghis-Tramoni

Close the world, txEn eht nepO.


gbarr at pobox

May 17, 2008, 4:24 AM

Post #3 of 6 (149 views)
Permalink
Re: [perl #54198] readlink() returns result along with garbage [In reply to]

On May 16, 2008, at 7:28 PM, Sébastien Aperghis-Tramoni wrote:
> Hello,
>
> Denis Melnikov wrote:
>
>> There's a problem with readlink().
>
> I was bitten by the same thing in a program I wrote at $work.
> AFAIK, it's a Linux only thing.

It is a "trick" that procfs uses which assumes anyone using readlink
on a link in procfs will stop at the NULL.

>> print readlink("/proc/13917/exe");
>> results in:
>> /usr/sbin/squidr.pyo (deleted)
>>
>> While it should be '/usr/sbin/squid'.
>> Actually there are \0's in the resulting string:
>>
>> 00000000 2f 75 73 72 2f 73 62 69 6e 2f 73 71 75 69 64 00 |/usr/
>> sbin/squid.|
>> 00000010 72 2e 70 79 6f 00 00 00 00 00 00 00 00 00 00 00 |
>> r.pyo...........|
>> 00000020 20 28 64 65 6c 65 74 65 64 29 |
>> (deleted)|
>>
>> As far as I understand, C's readlink() and shell's readlink(1) work
>> fine 'cause they see \0 termination, while Perl doesn't use it.
>>
>> The problem has appeared today, yesterday the code worked fine.
>
>
> Read the full string: it only appears when the target file has been
> deleted. IIRC, the buffer contains the target path, a fixed number
> of bytes (I don't know their meaning), then the string "(deleted)".
> I think it's used by utilities like lsof.

> If we consider this a bug, the following (untested) patch should
> solve it:

I do not consider it a bug. With the patch below anyone attempting to
write utilities to use that information cannot.

Graham.

>
> --- pp_sys.c.old 2008-04-30 13:51:55.000000000 +0200
> +++ pp_sys.c 2008-05-17 02:25:59.000000000 +0200
> @@ -3586,7 +3586,8 @@
> EXTEND(SP, 1);
> if (len < 0)
> RETPUSHUNDEF;
> - PUSHp(buf, len);
> + buf[len] = '\0';
> + PUSHp(buf, strlen(buf));
> RETURN;
> #else
> EXTEND(SP, 1);
>
>
> --
> Sébastien Aperghis-Tramoni
>
> Close the world, txEn eht nepO.
>


nick at ccl4

May 17, 2008, 4:42 AM

Post #4 of 6 (147 views)
Permalink
Re: [perl #54198] readlink() returns result along with garbage [In reply to]

On Sat, May 17, 2008 at 06:24:16AM -0500, Graham Barr wrote:
> On May 16, 2008, at 7:28 PM, Sébastien Aperghis-Tramoni wrote:

> >Read the full string: it only appears when the target file has been
> >deleted. IIRC, the buffer contains the target path, a fixed number
> >of bytes (I don't know their meaning), then the string "(deleted)".
> >I think it's used by utilities like lsof.
>
> >If we consider this a bug, the following (untested) patch should
> >solve it:
>
> I do not consider it a bug. With the patch below anyone attempting to
> write utilities to use that information cannot.

My view too. readlink is working as documented.

I'm curious whether this feature of the Linux proc filing system is
documented. :-)

Nicholas Clark


maddingue at free

May 17, 2008, 5:36 PM

Post #5 of 6 (137 views)
Permalink
Re: [perl #54198] readlink() returns result along with garbage [In reply to]

Nicholas Clark wrote:

> On Sat, May 17, 2008 at 06:24:16AM -0500, Graham Barr wrote:
>> On May 16, 2008, at 7:28 PM, Sébastien Aperghis-Tramoni wrote:
>
>>> Read the full string: it only appears when the target file has been
>>> deleted. IIRC, the buffer contains the target path, a fixed number
>>> of bytes (I don't know their meaning), then the string "(deleted)".
>>> I think it's used by utilities like lsof.
>>
>>> If we consider this a bug, the following (untested) patch should
>>> solve it:
>>
>> I do not consider it a bug. With the patch below anyone attempting to
>> write utilities to use that information cannot.
>
> My view too. readlink is working as documented.

If you allow me to be a little pedant, it's not exactly working as
documented:

readlink EXPR
readlink
Returns the value of a symbolic link, if symbolic
links are
implemented. If not, gives a fatal error. If there
is some
system error, returns the undefined value and sets $!
(errno).
If EXPR is omitted, uses $_.

i.e., it should return the target of the symbolic link, and it's what
it does on all systems, including Linux. It's only when the target
file doesn't exist that it returns this additional, undocumented,
information, which doesn't exist on other systems. On OSX, readlink
(2) on a broken link just returns the content of the symbolic link.
So, one could argue that Perl could/should return consistent value
across operating systems.

Personally, I can live with it as it is, given it's just a matter of
s/\0//. Another solution is to add a POSIX::readlink() that DWIM and
only returns the target.

> I'm curious whether this feature of the Linux proc filing system is
> documented. :-)


IIRC, I had searched a little back then, but didn't find anything.
Googling a little more today didn't end with more results. The man
page for proc(5) doesn't indicate this:
» http://www.kernel.org/doc/man-pages/online/pages/man5/proc.5.html


--
Sébastien Aperghis-Tramoni

Close the world, txEn eht nepO.


gbarr at pobox

May 18, 2008, 4:37 AM

Post #6 of 6 (129 views)
Permalink
Re: [perl #54198] readlink() returns result along with garbage [In reply to]

On May 17, 2008, at 7:36 PM, Sébastien Aperghis-Tramoni wrote:

> Nicholas Clark wrote:
>
>> On Sat, May 17, 2008 at 06:24:16AM -0500, Graham Barr wrote:
>>> On May 16, 2008, at 7:28 PM, Sébastien Aperghis-Tramoni wrote:
>>
>>>> Read the full string: it only appears when the target file has been
>>>> deleted. IIRC, the buffer contains the target path, a fixed number
>>>> of bytes (I don't know their meaning), then the string "(deleted)".
>>>> I think it's used by utilities like lsof.
>>>
>>>> If we consider this a bug, the following (untested) patch should
>>>> solve it:
>>>
>>> I do not consider it a bug. With the patch below anyone
>>> attempting to
>>> write utilities to use that information cannot.
>>
>> My view too. readlink is working as documented.
>
> If you allow me to be a little pedant, it's not exactly working as
> documented:
>
> readlink EXPR
> readlink
> Returns the value of a symbolic link, if symbolic
> links are
> implemented. If not, gives a fatal error. If there
> is some
> system error, returns the undefined value and sets
> $! (errno).
> If EXPR is omitted, uses $_.
>
> i.e., it should return the target of the symbolic link, and it's
> what it does on all systems, including Linux. It's only when the
> target file doesn't exist that it returns this additional,
> undocumented, information, which doesn't exist on other systems. On
> OSX, readlink(2) on a broken link just returns the content of the
> symbolic link. So, one could argue that Perl could/should return
> consistent value across operating systems.

Allow me to be pedant

SYNOPSIS
#include <unistd.h>

int
readlink(const char *path, char *buf, int bufsiz);

DESCRIPTION
Readlink() places the contents of the symbolic link path in the
buffer
buf, which has size bufsiz. Readlink does not append a NUL
character to
buf.

RETURN VALUES
The call returns the count of characters placed in the buffer
if it suc-
ceeds, or a -1 if an error occurs, placing the error code in
the global
variable errno.

As the man page states, the system call does not append a nul
character, but returns the number of characters placed into the
buffer. If your code is treating any embedded nul character as the
end of the link, then I would suggest that your program is broken.

> Personally, I can live with it as it is, given it's just a matter
> of s/\0//. Another solution is to add a POSIX::readlink() that DWIM
> and only returns the target.

It already does what it is supposed to do.

Graham.

Perl porters RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.