Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: ClamAV: devel

symbol visibilty, public API, and internal function useage

 

 

ClamAV devel RSS feed   Index | Next | Previous | View Threaded


steve at lobefin

Nov 20, 2007, 11:23 AM

Post #1 of 5 (324 views)
Permalink
symbol visibilty, public API, and internal function useage

Hi all,

So I've been thinking about how to reduce the breakage that occasionally
happens to other bits of software on upgrades of libclamav, and I think
I've come up with something that might be of some use. libtool lets you
use a version script to (on platforms that support it) hide symbols from
being globally visible and version those that are.

The benefits of using this sort of approach is that other applications
that try to use libclamav internal functions will fail to link instead
of just throwing a compiler warning about an implicit function
definition. This will allow the clamav team to make their public API
more stable, since it is the only thing other people will be using.

The downfall, as I have just found out by setting up a version script,
is the the clamav applications fail to link without exporting quite a
few symbols that really shouldn't be exported by the library. It also
doesn't actually help with API/ABI changing modifications in the public
API itself - it just frees you to spend more time making sure that can
be left alone while you modify the private functions to your heart's
content.

This says to me that probably the architecture for some of this shared
code is wrong at the moment. One of the following solutions is probably
useful, but I'll leave it up to you all as to which one you prefer.

1) these symbols should be useable by all applications linking to
libclamav (and they should be cl_ instead of their current naming
scheme).

2) These symbols should not be in libclamav at all. They should be code
in shared/ which gets turned into a convenience library and linked to
all the applications and the library at build time, presumably
statically.

3) These symbols should be in a second shared library built from shared/
or elsewhere that is installed on the target system. To discourage use
of this private library, it could be installed in a sub directory of
$libdir, and an rpath added to all the binaries and libclamav.

My personal preference is 2. It incurs some on disk and memory
overhead, as these symbols will be loaded once for each application,
rather than the normal shared library memory saving techniques. That
being said, it is the simplest to manage, and is the most effective at
making sure other applications do not link to private functions within
the clamav suite that change between releases.

Ok, so long preamble over.

version script:
CLAMAV_0.92 {
global:
cl_*;
cli_calloc;
cli_chomp;
cli_dbgmsg;
cli_errmsg;
cli_gentemp;
cli_gentempdesc;
cli_leavetemps_flag;
cli_malloc;
cli_md5file;
cli_md5stream;
cli_memstr;
cli_ole2_extract;
cli_readn;
cli_realloc;
cli_rmdirs;
cli_rndnum;
cli_str2hex;
cli_strbcasestr;
cli_strdup;
cli_strrcpy;
cli_strtok;
cli_strtokbuf;
cli_unlockdb;
cli_untgz;
cli_utf16toascii;
cli_versigpss;
cli_warnmsg;
cli_writelockdb;
html_normalise_fd;
ppt_vba_read;
sha256_digest;
sha256_final;
sha256_init;
sha256_update;
tableCreate;
tableDestroy;
tableFind;
tableInsert;
tableIterate;
tableRemove;
tableUpdate;
vba56_dir_read;
vba_decompress;
wm_decrypt_macro;
wm_dir_read;
local:
*;
};

Everything in global: is exported and useable at link time. Everything
in local is not. As you can see, you can use wildcards, and you can
also chain versions, so that if a symbol is introduced or removed in a
later version of the library, it gets versioned correctly. All very
nice stuff.

The automake patches:
Index: libclamav/Makefile.am
===================================================================
--- libclamav/Makefile.am (revision 572)
+++ libclamav/Makefile.am (working copy)
@@ -21,8 +21,10 @@

libclamav_la_LIBADD = @LIBCLAMAV_LIBS@ @THREAD_LIBS@

-libclamav_la_LDFLAGS = @TH_SAFE@ -version-info @LIBCLAMAV_VERSION@ -no-undefined
+libclamav_la_LDFLAGS = @TH_SAFE@ -version-info @LIBCLAMAV_VERSION@ -no-undefined -Wl,--version-script=@top_srcdir@/libclamav/libclamav.map

+EXTRA_DIST = libclamav.map
+
include_HEADERS = clamav.h

libclamav_la_SOURCES = \
Index: libclamav/Makefile.in
===================================================================
--- libclamav/Makefile.in (revision 572)
+++ libclamav/Makefile.in (working copy)
@@ -238,7 +238,8 @@
target_vendor = @target_vendor@
INCLUDES = -I$(top_srcdir) -I[at]srcdir@/unrar -I[at]srcdir@/nsis
libclamav_la_LIBADD = @LIBCLAMAV_LIBS@ @THREAD_LIBS@
-libclamav_la_LDFLAGS = @TH_SAFE@ -version-info @LIBCLAMAV_VERSION@ -no-undefined
+libclamav_la_LDFLAGS = @TH_SAFE@ -version-info @LIBCLAMAV_VERSION@ -no-undefined -Wl,--version-script=@top_srcdir@/libclamav/libclamav.map
+EXTRA_DIST = libclamav.map
include_HEADERS = clamav.h
libclamav_la_SOURCES = \
clamav.h \

That's about it. This is against current stable, although rerevving it
against svn would be fairly trivial.

Is this something people are interested in? If so, how can I help to
get the list down to

global:
cl_*;
local:
*;

?

Happy hacking,
--
--------------------------------------------------------------------------
| Stephen Gran | <Joy> that's a Kludge(TM) <knghtbrd> It |
| steve[at]lobefin.net | Works(tm) <Joy> AIX works(TM) |
| http://www.lobefin.net/~steve | <knghtbrd> no it doesn't <knghtbrd> => |
--------------------------------------------------------------------------
Attachments: signature.asc (0.18 KB)


edwintorok at gmail

Nov 20, 2007, 11:59 AM

Post #2 of 5 (304 views)
Permalink
Re: symbol visibilty, public API, and internal function useage [In reply to]

Stephen Gran wrote:
> Hi all,
>
> So I've been thinking about how to reduce the breakage that occasionally
> happens to other bits of software on upgrades of libclamav, and I think
> I've come up with something that might be of some use. libtool lets you
> use a version script to (on platforms that support it) hide symbols from
> being globally visible and version those that are.
>
> The benefits of using this sort of approach is that other applications
> that try to use libclamav internal functions will fail to link instead
> of just throwing a compiler warning about an implicit function
> definition. This will allow the clamav team to make their public API
> more stable, since it is the only thing other people will be using.
>
> The downfall, as I have just found out by setting up a version script,
> is the the clamav applications fail to link without exporting quite a
> few symbols that really shouldn't be exported by the library.

This reminds me of https://wwws.clamav.net/bugzilla/show_bug.cgi?id=272

> It also
> doesn't actually help with API/ABI changing modifications in the public
> API itself -
Ulrich Drepper says in his DSO howto
(http://people.redhat.com/drepper/dsohowto.pdf) that API/ABI stability
can be maintained via versioning. But that is stuff for post 1.0 IMHO.

> it just frees you to spend more time making sure that can
> be left alone while you modify the private functions to your heart's
> content.
>

Did any breakage occur due to changes in cli_* functions? People should
really not use those directly.

> This says to me that probably the architecture for some of this shared
> code is wrong at the moment. One of the following solutions is probably
> useful, but I'll leave it up to you all as to which one you prefer.
>
> 1) these symbols should be useable by all applications linking to
> libclamav (and they should be cl_ instead of their current naming
> scheme).
>
> 2) These symbols should not be in libclamav at all. They should be code
> in shared/ which gets turned into a convenience library and linked to
> all the applications and the library at build time, presumably
> statically.
>
> 3) These symbols should be in a second shared library built from shared/
> or elsewhere that is installed on the target system. To discourage use
> of this private library, it could be installed in a sub directory of
> $libdir, and an rpath added to all the binaries and libclamav.
>
> My personal preference is 2. It incurs some on disk and memory
> overhead, as these symbols will be loaded once for each application,
> rather than the normal shared library memory saving techniques. That
> being said, it is the simplest to manage, and is the most effective at
> making sure other applications do not link to private functions within
> the clamav suite that change between releases.
>

Actually if we have export maps the size of the .so will be cut down, so
maybe it would compensate for it.

> Ok, so long preamble over.
>
> version script:
> [.....]
> vba56_dir_read;
> vba_decompress;
>

I guess this is only used by sigtool. I wouldn't mind linking sigtool
statically with libclamav, it is not a tool used on a day-by-day basis
by end-users.
I think if you don't consider sigtool you can significantly shorten this
list.
Besides if people see this list in an export map they'll start using it,
because they'll consider it public :(

Also clamdscan has more dependencies than it should.

> wm_decrypt_macro;
> wm_dir_read;
> local:
> *;
> };
>
> Everything in global: is exported and useable at link time.

I was also thinking of using -fvisibility feature of gcc to hide
internals functions, which has the nice benefit of
avoiding an indirect call (usually symbols not declared static need an
extra indirection due to -fPIC).
However I didn't like the way -fvisibility has to be used.
I either have to explicitly declare the visibility of all functions
exported, or declare visibility on all functions being hidden.
Even on functions declared as extern! Sure, we could do an EXTERN macro
that also adds the visibility attribute ...

However linker export maps can be implemented independently of
-fivisibility :)


> Everything
> in local is not. As you can see, you can use wildcards, and you can
> also chain versions, so that if a symbol is introduced or removed in a
> later version of the library, it gets versioned correctly. All very
> nice stuff.
>
> The automake patches:
> Index: libclamav/Makefile.am
> ===================================================================
> --- libclamav/Makefile.am (revision 572)
> +++ libclamav/Makefile.am (working copy)
> @@ -21,8 +21,10 @@
>
> libclamav_la_LIBADD = @LIBCLAMAV_LIBS@ @THREAD_LIBS@
>
> -libclamav_la_LDFLAGS = @TH_SAFE@ -version-info @LIBCLAMAV_VERSION@ -no-undefined
> +libclamav_la_LDFLAGS = @TH_SAFE@ -version-info @LIBCLAMAV_VERSION@ -no-undefined -Wl,--version-script=@top_srcdir@/libclamav/libclamav.map
>
> +EXTRA_DIST = libclamav.map
> +
> include_HEADERS = clamav.h
>
> libclamav_la_SOURCES = \
> Index: libclamav/Makefile.in
> ===================================================================
> --- libclamav/Makefile.in (revision 572)
> +++ libclamav/Makefile.in (working copy)
> @@ -238,7 +238,8 @@
> target_vendor = @target_vendor@
> INCLUDES = -I$(top_srcdir) -I[at]srcdir@/unrar -I[at]srcdir@/nsis
> libclamav_la_LIBADD = @LIBCLAMAV_LIBS@ @THREAD_LIBS@
> -libclamav_la_LDFLAGS = @TH_SAFE@ -version-info @LIBCLAMAV_VERSION@ -no-undefined
> +libclamav_la_LDFLAGS = @TH_SAFE@ -version-info @LIBCLAMAV_VERSION@ -no-undefined -Wl,--version-script=@top_srcdir@/libclamav/libclamav.map
> +EXTRA_DIST = libclamav.map
> include_HEADERS = clamav.h
> libclamav_la_SOURCES = \
> clamav.h \
>
> That's about it. This is against current stable, although rerevving it
> against svn would be fairly trivial.
>
> Is this something people are interested in?
I wanted to do something similar a long while ago, so I am definetely
interested in helping out.
> If so, how can I help to
> get the list down to
>
> global:
> cl_*;
> local:
> *;
>
> ?
>

Yes it would be nice to narrow down the list to this.
Does this work with GNU ld only? (it is supposed to work with Solaris's
ld too)

Best regards,
--Edwin
_______________________________________________
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


sherpya at netfarm

Nov 20, 2007, 12:34 PM

Post #3 of 5 (303 views)
Permalink
Re: symbol visibilty, public API, and internal function useage [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,
currently libclamav is used in a "static" way while
compiled as shared, I mean that all internal
functions are in the .so and binary that links
that are using them from the .so
cli_x functions should be internals and the executables
should link only the cl_xxx
On unix this is not really a problem since the linker
is smarter than the win32 linker
In my win32 port I had to export symbols that are currently
named as internals
as in:
http://clamwin.svn.sourceforge.net/viewvc/clamwin/trunk/clamav-devel/contrib/msvc/libclamav.def?revision=1537&view=markup

there are currently some of cli_ symbols exported
and other ones that I avoided by linking directly .c in the single
executables

for this reason libclamunrar introduces circular dependancies
and it's not buildable as is in win32 and standalone
on unix
(perhaps easy solvable since it uses snprintf that is public domain
and not gpl and other symbols are cli_malloc and cli_readn and friends)

I think that is not really easy right now to avoid
exporting internal functions because of the breakage of
package executables compilation, but
for external apps that are linking libclamav it would be nice
to restrict the usage to clamav.h defined symbols

Regards

- --
Gianluigi Tiesi <sherpya[at]netfarm.it>
EDP Project Leader
Netfarm S.r.l. - http://www.netfarm.it/
Free Software: http://oss.netfarm.it/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHQ0Tc3UE5cRfnO04RAvi6AJ9hlROxzvNZQyZVxdABewYgMhArngCguHCI
xRdyFEXwKE9vGB5ofDr5bks=
=eAFh
-----END PGP SIGNATURE-----
_______________________________________________
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net


steve at lobefin

Nov 20, 2007, 3:03 PM

Post #4 of 5 (294 views)
Permalink
Re: symbol visibilty, public API, and internal function useage [In reply to]

On Tue, Nov 20, 2007 at 09:59:21PM +0200, Török Edwin said:
> Stephen Gran wrote:
> > Hi all,
> >
> > So I've been thinking about how to reduce the breakage that occasionally
> > happens to other bits of software on upgrades of libclamav, and I think
> > I've come up with something that might be of some use. libtool lets you
> > use a version script to (on platforms that support it) hide symbols from
> > being globally visible and version those that are.

[...]

> This reminds me of https://wwws.clamav.net/bugzilla/show_bug.cgi?id=272

Yes, that's exactly the same issue.

> > It also
> > doesn't actually help with API/ABI changing modifications in the public
> > API itself -
> Ulrich Drepper says in his DSO howto
> (http://people.redhat.com/drepper/dsohowto.pdf) that API/ABI stability
> can be maintained via versioning. But that is stuff for post 1.0 IMHO.

Agreed. I just meant that symbol visibilty doesn't actually force you
guys to maintain a stable public API - you can still change that every
release if you want. It just makes it easier to break the private
functions as often as you want :)

> Did any breakage occur due to changes in cli_* functions? People should
> really not use those directly.

I think so. I can't remember the details now, but when we did the
libclamav1 -> libclamav2 transition in Debian, there were a few
applications that either used functions not in the public headers, or
messed about with the internals of private structs. We all know that's
bad behavior on the part of those applications - I'm just trying to
force them to use the interface that they're given.

> > My personal preference is 2. It incurs some on disk and memory
> > overhead, as these symbols will be loaded once for each application,
> > rather than the normal shared library memory saving techniques. That
> > being said, it is the simplest to manage, and is the most effective at
> > making sure other applications do not link to private functions within
> > the clamav suite that change between releases.
>
> Actually if we have export maps the size of the .so will be cut down, so
> maybe it would compensate for it.

That's a point I hadn't considered. You may be right. I sort of feel
that on most hardware capable of running clamav, we're unlikely to
notice a real difference with any of the proposed changes, but I'm
generally in favor of efficiency.

> > Ok, so long preamble over.
> >
> > version script:
> > [.....]
> > vba56_dir_read;
> > vba_decompress;
>
> I guess this is only used by sigtool. I wouldn't mind linking sigtool
> statically with libclamav, it is not a tool used on a day-by-day basis
> by end-users.
> I think if you don't consider sigtool you can significantly shorten this
> list.

I got a half dozen from each app, unfortunately. The milter is a heavy
user of several output functions as well as cli_malloc/realloc and
friends, clamd uses a bunch of tmpfile related functions, and so on. I
could make up a list of what each app uses if you like, but it's trivial
to discover. Just make a version script where cl_* is the only thing in
global: and watch what breaks :)

> Besides if people see this list in an export map they'll start using it,
> because they'll consider it public :(

No more than they do now, I'm sorry to say. I am viewing this as a step
to reducing what other projects are _able_ to use from clamav. This
will mean some changes in clamav as well, though.

> I was also thinking of using -fvisibility feature of gcc to hide
> internals functions, which has the nice benefit of
> avoiding an indirect call (usually symbols not declared static need an
> extra indirection due to -fPIC).
> However I didn't like the way -fvisibility has to be used.
> I either have to explicitly declare the visibility of all functions
> exported, or declare visibility on all functions being hidden.
> Even on functions declared as extern! Sure, we could do an EXTERN macro
> that also adds the visibility attribute ...

Yes, that's why I decided on the version script approach. I find
-fvisibility painful to use well.

> However linker export maps can be implemented independently of
> -fivisibility :)

Very true.

> Yes it would be nice to narrow down the list to this.
> Does this work with GNU ld only? (it is supposed to work with Solaris's
> ld too)

As far as I know, it work on at least Solaris, Linux and the BSDs.
I may be wrong, though - I tend not to spend a lot of time working on
platforms besides linux these days. At any rate, it'll be more than what
there is now, and people will generally be compiling 3rd party software
against clamav on one of those platforms anyway. Someone who wants to
only build an application against libclamav private functions on RiscOS
is, well, probably beyond help anyway.

Thanks,
--
--------------------------------------------------------------------------
| Stephen Gran | In the long run we are all dead. -- |
| steve[at]lobefin.net | John Maynard Keynes |
| http://www.lobefin.net/~steve | |
--------------------------------------------------------------------------
Attachments: signature.asc (0.18 KB)


edwintorok at gmail

Dec 28, 2007, 1:40 PM

Post #5 of 5 (243 views)
Permalink
Re: symbol visibilty, public API, and internal function useage [In reply to]

Stephen Gran wrote:
>> This reminds me of https://wwws.clamav.net/bugzilla/show_bug.cgi?id=272
>>
>
> Yes, that's exactly the same issue.
>

Hi Stephen,

Check out latest SVN (r3462), it uses version maps.
The cli_* symbols have a different "version", they are marked
CLAMAV_PRIVATE.
I got the idea when I've seen symbols being marked as GLIBC_PRIVATE.

The cl_* symbols are CLAMAV_PUBLIC, maybe we'll use versions in the
future, but for now bumping soname revisions is working well enough IMHO.

I've also written a configure check to test if the linker supports
version maps (GNU and Solaris's should support).

Best regards,
--Edwin
_______________________________________________
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net

ClamAV devel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.