trinity-devel@lists.pearsoncomputing.net

Message: previous - next
Month: April 2012

Re: [trinity-devel] /etc/ld.so.cache

From: Nix <nix@...>
Date: Sun, 29 Apr 2012 23:43:36 +0100
On 29 Apr 2012, Darrell Anderson uttered the following:

>> > Do I need to add /opt/trinity/lib/trinity too or are
>> subdirectories of /opt/trinity/lib automatically found by
>> ldconfig?
>> 
>> No, you need to add any directory outside the 'standard
>> system set' (normally /usr/local/lib, /usr/lib, /lib, and any /lib32 /
>> /lib64 variants your distro may use). Subdirectories are not
>> autoamtically searched to satisfy DT_NEEDED entries. However, the
>> lib/trinity/ subdirectory is not loaded via DT_NEEDED entries but via
>> explicit dlopen(), which does no path searching at all because the
>> path needed is explicitly specified in the dlopen() call.
>> 
>> (And you need to run /sbin/ldconfig.)
>
> I don't understand the full technical aspects of what you wrote, :-)
> but what you wrote matches the particular error messages I'm seeing.

Sounds like I was incomprehensible as usual. Job done! :P

> Although I asked the question, I was leaning toward
> /opt/trinity/lib/trinity not being necessary in /etc/ld.so.conf
> because the equivalent /usr/lib/kde[3] never was needed with KDE3 and
> everything works as expected.

Stripping away some of my redundant geekberish, the rule is simple
enough:

If you link against a shared library via -L/some/dir/here -lfoo, then
the dynamic linker is going to be locating your library itself, at
process startup (or shared library load), so the directory in which that
library is located must exist in /lib/ld.so.conf, and /sbin/ldconfig
must be rerun to update /etc/ld.so.cache.

If you load a shared library via dlopen(), you have to give the path to
the library there, so there is no need to update ld.so.conf or update
ld.so.cache. (Plugins, like the stuff in /usr/lib/kde3 or
/opt/trinity/lib/trinity, are invariably opened via dlopen().)


Some extra complexities, rarely important, ludicrously overdesigned,
partly undocumented (I should fix that):

 - you can override the load path for shared libraries with the
   LD_LIBRARY_PATH environment variable. This is prepended to the list
   in ld.so.conf. Don't put a lot in here: it can't be cached and has to
   be searched on every process startup. Older Unixes with no ld.so
   cache often grew insanely long paths in their LD_LIBRARY_PATH. These
   days this is a sign of bad taste.

 - there is a 'system list', normally /lib and /usr/lib, which is
   searched anyway, even if not named in ld.so.conf. If you link with
   -z nodefaultlib, this is left out.

 - there are two ELF tags DT_RPATH and DT_RUNPATH which can also specify
   colon-separated paths for libraries linked with that executable or
   shared library. DT_RPATH is strongly deprecated because it is applied
   *before* LD_LIBRARY_PATH et al, so if it points into a dead network
   share you will freeze solid whenever you try to start the program.
   DT_RUNPATH is recommended instead: it is searched later. (In general
   it is rare to see either used, thank goodness.)

   There are a few magic tags that can be used in the paths in DT_RPATH
   and DT_RUNPATH. $ORIGIN means "." (and is only valid in non-setuid
   programs for hopefully obvious reasons); $PLATFORM is the platform
   name (which used to be the value returned by 'uname -p', but since
   this is almost always 'unknown' these days is now a value derived
   from the AT_PLATFORM entry in the ELF auxiliary vector supplied by
   the kernel: something like 'x86_64'); $LIB is the system library
   directory (/lib, /lib64, something like that).

 - the dynamic linker additionally searches a number of other
   directories under each directory named in ld.so.conf, in response to
   the hardware capabilities of the system (hence the glibc geek name
   for this 'hwcaps'). The underlying data for this consists of a 32-bit
   long passed down from the kernel in the ELF auxiliary vector: running
   any program with the LD_SHOW_AUXV=t environment variable set will
   show you this vector. The hwcap string is decoded by code in
   sysdeps/*/dl-procinfo.[ch] (e.g. sysdeps/i386/dl-procinfo.h): if
   present, subdirectories named after hwcaps found on the machine are
   searched before the directories named in ld.so.conf. (The hwcaps are
   often named the same as the flags in the flags string in
   /proc/cpuinfo, so you don't need to go through all this rigmarole
   just to figure out what your machine's hwcaps are). (There is an
   additional fake hwcap 'tls' which is present only if glibc is capable
   of thread-local storage, but this is rather irrelevant these days
   when virtually every system, desktop or not, has TLS support.)
   (Shared libraries can also add names to the currently-valid list of
   hwcaps, but I've never seen this feature used. You can mask out
   hwcaps you don't want the system to pay attention to using the
   DL_HWCAP_MASK environment variable, but this is really only a
   debugging aid, or at least I can't imagine another circumstance when
   you'd want to set it.)

   The upshot of all this is that if you have both e.g. MMX and
   non-MMX-capable versions of a library, you can put the MMX version in
   a subdirectory of the libdir named 'mmx' and it will be picked up
   automatically if the hardware is capable of MMX.

   This is most commonly used on 32-bit x86 to allow support of
   686-class machines that do not support the CMOV instruction (e.g.
   the Geode LX) by putting CMOV-capable libraries in a cmov/
   subdirectory, but it has other uses.

 - But that's not all! Any shared libraries named in the LD_PRELOAD
   environment variable will be loaded before *any* others. This means
   they can get in first and override symbols in those other libraries,
   (often deferring to the original symbols later via
   dlsym (RTLD_NEXT, ...). This is a useful hooking technique for all
   sorts of obscure purposes: e.g. fakeroot relies on it, as does the
   Electric Fence malloc debugger.

 - But that's not all! Any shared libraries named in /etc/ld.so.preload
   will be loaded before any others *systemwide*. Using this for
   anything at all is generally a sign of galloping insanity or being a
   toolchain or kernel developer (but I repeat myself).

So library loading happens in the order

(initial executable load only)
   (executable mapped by kernel)   
   /lib/ld-linux.so.2 (or whatever is specified in the DT_INTERP section
                       of the executable: mapped by kernel)
   /etc/ld.so.preload
   from LD_PRELOAD
(executables and shared libraries below here)
   from DT_RPATH tag (with all the $ORIGIN/$PLATFORM/$LIB madness)
   from LD_LIBRARY_PATH
   from DT_RUNPATH tag (as for DT_RPATH)
   /etc/ld.so.conf (with all the hwcap searching madness)

If all of these things are in use at the same time, expect to get very,
very seriously confused! Thankfully almost all you ever need to pay
attention to is /etc/ld.so.conf, and (on proprietary systems)
LD_LIBRARY_PATH.

> The peculiar thing about this problem is only kword and kpresenter are
> affected. Possibly there are other "undefined symbol" problems in my
> builds that I have not yet noticed, but I'm guessing kword and
> kpresenter are not linking correctly during my builds. I don't know
> how to debug further or what to look for.

You might find the linker flag --no-undefined (-Wl,--no-undefined in
LDFLAGS) to be useful. (I thought Trinity was passing it already, but I
could be wrong.)

Other more-or-less-obscure things you might find useful in this hunt:

 - dynamic linker symbol debugging, set LD_DEBUG=symbols before running
   the program: very verbose: see also LD_DEBUG=help (then run any
   dynamically-linked program at all, e.g. LD_DEBUG=help ls)

 - linker symbol tracing, -Wl,--trace-symbol=SYMBOL in LDFLAGS,
   which prints every file the named symbol appears in

-- 
NULL && (void)