Linkage order

In short: When you invoke the linker the order of object files does not matter, but the order of static libraries does matter. For exemplification we construct some simple toy examples where two libraries libfoo.a and libbar.a are used by a test program xtest.

For showing you how this is relevant in the real world these examples are first realized in C and the GNU compiler and linker are used. You will also see that in the real world there many technical and platform dependant details are hidden.

We will not completely reveal all of them but at least point to them. The intension is that you see enough from the real world so that you understand that the essential things are reflected by the ULM tools.

The quiz at the end of this page makes some references to this question on stackoverflow. This is supposed to show you that we are dealing with things that actually have some practical relevance. And most of all, understanding what these guys are talking about should give you some confidence ;-)

GNU Linker: Example from the real world

From Session 10 you already got the idea that C code is just a more abstract, platform independent description of assembly code. So when we use examples in C we just hide details about the instruction set of the underlying architecture or about the used calling conventions. The C compiler takes care of this details and produces the proper assembly code. So looking at the following C code you know how to implement it equivalently in assembly for the ULM. Now just belief that for any other architecture the assembly code is essentially the same:

extern int
puts(const char *);

extern void
foo();

int
main()
{
    puts("main");
    foo();
    puts("MAIN");
}
extern int
puts(const char *);

extern void
bar();

void
foo()
{
    puts("foo");
    bar();
    puts("FOO");
}
extern int
puts(const char *);

void
bar()
{
    puts("bar");
}

With gcc you can generate an executable from this source files with a single command:

theon$ gcc -o xtest xtest.c foo.c bar.c
theon$ 

This command appears harmless but hides quite a few things. Using the option -v you can see that gcc is used here as a convenient wrapper for calling a C compiler, assembler and linker. Don't be shy unfold the following shell box and breathe in some of the details:

theon$ gcc -v -o xtest xtest.c foo.c bar.c
Using built-in specs.
COLLECT_GCC=/opt/ulm/ballinrobe/bin/gcc
COLLECT_LTO_WRAPPER=/opt/ulm/ballinrobe/libexec/gcc/x86_64-pc-solaris2.11/7.3.0/lto-wrapper
Target: x86_64-pc-solaris2.11
Configured with: ../gcc-7.3.0/configure --target=x86_64-pc-solaris2.11 --disable-multilib --disable-bootstrap --enable-languages=c,c++,fortran --prefix=/opt/ulm/ballinrobe --with-local-prefix=/opt/ulm/ballinrobe --with-ld=/opt/ulm/ballinrobe/bin/ld --with-as=/opt/ulm/ballinrobe/bin/as --with-gnu-as --disable-nls --with-gmp=/opt/ulm/ballinrobe --with-mpfr=/opt/ulm/ballinrobe --with-mpc=/opt/ulm/ballinrobe --with-cloog=/opt/ulm/ballinrobe --with-ppl=/opt/ulm/ballinrobe --with-isl=/opt/ulm/ballinrobe --with-libiconv-prefix=/usr --with-build-time-tools=/opt/ulm/ballinrobe/bin --with-system-zlib
Thread model: posix
gcc version 7.3.0 (GCC) 
COLLECT_GCC_OPTIONS='-v' '-o' 'xtest' '-mtune=generic' '-march=x86-64'
 /opt/ulm/ballinrobe/libexec/gcc/x86_64-pc-solaris2.11/7.3.0/cc1 -quiet -v xtest.c -quiet -dumpbase xtest.c -mtune=generic -march=x86-64 -auxbase xtest -version -o /tmp/cc5DHe9b.s
GNU C11 (GCC) version 7.3.0 (x86_64-pc-solaris2.11)
        compiled by GNU C version 7.3.0, GMP version 6.1.2, MPFR version 4.0.1, MPC version 1.1.0, isl version isl-0.15-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/7.3.0/../../../../x86_64-pc-solaris2.11/include"
#include "..." search starts here:
#include <...> search starts here:
 /opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/7.3.0/include
 /opt/ulm/ballinrobe/include
 /opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/7.3.0/include-fixed
 /usr/include
End of search list.
GNU C11 (GCC) version 7.3.0 (x86_64-pc-solaris2.11)
        compiled by GNU C version 7.3.0, GMP version 6.1.2, MPFR version 4.0.1, MPC version 1.1.0, isl version isl-0.15-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 8809304f116acd2a0d5629041a7b213c
COLLECT_GCC_OPTIONS='-v' '-o' 'xtest' '-mtune=generic' '-march=x86-64'
 /opt/ulm/ballinrobe/bin/as -v -V -Qy -s --64 -o /tmp/cc7CYDwd.o /tmp/cc5DHe9b.s
GNU assembler version 2.30 (i386-pc-solaris2.11) using BFD version (GNU Binutils) 2.30
COLLECT_GCC_OPTIONS='-v' '-o' 'xtest' '-mtune=generic' '-march=x86-64'
 /opt/ulm/ballinrobe/libexec/gcc/x86_64-pc-solaris2.11/7.3.0/cc1 -quiet -v foo.c -quiet -dumpbase foo.c -mtune=generic -march=x86-64 -auxbase foo -version -o /tmp/cc5DHe9b.s
GNU C11 (GCC) version 7.3.0 (x86_64-pc-solaris2.11)
        compiled by GNU C version 7.3.0, GMP version 6.1.2, MPFR version 4.0.1, MPC version 1.1.0, isl version isl-0.15-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/7.3.0/../../../../x86_64-pc-solaris2.11/include"
#include "..." search starts here:
#include <...> search starts here:
 /opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/7.3.0/include
 /opt/ulm/ballinrobe/include
 /opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/7.3.0/include-fixed
 /usr/include
End of search list.
GNU C11 (GCC) version 7.3.0 (x86_64-pc-solaris2.11)
        compiled by GNU C version 7.3.0, GMP version 6.1.2, MPFR version 4.0.1, MPC version 1.1.0, isl version isl-0.15-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 8809304f116acd2a0d5629041a7b213c
COLLECT_GCC_OPTIONS='-v' '-o' 'xtest' '-mtune=generic' '-march=x86-64'
 /opt/ulm/ballinrobe/bin/as -v -V -Qy -s --64 -o /tmp/ccaGGMgc.o /tmp/cc5DHe9b.s
GNU assembler version 2.30 (i386-pc-solaris2.11) using BFD version (GNU Binutils) 2.30
COLLECT_GCC_OPTIONS='-v' '-o' 'xtest' '-mtune=generic' '-march=x86-64'
 /opt/ulm/ballinrobe/libexec/gcc/x86_64-pc-solaris2.11/7.3.0/cc1 -quiet -v bar.c -quiet -dumpbase bar.c -mtune=generic -march=x86-64 -auxbase bar -version -o /tmp/cc5DHe9b.s
GNU C11 (GCC) version 7.3.0 (x86_64-pc-solaris2.11)
        compiled by GNU C version 7.3.0, GMP version 6.1.2, MPFR version 4.0.1, MPC version 1.1.0, isl version isl-0.15-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/7.3.0/../../../../x86_64-pc-solaris2.11/include"
#include "..." search starts here:
#include <...> search starts here:
 /opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/7.3.0/include
 /opt/ulm/ballinrobe/include
 /opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/7.3.0/include-fixed
 /usr/include
End of search list.
GNU C11 (GCC) version 7.3.0 (x86_64-pc-solaris2.11)
        compiled by GNU C version 7.3.0, GMP version 6.1.2, MPFR version 4.0.1, MPC version 1.1.0, isl version isl-0.15-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 8809304f116acd2a0d5629041a7b213c
COLLECT_GCC_OPTIONS='-v' '-o' 'xtest' '-mtune=generic' '-march=x86-64'
 /opt/ulm/ballinrobe/bin/as -v -V -Qy -s --64 -o /tmp/cch6IUja.o /tmp/cc5DHe9b.s
GNU assembler version 2.30 (i386-pc-solaris2.11) using BFD version (GNU Binutils) 2.30
COMPILER_PATH=/opt/ulm/ballinrobe/libexec/gcc/x86_64-pc-solaris2.11/7.3.0/:/opt/ulm/ballinrobe/libexec/gcc/x86_64-pc-solaris2.11/7.3.0/:/opt/ulm/ballinrobe/libexec/gcc/x86_64-pc-solaris2.11/:/opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/7.3.0/:/opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/:/usr/ccs/bin/
LIBRARY_PATH=/opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/7.3.0/:/opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/7.3.0/../../../amd64/:/lib/amd64/:/usr/lib/amd64/:/opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/7.3.0/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-v' '-o' 'xtest' '-mtune=generic' '-march=x86-64'
 /opt/ulm/ballinrobe/libexec/gcc/x86_64-pc-solaris2.11/7.3.0/collect2 -plugin /opt/ulm/ballinrobe/libexec/gcc/x86_64-pc-solaris2.11/7.3.0/liblto_plugin.so -plugin-opt=/opt/ulm/ballinrobe/libexec/gcc/x86_64-pc-solaris2.11/7.3.0/lto-wrapper -plugin-opt=-fresolution=/tmp/cch4u8ha.res -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_eh -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_eh --eh-frame-hdr -V -m elf_x86_64_sol2 -Y P,/lib/amd64:/usr/lib/amd64 -Qy -o xtest /usr/lib/amd64/crt1.o /usr/lib/amd64/crti.o /usr/lib/amd64/values-Xa.o /opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/7.3.0/crtbegin.o -L/opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/7.3.0 -L/opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/7.3.0/../../../amd64 -L/lib/amd64 -L/usr/lib/amd64 -L/opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/7.3.0/../../.. /tmp/cc7CYDwd.o /tmp/ccaGGMgc.o /tmp/cch6IUja.o -lgcc -lgcc_eh -lc -lgcc -lgcc_eh /opt/ulm/ballinrobe/lib/gcc/x86_64-pc-solaris2.11/7.3.0/crtend.o /usr/lib/amd64/crtn.o
GNU ld (GNU Binutils) 2.30
  Supported emulations:
   elf_i386_sol2
   elf_i386_ldso
   elf_i386
   elf_iamcu
   elf_x86_64_sol2
   elf_x86_64
   elf_l1om
   elf_k1om
COLLECT_GCC_OPTIONS='-v' '-o' 'xtest' '-mtune=generic' '-march=x86-64'
theon$ 

Simplified (we neglect the C preprocessor) you can describe what is going on behind the scene by this flowchart:

What is denoted as “some libraries” hides code for communicating with the operating system. For instance, it contains a _start function which calls function main and returns its return value as exit code, and function puts for printing some text.

Some truths just for completeness

Truth be told, these libraries are so called shared libraries, and on Windows they are called Dynamic-link libraries (DLLs). Compared to static libraries the details behind shared libraries are a bit harder to explain and they don't solve any problem that is relevant for us. Hence we just scratch this topic by mentioning it.

On Linux (try the following on heim) the topic shared libraries can be postponed until there is an actual need for them. If you use the option -static all libraries involved are actually are static libraries:

heim$ gcc -static -v -o xtest xtest.c foo.c bar.c
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 6.3.0-18+deb9u1' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1) 
COLLECT_GCC_OPTIONS='-static' '-v' '-o' 'xtest' '-mtune=generic' '-march=x86-64'
 /usr/lib/gcc/x86_64-linux-gnu/6/cc1 -quiet -v -imultiarch x86_64-linux-gnu xtest.c -quiet -dumpbase xtest.c -mtune=generic -march=x86-64 -auxbase xtest -version -o /tmp/cc5VgL7Q.s
GNU C11 (Debian 6.3.0-18+deb9u1) version 6.3.0 20170516 (x86_64-linux-gnu)
        compiled by GNU C version 6.3.0 20170516, GMP version 6.1.2, MPFR version 3.1.5, MPC version 1.0.3, isl version 0.15
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/6/../../../../x86_64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/lib/gcc/x86_64-linux-gnu/6/include
 /usr/local/include
 /usr/lib/gcc/x86_64-linux-gnu/6/include-fixed
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
GNU C11 (Debian 6.3.0-18+deb9u1) version 6.3.0 20170516 (x86_64-linux-gnu)
        compiled by GNU C version 6.3.0 20170516, GMP version 6.1.2, MPFR version 3.1.5, MPC version 1.0.3, isl version 0.15
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: b8e5d7f3c4236757ee0871869b8330f3
COLLECT_GCC_OPTIONS='-static' '-v' '-o' 'xtest' '-mtune=generic' '-march=x86-64'
 as -v --64 -o /tmp/cc8UKBju.o /tmp/cc5VgL7Q.s
GNU assembler version 2.28 (x86_64-linux-gnu) using BFD version (GNU Binutils for Debian) 2.28
COLLECT_GCC_OPTIONS='-static' '-v' '-o' 'xtest' '-mtune=generic' '-march=x86-64'
 /usr/lib/gcc/x86_64-linux-gnu/6/cc1 -quiet -v -imultiarch x86_64-linux-gnu foo.c -quiet -dumpbase foo.c -mtune=generic -march=x86-64 -auxbase foo -version -o /tmp/cc5VgL7Q.s
GNU C11 (Debian 6.3.0-18+deb9u1) version 6.3.0 20170516 (x86_64-linux-gnu)
        compiled by GNU C version 6.3.0 20170516, GMP version 6.1.2, MPFR version 3.1.5, MPC version 1.0.3, isl version 0.15
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/6/../../../../x86_64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/lib/gcc/x86_64-linux-gnu/6/include
 /usr/local/include
 /usr/lib/gcc/x86_64-linux-gnu/6/include-fixed
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
GNU C11 (Debian 6.3.0-18+deb9u1) version 6.3.0 20170516 (x86_64-linux-gnu)
        compiled by GNU C version 6.3.0 20170516, GMP version 6.1.2, MPFR version 3.1.5, MPC version 1.0.3, isl version 0.15
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: b8e5d7f3c4236757ee0871869b8330f3
COLLECT_GCC_OPTIONS='-static' '-v' '-o' 'xtest' '-mtune=generic' '-march=x86-64'
 as -v --64 -o /tmp/cc5zuJA7.o /tmp/cc5VgL7Q.s
GNU assembler version 2.28 (x86_64-linux-gnu) using BFD version (GNU Binutils for Debian) 2.28
COLLECT_GCC_OPTIONS='-static' '-v' '-o' 'xtest' '-mtune=generic' '-march=x86-64'
 /usr/lib/gcc/x86_64-linux-gnu/6/cc1 -quiet -v -imultiarch x86_64-linux-gnu bar.c -quiet -dumpbase bar.c -mtune=generic -march=x86-64 -auxbase bar -version -o /tmp/cc5VgL7Q.s
GNU C11 (Debian 6.3.0-18+deb9u1) version 6.3.0 20170516 (x86_64-linux-gnu)
        compiled by GNU C version 6.3.0 20170516, GMP version 6.1.2, MPFR version 3.1.5, MPC version 1.0.3, isl version 0.15
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/6/../../../../x86_64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/lib/gcc/x86_64-linux-gnu/6/include
 /usr/local/include
 /usr/lib/gcc/x86_64-linux-gnu/6/include-fixed
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
GNU C11 (Debian 6.3.0-18+deb9u1) version 6.3.0 20170516 (x86_64-linux-gnu)
        compiled by GNU C version 6.3.0 20170516, GMP version 6.1.2, MPFR version 3.1.5, MPC version 1.0.3, isl version 0.15
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: b8e5d7f3c4236757ee0871869b8330f3
COLLECT_GCC_OPTIONS='-static' '-v' '-o' 'xtest' '-mtune=generic' '-march=x86-64'
 as -v --64 -o /tmp/cci12iWK.o /tmp/cc5VgL7Q.s
GNU assembler version 2.28 (x86_64-linux-gnu) using BFD version (GNU Binutils for Debian) 2.28
COMPILER_PATH=/usr/lib/gcc/x86_64-linux-gnu/6/:/usr/lib/gcc/x86_64-linux-gnu/6/:/usr/lib/gcc/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/6/:/usr/lib/gcc/x86_64-linux-gnu/
LIBRARY_PATH=/usr/lib/gcc/x86_64-linux-gnu/6/:/usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/6/../../../../lib/:/lib/x86_64-linux-gnu/:/lib/../lib/:/usr/lib/x86_64-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/x86_64-linux-gnu/6/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-static' '-v' '-o' 'xtest' '-mtune=generic' '-march=x86-64'
 /usr/lib/gcc/x86_64-linux-gnu/6/collect2 -plugin /usr/lib/gcc/x86_64-linux-gnu/6/liblto_plugin.so -plugin-opt=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper -plugin-opt=-fresolution=/tmp/ccZSXZio.res -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_eh -plugin-opt=-pass-through=-lc --sysroot=/ --build-id -m elf_x86_64 --hash-style=gnu -static -o xtest /usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/crt1.o /usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/6/crtbeginT.o -L/usr/lib/gcc/x86_64-linux-gnu/6 -L/usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/6/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/6/../../.. /tmp/cc8UKBju.o /tmp/cc5zuJA7.o /tmp/cci12iWK.o --start-group -lgcc -lgcc_eh -lc --end-group /usr/lib/gcc/x86_64-linux-gnu/6/crtend.o /usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/crtn.o
COLLECT_GCC_OPTIONS='-static' '-v' '-o' 'xtest' '-mtune=generic' '-march=x86-64'
heim$ 

On other systems like Windows, MacOS or Solaris (which runs on theon) you would get a linker error because the vendors don't ship all required static libraries. In this article Static Linking - where did it go you can find some explanation about the background.

Generate the object files

Using the option -c the tool chain behind gcc stops after the C code was translated into an object file:

theon$ gcc -c xtest.c
theon$ gcc -c foo.c
theon$ gcc -c bar.c
theon$ 

In the following we use these to exemplify how the GNU linker processes objects and static libraries.

Linking object files

Like the ULM linker all object files that the GNU linker receives are combined. Regardless whether they are required to resolve undefined symbols. The advantage of that is that the order of the objects does not matter:

theon$ gcc -o xtest xtest.o foo.o bar.o
theon$ xtest
main
foo
bar
FOO
MAIN
theon$ gcc -o xtest xtest.o bar.o foo.o
theon$ xtest
main
foo
bar
FOO
MAIN
theon$ gcc -o xtest bar.o foo.o xtest.o
theon$ xtest
main
foo
bar
FOO
MAIN
theon$ 

The disadvantage is that the executables might contain object files that are unused.

Generating and linking static libraries

Next we generate two static libraries that store foo.o and bar.o respectively:

theon$ ar cru libfoo.a foo.o
theon$ ranlib libfoo.a
theon$ ar cru libbar.a bar.o
theon$ ranlib libbar.a
theon$ 

Besides the object file each archive contains an index as member. However, the ar command on Solaris does not print that if you use the -t option:

theon$ ar t libfoo.a
foo.o
theon$ ar t libbar.a
bar.o
theon$ 

Note that there are actually different ranlib implementations in the real world. On Linux and Solaris usually GNU ranlib is installed (which hides the index), on MacOS it is by default BSD ranlib (which does not hide the index). Read about ranlib on wikipedia to get a overview of popular implementations and how they differ. On Solaris and Linux you can see the index with with nm -s:

theon$ nm -s libfoo.a

Archive index:
foo in foo.o

foo.o:
                 U bar
0000000000000000 T foo
                 U puts
theon$ nm -s libbar.a

Archive index:
bar in bar.o

bar.o:
0000000000000000 T bar
                 U puts
theon$ 

Now this will be the only order that works:

theon$ gcc -o xtest xtest.o libfoo.a libbar.a
theon$ xtest
main
foo
bar
FOO
MAIN
theon$ 

Any other order will result in a linker error. For example

theon$ gcc -o xtest xtest.o libbar.a libfoo.a
libfoo.a(foo.o): In function `foo':
foo.c:(.text+0x14): undefined reference to `bar'
collect2: error: ld returned 1 exit status
theon$ 

or

theon$ gcc -o xtest libfoo.a libbar.a xtest.o
xtest.o: In function `main':
xtest.c:(.text+0x14): undefined reference to `foo'
collect2: error: ld returned 1 exit status
theon$ 

ULM Linker: Example for non-circular dependencies

The real world is ugly and harsh and in teaching you either have to hide some details (the “some library” or “shared library” stuff) or you just can mention them (but there is no time to reveal all the details). Therefore we resemble the real world example with the ULM tools. The advantage is that we exactly know what is going on here. There will be no “some libraries” involved for “doing something you can not (and don't have to) understand in detail”.

The adapted example consists of three source files and the following call tree:

In the implementation the calling convention from Session 11 is used as we have non-leaf functions here. In reality we would have at least 5 function (_start, puts, main, foo and bar) here we simplify things:

  • Think of xtest.s as a merged combination of the _start and main function:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
            .equ    FP,             1
            .equ    SP,             2
            .equ    RET,            3
    
    //------------------------------------------------------------------------------
    // Function _start()
    //------------------------------------------------------------------------------
            .equ    ret,            0
            .equ    fp,             8
            .equ    rval,           16
    
            .text
            .globl _start
    _start:
            // begin of the function body
    
            ldzwq   0,              %SP
    
            // call function foo()
            subq    16,             %SP,            %SP
            ldzwq   foo,            %4
            jmp     %4,             %RET
            addq    16,             %SP,            %SP
    
            halt    0
    
  • Instead of calling an extra function puts the strings are printed character by character using hard coded putc instructions:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
            .equ    FP,             1
            .equ    SP,             2
            .equ    RET,            3
    
    //------------------------------------------------------------------------------
    // Function foo()
    //------------------------------------------------------------------------------
            .equ    ret,            0
            .equ    fp,             8
            .equ    rval,           16
    
            .text
            .globl  foo
    foo:
            // function prologue
            movq    %RET,           ret(%SP)
            movq    %FP,            fp(%SP)
            addq    0,              %SP,            %FP
            // begin of the function body
    
            /*
               print "foo\n"
            */
            putc    'f'
            putc    'o'
            putc    'o'
            putc    '\n'
    
            /*
               bar();
            */
            // call procedure bar()
            subq    16,             %SP,            %SP
            # store argument msg in 16(%SP)
            ldzwq   bar,            %4
            jmp     %4,             %RET
            addq    16,             %SP,            %SP
    
            /*
               print "FOO\n"
            */
            putc    'F'
            putc    'O'
            putc    'O'
            putc    '\n'
    
            // end of the function body
            // function epilogue
    main.leave:
            addq    0,              %FP,            %SP
            movq    fp(%SP),        %FP
            movq    ret(%SP),       %RET
            jmp     %RET,           %0
    
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
            .equ    FP,             1
            .equ    SP,             2
            .equ    RET,            3
    
    //------------------------------------------------------------------------------
    // Function bar()
    //------------------------------------------------------------------------------
            .equ    ret,            0
            .equ    fp,             8
            .equ    rval,           16
    
            .text
            .globl  bar
    bar:
            // function prologue
            movq    %RET,           ret(%SP)
            movq    %FP,            fp(%SP)
            addq    0,              %SP,            %FP
            // begin of the function body
    
            /*
               print "bar\n"
            */
            putc    'b'
            putc    'a'
            putc    'r'
            putc    '\n'
    
            // end of the function body
            // function epilogue
    main.leave:
            addq    0,              %FP,            %SP
            movq    fp(%SP),        %FP
            movq    ret(%SP),       %RET
            jmp     %RET,           %0
    

As in the examples with gcc we generate object files and static libraries. Using the ULM tools we can understand in more detail how linkage is done.

Order of object files does not matter

First create the object files:

theon$ ulmas -o xtest.o xtest.s
theon$ ulmas -o foo.o foo.s
theon$ ulmas -o bar.o bar.s
theon$ 

Now pass the object files to the linker in an arbitrary order. Here you see 3 of the 6 possibilities:

theon$ ulmld -o xtest xtest.o foo.o bar.o
theon$ ulm xtest
foo
bar
FOO
theon$ ulmld -o xtest xtest.o bar.o foo.o
theon$ ulm xtest
foo
bar
FOO
theon$ ulmld -o xtest bar.o foo.o xtest.o
theon$ ulm xtest
foo
bar
FOO
theon$ 

The order does not matter because an object files is always imported. This has the drawback that the executable might contain unused code. But the advantage is that we don't have to think about the order. As long as in the union of all objects all symbols can be resolved we get an executable.

Order of static libraries does matter

Now we create the static libraries:

theon$ ar cru libfoo.a foo.o
theon$ ulmranlib libfoo.a
theon$ ar cru libbar.a bar.o
theon$ ulmranlib libbar.a
theon$ 

The linker processes arguments in the order they appear. Recall, if a static library is passed only object files are picked from it if they resolve at least one undefined symbol.

  • The only order that allows the linker to resolve all symbols is as follows:

    theon$ ulmld -o xtest xtest.o libfoo.a libbar.a
    theon$ 

    The linker first reads in the complete object file xtest.o. This defines (and exports) the symbol _start which is required for an executable but has one unresolved symbol foo:

    theon$ grep "^T" xtest.o | grep _start
    T _start                      0x0000000000000000
    theon$ grep "^U" xtest.o
    U foo                         0x0000000000000000
    theon$ 

    Then the linker processes the static library libfoo.a. That means it checks if it contains an object file that can resolve the undefined symbol foo. For that the index of the library is searched. We can do this manually with

    theon$ ar x libfoo.a __SYMTAB_INDEX
    theon$ grep foo __SYMTAB_INDEX
    T foo                         foo.o
    theon$ 

    Because of that the linker will import the object file foo.o from the library. This leads to another unresolved symbol bar:

    theon$ ar x libfoo.a foo.o
    theon$ grep "^U" foo.o
    U bar                         0x0000000000000000
    theon$ 

    This symbol can not be resolved by an object file stored in libfoo.a:

    theon$ ar x libfoo.a __SYMTAB_INDEX
    theon$ grep bar __SYMTAB_INDEX
    theon$ 

    Hence the next argument gets processed which is libbar.a. Again the linker checks if it contains objects that resolve a undefined symbol, in this case bar:

    theon$ ar x libbar.a __SYMTAB_INDEX
    theon$ grep bar __SYMTAB_INDEX
    T bar                         bar.o
    theon$ 

    After picking bar.o from the library no more undefined symbols exist:

    theon$ ar x libbar.a bar.o
    theon$ grep "^U" bar.o
    theon$ 

    Hence linkage is successful and an executable gets created.

  • Let's reproduce why other orders fail. For example this order:

    theon$ ulmld -o xtest xtest.o libbar.a libfoo.a
    ulmld: execution aborted
    Unresolved symbol bar
    theon$ 

    After processing xtest.o the list of undefined symbols again just consists of foo. As this is not contained in libbar.a nothing gets extracted. The argument is libfoo.a which resolves foo but introduces the undefined symbol bar.

  • Also consider this order:

    theon$ ulmld -o xtest libfoo.a libbar.a xtest.o
    ulmld: execution aborted
    Unresolved symbol foo
    theon$ 

    In this example neither libfoo.a nor libbar.a contains an object file that defines _start. Hence nothing gets imported from them. Hence the linker only reads in xtest.o which results in the unresolved symbol foo.

Quiz13: Circular dependencies

In general static libraries can have circular dependencies, e.g. libfoo.a depends on libbar.a, and libbar.a depends on libfoo.a. We will not discuss whether this form of dependencies indicate that something is wrong about the overall design of the libraries (most likely it can be avoided by rethinking things). Just read this question on stackoverflow to see that you have to face this things in the real world whether you like it or not.

For the quiz you have to submit 4 files

submit hpc quiz14 notes.txt xtest.o libfoo.a libbar.a

Here a brief description of these files:

  • xtest.o is an object file for a test program,

  • libfoo.a and libbar.a are static libraries and

  • in notes.txt you can use to give some free style answers to different questions.

The details about these files are as follows:

  • You have to write 7 functions in separate translation units as outlined in this flow chart:

  • In the implementation the “do something” should print some text so that you can trace the call tree by running the program without using a debugger (we want something simple).

  • Use the ULM assembler to generate object files for each translation unit. You don't have to write a makefile for that (but you can), i.e. you can do that manually:

    1
    2
    3
    4
    5
    6
    7
    ulmas -o xtest.o xtest.s
    ulmas -o foo1.o foo1.s
    ulmas -o foo2.o foo2.s
    ulmas -o foo3.o foo3.s
    ulmas -o bar1.o bar1.s
    ulmas -o bar2.o bar2.s
    ulmas -o bar3.o bar3.s
    

    Or by using the bash as follows (I hope such examples motivate you to learn more about bash):

    1
    for i in *.s; do ulmas -o ${i#.s}.o $i; done
    
  • Create the static libraries libfoo.a and libbar.a as follows:

    1
    2
    3
    4
    ar cru libfoo.a foo1.o foo2.o foo3.o
    ulmranlib libfoo.a
    ar cru libbar.a bar1.o bar2.o bar3.o
    ulmranlib libbar.a
    
  • In notes.txt give short answers to these questions. Just assume you would post the answer on stackoverflow, so keep it short, simple and essential (or assume you have to give the answer in an oral exam):

    • Why do you get an linker error when you try this

      1
      ulmld -o xtest xtest.o libfoo.a libbar.a
      
    • In the question on stackoverflow two possibilities for linkage are given. One was that circular dependencies can be handled with

      1
      ulmld -o xtest xtest.o libfoo.a libbar.a libfoo.a
      

      Why is this not working here?

  • Is the following working?

    1
    ulmld -o xtest xtest.o libfoo.a libbar.a libfoo.a libbar.a
    

    And are we closer to a working solution? How can this be extended so that we finally have a working solution?

  • How would you rank the answers given on stackoverflow? And why?

I had no good idea about how to make an exercise for the following. But the most important things in life you don't learn because you were asked about them in an exercise. So just do it for fun (and don't take the “RTFM” serious: I just want to prepare you to the slang used by coders in the wild, and I think it is funny):

In the question on stackoverflow the other solution was based on using the linker options --start-group and --end-group. RTFM of GNU ld to find out what they are meaning. These options are also supported by the ULM linker. So try this:

1
ulmld -o xtest xtest.o --start-group libfoo.a libbar.a --end-group

and also this

1
ulmld -o xtest xtest.o -\( libfoo.a libbar.a -\)