Static libraries

So far each test program was linked against all the object files from our library. This means that the executables in general contain code that is never used. For example, the program xhello contained the code from putui.o. We will overcome this problem by creating a static library libulm.a:

  • Using the Unix command ar (note that we are using the “real thing”, not some reimplemented tool for the ULM) all object files will be bundled in a single file libulm.a.

  • A second tool ulmranlib (which is an adaption of the Unix command ranlib) creates an index for the archive so that the linker can only pick those object file that needed to resolve undefined symbols.

The build system needs to be able to update the library. That means if new source files are added the corresponding object files need to be created and added. If source files are deleted (or renamed) the have to be removed.

Using ar for creating and maintaining an archive

Although ar is nowadays almost exclusively used for static libraries that is just one possible application for this command. Here you will see some examples for adding files to an archive, as well as extracting and removing files from an archive. Think of creating backups as an alternative use case, or that you want to send several files by email which beforehand get combined into a single file (note that ar is much older than zip which is also doing compression).

Creating an archive for two files

Files are not compressed by ar when you add them. So you can use an editor or cat to see the content of an archive. Here an example for two text files that will be added to an archive:

1
2
3
This is 'file1'.

Just a simple text file.
1
2
3
This is 'file2'.

Just another text file.

In the following ar is called with option c, r and u to add file1 and file2 to an archive named my_archive:

theon$ ar cru my_archive file1 file2
theon$ cat my_archive
!<arch>
file1/          1594733838  13075 210   100664  43        `
This is 'file1'.

Just a simple text file.

file2/          1594733838  13075 210   100664  42        `
This is 'file2'.

Just another text file.
theon$ 

The options are explained in more detail in the man page of ar. Here some brief version:

  • Because of c you don't get a warning if my_archive does not already exist. But even without this option an archive always gets created if it does not already exist.

  • From the content of my_archive you can see that ar also stores the names of the archived files. Using the option r means that if you add a file whose filename is already in the archive this entry gets overwritten.

  • With u such an existing entry gets only gets replaced if you add a file that is newer.

Add another file to the archive

Let's add another file to the archive:

1
2
3
This is 'file3'.

And I am running out of ideas for some demo text.

You just specify to which archive this file gets added:

theon$ ar cru my_archive file3
theon$ cat my_archive
!<arch>
file1/          1594733838  13075 210   100664  43        `
This is 'file1'.

Just a simple text file.

file2/          1594733838  13075 210   100664  42        `
This is 'file2'.

Just another text file.
file3/          1594733838  13075 210   100664  68        `
This is 'file3'.

And I am running out of ideas for some demo text.
theon$ 

Get a table of content

With the option t you can get a table of files stored in an archive:

theon$ ar t my_archive
file1
file2
file3
theon$ 

Extracting a file from an archive

In conjunction with static libraries we never have to extract files from an archive (the linker internally has to do that). But nevertheless you can do that. So here we delete an archived file and restore it afterwards from the archive:

theon$ ls
file1       file2       file3       my_archive
theon$ rm file1
theon$ ls
file2       file3       my_archive
theon$ ar x my_archive file1
theon$ ls
file1       file2       file3       my_archive
theon$ cat file1
This is 'file1'.

Just a simple text file.
theon$ 

Deleting a file from an archive

With the option d a member of the archive can be deleted:

theon$ ar d my_archive file1
theon$ ar t my_archive
file2
file3
theon$ 

ulmranlib: Creating an index for an archive

Before we consider makefiles for automating we manually build a static library. In this connection also the ulmranlib command will be introduced.

In a first step all object files are created and added to the archive libulm.a:

theon$ ulmas -o crt0.o crt0.s
theon$ ulmas -o putui.o putui.s
theon$ ulmas -o puts.o puts.s
theon$ ar cru libulm.a crt0.o putui.o puts.o
theon$ 

This result could already be used to create an executable. You simply pass it to the linker like an object file:

theon$ ulmas   -o xhello.o xhello.s
theon$ ulmld -o xhello xhello.o libulm.a
theon$ xhello
hello, world!
theon$ 

In its current form the ULM linker also treats libulm.a as a sequence of object files. So while we have to pass fewer files to the linker we still have the problem that an executable contains code for unused object files. In the case of xhello you can see that it still contains the code from putui.o:

theon$ grep libulm xhello
# from: libulm.a(crt0.o)
# from: libulm.a(puts.o)
# from: libulm.a(putui.o)
theon$ 

In order to pick only object files that are needed in order to resolve undefined symbols the archive needs an addition entry that gets created by ulmranlib (which as mention above resembles ranlib). The command expects an archive as argument for which it creates and add such an entry:

theon$ ulmranlib libulm.a
theon$ 

This created the archive member __SYMTAB_INDEX:

theon$ ar t libulm.a
crt0.o
putui.o
puts.o
__SYMTAB_INDEX
theon$ 

In this form libulm.a finally can be called a static library. As you see this is just a table with entries about what symbols are defined in the stored object files:

theon$ ar x libulm.a __SYMTAB_INDEX
theon$ cat __SYMTAB_INDEX
T _start                      crt0.o
T puts                        puts.o
T putui                       putui.o
theon$ rm __SYMTAB_INDEX
theon$ 

If the ULM linker find such an index in the archive it gets used to extract only those object files that resolve at least one unresolved symbol. Check the executable generated in the following to see that it no longer contains code from putui.o:

theon$ ulmas   -o xhello.o xhello.s
theon$ ulmld -o xhello xhello.o libulm.a
theon$ grep libulm xhello
# from: libulm.a(crt0.o)
# from: libulm.a(puts.o)
theon$ 

In general adding another object file can lead to further undefined symbols. For example, if in an object file a function from another object file gets called. Therefore each time an object file was added the linker checks if more object files can be extracted to resolve at least one unresolved symbol.

Makefile

In a first approach we make some minor modification to the previous makefile. The name of the library is stored in variable Lib:

1
Lib         := libulm.a

and this becomes in addition to the test programs a primary target:

1
all:    $(TestTargets) $(Lib)

Each test program now depends on this library instead of the list of object files in LibObjects:

1
2
$(TestTargets): % : %.o $(Lib)
        $(LD) -o $@ $^

This list of object files defines now the dependences for the library which gets created by ar and ulmranlib:

1
2
3
$(Lib): $(LibObjects)
        ar cru $@ $^
        ulmranlib $@

Altogether this leads to:

AS := ulmas
LD := ulmld

TestTargets := $(patsubst %.s,%,$(wildcard x*.s))
LibSources  := $(filter-out x%.s,$(wildcard *.s))
LibObjects  := $(patsubst %.s,%.o,$(LibSources))
Lib         := libulm.a

.PHONY: all clean

all:    $(TestTargets) $(Lib)

clean:
        $(RM) $(TestTargets) *.o $(Lib)

$(TestTargets): % : %.o $(Lib)
        $(LD) -o $@ $^

$(Lib): $(LibObjects)
        ar cru $@ $^
        ulmranlib $@

Now let's test that the build system only updates what is necessary:

  • First we build everything from scratch:

    theon$ make clean
    rm -f xanswer xhello *.o libulm.a
    theon$ make
    ulmas   -o xanswer.o xanswer.s
    ulmas   -o crt0.o crt0.s
    ulmas   -o putui.o putui.s
    ulmas   -o puts.o puts.s
    ar cru libulm.a crt0.o putui.o puts.o
    ulmranlib libulm.a
    ulmld -o xanswer xanswer.o libulm.a
    ulmas   -o xhello.o xhello.s
    ulmld -o xhello xhello.o libulm.a
    theon$ 
  • Next we simulate that some source file that is part of the library, e.g. puts.s gets modified:

    theon$ touch puts.s
    theon$ make
    ulmas   -o puts.o puts.s
    ar cru libulm.a crt0.o putui.o puts.o
    ulmranlib libulm.a
    ulmld -o xanswer xanswer.o libulm.a
    ulmld -o xhello xhello.o libulm.a
    theon$ 

    Only for puts.s the object file was regenerated, the library updated and the test programs rebuild by linking the existing object files.

  • If the source file of a test program gets modified only its object file gets generated and linked against the library:

    theon$ touch xhello.s
    theon$ make
    ulmas   -o xhello.o xhello.s
    ulmld -o xhello xhello.o libulm.a
    theon$ 

This makefile still has some minor flaw. If you rename or delete a source file from the library its object code does not get removed from the library. In its current form our build system would require that you manually delete libulm.a.

More tweaks for the makefile

The following makefile takes care of renamed or deleted source files (you don't have to understnad all the details but I hope it encourages you to learn more about GNU make):

Lib         := libulm.a
TestTargets := $(patsubst %.s,%,$(wildcard x*.s))

LibSources  := $(filter-out x%.s,$(wildcard *.s))
LibObjects  := $(patsubst %.s,%.o,$(LibSources))

LibContent  = $(if $(wildcard $(Lib)),$(shell ar t $(Lib) | grep -v "^__"),)
LibRemoves  = $(filter-out $(LibObjects),$(LibContent))
SrcRemoves  = $(patsubst %.o,%.c,$(LibRemoves))
ArDelete    = $(if $(LibRemoves),ar d $(Lib) $(LibRemoves),)

AS := ulmas
LD := ulmld
LDFLAGS := $(Lib)
RANLIB := ulmranlib

.PHONY: all clean

all:    $(TestTargets) $(Lib)

clean:
        $(RM) $(TestTargets) *.o $(Lib)

$(TestTargets): % : %.o $(Lib)
        $(LD) -o $@ $^

$(Lib)(%.o) : %.o
        $(AR) cr $@ $^

$(SrcRemoves) :
        $(ArDelete)

$(Lib) : $(Lib)($(LibObjects)) $(SrcRemoves)
        $(RANLIB) $(Lib)

The makefiles uses GNU make's features for rules that address archive members, conditional functions and the shell function. The details will not be covered here (but you can read them up in the documentation). Let's just demonstrate that it actually solves the problem mentioned above. We begin by building everything from scratch:

theon$ make clean
rm -f xanswer xhello *.o libulm.a
theon$ make
ulmas   -o xanswer.o xanswer.s
ulmas   -o crt0.o crt0.s
ar cr libulm.a crt0.o
ulmas   -o putui.o putui.s
ar cr libulm.a putui.o
ulmas   -o puts.o puts.s
ar cr libulm.a puts.o
ulmranlib libulm.a
ulmld -o xanswer xanswer.o libulm.a
ulmas   -o xhello.o xhello.s
ulmld -o xhello xhello.o libulm.a
rm puts.o crt0.o putui.o
theon$ 

The archive contains the objects we expect

theon$ ar t libulm.a
crt0.o
putui.o
puts.o
__SYMTAB_INDEX
theon$ 

Now let's rename some file from the library. For example, puts.s becomes new_puts.s:

theon$ mv puts.s new_puts.s
theon$ make
ulmas   -o new_puts.o new_puts.s
ar cr libulm.a new_puts.o
ar d libulm.a puts.o
ulmranlib libulm.a
ulmld -o xanswer xanswer.o libulm.a
ulmld -o xhello xhello.o libulm.a
rm new_puts.o
theon$ 

The object file puts.o was removed, and new_puts.o was generated with the assembler and inserted:

theon$ ar t libulm.a
crt0.o
putui.o
__SYMTAB_INDEX
new_puts.o
theon$ 

Also note that all executables were generated by relinking only.