ld -ztype, and Kernel Modules That Know What They Are

Ali Bahrami — Wednesday December 23, 2015

Surfing with the Linker-Aliens

There are 3 basic types of ELF object:

ET_REL

A relocatable object, commonly called a "dot O" files, because of the ".o" file extension that is commonly used for them. Compilers take in source files and produce relocatable objects. Relocatable objects can be subsequently be combined to create executables or shared objects. Relocatable objects cannot be executed without first being processed by another link-edit (ld) step to create an executable or shared object.
ET_EXEC

An executable, often referred to as an a.out, or simply as "programs", are the first object in a process, and contain the main() function where execution begins. Executables are typically built to be mapped to a specific fixed virtual memory address.
ET_DYN

Shared objects, sometimes called "dynamic libraries". A shared object is an object that can be loaded dynamically at runtime in a process to add executable code or data to the process. Shared objects are typically built with position independent code, and are intended to be mapped at any arbitrary address in the process where the operating system finds enough room.

We call executables, and shared objects, final objects, because they are in their final forms, and cannot be merged with other objects to create a new output object. Relocatable objects are not final, and can be combined with other relocatable objects to create a new executable, shared object, or even another relocatable object.

These three ELF types corresponded fairly directly to the types of objects we normally create and use. However, there are notable special cases:

Position Independent Executables (PIE) are implemented as ET_DYN objects that have an interpreter (runtime linker) specified. They play the role of an executable in the process, but they can be mapped at arbitrary addresses like a shared object. This facilitates security features such as ASLR.
Kernel modules have traditionally been implemented as relocatable objects, but within the kernel, they operate like shared objects and not normal "dot O" files. They are final objects, and would benefit from the link-editor knowing that at the time they're being built, and doing some kernel module specific finalization.

In the past, we've dealt with these special cases by setting extra link-editor options to get the effects we were after. We realized however that it would be far more effective if we instead gave the link-editor a high level option describing what we're trying to build. And so, in Solaris 11 Update 4, the ld command accepts a new -z type option. From the ld manpage:

-z type=object-type Specify the type of object to create. The following object-types are available. exec A dynamic executable. kmod A kernel module. pie A position-independent executable. This option also asserts the -ztext option. reloc A relocatable object. This is equivalent to specifying the -r option. shared A shared object. This is equivalent to specifying the -G option. This option also asserts the -ztext option.

When -z type is used, instead of more basic options such as -r or -G, ld is better able to provide the proper features without requiring the user to specify them. Better error handling also becomes possible, since incompatible options are easier to recognize. A very nice side effect of this high level specification is that ld is also able to tag the objects with their PIE or KMOD status. And so, the file command is able to identify them accurately. For instance:

    cc -Kpic -ztype=pie hello.c -o hello
    % ./hello 
    hello
    % file hello
    hello: ELF 32-bit LSB dynamic lib 80386 Version 1 [SSE],
        position-independent executable, dynamically linked,
        not stripped

I find this most satisfying in the realm of kernel modules. Prior to joining Sun, I spent 2 decades writing userland code that hovered somewhere just above the system call level. I was familiar with kernel programming and concepts, but I had never actually understood that kernel modules are nothing more than dot O files that someone decided could be safely loaded into the kernel. I have to admit that I found the idea that this was reasonable somewhat baffling. And so, on Solaris 11.3 and older, you'll see output like this:

% file /kernel/fs/amd64/zfs
/kernel/fs/amd64/zfs:  ELF 64-bit LSB relocatable AMD64 Version 1 [SSE]

Whereas on Solaris 11 Update 4, where all kernel modules are built with -ztype=kmod, kernel modules know what they are:

% file /kernel/fs/amd64/zfs
/kernel/fs/amd64/zfs:   ELF 64-bit LSB relocatable AMD64 Version 1 [SSE],
    kernel module

For me, that would be reason enough to do it, but there are other benefits:

The link-editor can detect attempts to link kernel modules into userland code, and issue an error.
The link-editor can merge sections and do other operations that normally do not apply to relocatable objects.
Kernel modules can be linked against each other to satisfy external symbol references, automatically generated NEEDED dynamic section elements, rather than requiring them to be specified manually via the -dy -N options that used to be required.

Surfing with the Linker-Aliens

Published Elsewhere

https://blogs.oracle.com/ali/explicit_kernel_modules/

Surfing with the Linker-Aliens

[31] Regex and Glob for Mapfiles Blog Index (ali) [33] ELF Section Compression