Using Stub Objects |
Ali Bahrami Friday November 11, 2011
Having told the long and winding tale of where stub objects came from and how we use them to build Solaris, I'd like to focus now on the the nuts and bolts of building and using them. The following new features were added to the Solaris link-editor (ld) to support the production and use of stub objects:
- -z stub
- This new command line option informs ld that it is to build a stub object rather than a normal object. In this mode, it accepts the same command line arguments as usual, but will quietly ignore any objects and sharable object dependencies.
- STUB_OBJECT Mapfile Directive
- In order to build a stub version of an object, its mapfile must specify the STUB_OBJECT directive. When producing a non-stub object, the presence of STUB_OBJECT causes the link-editor to perform extra validation to ensure that the stub and non-stub objects will be compatible.
- ASSERT Mapfile Directive
- All data symbols exported from the object must have an ASSERT symbol directive in the mapfile that declares them as data and supplies the size, binding, bss attributes, and symbol aliasing details. When building the stub objects, the information in these ASSERT directives is used to create the data symbols. When building the real object, these ASSERT directives will ensure that the real object matches the linking interface presented by the stub.
Although ASSERT was added to the link-editor in order to support stub objects, they are a general purpose feature that can be used independently of stub objects. For instance you might choose to use an ASSERT directive if you have a symbol that must have a specific address in order for the object to operate properly and you want to automatically ensure that this will always be the case.
The material presented here is derived from a document I originally wrote during the development effort, which had the dual goals of providing supplemental materials for the stub object PSARC case, and as a set of edits that were eventually applied to the Oracle Solaris Linker and Libraries Manual (LLM). The Solaris 11 LLM contains this information in a more polished form.
When building a stub object, the link-editor ignores any object or library files specified on the command line, and these files need not exist in order to build a stub. Since the compilation step can be omitted, and because the link-editor has relatively little work to do, stub objects can be built very quickly.
Stub objects can be used to solve a variety of build problems:
Stub shared objects offer an alternative method for building code that sidesteps the above issues. Stub objects can be quickly built for all the shared objects produced by the build. Then, all the real shared objects and executables can be built in parallel, in any order, using the stub objects to stand in for the real objects at link-time. Afterwards, the executables and real shared objects are kept, and the stub shared objects are discarded.
- Speed
- Modern machines, using a version of make with the ability to parallelize operations, are capable of compiling and linking many objects simultaneously, and doing so offers significant speedups. However, it is typical that a given object will depend on other objects, and that there will be a core set of objects that nearly everything else depends on. It is necessary to impose an ordering that builds each object before any other object that requires it. This ordering creates bottlenecks that reduce the amount of parallelization that is possible and limits the overall speed at which the code can be built.
- Complexity/Correctness
- In a large body of code, there can be a large number of dependencies between the various objects. The makefiles or other build descriptions for these objects can become very complex and difficult to understand or maintain. The dependencies can change as the system evolves. This can cause a given set of makefiles to become slightly incorrect over time, leading to race conditions and mysterious rare build failures.
- Dependency Cycles
- It might be desirable to organize code as cooperating shared objects, each of which draw on the resources provided by the other. Such cycles cannot be supported in an environment where objects must be built before the objects that use them, even though the runtime linker is fully capable of loading and using such objects if they could be built.
Stub objects are built from a mapfile, which must satisfy the following requirements.
Given such a mapfile, the stub and real versions of the shared object can be built using the same command line for each, adding the '-z stub' option to the link for the stub object, and omiting the option from the link for the real object.
To demonstrate these ideas, the following code implements a shared object named idx5, which exports data from a 5 element array of integers, with each element initialized to contain its zero-based array index. This data is available as a global array, via an alternative alias data symbol with weak binding, and via a functional interface.
A mapfile is required to describe the interface provided by this shared object.% cat idx5.c int _idx5[5] = { 0, 1, 2, 3, 4 }; #pragma weak idx5 = _idx5 int idx5_func(int index) { if ((index < 0) || (index > 4)) return (-1); return (_idx5[index]); }
The following main program is used to print all the index values available from the idx5 shared object.% cat mapfile $mapfile_version 2 STUB_OBJECT; SYMBOL_SCOPE { _idx5 { ASSERT { TYPE=data; SIZE=4[5] }; }; idx5 { ASSERT { BINDING=weak; ALIAS=_idx5 }; }; idx5_func; local: *; };
The following commands create a stub version of this shared object in a subdirectory named stublib. elfdump is used to verify that the resulting object is a stub. The command used to build the stub differs from that of the real object only in the addition of the -z stub option, and the use of a different output file name. This demonstrates the ease with which stub generation can be added to an existing makefile.% cat main.c #include <stdio.h> extern int _idx5[5], idx5[5], idx5_func(int); int main(int argc, char **argv) { int i; for (i = 0; i < 5; i++) (void) printf("[%d] %d %d %d\n", i, _idx5[i], idx5[i], idx5_func(i)); return (0); }
The main program can now be built, using the stub object to stand in for the real shared object, and setting a runpath that will find the real object at runtime. However, as we have not yet built the real object, this program cannot yet be run. Attempts to cause the system to load the stub object are rejected, as the runtime linker knows that stub objects lack the actual code and data found in the real object, and cannot execute.% cc -Kpic -G -M mapfile -h libidx5.so.1 idx5.c -o stublib/libidx5.so.1 -zstub % ln -s libidx5.so.1 stublib/libidx5.so % elfdump -d stublib/libidx5.so | grep STUB [11] FLAGS_1 0x4000000 [ STUB ]
We build the real object using the same command as we used to build the stub, omitting the -z stub option, and writing the results to a different file.% cc main.c -L stublib -R '$ORIGIN/lib' -lidx5 -lc % ./a.out ld.so.1: a.out: fatal: libidx5.so.1: open failed: No such file or directory Killed % LD_PRELOAD=stublib/libidx5.so.1 ./a.out ld.so.1: a.out: fatal: stublib/libidx5.so.1: stub shared object cannot be used at runtime Killed
% cc -Kpic -G -M mapfile -h libidx5.so.1 idx5.c -o lib/libidx5.so.1
Once the real object has been built in the lib subdirectory, the program can be run.
% ./a.out [0] 0 0 0 [1] 1 1 1 [2] 2 2 2 [3] 3 3 3 [4] 4 4 4
The link-editor maintains an internal table of names that can be used in the logical expressions evaluated by $if and $elif. At startup, this table is initialized with items that describe the class of object (_ELF32 or _ELF64) and the type of the target machine (_sparc or _x86). We found that there were a small number of cases in the Solaris code base in which we needed to know what kind of object we were producing, so we added the following new predefined items in order to address that need:
Name Meaning ... ... _ET_DYN shared object _ET_EXEC executable object _ET_REL relocatable object ... ...
A stub shared object is built entirely from the information in the mapfiles supplied on the command line. When the -z stub option is specified to build a stub object, the presence of the STUB_OBJECT directive in a mapfile is required, and the link-editor uses the information in symbol ASSERT attributes to create global symbols that match those of the real object.STUB_OBJECT;
When the real object is built, the presence of STUB_OBJECT causes the link-editor to verify that the mapfiles accurately describe the real object interface, and that a stub object built from them will provide the same linking interface as the real object it represents.
These rules ensure that the mapfiles contain a description of the real shared object's linking interface that is sufficient to produce a stub object with a completely compatible linking interface.
ASSERT { ALIAS = symbol_name; BINDING = symbol_binding; TYPE = symbol_type; SH_ATTR = section_attributes; SIZE = size_value; SIZE = size_value[count]; };
The ASSERT attribute is used to specify the expected characteristics of the symbol. The link-editor compares the symbol characteristics that result from the link to those given by ASSERT attributes. If the real and asserted attributes do not agree, a fatal error is issued and the output object is not created.
In normal use, the link editor evaluates the ASSERT attribute when present, but does not require them, or provide default values for them. The presence of the STUB_OBJECT directive in a mapfile alters the interpretation of ASSERT to require them under some circumstances, and to supply default assertions if explicit ones are not present. See the definition of the STUB_OBJECT Directive for the details.
When the -z stub command line option is specified to build a stub object, the information provided by ASSERT attributes is used to define the attributes of the global symbols provided by the object.
ASSERT accepts the following:
- ALIAS
- Name of a previously defined symbol that this symbol is an alias for. An alias symbol has the same type, value, and size as the main symbol. The ALIAS attribute is mutually exclusive to the TYPE, SIZE, and SH_ATTR attributes, and cannot be used with them. When ALIAS is specified, the type, size, and section attributes are obtained from the alias symbol.
- BIND
- Specifies an ELF symbol binding, which can be any of the STB_ constants defined in <sys/elf.h>, with the STB_ prefix removed (e.g. GLOBAL, WEAK).
- TYPE
- Specifies an ELF symbol type, which can be any of the STT_ constants defined in <sys/elf.h>, with the STT_ prefix removed (e.g. OBJECT, COMMON, FUNC). In addition, for compatibility with other mapfile usage, FUNCTION and DATA can be specified, for STT_FUNC and STT_OBJECT, respectively. TYPE is mutually exclusive to ALIAS, and cannot be used in conjunction with it.
- SH_ATTR
- Specifies attributes of the section associated with the symbol. The section_attributes that can be specified are given in the following table:
SH_ATTR is mutually exclusive to ALIAS, and cannot be used in conjunction with it.
Section Attribute Meaning BITS Section is not of type SHT_NOBITS NOBITS Section is of type SHT_NOBITS - SIZE
- Specifies the expected symbol size. SIZE is mutually exclusive to ALIAS, and cannot be used in conjunction with it. The syntax for the size_value argument is as described in the discussion of the SIZE attribute below.
Prior to the introduction of the ASSERT SIZE attribute, the value of a SIZE attribute was always numeric. While attempting to apply ASSERT SIZE to the objects in the Solaris ON consolidation, I found that many data symbols have a size based on the natural machine wordsize for the class of object being produced. Variables declared as long, or as a pointer, will be 4 bytes in size in a 32-bit object, and 8 bytes in a 64-bit object. Initially, I employed the conditional $if directive to handle these cases as follows:
$if _ELF32 foo { ASSERT { TYPE=data; SIZE=4 } }; bar { ASSERT { TYPE=data; SIZE=20 } }; $elif _ELF64 foo { ASSERT { TYPE=data; SIZE=8 } }; bar { ASSERT { TYPE=data; SIZE=40 } }; $else $error UNKNOWN ELFCLASS $endif
I found that the situation occurs frequently enough that this is cumbersome. To simplify this case, I introduced the idea of the addrsize symbolic name, and of a repeat count, which together make it simple to specify machine word scalar or array symbols. Both the SIZE, and ASSERT SIZE attributes support this syntax:
The size_value argument can be a numeric value, or it can be the symbolic name addrsize. addrsize represents the size of a machine word capable of holding a memory address. The link-editor substitutes the value 4 for addrsize when building 32-bit objects, and the value 8 when building 64-bit objects. addrsize is useful for representing the size of pointer variables and C variables of type long, as it automatically adjusts for 32 and 64-bit objects without requiring the use of conditional input.Using this feature, the example above can be written more naturally as:The size_value argument can be optionally suffixed with a count value, enclosed in square brackets. If count is present, size_value and count are multiplied together to obtain the final size value.
foo { ASSERT { TYPE=data; SIZE=addrsize } }; bar { ASSERT { TYPE=data; SIZE=addrsize[5] } };
We have long advised against global data exported from shared objects. There are many ways in which global data does not fit well with dynamic linking. Stub objects simply provide one more reason to avoid this practice. It is always better to export all data via a functional interface. You should always hide your data, and make it available to your users via a function that they can call to acquire the address of the data item. However, If you do have to support global data for a stub, perhaps because you are working with an already existing object, it is still easilily done, as shown above.
Oracle does not like us to discuss hypothetical new features that don't exist in shipping product, so I'll end this section with a speculation. It might be possible to do more in this area to ease the difficulty of dealing with objects that have global data that the users of the library don't need. Perhaps someday...
and then to use the same link command you're currently using, with the addition of the -z stub option.STUB_OBJECT;
Happy Stubbing!
[20] Stub Objects | [22] Stub Proto: Not Just For Stub Objects |