Nagging As A Strategy For Better Linking: -z guidance |
Ali Bahrami Friday November 11, 2011
The link-editor (ld) in Solaris 11 has a new feature that we call guidance that is intended to help you build better objects. The basic idea behind guidance is that if (and only if) you request it, the link-editor will issue messages suggesting better options and other changes you might make to your ld command to get better results. You can choose to take the advice, or you can disable specific types of guidance while acting on others. In some ways, this works like an experienced friend leaning over your shoulder and giving you advice you're free to take it or leave it as you see fit, but you get nudged to do a better job than you might have otherwise.
We use guidance to build the core Solaris OS, and it has proven to be useful, both in improving our objects, and in making sure that regressions don't creep back in later. In this article, I'm going to describe the evolution in thinking and design that led to the implementation of the -z guidance option, as well as give a brief description of how it works.
The guidance feature issues non-fatal warnings. However, experience shows that once developers get used to ignoring warnings, it is inevitable that real problems will be lost in the noise and ignored or missed. This is why we have a zero tolerance policy against build noise in the core Solaris OS. In order to get maximum benefit from -z guidance while maintaining this policy, I added the -z fatal-warnings option at the same time.
Much of the material presented here is adapted from the arc case:
PSARC 2010/312 Link-editor guidance
Over the years, applications have grown in size and complexity. Important concepts like dynamic linking that didn't exist in the original Unix system were invented. Object file formats changed. In the case of System V Release 4 Unix derivatives like Solaris, the ELF (Extensible Linking Format) was adopted. Since then, the ELF system has evolved to provide tools needed to manage today's larger and more complex environments. Features such as lazy loading, and direct bindings have been added. In an ideal world, many of these options would be defaults, with rarely used options that allow the user to turn them off. However, the reality is exactly the reverse: For backward compatibility, these features are all options that must be explicitly turned on by the user. This has led to a situation in which most applications do not take advantage of the many improvements that have been made in linking over the last 20 years. If their code seems to link and run without issue, what motivation does a developer have to read a complex manpage, absorb the information provided, choose the features that matter for their application, and apply them? Experience shows that only the most motivated and diligent programmers will make that effort.
We know that most programs would be improved if we could just get you to use the various whizzy features that we provide, but the defaults conspire against us. We have long wanted to do something to make it easier for our users to use the linkers more effectively. There have been many conversations over the years regarding this issue, and how to address it. They always break down along the following lines:
This idea is simple, elegant, and impossible. Doing so would break a large number of existing applications, including those of ISVs, big customers, and a plethora of existing open source packages. In each case, the owner of that code may choose to follow our lead and fix their code, or they may view it as an invitation to reconsider their commitment to our platform. Backward compatibility, and our installed base of working software, is one of our greatest assets, and not something to be lightly put at risk. Breaking backward compatibility at this level of the system is likely to do more harm than good.
But, it sure is tempting.
The resulting link-editor would be a pleasure to use. However, the approach is doomed to niche status. There is a vast pile of exiting code in the world built around the existing ld command, that reaches back to the 1970's. ld use is embedded in large and unknown numbers of makefiles, and is used by name by compilers that execute it. A Unix link-editor that is not named ld will not find a majority audience no matter how good it might be.
Finally, a new linker command will eventually cease to be new, and will accumulate its own burden of backward compatibility issues.
6239804 make it easier for ld(1) to do what's bestThe idea is to have a '-z best' option that unchains ld from its backward compatibility commitment, and allows it to turn on the "best" set of features, as determined by the authors of ld. The specific set of features enabled by -z best would be subject to change over time, as requirements change.
This idea is more realistic than the other two, but was never implemented because it has some important issues that we could never answer to our satisfaction:
The idea for -z guidance came from consideration of these points. By decoupling the advice from the act of taking the advice, we can retain the good aspects of -z best while avoiding its pitfalls:
% ld -Dargs -z guidance=foo,nodefs debug: debug: Solaris Linkers: 5.11-1.2275 debug: debug: arg[1] option=-D: option-argument: args debug: arg[2] option=-z: option-argument: guidance=foo,nodefs debug: warning: unrecognized -z guidance item: foo
-z fatal-warnings | nofatal-warnings
--fatal-warnings | --no-fatal-warningsThe -z fatal-warnings and the --fatal-warnings option cause the link-editor to treat warnings as fatal errors.The -z nofatal-warnings and the --no-fatal-warnings option cause the link-editor to treat warnings as non-fatal. This is the default behavior.
The -z guidance option is defined as follows:
-z guidance[=item1,item2,...]Provide guidance messages to suggest ld options that can improve the quality of the resulting object, or which are otherwise considered to be beneficial. The specific guidance offered is subject to change over time as the system evolves. Obsolete guidance offered by older versions of ld may be dropped in new versions. Similarly, new guidance may be added to new versions of ld. Guidance therefore always represents current best practices.It is possible to enable guidance, while preventing specific guidance messages, by providing a list of item tokens, representing the class of guidance to be suppressed. In this way, unwanted advice can be suppressed without losing the benefit of other guidance. Unrecognized item tokens are quietly ignored by ld, allowing a given ld command line to be executed on a variety of older or newer versions of Solaris.
The guidance offered by the current version of ld, and the item tokens used to disable these messages, are as follows.
Specify Required Dependencies
Dynamic executables and shared objects should explicitly define all of the dependencies they require. Guidance recommends the use of the -z defs option, should any symbol references remain unsatisfied when building dynamic objects. This guidance can be disabled with -z guidance=nodefs.Do Not Specify Non-Required DependenciesDynamic executables and shared objects should not define any dependencies that do not satisfy the symbol references made by the dynamic object. Guidance recommends that unused dependencies be removed. This guidance can be disabled with -z guidance=nounused.Lazy LoadingDependencies should be identified for lazy loading. Guidance recommends the use of the -z lazyload option should any dependency be processed before either a -z lazyload or -z nolazyload option is encountered. This guidance can be disabled with -z guidance=nolazyload.Direct BindingsDependencies should be referenced with direct bindings. Guidance recommends the use of the -B direct, or -z direct options should any dependency be processed before either of these options, or the -z nodirect option is encountered. This guidance can be disabled with -z guidance=nodirect.Pure Text SegmentDynamic objects should not contain relocations to non-writable, allocable sections. Guidance recommends compiling objects with Position Independent Code (PIC) should any relocations against the text segment remain, and neither the -z textwarn or -z textoff options are encountered. This guidance can be disabled with -z guidance=notext.Mapfile SyntaxAll mapfiles should use the version 2 mapfile syntax. Guidance recommends the use of the version 2 syntax should any mapfiles be encountered that use the version 1 syntax. This guidance can be disabled with -z guidance=nomapfile.Library Search PathInappropriate dependencies that are encountered by ld are quietly ignored. For example, a 32-bit dependency that is encountered when generating a 64-bit object is ignored. These dependencies can result from incorrect search path settings, such as supplying an incorrect -L option. Although benign, this dependency processing is wasteful, and might hide a build problem that should be solved. Guidance recommends the removal of any inappropriate dependencies. This guidance can be disabled with -z guidance=nolibpath.In addition, -z guidance=noall can be used to entirely disable the guidance feature. See Chapter 7, Link-Editor Quick Reference, in the Linker and Libraries Guide for more information on guidance and advice for building better objects.Example
The following example demonstrates how the guidance feature is intended to work. We will build a shared object that has a variety of shortcomings:
- Does not specify all it's dependencies
- Specifies dependencies it does not use
- Does not use direct bindings
- Uses a version 1 mapfile
- Contains relocations to the readonly allocable text (not PIC)
This scenario is sadly very common many shared objects have one or more of these issues.
% cat hello.c #include <stdio.h> #include <unistd.h> void hello(void) { printf("hello user %d\n", getpid()); } % cat mapfile.v1 # This version 1 mapfile will trigger a guidance message % cc hello.c -o hello.so -G -M mapfile.v1 -lelfAs you can see, the operation completes without error, resulting in a usable object. However, turning on guidance reveals a number of things that could be better:
% cc hello.c -o hello.so -G -M mapfile.v1 -lelf -zguidance ld: guidance: version 2 mapfile syntax recommended: mapfile.v1 ld: guidance: -z lazyload option recommended before first dependency ld: guidance: -B direct or -z direct option recommended before first dependency Undefined first referenced symbol in file getpid hello.o (symbol belongs to implicit dependency /lib/libc.so.1) printf hello.o (symbol belongs to implicit dependency /lib/libc.so.1) ld: warning: symbol referencing errors ld: guidance: -z defs option recommended for shared objects ld: guidance: removal of unused dependency recommended: libelf.so.1 warning: Text relocation remains referenced against symbol offset in file .rodata1 (section) 0xa hello.o getpid 0x4 hello.o printf 0xf hello.o ld: guidance: position independent (PIC) code recommended for shared objects ld: guidance: see ld(1) -z guidance for more informationGiven the explicit advice in the above guidance messages, it is relatively easy to modify the example to do the right things:
% cat mapfile.v2 # This version 2 mapfile will not trigger a guidance message $mapfile_version 2 % cc hello.c -o hello.so -Kpic -G -Bdirect -M mapfile.v2 -lc -zguidanceThere are situations in which the guidance does not fit the object being built. For instance, you want to build an object without direct bindings:
% cc -Kpic hello.c -o hello.so -G -M mapfile.v2 -lc -zguidance ld: guidance: -B direct or -z direct option recommended before first dependency ld: guidance: see ld(1) -z guidance for more informationIt is easy to disable that specific guidance warning without losing the overall benefit from allowing the remainder of the guidance feature to operate:
% cc -Kpic hello.c -o hello.so -G -M mapfile.v2 -lc -zguidance=nodirectConclusions
The linking guidelines enforced by the ld guidance feature correspond rather directly to our standards for building the core Solaris OS. I'm sure that comes as no surprise. It only makes sense that we would want to build our own product as well as we know how. Solaris is usually the first significant test for any new linker feature. We now enable guidance by default for all builds, and the effect has been very positive.Guidance helps us find suboptimal objects more quickly. Programmers get concrete advice for what to change instead of vague generalities. Even in the cases where we override the guidance, the makefile rules to do so serve as documentation of the fact.
Deciding to use guidance is likely to cause some up front work for most code, as it forces you to consider using new features such as direct bindings. Such investigation is worthwhile, but does not come for free. However, the guidance suggestions offer a structured and straightforward way to tackle modernizing your objects, and once that work is done, for keeping them that way. The investment is often worth it, and will replay you in terms of better performance and fewer problems. I hope that you find guidance to be as useful as we have.
Surfing with the Linker-Aliens
Published Elsewhere
https://blogs.oracle.com/ali/entry/nagging_as_a_strategy_for/
https://blogs.oracle.com/ali/nagging-as-a-strategy-for-better-linking:-z-guidance/
Surfing with the Linker-Aliens
[18] 64-bit Archives Blog Index (ali) [20] Stub Objects