Inside ELF Symbol Tables |
Ali Bahrami Saturday September 23, 2006
ELF files are full of things we need to keep track of for later access: Names, addresses, sizes, and intended purpose. Without this information, an ELF file would not be very useful. We would have no way to make sense of the impenetrable mass of octal or hexidecimal numbers.
Consider: When you write a program in any language above direct machine code, you give symbolic names to functions and data. The compiler turns these things into code. At the machine level, they are known only by their address (offset within the file) and their size. There are no names in this machine code. How then, can a linker combine multiple object files, or a symbolic debugger know what name to use for a given address? How do we make sense of these files?
Symbols are the way we manage this information. Compilers generate symbol information along with code. Linkers manipulate symbols, reading them in, matching them up, and writing them out. Almost everything a linker does is driven by symbols. Finally, debuggers use them to figure out what they are looking at and to provide you with a human readable view of that information.
It is therefore a rare ELF file that doesn't have a symbol table. However, most programmers have only an abstract knowledge that symbol tables exist, and that they loosely correspond to their functions and data, and some "other stuff". Protected by the abstractions of compiler, linker, and debugger, we don't usually need to know too much about the details of how a symbol table is organized. I've recently completed a project that required me to learn about symbol tables in great detail. Today, I'm going to write about the symbol tables used by the linker.
The dynsym is a smaller version of the symtab that only contains global symbols. The information found in the dynsym is therefore also found in the symtab, while the reverse is not necessarily true. You are almost certainly wondering why we complicate the world with two symbol tables. Won't one table do? Yes, it would, but at the cost of using more memory than necessary in the running process.
To understand how this works, we need to understand the difference between allocable and a non-allocable ELF sections. ELF files contain some sections (e.g. code and data) needed at runtime by the process that uses them. These sections are marked as being allocable. There are many other sections that are needed by linkers, debuggers, and other such tools, but which are not needed by the running program. These are said to be non-allocable. When a linker builds an ELF file, it gathers all of the allocable sections together in one part of the file, and all of the non-allocable sections are placed elsewhere. When the operating system loads the resulting file, only the allocable part is mapped into memory. The non-allocable part remains in the file, but is not visible in memory. strip(1) can be used to remove certain non-allocable sections from a file. This reduces file size by throwing away information. The program is still runnable, but debuggers may be hampered in their ability to tell you what the program is doing.
The full symbol table contains a large amount of data needed to link or debug our files, but not needed at runtime. In fact, in the days before sharable libraries and dynamic linking, none of it was needed at runtime. There was a single, non-allocable symbol table (reasonably named "symtab"). When dynamic linking was added to the system, the original designers faced a choice: Make the symtab allocable, or provide a second smaller allocable copy. The symbols needed at runtime are a small subset of the total, so a second symbol table saves virtual memory in the running process. This is an important consideration. Hence, a second symbol table was invented for dynamic linking, and consequently named "dynsym".
And so, we have two symbol tables. The symtab contains everything, but it is non-allocable, can be stripped, and has no runtime cost. The dynsym is allocable, and contains the symbols needed to support runtime operation. This division has served us well over the years.
- STT_NOTYPE
- Used when we don't know what a symbol is, or to indicate the absence of a symbol.
- STT_OBJECT / STT_COMMON
- These are both used to represent data. (The word OBJECT in this context should not interpreted as having anything to do with object orientation. STT_DATA might have been a better name.)
STT_OBJECT is used for normal variable definitions, while STT_COMMON is used for tentative definitions. See my earlier blog entry about tentative symbols for more information on the differences between them.
- STT_FUNC
- A function, or other executable code.
- STT_SECTION
- When I first started learning about ELF, and someone would say something about "section symbols", I thought they meant a symbol from some given section. That's not it though: A section symbol is a symbol that is used to refer to the section itself. They are used mainly when performing relocations, which are often specified in the form of "modify the value at offset XXX relative to the start of section YYY".
- STT_FILE
- The name of a file, either of an input file used to construct the ELF file, or of the ELF file itself.
- STT_TLS
- A third type of data symbol, used for thread local data. A thread local variable is a variable that is unique to each thread. For instance, if I declare the variable "foo" to be thread local, then every thread has a separate foo variable of their own, and they do not see or share values from the other threads. Thread local variables are created for each thread when the thread is created. As such, their number (one per thread) and addresses (depends on when the thread is created, and how many threads there are) are unknown until runtime. An ELF file cannot contain an address for them. Instead, a STT_TLS symbol is used. The value of a STT_TLS symbol is an offset, which is used to calculate a TLS offset relative to the thread pointer. You can read more about TLS in the Linker And Libraries Guide.
- STT_REGISTER
- The Sparc architecture has a concept known as a "register symbol". These symbols are used to validate symbol/register usage, and can also be used to initialize global registers. Other architectures don't use these.
In addition to symbol type, each symbols has other attributes:
Machines are much larger than they used to be. The memory saved by the symtab/dynsym division is still a good thing, but there are times when we wish that the dynsym contained a bit more data. This is harder than it sounds. The layout of dynsym interacts with the rest of an ELF file in ways that are set in stone by years of existing practice. Backward compatibility is a critical feature of Solaris. We try extremely hard to keep those old programs running. And yet, the needs of observability, spearheaded by important new features like DTrace, put pressure on us in the other direction.
This discussion is prelude to work I recently did to augment the dynsym to contain local symbols, while preserving full backward compatibility with older versions of Solaris. I plan to cover that in a future blog entry. ELF is old, and much of how it works cannot be changed. Its original designers (our "Founding Fathers", as Rod calls them) anticipated that this would be the case, based no doubt on hard experience with earlier systems. The ELF design is therefore uniquely flexible, which explains why it has survived as long as it has. There is always a way to add something new. Sometimes, it takes several tries to find the best way.
[3] "Tentative" Symbols? | [5] What Is .SUNW_ldynsym? |