| ELF Section Compression | |
Ali Bahrami Thursday December 24, 2015
Surfing with the Linker-Aliens
In December of 2015, we added public and documented ELF functions for
section compression to libelf, in cooperation with the GNU libutils,
whose libelf delivers the same routines. This means that the ELF world
now has a stable and portable way to compress non-allocable (debug) data
in an ELF object a big deal in ELF circles. For us,
this was the cumulation of a 4 year journey. In this article, I'll tell
that story, and describe the new ELF functionality, as well the older GNU
compression scheme that it supersedes.
Part 1: A Better Format, Private APIs, and Public Functionality
Back in 2012, we introduced a new object ELF object type in
Solaris 11 Update 1 called the Ancillary Object, which was
the topic of
a previous blog entry. Ancillary objects let you
push all of the debug data associated with an object into a separate
file. As previously discussed, modern DWARF debug data can be huge, up
to 10X the size of the code it describes, making objects too large for some
applications. Separate debug files are one way to address this. The other
obvious approach is compression.
One thing that I didn't mention in that 2012 article is that I initially
proposed ancillary objects hoping in part that they would obviate the
need to provide compression. I had studied both options, and ancillary
objects were far and away the easier option, and potentially more efficient as
they don't entail the overhead of compression on creation, and decompression
on access. I suggested that we do ancillary objects instead of compression, and
everyone agreed that would be fine. Predictably, that bargain didn't hold,
and the first request for compression arrived shortly after support for
ancillary objects was delivered. And in fact, that's completely
reasonable, as the two techniques are orthogonal, and can even be profitably
combined to create smaller separate debug files. Compression may be less
efficient in terms of CPU, but is more efficient in terms of
network bandwidth (which matters) and disk space (which these days largely
does not). As is always the case in software, context matters.
The problem was that there was no standard ELF format for representing
compressed sections. The GNU folks did have an extension for doing compression:
- Uncompressed sections all start with a ".debug" prefix, and when
compressed, are renamed to start with ".zdebug".
- The first 4 bytes of the data area contain the bytes
[ 'Z', 'L', 'I', 'B'].
- The next 8 bytes encode a 64-bit integer in big endian byte
order, encoding the size of the uncompressed data. This is true
regardless of the type of machine the object describes (for
instance, a 32-bit little endian machine). This integer is
potentially unaligned, and must be read byte by byte.
You can think of this in terms of a header, followed by an uninterpreted
stream of compressed bytes, where the header looks like:
typedef struct {
uchar_t gch_magic[4]; /* [ 'Z', 'L', 'I', 'B'] */
uchar_t gch_size[8]; /* unaligned 64-bit ELFDATAMSB integer */
} Chdr_GNU;
There are some good things about this format: It extends ELF to provide
compression without creating a blizzard of new section types, and it is
simple to describe. And ZLIB is clearly a solid choice for compression.
However, there are a number of significant problems:
- Limiting compressible section names to those starting with a
".debug" prefix artificially limits sections that might otherwise
be compressed, such as the stab and annotate sections produced by
the non-GNU compilers, including the Sun/Oracle Studio compilers.
The changing names are confusing to users, but more fundamentally,
ELF features of this nature should not be tied to section names at all.
- ELF files start with ("\177ELF"), but this is the only place in the
format where a character based "magic number" is employed. This serves
as a bootstrap, to allow the reader to properly interpret the ELF header
and then through it, the rest of the file. Everywhere else, attributes
are identified as well known integers, or bit valued flag assignments.
The "ZLIB" magic number is contrary to this. Compression should have
been identified with a section header flag, or a specific section type.
- ELF objects encode addresses, sizes, and other such integers as
machine sized words in the byte order described by the ELF header.
The use of a 64-bit big endian integer "no matter what", and the fact
that its not properly aligned are contrary to this intent. It's just
not very ELF-like.
- This format does not capture the fact that the compressed and
uncompressed data may have differing alignment requirements.
The value found in the section header sh_addralign field is instead
set to the more restrictive of the two. A better format would explicitly
capture the uncompressed alignment requirement along with the
uncompressed data size.
I do not know the real history behind this, but the design is clearly
aimed at avoiding changes to existing ELF tools, and in particular,
to avoid changing libelf. That seems like the sort of thing one might
do as an initial experiment, intending to come back and do it properly
once the experiment has been shown to be a success. My hacker intuition,
not based on any verified facts, says that this was a prototype that
escaped from the lab. Whatever the history, this original GNU format
is the starting point for any discussion of ELF compression
It was obvious to me that if we were to support ELF compression in Solaris
at all, that we'd want to support this format for compatibility with external
tool chains such as gcc. However, it was equally clear that it wasn't
going to be sufficient for our purposes. One option was to invent an alternative
Solaris-only feature. A better, but harder, path would be to make a
attempt to extend the generic ELF ABI to provide the necessary functionality.
I decided to give it a shot.
I asked some questions in the ELF community, and learned that there was
willingness to cooperate on an improved format. The GNU feature wasn't widely
used yet, so perhaps it wasn't too late for an alternative to gain traction.
And so,
I proposed an improved format in July of 2012 for the generic ELF ABI (gABI).
This replacement format intentionally retained much of the original design,
hoping to make it easy for existing code to transition to it.
In particular, the basic theory of operation is the same:
- There is still a compression header written at the head of the data.
- ZLIB, which is unique in the industry for its stability
and longevity, is still used. In fact, the actual ZLIB byte streams
are identical in the GNU and gABI formats. The differences are all
in the section metadata.
The important differences are:
- Compression is indicated by a section header flag (SHF_COMPRESSED)
rather than by section name.
- Section names are not changed when sections are compressed or
decompressed.
- The compression header contains three items:
- compression type
- uncompressed data size
- uncompressed data alignment
- All of these fields are properly aligned machine sized words that
employ the encoding specified by the object ELF header.
Going into this effort, I did not know whether it would be welcome
or not, but in fact, I received considerable help from everyone I
encountered along the way.
For various reasons, it took about 6-9 months to get this proposal finalized
and officially added to the gABI, which was maintained at that time by
the SCO group. The gABI maintainer at the time was John Wolfe at SCO,
and John worked hard with us to finish and polish the proposal into
its final form. Among the reasons for the delay was Hurricane Sandy, which
flooded both his home and workplace, and the bankruptcy of the SCO group
and the rebirth of its Unix operation as UnXis. That was a very full year
for John, but he persevered, and the new compression format became part
of the gABI in the spring of 2012.
The public discussion is visible at:
https://groups.google.com/forum/?fromgroups=#!topic/generic-abi/dBOS1H47Q64
The gABI description can be found at:
http://www.sco.com/developers/gabi/latest/ch4.sheader.html
With a usable format established, we delivered section compression
in Solaris 11.2, supporting both the old GNU format, and the new gABI format:
- libelf provided support for compressing and decompressing sections
on demand. However, these functions were uncommitted, and undocumented,
and for internal use only.
- Enhancements to the ELF tools (elfdump, etc) to display compression
related information.
- Support in ld to decompress input sections, and to produce
compressed output sections. Input sections are automatically
decompressed in a manner that's transparent to the user. Output
sections are compressed when the -z compress-sections option is
specified by the user.
- A new utility, /usr/bin/elfcompress, was introduced, which can
compress and decompress sections in existing objects (including
old ones), in any supported format.
Solaris documentation for these features can be found at:
That was a pretty significant, and I intended to blog about
it when Solaris 11.2 shipped, but frankly, I considered the overall project
to be unfinished. Lacking a sense of closure, I repeatedly pushed off writing
about it. The remaining problems were:
- Without standard public functions for doing compression and
decompression in libelf, each compiler and debugger that wants
to play has to provide its own implementation. This is a waste
of time. It also can lead to minor incompatibilities between
implementations.
- Although the GNU community had been very supportive of the
effort to augment the gABI with a standard compression scheme,
it was anyone's guess as to whether or not they would adopt it.
After all, they already had a working "good enough" solution, and
it's often the case that working systems never progress beyond that
stage.
That's where things sat from the summer of 2012, to the spring of 2015.
The dbx debugger folks, who were the original requesters for these
abilities, wrote their own decompression code and
added support for compressed debug sections on both Solaris and Linux.
As far as I know, no one else did so, The link-editor
(ld) and elfcompress utility were the main tools for compressing and
decompressing debug sections. It probably wasn't used much, and I
had no idea whether things would end there or not.
Part 2: Public APIs, Portable Code
Two years went by, with little news on the ELF compression front.
I had a goal of publishing APIs from libelf so that compilers and
others could leverage our existing functionality, but without
any standard, any such APIs would have seen little use, and it was
hard to see it as a priority. For a couple of years, it sat in the
middle of my TODO list, and saw no action.
That changed in April of 2015, when H.J. Lu asked a few questions
in the Generic System V Application Binary Interface
(generic-abi@googlegroups.com) mailing list. It turned out that he
was adding support for the gABI compression format to the GNU binutils,
a necessary first step for any real adoption to occur. I was happy
to see that, but the overall situation seemed unchanged. But then, this
arrived:
On 04/20/15 06:34, Mark Wielaard wrote:
I would like to see compressed section support in elfutils
libelf and would like to make sure it is source compatible
with the Solaris libelf if possible.
That was what I had been waiting for! Standard libelf
interfaces, coupled with a standard gABI format, are the only way to
forestall cross platform fragmentation. I immediately replied, and then
rushed to propose a set of interfaces, based on what I had learned
2 years previously in creating the Solaris implementation, which supports
both the old GNU and new gABI formats.
The simplest design, and arguably best, would have been to ignore the
old GNU format entirely and only support the new one. However, I assumed
that a proposal that didn't fully support the original GNU format would
not be well received, so I instead focused on putting together the simplest
proposal possible that could do both. There were essentially 4 basic
operations:
- identify
- Determine if a target section is compressed, and if so, in
which format.
- compressible
- Determine if a target section can be compressed in a specified
format, taking any required name changes required by the
old GNU format into account.
- compress
- Compress a target section.
- decompress
- Decompress a target section.
The need to handle the old format complicated matters, but it seemed
like the minimal solution to the problem, and after coding it up to
ensure it would work, I proposed it.
Mark took that proposal, and we started an on going discussion, which
stretched out over the next half year, with some large gaps where we
were both busy with other things. The public part of the discussion
can be seen
at at:
https://groups.google.com/forum/#!topic/generic-abi/9ZgNqIC2X-4
The response was polite and constructive, but there were questions
and discussion around the details that supported the old format, and in
particular, the section renaming it requires, that we found hard to resolve.
As we discussed the issues and went back and forth, I realized that I
had not properly understood that the GNU folks were willing to throw out much of
the complexity that supported the old format, and focus instead on clean
support for the gABI format. I had been pushing for features to support their
old format that they didn't really want. That was a very
welcome development indeed, and I quickly retrenched and came back with
a greatly simplified proposal that threw out most the complications
associated with supporting the old format. This proposal was received
with considerably more enthusiasm, and after some small iteration, we
quickly settled on 2 basic operations, compress, and getchdr, and converged
on the final APIs, finishing in December 2015:
- elf_compress()
- Compress or decompress ELF sections in the standard ELF format
described by the generic ELF ABI (gABI).
- elf_compress_gnu()
- Compress or decompress ELF sections in the deprecated
original GNU format. This function provides access
to the compression engine found within libelf, but the
details of section renaming are left to the caller.
- elf32_getchdr() / elf64_getchdr() / gelf_getchdr()
- Convenience functions used to get access to the compression
header for a compressed ELF section.
This is far simpler and better than what I would have put in
the Solaris libelf, had I decided to move on that TODO item in
my list, rather than wait. Knowing when not to act can sometimes
be a powerful technique, but sadly, one that only works in hindsight.
I'm happy to say it worked out in this case.
These functions, and the manpages describing them, are a standard part
of Solaris 11 Update 4. Our support for the original GNU compression format remains.
However, we now look forward to seeing the use of that format fade away
rapidly, and to seeing code that does ELF section compression build
seamlessly across platforms, and just work.
Thank You
I am grateful to the GNU community, and in particular to John Wolfe and
Mark Wielaard. Everything described here was greatly improved by their
involvement.
Surfing with the Linker-Aliens
Published Elsewhere
https://blogs.oracle.com/ali/elf_section_compression/
Surfing with the Linker-Aliens