When a binary has a PT_GNU_RELRO segment, the elfhack injected code
uses mprotect to add the writable flag to relocated pages before
applying relocations, removing it afterwards. To do so, the elfhack
program uses the location and size of the PT_GNU_RELRO segment, and
adjusts it to be aligned according to the PT_LOAD alignment.
The problem here is that the PT_LOAD alignment doesn't necessarily match
the actual page alignment, and the resulting mprotect may end up not
covering the full extent of what the dynamic linker has protected
read-only according to the PT_GNU_RELRO segment. In turn, this can lead
to a crash on startup when trying to apply relocations to the still
read-only locations.
Practically speaking, this doesn't end up being a problem on x86, where
the PT_LOAD alignment is usually 4096, which happens to be the page
size, but on Debian armhf, it is 64k, while the run time page size can be
4k.
Bug 1385783 changed things such that the two elfhack sections are not
adjacent anymore. They can even be in different segments in some cases,
but the undo code doesn't know how to actually handle that case.
So for now, allow non adjacent sections, but still verify that they are
in the same segment.
In bug 635961, elfhack was made to (ab)use the bss section as a
temporary space for a pointer. To find it, it scanned writable PT_LOAD
segments to find one that has a different file and memory size,
indicating the presence of .bss. This usually works fine, but when
the binary is linked with lld and relro is enabled, the end of the
file-backed part of the PT_LOAD segment containing the .bss section
ends up in the RELRO segment, making that location read-only and
subsequently making the elfhacked binary crash when it tries to restore
the .bss to a clean state, because it's not actually writing in the .bss
section: lld page aligns it after the RELRO segment.
So instead of scanning PT_LOAD segments, we scan for SHT_NOBITS
sections that are not SHF_TLS (i.e. not .tbss).
The lld linker creates separate segments for purely executable sections
(such as .text) and sections preceding those (such as .rel.dyn). Neither
gold nor bfd ld do that, and just put all those sections in the same
executable segment.
Since elfhack is putting its executable code between the two relocation
sections, it ends up in a non-executable segment, leading to a crash
when it's time to run that code.
We thus insert the elfhack code before the first executable section
instead of between the two relocation sections (which is where the
elfhack data lies, and stays).
Somehow, with the Android toolchain, we end up with
non-empty-but-really-empty DT_INIT_ARRAYs.
In practical terms, they are arrays with no relocations, and content
that is meaningless:
$ objdump -s -j .init_array libnss3.so
libnss3.so: file format elf32-little
Contents of section .init_array:
1086e0 00000000 ....
$ readelf -r libnss3.so | grep 1086e0
$ objdump -s -j .init_array libplugin-container-pie.so
libplugin-container-pie.so: file format elf32-little
Contents of section .init_array:
4479c ffffffff 00000000 ffffffff 00000000 ................
$ readelf -r libplugin-container-pie.so | grep 4479c
Because so far, elfhack expected meaningful DT_INIT_ARRAYs, it bailed out
early in that case.
In c++0x it is not valid to use a negative number in a unsigned
position in an initializer list. Add explicit casts and change
the size method to return an unsigned int.