前言
近期冲浪刷到大佬博客ELF文件格式, 心血来潮
网上有不少ELF文件结构相关的文章,但大都介绍原理,具体的代码实现并不多(或许是因为有开源代码)
然而阅读开源代码不是我的强项(看的头大), 于是依据当年学习PE文件结构的思路,学习ELF文件格式
仿照 readelf 的输出结果编写解析器, 最后编写了简单的ELF加载器
代码支持x86和x64的ELF文件:
-
解析器针对x86/x64有两套实现, 支持解析x86和x64平台的ELF文件
-
加载器依赖编译环境,只能加载对应平台的ELF文件,要分别编译x86和x64的加载器
-
内容讲解演示主要以x86为主
环境&工具:
- VMware pro 17.6.1
- Kali Linux 2023.4 vmware amd64
- gcc (Debian 14.2.0-8) 14.2.0
- CLion 2024.2.3
- 010 Editor 13.0.1
- IDA Pro 7.7
附件:
CompiledTools.zip
(21.78 KB, 下载次数: 26)
Sources.zip
(12.45 KB, 下载次数: 28)
TestFiles.zip
(1.15 MB, 下载次数: 21)
由于本人水平有限, 内容错误之处还望大佬多多包涵, 批评指正
ELF文件结构概述
ELF是UNIX系统实验室(USL)作为应用程序二进制接口(Application Binary Interface,ABI)而开发和发布的,也是Linux的主要可执行文件格式, 全称是Executable and Linking Format,这个名字相当关键,包含了ELF所需要支持的两个功能——执行和链接
ELF文件包含3大部分,ELF头,ELF节,ELF段:
-
节头表指向节, 类似PE的节表, 描述各个节区的信息
-
程序头表描述段信息,一个段可以包含多个节,指导ELF文件如何映射至文件
-
在OBJ文件中,段是可选的,在可执行文件中,节是可选的,但NDK编译的ELF文件同时有段和节
ELF文件封装了部分数据类型
#include <stdint.h>
typedef uint16_t Elf32_Half;
typedef uint16_t Elf64_Half;
typedef uint32_t Elf32_Word;
typedef int32_t Elf32_Sword;
typedef uint32_t Elf64_Word;
typedef int32_t Elf64_Sword;
typedef uint64_t Elf32_Xword;
typedef int64_t Elf32_Sxword;
typedef uint64_t Elf64_Xword;
typedef int64_t Elf64_Sxword;
typedef uint32_t Elf32_Addr;
typedef uint64_t Elf64_Addr;
typedef uint32_t Elf32_Off;
typedef uint64_t Elf64_Off;
typedef uint16_t Elf32_Section;
typedef uint16_t Elf64_Section;
typedef Elf32_Half Elf32_Versym;
typedef Elf64_Half Elf64_Versym;
可以发现,32和64位定义的数据结构仅有Addr和Off有位宽差距,我们可以定义对应的通用类型
ELF数据结构 |
原始类型 |
备注 |
Elfn_Half |
uint16_t |
|
Elfn_Word |
uint32_t |
|
Elfn_Sword |
int32_t |
|
Elfn_Xword |
uint64_t |
|
Elfn_Sxword |
int64_t |
|
Elf32_Addr |
uint32_t |
地址 |
Elf64_Addr |
uint64_t |
|
Elf32_Off |
uint32_t |
文件偏移 |
Elf64_Off |
uint64_t |
|
Elfn_Section |
uint16_t |
节索引 |
Elfn_Versym |
uint16_t |
|
使用gcc分别编译32/64位的elf可执行文件用于测试
#include <stdio.h>
int main(int argc, char* argv[]){
printf("Hello ELF!\n");
return 0;
}
gcc -m32 -O0 main.c -o HelloELF32
gcc -m64 -O0 main.c -o HelloELF64
编写ELF解析器/加载器前,定义文件读取函数
读取指定路径文件,返回字节指针和读取文件大小
uint8_t* readFileToBytes(const char *fileName,size_t* readSize) {
FILE *file = fopen(fileName, "rb");
if (file == NULL) {
printf("Error opening file\n");
fclose(file);
return NULL;
}
fseek(file, 0,SEEK_END);
size_t fileSize = ftell(file);
fseek(file, 0,SEEK_SET);
uint8_t *buffer = (uint8_t *) malloc(fileSize);
if (buffer == NULL) {
printf("Error allocating memory\n");
fclose(file);
return NULL;
}
size_t bytesRead = fread(buffer, 1, fileSize, file);
if(bytesRead!=fileSize) {
printf("Read bytes not equal file size!\n");
free(buffer);
fclose(file);
return NULL;
}
fclose(file);
if(readSize)
*readSize=bytesRead;
return buffer;
}
定义在elf.h中
#define EI_NIDENT (16)
typedef struct
{
unsigned char e_ident[EI_NIDENT];
Elf32_Half e_type;
Elf32_Half e_machine;
Elf32_Word e_version;
Elf32_Addr e_entry;
Elf32_Off e_phoff;
Elf32_Off e_shoff;
Elf32_Word e_flags;
Elf32_Half e_ehsize;
Elf32_Half e_phentsize;
Elf32_Half e_phnum;
Elf32_Half e_shentsize;
Elf32_Half e_shnum;
Elf32_Half e_shstrndx;
} Elf32_Ehdr;
typedef struct
{
unsigned char e_ident[EI_NIDENT];
Elf64_Half e_type;
Elf64_Half e_machine;
Elf64_Word e_version;
Elf64_Addr e_entry;
Elf64_Off e_phoff;
Elf64_Off e_shoff;
Elf64_Word e_flags;
Elf64_Half e_ehsize;
Elf64_Half e_phentsize;
Elf64_Half e_phnum;
Elf64_Half e_shentsize;
Elf64_Half e_shnum;
Elf64_Half e_shstrndx;
} Elf64_Ehdr;
可以使用readelf查看
e_ident
16字节ELF标识,前4字节是ELF文件标识"\x7fELF",不可修改
010editor中解析如下
-
e_ident[EI_CLASS]
该字节指明了文件类型
Android系统不检查该字节,通过判断指令集v7a/v8a确定是32或64位
IDA检查该字节,如果修改了这个字节,IDA就无法反汇编
-
e_ident[EI_DATA]
该字节指明了目标文件的数据编码格式(大小端序)
Android不检查该字节,默认小端序; IDA检查该字节,如果修改该字节则IDA无法正确反汇编
-
e_ident[EI_VERSION]
ELF文件头的版本
e_type
2字节,表明目标文件属于哪种类型
Android5.0后,可执行文件全部为so,这个标志只能为03不可修改
#define ET_NONE 0
#define ET_REL 1
#define ET_EXEC 2
#define ET_DYN 3
#define ET_CORE 4
#define ET_NUM 5
#define ET_LOOS 0xfe00
#define ET_HIOS 0xfeff
#define ET_LOPROC 0xff00
#define ET_HIPROC 0xffff
e_machine
2字节,该字段用于指定ELF文件适用的处理器架构,部分定义如下, 对于intel,固定为EM_386
#define EM_NONE 0
#define EM_M32 1
#define EM_SPARC 2
#define EM_386 3
#define EM_68K 4
#define EM_88K 5
#define EM_IAMCU 6
#define EM_860 7
#define EM_MIPS 8
#define EM_S370 9
#define EM_MIPS_RS3_LE 10
#define EM_PARISC 15
e_version
4字节,指明目标文件版本
Android不检查该字段,IDA检查,但对反汇编无影响
e_entry
4或8字节,程序入口点(OEP) RVA, 如果e_type=2 即可执行程序, 则该字段为VA; 如果是so,则为0
e_phoff
4或8字节,程序头表偏移FOA,如果没有程序头表则该字段为0
e_shoff
4或8字节,节头表偏移FOA,如果没有节头表则该字段为0
Android对抗中经常会删除节表
e_flags
4字节标志,无用
e_ehsize
2字节,ELF文件头大小
Android不检查,默认ELF Header大小为52字节; IDA检查,修改该字段只会产生警告不影响反汇编
e_phentsize
2字节,表示程序头表每一个表项的大小
e_phnum
2字节,表示程序头表的表项数目
e_shentsize
2字节,节头表表项大小
e_shnum
2字节,节头表表项个数
e_shstrndx
2字节,节头表中与节名表相对应表项的索引
打印文件头
根据枚举值,定义对应的字符串数组以打印相关信息
char ELF_Class[3][6] = {"NONE", "ELF32", "ELF64"};
char ELF_Data[3][14] = {"NONE", "Little Endian", "Big Endian"};
char objectFileType[7][7] = {"NONE", "REL", "EXEC", "DYN", "CORE", "LOPROC", "HIPROC"};
void printELFHeader32(const Elf32_Ehdr* pElfHeader) {
printf("ELF Header:\n");
printf("\tMagic:\t");
for (int i = 0; i < EI_NIDENT; i++) {
printf("%02x ", pElfHeader[i].e_ident[i]);
}
printf("\n");
printf("\t%-36s%s\n", "Class:", ELF_Class[pElfHeader->e_ident[EI_CLASS]]);
printf("\t%-36s%s\n", "Data:", ELF_Data[pElfHeader->e_ident[EI_DATA]]);
printf("\t%-36s%#x\n", "Version:", pElfHeader->e_version);
printf("\t%-36s%#x\n", "Machine:", pElfHeader->e_machine);
printf("\t%-36s%s\n", "Type:", objectFileType[pElfHeader->e_type]);
printf("\t%-36s%#x\n", "Size Of ELF Header:", pElfHeader->e_ehsize);
printf("\t%-36s%#x\n", "Entry point:", pElfHeader->e_entry);
printf("\t%-36s%#x\n", "Start Of Program Headers:", pElfHeader->e_phoff);
printf("\t%-36s%#x\n", "Start Of Section Headers:", pElfHeader->e_shoff);
printf("\t%-36s%#x\n", "Size Of Program Headers:", pElfHeader->e_phentsize);
printf("\t%-36s%#x\n", "Number Of Program Headers:", pElfHeader->e_phnum);
printf("\t%-36s%#x\n", "Size Of Section Headers:", pElfHeader->e_shentsize);
printf("\t%-36s%#x\n", "Number Of Sections:", pElfHeader->e_shnum);
printf("\t%-36s%d\n", "Section Header String Table Index:", pElfHeader->e_shstrndx);
printf("ELF Header End\n");
}
打印效果如下
类似PE文件的节表(IMAGE_SECTION_HEADER)
节表保存了节的基本属性,是ELF文件中除了文件头之外最重要的结构,编译器,链接器和装载器都依赖节表定位和访问各个节的属性
节表数组第0个元素固定为SHN_UNDEF, 节表成员结构定义如下
typedef struct
{
Elf32_Word sh_name;
Elf32_Word sh_type;
Elf32_Word sh_flags;
Elf32_Addr sh_addr;
Elf32_Off sh_offset;
Elf32_Word sh_size;
Elf32_Word sh_link;
Elf32_Word sh_info;
Elf32_Word sh_addralign;
Elf32_Word sh_entsize;
} Elf32_Shdr;
typedef struct
{
Elf64_Word sh_name;
Elf64_Word sh_type;
Elf64_Xword sh_flags;
Elf64_Addr sh_addr;
Elf64_Off sh_offset;
Elf64_Xword sh_size;
Elf64_Word sh_link;
Elf64_Word sh_info;
Elf64_Xword sh_addralign;
Elf64_Xword sh_entsize;
} Elf64_Shdr;
readelf查看节表
sh_name
4字节,偏移值,通过ELF File Header.e_shstrndx拿到节表中节名称表对应项的索引
然后在节表中找到该项,找到sh_offset的文件偏移 sh_name+sh_offset即为该节名的字符串的FOA
sh_type
4字节,指示节的类型, 定义如下
#define SHT_NULL 0
#define SHT_PROGBITS 1
#define SHT_SYMTAB 2
#define SHT_STRTAB 3
#define SHT_RELA 4
#define SHT_HASH 5
#define SHT_DYNAMIC 6
#define SHT_NOTE 7
#define SHT_NOBITS 8
#define SHT_REL 9
#define SHT_SHLIB 10
#define SHT_DYNSYM 11
#define SHT_INIT_ARRAY 14
#define SHT_FINI_ARRAY 15
#define SHT_PREINIT_ARRAY 16
#define SHT_GROUP 17
#define SHT_SYMTAB_SHNDX 18
#define SHT_RELR 19
#define SHT_NUM 20
#define SHT_LOOS 0x60000000
#define SHT_GNU_ATTRIBUTES 0x6ffffff5
#define SHT_GNU_HASH 0x6ffffff6
#define SHT_GNU_LIBLIST 0x6ffffff7
#define SHT_CHECKSUM 0x6ffffff8
#define SHT_LOSUNW 0x6ffffffa
#define SHT_SUNW_move 0x6ffffffa
#define SHT_SUNW_COMDAT 0x6ffffffb
#define SHT_SUNW_syminfo 0x6ffffffc
#define SHT_GNU_verdef 0x6ffffffd
#define SHT_GNU_verneed 0x6ffffffe
#define SHT_GNU_versym 0x6fffffff
#define SHT_HISUNW 0x6fffffff
#define SHT_HIOS 0x6fffffff
#define SHT_LOPROC 0x70000000
#define SHT_HIPROC 0x7fffffff
#define SHT_LOUSER 0x80000000
#define SHT_HIUSER 0x8fffffff
比较常见的节类型如下
SHT_NULL
SHT_STRTAB
SHT_RELA
SHT_HASH
SHT_DYNAMIC
SHT_NOBITS
SHT_REL
SHT_DYNSYM
sh_flags
4字节,由一系列标志bit位组成
-
SHF_WRITE 表示本节在进程中可写
-
SHF_ALLOC 表示本节在运行中需要占用内存
不是所有节都要占用实际内存,部分起控制作用的节在文件映射至内存时不需要占用
-
SHF_EXECINSTR 表示本节的内容是指令代码
-
SHF_MASKPROC 被该值覆盖的位都保留做特殊处理器扩展用
sh_addr
4字节,节的内存虚拟地址
sh_offset
4字节,节的FOA
sh_size
4字节,段的大小
sh_link
4字节,索引值
sh_info
4字节,节的附加信息
根据节类型不同,sh_info和sh_link有不同的含义
sh_addralign
4字节,段地址对齐值,假如为0或者1表示该段没有对齐要求; 假如为3表示对齐2^3=8
节的sh_addr必须能被sh_addralign整除,即sh_addr%sh_addralign=0
sh_entsize
4字节,部分节的内容是一张表,每个表项的大小固定(例如符号表), 该字段指定其每个表项的大小
为0则表示不是这些表
打印节表头
char *getSectionTypeString(Elf_Word sectionType) {
switch (sectionType) {
case SHT_NULL: return "NULL";
case SHT_PROGBITS: return "PROGBITS";
case SHT_SYMTAB: return "SYMTAB";
case SHT_STRTAB: return "STRTAB";
case SHT_RELA: return "RELA";
case SHT_HASH: return "HASH";
case SHT_DYNAMIC: return "DYNAMIC";
case SHT_NOTE: return "NOTE";
case SHT_NOBITS: return "NOBITS";
case SHT_REL: return "REL";
case SHT_SHLIB: return "SHLIB";
case SHT_DYNSYM: return "DYNSYM";
case SHT_INIT_ARRAY: return "INIT_ARRAY";
case SHT_FINI_ARRAY: return "FINI_ARRAY";
case SHT_PREINIT_ARRAY: return "PREINIT_ARRAY";
case SHT_GROUP: return "GROUP";
case SHT_SYMTAB_SHNDX: return "SYMTAB_SHNDX";
case SHT_RELR: return "RELR";
case SHT_NUM: return "NUM";
case SHT_LOOS: return "LOOS";
case SHT_GNU_ATTRIBUTES: return "GNU_ATTRIBUTES";
case SHT_GNU_HASH: return "GNU_HASH";
case SHT_GNU_LIBLIST: return "GNU_LIBLIST";
case SHT_CHECKSUM: return "CHECKSUM";
case SHT_LOSUNW: return "LOSUNW";
case SHT_SUNW_COMDAT: return "SUNW_COMDAT";
case SHT_SUNW_syminfo: return "SUNW_syminfo";
case SHT_GNU_verdef: return "GNU_verdef";
case SHT_GNU_verneed: return "GNU_verneed";
case SHT_GNU_versym: return "GNU_versym";
case SHT_LOPROC: return "LOPROC";
case SHT_HIPROC: return "HIPROC";
case SHT_LOUSER: return "LOUSER";
case SHT_HIUSER: return "HIUSER";
default: return "UNKNOWN";
}
}
const char* getSectionFlagStr(Elf_Word flags) {
switch (flags) {
case SHF_ALLOC: return " A";
case SHF_WRITE: return " W";
case SHF_WRITE | SHF_ALLOC: return " WA";
case SHF_EXECINSTR: return " X";
case SHF_ALLOC | SHF_EXECINSTR: return " AX";
case SHF_MASKPROC: return "MKP";
default: return " ";
}
}
void printElfSectionHeader32(const Elf32_Shdr* pSectionHeader,Elf_Half sectionNum,const char* pStringTable) {
printf("ELF Section Headers:\n");
printf("\t[Nr] Name\t\t\tType\t\t\tAddr\t\tOffset\t\tSize\t\tEntSize\tFlag\tLink\tInfo\tAlign\n");
for (int i = 0; i < sectionNum; i++) {
printf("\t[%2d] %-20s", i, (char *) &pStringTable[pSectionHeader[i].sh_name]);
printf("\t%-16s", getSectionTypeString(pSectionHeader[i].sh_type));
printf("\t%08x", pSectionHeader[i].sh_addr);
printf("\t%08x", pSectionHeader[i].sh_offset);
printf("\t%08x", pSectionHeader[i].sh_size);
printf("\t%x", pSectionHeader[i].sh_entsize);
printf("\t%s", getSectionFlagStr(pSectionHeader[i].sh_flags));
printf("\t%x", pSectionHeader[i].sh_link);
printf("\t%x", pSectionHeader[i].sh_info);
printf("\t%x\n", pSectionHeader[i].sh_addralign);
}
printf("ELF Section Headers End\n");
}
打印结果如下
程序头表用于描述ELF文件如何映射到内存中,用段(segment)表示
定义如下
typedef struct
{
Elf32_Word p_type;
Elf32_Off p_offset;
Elf32_Addr p_vaddr;
Elf32_Addr p_paddr;
Elf32_Word p_filesz;
Elf32_Word p_memsz;
Elf32_Word p_flags;
Elf32_Word p_align;
} Elf32_Phdr;
typedef struct
{
Elf64_Word p_type;
Elf64_Word p_flags;
Elf64_Off p_offset;
Elf64_Addr p_vaddr;
Elf64_Addr p_paddr;
Elf64_Xword p_filesz;
Elf64_Xword p_memsz;
Elf64_Xword p_align;
} Elf64_Phdr;
p_type
指定了程序头描述的段类型(或如何解析本程序头的信息)
段类型如下
#define PT_NULL 0
#define PT_LOAD 1
#define PT_DYNAMIC 2
#define PT_INTERP 3
#define PT_NOTE 4
#define PT_SHLIB 5
#define PT_PHDR 6
#define PT_TLS 7
#define PT_NUM 8
#define PT_LOOS 0x60000000
#define PT_GNU_EH_FRAME 0x6474e550
#define PT_GNU_STACK 0x6474e551
#define PT_GNU_RELRO 0x6474e552
#define PT_GNU_PROPERTY 0x6474e553
#define PT_GNU_SFRAME 0x6474e554
#define PT_LOSUNW 0x6ffffffa
#define PT_SUNWBSS 0x6ffffffa
#define PT_SUNWSTACK 0x6ffffffb
#define PT_HISUNW 0x6fffffff
#define PT_HIOS 0x6fffffff
#define PT_LOPROC 0x70000000
#define PT_HIPROC 0x7fffffff
p_offset
段的文件偏移值
p_vaddr
段的内存虚拟地址
p_paddr
段的内存物理地址, 由于多数现代操作系统的设计不可预知段的物理地址,故该字段多数情况下保留
p_filesz
段的文件大小
p_memsz
段的内存大小
p_flags
段的属性
#define PF_X (1 << 0)
#define PF_W (1 << 1)
#define PF_R (1 << 2)
#define PF_MASKOS 0x0ff00000
#define PF_MASKPROC 0xf0000000
p_align
段的内存对齐值
打印段表头
const char *getSegmentTypeStr(Elf32_Word segmentType) {
switch (segmentType) {
case PT_NULL:return "NULL";
case PT_LOAD: return "LOAD";
case PT_DYNAMIC: return "DYNAMIC";
case PT_INTERP:return "INTERP";
case PT_NOTE: return "NOTE";
case PT_SHLIB:return "SHLIB";
case PT_PHDR: return "PHDR";
case PT_TLS:return "TLS";
case PT_NUM: return "PT_NUM";
case PT_LOOS:return "LOOS";
case PT_GNU_EH_FRAME: return "GNU_EH_FRAME";
case PT_GNU_STACK:return "GNU_STACK";
case PT_GNU_RELRO: return "GNU_RELRO";
case PT_GNU_PROPERTY: return "GNU_PROPERTY";
case PT_GNU_SFRAME: return "GNU_SFRAME";
case PT_SUNWBSS: return "SUNWBSS";
case PT_SUNWSTACK: return "SUNWSTACK";
case PT_HIOS: return "HIOS";
case PT_LOPROC: return "LOPROC";
case PT_HIPROC: return "HIPROC";
default: return "UNKNOWN";
}
}
const char* getSegmentFlagStr(Elf_Word segmentFlags) {
static char segmentFlagStr[5] = " ";
int count = 0;
if (segmentFlags & PF_R) {
segmentFlagStr[count++] = 'R';
}
if (segmentFlags & PF_W) {
segmentFlagStr[count++] = 'W';
}
if (segmentFlags & PF_X) {
segmentFlagStr[count++] = 'X';
}
return segmentFlagStr;
}
void printElfProgramHeader32(const Elf32_Phdr *pProgramHeader,Elf_Half segmentNum,const uint8_t* pFileBuffer) {
printf("ELF ProgramHeader:\n");
printf("\t[Nr] Type\t\tFileOff\t\tVirAddr\t\tPhyAddr\t\tFileSize\tMemSize\t\tFlag\tAlign\n");
for (int i = 0; i < segmentNum; i++) {
printf("\t[%02d] %-16s", i, getSegmentTypeStr(pProgramHeader[i].p_type));
printf("\t%08x", pProgramHeader[i].p_offset);
printf("\t%08x", pProgramHeader[i].p_vaddr);
printf("\t%08x", pProgramHeader[i].p_paddr);
printf("\t%08x", pProgramHeader[i].p_filesz);
printf("\t%08x", pProgramHeader[i].p_memsz);
printf("\t%#4s", getSegmentFlagStr(pProgramHeader[i].p_flags));
printf("\t%#x\n", pProgramHeader[i].p_align);
if (pProgramHeader[i].p_type == PT_INTERP) {
printf("\t\t [Request Program Interpreter Path: %s]\n",(char *) (pFileBuffer + pProgramHeader[i].p_offset));
}
}
printf("ELF ProgramHeader End\n");
}
void printSectionToSegmentMapping32(const Elf32_Phdr* pProgramHeader,const Elf32_Shdr* pSectionHeader,Elf_Half segmentNum,Elf_Half sectionNum,const char* pSectionHeaderStringTable) {
printf("Segtion to Segment Mapping:\n");
printf("\tSegment\tSections\n");
for (int i = 0; i < segmentNum; i++) {
Elf32_Addr segmentStartAddr = pProgramHeader[i].p_vaddr;
Elf32_Addr segmentEndAddr = segmentStartAddr + pProgramHeader[i].p_memsz;
printf("\t%02d\t\t", i);
for (int j = 0; j < sectionNum; j++) {
Elf32_Addr sectionStartAddr = pSectionHeader[j].sh_addr;
if (sectionStartAddr >= segmentStartAddr && sectionStartAddr < segmentEndAddr) {
if (pSectionHeader[j].sh_flags & SHF_ALLOC) {
printf("%s ",(char *) pSectionHeaderStringTable + pSectionHeader[j].sh_name);
}
}
}
printf("\n");
}
}
打印结果如下
特殊节
ELF 文件中有一些特定的节是预定义好的,其内容是指令代码或者控制信息
这些节专门为操作系统使用,对于不同的操作系统,这些节的类型和属性有所不同
节名 |
作用 |
.text |
代码段 |
.data |
保存已经初始化的全局变量和局部静态变量 |
.bss |
保存未初始化的全局变量和局部静态变量 |
.rodata |
存放只读数据, 例如常量字符串 |
.comment |
编译器版本信息 |
.debug |
调试信息 |
.dynamic |
动态链接信息, linker解析该段以加载elf文件 |
.hash |
符号哈希表 (可查导入和导出符号) |
.gnu.hash |
GNU哈希表 (只可查导出符号,导出表) |
.line |
调试行号表 即源代码行号与编译后指令的对应表 |
.note |
额外的编译器信息 例如公司名,版本号 |
.rel.dyn |
动态链接重定位表 存放全局变量重定位项 |
.rel.plt |
动态链接函数跳转重定位表 存放plt重定位项 |
.symtab |
符号表 |
.dynsym |
动态链接符号表 |
.strtab |
字符串表 |
.shstrtab |
节名表 |
.dynstr |
动态链接字符串表 |
.plt |
动态链接跳转表 |
.got |
动态链接全局偏移表 |
.init |
程序初始化代码段(节) |
.fini |
程序结束代码段(节) |
String Table
ELF文件中有很多字符串,例如段名,变量名等, 由于字符串长度往往不固定,所以使用固定结构描述比较困难
常见做法是将字符串集中起来存放到一张字符串表,然后通过索引查表来引用字符串
常见的有:
-
.strtab(字符串表,保存普通字符串)
遍历section header, 查找type==SHT_STRTAB的即为字符串表 (包括段表字符串表)
-
.shstrtab(段表字符串表,保存段表用到的字符串)
获取该表可以通过ELF Header的e_shstrndx成员做索引,查找ELF Section Header Table
即p_shstrtab=ELFSectionHeaderTable[ELFHeader.e_shstrndx]
打印代码如下
void printStringTable32(const Elf32_Shdr* pSectionHeader,Elf_Half sectionNum,const char* pSectionHeaderStringTable,const uint8_t* pFileBuffer) {
printf("ELF String Table:\n");
for (int i = 0; i < sectionNum; i++) {
if (pSectionHeader[i].sh_type == SHT_STRTAB) {
printf("\t==========String Table %s==========\n",getSectionName(pSectionHeaderStringTable,pSectionHeader[i].sh_name));
char *pStringTable = (char *) (pFileBuffer + pSectionHeader[i].sh_offset);
Elf32_Word stringTableSize = pSectionHeader[i].sh_size, pos = 0;
while (pos < stringTableSize) {
if (pStringTable[pos] == 0) {
pos += 1;
printf("\t%s\n", pStringTable + pos);
} else {
while (pStringTable[pos] != 0) {
pos++;
}
}
}
}
}
printf("ELF String Table End\n");
}
Symbol Table
符号表的作用是描述导入和导出符号,这里的符号可以是全局变量,函数,外部引用等
通过符号表和对应的字符串表可以得到符号名,符号大小,符号地址等信息
.dynsym
.symtab
.dynstr
.strtab
符号表表项结构
typedef struct
{
Elf32_Word st_name;
Elf32_Addr st_value;
Elf32_Word st_size;
unsigned char st_info;
unsigned char st_other;
Elf32_Section st_shndx;
} Elf32_Sym;
typedef struct
{
Elf64_Word st_name;
unsigned char st_info;
unsigned char st_other;
Elf64_Section st_shndx;
Elf64_Addr st_value;
Elf64_Xword st_size;
} Elf64_Sym;
st_name
符号名, 字符串表的索引下标, 节表的sh_link说明了是在哪个字符串表中
st_value
符号对应的值, 和符号有关, 可能是绝对值,也可能是一个地址, 不同符号的含义不同
st_size
符号大小, 对于包含数据的符号, 是该数据类型的大小
例如一个double型的符号占用8字节,如果该值为0表示符号大小为0或未知
st_info
符号的类型和属性,高4bit标识了符号绑定(symbol binding), 低4bit标识了符号类型(symbol type),组成符号信息(symbol information)
有3个宏分别读取这三个属性值
#define ELF32_ST_BIND(val) (((unsigned char) (val)) >> 4)
#define ELF32_ST_TYPE(val) ((val) & 0xf)
#define ELF32_ST_INFO(bind, type) (((bind) << 4) + ((type) & 0xf))
Symbol Binding
符号绑定的合法属性如下
#define STB_LOCAL 0
#define STB_GLOBAL 1
#define STB_WEAK 2
#define STB_NUM 3
#define STB_LOOS 10
#define STB_GNU_UNIQUE 10
#define STB_HIOS 12
#define STB_LOPROC 13
#define STB_HIPROC 15
几个重要属性解释如下:
-
STB_LOCAL
该符号是本地符号,只出现在本文件中,在其他文件中无效
所以在不同文件中可以定义相同的符号名,不会互相影响
-
STB_GLOBAL
该符号是全局符号,当有多个文件被链接在一起时,在所有文件中该符号都是可见的
所以在一个文件中定义的全局符号,一定是在其他文件中需要被引用,否则无需定义为全局
-
STB_WEAK
弱符号,类似于全局符号,但优先级比global更低
-
STB_LOPROC~STB_HIPROC
为特殊处理器保留
Symbol Type
#define STT_NOTYPE 0
#define STT_OBJECT 1
#define STT_FUNC 2
#define STT_SECTION 3
#define STT_FILE 4
#define STT_COMMON 5
#define STT_TLS 6
#define STT_NUM 7
#define STT_LOOS 10
#define STT_GNU_IFUNC 10
#define STT_HIOS 12
#define STT_LOPROC 13
#define STT_HIPROC 15
几个重要符号解析如下
-
STT_NOTYPE
该符号类型未指定
-
STT_OBJECT
该符号是一个数据对象,例如变量,数组等
-
STT_FUNC
该符号是一个函数,或者其他的可执行代码
-
STT_SECTION
该符号和一个节相关联,用于重定位,通常具有STB_LOCAL属性
-
STT_FILE
该符号是一个文件符号,具有STB_LOCAL属性
-
STT_LOPROC~STT_HIPROC
为特殊处理器保留
st_other
低2位保存了符号可见性
st_shndx
符号所在的段
打印符号表
const char *getSymbolBindingString(uint8_t symbolBinding) {
switch (symbolBinding) {
case STB_LOCAL: return "LOCAL";
case STB_GLOBAL: return "GLOBAL";
case STB_WEAK: return "WEAK";
case STB_NUM: return "STB_NUM";
case STB_GNU_UNIQUE: return "GNU_UNIQUE";
case STB_HIOS: return "STB_HIOS";
case STB_LOPROC: return "STB_LOPROC";
case STB_HIPROC: return "STB_HIPROC";
default: return "UNKNOWN";
}
}
const char *getSymbolTypeString(uint8_t symbolType) {
switch (symbolType) {
case STT_NOTYPE: return "NOTYPE";
case STT_OBJECT: return "OBJECT";
case STT_FUNC: return "FUNC";
case STT_SECTION: return "SECTION";
case STT_FILE: return "FILE";
case STT_COMMON: return "COMMON";
case STT_TLS: return "TLS";
case STT_NUM: return "STT_NUM";
case STT_GNU_IFUNC: return "GNU_IFUNC";
case STT_HIOS: return "HIOS";
case STT_LOPROC: return "LOPROC";
case STT_HIPROC: return "HIPROC";
default: return "UNKNOWN";
}
}
const char *getSymbolVisibility(uint8_t st_other) {
unsigned char visibility = st_other & 0x03;
switch (visibility) {
case 0: return "DEFAULT";
case 1: return "INTERNAL";
case 2: return "HIDDEN";
case 3: return "PROTECTED";
default: return "UNKNOWN";
}
}
void printSymbolTable32(const Elf32_Shdr* pSectionHeader,Elf_Half sectionNum,const char* pSectionHeaderStringTable,const uint8_t* pFileBuffer) {
printf("ELF Symbol Tables:\n");
for (int i = 0; i < sectionNum; i++) {
if (pSectionHeader[i].sh_type == SHT_SYMTAB || pSectionHeader[i].sh_type == SHT_DYNSYM) {
Elf32_Word symbolNum = pSectionHeader[i].sh_size / pSectionHeader[i].sh_entsize;
char* pSymbolNameTable =(char*) pFileBuffer + pSectionHeader[pSectionHeader[i].sh_link].sh_offset;
printf("\tSymbol Table '%s' contains %#x entries:\n",(char*)getSectionName(pSectionHeaderStringTable,pSectionHeader[i].sh_name), symbolNum);
printf("\tNum \tValue\t\tSize\t\tType\t\tBind\t\tVisible\t\tIndex\t\tName\n");
Elf32_Sym *pSymbolTable = (Elf32_Sym *) (pFileBuffer + pSectionHeader[i].sh_offset);
for (int j = 0; j < symbolNum; j++) {
printf("\t%04d", j);
printf("\t%08x", pSymbolTable[j].st_value);
printf("\t%08x", pSymbolTable[j].st_size);
printf("\t%s\t", getSymbolTypeString(ELF32_ST_TYPE(pSymbolTable[j].st_info)));
printf("\t%s\t", getSymbolBindingString(ELF32_ST_BIND(pSymbolTable[j].st_info)));
printf("\t%-10s", getSymbolVisibility(pSymbolTable[j].st_other));
if (pSymbolTable[j].st_shndx == SHN_UNDEF) {
printf("\t%4s\t", "UDEF");
} else if (pSymbolTable[j].st_shndx == SHN_ABS) {
printf("\t%4s\t", "ABS");
} else {
printf("\t%04x\t", pSymbolTable[j].st_shndx);
}
printf("\t%s\n", pSymbolNameTable + pSymbolTable[j].st_name);
}
printf("\n");
}
}
}
Relocation Table
一般有两张重定位表:
-
.rel.plt 修复外部函数地址
-
.rel.dyn 修复全局变量地址
重定位表有SHT_REL, SHT_RELA, SHT_RELR三种类型,对应表项定义如下
注: Intel x86架构只使用REL重定位项, x64架构似乎只使用RELA重定位项, 在后续修复重定位表可以得知
typedef struct
{
Elf32_Addr r_offset;
Elf32_Word r_info;
} Elf32_Rel;
typedef struct
{
Elf64_Addr r_offset;
Elf64_Xword r_info;
} Elf64_Rel;
typedef struct
{
Elf32_Addr r_offset;
Elf32_Word r_info;
Elf32_Sword r_addend;
} Elf32_Rela;
typedef struct
{
Elf64_Addr r_offset;
Elf64_Xword r_info;
Elf64_Sxword r_addend;
} Elf64_Rela;
typedef Elf32_Word Elf32_Relr;
typedef Elf64_Xword Elf64_Relr;
r_offset
重定位的位置
对于重定位文件而言,该值是待重定位单元在节中的偏移量
对于可执行文件或链接库文件而言,该值是待重定位单元的虚拟地址
r_info
给出了待重定位单元的符号表索引和重定位类型
获取信息的宏
SYM获取高24/32位, 是符号表索引, 指明符号
TYPE获取低8/32位, 是重定位类型
#define ELF32_R_SYM(val) ((val) >> 8)
#define ELF32_R_TYPE(val) ((val) & 0xff)
#define ELF32_R_INFO(sym, type) (((sym) << 8) + ((type) & 0xff))
#define ELF64_R_SYM(i) ((i) >> 32)
#define ELF64_R_TYPE(i) ((i) & 0xffffffff)
#define ELF64_R_INFO(sym,type) ((((Elf64_Xword) (sym)) << 32) + (type))
r_addend
指定加数,用于计算需要重定位的域的值
Rela使用该字段显式地指出加数,Rel的加数隐含在被修改的位置中
一个重定位节(Relocation Section)需要引用另外两个节: 符号表和待修复节
重定位节节头的sh_info和sh_link分别指明了引用关系
不同目标文件中,重定位项的r_offset成员含义略有不同
-
重定位文件
r_offset指向待修改节的重定位单元偏移地址
-
可执行文件/共享目标文件
r_offset指向待修改单元的虚拟地址
重定位类型
重定位项用于描述如何修改以下的指令和数据域(被重定位域)
定义以下几种运算符号便于描述
常见重定位类型如下
R_386_GOT_DAT
将指定的符号地址设置为一个GOT表项
修复方法: elf加载后, 填入符号对应真实地址
R_386_JMP_SLOT
用于动态链接的PLT表项
修复方法: elf加载后, 修改跳转地址为符号地址
R_386_RELATIVE
相对偏移地址重定位
修复方法: 将offset指出的位置解引用,加上elf加载的基地址
全部的intel x86架构重定位类型如下
#define R_386_NONE 0
#define R_386_32 1
#define R_386_PC32 2
#define R_386_GOT32 3
#define R_386_PLT32 4
#define R_386_COPY 5
#define R_386_GLOB_DAT 6
#define R_386_JMP_SLOT 7
#define R_386_RELATIVE 8
#define R_386_GOTOFF 9
#define R_386_GOTPC 10
#define R_386_32PLT 11
#define R_386_TLS_TPOFF 14
#define R_386_TLS_IE 15
#define R_386_TLS_GOTIE 16
#define R_386_TLS_LE 17
#define R_386_TLS_GD 18
#define R_386_TLS_LDM 19
#define R_386_16 20
#define R_386_PC16 21
#define R_386_8 22
#define R_386_PC8 23
#define R_386_TLS_GD_32 24
#define R_386_TLS_GD_PUSH 25
#define R_386_TLS_GD_CALL 26
#define R_386_TLS_GD_POP 27
#define R_386_TLS_LDM_32 28
#define R_386_TLS_LDM_PUSH 29
#define R_386_TLS_LDM_CALL 30
#define R_386_TLS_LDM_POP 31
#define R_386_TLS_LDO_32 32
#define R_386_TLS_IE_32 33
#define R_386_TLS_LE_32 34
#define R_386_TLS_DTPMOD32 35
#define R_386_TLS_DTPOFF32 36
#define R_386_TLS_TPOFF32 37
#define R_386_SIZE32 38
#define R_386_TLS_GOTDESC 39
#define R_386_TLS_DESC_CALL 40
#define R_386_TLS_DESC 41
#define R_386_IRELATIVE 42
#define R_386_GOT32X 43
#define R_386_NUM 44
x64重定位类型定义如下
#define R_X86_64_NONE 0
#define R_X86_64_64 1
#define R_X86_64_PC32 2
#define R_X86_64_GOT32 3
#define R_X86_64_PLT32 4
#define R_X86_64_COPY 5
#define R_X86_64_GLOB_DAT 6
#define R_X86_64_JUMP_SLOT 7
#define R_X86_64_RELATIVE 8
#define R_X86_64_GOTPCREL 9
#define R_X86_64_32 10
#define R_X86_64_32S 11
#define R_X86_64_16 12
#define R_X86_64_PC16 13
#define R_X86_64_8 14
#define R_X86_64_PC8 15
#define R_X86_64_DTPMOD64 16
#define R_X86_64_DTPOFF64 17
#define R_X86_64_TPOFF64 18
#define R_X86_64_TLSGD 19
#define R_X86_64_TLSLD 20
#define R_X86_64_DTPOFF32 21
#define R_X86_64_GOTTPOFF 22
#define R_X86_64_TPOFF32 23
#define R_X86_64_PC64 24
#define R_X86_64_GOTOFF64 25
#define R_X86_64_GOTPC32 26
#define R_X86_64_GOT64 27
#define R_X86_64_GOTPCREL64 28
#define R_X86_64_GOTPC64 29
#define R_X86_64_GOTPLT64 30
#define R_X86_64_PLTOFF64 31
#define R_X86_64_SIZE32 32
#define R_X86_64_SIZE64 33
#define R_X86_64_GOTPC32_TLSDESC 34
#define R_X86_64_TLSDESC_CALL 35
#define R_X86_64_TLSDESC 36
#define R_X86_64_IRELATIVE 37
#define R_X86_64_RELATIVE64 38
#define R_X86_64_GOTPCRELX 41
#define R_X86_64_REX_GOTPCRELX 42
#define R_X86_64_NUM 43
打印重定位表
const char *getRelocationTypeString32(Elf_Word value) {
switch (value) {
case R_386_NONE: return "R_386_NONE";
case 1: return "R_386_32";
case 2: return "R_386_PC32";
case 3: return "R_386_GOT32";
case 4: return "R_386_PLT32";
case 5: return "R_386_COPY";
case 6: return "R_386_GLOB_DAT";
case 7: return "R_386_JMP_SLOT";
case 8: return "R_386_RELATIVE";
case 9: return "R_386_GOTOFF";
case 10: return "R_386_GOTPC";
case 11: return "R_386_32PLT";
case 14: return "R_386_TLS_TPOFF";
case 15: return "R_386_TLS_IE";
case 16: return "R_386_TLS_GOTIE";
case 17: return "R_386_TLS_LE";
case 18: return "R_386_TLS_GD";
case 19: return "R_386_TLS_LDM";
case 20: return "R_386_16";
case 21: return "R_386_PC16";
case 22: return "R_386_8";
case 23: return "R_386_PC8";
case 24: return "R_386_TLS_GD_32";
case 25: return "R_386_TLS_GD_PUSH";
case 26: return "R_386_TLS_GD_CALL";
case 27: return "R_386_TLS_GD_POP";
case 28: return "R_386_TLS_LDM_32";
case 29: return "R_386_TLS_LDM_PUSH";
case 30: return "R_386_TLS_LDM_CALL";
case 31: return "R_386_TLS_LDM_POP";
case 32: return "R_386_TLS_LDO_32";
case 33: return "R_386_TLS_IE_32";
case 34: return "R_386_TLS_LE_32";
case 35: return "R_386_TLS_DTPMOD32";
case 36: return "R_386_TLS_DTPOFF32";
case 37: return "R_386_TLS_TPOFF32";
case 38: return "R_386_SIZE32";
case 39: return "R_386_TLS_GOTDESC";
case 40: return "R_386_TLS_DESC_CALL";
case 41: return "R_386_TLS_DESC";
case 42: return "R_386_IRELATIVE";
case 43: return "R_386_GOT32X";
default: return "Unknown relocation type";
}
}
void printRelocationTable32(const Elf32_Shdr* pSectionHeader,Elf_Half sectionNum,uint8_t* pFileBuffer,const char* pSectionHeaderStringTable) {
printf("Relocation Tables:\n");
for (int i = 0; i < sectionNum; i++) {
if (pSectionHeader[i].sh_type == SHT_REL) {
Elf32_Shdr *pRelocationTableHeader = &pSectionHeader[i];
Elf32_Rel *pRelocationTable = (Elf32_Rel *) (pFileBuffer + pRelocationTableHeader->sh_offset);
Elf32_Word relocItemNum = pRelocationTableHeader->sh_size / pRelocationTableHeader->sh_entsize;
Elf32_Shdr *pSymbolTableHeader = (Elf32_Shdr *) &pSectionHeader[pSectionHeader[i].sh_link];
Elf32_Sym *pSymbolTable = (Elf32_Sym *) (pFileBuffer + pSymbolTableHeader->sh_offset);
char *pSymbolTableStringTable = (char *) pFileBuffer + pSectionHeader[pSymbolTableHeader->sh_link].sh_offset;
printf("Relocation Section '%s' at offset contains %d entries\n",(char*) pSectionHeaderStringTable + pSectionHeader[i].sh_name, relocItemNum);
printf("\tOffset\t\tInfo\t\tType\t\t\t\tSym.value\t\tSym.name\n");
for (int j = 0; j < relocItemNum; j++) {
printf("\t%08x", pRelocationTable[j].r_offset);
printf("\t%08x", pRelocationTable[j].r_info);
printf("\t%s\t", getRelocationTypeString32(ELF32_R_TYPE(pRelocationTable[j].r_info)));
printf("\t%08x\t", pSymbolTable[ELF32_R_SYM(pRelocationTable[j].r_info)].st_value);
printf("\t%s", &pSymbolTableStringTable[pSymbolTable[ELF32_R_SYM(pRelocationTable[j].r_info)].st_name]);
printf("\n");
}
}
}
}
修复重定位表
r_offset指定了待修复的地址,这是一个RVA, 需要将该地址存储的数据加上elf文件加载的基地址
例如readelf读取的重定位表信息如下
Relocation section '.rel.dyn' at offset 0x384 contains 8 entries:
Offset Info Type Sym.Value Sym. Name
00003ee8 00000008 R_386_RELATIVE
00003eec 00000008 R_386_RELATIVE
00003fec 00000008 R_386_RELATIVE
0000400c 00000008 R_386_RELATIVE
00003fe0 00000206 R_386_GLOB_DAT 00000000 _ITM_deregisterTM[...]
00003fe4 00000306 R_386_GLOB_DAT 00000000 __cxa_finalize@GLIBC_2.1.3
00003fe8 00000506 R_386_GLOB_DAT 00000000 __gmon_start__
00003ff0 00000606 R_386_GLOB_DAT 00000000 _ITM_registerTMCl[...]
Relocation section '.rel.plt' at offset 0x3c4 contains 2 entries:
Offset Info Type Sym.Value Sym. Name
00004000 00000107 R_386_JUMP_SLOT 00000000 __libc_start_main@GLIBC_2.34
00004004 00000407 R_386_JUMP_SLOT 00000000 puts@GLIBC_2.0
No processor specific unwind information to decode
3ee8和3eec分别在init_array和fini_array段,均为RELATIVE类型重定位项
3fec, 3fe0,3fe4,3fe8,3ff0是GOT表项, 其中3fec (main_ptr) 是RELATIVE类型,其他均为GLOB_DAT类型
表项填充的函数为虚拟extern段中函数的地址,该段在内存中实际不存在
4000,4004是plt表项, 均为JUMP_SLOT类型, 400c是dso_handle, 为RELATIVE类型
got.plt表填充的也是外部函数地址,在虚拟extern段
在elf文件末尾,ida自动追加extern段(该段在内存中不存在,仅供分析)
综上所述,重定位有以下情况:
-
将待重定位地址处的内容解引用并加上elf加载的基地址即可
这种情况是针对elf文件内部变量绝对地址引用需要修复
例如RELATIVE类型
-
加载动态库,写入外部函数地址
针对外部引用地址修复
例如GLOB_DAT和JUMP_SLOT类型
Dynamic Segment
如果目标文件参与动态链接,必定包含一个类型为 PT_DYNAMIC 的Program表项, 对应节名为 .dynamic (type=SHT_DYNAMIC)
动态段的作用是提供动态链接器所需要的信息,比如依赖哪些共享库文件,动态链接符号表的位置,动态链接重定位表的位置等
typedef struct
{
Elf32_Sword d_tag;
union
{
Elf32_Word d_val;
Elf32_Addr d_ptr;
} d_un;
} Elf32_Dyn;
typedef struct
{
Elf64_Sxword d_tag;
union
{
Elf64_Xword d_val;
Elf64_Addr d_ptr;
} d_un;
} Elf64_Dyn;
d_tag
d_tag决定了如何对d_un解析
合法的d_tag值定义如下
#define DT_NULL 0
#define DT_NEEDED 1
#define DT_PLTRELSZ 2
#define DT_PLTGOT 3
#define DT_HASH 4
#define DT_STRTAB 5
#define DT_SYMTAB 6
#define DT_RELA 7
#define DT_RELASZ 8
#define DT_RELAENT 9
#define DT_STRSZ 10
#define DT_SYMENT 11
#define DT_INIT 12
#define DT_FINI 13
#define DT_SONAME 14
#define DT_RPATH 15
#define DT_SYMBOLIC 16
#define DT_REL 17
#define DT_RELSZ 18
#define DT_RELENT 19
#define DT_PLTREL 20
#define DT_DEBUG 21
#define DT_TEXTREL 22
#define DT_JMPREL 23
#define DT_BIND_NOW 24
#define DT_INIT_ARRAY 25
#define DT_FINI_ARRAY 26
#define DT_INIT_ARRAYSZ 27
#define DT_FINI_ARRAYSZ 28
#define DT_RUNPATH 29
#define DT_FLAGS 30
#define DT_ENCODING 32
#define DT_PREINIT_ARRAY 32
#define DT_PREINIT_ARRAYSZ 33
#define DT_SYMTAB_SHNDX 34
#define DT_RELRSZ 35
#define DT_RELR 36
#define DT_RELRENT 37
#define DT_NUM 38
#define DT_LOOS 0x6000000d
#define DT_HIOS 0x6ffff000
#define DT_LOPROC 0x70000000
#define DT_HIPROC 0x7fffffff
#define DT_PROCNUM DT_MIPS_NUM
#define DT_VALRNGLO 0x6ffffd00
#define DT_GNU_PRELINKED 0x6ffffdf5
#define DT_GNU_CONFLICTSZ 0x6ffffdf6
#define DT_GNU_LIBLISTSZ 0x6ffffdf7
#define DT_CHECKSUM 0x6ffffdf8
#define DT_PLTPADSZ 0x6ffffdf9
#define DT_MOVEENT 0x6ffffdfa
#define DT_MOVESZ 0x6ffffdfb
#define DT_FEATURE_1 0x6ffffdfc
#define DT_POSFLAG_1 0x6ffffdfd
#define DT_SYMINSZ 0x6ffffdfe
#define DT_SYMINENT 0x6ffffdff
#define DT_VALRNGHI 0x6ffffdff
#define DT_VALTAGIDX(tag) (DT_VALRNGHI - (tag))
#define DT_VALNUM 12
#define DT_ADDRRNGLO 0x6ffffe00
#define DT_GNU_HASH 0x6ffffef5
#define DT_TLSDESC_PLT 0x6ffffef6
#define DT_TLSDESC_GOT 0x6ffffef7
#define DT_GNU_CONFLICT 0x6ffffef8
#define DT_GNU_LIBLIST 0x6ffffef9
#define DT_CONFIG 0x6ffffefa
#define DT_DEPAUDIT 0x6ffffefb
#define DT_AUDIT 0x6ffffefc
#define DT_PLTPAD 0x6ffffefd
#define DT_MOVETAB 0x6ffffefe
#define DT_SYMINFO 0x6ffffeff
#define DT_ADDRRNGHI 0x6ffffeff
#define DT_ADDRTAGIDX(tag) (DT_ADDRRNGHI - (tag))
#define DT_ADDRNUM 11
#define DT_VERSYM 0x6ffffff0
#define DT_RELACOUNT 0x6ffffff9
#define DT_RELCOUNT 0x6ffffffa
#define DT_FLAGS_1 0x6ffffffb
#define DT_VERDEF 0x6ffffffc
#define DT_VERDEFNUM 0x6ffffffd
#define DT_VERNEED 0x6ffffffe
#define DT_VERNEEDNUM 0x6fffffff
#define DT_VERSIONTAGIDX(tag) (DT_VERNEEDNUM - (tag))
#define DT_VERSIONTAGNUM 16
#define DT_AUXILIARY 0x7ffffffd
#define DT_FILTER 0x7fffffff
#define DT_EXTRATAGIDX(tag) ((Elf32_Word)-((Elf32_Sword) (tag) <<1>>1)-1)
#define DT_EXTRANUM 3
DT_NEEDED
该tag对应的即为elf文件依赖的动态库文件,使用d_val解析后得到索引值
通过索引查找.dynstr即可得到链接库名
动态段的sh_link字段是指向动态链接字符串表的索引值
另外通过d_tag==DT_STRTAB解析对应的d_val可以得到.dynstr的文件偏移值
d_un
d_val 代表整数值
d_ptr 代表进程空间的虚拟地址
解析规则如下
名称 |
值 |
d_un |
可执行文件 |
共享目标文件 |
DT_NULL |
0 |
忽略 |
必需 |
必需 |
DT_NEEDED |
1 |
d_val |
可选 |
可选 |
DT_PLTRELSZ |
2 |
d_val |
可选 |
可选 |
DT_PLTGOT |
3 |
d_ptr |
可选 |
可选 |
DT_HASH |
4 |
d_ptr |
必需 |
必需 |
DT_STRTAB |
5 |
d_ptr |
必需 |
必需 |
DT_SYMTAB |
6 |
d_ptr |
必需 |
必需 |
DT_RELA |
7 |
d_ptr |
必需 |
可选 |
DT_RELASZ |
8 |
d_val |
必需 |
可选 |
DT_RELAENT |
9 |
d_val |
必需 |
可选 |
DT_STRSZ |
10 |
d_val |
必需 |
必需 |
DT_SYMENT |
11 |
d_val |
必需 |
必需 |
DT_INIT |
12 |
d_ptr |
可选 |
可选 |
DT_FINI |
13 |
d_ptr |
可选 |
可选 |
DT_SONAME |
14 |
d_val |
忽略 |
可选 |
DT_RPATH |
15 |
d_val |
可选 |
忽略 |
DT_SYMBOLIC |
16 |
忽略 |
忽略 |
可选 |
DT_REL |
17 |
d_ptr |
必需 |
可选 |
DT_RELSZ |
18 |
d_val |
必需 |
可选 |
DT_RELENT |
19 |
d_val |
必需 |
可选 |
DT_PLTREL |
20 |
d_val |
可选 |
可选 |
DT_DEBUG |
21 |
d_ptr |
可选 |
忽略 |
DT_TEXTREL |
22 |
忽略 |
可选 |
可选 |
DT_JMPREL |
23 |
d_ptr |
可选 |
可选 |
DT_BIND_NOW |
24 |
忽略 |
可选 |
可选 |
DT_LOPROC |
0x70000000 |
未定义 |
未定义 |
未定义 |
DT_HIPROC |
0x7fffffff |
未定义 |
未定义 |
未定义 |
打印动态段
#define DT_VAL 0
#define DT_PTR 1
const char *getDynamicType(Elf_Xword value) {
if (value >= DT_LOOS && value <= DT_HIOS)
return "OS-Specific";
if (value >= DT_LOPROC && value <= DT_HIPROC)
return "Processor-Specific";
switch (value) {
case DT_NULL: return "NULL";
case DT_NEEDED: return "NEEDED";
case DT_PLTRELSZ: return "PLTRELSZ";
case DT_PLTGOT: return "PLTGOT";
case DT_HASH: return "HASH";
case DT_STRTAB: return "STRTAB";
case DT_SYMTAB: return "SYMTAB";
case DT_RELA: return "RELA";
case DT_RELASZ: return "RELASZ";
case DT_RELAENT: return "RELAENT";
case DT_STRSZ: return "STRSZ";
case DT_SYMENT: return "SYMENT";
case DT_INIT: return "INIT";
case DT_FINI: return "FINI";
case DT_SONAME: return "SONAME";
case DT_RPATH: return "RPATH";
case DT_SYMBOLIC: return "SYMBOLIC";
case DT_REL: return "REL";
case DT_RELSZ: return "RELSZ";
case DT_RELENT: return "RELENT";
case DT_PLTREL: return "PLTREL";
case DT_DEBUG: return "DEBUG";
case DT_TEXTREL: return "TEXTREL";
case DT_JMPREL: return "JMPREL";
case DT_BIND_NOW: return "BIND_NOW";
case DT_INIT_ARRAY: return "INIT_ARRAY";
case DT_FINI_ARRAY: return "FINI_ARRAY";
case DT_INIT_ARRAYSZ: return "INIT_ARRAYSZ";
case DT_FINI_ARRAYSZ: return "FINI_ARRAYSZ";
case DT_RUNPATH: return "RUNPATH";
case DT_FLAGS: return "FLAGS";
case DT_ENCODING: return "ENCODING";
case DT_SYMTAB_SHNDX: return "SYMTAB_SHNDX";
case DT_RELRSZ: return "RELRSZ";
case DT_RELR: return "RELR";
case DT_RELRENT: return "RELRENT";
case DT_NUM: return "NUM";
case DT_VALRNGLO: return "VALRNGLO";
case DT_GNU_PRELINKED: return "GNU_PRELINKED";
case DT_GNU_CONFLICTSZ: return "GNU_CONFLICTSZ";
case DT_GNU_LIBLISTSZ: return "GNU_LIBLISTSZ";
case DT_CHECKSUM: return "CHECKSUM";
case DT_PLTPADSZ: return "PLTPADSZ";
case DT_MOVEENT: return "MOVEENT";
case DT_MOVESZ: return "MOVESZ";
case DT_FEATURE_1: return "FEATURE_1";
case DT_POSFLAG_1: return "POSFLAG_1";
case DT_SYMINSZ: return "SYMINSZ";
case DT_SYMINENT: return "SYMINENT";
case DT_ADDRRNGLO: return "ADDRRNGLO";
case DT_GNU_HASH: return "GNU_HASH";
case DT_TLSDESC_PLT: return "TLSDESC_PLT";
case DT_TLSDESC_GOT: return "TLSDESC_GOT";
case DT_GNU_CONFLICT: return "GNU_CONFLICT";
case DT_GNU_LIBLIST: return "GNU_LIBLIST";
case DT_CONFIG: return "CONFIG";
case DT_DEPAUDIT: return "DEPAUDIT";
case DT_AUDIT: return "AUDIT";
case DT_PLTPAD: return "PLTPAD";
case DT_MOVETAB: return "MOVETAB";
case DT_SYMINFO: return "SYMINFO";
case DT_VERSYM: return "VERSYM";
case DT_RELACOUNT: return "RELACOUNT";
case DT_RELCOUNT: return "RELCOUNT";
case DT_FLAGS_1: return "FLAGS_1";
case DT_VERDEF: return "VERDEF";
case DT_VERDEFNUM: return "VERDEFNUM";
case DT_VERNEED: return "VERNEED";
case DT_VERNEEDNUM: return "VERNEEDNUM";
case DT_AUXILIARY: return "AUXILIARY";
case DT_FILTER: return "FILTER";
default: return "Unknown Type";
}
}
uint32_t getDynamicDunType(Elf_Xword value) {
switch (value) {
case DT_NULL:
case DT_NEEDED:
case DT_PLTRELSZ:
case DT_RELASZ:
case DT_RELAENT:
case DT_STRSZ:
case DT_SYMENT:
case DT_SONAME:
case DT_RPATH:
case DT_SYMBOLIC:
case DT_RELSZ:
case DT_RELENT:
case DT_PLTREL:
case DT_TEXTREL:
case DT_BIND_NOW:
case DT_LOPROC:
case DT_HIPROC:
return DT_VAL;
case DT_PLTGOT:
case DT_HASH:
case DT_STRTAB:
case DT_SYMTAB:
case DT_RELA:
case DT_INIT:
case DT_FINI:
case DT_JMPREL:
case DT_DEBUG:
case DT_REL:
return DT_PTR;
default:
return DT_VAL;
}
}
void printDynamicSegment32(const Elf32_Shdr* pSectionHeader,Elf_Half sectionNum,uint8_t* pFileBuffer) {
for (int i = 0; i < sectionNum; i++) {
if (pSectionHeader[i].sh_type == SHT_DYNAMIC) {
Elf32_Shdr *pDynamicSection = &pSectionHeader[i];
Elf32_Word dynamicItemNum = pDynamicSection->sh_size / pDynamicSection->sh_entsize;
printf("Dynamic Section At File Offset %#x Contains %d Entries:\n", pDynamicSection->sh_offset,dynamicItemNum);
printf("\tTag \t\tType\t\t\t\tName/Value\n");
Elf32_Dyn *pDynamicTable = (Elf32_Dyn *) (pFileBuffer + pDynamicSection->sh_offset);
Elf32_Shdr *pDynamicStringTableHeader = &pSectionHeader[pDynamicSection->sh_link];
char *pDynamicStringTable = (char *) pFileBuffer + pDynamicStringTableHeader->sh_offset;
for (int j = 0; j < dynamicItemNum; j++) {
printf("\t%08x", pDynamicTable[j].d_tag);
printf("\t%-16s", getDynamicType(pDynamicTable[j].d_tag));
printf("\t%08x\t", pDynamicTable[j].d_un.d_val);
if (getDynamicDunType(pDynamicTable[j].d_tag) == DT_PTR)
printf("(PTR)");
switch (pDynamicTable[j].d_tag) {
case DT_NEEDED: printf("[%s]", pDynamicStringTable + pDynamicTable[j].d_un.d_val);
break;
case DT_SONAME: printf("[%s]", pDynamicStringTable + pDynamicTable[j].d_un.d_val);
break;
default: ;
}
printf("\n");
}
}
}
}
Hash Table (Export Table)
哈希表可用于查询导出函数, 有两种, 目前的elf文件主要是用GNU HASH表作为导出表
.hash
.gnu.hash
ELF Hash
Hash表定义如下
struct ELFHash {
uint32_t nbucket;
uint32_t nchain;
uint32_t buckets[];
uint32_t chains[];
};
Linux原始Elf Hash算法如下
uint32_t elf_hash(const unsigned char* name)
{
uint32_t h = 0, g;
while (*name)
{
h = (h << 4) + *name++;
if (g = h & 0xf0000000)
h ^= g >> 24;
h &= ~g;
}
return h;
}
ELF Hash Table根据符号名查找符号地址的流程如下
-
根据elfhash函数计算符号名的hash
-
index=buckets[hash%nbucket]
index即为符号在符号表中的索引
-
如果index==SHT_UNDEF(0)则未找到符号,结束
否则判断符号表中索引index的符号和目标符号是否相同
-
如果符号名不同则根据index从chains表找下一个符号索引,继续第3步
index=chains[index] (如果chains[index]==0说明不存在该符号)
代码表示如下:
uint32_t findSymbolIndexByElfHash(const char* symbolName,
uint32_t* pHashTable,
Elf32_Sym* pSymbolTable,
const char* pSymbolStringTable)
{
uint32_t nbucket=pHashTable[0],nchain=pHashTable[1];
uint32_t* buckets=&pHashTable[2],*chains=&pHashTable[2+nbucket];
uint32_t hash = elf_hash(symbolName);
for (uint32_t index=buckets[hash % nbucket]; index; index = chains[index]) {
if (strcmp(symbolName, &pSymbolStringTable[pSymbolTable[index].st_name]) == 0) {
return index;
}
}
return 0;
}
手工查找流程示例:
由于x86_64下gcc编译的elf程序默认只使用gnu.hash,以Android NDK得到的64位so为例
找到.hash节,发现nbucket=nchain=0x36
根据elfhash计算bucket下标, index=hash%nbucket =48
由于bucket项大小为4字节,从0x960开始+48*4=0xA20
得到动态符号表下标为0xE(14), 查找符号表正好对应dlopen函数
Android Elf Hash
Android的elfhash算法代码有所不同,但和原始elfhash等价
参考 https://cs.android.com/android/platform/superproject/+/android-4.1.2_r2.1:bionic/linker/linker.c
static unsigned elfhash(const char *_name)
{
const unsigned char *name = (const unsigned char *) _name;
unsigned h = 0, g;
while(*name) {
h = (h << 4) + *name++;
g = h & 0xf0000000;
h ^= g;
h ^= g >> 24;
}
return h;
}
static Elf32_Sym *_elf_lookup(soinfo *si, unsigned hash, const char *name)
{
Elf32_Sym *s;
Elf32_Sym *symtab = si->symtab;
const char *strtab = si->strtab;
unsigned n;
TRACE_TYPE(LOOKUP, "%5d SEARCH %s in %s@0x%08x %08x %d\n", pid,
name, si->name, si->base, hash, hash % si->nbucket);
n = hash % si->nbucket;
for(n = si->bucket[hash % si->nbucket]; n != 0; n = si->chain[n]){
s = symtab + n;
if(strcmp(strtab + s->st_name, name)) continue;
switch(ELF32_ST_BIND(s->st_info)){
case STB_GLOBAL:
case STB_WEAK:
if(s->st_shndx == 0) continue;
TRACE_TYPE(LOOKUP, "%5d FOUND %s in %s (%08x) %d\n", pid,
name, si->name, s->st_value, s->st_size);
return s;
}
}
return NULL;
}
Sysv Hash
Elf Hash在Android又定义为为Sysv Hash,参考https://cs.android.com/android/platform/superproject/+/android14-qpr3-release:external/musl/ldso/dynlink.c
static uint32_t sysv_hash(const char *s0)
{
const unsigned char *s = (void *)s0;
uint_fast32_t h = 0;
while (*s) {
h = 16*h + *s++;
h ^= h>>24 & 0xf0;
}
return h & 0xfffffff;
}
static Sym *sysv_lookup(const char *s, uint32_t h, struct dso *dso)
{
size_t i;
Sym *syms = dso->syms;
Elf_Symndx *hashtab = dso->hashtab;
char *strings = dso->strings;
for (i=hashtab[2+h%hashtab[0]]; i; i=hashtab[2+hashtab[0]+i]) {
if ((!dso->versym || dso->versym[i] >= 0)
&& (!strcmp(s, strings+syms[i].st_name)))
return syms+i;
}
return 0;
}
GNU Hash
GNU Hash表项如下
struct GnuHash {
uint32_t nbucket;
uint32_t symndx;
uint32_t bloomSize;
uint32_t bloomShift;
ElfW(Addr) blooms[];
uint32_t buckets[];
uint32_t chains[];
};
可以发现,GNU Hash并没有给出nchain字段,如何计算?
- chains数组前面是连续的blooms和buckets数组,只要根据哈希表大小减去前面的成员大小即可
- 32位 nchain=GNUHashTable.sh_size/sizeof(uint32_t) - (4+bloomSize+nbucket)
- 64位 nchain=GNUHashTable.sh_size/sizeof(uint32_t) - (4+bloomSize*2+nbucket)
查找GNU Hash表的示意图如下:
-
chain表的虚线部分并不存在
除了导出符号之外的符号chain表并无必要保存,但chain表的索引和符号表要一一对应
所以chain表的理论起始地址=buckets+nbucket-symndx
但在文件的排列上,各项是连续的,chains有效内容仍然在buckets后方
-
chain表每个表项保存符号的哈希值
最低位为0时表示对应的符号有剩余哈希冲突项
为1时表示没有剩余冲突项
详细可参考ELF 通过 Sysv Hash & Gnu Hash 查找符号的实现及对比和ELF解析07_哈希表, 导出表
参考https://cs.android.com/android/platform/superproject/+/android14-qpr3-release:external/musl/ldso/dynlink.c
Android Linker的源码实现如下
uint32_t gnu_hash(const unsigned char* str)
{
uint_32 h = 5381;
while(*str != 0)
{
h += (h<<5) +*str++;
}
return h;
}
static Sym *gnu_lookup(uint32_t h1, uint32_t *hashtab, struct dso *dso, const char *s)
{
uint32_t nbuckets = hashtab[0];
uint32_t *buckets = hashtab + 4 + hashtab[2]*(sizeof(size_t)/4);
uint32_t i = buckets[h1 % nbuckets];
if (!i) return 0;
uint32_t *hashval = buckets + nbuckets + (i - hashtab[1]);
for (h1 |= 1; ; i++) {
uint32_t h2 = *hashval++;
if ((h1 == (h2|1)) && (!dso->versym || dso->versym[i] >= 0)
&& !strcmp(s, dso->strings + dso->syms[i].st_name))
return dso->syms+i;
if (h2 & 1) break;
}
return 0;
}
打印哈希表
unsigned int elf_hash(const char* _name)
{
const unsigned char* name=(const unsigned char*)_name;
unsigned int h = 0, g;
while (*name)
{
h = (h << 4) + *name++;
if (g = h & 0xf0000000)
h ^= g >> 24;
h &= ~g;
}
return h;
}
void printHashTable32(Elf32_Shdr* pSectionHeader,Elf_Half sectionNum,uint8_t* pFileBuffer,const char* pSectionHeaderStringTable) {
printf("ELF Hash Tables:\n");
for(int i=0;i<sectionNum;i++) {
if(pSectionHeader[i].sh_type==SHT_HASH) {
Elf32_Shdr* pDynamicSymbolTableHeader=&pSectionHeader[pSectionHeader[i].sh_link];
Elf32_Sym* pDynamicSymbolTable=(Elf32_Sym*)(pDynamicSymbolTableHeader->sh_offset+pFileBuffer);
const char* pDynamicSymbolStringTable=(const char*)(pSectionHeader[pDynamicSymbolTableHeader->sh_link].sh_offset+pFileBuffer);
uint32_t* pHashTable=(uint32_t*)(pSectionHeader[i].sh_offset+pFileBuffer);
uint32_t nbucket=pHashTable[0],nchain=pHashTable[1];
uint32_t* buckets=&pHashTable[2];
uint32_t* chains=&pHashTable[2+nbucket];
printf("\tHash Table '%s' contains %d entries\n",&pSectionHeaderStringTable[pSectionHeader[i].sh_name],nchain);
printf("\t\tNum\t\tHash \% Nbucket\t\tIndex\t\t\tValue\t\t\tName\n");
for(uint32_t j=0,count=0;j<nbucket;j++) {
uint32_t index=buckets[j];
if(index) {
printf("\t\t%d\t\t%08x\t\t%08x\t\t%08x\t\t%s\n",++count,elf_hash(&pDynamicSymbolStringTable[pDynamicSymbolTable[index].st_name])%nbucket,index,pDynamicSymbolTable[index].st_value,&pDynamicSymbolStringTable[pDynamicSymbolTable[index].st_name]);
}
while(chains[index]) {
index=chains[index];
printf("\t\t%d\t\t%08x\t\t%08x\t\t%08x\t\t%s\n",++count,elf_hash(&pDynamicSymbolStringTable[pDynamicSymbolTable[index].st_name])%nbucket,index,pDynamicSymbolTable[index].st_value,&pDynamicSymbolStringTable[pDynamicSymbolTable[index].st_name]);
}
}
}
if(pSectionHeader[i].sh_type==SHT_GNU_HASH) {
Elf32_Shdr* pDynamicSymbolTableHeader=&pSectionHeader[pSectionHeader[i].sh_link];
Elf32_Sym* pDynamicSymbolTable=(Elf32_Sym*)(pDynamicSymbolTableHeader->sh_offset+pFileBuffer);
const char* pDynamicSymbolStringTable=(const char*)(pSectionHeader[pDynamicSymbolTableHeader->sh_link].sh_offset+pFileBuffer);
uint32_t* pGNUHashTable=(uint32_t*)(pSectionHeader[i].sh_offset+pFileBuffer);
uint32_t nbucket=pGNUHashTable[0];
uint32_t symndx=pGNUHashTable[1];
uint32_t bloomSize=pGNUHashTable[2];
uint32_t bloomShift=pGNUHashTable[3];
Elf32_Addr* blooms=(Elf32_Addr*)&pGNUHashTable[4];
uint32_t* buckets=pGNUHashTable+4+bloomSize;
uint32_t* chains=buckets+nbucket-symndx;
uint32_t nchain=pSectionHeader[i].sh_size/sizeof(uint32_t)-(4+bloomSize+nbucket);
printf("\tHash Table '%s' contains %d entries, nbucket: %d, symndx: %#x \n",&pSectionHeaderStringTable[pSectionHeader[i].sh_name],nchain,nbucket,symndx);
printf("\t\tNum\t\tIndex\t\t\tValue\t\t\tName\n");
for(int j=0,count=0;j<nbucket;j++) {
uint32_t index=buckets[j];
if(index) {
printf("\t\t%d\t\t%08x\t\t%08x\t\t%s\n",++count,index,pDynamicSymbolTable[index].st_value,&pDynamicSymbolStringTable[pDynamicSymbolTable[index].st_name]);
}
while((chains[index]&1)==0) {
index++;
printf("\t\t%d\t\t%08x\t\t%08x\t\t%s\n",++count,index,pDynamicSymbolTable[index].st_value,&pDynamicSymbolStringTable[pDynamicSymbolTable[index].st_name]);
}
}
}
}
}
ELF Loader
ELF Program Header描述了ELF文件的哪些段需要映射到内存,ELF程序的加载流程如下:
-
将elf文件加载到内存中,成为filebuffer
-
根据program header,映射filebuffer至imagebuffer
这一步需要给予不同段正确的权限
-
重定位,修复全局变量地址和外部引用地址
根据elf加载的基地址修复全局变量地址
外部引用地址需要加载并遍历needed libso,根据符号查找函数真实地址并修复
-
跳转至入口点
分别编译loadelf32/64以加载x86/x64的elf文件
gcc -m32 main.c LoadELF.h LoadELF.c -o loadelf32
gcc -m64 main.c LoadELF.h LoadELF.c -o loadelf64
main.c
#include "LoadELF.h"
#include <stdio.h>
int main(int argc, char *argv[]) {
if (argc!= 2) {
printf("Usage: %s <filepath>\n", argv[0]);
return 1;
}
LoadAndExecElf(argv[1]);
return 0;
}
LoadELF.h
#ifndef LOADELF_H
#define LOADELF_H
#include <stddef.h>
#include <stdint.h>
uint8_t* readFileToBytes(const char *fileName,size_t* readSize);
void LoadAndExecElf(const char* filePath);
#endif
LoadELF.c
根据x86/x64不同环境,定义对应宏
#include "LoadELF.h"
#include <stdio.h>
#include <elf.h>
#include <stdlib.h>
#include <dlfcn.h>
#include <string.h>
#include <sys/mman.h>
#include <link.h>
#ifdef __x86_64__
#define Elf_Ehdr Elf64_Ehdr
#define Elf_Phdr Elf64_Phdr
#define Elf_Shdr Elf64_Shdr
#define Elf_Addr Elf64_Addr
#define Elf_Dyn Elf64_Dyn
#define Elf_Rel Elf64_Rela
#define Elf_Sym Elf64_Sym
#define ELF_R_TYPE ELF64_R_TYPE
#define ELF_R_SYM ELF64_R_SYM
#define DT_REL_ITEM DT_RELA
#define DT_REL_SZ DT_RELASZ
#else
#define Elf_Ehdr Elf32_Ehdr
#define Elf_Phdr Elf32_Phdr
#define Elf_Shdr Elf32_Shdr
#define Elf_Addr Elf32_Addr
#define Elf_Dyn Elf32_Dyn
#define Elf_Rel Elf32_Rel
#define Elf_Sym Elf32_Sym
#define ELF_R_TYPE ELF32_R_TYPE
#define ELF_R_SYM ELF32_R_SYM
#define DT_REL_ITEM DT_REL
#define DT_REL_SZ DT_RELSZ
#endif
uint8_t* readFileToBytes(const char *fileName,size_t* readSize) {
FILE *file = fopen(fileName, "rb");
if (file == NULL) {
printf("Error opening file\n");
fclose(file);
return NULL;
}
fseek(file, 0,SEEK_END);
size_t fileSize = ftell(file);
fseek(file, 0,SEEK_SET);
uint8_t *buffer = (uint8_t *) malloc(fileSize);
if (buffer == NULL) {
printf("Error allocating memory\n");
fclose(file);
return NULL;
}
size_t bytesRead = fread(buffer, 1, fileSize, file);
if(bytesRead!=fileSize) {
printf("Read bytes not equal file size!\n");
free(buffer);
fclose(file);
return NULL;
}
fclose(file);
if(readSize)
*readSize=bytesRead;
return buffer;
}
uint64_t alignValue(uint64_t value, uint64_t alignment) {
return value % alignment ? (value / alignment + 1) * alignment : value;
}
size_t getElfMemorySize(Elf_Phdr* pProgramHeader,Elf_Half segmentNum) {
size_t size = 0;
for (int i = segmentNum - 1; i >= 0; i--) {
if (pProgramHeader[i].p_type == PT_LOAD) {
size = pProgramHeader[i].p_vaddr + pProgramHeader[i].p_memsz;
break;
}
}
return alignValue(size, 0x1000);
}
Elf_Word getDynamicTableValueByType(Elf_Dyn *dynamicTable, size_t dynamicTableSize, int type) {
for (int i = 0; i < dynamicTableSize; i++) {
if (dynamicTable[i].d_tag == type) {
return dynamicTable[i].d_un.d_val;
}
}
return 0;
}
const char** getNeededLibraryPath(uint8_t* pElfBuffer,Elf_Dyn *pDynamicTable, size_t dynamicTableSize,size_t* neededLibraryNum) {
char** buffer = NULL;
int num=0;
char* pImageStringTable=(char*)pElfBuffer+getDynamicTableValueByType(pDynamicTable,dynamicTableSize,DT_STRTAB);
for (int i = 0; i < dynamicTableSize; i++) {
if (pDynamicTable[i].d_tag == DT_NEEDED) {
num++;
buffer=(char**)realloc(buffer,num*sizeof(char*));
if(buffer==NULL) {
printf("Error reallocating memory\n");
exit(-1);
}
buffer[num-1]=pImageStringTable+ pDynamicTable[i].d_un.d_val;
}
}
*neededLibraryNum=num;
return (const char**)buffer;
}
Elf_Addr getSymbolAddress(const char** neededLibrary, size_t neededLibraryNum, const char *symbolName) {
for (int i = 0; i < neededLibraryNum; i++) {
void *handle = dlopen(neededLibrary[i],RTLD_NOW);
if (handle == NULL) {
printf("Error opening library %s\n", dlerror());
exit(1);
}
void *address = dlsym(handle, symbolName);
if (address == NULL) {
continue;
}
return (Elf_Addr)address;
}
printf("Can't find address of symbol: %s\n",symbolName);
return 0;
}
void mapSegmentToMemory(uint8_t* pImageBuffer,uint8_t* pFileBuffer,Elf_Phdr* pProgramHeader,Elf_Half segmentNum) {
for (int i = 0; i < segmentNum; i++) {
if (pProgramHeader[i].p_type == PT_LOAD) {
uint8_t *pImageAddr = pImageBuffer + pProgramHeader[i].p_vaddr;
size_t memorySize = pProgramHeader[i].p_memsz;
Elf_Word segmentFlags = pProgramHeader[i].p_flags;
int protection = 0;
memcpy(pImageAddr, pFileBuffer + pProgramHeader[i].p_offset, pProgramHeader[i].p_filesz);
if (segmentFlags & PF_R) {
protection |= PROT_READ;
}
if (segmentFlags & PF_W) {
protection |= PROT_WRITE;
}
if (segmentFlags & PF_X) {
protection |= PROT_EXEC;
}
mprotect(pImageAddr, alignValue(memorySize, 0x1000), protection);
}
}
}
void fixRelocationItem(Elf_Rel* pRelocationTable,Elf_Word relocationItemNum,uint8_t* pImageBuffer,const char* pDynamicStringTable,Elf_Sym* pDynamicSymbolTable,const char** neededLibrary,size_t neededLibraryNum) {
Elf_Addr* fixItem=NULL;
Elf_Addr baseAddr=(Elf_Addr)pImageBuffer;
for(int i=0;i<relocationItemNum;i++) {
switch (ELF_R_TYPE(pRelocationTable[i].r_info)) {
case R_386_RELATIVE:
fixItem=(Elf_Addr*)(pImageBuffer+pRelocationTable[i].r_offset);
*fixItem+=baseAddr;
break;
case R_386_GLOB_DAT:
case R_386_JMP_SLOT:
const char* symbolName=&pDynamicStringTable[ pDynamicSymbolTable[ELF_R_SYM(pRelocationTable[i].r_info)].st_name ];
fixItem=(Elf_Addr*)(pImageBuffer+pRelocationTable[i].r_offset);
Elf_Addr symbolAddr=getSymbolAddress(neededLibrary,neededLibraryNum,symbolName);
*fixItem=symbolAddr;
break;
}
}
}
void LoadAndExecElf(const char* filePath) {
size_t readFileSize=0;
uint8_t* pFileBuffer=readFileToBytes(filePath,&readFileSize);
if(pFileBuffer==NULL) {
printf("Error reading file\n");
return;
}
Elf_Ehdr* pElfHeader=(Elf_Ehdr*)pFileBuffer;
Elf_Phdr *pProgramHeader=(Elf_Phdr*)(pFileBuffer+pElfHeader->e_phoff);
Elf_Half segmentNum=pElfHeader->e_phnum;
uint8_t* pImageBuffer=NULL;
size_t elfMemorySize = getElfMemorySize(pProgramHeader,segmentNum);
if (elfMemorySize == 0) {
printf("ELF memory size is 0!\n");
return;
}
posix_memalign((void*)&pImageBuffer, 0x1000, elfMemorySize);
if (pImageBuffer == NULL) {
printf("Error allocating memory\n");
return;
}
memset(pImageBuffer,0 ,elfMemorySize);
mapSegmentToMemory(pImageBuffer,pFileBuffer,pProgramHeader,segmentNum);
Elf_Phdr *pDynamicTableHeader=NULL;
Elf_Dyn *pDynamicTable=NULL;
for (int i = 0; i < segmentNum; i++) {
if (pProgramHeader[i].p_type == PT_DYNAMIC) {
pDynamicTableHeader = &pProgramHeader[i];
break;
}
}
pDynamicTable = (Elf_Dyn *) (pImageBuffer + pDynamicTableHeader->p_vaddr);
size_t dynamicItemNum = pDynamicTableHeader->p_filesz / sizeof(Elf_Dyn);
Elf_Rel *pRelocationTable =NULL;
size_t relocationItemNum=0;
Elf_Rel *pJmpRelocationTable = (Elf_Rel *) (pImageBuffer + getDynamicTableValueByType(pDynamicTable, dynamicItemNum,DT_JMPREL));
size_t jmpRelocationItemNum=0;
Elf_Sym *pDynamicSymbolTable = NULL;
char *pDynamicStringTable = NULL;
for (int i = 0; i <dynamicItemNum; i++) {
switch (pDynamicTable[i].d_tag) {
case DT_REL_ITEM: pRelocationTable=(Elf_Rel*)(pImageBuffer+pDynamicTable[i].d_un.d_val); break;
case DT_JMPREL: pJmpRelocationTable=(Elf_Rel*)(pImageBuffer+pDynamicTable[i].d_un.d_val); break;
case DT_REL_SZ: relocationItemNum=pDynamicTable[i].d_un.d_val/sizeof(Elf_Rel); break;
case DT_PLTRELSZ: jmpRelocationItemNum=pDynamicTable[i].d_un.d_val/sizeof(Elf_Rel); break;
case DT_SYMTAB:pDynamicSymbolTable=(Elf_Sym*)(pImageBuffer+pDynamicTable[i].d_un.d_val);break;
case DT_STRTAB:pDynamicStringTable=(char*)(pImageBuffer+pDynamicTable[i].d_un.d_val);break;
}
}
size_t neededLibraryNum=0;
const char** neededLibrary=getNeededLibraryPath(pImageBuffer,pDynamicTable,dynamicItemNum,&neededLibraryNum);
fixRelocationItem(pRelocationTable,relocationItemNum,pImageBuffer,pDynamicStringTable,pDynamicSymbolTable,neededLibrary,neededLibraryNum);
fixRelocationItem(pJmpRelocationTable,jmpRelocationItemNum,pImageBuffer,pDynamicStringTable,pDynamicSymbolTable,neededLibrary,neededLibraryNum);
typedef void (*VoidFunctionPtr)();
VoidFunctionPtr entry=(VoidFunctionPtr)(pImageBuffer+pElfHeader->e_entry);
printf("Load ELF success!Jump to entry point:%#lx\n",(unsigned long long)entry);
entry();
printf("Come back\n");
}
效果如下
References
ELF文件格式
ELF文件格式解析
《程序员的自我修养》
ELF加载器的原理与实现
【内核】ELF 文件执行流程
说一下Linux可执行文件的格式,ELF格式
ELF解析07_哈希表, 导出表
ELF 通过 Sysv Hash & Gnu Hash 查找符号的实现及对比
[翻译]GNU Hash ELF Sections