一、在线编码查看
二、ELF头
ELF是一种用于二进制文件、可执行文件、目标代码、共享库和核心转储格式文件,它由4部分组成,分别是ELF头(ELF header)、程序头表(Program header table)、节(Section)和节头表(Section header table)。实际上,一个文件中不一定包含全部内容,而且它们的位置也未必如同所示这样安排,只有ELF头的位置是固定的,其余各部分的位置、大小等信息由ELF头中的各项值来决定。接下来我们将通过代码来具体进行分析。
1、源码
我们将以如下一段代码来分析elf的格式
#include <stdio.h> int global_init_var = 84; int global_uninit_var; void func1(int i) { printf("%d\n", i); } int main(void) { static int static_var = 85; static int static_var2; int a = 1; int b; func1(static_var + static_var2 + a + b); return 0; }
编译:
# 首先生成a.o文件 gcc -c a.c # 生成可执行文件 gcc -o a a.c # 执行 ./a 86
2、分析
2.1 ELF头结构
我们知道ELF文件最开始部分为ELF的头,其64位的结构如下:
/* Type for a 16-bit quantity. */ typedef uint16_t Elf64_Half; /* Types for signed and unsigned 32-bit quantities. */ typedef uint32_t Elf64_Word; typedef int32_t Elf64_Sword; /* Types for signed and unsigned 64-bit quantities. */ typedef uint64_t Elf64_Xword; typedef int64_t Elf64_Sxword; /* Type of addresses. */ typedef uint64_t Elf64_Addr; /* Type of file offsets. */ typedef uint64_t Elf64_Off; /* Type for section indices, which are 16-bit quantities. */ typedef uint16_t Elf64_Section; /* Type for version symbol information. */ typedef Elf64_Half Elf64_Versym; #define EI_NIDENT (16) typedef struct { unsigned char e_ident[EI_NIDENT]; /* Magic number and other info, 16bytes */ Elf64_Half e_type; /* Object file type,2bytes */ Elf64_Half e_machine; /* Architecture, 2bytes */ Elf64_Word e_version; /* Object file version, 4bytes */ Elf64_Addr e_entry; /* Entry point virtual address,8bytes */ Elf64_Off e_phoff; /* Program header table file offset,8bytes */ Elf64_Off e_shoff; /* Section header table file offset,8bytes */ Elf64_Word e_flags; /* Processor-specific flags,4bytes */ Elf64_Half e_ehsize; /* ELF header size in bytes,2bytes */ Elf64_Half e_phentsize; /* Program header table entry size,2bytes */ Elf64_Half e_phnum; /* Program header table entry count,2bytes */ Elf64_Half e_shentsize; /* Section header table entry size,2bytes */ Elf64_Half e_shnum; /* Section header table entry count,2bytes */ Elf64_Half e_shstrndx; /* Section header string table index,2bytes */ } Elf64_Ehdr; // 共64字节
2.2 readelf命令输出
首先我们通过readelf把ELF的头文件读取出来
readelf -h a.o ELF 头: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 类别: ELF64 数据: 2 补码,小端序 (little endian) Version: 1 (current) OS/ABI: UNIX - System V ABI 版本: 0 类型: REL (可重定位文件) 系统架构: Advanced Micro Devices X86-64 版本: 0x1 入口点地址: 0x0 程序头起点: 0 (bytes into file) Start of section headers: 1176 (bytes into file) 标志: 0x0 Size of this header: 64 (bytes) Size of program headers: 0 (bytes) Number of program headers: 0 Size of section headers: 64 (bytes) Number of section headers: 14 Section header string table index: 13
2.2 readelf命令输出
首先我们通过readelf把ELF的头文件读取出来
readelf -h a.o ELF 头: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 类别: ELF64 数据: 2 补码,小端序 (little endian) Version: 1 (current) OS/ABI: UNIX - System V ABI 版本: 0 类型: REL (可重定位文件) 系统架构: Advanced Micro Devices X86-64 版本: 0x1 入口点地址: 0x0 程序头起点: 0 (bytes into file) Start of section headers: 1176 (bytes into file) 标志: 0x0 Size of this header: 64 (bytes) Size of program headers: 0 (bytes) Number of program headers: 0 Size of section headers: 64 (bytes) Number of section headers: 14 Section header string table index: 13
从以上信息可知,其含有14个section header,从文件1176字节处开始,每个section header大小为64字节。
2.3 二进制ELF头分析
我们根据上述结构对二进制文件进行解析,可得到如下图所示:
从图上得知,其确实能和readelf命令输出的头信息及段信息对应上。
三、ELF段
3.1 段结构
typedef struct { Elf64_Word sh_name; /* Section name (string tbl index),4bytes */ Elf64_Word sh_type; /* Section type,4bytes */ Elf64_Xword sh_flags; /* Section flags,8bytes */ Elf64_Addr sh_addr; /* Section virtual addr at execution,8bytes */ Elf64_Off sh_offset; /* Section file offset,8byts */ Elf64_Xword sh_size; /* Section size in bytes,8bytes */ Elf64_Word sh_link; /* Link to another section,4bytes */ Elf64_Word sh_info; /* Additional section information,4bytes */ Elf64_Xword sh_addralign; /* Section alignment,8bytes */ Elf64_Xword sh_entsize; /* Entry size if section holds table,8bytes */ } Elf64_Shdr; // 共 8+8+8+8+8+8+8+8=64字节
/* Legal values for sh_type (section type). */ #define SHT_NULL 0 /* Section header table entry unused */ #define SHT_PROGBITS 1 /* Program data */ #define SHT_SYMTAB 2 /* Symbol table */ #define SHT_STRTAB 3 /* String table */ #define SHT_RELA 4 /* Relocation entries with addends */ #define SHT_HASH 5 /* Symbol hash table */ #define SHT_DYNAMIC 6 /* Dynamic linking information */ #define SHT_NOTE 7 /* Notes */ #define SHT_NOBITS 8 /* Program space with no data (bss) */ #define SHT_REL 9 /* Relocation entries, no addends */ #define SHT_SHLIB 10 /* Reserved */ #define SHT_DYNSYM 11 /* Dynamic linker symbol table */ #define SHT_INIT_ARRAY 14 /* Array of constructors */ #define SHT_FINI_ARRAY 15 /* Array of destructors */ #define SHT_PREINIT_ARRAY 16 /* Array of pre-constructors */ #define SHT_GROUP 17 /* Section group */ #define SHT_SYMTAB_SHNDX 18 /* Extended section indeces */ #define SHT_NUM 19 /* Number of defined types. */ #define SHT_LOOS 0x60000000 /* Start OS-specific. */ #define SHT_GNU_ATTRIBUTES 0x6ffffff5 /* Object attributes. */ #define SHT_GNU_HASH 0x6ffffff6 /* GNU-style hash table. */ #define SHT_GNU_LIBLIST 0x6ffffff7 /* Prelink library list */ #define SHT_CHECKSUM 0x6ffffff8 /* Checksum for DSO content. */ #define SHT_LOSUNW 0x6ffffffa /* Sun-specific low bound. */ #define SHT_SUNW_move 0x6ffffffa #define SHT_SUNW_COMDAT 0x6ffffffb #define SHT_SUNW_syminfo 0x6ffffffc #define SHT_GNU_verdef 0x6ffffffd /* Version definition section. */ #define SHT_GNU_verneed 0x6ffffffe /* Version needs section. */ #define SHT_GNU_versym 0x6fffffff /* Version symbol table. */ #define SHT_HISUNW 0x6fffffff /* Sun-specific high bound. */ #define SHT_HIOS 0x6fffffff /* End OS-specific type */ #define SHT_LOPROC 0x70000000 /* Start of processor-specific */ #define SHT_HIPROC 0x7fffffff /* End of processor-specific */ #define SHT_LOUSER 0x80000000 /* Start of application-specific */ #define SHT_HIUSER 0x8fffffff /* End of application-specific */ /* Legal values for sh_flags (section flags). */ #define SHF_WRITE (1 << 0) /* Writable */ #define SHF_ALLOC (1 << 1) /* Occupies memory during execution */ #define SHF_EXECINSTR (1 << 2) /* Executable */ #define SHF_MERGE (1 << 4) /* Might be merged */ #define SHF_STRINGS (1 << 5) /* Contains nul-terminated strings */ #define SHF_INFO_LINK (1 << 6) /* `sh_info' contains SHT index */ #define SHF_LINK_ORDER (1 << 7) /* Preserve order after combining */ #define SHF_OS_NONCONFORMING (1 << 8) /* Non-standard OS specific handling required */ #define SHF_GROUP (1 << 9) /* Section is member of a group. */ #define SHF_TLS (1 << 10) /* Section hold thread-local data. */ #define SHF_COMPRESSED (1 << 11) /* Section with compressed data. */ #define SHF_MASKOS 0x0ff00000 /* OS-specific. */ #define SHF_MASKPROC 0xf0000000 /* Processor-specific */ #define SHF_ORDERED (1 << 30) /* Special ordering requirement (Solaris). */ #define SHF_EXCLUDE (1U << 31) /* Section is excluded unless referenced or allocated (Solaris).*/
3.2 命令行段输出
从上面信息可知,该文件共有14个段(e_shnum),从文件0x498(e_shoff,十进制1176)处开始,每个section header大小为64(e_shentsize)字节。接下来我们分析下段信息:
首先我们使用readelf把其段描述读取出来:
readelf -S a.o There are 14 section headers, starting at offset 0x498: 节头: [号] 名称 类型 地址 偏移量 大小 全体大小 旗标 链接 信息 对齐 [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .text PROGBITS 0000000000000000 00000040 0000000000000061 0000000000000000 AX 0 0 1 [ 2] .rela.text RELA 0000000000000000 00000378 0000000000000078 0000000000000018 I 11 1 8 [ 3] .data PROGBITS 0000000000000000 000000a4 0000000000000008 0000000000000000 WA 0 0 4 [ 4] .bss NOBITS 0000000000000000 000000ac 0000000000000004 0000000000000000 WA 0 0 4 [ 5] .rodata PROGBITS 0000000000000000 000000ac 0000000000000004 0000000000000000 A 0 0 1 [ 6] .comment PROGBITS 0000000000000000 000000b0 000000000000002b 0000000000000001 MS 0 0 1 [ 7] .note.GNU-stack PROGBITS 0000000000000000 000000db 0000000000000000 0000000000000000 0 0 1 [ 8] .note.gnu.propert NOTE 0000000000000000 000000e0 0000000000000020 0000000000000000 A 0 0 8 [ 9] .eh_frame PROGBITS 0000000000000000 00000100 0000000000000058 0000000000000000 A 0 0 8 [10] .rela.eh_frame RELA 0000000000000000 000003f0 0000000000000030 0000000000000018 I 11 9 8 [11] .symtab SYMTAB 0000000000000000 00000158 00000000000001b0 0000000000000018 12 12 8 [12] .strtab STRTAB 0000000000000000 00000308 0000000000000070 0000000000000000 0 0 1 [13] .shstrtab STRTAB 0000000000000000 00000420 0000000000000074 0000000000000000 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclude), l (large), p (processor specific)
.strtab(string table):字符串表。
.shstrtab(section header string table):段表字符串表。
.text:代码段
其输出了14个段信息,并显示了段的详细信息。我们看下文件0x498处是否能和以上信息对应上:
3.3 二进制段分析
上面我们有讲到,该文件共有14个段(e_shnum),从文件0x498(e_shoff,十进制1176)处开始,每个section header大小为64(e_shentsize)字节,到0x818处截止。
从Elf64_Shdr结构可知,sh_name指向.shstrtab中的序号,所以我们先分析.shstrtab段,其在文件中起始位置为:0x498+64*13=0x7d8。
我们再看下.text字段,其在文件中起始位置为:0x498+64*1=0x4d8(段开始位置为0x498,每个section header大小为64)。
我们再看下.strtab段(字符串段),其在文件中起始位置为:0x498+64*12=0x798(段开始位置为0x498,每个section header大小为64)。
0x308开始的0x70字节如下:
四、符号表
4.1 符号表结构
/* Symbol table entry. */ typedef struct { Elf64_Word st_name; /* Symbol name (string tbl index),4bytes */ unsigned char st_info; /* Symbol type and binding,1bytes */ unsigned char st_other; /* Symbol visibility,1bytes */ Elf64_Section st_shndx; /* Section index,2bytes */ Elf64_Addr st_value; /* Symbol value,8bytes */ Elf64_Xword st_size; /* Symbol size,8bytes */ } Elf64_Sym; /* The syminfo section if available contains additional information about every dynamic symbol. */ typedef struct { Elf64_Half si_boundto; /* Direct bindings, symbol bound to */ Elf64_Half si_flags; /* Per symbol flags */ } Elf64_Syminfo; /* Possible values for si_boundto. */ #define SYMINFO_BT_SELF 0xffff /* Symbol bound to self */ #define SYMINFO_BT_PARENT 0xfffe /* Symbol bound to parent */ #define SYMINFO_BT_LOWRESERVE 0xff00 /* Beginning of reserved entries */ /* Possible bitmasks for si_flags. */ #define SYMINFO_FLG_DIRECT 0x0001 /* Direct bound symbol */ #define SYMINFO_FLG_PASSTHRU 0x0002 /* Pass-thru symbol for translator */ #define SYMINFO_FLG_COPY 0x0004 /* Symbol is a copy-reloc */ #define SYMINFO_FLG_LAZYLOAD 0x0008 /* Symbol bound to object to be lazy loaded */ /* Syminfo version values. */ #define SYMINFO_NONE 0 #define SYMINFO_CURRENT 1 #define SYMINFO_NUM 2
4.2 命令行段输出
readelf -s a.o Symbol table '.symtab' contains 18 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS a.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 3: 0000000000000000 0 SECTION LOCAL DEFAULT 3 4: 0000000000000000 0 SECTION LOCAL DEFAULT 4 5: 0000000000000000 0 SECTION LOCAL DEFAULT 5 6: 0000000000000004 4 OBJECT LOCAL DEFAULT 3 static_var.2321 7: 0000000000000000 4 OBJECT LOCAL DEFAULT 4 static_var2.2322 8: 0000000000000000 0 SECTION LOCAL DEFAULT 7 9: 0000000000000000 0 SECTION LOCAL DEFAULT 8 10: 0000000000000000 0 SECTION LOCAL DEFAULT 9 11: 0000000000000000 0 SECTION LOCAL DEFAULT 6 12: 0000000000000000 4 OBJECT GLOBAL DEFAULT 3 global_init_var 13: 0000000000000004 4 OBJECT GLOBAL DEFAULT COM global_uninit_var 14: 0000000000000000 40 FUNC GLOBAL DEFAULT 1 func1 15: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _GLOBAL_OFFSET_TABLE_ 16: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND printf 17: 0000000000000028 57 FUNC GLOBAL DEFAULT 1 main
4.3 二进制段分析
.symtab段(符号表),其在文件中起始位置为:0x498+64*11=0x758(段开始位置为0x498,每个section header大小为64)
从上图可知,其共有0x1b0字节,每个域大小为0x18,所以有18个符号,与命令行输出的18个符号对应上了。
0x158开始的0x1b0字节如下:
程序员的自我修养:链接、装载与库阅读2:https://developer.aliyun.com/article/1597080