1. Abstract
ELF (Executable and Linking Format), 是 executable binary file 以及 object file 的檔案格式規範
ELF 的檔案開頭會有 (1) ELF header, (2) program header table, (3) section header table
ELF 相關的 define 在 "/usr/include/elf.h", 裡面有分 32-bit 及 64-bit, 以下都以32-bit當例子
2. ELF header
查看 ELF header 的內容, 可以用 readelf
$ readelf -h /bin/ls
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: ARM
Version: 0x1
Entry point address: 0xc3f0
Start of program headers: 52 (bytes into file)
Start of section headers: 95220 (bytes into file)
Flags: 0x5000002, has entry point, Version5 EABI
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 9
Size of section headers: 40 (bytes)
Number of section headers: 28
Section header string table index: 27
其實這只是把 /bin/ls 的 binary 的開頭部份以 Elf32_Ehdr 來 parse
Elf32_Ehdr 在 "/usr/include/elf.h" 的定義為
typedef struct
{
unsigned char e_ident[EI_NIDENT]; /* Magic number and other info */
Elf32_Half e_type; /* Object file type */
Elf32_Half e_machine; /* Architecture */
Elf32_Word e_version; /* Object file version */
Elf32_Addr e_entry; /* Entry point virtual address */
Elf32_Off e_phoff; /* Program header table file offset */
Elf32_Off e_shoff; /* Section header table file offset */
Elf32_Word e_flags; /* Processor-specific flags */
Elf32_Half e_ehsize; /* ELF header size in bytes */
Elf32_Half e_phentsize; /* Program header table entry size */
Elf32_Half e_phnum; /* Program header table entry count */
Elf32_Half e_shentsize; /* Section header table entry size */
Elf32_Half e_shnum; /* Section header table entry count */
Elf32_Half e_shstrndx; /* Section header string table index */
} Elf32_Ehdr;
那麼把 /bin/ls 的開頭 dump 出來就可以對應到以上的欄位
$ hexdump -n 64 -C /bin/ls
00000000 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|
00000010 02 00 28 00 01 00 00 00 f0 c3 00 00 34 00 00 00 |..(.........4...|
00000020 f4 73 01 00 02 00 00 05 34 00 20 00 09 00 28 00 |.s......4. ...(.|
00000030 1c 00 1b 00 01 00 00 70 c0 60 01 00 c0 e0 01 00 |.......p.`......|
其中 Elf32_Ehdr 實際對應的值是
e_ident: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
e_type: 02 00 = 0x0002
e_machine: 28 00 = 0x0028
e_version: 01 00 00 00 = 0x00000001
e_entry: f0 c3 00 00 = 0x0000c3f0
e_phoff: 34 00 00 00 = 0x00000034 = 52 (bytes)
e_shoff: f4 73 01 00 = 0x000173f4 = 95220 (bytes)
e_flags: 02 00 00 05
e_ehsize: 34 00 = 0x0034 = 52 (bytes)
e_phentsize: 20 00 = 0x0020 = 32 (bytes)
e_phnum: 09 00 = 0x0009
e_shentsize: 28 00 = 0x0028
e_shnum: 1c 00 = 0x001c
e_shstrndx: 1b 00 = 0x001b
其中 e_ident 包含 ELF magic number 和其它資訊, ELF 檔案的前 4 個byte一定是 "0x7f 45 4c 46"
e_type 則是底下幾種:
- ET_REL (1) : relocatable file
- ET_EXEC (2) : executable file
- ET_DYN (3) : shared object file
- ET_CORE (4) : core file
3. Program Header
program header table的位置從 ELF header 的 "Start of program headers" (e_phoff) 開始
program header 的大小則是由 "Size of program headers" (e_phentsize) 以及 "Number of program headers" (e_phnum) 決定
"/bin/ls" 的例子裡, program header 從offset 52 bytes 開始, header size是 32bytes, 共有 9 個header
用 readelf 來解讀:
$ readelf --program-headers /bin/ls
Elf file type is EXEC (Executable file)
Entry point 0xc3f0
There are 9 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
EXIDX 0x0160c0 0x0001e0c0 0x0001e0c0 0x00028 0x00028 R 0x4
PHDR 0x000034 0x00008034 0x00008034 0x00120 0x00120 R E 0x4
INTERP 0x000154 0x00008154 0x00008154 0x00019 0x00019 R 0x1
[Requesting program interpreter: /lib/ld-linux-armhf.so.3]
LOAD 0x000000 0x00008000 0x00008000 0x160ec 0x160ec R E 0x8000
LOAD 0x016edc 0x00026edc 0x00026edc 0x003f8 0x0108c RW 0x8000
DYNAMIC 0x016ee8 0x00026ee8 0x00026ee8 0x00118 0x00118 RW 0x4
NOTE 0x000170 0x00008170 0x00008170 0x00044 0x00044 R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
GNU_RELRO 0x016edc 0x00026edc 0x00026edc 0x00124 0x00124 R 0x1
Section to Segment mapping:
Segment Sections...
00 .ARM.exidx
01
02 .interp
03 .interp .note.ABI-tag .note.gnu.build-id .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .ARM.exidx .eh_frame
04 .init_array .fini_array .jcr .dynamic .got .data .bss
05 .dynamic
06 .note.ABI-tag .note.gnu.build-id
07
08 .init_array .fini_array .jcr .dynamic
其中 program header 對應到 "/usr/include/elf.h"
typedef struct
{
Elf32_Word p_type; /* Segment type */
Elf32_Off p_offset; /* Segment file offset */
Elf32_Addr p_vaddr; /* Segment virtual address */
Elf32_Addr p_paddr; /* Segment physical address */
Elf32_Word p_filesz; /* Segment size in file */
Elf32_Word p_memsz; /* Segment size in memory */
Elf32_Word p_flags; /* Segment flags */
Elf32_Word p_align; /* Segment alignment */
} Elf32_Phdr;
一樣可以用 hexdump 來倒出對應的值, 其中參數 288 = 32 bytes * 9 header, offset 是 52 bytes
$ hexdump -n 288 -s 52 -C /bin/ls
00000034 01 00 00 70 c0 60 01 00 c0 e0 01 00 c0 e0 01 00 |...p.`..........|
00000044 28 00 00 00 28 00 00 00 04 00 00 00 04 00 00 00 |(...(...........|
00000054 06 00 00 00 34 00 00 00 34 80 00 00 34 80 00 00 |....4...4...4...|
00000064 20 01 00 00 20 01 00 00 05 00 00 00 04 00 00 00 | ... ...........|
00000074 03 00 00 00 54 01 00 00 54 81 00 00 54 81 00 00 |....T...T...T...|
00000084 19 00 00 00 19 00 00 00 04 00 00 00 01 00 00 00 |................|
00000094 01 00 00 00 00 00 00 00 00 80 00 00 00 80 00 00 |................|
000000a4 ec 60 01 00 ec 60 01 00 05 00 00 00 00 80 00 00 |.`...`..........|
000000b4 01 00 00 00 dc 6e 01 00 dc 6e 02 00 dc 6e 02 00 |.....n...n...n..|
000000c4 f8 03 00 00 8c 10 00 00 06 00 00 00 00 80 00 00 |................|
000000d4 02 00 00 00 e8 6e 01 00 e8 6e 02 00 e8 6e 02 00 |.....n...n...n..|
000000e4 18 01 00 00 18 01 00 00 06 00 00 00 04 00 00 00 |................|
000000f4 04 00 00 00 70 01 00 00 70 81 00 00 70 81 00 00 |....p...p...p...|
00000104 44 00 00 00 44 00 00 00 04 00 00 00 04 00 00 00 |D...D...........|
00000114 51 e5 74 64 00 00 00 00 00 00 00 00 00 00 00 00 |Q.td............|
00000124 00 00 00 00 00 00 00 00 06 00 00 00 04 00 00 00 |................|
00000134 52 e5 74 64 dc 6e 01 00 dc 6e 02 00 dc 6e 02 00 |R.td.n...n...n..|
00000144 24 01 00 00 24 01 00 00 04 00 00 00 01 00 00 00 |$...$...........|
以第一個 program header 來說, Elf32_Phdr 對應的值是:
p_type: 01 00 00 70 = 0x70000001
p_offset: c0 60 01 00 = 0x000160c0
p_vaddr: c0 e0 01 00 = 0x0001e0c0
p_paddr: c0 e0 01 00 = 0x0001e0c0
p_filesz: 28 00 00 00 = 0x00000028
p_memsz: 28 00 00 00 = 0x00000028
p_flags: 04 00 00 00 = 0x00000004
p_align: 04 00 00 00 = 0x00000004
其中 p_type 常見的值:
- PT_LOAD (1) : 載入的 program section
- PT_DYNAMIC (2) : 動態連結資訊
- PT_INTERP (3) : program interpreter
- PT_NOTE (4) : 輔助資訊
- PT_PHDR (6) : program header本身
- PT_TLS (7) : thread local storage
在 readelf 印出來的資料中, "Section to Segment mapping"底下的資料是每個program header對應的segment以及它包含的section list
比如說, 第1個 header 的 p_type 是 EXIDX, 它的 segment 是 index 00, 它包含的section是 ".ARM.exidx"
第4個 header的 p_type 是 LOAD, 它的 segment 是 index 03, 它包含的 section 是 ".interp .note.ABI-tag .note.gnu.build-id .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .ARM.exidx .eh_frame"
4. Section header
section header table 從 "Start of section headers" (e_shoff) 的 offset 開始
它的大小為 "Size of section headers" (e_shentsize) 乘以 "Number of section headers" (e_shnum)
"/bin/ls" 的 offset 是 95220 bytes, 大小為 40 bytes * 28 items = 1120 bytes
用 readelf 解讀:
$ readelf --section-headers /bin/ls
There are 28 section headers, starting at offset 0x173f4:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .interp PROGBITS 00008154 000154 000019 00 A 0 0 1
[ 2] .note.ABI-tag NOTE 00008170 000170 000020 00 A 0 0 4
[ 3] .note.gnu.build-i NOTE 00008190 000190 000024 00 A 0 0 4
[ 4] .hash HASH 000081b4 0001b4 000380 04 A 6 0 4
[ 5] .gnu.hash GNU_HASH 00008534 000534 0003f8 04 A 6 0 4
[ 6] .dynsym DYNSYM 0000892c 00092c 0007d0 10 A 7 1 4
[ 7] .dynstr STRTAB 000090fc 0010fc 000592 00 A 0 0 1
[ 8] .gnu.version VERSYM 0000968e 00168e 0000fa 02 A 6 0 2
[ 9] .gnu.version_r VERNEED 00009788 001788 0000b0 00 A 7 5 4
[10] .rel.dyn REL 00009838 001838 000030 08 A 6 0 4
[11] .rel.plt REL 00009868 001868 000350 08 A 6 13 4
[12] .init PROGBITS 00009bb8 001bb8 00000c 00 AX 0 0 4
[13] .plt PROGBITS 00009bc4 001bc4 00050c 04 AX 0 0 4
[14] .text PROGBITS 0000a0d0 0020d0 010bd0 00 AX 0 0 8
[15] .fini PROGBITS 0001aca0 012ca0 000008 00 AX 0 0 4
[16] .rodata PROGBITS 0001aca8 012ca8 003418 00 A 0 0 4
[17] .ARM.exidx ARM_EXIDX 0001e0c0 0160c0 000028 00 AL 14 0 4
[18] .eh_frame PROGBITS 0001e0e8 0160e8 000004 00 A 0 0 4
[19] .init_array INIT_ARRAY 00026edc 016edc 000004 00 WA 0 0 4
[20] .fini_array FINI_ARRAY 00026ee0 016ee0 000004 00 WA 0 0 4
[21] .jcr PROGBITS 00026ee4 016ee4 000004 00 WA 0 0 4
[22] .dynamic DYNAMIC 00026ee8 016ee8 000118 08 WA 7 0 4
[23] .got PROGBITS 00027000 017000 0001bc 04 WA 0 0 4
[24] .data PROGBITS 000271c0 0171c0 000114 00 WA 0 0 8
[25] .bss NOBITS 000272d8 0172d4 000c90 00 WA 0 0 8
[26] .ARM.attributes ARM_ATTRIBUTES 00000000 0172d4 00002f 00 0 0 1
[27] .shstrtab STRTAB 00000000 017303 0000f1 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
而 section header 的定義為:
typedef struct
{
Elf32_Word sh_name; /* Section name (string tbl index) */
Elf32_Word sh_type; /* Section type */
Elf32_Word sh_flags; /* Section flags */
Elf32_Addr sh_addr; /* Section virtual addr at execution */
Elf32_Off sh_offset; /* Section file offset */
Elf32_Word sh_size; /* Section size in bytes */
Elf32_Word sh_link; /* Link to another section */
Elf32_Word sh_info; /* Additional section information */
Elf32_Word sh_addralign; /* Section alignment */
Elf32_Word sh_entsize; /* Entry size if section holds table */
} Elf32_Shdr;
常見的 section type:
- SHT_PROGBITS (1) : 程式資料
- SHT_SYMTAB (2) : 符號表
- SHT_STRTAB (3) : 字串表
- SHT_RELA (4) : relocation addition item
- SHT_HASH (5) : symbol hash table
- SHT_DYNAMIC (6) : 動態連結資訊
- SHT_NOTE (7)
- SHT_NOBITS (8) : 檔案裡不是資料的部份 (.bss)
- SHT_REL (9) : relocatable item
- SHT_DYNSYM (11) : dynamic linker 所使用的 symbol table
- SHT_INIT_ARRAY (14) : constructor array (.init)
- SHT_FINI_ARRAY (15) : destructor array (.fini)
在 elf header 裡, e_shstrndx 標明字串表的 section index, 這個例子裡 "Section header string table index" (e_shstrndx) 是 27, section header 的 name (sh_name) 是從這個 section 對應而來
在index 27, 它的 offset = 0x017303, size = 0xf1 (241), 用 hexdump把它印出來
$ hexdump -n 241 -s 0x017303 -C /bin/ls
00017303 00 2e 73 68 73 74 72 74 61 62 00 2e 69 6e 74 65 |..shstrtab..inte|
00017313 72 70 00 2e 6e 6f 74 65 2e 41 42 49 2d 74 61 67 |rp..note.ABI-tag|
00017323 00 2e 6e 6f 74 65 2e 67 6e 75 2e 62 75 69 6c 64 |..note.gnu.build|
00017333 2d 69 64 00 2e 67 6e 75 2e 68 61 73 68 00 2e 64 |-id..gnu.hash..d|
00017343 79 6e 73 79 6d 00 2e 64 79 6e 73 74 72 00 2e 67 |ynsym..dynstr..g|
00017353 6e 75 2e 76 65 72 73 69 6f 6e 00 2e 67 6e 75 2e |nu.version..gnu.|
00017363 76 65 72 73 69 6f 6e 5f 72 00 2e 72 65 6c 2e 64 |version_r..rel.d|
00017373 79 6e 00 2e 72 65 6c 2e 70 6c 74 00 2e 69 6e 69 |yn..rel.plt..ini|
00017383 74 00 2e 74 65 78 74 00 2e 66 69 6e 69 00 2e 72 |t..text..fini..r|
00017393 6f 64 61 74 61 00 2e 41 52 4d 2e 65 78 69 64 78 |odata..ARM.exidx|
000173a3 00 2e 65 68 5f 66 72 61 6d 65 00 2e 69 6e 69 74 |..eh_frame..init|
000173b3 5f 61 72 72 61 79 00 2e 66 69 6e 69 5f 61 72 72 |_array..fini_arr|
000173c3 61 79 00 2e 6a 63 72 00 2e 64 79 6e 61 6d 69 63 |ay..jcr..dynamic|
000173d3 00 2e 67 6f 74 00 2e 64 61 74 61 00 2e 62 73 73 |..got..data..bss|
000173e3 00 2e 41 52 4d 2e 61 74 74 72 69 62 75 74 65 73 |..ARM.attributes|
也可以用 readelf印出來
$ readelf -x27 /bin/ls
Hex dump of section '.shstrtab':
0x00000000 002e7368 73747274 6162002e 696e7465 ..shstrtab..inte
0x00000010 7270002e 6e6f7465 2e414249 2d746167 rp..note.ABI-tag
0x00000020 002e6e6f 74652e67 6e752e62 75696c64 ..note.gnu.build
0x00000030 2d696400 2e676e75 2e686173 68002e64 -id..gnu.hash..d
0x00000040 796e7379 6d002e64 796e7374 72002e67 ynsym..dynstr..g
0x00000050 6e752e76 65727369 6f6e002e 676e752e nu.version..gnu.
0x00000060 76657273 696f6e5f 72002e72 656c2e64 version_r..rel.d
0x00000070 796e002e 72656c2e 706c7400 2e696e69 yn..rel.plt..ini
0x00000080 74002e74 65787400 2e66696e 69002e72 t..text..fini..r
0x00000090 6f646174 61002e41 524d2e65 78696478 odata..ARM.exidx
0x000000a0 002e6568 5f667261 6d65002e 696e6974 ..eh_frame..init
0x000000b0 5f617272 6179002e 66696e69 5f617272 _array..fini_arr
0x000000c0 6179002e 6a637200 2e64796e 616d6963 ay..jcr..dynamic
0x000000d0 002e676f 74002e64 61746100 2e627373 ..got..data..bss
0x000000e0 002e4152 4d2e6174 74726962 75746573 ..ARM.attributes
0x000000f0 00 .
section header 的 sh_name 是字串表的 offset, 比如說 sh_name=1的話, name就是".shstrtab", 後面有字串結尾'\0', 如果 sh_name=0x0b的話, name 就是 ".interp"
5. Symbol table
如果檔案有strip過, 那麼只會看到 dynamic symbol table
"/bin/ls" 的 section header index 06 它的 type 是 DYNSYM, 所以 dynamic symbol table存在這
也可以先用 readelf來解讀
$ readelf --syms /bin/ls
Symbol table '.dynsym' contains 125 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 FUNC GLOBAL DEFAULT UND __aeabi_unwind_cpp_pr0@GCC_3.5 (2)
2: 00000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__
3: 00000000 0 NOTYPE WEAK DEFAULT UND _Jv_RegisterClasses
4: 00009c08 0 FUNC GLOBAL DEFAULT UND getpwnam@GLIBC_2.4 (3)
......
如果要用hexdump的話, section header index 06 的 offset是0x92c, 大小是0x7d0
$ hexdump -n 32 -s 0x92c -C /bin/ls
0000092c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
0000093c cd 00 00 00 00 00 00 00 00 00 00 00 12 00 00 00 |................|
而symbol item 的 struct :
typedef struct
{
Elf32_Word st_name; /* Symbol name (string tbl index) */
Elf32_Addr st_value; /* Symbol value */
Elf32_Word st_size; /* Symbol size */
unsigned char st_info; /* Symbol type and binding */
unsigned char st_other; /* Symbol visibility */
Elf32_Section st_shndx; /* Section index */
} Elf32_Sym;
struct大小為16 bytes, 以第2個item來說, 它對應的值是
st_name: cd 00 00 00 = 0x000000cd
st_value: 00 00 00 00
st_size: 00 00 00 00
st_info: 12 = STB_GLOBAL | STT_FUNC
st_other: 00
st_shndx: 00 00
st_name 雖然也是參考 string table, 但它參考的是 dynamic string table, 也就是 section header index 07 的表
st_info 分成高位 4 bit 及低位 4 bit
高位4 bit為:
- STB_LOCAL (0) : local symbol
- STB_GLOBAL (1) : global symbol
- STB_WEAK (2) : weak linking symbol
低位4 bit為:
- STT_OBJECT (1) : data object
- STT_FUNC (2) : function
- STT_SECTION (3) : section
- STT_FILE (4) : source file name which related to object
- STT_COMMON (5) : share data
- STT_TLS (6) : thread local data