2014/10/13

ELF (Executable and Linking Format)

1. Abstract


ELF (Executable and Linking Format), 是 executable binary file 以及 object file 的檔案格式規範

ELF 的檔案開頭會有 (1) ELF header, (2) program header table, (3) section header table

ELF 相關的 define 在 "/usr/include/elf.h", 裡面有分 32-bit 及 64-bit, 以下都以32-bit當例子

2. ELF header


查看 ELF header 的內容, 可以用 readelf
$ readelf -h /bin/ls
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           ARM
  Version:                           0x1
  Entry point address:               0xc3f0
  Start of program headers:          52 (bytes into file)
  Start of section headers:          95220 (bytes into file)
  Flags:                             0x5000002, has entry point, Version5 EABI
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         9
  Size of section headers:           40 (bytes)
  Number of section headers:         28
  Section header string table index: 27

其實這只是把 /bin/ls 的 binary 的開頭部份以 Elf32_Ehdr 來 parse
Elf32_Ehdr 在 "/usr/include/elf.h" 的定義為
typedef struct
{
  unsigned char e_ident[EI_NIDENT];     /* Magic number and other info */
  Elf32_Half    e_type;                 /* Object file type */
  Elf32_Half    e_machine;              /* Architecture */
  Elf32_Word    e_version;              /* Object file version */
  Elf32_Addr    e_entry;                /* Entry point virtual address */
  Elf32_Off     e_phoff;                /* Program header table file offset */
  Elf32_Off     e_shoff;                /* Section header table file offset */
  Elf32_Word    e_flags;                /* Processor-specific flags */
  Elf32_Half    e_ehsize;               /* ELF header size in bytes */
  Elf32_Half    e_phentsize;            /* Program header table entry size */
  Elf32_Half    e_phnum;                /* Program header table entry count */
  Elf32_Half    e_shentsize;            /* Section header table entry size */
  Elf32_Half    e_shnum;                /* Section header table entry count */
  Elf32_Half    e_shstrndx;             /* Section header string table index */
} Elf32_Ehdr;

那麼把 /bin/ls 的開頭 dump 出來就可以對應到以上的欄位
$ hexdump -n 64 -C /bin/ls
00000000  7f 45 4c 46 01 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  02 00 28 00 01 00 00 00  f0 c3 00 00 34 00 00 00  |..(.........4...|
00000020  f4 73 01 00 02 00 00 05  34 00 20 00 09 00 28 00  |.s......4. ...(.|
00000030  1c 00 1b 00 01 00 00 70  c0 60 01 00 c0 e0 01 00  |.......p.`......|

其中 Elf32_Ehdr 實際對應的值是
e_ident:     7f 45 4c 46 01 01 01 00  00 00 00 00 00 00 00 00
e_type:      02 00       = 0x0002
e_machine:   28 00       = 0x0028
e_version:   01 00 00 00 = 0x00000001
e_entry:     f0 c3 00 00 = 0x0000c3f0
e_phoff:     34 00 00 00 = 0x00000034 = 52 (bytes)
e_shoff:     f4 73 01 00 = 0x000173f4 = 95220 (bytes)
e_flags:     02 00 00 05
e_ehsize:    34 00       = 0x0034     = 52 (bytes)
e_phentsize: 20 00       = 0x0020     = 32 (bytes)
e_phnum:     09 00       = 0x0009
e_shentsize: 28 00       = 0x0028
e_shnum:     1c 00       = 0x001c
e_shstrndx:  1b 00       = 0x001b

其中 e_ident 包含 ELF magic number 和其它資訊, ELF 檔案的前 4 個byte一定是 "0x7f 45 4c 46"

e_type 則是底下幾種:

  • ET_REL (1) : relocatable file
  • ET_EXEC (2) : executable file
  • ET_DYN (3) : shared object file
  • ET_CORE (4) : core file

3. Program Header


program header table的位置從 ELF header 的 "Start of program headers" (e_phoff) 開始
program header 的大小則是由 "Size of program headers" (e_phentsize) 以及 "Number of program headers" (e_phnum) 決定

"/bin/ls" 的例子裡, program header 從offset 52 bytes 開始, header size是 32bytes, 共有 9 個header

用 readelf 來解讀:
$ readelf --program-headers /bin/ls

Elf file type is EXEC (Executable file)
Entry point 0xc3f0
There are 9 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  EXIDX          0x0160c0 0x0001e0c0 0x0001e0c0 0x00028 0x00028 R   0x4
  PHDR           0x000034 0x00008034 0x00008034 0x00120 0x00120 R E 0x4
  INTERP         0x000154 0x00008154 0x00008154 0x00019 0x00019 R   0x1
      [Requesting program interpreter: /lib/ld-linux-armhf.so.3]
  LOAD           0x000000 0x00008000 0x00008000 0x160ec 0x160ec R E 0x8000
  LOAD           0x016edc 0x00026edc 0x00026edc 0x003f8 0x0108c RW  0x8000
  DYNAMIC        0x016ee8 0x00026ee8 0x00026ee8 0x00118 0x00118 RW  0x4
  NOTE           0x000170 0x00008170 0x00008170 0x00044 0x00044 R   0x4
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4
  GNU_RELRO      0x016edc 0x00026edc 0x00026edc 0x00124 0x00124 R   0x1

 Section to Segment mapping:
  Segment Sections...
   00     .ARM.exidx
   01
   02     .interp
   03     .interp .note.ABI-tag .note.gnu.build-id .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .ARM.exidx .eh_frame
   04     .init_array .fini_array .jcr .dynamic .got .data .bss
   05     .dynamic
   06     .note.ABI-tag .note.gnu.build-id
   07
   08     .init_array .fini_array .jcr .dynamic

其中 program header 對應到 "/usr/include/elf.h"
typedef struct
{
  Elf32_Word    p_type;                 /* Segment type */
  Elf32_Off     p_offset;               /* Segment file offset */
  Elf32_Addr    p_vaddr;                /* Segment virtual address */
  Elf32_Addr    p_paddr;                /* Segment physical address */
  Elf32_Word    p_filesz;               /* Segment size in file */
  Elf32_Word    p_memsz;                /* Segment size in memory */
  Elf32_Word    p_flags;                /* Segment flags */
  Elf32_Word    p_align;                /* Segment alignment */
} Elf32_Phdr;

一樣可以用 hexdump 來倒出對應的值, 其中參數 288 = 32 bytes * 9 header, offset 是 52 bytes
$ hexdump -n 288 -s 52 -C /bin/ls
00000034  01 00 00 70 c0 60 01 00  c0 e0 01 00 c0 e0 01 00  |...p.`..........|
00000044  28 00 00 00 28 00 00 00  04 00 00 00 04 00 00 00  |(...(...........|
00000054  06 00 00 00 34 00 00 00  34 80 00 00 34 80 00 00  |....4...4...4...|
00000064  20 01 00 00 20 01 00 00  05 00 00 00 04 00 00 00  | ... ...........|
00000074  03 00 00 00 54 01 00 00  54 81 00 00 54 81 00 00  |....T...T...T...|
00000084  19 00 00 00 19 00 00 00  04 00 00 00 01 00 00 00  |................|
00000094  01 00 00 00 00 00 00 00  00 80 00 00 00 80 00 00  |................|
000000a4  ec 60 01 00 ec 60 01 00  05 00 00 00 00 80 00 00  |.`...`..........|
000000b4  01 00 00 00 dc 6e 01 00  dc 6e 02 00 dc 6e 02 00  |.....n...n...n..|
000000c4  f8 03 00 00 8c 10 00 00  06 00 00 00 00 80 00 00  |................|
000000d4  02 00 00 00 e8 6e 01 00  e8 6e 02 00 e8 6e 02 00  |.....n...n...n..|
000000e4  18 01 00 00 18 01 00 00  06 00 00 00 04 00 00 00  |................|
000000f4  04 00 00 00 70 01 00 00  70 81 00 00 70 81 00 00  |....p...p...p...|
00000104  44 00 00 00 44 00 00 00  04 00 00 00 04 00 00 00  |D...D...........|
00000114  51 e5 74 64 00 00 00 00  00 00 00 00 00 00 00 00  |Q.td............|
00000124  00 00 00 00 00 00 00 00  06 00 00 00 04 00 00 00  |................|
00000134  52 e5 74 64 dc 6e 01 00  dc 6e 02 00 dc 6e 02 00  |R.td.n...n...n..|
00000144  24 01 00 00 24 01 00 00  04 00 00 00 01 00 00 00  |$...$...........|

以第一個 program header 來說, Elf32_Phdr 對應的值是:
p_type:   01 00 00 70 = 0x70000001
p_offset: c0 60 01 00 = 0x000160c0
p_vaddr:  c0 e0 01 00 = 0x0001e0c0
p_paddr:  c0 e0 01 00 = 0x0001e0c0
p_filesz: 28 00 00 00 = 0x00000028
p_memsz:  28 00 00 00 = 0x00000028
p_flags:  04 00 00 00 = 0x00000004
p_align:  04 00 00 00 = 0x00000004

其中 p_type 常見的值:

  • PT_LOAD (1) : 載入的 program section
  • PT_DYNAMIC (2) : 動態連結資訊
  • PT_INTERP (3) : program interpreter
  • PT_NOTE (4) : 輔助資訊
  • PT_PHDR (6) : program header本身
  • PT_TLS (7) : thread local storage

在 readelf 印出來的資料中, "Section to Segment mapping"底下的資料是每個program header對應的segment以及它包含的section list
比如說, 第1個 header 的 p_type 是 EXIDX, 它的 segment 是 index 00, 它包含的section是 ".ARM.exidx"
第4個 header的 p_type 是 LOAD, 它的 segment 是 index 03, 它包含的 section 是 ".interp .note.ABI-tag .note.gnu.build-id .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .ARM.exidx .eh_frame"

4. Section header


section header table 從 "Start of section headers" (e_shoff) 的 offset 開始
它的大小為 "Size of section headers" (e_shentsize) 乘以 "Number of section headers" (e_shnum)

"/bin/ls" 的 offset 是 95220 bytes, 大小為 40 bytes * 28 items = 1120 bytes

用 readelf 解讀:
$ readelf --section-headers /bin/ls
There are 28 section headers, starting at offset 0x173f4:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .interp           PROGBITS        00008154 000154 000019 00   A  0   0  1
  [ 2] .note.ABI-tag     NOTE            00008170 000170 000020 00   A  0   0  4
  [ 3] .note.gnu.build-i NOTE            00008190 000190 000024 00   A  0   0  4
  [ 4] .hash             HASH            000081b4 0001b4 000380 04   A  6   0  4
  [ 5] .gnu.hash         GNU_HASH        00008534 000534 0003f8 04   A  6   0  4
  [ 6] .dynsym           DYNSYM          0000892c 00092c 0007d0 10   A  7   1  4
  [ 7] .dynstr           STRTAB          000090fc 0010fc 000592 00   A  0   0  1
  [ 8] .gnu.version      VERSYM          0000968e 00168e 0000fa 02   A  6   0  2
  [ 9] .gnu.version_r    VERNEED         00009788 001788 0000b0 00   A  7   5  4
  [10] .rel.dyn          REL             00009838 001838 000030 08   A  6   0  4
  [11] .rel.plt          REL             00009868 001868 000350 08   A  6  13  4
  [12] .init             PROGBITS        00009bb8 001bb8 00000c 00  AX  0   0  4
  [13] .plt              PROGBITS        00009bc4 001bc4 00050c 04  AX  0   0  4
  [14] .text             PROGBITS        0000a0d0 0020d0 010bd0 00  AX  0   0  8
  [15] .fini             PROGBITS        0001aca0 012ca0 000008 00  AX  0   0  4
  [16] .rodata           PROGBITS        0001aca8 012ca8 003418 00   A  0   0  4
  [17] .ARM.exidx        ARM_EXIDX       0001e0c0 0160c0 000028 00  AL 14   0  4
  [18] .eh_frame         PROGBITS        0001e0e8 0160e8 000004 00   A  0   0  4
  [19] .init_array       INIT_ARRAY      00026edc 016edc 000004 00  WA  0   0  4
  [20] .fini_array       FINI_ARRAY      00026ee0 016ee0 000004 00  WA  0   0  4
  [21] .jcr              PROGBITS        00026ee4 016ee4 000004 00  WA  0   0  4
  [22] .dynamic          DYNAMIC         00026ee8 016ee8 000118 08  WA  7   0  4
  [23] .got              PROGBITS        00027000 017000 0001bc 04  WA  0   0  4
  [24] .data             PROGBITS        000271c0 0171c0 000114 00  WA  0   0  8
  [25] .bss              NOBITS          000272d8 0172d4 000c90 00  WA  0   0  8
  [26] .ARM.attributes   ARM_ATTRIBUTES  00000000 0172d4 00002f 00      0   0  1
  [27] .shstrtab         STRTAB          00000000 017303 0000f1 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

而 section header 的定義為:
typedef struct
{
  Elf32_Word    sh_name;                /* Section name (string tbl index) */
  Elf32_Word    sh_type;                /* Section type */
  Elf32_Word    sh_flags;               /* Section flags */
  Elf32_Addr    sh_addr;                /* Section virtual addr at execution */
  Elf32_Off     sh_offset;              /* Section file offset */
  Elf32_Word    sh_size;                /* Section size in bytes */
  Elf32_Word    sh_link;                /* Link to another section */
  Elf32_Word    sh_info;                /* Additional section information */
  Elf32_Word    sh_addralign;           /* Section alignment */
  Elf32_Word    sh_entsize;             /* Entry size if section holds table */
} Elf32_Shdr;


常見的 section type:

  • SHT_PROGBITS (1) : 程式資料
  • SHT_SYMTAB (2) : 符號表
  • SHT_STRTAB (3) : 字串表
  • SHT_RELA (4) : relocation addition item
  • SHT_HASH (5) : symbol hash table
  • SHT_DYNAMIC (6) : 動態連結資訊
  • SHT_NOTE (7)
  • SHT_NOBITS (8) : 檔案裡不是資料的部份 (.bss)
  • SHT_REL (9) : relocatable item
  • SHT_DYNSYM (11) : dynamic linker 所使用的 symbol table
  • SHT_INIT_ARRAY (14) : constructor array (.init)
  • SHT_FINI_ARRAY (15) : destructor array (.fini)

在 elf header 裡, e_shstrndx 標明字串表的 section index, 這個例子裡 "Section header string table index" (e_shstrndx) 是 27, section header 的 name (sh_name) 是從這個 section 對應而來

在index 27, 它的 offset = 0x017303, size = 0xf1 (241), 用 hexdump把它印出來
$ hexdump -n 241 -s 0x017303 -C /bin/ls
00017303  00 2e 73 68 73 74 72 74  61 62 00 2e 69 6e 74 65  |..shstrtab..inte|
00017313  72 70 00 2e 6e 6f 74 65  2e 41 42 49 2d 74 61 67  |rp..note.ABI-tag|
00017323  00 2e 6e 6f 74 65 2e 67  6e 75 2e 62 75 69 6c 64  |..note.gnu.build|
00017333  2d 69 64 00 2e 67 6e 75  2e 68 61 73 68 00 2e 64  |-id..gnu.hash..d|
00017343  79 6e 73 79 6d 00 2e 64  79 6e 73 74 72 00 2e 67  |ynsym..dynstr..g|
00017353  6e 75 2e 76 65 72 73 69  6f 6e 00 2e 67 6e 75 2e  |nu.version..gnu.|
00017363  76 65 72 73 69 6f 6e 5f  72 00 2e 72 65 6c 2e 64  |version_r..rel.d|
00017373  79 6e 00 2e 72 65 6c 2e  70 6c 74 00 2e 69 6e 69  |yn..rel.plt..ini|
00017383  74 00 2e 74 65 78 74 00  2e 66 69 6e 69 00 2e 72  |t..text..fini..r|
00017393  6f 64 61 74 61 00 2e 41  52 4d 2e 65 78 69 64 78  |odata..ARM.exidx|
000173a3  00 2e 65 68 5f 66 72 61  6d 65 00 2e 69 6e 69 74  |..eh_frame..init|
000173b3  5f 61 72 72 61 79 00 2e  66 69 6e 69 5f 61 72 72  |_array..fini_arr|
000173c3  61 79 00 2e 6a 63 72 00  2e 64 79 6e 61 6d 69 63  |ay..jcr..dynamic|
000173d3  00 2e 67 6f 74 00 2e 64  61 74 61 00 2e 62 73 73  |..got..data..bss|
000173e3  00 2e 41 52 4d 2e 61 74  74 72 69 62 75 74 65 73  |..ARM.attributes|

也可以用 readelf印出來
$ readelf -x27 /bin/ls

Hex dump of section '.shstrtab':
  0x00000000 002e7368 73747274 6162002e 696e7465 ..shstrtab..inte
  0x00000010 7270002e 6e6f7465 2e414249 2d746167 rp..note.ABI-tag
  0x00000020 002e6e6f 74652e67 6e752e62 75696c64 ..note.gnu.build
  0x00000030 2d696400 2e676e75 2e686173 68002e64 -id..gnu.hash..d
  0x00000040 796e7379 6d002e64 796e7374 72002e67 ynsym..dynstr..g
  0x00000050 6e752e76 65727369 6f6e002e 676e752e nu.version..gnu.
  0x00000060 76657273 696f6e5f 72002e72 656c2e64 version_r..rel.d
  0x00000070 796e002e 72656c2e 706c7400 2e696e69 yn..rel.plt..ini
  0x00000080 74002e74 65787400 2e66696e 69002e72 t..text..fini..r
  0x00000090 6f646174 61002e41 524d2e65 78696478 odata..ARM.exidx
  0x000000a0 002e6568 5f667261 6d65002e 696e6974 ..eh_frame..init
  0x000000b0 5f617272 6179002e 66696e69 5f617272 _array..fini_arr
  0x000000c0 6179002e 6a637200 2e64796e 616d6963 ay..jcr..dynamic
  0x000000d0 002e676f 74002e64 61746100 2e627373 ..got..data..bss
  0x000000e0 002e4152 4d2e6174 74726962 75746573 ..ARM.attributes
  0x000000f0 00                                  .

section header 的 sh_name 是字串表的 offset, 比如說 sh_name=1的話, name就是".shstrtab", 後面有字串結尾'\0', 如果 sh_name=0x0b的話, name 就是 ".interp"

5. Symbol table


如果檔案有strip過, 那麼只會看到 dynamic symbol table
"/bin/ls" 的 section header index 06 它的 type 是 DYNSYM, 所以 dynamic symbol table存在這
也可以先用 readelf來解讀
$ readelf --syms /bin/ls

Symbol table '.dynsym' contains 125 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 00000000     0 FUNC    GLOBAL DEFAULT  UND __aeabi_unwind_cpp_pr0@GCC_3.5 (2)
     2: 00000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
     3: 00000000     0 NOTYPE  WEAK   DEFAULT  UND _Jv_RegisterClasses
     4: 00009c08     0 FUNC    GLOBAL DEFAULT  UND getpwnam@GLIBC_2.4 (3)
......

如果要用hexdump的話, section header index 06 的 offset是0x92c, 大小是0x7d0
$ hexdump -n 32 -s 0x92c -C /bin/ls
0000092c  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
0000093c  cd 00 00 00 00 00 00 00  00 00 00 00 12 00 00 00  |................|

而symbol item 的 struct :
typedef struct
{
  Elf32_Word    st_name;                /* Symbol name (string tbl index) */
  Elf32_Addr    st_value;               /* Symbol value */
  Elf32_Word    st_size;                /* Symbol size */
  unsigned char st_info;                /* Symbol type and binding */
  unsigned char st_other;               /* Symbol visibility */
  Elf32_Section st_shndx;               /* Section index */
} Elf32_Sym;

struct大小為16 bytes, 以第2個item來說, 它對應的值是
st_name:  cd 00 00 00 = 0x000000cd
st_value: 00 00 00 00
st_size:  00 00 00 00
st_info:  12          = STB_GLOBAL | STT_FUNC
st_other: 00
st_shndx: 00 00

st_name 雖然也是參考 string table, 但它參考的是 dynamic string table, 也就是 section header index 07 的表

st_info 分成高位 4 bit 及低位 4 bit
高位4 bit為:

  • STB_LOCAL (0) : local symbol
  • STB_GLOBAL (1) : global symbol
  • STB_WEAK (2) : weak linking symbol
低位4 bit為:

  • STT_OBJECT (1) : data object
  • STT_FUNC (2) : function
  • STT_SECTION (3) : section
  • STT_FILE (4) : source file name which related to object
  • STT_COMMON (5) : share data
  • STT_TLS (6) : thread local data






沒有留言:

張貼留言