[LINUX] Segfault with 0 characters with gcc

Introduction

[Popular material](https://qiita.com/search?q=%E3%82%BB%E3%82%B0%E3%83%95%E3%82%A9%E3%82%89%E3 It is a piggyback ride on% 81% 9B% E3% 82% 8B).

Quick demonstration

I am using x86-64 Ubuntu 19.04 + gcc 8.3.0 as the operation check environment.

Build and run


$ gcc -xc /dev/null -nostdlib -Wl,-e0xfa -Wl,-Ttext=0x2504890000 -zexecstack && ./a.out
Segmentation fault (core dumped)
$ 

Commentary

First, compile and compile the smallest C program that does not give a warning.

int main(){}

Since it is troublesome to prepare the source code, I will use the echo command to connect with the compiler with | and input it directly as the standard input. Since the compiler cannot determine the type of the input file by the extension of the input file, use the command line option `-xc'notifying that it is a C source.

$ echo 'int main(){}' | gcc -xc - && ls -l a.out
-rwxr-xr-x 1 fujita fujita 16344 Jul  8 22:31 a.out
$ 

An executable file of about 16kB has been created. When I feed the generated a.out to the readelf command

$ readelf -S a.out
There are 28 section headers, starting at offset 0x38d8:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .interp           PROGBITS         00000000000002a8  000002a8
       000000000000001c  0000000000000000   A       0     0     1
  [ 2] .note.gnu.build-i NOTE             00000000000002c4  000002c4
       0000000000000024  0000000000000000   A       0     0     4
  [ 3] .note.ABI-tag     NOTE             00000000000002e8  000002e8
       0000000000000020  0000000000000000   A       0     0     4
  [ 4] .gnu.hash         GNU_HASH         0000000000000308  00000308
       0000000000000024  0000000000000000   A       5     0     8
  [ 5] .dynsym           DYNSYM           0000000000000330  00000330
       0000000000000090  0000000000000018   A       6     1     8
  [ 6] .dynstr           STRTAB           00000000000003c0  000003c0
       000000000000007d  0000000000000000   A       0     0     1
  [ 7] .gnu.version      VERSYM           000000000000043e  0000043e
       000000000000000c  0000000000000002   A       5     0     2
  [ 8] .gnu.version_r    VERNEED          0000000000000450  00000450
       0000000000000020  0000000000000000   A       6     1     8
  [ 9] .rela.dyn         RELA             0000000000000470  00000470
       00000000000000c0  0000000000000018   A       5     0     8
  [10] .init             PROGBITS         0000000000001000  00001000
       0000000000000017  0000000000000000  AX       0     0     4
  [11] .plt              PROGBITS         0000000000001020  00001020
       0000000000000010  0000000000000010  AX       0     0     16
  [12] .plt.got          PROGBITS         0000000000001030  00001030
       0000000000000008  0000000000000008  AX       0     0     8
  [13] .text             PROGBITS         0000000000001040  00001040
       0000000000000151  0000000000000000  AX       0     0     16
  [14] .fini             PROGBITS         0000000000001194  00001194
       0000000000000009  0000000000000000  AX       0     0     4
  [15] .rodata           PROGBITS         0000000000002000  00002000
       0000000000000004  0000000000000004  AM       0     0     4
  [16] .eh_frame_hdr     PROGBITS         0000000000002004  00002004
       000000000000003c  0000000000000000   A       0     0     4
  [17] .eh_frame         PROGBITS         0000000000002040  00002040
       0000000000000108  0000000000000000   A       0     0     8
  [18] .init_array       INIT_ARRAY       0000000000003df0  00002df0
       0000000000000008  0000000000000008  WA       0     0     8
  [19] .fini_array       FINI_ARRAY       0000000000003df8  00002df8
       0000000000000008  0000000000000008  WA       0     0     8
  [20] .dynamic          DYNAMIC          0000000000003e00  00002e00
       00000000000001c0  0000000000000010  WA       6     0     8
  [21] .got              PROGBITS         0000000000003fc0  00002fc0
       0000000000000040  0000000000000008  WA       0     0     8
  [22] .data             PROGBITS         0000000000004000  00003000
       0000000000000010  0000000000000000  WA       0     0     8
  [23] .bss              NOBITS           0000000000004010  00003010
       0000000000000008  0000000000000000  WA       0     0     1
  [24] .comment          PROGBITS         0000000000000000  00003010
       0000000000000023  0000000000000001  MS       0     0     1
  [25] .symtab           SYMTAB           0000000000000000  00003038
       00000000000005b8  0000000000000018          26    43     8
  [26] .strtab           STRTAB           0000000000000000  000035f0
       00000000000001e9  0000000000000000           0     0     1
  [27] .shstrtab         STRTAB           0000000000000000  000037d9
       00000000000000f9  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  l (large), p (processor specific)
$ 

You can see that it contains many sections.

Then try compiling the 0-character source. Use / dev / null instead of the source file, which always returns an EOF when read. You don't even need a startup to call main, so specify the \ -nostdlib'option. If you use this, the default program execution start address _start will not be found and the linker will warn you, so use the -Wl, -e0'option to specify 0 as the temporary execution start address.

$ gcc -xc /dev/null -nostdlib -Wl,-e0 && ls -l a.out
-rwxr-xr-x 1 fujita fujita 9512 Jul  8 22:35 a.out
$ 

An executable file of about 9.5kB has been created. If you feed this to the readelf command

$ readelf -S a.out
There are 12 section headers, starting at offset 0x2228:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .interp           PROGBITS         0000000000000200  00000200
       000000000000001c  0000000000000000   A       0     0     1
  [ 2] .note.gnu.build-i NOTE             000000000000021c  0000021c
       0000000000000024  0000000000000000   A       0     0     4
  [ 3] .gnu.hash         GNU_HASH         0000000000000240  00000240
       000000000000001c  0000000000000000   A       4     0     8
  [ 4] .dynsym           DYNSYM           0000000000000260  00000260
       0000000000000018  0000000000000018   A       5     1     8
  [ 5] .dynstr           STRTAB           0000000000000278  00000278
       0000000000000001  0000000000000000   A       0     0     1
  [ 6] .eh_frame         PROGBITS         0000000000001000  00001000
       0000000000000000  0000000000000000   A       0     0     8
  [ 7] .dynamic          DYNAMIC          0000000000001f20  00001f20
       00000000000000e0  0000000000000010  WA       5     0     8
  [ 8] .comment          PROGBITS         0000000000000000  00002000
       0000000000000023  0000000000000001  MS       0     0     1
  [ 9] .symtab           SYMTAB           0000000000000000  00002028
       0000000000000168  0000000000000018          10    12     8
  [10] .strtab           STRTAB           0000000000000000  00002190
       0000000000000027  0000000000000000           0     0     1
  [11] .shstrtab         STRTAB           0000000000000000  000021b7
       000000000000006c  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  l (large), p (processor specific)
$ 

You can see that the executable still contains multiple sections, though less than the smallest main example.

Some command line options to the linker specify the starting address of the section. Try specifying the value 0x123456789abcdef0 as the starting address for the .text section that is not included in the list of sections above. The following command line option `-Wl, -Ttext = 0x123456789abcdef0'is applicable.

$ gcc -xc /dev/null -nostdlib -Wl,-e0 -Wl,-Ttext=0x123456789abcdef0 && ls -l a.out
-rwxr-xr-x 1 fujita fujita 9512 Jul  8 22:41 a.out
$ readelf -S a.out
There are 12 section headers, starting at offset 0x2228:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .interp           PROGBITS         0000000000000238  00000238
       000000000000001c  0000000000000000   A       0     0     1
  [ 2] .note.gnu.build-i NOTE             0000000000000254  00000254
       0000000000000024  0000000000000000   A       0     0     4
  [ 3] .gnu.hash         GNU_HASH         0000000000000278  00000278
       000000000000001c  0000000000000000   A       4     0     8
  [ 4] .dynsym           DYNSYM           0000000000000298  00000298
       0000000000000018  0000000000000018   A       5     1     8
  [ 5] .dynstr           STRTAB           00000000000002b0  000002b0
       0000000000000001  0000000000000000   A       0     0     1
  [ 6] .eh_frame         PROGBITS         123456789abce000  00001000
       0000000000000000  0000000000000000   A       0     0     8
  [ 7] .dynamic          DYNAMIC          123456789abcef20  00001f20
       00000000000000e0  0000000000000010  WA       5     0     8
  [ 8] .comment          PROGBITS         0000000000000000  00002000
       0000000000000023  0000000000000001  MS       0     0     1
  [ 9] .symtab           SYMTAB           0000000000000000  00002028
       0000000000000168  0000000000000018          10    12     8
  [10] .strtab           STRTAB           0000000000000000  00002190
       0000000000000027  0000000000000000           0     0     1
  [11] .shstrtab         STRTAB           0000000000000000  000021b7
       000000000000006c  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  l (large), p (processor specific)
$ 

I don't know what the section is, but you can see that the section .eh_frame is affected and has a similar address. It seems that the lower 16 bits of the address, which is the value of the Offset item, are aligned from def0 to ʻe000`. These section information should be in the executable a.out. Try a hexadecimal dump with the hexdump command

$ hexdump -C a.out | head -24
00000000  7f 45 4c 46 02 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  03 00 3e 00 01 00 00 00  00 00 00 00 00 00 00 00  |..>.............|
00000020  40 00 00 00 00 00 00 00  28 22 00 00 00 00 00 00  |@.......("......|
00000030  00 00 00 00 40 00 38 00  09 00 40 00 0c 00 0b 00  |[email protected]...@.....|
00000040  06 00 00 00 04 00 00 00  40 00 00 00 00 00 00 00  |........@.......|
00000050  40 00 00 00 00 00 00 00  40 00 00 00 00 00 00 00  |@.......@.......|
00000060  f8 01 00 00 00 00 00 00  f8 01 00 00 00 00 00 00  |................|
00000070  08 00 00 00 00 00 00 00  03 00 00 00 04 00 00 00  |................|
00000080  38 02 00 00 00 00 00 00  38 02 00 00 00 00 00 00  |8.......8.......|
00000090  38 02 00 00 00 00 00 00  1c 00 00 00 00 00 00 00  |8...............|
000000a0  1c 00 00 00 00 00 00 00  01 00 00 00 00 00 00 00  |................|
000000b0  01 00 00 00 04 00 00 00  00 00 00 00 00 00 00 00  |................|
000000c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000d0  b1 02 00 00 00 00 00 00  b1 02 00 00 00 00 00 00  |................|
000000e0  00 10 00 00 00 00 00 00  01 00 00 00 04 00 00 00  |................|
000000f0  00 10 00 00 00 00 00 00  00 e0 bc 9a 78 56 34 12  |............xV4.|
00000100  00 e0 bc 9a 78 56 34 12  00 00 00 00 00 00 00 00  |....xV4.........|
00000110  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
00000120  01 00 00 00 06 00 00 00  20 1f 00 00 00 00 00 00  |........ .......|
00000130  20 ef bc 9a 78 56 34 12  20 ef bc 9a 78 56 34 12  | ...xV4. ...xV4.|
00000140  e0 00 00 00 00 00 00 00  e0 00 00 00 00 00 00 00  |................|
00000150  00 10 00 00 00 00 00 00  02 00 00 00 06 00 00 00  |................|
00000160  20 1f 00 00 00 00 00 00  20 ef bc 9a 78 56 34 12  | ....... ...xV4.|
00000170  20 ef bc 9a 78 56 34 12  e0 00 00 00 00 00 00 00  | ...xV4.........|
$ 

You can see that 8 bytes from address 000000f8, which seems to be the start address of .eh_frame, is stored in little endian. By using \ -Wl, -Ttext = 0xXXXXXXXXXXXX0000', assuming that the lower 16 bits are ignored to avoid alignment of addresses, any byte string from address 000000fa is embedded in the executable file as the address of .eh_frame. I think I can do it. As a test, specify \ -Wl, -Ttext = 0x2504890000'and place 89 04 25 from address 000000fa. Specify the address 000000fa as the execution address with \ -e0xfa', and also specify \ -zexecstack', which enables execution in areas other than the code area.

$ gcc -xc /dev/null -nostdlib -Wl,-e0xfa -Wl,-Ttext=0x2504890000 -zexecstack && ls -l a.out
-rwxr-xr-x 1 fujita fujita 9512 Jul  8 22:49 a.out
$ ./a.out
Segmentation fault (core dumped)
$ 

I was able to confirm that segfault occurs by executing the generated a.out. Let's use the objdump command to see the instructions embedded in the executable file from address 000000fa.

$ objdump -D -bbinary -mi386 -Mintel,x86-64 --start-address=0xfa a.out | head -8

a.out:     file format binary


Disassembly of section .data:

000000fa <.data+0xfa>:
      fa:	89 04 25 00 00 00 00 	mov    DWORD PTR ds:0x0,eax
$ 

The content is to write the value of the EAX register to address 0. By trying to do this, you can see that the OS protection feature segfaulted.

With the above, we have achieved the title "Segfault with 0 characters with gcc". In addition, it is possible to issue an error other than segfault by devising the contents of \ `-Wl, -Ttext = 0xXXXXXXXXXXXX0000'.

Call software interrupt INT3 for debugging


$ gcc -xc /dev/null -nostdlib -Wl,-e0xfa -Wl,-Ttext=0xcc0000 -zexecstack && ls -l a.out
-rwxr-xr-x 1 fujita fujita 9512 Jul  8 22:55 a.out
$ objdump -D -bbinary -mi386 -Mintel,x86-64 --start-address=0xfa a.out | head -8

a.out:     file format binary


Disassembly of section .data:

000000fa <.data+0xfa>:
      fa:	cc                   	int3   
$ ./a.out
Trace/breakpoint trap (core dumped)
$ 

Execute UD2 reserved as an illegal instruction


$ gcc -xc /dev/null -nostdlib -Wl,-e0xfa -Wl,-Ttext=0x0b0f0000 -zexecstack && ls -l a.out
-rwxr-xr-x 1 fujita fujita 9512 Jul  8 22:56 a.out
$ objdump -D -bbinary -mi386 -Mintel,x86-64 --start-address=0xfa a.out | head -8

a.out:     file format binary


Disassembly of section .data:

000000fa <.data+0xfa>:
      fa:	0f 0b                	ud2    
$ ./a.out
Illegal instruction (core dumped)
$ 

Divide the value of the AX register by the contents of address 0x100 where 0 is stored to generate a division by zero error.


$ gcc -xc /dev/null -nostdlib -Wl,-e0xfa -Wl,-Ttext=0x35f60000 -zexecstack && ls -l a.out
-rwxr-xr-x 1 fujita fujita 9512 Jul  8 22:58 a.out
$ objdump -D -bbinary -mi386 -Mintel,x86-64 --start-address=0xfa a.out | head -9

a.out:     file format binary


Disassembly of section .data:

000000fa <.data+0xfa>:
      fa:	f6 35 00 00 00 00    	div    BYTE PTR [rip+0x0]        # 0x100
     100:	00 00                	add    BYTE PTR [rax],al
$ ./a.out
Floating point exception (core dumped)
$ 

It's fun to try various things.

in conclusion

The end.

Recommended Posts

Segfault with 0 characters with gcc
Segfault Python with 33 characters
Segfault with 16 characters in C language
[Python] One-liner Stalin sort with 50 characters
[Note] Japanese characters are garbled with atom-runner
Output color characters to pretty with python
Face recognition of anime characters with Keras
Count the number of characters with echo