By the way, it was the time of Advent Calendar, so I will give a memorial service to the script I wrote yesterday. This article is the 13th day article of Security Tools Advent Calendar 2018.
Execution path difference viewer is a tool that visualizes the difference of execution path when two inputs are given to the same program. (I thought the name was appropriate now) For example, if you have a program that accepts the letters "AB" like this:
test.c
#include <stdio.h>
#include <stdlib.h>
void one_match() {
puts("One match");
}
void all_match() {
puts("Accepted!");
}
int main(int argc, char *argv[]) {
FILE *fp;
char buf[32] = {0};
if (argc < 2) {
fprintf(stderr, "usage: ./test <input>\n");
exit(0);
}
fp = fopen(argv[1], "r");
fread(buf, sizeof(char), 31, fp);
if (buf[0] == 'A') {
if (buf[1] == 'B') {
all_match();
return 0;
} else {
one_match();
}
}
puts("Not good");
return 0;
}
Given the string "AX", "Not good" after one_match (), Given the string "AB", all_match (), I want to visualize this in a good way though it goes through different execution paths.
python
$ cat test1.in test2.in
AX
AB
$ ./test test1.in
One match
Not good
$ ./test test2.in
Accepted!
angr+Bingraphvis There are two main things I want to do this time. --Record execution traces for each of the two inputs --Display the difference of the obtained trace on CFG in some way
This time I used angr and Bingraphvis. Bingraphvis is a core library that supports angr-utils for visualizing CFG generated by angr. It handles CFG node operations, transformations, and plots. Using this and angr's QEMU Runner (tracer),
I did that. The reason I didn't use the wrapper angr-utils was because I wanted to define and use a CFG variant with my own trace diff. If you run it against the program mentioned earlier, it will spit out the image below.
python
$ python3 input-tracer.py -b ./test -i test1.in,test2.in -v
[+] Opening binary ./test
[+] CFG Cache found
CFG size = 46
[+] Tracing .... test1.in
Size: 46079
[+] Tracing .... test2.in
Size: 46033
[+] CFG processing ....
Graph len= 30
[+] Complete! Saved to outd/input_trace_test_entire.png
The path when red gives test1.in and blue gives test2.in. Common is black.
If you do the above for a program like a UNIX utility, you will plot a huge (5,000 or more nodes) CFG and generate an image of tens of thousands of pixels. Of course, the viewer cannot be displayed and falls, which impairs the meaning of visualization. It is possible to display only the nodes with the difference instead of the entire CFG (without -v from the above command), but the difference alone may be very large. Therefore, we also added a function to plot the CFG by dividing the image for each function of the program.
Enabled with the -f option.
$ python3 ./input-tracer.py -b mp3_player -i invalid.mp3,1.mp3 -f
[+] Opening binary mp3_player
[+] Searching for all the functions (using CFGFast)
100% |#####################################| Elapsed Time: 0:00:02 Time: 0:00:02
==> 106 functions to process.
[+] Tracing .... invalid.mp3
Size: 1305732
[+] Tracing .... 1.mp3
Size: 6084333
[+] CFG processing ....
[+](0/106) Computing Accurate CFG for function _init (0x8049cd8)
[+] CFG Cache found
Graph len= 0
[+] Complete! Saved to outd/input_trace_mpg321-0.3.0__init.png
[+](1/106) Computing Accurate CFG for function sub_8049d0c (0x8049d0c)
[+] CFG Cache found
Give the player clearly invalid mp3 data and valid mp3 data, for example "ABCD". The CFG difference for each function is plotted in outd.
Looking at the difference of the function in the mp3 player called calc_length, it is as follows.
The code is below https://gist.github.com/RKX1209/3cb60b0fa0ba92da6575716680f32aa0
Recommended Posts