By the way, it was the time of Advent Calendar, so I will give a memorial service to the script I wrote yesterday. This article is the 13th day article of Security Tools Advent Calendar 2018.

Execution path difference viewer is a tool that visualizes the difference of execution path when two inputs are given to the same program. (I thought the name was appropriate now) For example, if you have a program that accepts the letters "AB" like this:

`test.c`


#include <stdio.h>
#include <stdlib.h>

void one_match() {
        puts("One match");
}

void all_match() {
        puts("Accepted!");
}

int main(int argc, char *argv[]) {
        FILE *fp;
        char buf[32] = {0};
        if (argc < 2) { 
                fprintf(stderr, "usage: ./test <input>\n");
                exit(0);
        }
        fp = fopen(argv[1], "r");
        fread(buf, sizeof(char), 31, fp);
        if (buf[0] == 'A') {
                if (buf[1] == 'B') {
                        all_match();
                        return 0;
                } else {
                        one_match();
                }
        }
        puts("Not good");
        return 0;        
}

Given the string "AX", "Not good" after one_match (), Given the string "AB", all_match (), I want to visualize this in a good way though it goes through different execution paths.

`python`


$ cat test1.in test2.in
AX
AB
$ ./test test1.in 
One match
Not good
$ ./test test2.in 
Accepted!

angr+Bingraphvis There are two main things I want to do this time. --Record execution traces for each of the two inputs --Display the difference of the obtained trace on CFG in some way

This time I used angr and Bingraphvis. Bingraphvis is a core library that supports angr-utils for visualizing CFG generated by angr. It handles CFG node operations, transformations, and plots. Using this and angr's QEMU Runner (tracer),

Record a trace for two inputs with QEMURunner
Generate CFG after main function with CFGEmulated
Color the CFG node that corresponds to the trace difference with Bingraphviz
Save CFG as png

I did that. The reason I didn't use the wrapper angr-utils was because I wanted to define and use a CFG variant with my own trace diff. If you run it against the program mentioned earlier, it will spit out the image below.

`python`


$ python3 input-tracer.py -b ./test -i test1.in,test2.in -v
[+] Opening binary ./test
[+] CFG Cache found 
CFG size = 46
[+] Tracing .... test1.in
Size: 46079
[+] Tracing .... test2.in
Size: 46033
[+] CFG processing ....
Graph len= 30
[+] Complete! Saved to outd/input_trace_test_entire.png

The path when red gives test1.in and blue gives test2.in. Common is black.

Plot by function using CFGFast

If you do the above for a program like a UNIX utility, you will plot a huge (5,000 or more nodes) CFG and generate an image of tens of thousands of pixels. Of course, the viewer cannot be displayed and falls, which impairs the meaning of visualization. It is possible to display only the nodes with the difference instead of the entire CFG (without -v from the above command), but the difference alone may be very large. Therefore, we also added a function to plot the CFG by dividing the image for each function of the program.

Record a trace for two inputs with QEMURunner
Get a list of functions defined in the binary with CFGFast of angr.
Perform the same process as above for each function

Enabled with the -f option.

$ python3 ./input-tracer.py -b mp3_player -i invalid.mp3,1.mp3 -f                              
[+] Opening binary mp3_player                                 
[+] Searching for all the functions (using CFGFast)                             
100% |#####################################| Elapsed Time: 0:00:02 Time: 0:00:02
   ==> 106 functions to process.                                                
[+] Tracing .... invalid.mp3                                                    
Size: 1305732                                                                   
[+] Tracing .... 1.mp3                                                          
Size: 6084333                                                                   
[+] CFG processing ....                                                         
[+](0/106) Computing Accurate CFG for function _init (0x8049cd8)               
[+] CFG Cache found                                                             
Graph len= 0                                                                    
[+] Complete! Saved to outd/input_trace_mpg321-0.3.0__init.png                  
[+](1/106) Computing Accurate CFG for function sub_8049d0c (0x8049d0c)         
[+] CFG Cache found

Give the player clearly invalid mp3 data and valid mp3 data, for example "ABCD". The CFG difference for each function is plotted in outd.

Looking at the difference of the function in the mp3 player called calc_length, it is as follows.

input_trace_mpg321-0.3.0_calc_length.png

Source code

The code is below https://gist.github.com/RKX1209/3cb60b0fa0ba92da6575716680f32aa0

[PYTHON] Try to create an execution path diff viewer with angr + bingraphvis

test.c

python

python

Plot by function using CFGFast

Source code

`test.c`

`python`

`python`