[PYTHON] Visualization of the connection between malware and the callback server

Visualization of the connection between malware and callback server

this is

This article talks about visualization of the connection between malware and callback server for networkx practice.

environment

Kali Linux python 2.2.17

Rough implementation flow

  1. Argument setting
  2. Organize suffix
  3. Hostname acquisition function
  4. Scan target directory
  5. Network creation

Load the required library

import pefile
import sys
import argparse
import os
import pprint
import networkx
import re
from networkx.drawing.nx_agraph import write_dot
import collections
from networkx.algorithms import bipartite

Setting command line arguments

args = argparse.ArgumentParser()
args.add_argument("target")
args.add_argument("filename")
args.add_argument("malware_pro")
args.add_argument("hostname_pro")
args = args.parse_args()
network = networkx.Graph()

Suffix organization

suffixes = map(lambda string: string.strip(), open("suffixes.txt"))
suffixes = set(suffixes)

Organize suffixes in suffixes.txt using map and lambda expressions in the last two lines. You can create suffixes.txt yourself, but it's annoying. This time, I will borrow from Malware Data Science. Also, the malware sample passed to the target argument is also borrowed. First of all, strip () because the line feed code in suffixes.txt is an obstacle. Note that in python3, the return value of the map function becomes the object of the map function. </ font>

def get_hostnames(string):
    tmp = re.findall(r'(?:[a-zA-Z0-9](?:[a-zA-Z0-9\-]{,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,6}',string)
    hostnames = filter(lambda hostname: hostname.split(".")[-1].lower() in suffixes, tmp)
    return hostnames

Specify the domain with a regular expression and separate it with ".". Make the last element lowercase and use filter and lambda expressions to check if it matches suffixes. If this is also python3 ... </ font>

for root,dirs,files in os.walk(args.target):
    for file in files:
    try:
         pe = pefile.PE(os.path.join(root, file))
    except pefile.PEFormatError:
        continue
        f_path = os.path.join(root, file)
        contents = os.popen("strings '{0}'".format(f_path)).read()
        hostnames = get_hostnames(contents)
        if len(hostnames):
            network.add_node(file,label=file ,color='blue', penwidth=3,bipartite=0)
        for hostname in hostnames:
            network.add_node(hostname,label=hostname,color='purple', penwidth=10,bipartite=1)
            network.add_edge(hostname, file ,penwidth=2)
    if hostnames:
        print "Extracted hostname from:", file
        pprint.pprint(hostname)

Walk the target directory, get the root directory, subdirectory, and file path with the for statement, and check pefile mobilization with try. If not, go to the next loop. Store the full path of the pe file in f_path and pass the printable string as an argument to the get_hostnames function. After getting the host name, create content salware and host network respectively. Next, if the host name is found, register the file path in one of the two-part network, and register all the host names themselves in the other of the two-part network. In the last if statement, the file path where the host name was obtained is displayed on the command line at the time of execution </ font>

write_dot(network, args.filename)
codes= set(n for n,d in network.nodes(data=True) if d['bipartite']==0)
hostname = set(network)-codes

Write the network to filename on the first line. Store the malware side with bipartite = 0 in code. By setting network.nodes (data = True), a tuple consisting of a node name and a dictionary of the attributes of the node will be returned. Similarly, enter the host name in hostname.

codes = bipartite.projected_graph(network, codes)
hostname = bipartite.projected_graph(network, hostname)

Create projections for each malware and host. Projection is a simplification of the two-part network here. For example, in the case of malware (codes), malware with a common host name is connected. </ font>

write_dot(codes ,args.malware_pro)
write_dot(hostname ,args.hostname_pro)

Write each created projection to a file. </ font>

Visualization with fdp

fdp filename.dot -T png -o filename.png -Goverlap=false
fdp malware_pro.dot -T png -o malwre_pro.png -Goverlap=false
fdp hostname_pro.dot -T png -o hostname_pro.png -Goverlap=false

fdp is a tool that visualizes the network based on force orientation. There are other tools such as sfdp, but I will omit them this time. Try to run it.

filename.png
malware_pro.pnf
hostname_pro.png

You can do it with an image file called .

eog filename.png

When I open it,
a.png

Done ^^

By the way, what is power orientation?

Edge length is a problem when laying out a network. If the node weights are the same, the edge lengths should be the same. However, when the number of nodes is 4 or more, it is absolutely impossible to make all the nodes the same length, right? Therefore, we will try to minimize this distortion. That is where the force-oriented algorithm comes out. When you simulate an edge as a spring, the edge automatically tries to make the distance between the nodes as uniform as possible. Spring is great. Then. </ font>

Recommended Posts