[PYTHON] Store node structure in C ++ from NetworkX file format

A graph is one of the general data structures built on the edges of points. In familiar examples, friend networks and web pages can also be represented as graphs.

Currently, Python's Networkx is a very useful tool when dealing with graphs. A very useful library that is easy to write and easy to write. If you want to deal with graphs, it seems good to try it first.

-[Python] Summary of basic usage of NetworkX 2.0 -Introduction to NetworkX in Python

However, NetworkX has the problem that it is very heavy for large graphs.

Therefore, in this article, we will introduce how to use C ++, which has high processing speed, for information on graph construction in C ++, while keeping the file format when constructing a graph in NetworkX.

As a prerequisite knowledge, it is assumed that you know how to write C ++. There is a description about Networkx, but there is no problem even if you do not know about NetworkX.


NetworkX format node

In NetworkX, the origin of vertices is written by separating each vertex with a space for each line.

The file is in this format.

facebook_combined.txt


0 1
0 2
0 3
0 4
0 5
0 6
︙

The format is (start node id) (end node id).

NetworkX can handle such node information in an instant, but C ++ requires a lot of work. As an actual procedure

  1. Read the file
  2. Split string
  3. Assignment to vector

I will explain in order.

Read file

	string path = argv[1];
	string num_nodes = stoi(argv[2]);
	ifstream ifs(path);

	vector<vector<int>> nodes;
	nodes = vector<vector<int>>(num_nodes);

String split function

vector<string> split(string& input, char delimiter){
    istringstream stream(input);
    string field;
    vector<string> result;
    while (getline(stream, field, delimiter)) {
        result.push_back(field);
    }
    return result;
}

This is a string split function that is common in C ++. It takes the target string and the split character as arguments, splits the string with the split character, and returns a vector.

Read the file line by line, split and then assign to vector

assignment


	string str;
	int from, to;

	while(getline(ifs, str)){
		//Separated by spaces
		vector<string> strvec = split(str, ' ');
		from = stoi(strvec.at(0));
        to = stoi(strvec.at(1));
		nodes[from].push_back(to);
	}

The getline function reads the file line by line and the split function is used to interpret the lines separated by spaces. Substitute the information of the start point node id and the end point node id returned as a result into the vector. In this way, the structural information of the node could be stored in the vector.

Verification

	for(int i = 0; i < num_nodes; i++){
		cout << i << "->";
		for(int j = 0; j < nodes[i].size(); j++){
			cout << nodes[i][j];
			if(j != nodes[i].size()-1)cout << ",";
		}
		cout << endl;
	}

result

1->48,53,54,73,88,92,119,126,133,194,236,280,299,315,322,346
2->20,115,116,149,226,312,326,333,343
3->9,25,26,67,72,85,122,142,170,188,200,228,274,280,283,323
4->78,152,181,195,218,273,275,306,328
︙

In this way, you can check the edge information of the node. It's hard to do so far.

Recommended Posts

Store node structure in C ++ from NetworkX file format
Get macro constants from C (++) header file (.h) in Python
Write O_SYNC file in C and Python
Generate C language from S-expressions in Python
Export xlsx file in C ++ using libxlsxwriter.
Get compliments from new girls in C # paizahack_01
From file to graph drawing in Python. Elementary elementary
Call a Python script from Embedded Python in C ++ / C ++