The performance of the created program is not as good as I expected! I don't know where the bottleneck is, so I want to find out!
In such a case, use a profiling tool.
I will drop the source code from the link to the download page at the bottom of http://goog-perftools.sourceforge.net/. The latest version as of October 11, 2014 is gperftools-2.2.1.tar.gz. After unzipping, install with ./configure, make, make install as usual.
Not necessary for using google-perftool only on the command line, google-perftool creates a visually easy-to-understand diagram called a call graph. Since this figure is generated in dot format, it is a good idea to also install the tool "graphviz" for converting this dot format file to a format such as eps.
Download the source code from the Download page at http://www.graphviz.org/. Also unzip and ./configure, make, make install If you try to install from source code, you are likely to complain in ./configure as there are many dependent libraries. If you just want to convert the dot file generated by google-perftool to eps, most of them are not functionally necessary, so ignore them and make and make install.
Add the lib directory where google-perftool is installed to LD_LIBRARY_PATH. Write it in ./bash_profile etc.
export LD_LIBRARY_PATH=/home/tanaka/lib:$LD_LIBRARY_PATH
Link libprofiler.so when compiling the program you want to profile.
$ g++ -o hoge.exe hoge.cpp -g -lprofiler
Execute the program by specifying the analysis file name of the output destination.
$ export CPUPROFILE=prof.out; ./hoge.exe
PROFILE: interrupts/evictions/bytes = xxx/x/xxxx
This will generate prof.out, so specify the original program and analysis file and display the result. Take a look at the top of the function that is taking a long time to execute.
$ pprof hoge.exe prof.out
Using local file prof.out.
Welcome to pprof! For help, type 'help'.
(pprof) top
Total: 355 samples
286 80.6% 80.6% 286 80.6% __write_nocancel
16 4.5% 85.1% 16 4.5% __read_nocancel
14 3.9% 89.0% 17 4.8% __lseek_nocancel
・
・
・
The second column from the left is the percentage of execution time that the function occupies. Since __write_nocancel is a function that is finally called by write (2), this program knows that write (2) is the bottleneck.
Create a call graph that shows the ratio of execution time by the size of the object and visually displays it in the format like a flow chart in the order of function calls as follows.
$ pprof --dot hoge.exe prof.out > prof.dot
$ dot -T eps prof.dot > prof.eps
Some of the most commonly used are:
If you can use yum or apt-get (with that permission), it may be easy to install and use perf or oprofile. (Because when I tried to install perf from the source code into my home directory, I messed up with configure or make and gave up.)
If google-perftool is Linux kernel 2.6.31 or later, I feel that it is easy to use in that it is less likely to trip when installing from source.
Recommended Posts