In embedded sites, due to resource constraints, an executable file with debug symbols strip
is often installed on the actual machine.
As a result, even if you execute perf top
, you cannot basically solve which function is being executed, and it will be displayed by address.
For such an environment, the perf
command provides an offline analysis mechanism that analyzes the profile information saved on the actual machine together with the debug symbols on the PC.
In this article, I will leave a memorandum of the flow of offline analysis using perf
in a cross environment using the QEMU environment as an example.
The environment used this time is as follows.
Use buildroot
to prepare a QEMU environment for ʻarch64`.
(Since QEMU has been built for some time, it's even easier to build. Thank you.)
wget https://buildroot.org/downloads/buildroot-2020.02.3.tar.bz2
tar xf buildroot-2020.02.3.tar.bz2
cd buildroot-2020.02.3/
cat <<EOS >> configs/qemu_aarch64_virt_defconfig
BR2_ENABLE_DEBUG=y
BR2_PACKAGE_LINUX_TOOLS_PERF=y
EOS
cat <<EOS >> board/qemu/aarch64-virt/linux.config
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_KERNEL=y
EOS
make qemu_aarch64_virt_defconfig
make BR2_JLEVEL=$(nproc)
export PATH="$PWD/output/host/bin/:$PATH"
I have enabled some configs for this experiment.
--Buildroot config
--BR2_ENABLE_DEBUG
: Enable debug symbols when building each package. Save the binary with debug symbols under staging
.
--BR2_PACKAGE_LINUX_TOOLS_PERF
: Add the perf
command.
--Linux config
--CONFIG_DEBUG_INFO
: Build vmlinux with debug symbols.
--CONFIG_DEBUG_KERNEL
: Enables the kernel debugging function. Required to enable CONFIG_DEBUG_INFO
.
As per readme.txt, launch the virtual environment built with the following command. I can.
qemu-system-aarch64 -M virt -cpu cortex-a53 -nographic -smp 1 -kernel output/images/Image -append "rootwait root=/dev/vda console=ttyAMA0" -netdev user,id=eth0 -device virtio-net-device,netdev=eth0 -drive file=output/images/rootfs.ext4,if=none,format=raw,id=hd0 -device virtio-blk-device,drive=hd0
perf record
: Get profile informationYou can get profile information with the perf record
command.
The profile information to be acquired can be changed in various ways depending on the option,
Here, the -a
option is used to get the profile of the entire system.
In addition, the backtrace information is also acquired with the --call-graph fp
option.
perf
supports several methods for calculating backtrace, but here we specify the method using the frame pointer.
[On Target (QEMU)]
perf record -a --call-graph fp
When the measurement is completed, press Ctrl-C to finish.
Then, perf.data
is saved in the current directory, so transfer it to the host PC to analyze it.
There are various methods such as network transfer, but this time it is QEMU, so I will look at the disk image as it is.
[On Host PC]
cd output/images
mkdir -p mnt
sudo mount -t ext4 rootfs.ext2 mnt
sudo cp mnt/root/perf.data perf.data
sudo umount mnt
cd ../../
In addition, it is not interesting to see that there is no load, so I turned seq during measurement for the time being.
[On Target (QEMU)]
seq 100000 > /dev/null
perf report
: Analysis of acquired profile informationKernel symbols can be read with the -k
option and user space symbols can be read with --symfs
.
As far as I can see, it seems that the kernel symbol and the user space symbol cannot be loaded at the same time.
It's a bit of a hassle, but it seems that you need to switch each time.
[On Host PC]
perf report -i output/images/perf.data -k output/build/linux-4.19.91/vmlinux
It can be read that it is ʻarch_cpu_idlethat is using time in the kernel. The
[k]is the kernel space, and the
[.]` Is the user space.
As for the user space, the name cannot be resolved and it is displayed by the address.
[On Host PC]
perf report -i output/images/perf.data --symfs output/staging/
You can see that the printf
called from seq_main
of busybox
started as seq
is taking a long time.
If you press ʻafurther, you can display where it took time at the command / line level. (Equivalent to
perf annotate`)
If you are not in a cross environment, you can also display the corresponding line number and file name with -F srcline
or -F srcfile
, but
I couldn't display it in this environment, probably because ʻaddr2line is not called internally. However, executing ʻaddr2line
for the entire profile information takes a lot of time and is not easy to use.
I think it's a realistic way to call ʻaddr2line` separately only when you actually need the line number / file name (excuse).
Perf Wiki Profiling your Applications using the Linux Perf Tools Buildroot - Making Embedded Linux Easy
Recommended Posts