perfapm-server
is a bare-metal application that executes on RPU-1. The firmware binary is loaded by the APU master at the end of the Linux boot process. RPU-1 and APU establish a communication channel using the OpenAMP framework. RPU-1 gathers performance data like memory throughput from the PS AXI performance monitor (APM) units and sends it across to the APU where the data is received by the perfapm-client
library and then visualized on a plotted graph.Create a new Vitis workspace.
% cd $TRD_HOME/workspaces/ws_perfapm-server % vitis -workspace . & |
perfapm
, perfapm-server
, and perfapm-server_system
are selected. Click 'Finish'.
perfapm-server
system and select 'Build Project'.
Copy the generated perfapm-server
executable to the dm4 SD card directory.
% mkdir -p $TRD_HOME/sd_card/dm4 % cp perfapm-server/Debug/perfapm-server.elf $TRD_HOME/sd_card/dm4/ |
perfapm-client-test
application receives performance counter values from RPU-1 and prints them to UART-0. It is by default built as part of the meta-user layer of the PetaLinux BSP. The corresponding yocto recipe and source files are located at $TRD_HOME/petalinux/bsp/project-spec/meta-user/recipes-apps/perfapm-client
and the generated binary is located at /usr/bin/perfapm-client-test
on the target rootfs.settings64.sh
script before executing the below steps. This will add the ARM cross-compile toolchain to your PATH
and set the XILINX_SDX
environment variable.Copy and extract the source files into a new workspace.
% mkdir -p $TRD_HOME/workspaces/ws_perfapm-client % cd $TRD_HOME/workspaces/ws_perfapm-client % cp $TRD_HOME/petalinux/bsp/project-spec/meta-user/recipes-apps/perfapm-client/files/perfapm-client.zip . % unzip perfapm-client.zip % mkdir build work |
Configure the project using cmake
and generate eclipse project files. Build the project using make
from the command line.
SDKTARGETSYSROOT
environment variable which contains the target and host sysroot for building this application. This requires to complete the PetaLinux SDK installation step as described in Design Module 5.% cd build % CC=aarch64-linux-gnu-gcc CXX=aarch64-linux-gnu-g++ \ cmake -G"Eclipse CDT4 - Unix Makefiles" -DCMAKE_ECLIPSE_EXECUTABLE=${XILINX_SDX}/eclipse/lnx64.o/eclipse \ ../src % make -j |
Alternatively you can build the project through the XSDK GUI.
% cd ../work % xsdk -workspace . & |
$TRD_HOME/workspaces/ws_perfapm-client/build
directory and make sure the listed project is selected. Click 'Finish'. Copy the generated perfapm-client-test
executable to the dm4 SD card directory.
% cp $TRD_HOME/workspaces/ws_perfapm-client/build/perfapm-client-test/perfapm-client-test $TRD_HOME/sd_card/dm4 |
Select the device-tree matching design module 4 and build all Linux image components. If you have run petalinux-build
in a previous module, the build step will be incremental.
% cd $TRD_HOME/petalinux/bsp/project-spec/meta-user/recipes-bsp/device-tree/files/ % cp zcu102-base-dm4.dtsi system-user.dtsi % petalinux-build |
Create a boot image
% cd $TRD_HOME/petalinux/bsp/images/linux % petalinux-package --boot --bif=../../project-spec/boot/dm4.bif --force |
Copy the generated images to the dm4 SD card directory
% cp BOOT.BIN image.ub $TRD_HOME/sd_card/dm4 |
$TRD_HOME/sd_card/dm4
SD card directory to a FAT formatted SD card.Run the perfapm-client-test
application:
% perfapm-client-test |
Below is a sample output of the application on the serial console:
|----------------------------------------------------------------------| | Performance Monitor APP | |----------------------------------------------------------------------| |Slot |Write Byte Cnt |Read Byte Cnt |Total RW Byte Cnt | |----------------------------------------------------------------------| |DDR Slot1 | 62614 | 231056 | 293670 | |DDR Slot2 | 70966 | 327328 | 398294 | |DDR Slot3 | 0 | 994784128 | 994784128 | |DDR Slot4 | 0 | 0 | 0 | |DDR Slot5 | 0 | 0 | 0 | |OCM APM | 64 | 0 | 64 | |LPD_FPD | 1472 | 69728 | 71200 | |----------------------------------------------------------------------| DDRAPM_SLOT_DP+HP0 throughput = 7.958273 gigabits/sec DDRAPM_SLOT_HP1+HP2 throughput = 0.000000 gigabits/sec DDRAPM_SLOT_HP3+FPDDMA throughput = 0.000000 gigabits/sec |