Zynq-7000 AP SoC Spectrum Analyzer part 3 - Accelerating Software - Running ARM Library Tests Tech Tip 2014.3
Zynq-7000 AP SoC Spectrum Analyzer part 3 - Accelerating Software - Running ARM Library Tests Tech Tip 2014.3
Table of Contents
Document History
Date |
Version |
Author |
Description of Revisions |
|
23 October 2014 |
1.0 |
Faster Technology |
Initial Posting - updated to 2014.3 |
|
Date |
Author |
Comment |
||
---|---|---|---|---|
More Revisions |
||||
Description/Summary
In Tech Tip "Zynq-7000 AP SoC Spectrum Analyzer part 2 - Accelerating Software - Building ARM Neon Library Tech Tip 2014.3" a library of complex signal processing functions was obtained and built. This Tech Tip describes the process to build and execute on the ZC702 the tests that are supplied with the NE10 library and the subsequent results. This will demonstrate the ease with which externally sourced software can be built using Xilinx SDK and run on the ZC702. In addition, multiple methods for accessing and controlling the ZC702 will be shown, demonstrating the flexibility of the ZC702, the simple process of system debug and the speed of system development. A subset of the supplied tests will be used to indicate the level of performance improvement that might be possible using the NEON SIMD extension to the ARM processor in Zynq-7000 AP SoC Processing System (PS).
Implementation
Implementation Details |
|
Design Type |
PS Only |
SW Type |
PetaLinux |
CPUs |
1 CPU - standard ZC702 frequency |
PS Features |
ARM Processor and NEON SIMD engine |
PL Cores |
None |
Boards/Tools |
ZC702 |
Xilinx Tools Version |
Vivado / SDK 2014.3 |
Other Details |
Standard ZC702 setup for console terminal and Ethernet required; HDMI display with Base TRD |
Files Provided |
|
Ne10LibraryBuild2014dt3.zip |
Saved workspace file set |
Step by Step Instructions
This Tech Tip uses the workspace that resulted from building the Ne10 library in the Tech Tip "Zynq-7000 AP SoC Spectrum Analyzer part 2 - Building ARM NEON Library Tech Tip 2014.3". If that workspace is available, skip the instructions below to start SDK with the workspace already in place at the Workspace in Place heading below.
If the workspace is not available or if there is a question if it was completed properly, the referenced file "PetaLinuxNe10LibraryBuild.zip" can be used to create a known good starting point for this Tech Tip.
Restoring the Workspace
Download the zip file from the Ne10LibraryBuild2014dt3.zip link.
Create an empty directory where you will be implementing this Tech Tip. To be consistent with the balance of these step by step instructions, the directory could be:
G:\Projects\ZC702_Ne10
However, these steps to import a known workspace will work with any new folder of the user's choosing.
Start -> All Programs -> Xilinx Design Tools -> Vivado 2014.3 -> Xilinx SDK 2014.3
In the Workspace Launcher, browse to and select \Projects\ZC702_Ne10 or the empty directory that you have created.
Click OK to continue.
When presented with the Welcome screen, click the x in the Welcome tab to close that screen
SDK will start with a blank Project Explorer pane
We can now import the saved workspace into SDK.
Select File -> Import or Right Click on the white space in the Project Explorer pane and select Import
The Import dialogue box will appear. Expand the General line and select Existing Projects into Workspace
Click Next
Click the Select archive file button. Then click Browse to navigate to the saved workspace that you want to import and click Open. In our case, this is Ne10LibraryBuild2014dt3.zip.
Be sure that both Ne10 and Ne10-master projects are selected.
Click Finish
SDK will build the workspace automatically. Because SDK is already started and the workspace is in place, you can skip the following instructions to start SDK and go directly to SDK running.
Workspace in Place
If you have not yet started SDK, do so by:
Start -> All Programs -> Xilinx Design Tools -> Vivado 2014.3 -> Xilinx SDK 2014.3
In the Workspace Launcher, browse to and select Projects\ZC702_Ne10.
Click OK to continue.
When presented with the Welcome screen, click the x in the Welcome tab to close that screen
SDK Running
With SDK now running with the existing or re-loaded workspace, continue from here.
Because this is an existing built project, you should see something similar to the following. The key is to have the Ne10 project with the directories shown. If these are not present, return to the Zynq-7000 AP SoC Spectrum Analyzer part 2 - Accelerating Software - Building ARM Neon Library Tech Tip 2014.3 and build the library again, or use the instructions above to load the known workspace file set.
With the library in place, we build the standard test program that was supplied as part of the repository. A new project for the test program is created, the appropriate files from the repository are imported and then it is compiled into an executable program.
Select File -> New ->Project; expand the C/C++ group and select C Project.
Shortcut:
Right click on the white space in the Project Explorer pane. Then select New -> C Project.
In the Project name: box, give the project the name Ne10-test. Select the Project type: as Xilinx ARM Linux Executable by clicking on that option in the left column.
Click Next
Verify that both the Debug and Release Configurations: options are checked.
Click Advanced Settings.
We are now going to add the proper paths to library include files.
Expand the C/C++ General line in the left column and select the Paths and Symbols item. Be sure the Configuration: option is set to Release because we want to have the best performance. If desired, these settings can be applied to both debug and release configurations by selecting the All Configurations options as shown.
These steps are referred to as the Adding Paths steps and will be used several times
With the Includes tab selected, click Add in the right column to get the following dialogue box
Check both of the options and then Click Workspace to select the directory to be added
Select Ne10-master -> common and then click OK
Click OK again to add this path to the Includes with the result as shown below
Using these same Adding Paths steps, add the following paths:
- Ne10-master / inc
- Ne10-master / modules / math
- Ne10-master / test / include
When completed you should have the following include directories set.
We now need to add some library paths for tools to find all of the required components.
In the left column of the Properties for Ne10-test, expand C/C++ build options, and select Settings. In the left portion of the right side of the window, under ARM Linux gcc linker, select Libraries.
At the top of the Libraries (-l) area, click on the add icon (looks like a sheet of paper with a plus sign on it).
In the text entry box, type Ne10 and click OK. Then do the same steps for Linux libraries "m" and "rt" resulting in the following
In the Library search path pane, click the add icon (the paper with the plus sign as before) to get a pop up window:
Click Workspace and then select Ne10/Release.
Click OK to return to the Settings window.
NOTE:
These same library paths should be added to the Release configuration also. If the Configuration: option box is not displayed at the top of the Settings window, expand the C/C++ General line and select the Paths and Symbols item. At the top of the window, the Configuration: option box should appear. From the drop down select Release. Then in the left column select the C/C++ Build / Settings item. In the list of Tool Settings, select the Libraries item under the ARM Linux gcc linker.
Using the Add steps used immediately prior for adding to the Debug configuration, add the following Libraries:
- Ne10
- m
- rt
Then add the Library search path (-L) Ne10/Release. You should then have the following:
NOTE:
If the original Configuration: box is set to All Configurations, this will force the Debug and Release build options to be identical. This is a simpler approach that can be changed later if different options are desired for the release build versus the debug build. In general, a release build will invoke a higher degree of compiler optimization for more efficient code while a debug release includes additional hooks to simplify analysis and debug of the code. This causes Release build code to usually operate faster than Debug build code.
Click Apply
Click OK to return to the Select Configurations screen.
Click Finish to complete this portion of the setup.
We now need to import the actual code for the tests. This follows a set of steps similar to the import of the code into the Ne10 library in the Zynq-7000 AP SoC Spectrum Analyzer part 2 - Accelerating Software - Building ARM Neon Library Tech Tip 2014.3
Right click on Ne10-test in the Project Explorer window, and select Import.
In the Select window that pops up, expand General and select File System then click Next.
Browse to the Ne10-master directory that is under "Projects\ZC702_Ne10".
WARNING:
Do NOT use the original file that was unzipped or copied into a higher level directory.
Expand Ne10-master in the left pane and select the modules/dsp/test and test/src directories to import.
CAUTION:
If the original source Ne10-master is selected, subsequent build steps will fail. If that occurs, delete the Ne10-master in the workspace and perform the import again, being sure to point to the Ne10-master in workspace directory
Click Finish to perform the import.
The test software is now ready to be built.
NOTE:
To obtain the best execution performance, the compiler default options must be set to Release.
Right click on Ne10-test in the Project Explorer pane.
Select Build Configurations . Set Active > Release
Right click Ne10-test and select Build Project.
As before, there are a number of warnings. These can be ignored for now.
The test software can now be run on the ZC702.
There are several methods that can be used to run the test software on the ZC702. Among them are:
- Use the Remote System Explorer capability in SDK
- Copy the binary file to the SD card containing the TRD Linux system and execute from a remote terminal or from the local console
Since there are two tests that will be run, we will demonstrate each of the methods.
Running from Remote System Explorer
We will access the ZC702 over an Ethernet connection from the PC where SDK is running.
CAUTION!!
The default IP address of the ZC702 is different for PetaLinux than for the OSL Linux used in prior versions of this series of Tech Tips. The default IP address is now 192.168.0.10 so be sure your computer can reach that sub-net.
Connect your ZC702 to your computer or network with an Ethernet cable.
If you are unable to directly reach the .0 subnet from your computer, it is possible to change the IP address of the ZC702 after PetaLinux has booted. To make this change, do the following:
- Connect the console serial over USB port to your PC with the supplied mini-usb adapter and appropriate cable
- Start TeraTerm or similar terminal emulator
- Boot the ZC702 as described below
- Once PetaLinux has booted, log in using the username root and password root
- Use the ifconfig command to change the IP address using - "ifconfig eth0 192.168.1.65" where the IP address is one that you can reach from your PC
The ZC702 must be set to boot from the SD-MMC card that has been updated with the 2014.2 TRD as described in the TRD Technical Article Zynq Base TRD 2014.2. As a caution before proceeding, be sure that the ZC702 will properly run the 2014.2 TRD.
Set the boot select switches as shown below, then power on your ZC702.
With the ZC702 running, we will set up a Remote System Explorer connection to the ZC702 board to allow us to download and run our program. Right click in the Project Explorer window, select New -> Other. In the pop up window, expand Remote System Explorer, highlight Connection.
NOTE:
The balance of these instructions assume that the IP address of the ZC702 has been changed to 192.168.1.65. If you have set the IP address differently, use that IP address.
With the ZC702 running, we will set up a Remote System Explorer connection to the ZC702 board to allow us to download and run our program. Right click in the Project Explorer window, select New -> Other. In the pop up window, expand Remote System Explorer, highlight Connection.
Click Next
In the next window, select SSH Only, and click Next
Use the IP address of the ZC702 (192.168.1.65) as the Host name and Connection name. Fill in the description field if you wish - it is not required.
Click Finish.
NOTE:
Although the connection has been established, there is nothing showing. To view the connection, a new perspective must be started, replacing the current perspective. To do this,
Click Window -> Open Perspective -> Other
Then select Remote System Explorer and click OK.
The Remote System Explorer connection will be displayed.
Close this perspective by clicking
Window -> Close Perspective (As an alternative, click on the C/C++ tab in the upper right corner of the overall window. This returns to the build perspective with the RSE perspective still available.)
With the connection established and the standard Project Explorer perspective displayed, we can now download and run the test program directly to the ZC702.
In the Project Explorer, right click on Ne10-test, then select Run As -> Run Configurations.
A new window will open to create, manage and run configurations.
Create a new Remote ARM Linux Application by double clicking on "Remote ARM Linux Application" in the left column. (This can also be done by following the instructions in the main window for using the "New" button.)
Select the IP address of the ZC702 in the pull down Connection menu.
NOTE:
If SDK has been used for other Tech Tips or development work, there may be items listed under "Remote ARM Linux Application" in the left panel of the RSE screen above. These can be ignored. For example, in the screen above the item "fft-zynq Release" is from work on another Tech Tip.
Then click on the Browse button next to the Remote Absolute File Path for C/C++ Application field .
Expand the Root directory. Under the root directory select the tmp directory. If prompted for the User ID and Password, use root for both.
Click OK
Append /Ne10-test.elf to /tmp in the Remote Absolute File Path for C/C++ Applications field.
Click Apply, then click Run.
The test software will be downloaded to the ZC702 and then run. The results will show in the SDK Console window.
NOTE:
Once the Run Configuration has been set up, SDK will remember this configuration, simplifying subsequent tests in the same environment.
This first test is a regression test that checks the code in the library for correct results so any mistakes that are made in optimizations, etc. will be caught. It is using the open source SEATEST project that implements an xUnit style of Test Driven Development.
The SEATEST project is hosted at https://code.google.com/p/seatest/
Additional information about xUnit testing can be found at http://en.wikipedia.org/wiki/XUnit
At the conclusion of this test with all of the runs passing, we can conclude that the desired fft portion of the library has been properly built and is operating as designed.
Performance Test
The second test that we will run is a performance comparison between running these same complex functions on the ARM processor itself and then running with optimizations to use the NEON SIMD engine. This test is included in the Ne10 library and is selectively compiled based on the presence or absence of a special symbol. To run the performance test we need to define the PERFORMANCE_TEST symbol.
Right click on Ne10-test and select Properties.
Expand the C/C++ General line in the left column and then select the Paths and Symbols entry.
Select the #Symbols tab by clicking on it.
Click the Add button on the right side and enter PERFORMANCE_TEST as the symbol name. There is no value required so leave that entry blank.
Click the Add to all configurations check box then Click OK.
The Add Symbol window closes revealing the Paths and Symbols window. If you scroll to the end of the list, you will see the PERFORMANCE_TEST symbol.
Click Apply
If an information box relating to changes in the paths or symbols appears, click Yes to accept it.
Click OK.
With the symbol in place, the test software must be rebuilt.
Right click on Ne10-test in the Project Explorer and select Build Project. As before, there will be several warnings that we can ignore for now.
The revised test program can now be run. We can use the same process just described to run it with the SDK Remote System Explorer. As noted earlier, the setup steps do not need to be performed again. Simply right click Ne10-test in the Project Explorer, select Run As -> Run Configurations. Then select the previously created Ne10-test configuration, and click Run.
The results will be displayed in the Console window of SDK as shown below.
We can see a generally consistent improvement in performance with NEON versus the general purpose ARM processor alone. Performance improvements vary for each execution run based on the other operations within Linux at the time the tests are run.
NOTE:
With the test performed in this manner, the base TRD is still operating.
An alternative method is to add the test program to the SD-MMC card that contains the base TRD, and then execute it from the console on the ZC702.
CAUTION!
In the OSL based version of these Tech Tips, files that are added to the root of the SD-MMC card are copied to the /mnt directory when the TRD boots up. For the PetaLinux based 2014.2 TRD, they are not automatically copied from the SD card. The copy process must be done manually as described in the following instructions.
In SDK, expand the line Ne10-test in Project Explorer, then expand the Binaries sub-directory. You should have a file Ne10-test.elf as the only item in the Binaries sub-directory.
Power off the ZC702 and remove the SD-MMC card from the ZC702 and insert it into an appropriate media reader on your computer.
Using standard Windows Explorer, navigate to the SD-MMC card to show its contents. The default should be the following showing the root of the SD-MMC card image.
Use standard Windows copy / paste methods to copy the Ne10-test.elf file to the root of the SD-MMC card. In the SDK window, right click on the Ne10-test.elf file and select Copy. Then in the Windows Explorer window for the SD-MMC card, right click and select Paste.
Eject the SD-MMC card from your computer, re-insert it in the ZC702 and power on the ZC702.
After the ZC702 boots up, use the instructions in UG926_Z7_ZC702_Eval_Kit to establish a terminal connection to the ZC702. Log in using the username root and the password root.
Exit out of the QT window system if the terminal window does not respond to commands.
Change to the /media directory and verify that Ne10-test.elf is there (cd //media then use ls to see the files on the SD card).
Attempting to execute Ne10-test from here will fail because the standard execution path does not include the /mnt directory. There are at least three ways to resolve this:
- change the path variable to include the /mnt directory
- from the /mnt directory, type ./Ne10-test.elf and the program will execute in place
- copy the Ne10-test.elf to the /bin directory which is in the execution path by default.
We used the latter of these two; at the prompt type
cp Ne10-test.elf /bin
Then change to the /bin directory to verify that it is there using ls to list the files (if you are unfamiliar with Linux, the commands to enter can be seen on the next screen).
We can now execute the test from the terminal window.
Performance Results
Type Ne10-test.elf and then Return. The tests should run and provide the following results in the terminal window. There will be some variability in the reported numbers.
--------------- ../modules/dsp/test_suite_fft_float32.c -------------- |
||||||||
--------------- test_fft_c2c_1d_float32_performance start |
||||||||
FFT Length |
C Time in ms |
NEON Time in ms |
Time Savings |
Performance Ratio |
||||
FFT |
size |
4 |
||||||
4 |
374815 |
218158 |
41.80% |
1.72:1 |
||||
4 |
507480 |
316038 |
37.72% |
1.61:1 |
||||
FFT |
size |
8 |
||||||
8 |
309311 |
218107 |
29.49% |
1.42:1 |
||||
8 |
451916 |
287729 |
36.33% |
1.57:1 |
||||
FFT |
size |
16 |
||||||
16 |
424538 |
288772 |
31.98% |
1.47:1 |
||||
16 |
469142 |
332249 |
29.18% |
1.41:1 |
||||
FFT |
size |
32 |
||||||
32 |
433718 |
290071 |
33.12% |
1.50:1 |
||||
32 |
465857 |
342670 |
26.44% |
1.36:1 |
||||
FFT |
size |
64 |
||||||
64 |
593465 |
370792 |
37.52% |
1.60:1 |
||||
64 |
626945 |
407847 |
34.95% |
1.54:1 |
||||
FFT |
size |
128 |
||||||
128 |
623664 |
382343 |
38.69% |
1.63:1 |
||||
128 |
651775 |
411053 |
36.93% |
1.59:1 |
||||
FFT |
size |
256 |
||||||
256 |
793286 |
474970 |
40.13% |
1.67:1 |
||||
256 |
818263 |
499879 |
38.91% |
1.64:1 |
||||
FFT |
size |
512 |
||||||
512 |
824819 |
496454< |