Zynq-7000 AP SoC Benchmark - SPEC CPU 2000 Tech Tip

Documentation Criteria

It Is Assumed that the user of tech tip knows and understands terminologies associated with SPEC benchmarking.
No explicit Information regarding compilation process of SPEC process is provided. It is assumed that user is aware of cross compilation, building of SPEC binaries.
SPEC sources are not provided with this Tech Tip, Its assumed that user has the SPEC sources or is going to purchase it from SPEC consortium
User is requested to get any of the standard SPEC CPU 2000 related details from SPEC CPU 2000 Official website

Document History


Date	Version	Author	Description of Revisions
March 6,2014	1.0	chandramohan Pujari	Initial Revision
April 10,2014	1.1	chandramohan pujari	Added Build procedure Updates

Summary

The main purpose of this tech tip is to provide a brief introduction to SPEC CPU 2000 and update user with the process followed to build SPEC CPU 2000, benchmark it, and finally report the results generated for Zynq 7000 AP SOC. Some of the extracts are directly taken from the SPEC website as the process to be followed is standardized across platforms.

Implementation

Implementation Details
Design Type	PS Only
SW Type	Linux
CPUs	ARM A9 Dual Core CPU running at 866 Mhz
PS Features	ARM A9 CPU's running at 866 MHZ, DDR memory Controller running at 400 Mhz
PL Cores	NA
Boards/Tools	ZC702 Evaluation Board, TI Panda Evaluation Board for SPEC CPU 2000 compilation
Xilinx Tools Version	Vivado 2013.4 and SDK.
Other Details	Power supplies, 8GB SD card, A diligent cable and a UART cable.

Files Provided
Config file	SPEC CPU 2000 Config file used for Zynq
Makefile Patch	Patch for Makefile used to compile the SPEC binaries
Ram disk Image	Ram disk images to resolve library dependencies

1. Introduction to SPEC CPU 2000

SPEC CPU 2000 is an industry standard benchmark which provides a comparative measure of integer and/or floating point compute intensive performance. SPEC CPU 2000 focuses on compute intensive performance, which means these benchmarks measure the performance of the

1. Processor (CPU)
2. Memory architecture (DDR)
3. The compilers.

The system performance of a device is not only limited to CPU performance but also the memory architecture of the device and the compilers are equally significant while evaluating the system performance.

1.1 Objectives and Workloads

Objectives:

Best results require steady performance across all programs. In order to arrive at predictable and useful conclusions on system performance, SPEC CPU2000 provides benchmarks in the form of source code, which are compiled in accordance to pre-defined rules. It is expected that a user obtains a copy of these suites from SPEC’s website, install the hardware, compilers, and other software described and reproduce the claimed performance with a small variation of 1 to 2% on each run. Thus serving as common reference and most importantly, can be considered as part of an evaluation process of the system under consideration.

Workloads:

Benchmarks are provided in two suites: An integer suite, known as CINT2000, and a floating point suite, known as CFP2000. There are several different ways to measure system performance. One way is to measure how fast the device completes a single task; this is called speed measure. Another way is to measure how many tasks a device can accomplish in a certain amount of time; this is called throughput, capacity or rate measure.

The SPEC speed metrics (e.g., SPECint2000) are used for comparing the ability of a device to complete a single task. Following are the well-known ways to evaluate Speed (1 copy) metrics
1) SPECint2000: The geometric mean of twelve normalized ratios (one for each integer benchmark) when compiled with aggressive optimization for each benchmark.
2) SPECint_base2000: The geometric mean of twelve normalized ratios when compiled with conservative optimization for each benchmark

The SPEC rate metrics (e.g., SPECint_rate2000) measure the throughput of a machine carrying out a number of tasks. Following are the well-known ways to evaluate Rate (2 copy) metrics
1) SPECint_rate2000: The geometric mean of twelve normalized throughput ratios when compiled with aggressive optimization for each benchmark.
2) SPECint_rate_base2000: The geometric mean of twelve normalized throughput ratios when compiled with conservative optimization for each benchmark.

The SPEC CPU2000 workload run times are normalized against the run times of a reference machine.
*Please note that only “conservative optimizations” have been evaluated for the current system under consideration.

Workloads Description:

CINT2000 contains 12 applications written in C and 1 in C++ (252.eon) that are used as benchmarks. CFP2000 contains 14 applications (6 Fortran-77, 4 Fortran-90 and 4 C) that are used as benchmarks.
Following is a brief description of the workloads.

CINT2000 Workloads	Description
164.gzip	Data compression utility
175.vpr	FPGA circuit placement and routing
176.gcc	C compiler
181.mcf	Minimum cost flow network
186.crafty	Chess program
197.parser	Natural language processing
252.eon	Ray tracing
253.perlbmk	Perl
254.gap	Computational group theory
255.vortex	Object Oriented Database
256.bzip2	Data compression utility
300.twolf	Place and route simulator

CFP2000 Workloads	Description
168.wupwise	Place and route simulator
171.swim	Shallow water modeling
172.mgrid	Multi-grid solver in 3D potential field
173.applu	Parabolic/elliptic partial differential equations
177.mesa	3D Graphics library
178.galgel	Fluid dynamics: analysis of oscillatory instability
179.art	Neural network simulation; adaptive resonance theory
183.equake	Finite element simulation; earthquake modeling
187.facerec	Computer vision: recognizes faces
188.ammp	Computational chemistry
189.lucas	Number theory: primality testing
191.fma3d	Finite element crash simulation
200.sixtrack	Particle accelerator model
301.apsi	Solves problems regarding temperature, wind, velocity and distribution of pollutant

2. Building and Compilation of Benchmark with SPEC tools

In order to build, compile and run the SPEC CPU 2000, SPEC proposes rules and regulations that need to be followed. They can be classified in to following categories.
a) System Requirements: The basic system requirements that SPEC imposes are

System running UNIX
256MB of RAM
800MB to 1GB of disk space
A set of compilers such as C and C++ compiler

Further details on system requirements can be obtained from SPEC CPU 2000 System Requirements

b) Procedure to build the tools:
Cross compilation and individual builds of CPU2000 is allowed by SPEC however SPEC encourages use of its tools such as runspec, specperl etc. to build the SPEC CPU 2000. Further details on the tools build process can be found at Building SPEC tools

c) Generating the Config File:
A Config file is a file in which user specifies compiler flags and other system-dependent information. It is recommended to go through the following link to understand more about the Config file Understanding SPEC CPU 2000 Config File

d) Additional process of using utilities:
Few generic utilities are provided by SPEC to cross verify the Config file, understand the time spent on each invocation etc. Details of these utilities are found at Details about SPEC CPU 2000 Utilities

2.1 Runspec and its Build Environment

Everyone who uses SPEC CPU 2000 needs runspec. It is the primary tool in the suite. It is used to build the benchmarks, run them, and report on their results. Runspec makes sure that the set of tools based on GNU Make and Perl5 are used to build and run the benchmarks. The use of runspec helps produce publication quality results and also ensures that results are reproducible and optimizations used are available in the configuration file. Before using runspec one needs to install CPU2000. Further details on runspec can be found at Using runspec for SPEC CPU 2000

3. Setting up the Zynq-7000 Board and Configuration Disclosure

This section describes the setup, building and installing SPEC CPU 2000 for Zynq-7000 including the setting of Zynq-7000 board as an evaluation target for SPEC CPU 2000.

3.1 Build Procedure:

The SPEC CPU 2000 workloads were initially built on a TI Panda board, which has an ARM Cortex-A9 dual core CPU sub-system similar to Zynq-7000. Later the same process was followed to build the binaries on Zynq-7000 based ZC702 board. The binaries compiled and generated are portable across platforms and no optimization specific to any architectural features has been done. Thus it is safer to use portable binaries compiled on panda board to evaluate the device performance.
If the user prefers using a panda board, the detailed procedure on how to boot Ubuntu on Panda board is available from Ubuntu on Panda-board. upon bringing up Ubuntu on Panda board user is requested to install necessary tools such as
gfortran (v4.7.2) and Gcc if they are not available using apt-get utility and follow the build process stated in SPEC website for building/cross compiling SPEC binaries.

Below we provide an overview of the process followed to build SPEC binaries on ZC702 Evaluation board.

Bring up Ubuntu Linux on Zynq-7000 ZC702 evaluation board. The Detailed Procedure to run Ubuntu on Zynq-7000 is available at Ubuntu On Zynq .Please note that the System details of Zynq-7000 can be found at Zynq 7000 System details
If the GCC tool chains are not available by default, they can be easily installed from Ubuntu software repositories using apt-get utility. GCC v4.7.2 (the latest compiler available on ARM Ubuntu repositories) was used to build the workloads.
Apart from GCC tool chain, a FORTRAN compiler tool chain is needed, since nearly half of the workloads in SPEC CPU2000 suite are written in FORTRAN. gfortran (v4.7.2) tool chain was downloaded using similar methodology as followed for GCC.
Download the SPEC CPU2000 sources and tools from the official website of SPEC.
Since the SPEC CPU2000 distribution does not support ARM architecture, one needs to modify the makefile and create a Config file. The simple.cfg file from the SPEC website acts a template for creating an initial Config file. For your reference the Config file and the makefile patch that were used for building SPEC are provided as attachments with this tech tip
Run the specmake SPEC CPU2000 tool to build (compile) the benchmark source. Some warnings would be reported which can be safely ignored.i,,e specmake build 2 > make.err | tee make.out.
For ease of use on other boards, Create a tar ball of the binaries and SPEC CPU2000 tools created.

Few of important the compile flags used are stated for ready reference
hw_model = Zynq
hw_cpu = Cortex-A9
hw_cpu_mhz =
hw_disk = Flash Drive, 8 GB
hw_fpu = Integrated
hw_memory = 1 GB
hw_ncpu = 2
hw_ncpuorder = 2
tune = base
CROSS_COMPILE=arm-xilinx-linux-gnueabi-
COPTIMIZE = -O3 -fno-common
CXXOPTIMIZE = -O3 -fno-common

4. Installation Procedure:

The tests were carried out on a Zynq-7000 ZC702 Evaluation board with Xilinx’s standard images released as a part of 14.7 . The procedure to install the SPEC tools and the binaries is described below.

Copy the standard Xilinx linux binaries downloaded from Zynq 14.7-2013.3 Release to the SD card.
Copy the generated build data for SPEC CPU2000 on the SD card.
Power on the board and Boot Zynq-7000 device using SD boot mode.
After booting, check if the SD card is mounted automatically, in some cases it may have failed so mount the SD card if not mounted automatically.
Make sure correct date is set on the system.
Go to the SPEC CPU 2000 Installation folder (I,,e path where SPEC binaries are placed), and Install the SPEC tools by executing install.sh script.i,,e by running ./install.sh from the command line.
The install.sh will prompt for few inputs from the user. Fill in the appropriate inputs to finish the installation.
Import the environment variables by executing shrc script. . ./shrc.
Make sure that no changes in the set up are further made.
Details of the installation can be found in the installation Log created in the SPEC Installation folder.
The Zynq-7000 devices ready for execution of workloads using runspec command.

* Please note that if there are errors while installing the SPEC binaries, it could be because of missing libraries. Since this issue was frequently observed, We have provided ramdisk image as attachments along with this tech tip.

4.1 Verification of Successful Build/Installation:

To verify whether the SPEC CPU2000 installation is successful, change the directory to SPEC installation directory ‘cd $SPEC’ and check if the major tools can identify themselves by executing the following commands.

runspec -V
specperl -V
specdiff -h
runspec -h

In order to test the specperl build; Run
runspec –test

5. Running SPEC CPU 2000 on Zynq-7000

Assuming that the installation of SPEC CPU2000 was successful we further go ahead and run the SPEC Benchmarks. In compliance to SPEC, runspec command should be used to run the SPEC CPU2000 benchmarks to be able to report the
results. Following are the commands used to execute speed and rate (1 copy and 2 copy) of the SPEC CPU2000 on Zynq-7000 boards

For Speed metrics (1 Copy)

runspec  --config linux-armv7l-gcc47-linaro.cfg --action validate  --extension gcc47-default --reportable --iterations 3

For Rate metrics (2 Copy)

runspec  --config linux-armv7l-gcc47-linaro.cfg --action validate  --extension gcc47-default --reportable --iterations 3 --rate --users 2

The above commands are used to obtain the following base results of speed and throughput metrics.

SPECint_base2000
SPECfp_base2000
SPECint_rate_base2000
SPECfp_rate_base2000

6. Obtaining the SPEC CPU2000 Results

Upon successful completion of the SPEC CPU2000 tests, result file/s is generated. The results file contains the elapsed time in seconds for each of the benchmarks in the CINT2000 or CFP2000 suite and the ratio to the reference machine (Sun Ultra 10). The SPECint_base2000 and SPECfp_base2000 metrics are calculated as a Geometric Mean of the individual ratios, where each ratio is based on the median execution time from an odd number of runs, equal to 3. The current execution produces results for "Base" metrics only since the benchmarks in the suite are built with a common set of optimizations hence for Zynq-7000 boards only Base metrics have been evaluated however user can modify the optimization flags to evaluate peak metrics
Below is the list of resultant files that would be generated under run/0000000n folderupon successful completion of the SPEC CPU2000 tests. The same files would be used for publication as well.

CINT2000.00n.asc: Result File in ascii format
CINT2000.00n.cfg: File Containing config information on how the test was run
CINT2000.00n.raw: Result File which has raw output for the entire run
log.00n : This is the run log and it will show the details of the entire run.
CFP2000.00n.asc: Result File in ascii format
CFP2000.00n.cfg: File Containing config information on how the test was run
CFP2000.00n.raw: Result File which has raw output for the entire run
log.00n : This is the run log and it will show the details of the entire run.

Note: n is the run number

7. Interpreting the SPEC CPU2000 Results

The resultant of each test of SPEC CPU2000 benchmark evaluation is a number which is produced by performing carefully regulated tests, obtained by running scientifically designed programs from the SPEC suite. The results are normalized by comparing them to the reference machine (the Sun Ultra 5/10). Each program is compiled with standard flags for a "base" measurement. The program is run three times; each runtime is measured and the median time is used. The "Base Ratio" for that program run is computed as follows:
For CINT2000:
Ratio = 100 × Ref Time / Run Time
For CFP2000:
CPU2000_Rate = geo mean [CPU2000_Rate (program)]
CPU2000_Rate (program) = N × [Tref (program) /Tref (171.swim)] × [3600/TSUT (program)]

Where
N = number of copies run concurrently (this may be different for each program if a peak SPECrate is being measured, but not for a base SPECrate)

Tref (program) = time to run program on the Sun Ultra 5/10

Tref (171.swim) = time to run 171.swim on the Sun Ultra 5/10 = 3100

3600 = number of seconds in an hour

TSUT (program) = time to finish last concurrent copy on system being tested

SUT = system under test

This formula is (mostly) given in the CPU2000 run rules.

The "rate" calculated for each benchmark is a function of:

the number of copies runs *

reference factor for the benchmark *

number of seconds in an hour /

elapsed time in seconds

which yields a rate in jobs/hour.
Here, the phrase "reference factor for the benchmark" corresponds to the ratio Tref (program) / Tref (171.swim).

The computed number tells us that, when running the a program compiled conservatively, the system under consideration completed the run about x times faster/slower than a Sun Ultra 5/10 300MHz workstation running the same program compiled the same way.
The formula for the overall SPECint2000 or SPECfp2000 score is a geometric mean of the ratios for all the programs in the benchmark suite

8. Conclusion

To conclude, SPEC CPU2000 2000 Benchmark was ported for Zynq, built and run in accordance with the rules stated by SPEC. The results for conservative optimization are obtained and published

9. FAQ

For FAQ’s related to SPEC please SPEC CPU 2000 FAQ

Xilinx Wiki

Zynq-7000 AP SoC Benchmark - SPEC CPU 2000 Tech Tip

Zynq-7000 AP SoC Benchmark - SPEC CPU 2000 Tech Tip

Documentation Criteria

Table of Contents

Document History

Summary

Implementation

1. Introduction to SPEC CPU 2000

1.1 Objectives and Workloads

Objectives:

Workloads:

Workloads Description:

2. Building and Compilation of Benchmark with SPEC tools

2.1 Runspec and its Build Environment

3. Setting up the Zynq-7000 Board and Configuration Disclosure

3.1 Build Procedure:

4. Installation Procedure:

4.1 Verification of Successful Build/Installation:

5. Running SPEC CPU 2000 on Zynq-7000

6. Obtaining the SPEC CPU2000 Results

7. Interpreting the SPEC CPU2000 Results

8. Conclusion

9. FAQ