OpenAMP Base Hardware Configurations
This document describes the base set of hardware required for OpenAMP to operate successfully as presented in Xilinx Vitis OpenAMP and libmetal template examples.
Table of Contents
Introduction
This document describes the base set of hardware required for OpenAMP to operate successfully as presented in Xilinx Vitis OpenAMP and libmetal template examples. It covers configurations for the RPU memory, shared memory for both the APU and RPU, generic interrupt controllers (GIC) and the inter-processor interconnect (IPI) interrupts. There are several examples of using Vitis, PetaLinux and OpenAMP, however this is not a tutorial for these tools. The reader is encouraged to refer to Xilinx user guides listed in the Related Links.
OpenAMP on Xilinx SoCs
Xilinx SoC products combine a set of heterogeneous hardware designs into one powerful and flexible platform that includes Arm Cortex-Ax, Cortex-R5 and Xilinx MicroBlaze processors. The OpenAMP project enables a distributed software architecture across this asymmetric multiprocessing platform (AMP).
There are two components of the OpenAMP framework:
Life Cycle Management (LCM) : enables a master processor to load, start and stop a firmware on a remote processor.
Inter-Processor Communication (IPC): interrupts, shared memory and message passing between the master and the remote processor.
Both the LCM and the IPC in the demos are using the remoteproc and the rpmsg drivers in the Linux kernel. Xilinx SoCs based systems should provide access to a specific set of hardware resources in order to use the OpenAMP framework. Depending on the OpenAMP or libmetal demo this includes:
Processing cores; need at least two:
application processing unit (APU)
real-time processing unit (RPU) for Versal and ZynqMP
Memory for the processing system (PS): APU (master), RPU (remote) and shared memory:
DDR
Tightly-coupled memory (TCM)
On-chip memory (OCM)
Interrupts:
APU GIC interrupts
RPU GIC interrupts
IPI interrupts
The next section tries to generalize each hardware component as it is used in the OpenAMP, followed by specific example of configuring some demos with the default values and another section showing how to change the defaults.
Zynq UltraScale+ MPSoC and Versal
LCM: Memory for RPU Firmware
Most demos use the DDR or the Tightly Coupled Memory (TCM) for RPU firmware text and data depending on the linker script (e.g.: linker_remote.ld) while the stack, heap and the interrupt vector table are in the TCM. The RPU is configured to enable the TCM interfaces from reset with base address 0 for TCMA. The Linux remoteproc driver on the APU will preload the TCMA with the boot code while the RPU is halted. When the halt is removed the RPU starts fetching instructions from the reset vector address in the normal way. The RPU's firmware .vectors section is loaded into TCMA at address 0 (see linker script). It starts with a set of branch instructions called _vector_table and a few code blocks labeled _boot, init and a function Init_MPU(). When the remoteproc driver requests RPU start the execution begins at address 0 in TCMA with a jump to _boot which calls Init_MPU(). All this is fetched from and executed in the TCMA. At the end the _boot jumps to _startup which could be in the DDR. The _startup prepares the "C runtime" and calls main().
Firmware load memory | ZynqMP / Versal |
---|---|
RPU | 0x3ED0_0000, see linker script |
APU kernel DT | 0x3ED0_0000, see rproc_0_reserved in the Linux Device Tree |
APU app | not used |
IPC: Shared Memory
The OpenAMP demos have a .resource_table ELF section as in this example linker_remote.ld. It has list of system resources required by the Linux remoteproc, e.g.: virtIO device data used by rpmsg, see rsc_table.h and rsc_table.c.
The default location for shared memory is the DDR.
SHM_BASE_ADDR | ZynqMP / Versal |
---|---|
RPU | E.g. 0x3ED8_0000, see metal_dev_table, SHARED_MEM_PA, SHARED_MEM_SIZE, SHARED_BUF_OFFSET |
APU kernel DT | E.g. 0x3ED8_0000, see reserved-memory nodes in the Linux Device Tree |
APU app: libmetal_amp_demo | UIO nodes names for zynqMP: SHM_DEV_NAME "3ed80000.shm"
IPI_DEV_NAME="ff340000.ipi" (if Versal=) "ff360000.ipi"
TTC_DEV_NAME="ff110000.timer" (if Versal=) "ff0e0000.ttc0" |
APU app: openamp demos | /sys/bus/rpmsg/devices/virtio0.rpmsg-openamp-demo-channel.-1.0 |
IPC: Interrupts
Recommended Reading
UG1085 Chapter 13: Interrupts
Versal-ACAP-TRM Chapter 52: Inter-Processor Interrupts
The OpenAMP and Libmetal demos use the IPI interrupts to enable one processor (source agent) to interrupt another processor (destination agent). The communications process uses the IPI interrupt register structure, the system interrupt structure, and the IPI message buffers. On the RPU’s firmware side these are represented by the struct metal_device ipi_device and IPI_CHN_BITMASK in platform_info.c.
GIC | IRQ System Interrupts: see IPI_IRQ_VECT_ID |
---|---|
Versal | 63 See "IRQ System Interrupts" in Versal-ACAP-TRM for IPI1 "IPI 1 interrupt"; |
ZynqMP | 65 or XPAR_XIPIPSU_0_INT_ID See "system interrupts" in UG1085 for RPU0 IPI_Ch1 |
The demos need two IPI channels (and IPI masks) one for the APU and one for the RPU. These are selected from the set of seven reprogrammable IPIs. Other system users might be requesting IPI channels. Both the firmware and the Linux device tree should be checked to detect and avoid conflicts.
| IPI Channel | Base Addr | IPI_MASK / IPI_CHN_BITMASK | References | |
---|---|---|---|---|---|
Send to Ch_7 | Send to Ch_1 | ||||
RPU_0 | Ch_1 | 0xFF31_0000 | 0x0100_0000 (set bit 24) | ipi_device, IPI_BASE_ADDR, IPI_CHN_BITMASK in platform_info.h | |
APU | Ch_7 | 0xFF34_0000 | 0x100 (set bit 8) | "xlnx,ipi-id" property, zynqmp_ipi1 node in ZynqMP .dtsi : zynqmp_ipi1 {
compatible = "xlnx,zynqmp-ipi-mailbox";
interrupt-parent = <&gic>;
interrupts = <0 29 4>;
xlnx,ipi-id = <7>;
#address-cells = <1>;
#size-cells = <1>;
ranges;
/* APU<->RPU0 IPI mailbox controller */
ipi_mailbox_rpu0: mailbox@ff90000 {
reg = <0xff990600 0x20>,
<0xff990620 0x20>,
<0xff9900c0 0x20>,
<0xff9900e0 0x20>;
reg-names = "local_request_region",
"local_response_region",
"remote_request_region",
"remote_response_region";
#mbox-cells = <1>;
xlnx,ipi-id = <1>;
};
}; |
In Versal the channel names and numbers are different, see Versal-ACAP-TRM
| IPI Channel | Base Addr | IPI_MASK / IPI_CHN_BITMASK | References | |
---|---|---|---|---|---|
Send to ipi3 | Send to ipi1 | ||||
RPU_0 | ipi1 | 0xFF34_0000 | 0x20 (set bit 5) |
| ipi_device, IPI_BASE_ADDR, IPI_CHN_BITMASK in platform_info.h |
APU | ipi3 | 0xFF36_0000 | 0x8 (set bit 3) | "xlnx,ipi-id" property, zynqmp_ipi1 node in Versal .dtsi. The mapping of ipi-id to channel is not obvious and it’s not the same as in ZynqMP. |
Build and Run Demos
Build RPU firmware
Step 1. Launch Vitis IDE and select a workspace directory with enough free space and fast access. Local disk is prefered as some NFS can be slow.
Step 2. Create a new application project using "File > New Application Project". This will start a wizard that will guide you through the process.
Create a new platform from hardware description XSA file. Select from the included XSA’s or use browse and import your own XSA.
Specify the application project name. E.g.: lm_amp and select psu_cortexr5_0 as the target processor from the list.
Select template for your project, e.g. "Libmetal AMP Application" and click "Finish":
Processor for the RPU firmware project | Domain: FreeRTOS or generic | Libmetal AMP Demo, OpenAMP echo-test,.. |
---|---|---|
Configuration of some hardware blocks used in this demo must match between the RPU’s firmware and APU’s Linux kernel, device tree and the user application. Please, see the "Hardware specification" tab to examine the address maps for APU and RPU processors:
Memory: DDR and Tightly-coupled memory (TCM)
Interrupts: PS-to-PS interrupts and Inter-processor interrupts (IPI)
Timers: Triple Timer Counters (TTC)
Hardware specification: Memory Maps | IPI: psu_ipi_1 example | Linker Script: lscript.ld |
---|---|---|
Firmware memory map: Select lscript.ld file in the source Explorer. The summary view shows the memory regions for this application:
The processor is configured to enable the TCM interfaces from reset with base address 0x0 for TCMA and 0x20000 for TCMB. This demo does not use TCMB.
The stack, heap and the interrupt vector table are in psu_r5_atcm_MEM_0, TCMA.
The text, read-only, initialized, uninitialized data sections, etc. are in psu_r5_ddr_0_MEM_0 starting at 0x3ED00000 with LENGTH=0x80000.
OpenAMP demos have .resource_table ELF section. It has list of system resources required by the Linux remoteproc, e.g.: virtIO device data.
The Linux remoteproc driver on the APU will preload the TCMA with the boot code while halting the RPU via nCPUHALTm pin. When the nCPUHALTm pin is deasserted, the processor starts fetching instructions from the reset vector address in the normal way. The FreeRTOS _vector_table is loaded into TCMA at address 0x0 followed by _boot and Init_MPU().
The execution starts at address 0x0 in TCMA via a jump to _boot which calls Init_MPU(). All this is fetched from and executed in the TCMA. At the end the _boot jumps to _startup in the DDR. The _startup prepares the "C runtime" and calls main(). Starting from _startup the RPU fetches its instructions from the DDR while its stack and heap are in the TCMA.
Step 3. Build the application project: click lm_amp or lm_amp_system in the source Explorer and select "Project > Build Project". The build produces an ARM CortexR5 binary in the Debug (or Release) directory called lm_amp.elf. In this example this binary will be loaded to execute on the RPU (r5_0) via Xilinx remoteproc Linux driver.
Build PetaLinux
Create a PetaLinux project using the BSP where we took the XSA file for the RPU firmware:
Depending on the selected demo you have to enable the following kernel config options (see: project-spec/meta-user/recipes-kernel/linux/linux-xlnx_%.bbappend and the kernel config file (e.g.: defconfig) it includes):
Copy one of the following device tree overlay files into
<plnx-proj>/project-spec/meta-user/recipes-bsp/device-tree/files/system-user.dtsi
depending on the demo selected for the RPU firmware:
The following descriptions refer to the "ZynqMP Libmetal AMP Demo" device tree overlay (.dtsi).
The <reg> property of the rproc_0_reserved node has the same memory region as the one selected in the lscript.ld in the Vitis application. This allows Xilinx Linux remoteproc driver to take our lm_amp CortexR5 binary (from
/lib/firmware
), parse its ELF headers and set up the segments in the DDR where the RPU can read and execute the text and access the shared data.The <reg> property of shm0 node is included for Linux UIO to let user space demo to talk to our firmware application lm_amp on the RPU.
Cross-reference for Shared memory: the Linux device tree (
system-user.dtsi
), RPU linker script (lscript.ld
),metal_dev_table
fromfreertos/zynqmp_r5/zynqmp_amp_demo/sys_init.c
| firmware ELF: text, data, etc. sections | shared memory |
---|---|---|
RPU | lscript.ld: ORIGIN = 0x3ED00000, LENGTH = 0x00080000 /* 0x80000 is excessive. Can be 0x40000 as in the DT */ | SHM_BASE_ADDR 0x3ED80000 .size = 0x1000000 /* sys_init.c metal_dev_table */ |
APU kernel DT | rproc_0_reserved: reg = <0x0 0x3ed00000 0x0 0x40000>; | shm0: reg = <0x0 0x3ed80000 0x0 0x1000000>; |
APU app | /* not referenced via UIO */ | SHM_DEV_NAME "3ed80000.shm" /* common.h */ |
Cross-reference for the IPI. The ipi0 node for Linux UIO uses IPI Channel 7 with base address 0xFF34_0000, while the RPU uses its default IPI Channel-1 with BASE_ADDRESS 0xFF31_0000.
| IPI Channel | Base Addr | IPI_MASK | References | |
---|---|---|---|---|---|
Send IPI to Ch_7 | Send IPI to Ch_1 | ||||
RPU_0 | Ch_1 | 0xFF31_0000 | 0x100_0000 (set bit 24) |
| IPI_MASK 0x1000000 in common.h |
APU | Ch_7 | 0xFF34_0000 |
| 0x100 (set bit 8) | -DCONFIG_IPI_MASK=0x100 in CMakeLists.txt |
The RPU code uses metal_dev_table from sys_init.c, common.h, CMakeLists.txt
RPU | IPI_BASE_ADDR=0xFF310000; /* metal_dev_table, CMakeLists.txt */ |
---|---|
APU kernel DT | reg = <0x0 0xff340000 0x0 0x1000>; /* ipi0 node */ |
APU app | IPI_DEV_NAME="ff340000.ipi" /* common.h, CMakeLists.txt */ |
Cross-reference for the TTCs, mapped via UIO for the Linux demo app
RPU | TTC0_BASE_ADDR=0xFF0E0000; /* metal_dev_table, CMakeLists.txt */ |
---|---|
APU kernel DT | &ttc0 /* enable TTC0 via the reference to ttc0 node */ |
APU app | TTC_DEV_NAME="ff110000.timer" /* common.h, CMakeLists.txt */ |
Build the PetaLinux project and package the BOOT.bin for Versal
For other architectures please see UG1144 PetaLinux Tools.
Changing the Defaults
LCM: TCM for RPU Firmware
Some OpenAMP demos are configured to have their text segment in the DDR memory. This allows larger firmware to run on RPU, but it might be slower compared to running from the TCM. According to Cortex-R5 Technical Reference Manual: “A ATCM typically holds interrupt or exception code that must be accessed at high speed, without any potential delay resulting from a cache miss. A BTCM typically holds a block of data for intensive processing, such as audio or video processing.” Another reason to run from the TCM is power usage, i.e. if the APU decides to suspend the DDR can be powered down or put into retention mode to save power.
The linker script controls the location of the ELF sections of the RPU firmware. For example, to run with .vectors, .text and .note.gnu.build-id in psu_r5_atcm_MEM_0, and with everything else in psu_r5_btcm_MEM_0, use the following linker script:
IPC: Shared Memory
Shared memory parameter changes need to be coordinated between the RPU firmware code and the APU's Linux device tree as shown in this table:
Default | Change to | |
---|---|---|
RPU SHARED_MEM_PA | 0x3ED40000 | 0x3EF40000 |
RPU SHARED_MEM_SIZE | 0x100000 | no change |
RPU SHARED_BUF_OFFSET | 0x8000 | no change |
Linux DT: rpu0vdev0vring0 | reg = <0x0 0x3ed40000 0x0 0x4000>; | reg = <0x0 0x3ef40000 0x0 0x4000>; |
Linux DT: rpu0vdev0vring1 | reg = <0x0 0x3ed44000 0x0 0x4000>; | reg = <0x0 0x3ef44000 0x0 0x4000>; |
Linux DT: rpu0vdev0buffer | reg = <0x0 0x3ed48000 0x0 0x100000>; | reg = <0x0 0x3ef40000 0x0 0x100000>; |
IPC: IPI Channels
Selecting a different IPI channel also requires coordinating changes in the RPU firmware and the APU's Linux device tree:
Versal | Default IPI1 | Change to IPI2 |
---|---|---|
RPU IPI_BASE_ADDR | 0xFF340000 | 0xFF350000 |
RPU IPI_CHN_BITMASK | 0x00000020 | 0x00000040 |
RPU IPI_IRQ_VECT_ID | 63 | 64 |
Linux DT: xlnx,ipi-id | 3 | 4 |
Please, see the IPI Interrupt Channel Architecture:
Versal IPIs: am011-versal-acap-trm.pdf
ZynqMP IPIs: ug1085-zynq-ultrascale-trm.pdf
Additional Information
Timers for the Libmetal AMP Demo
The timers are used in the "libmetal AMP demo" and are not needed for any of the OpenAMP demos. ZynqMP and Versal have four Triple Timer Counters (TTC[0-3]). The code in the demo uses TTC0 by default. However any of the four TTCs can be used.