This Tech-Tip is intended to show how to make a DDR-Less system using Zynq-7000 SoC. It describes about executing boot loader code and application code without DDR. User can take this frame work and use it for their specific applications.
Table of Contents
|Date||Version||Author||Description of Revisions|
|08/08/13||0.1||Yashu Gosain||Initial revision|
|14/03/14||0.2||Yashu Gosain||Updated for 2013.4|
|16/7/2014||0.3||Yashu Gosain||Updated for 2014.2|
|8/8/2014||0.4||Yashu Gosain||Modified Information related to FSBL Linker script|
|04/12/2014||0.5||Ravinder Are||Updated the tech-tip to cover below requirements|
1. Integrating PL logic (for AXI Timer) using Vivado Tools 2014.3 versions and creates the bit stream file for flashing.
2. Execute FSBL in XIP mode and Application from OCM memory region.
3. Simple Interrupt working design application to demonstrate AXI Timer use in PS application.
4. Added second application: C++ application execution in L2 cache lock mode.
|7/24/2015||0.6||Rajesh Gugulothu||Updated for Vivado/SDK 2015.2 version|
|7/29/2016||0.7||Rutuja Chavan||Updated for Vivado/SDK 2016.2 version|
Updated for Vivado/SDK 2017.3 version
|30/04/2020||0.9||Surender Polsani||Updated for Vivado/SDK 2019.1 version|
Some of the embedded systems require a small memory foot-print to store and execute its application. For these kinds of systems, On-chip memory (OCM) can satisfy their memory related need. So integrating a DDR module in these systems will always be an overhead for both cost and area. To reduce BOM, user just needs to execute boot loader and application from the OCM.Sometimes OCM size is also not sufficient to store and execute application. In that situation, user can execute application code from the Flash. But fetching instruction from flash is relatively slower than executing it from the OCM. For this kind of scenario, this document describes a framework in which a system can be developed without an external DDR memory. It also describes framework to improve the performance while executing application from flash in case of On-chip memory is not sufficient.The following points are covered in this tip.
- Shows FSBL XIP execution.
- Provide sample routine to do preloading of the data/code to L2-cache, in FSBL.
- Shows L2-cache Lockdown feature in Zynq-7000 SOC.
- Reference design for AXI Timer in PL.
- Shows Interrupt working concept.
- C and C++ application to run in L2-cache lock down mode.
This Tech tip users are expected to be familiar with Zynq-7000 SoC related booting sequence, SDK tool and concepts about execute in place.
1. DescriptionWhole project demonstrating the above discussed features are part of multiple sections of the design, which have PS section, PL section, and Application sections.1.1 PS (Processing System section)
|SW Type||Standalone (SMP)|
|CPUs||1 ARM Cortex-A9 667MHZ|
|Xilinx Tools Version||VIVADO version 2019.1. |
SDK 2019.1 (comes with Vivado installation)
|Zynq7000AP_SoC_BootingWithoutExternalMemory_2019.1.zip||See Appendix A for the descriptions of the files |
Zynq-7000 SoC PS has an inbuilt PL310 Cache controller to manage L2 cache. It is used to improve the performance of ARM based systems when significant memory traffic is generated by the processor. The cache controller provides a cache lockdown feature which can help to lock a critical piece of code or data in to L2-cache. This feature is useful where there is a need to provide deterministic response. Cache controller supports following two lockdown schemes.
The unified 512 KB L2 cache is 8-way set-associative and allows users to lock the cache content on a line, way, or master basis. This Tech tip describes and provides a reference design about how to preload data/code into L2-cache and provide a mechanism to lock it in L2-cache. Cache lockdown scheme used in the reference design is “Way-based” locking. This design uses the example application of PS Triple timer counter (TTC).There are modifications done to the application code to show application preloading to L2-cache and lock it. Details of the modifications are provided in the subsequent sections.Note
- Line based Locking
- Way- based locking
: For more detail about the PL310 cache controller, refer to ug585-Zynq-7000-TRM “Zynq-7000 SoC Technical Reference Manual”
Zynq-7000 SoC PS has an inbuilt PL390 Interrupt controller for configuring and using it for all internal and external interrupt sources for servicing them.Note:
For more details about Generic Interrupt controller, refer to http://infocenter.arm.com/help/topic/com.arm.doc.ddi0416b/Execute-in-Place Option:
Zynq-7000 SoC PS has execute-in-place option for QSPI flash memory. The Boot-ROM uses the linear addressing feature of the Quad-SPI controller for non-secure boot modes. In this case, the initial FSBL code must fit inside the first 16 MB of memory.Reference design is based on Zynq 7000 series. All the reference platform selections are Zynq-7ZC702Evaluation Board1.2 PL (Programmable Logic Section)
This Tech-tip reference design has AXI-Timer IP integrated in PL and provides steps to generate an output ‘bit-stream’ file for flashing the Zynq board along with the other Boot files. The AXI Timer/Counter is a 32/64-bit timer module that attaches to the AXI4-lite interface, It is a general purpose timer/counter with user configurable options.
For more detail about the AXI Timer IP, refer to http://www.xilinx.com/support/documentation/ip_documentation/axi_timer/v2_0/pg079-axi-timer.pdf
User can use Xilinx Vivado® Design Suite to integrate AXI Timer IP with zc702 PS platform.
Connect the AXI Timer with Global Interrupt controller (i.e. GIC) in PS. Follow the regular Vivado HW project steps through bit stream file generation.
Next coming sections have detailed information on creating a new Vivado project, integrating the AXI Timer. Also has information on using the reference HW project.Note:
for more details on using Vivado tools, refer to Vivado Design Suite User Guide: Getting Started (UG910) http://www.xilinx.com/support/documentation/sw_manuals/xilinx2014_3/ug910-vivado-getting-started.pdf1.3 Memory Map, Boot flow and Reference ApplicationsThis tech-tip describes the Boot flow of Zynq -7000 series platforms and provide details on application pre-loading in to L2 cache, locking it, and executing application out of L2 cache. It covers both C, and Cpp based application for demonstration. QSPI Flash Memory Map:Quad-SPI linear address space of 16MB with address range FC00_0000 to FDFF_FFFF. Which can be used for execute in place option (XIP). FSBL uses this option in the reference design to execute the FSBL.OCM Memory Map:Zynq-7000 series platform has On-chip memory of 256KB, default address range at 0000_0000 to 0003_FFFF of 192kB and FFFC_0000 to FFFF_FFFF of 64KB.
Boot flow:The Zynq platform’s default boot loader functionality is split into two different small programs. Boot-ROM code and the First Stage boot loader (FSBL).Boot-ROM executes at start up, loads the FSBL from non-volatile storage to dynamic On Chip Memory (OCM) and executes it. The FSBL can execute from Flash itself, for this boot loader creation (Bootgen tool) need to set the xip_mode attribute. FSBL executes and copies the application from non-volatile memory to external DDR memory, and transfer control to application start address to execute it. This tech-tip has some modifications in boot loader and its sequence to accomplish the “application preloading to L2 cache, locking L2 cache and executing it”.Note: In this tech-tip, FSBL executes in XIP mode, XIP_mode attribute need to set while creating Boot.Bin file, please refer bootgen.bif file shared under reference files.Boot flow changes in this tech-tip:
|Figure: System Memory Map during and after Boot-ROM Execution|
In this tech-tip, FSBL executes in XIP mode, XIP_mode attribute need to set while creating Boot.Bin file, please refer bootgen.bif file shared under reference files.
Zynq Boot code is spread in FSBL and FSBL_bsp projects. Refer to the software_projects.zip from the reference design files.FSBL Software flow for preloading application to L2 cache and locking it:
- FSBL checks for application segments that need to be preloaded (preload_funct() function) in to L2 and loads them.
- PartitionMove() function has changes.
- After application loading the Boot flow is same as regular one discussed above.
Changes to the default FSBL and FSBL_bsp files:
- Set page table entry attribute to “Inner cacheable and not outer cacheable” for FLASH linear memory region used for FSBL.,In the reference example FSBL runs from linear FLASH memory, range 0xfc000000 - 0xfc6fffff
- Set page table entry attribute to “Inner and outer cacheable” , for FLASH linear memory region used for Application code segments loading address. In the reference example Applications use linear FLASH memory, range 0xfc700000 - 0xfdffffff set to Outer cacheable but not Inner cacheable.
- Invalidate Instruction cache.
- Invalidate both L1 and L2 data caches.
- Disable both L1 instruction as well as data caches.
- Preload the application in to the L2-cache
- Lock all the 8 ways of L2-cache
FSBL Linker script:
- xil_exception.c in FSBL bsp added XVtable attribute to Vector Table entry.
- lscript.ld file changed to accommodate the linker changes in FSBL.
- main.c in FSBL changed to have the “preload_func “ function.which loads application code into cache and lock it.
- In fsblhandoff.s file, FsblHandOffExit function is modified by commenting out disabling of MMU and invalidating cache assembly code while handing off to application.
- In image_mover.c the PartitionMove() function has modified by commenting some code to support xip boot flow.
- In qspi.c the initQspi() and qspi access functions modified ,changes present provided reference design projects the design files of this tech tip. These modifications’ eliminates the re-initialization of the qspi.
- Code changes in pcap.c with ‘’DDRLESS_SYSTEM” macro protection in multiple places to change the boot flow sequence.
FSBL linker script use the below memory ranges
ps7_ram_0_S_AXI_BASEADDR: OCM memory used for data, heap, stack and Bss sections execution address region.ps7_ram_0_S_AXI_BASEADDR_1: OCM memory used for “Interrupt vector table” it is initialized in FSBL and will be continued to be used in Application without any modification, Applications also have reserved same region for vector table.FLASH: Flash memory used for FSBL code, static data to load and execute address region. Importing and building FSBL application is described and discussed in next sections.C Application:This is a simple C application, which makes use of AXI Timer integrated in PL.It initializes the interrupt controller, configures the AXI Timer with counter value and call back handler, also registers timer interrupt service routine. For every timer timeout, interrupt triggers and timer Interrupt handler executes (ISR routine).All the necessary bsp files for building the application are linked with this application by copying them in to same project source path, they are Timer, interrupt controller, UART (for serial prints). C-Application software flow
ps7_ram_0_S_AXI_BASEADDR : ORIGIN = 0x00000000, LENGTH = 0x0002FF00
ps7_ram_0_S_AXI_BASEADDR_1 : ORIGIN = 0x0002FF00, LENGTH = 0x0000100
FLASH : ORIGIN = 0xFC000000 + 0x1700, LENGTH = 0x2FE000
C-Application Linker Script: C-Application linker script defined following memory sections.
- Entry point for the application code is set to _startcode which is defined in the startup.s assembly file. At this entry point, stack pointer memory location is assigned to Stack pointer register and then branch to main() function.
- main() function is defined in Application.c file. It performs following operation.
- Copy the data section from FLASH(load region) to OCM(Execution region) memory region.
- Copy the Interrupt vector table from FLASH(load region) to OCM(Execution region) memory region.
- Configures PL timer.
- Associate timer ISR with the Timer interrupts.
- Initialize the Interrupt controller.
- Start the timer and wait for timer triggering through interrupt..
- Prints log messages inside ISR routine.
FLASH: QSPI flash memory used for applications code and data segments load address region.ps7_ram_0_S_AXI_BASEADDR: OCM memory used for application’s “data”, “stack”, “heap” sections execution address region.ps7_ram_1_S_AXI_BASEADDR: OCM memory used for “Interrupt vector table” it is initialized in FSBL and will be continued to be used in Application without any modification, Applications also have reserved same region for vector table.CPP- Application:It setups the c++ runtime required infrastructure like global objects construction and destruction. Also it demonstrates the basic mathematics based c++ class defining and using them. It has simple classes defined for mathematical operations such as Summation and Multiplication, and using them. CPP-Application software flow
FLASH : ORIGIN = 0xFC700000, LENGTH = 0x100000
ps7_ram_0_S_AXI_BASEADDR : ORIGIN = 0x00010000, LENGTH = 0x0001FF00
ps7_ram_1_S_AXI_BASEADDR : ORIGIN = 0x0002ff00, LENGTH = 0x00000100
CPP-Application Linker Script:Cpp-Application linker script defined following memory sections.
- Entry point for the application code is set to _startcode which is defined in the startup.s assembly file. At this entry point, stack pointer memory location is assigned to Stack pointer register.
- Then it branches to copyRO () function which defined in copyRO.cc file to copy the Application own data segments to OCM.
- Then it branches to libc_init_array() function to executes all global constructors which have static objects used across the application.
- Then it branches to main() function in app.cc file , where it defineclasses for Multiplication (class name ‘Mult’) and Summation (class name ‘Sum’) and carryout the math operations using the objects.
- While exiting it calls libc_fini_array() to cleanup all global static objects by calling destructors.
ps7_ram_0_S_AXI_BASEADDR : It is OCM memory used for application’s “data”, “stack”, “heap” sections execution address space FLASH: It is QSPI flash memory used for applications code and data segments load address region.2. Hardware and Application Software Development using Xilinx Vivado & SDK tools
ps7_ram_0_S_AXI_BASEADDR : ORIGIN = 0x00010000, LENGTH = 0x00020000
FLASH : ORIGIN = 0xFC700000, LENGTH = 0x2FE000
Before starting Hardware Development, download and extract the shared Zynq7000AP_SoC_BootingWithoutExternalMemory_update.zip
file to your local directory. From the extracted folders , copy the Reference_Design_Files
folder to C drive.2.1 Hardware DevelopmentThis section describes steps about the hardware generation and exports it to SDK.
Using reference design files:
Detailed description followed under below….1. Start Vivado tool.2. In “Tcl_Console” command prompt , cd to the HW project files path C:\Reference_Design_Files\HW_design\ source project.tcl , it will create a new HW project with the reference design of AXI in PL.3. In the Flow Navigator > tab click on the Generate Bit stream4. Then, Go to the File->Export-> click on Export hardware, The Export hardware diagram opens, then enable the Include bitstream and click ok.SDK launching and importing the software project files5. For this go to the File-> click on Launch SDK.6. The SDK window will be launched and workspace location will be C:\Reference_Design_Files\HW_design\project\
- Reference design files shared has system_top_wrapper design Vivado project.
- Open Vivado tool and source the project.tcl file shared under C:\Reference_Design_files
- Source Project.tcl file, which will launch the project with the reference design showing below. project.tcl is under “HW_design” folder in Reference design files.
- To rebuild and generate the bit stream file, run generate bit-stream in Vivado tool.
zynq_axitimer_design2019.1.sdk2.2 Application Software DevelopmentThis section describes about the application development and creation of the booting image. This booting image is used for boot the ZYNQ device and running the application from L2 cache. If user wants to skip this section and jump Application Execution, then prebuilt image (BOOT.bin) is included and located in the directory C:\Reference_Design_Files\generated_files\ and it can be used.After SDK launch, the projects can be imported to use the provided reference design files.1. Select File->Import to open an import wizard. 2. Select General > Existing Projects into Workspace in the import wizard. 3. Select Achieve file and specify the location where the software applications are stored.4. The import wizard displays a list of projects that are available to import. This list Includes
5. Build each of the above imported projects. In the order , FSBL_ZIP_bsp , FSBL_XIP, any Application.2.3 Generating Boot File
- Select all these projects and click Finish
This tech-tip boot file contains FSBL (FSBL_XIP.elf), PL Bit stream file (system_top_wrapper.bit) and Application (Application.elf/Application_Cpp.elf) binaries.
Followings are the steps to create Boot file. Should set xip_mode
attribute for FSBL executable. Refer to the reference design boot-gen files.1. In SDK, select Xilinx Tools > Create Boot Image. The Create Zynq Boot Image wizard opens.2. Provide the fsbl_XIP.elf , Bit stream file and Application elf files. Provide the proper offset for the build file generation against every executable. Please refer the bootimage.bif shared under reference files.Note:
3. Provide the output folder name in the Output Folder tab.4. Click Create Image. The Create Zynq Boot Image can be created as one of the following files in the specified output folder Using the Reference design projects: Modify the path of binaries and Import the bootgen file “bootimage.bif” or “bootimage_Cpp.bif” for creating the Boot gen file based on C or Cpp choice.3 Application ExecutionFollow the below procedure to execute application on the board1. Make sure that the hardware board is set up is done properly. Note: RefertoUG873, Zynq-7000 SoC: Concepts, Tools, and Techniques, section2.1.3 TakeTestDrive!Runningthe“HelloWorld” Application for information about setting up the board
- find fsbl_XIP. Elf in <SDK Workspace\ fsbl_XIP\Debug\fsbl_XIP.elf
- find .bit file in <SDK >\ system_top_wrapper_hw_platform_0\system_top_wrapper.bit
- find Application. Elf in <SDK Workspace\ Application\Debug\Application.elf
2. Change the SW16 switch setting as shown below to set the board to JTag flash mode.3. Open a serial communication utility for the com port.Note: Configuration for Zynq Processing System is: Baud rate 115200; 8 bit; Parity: none; Stop: 1 bit; Flow control: none.4. Flash the BOOT.bin to Zynq QSPI flash memory using SDK (use default fsbl.elf while flashing)Note: either use prebuilt BOOT.bin/Boot_Cpp.bin files provided with the reference design files under directory C:\Reference_Design_Files\generated_files Or use BOOT.bin generated in the above section.5. Change SW16 switch setting as shown below to change the booting type to QSPI booting.6. Power ON the board.7. Check the serial communication utility for the message.8. After successful execution, message showing Application execution.4 Expected ResultC Application:Trace messages dumped on Serial terminal.##### Timer demo Starts ##### @@@@@@@@@@@@@@@@@@@@@@@@@@@@@Inside Timer ISRTimer ISR Exit@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Cpp Application:Trace messages dumped on Serial terminal.Addition Starts ...Sum: 1+1 = 2Addition Done#####################################Multiplication Starts...#####################################Multiplication: 1*1 = 1Multiplication Done#####################################5 Appendix A: File Descriptions in the design_file directoryExtracted directory structure of the shared design files.
- Bootgen: It contains boot image format configuration data, C and Cpp applications have separate bootgen files.
- bootimage.bif : Bootgen file for C application.
- bootimage_Cpp.bif : Bootgen file for Cpp application.
- Application.elf: C Application Executable file.
- BOOT.bin : flash able executable.
- BOOT_Cpp.bin : flash able executable.
- FSBL_XIP.elf: FSBL executable.
- system_top_wrapper.bit : bit stream file generated by Vivado tool.
- softwrae_projects.zip -contains source code files for
- C_Application_Result.txt: It contains the result data after executing C application.
- Cpp_Application_Result.txt: It contains the result data after executing Cpp application.