CONFIG_EDAC_SYNOPSYS: │ │ │ │ Support for error detection and correction on the Synopsys DDR │ │ memory controller. │ │ │ │ Symbol: EDAC_SYNOPSYS [=m] │ │ Type : tristate │ │ Prompt: Synopsys DDR Memory Controller │ │ Location: │ │ -> Device Drivers │ │ -> EDAC (Error Detection And Correction) reporting (EDAC [=y]) │ │ -> Main Memory EDAC (Error Detection And Correction) reporting (EDAC_MM_EDAC [=y]) │ │ Defined at drivers/edac/Kconfig:386 │ │ Depends on: EDAC [=y] &&&& EDAC_MM_EDAC [=y] &&&& (ARM [=y] || ARM64) |
memory-controller@f8006000 { compatible = "xlnx,zynq-ddrc-a05"; reg = <0xf8006000 0x1000>; }; |
memory-controller@fd070000 { compatible = "xlnx,zynqmp-ddrc-2.40a"; reg = <0x0 0xfd070000 0x0 0x30000>; interrupt-parent = <&&gic>; interrupts = <0 112 4>; }; |
steps to reserve the test memory location for error injection
reserved-memory { #address-cells = <2>; #size-cells = <2>; ranges; reserved: buffer@0 { reusable; reg =<0x0 0x7EE0EEE0 0x0 0x00100000>; }; }; reserved-driver@0 { compatible = "xlnx,reserved-memory"; memory-region = <&reserved>; }; |
Check driver is probed or not zynq> dmesg | grep edac EDAC MC-1: Giving out device to 'xilinxps_edac' 'zynq_ddr_controller': DEV f8006000.ps7-ddrc zynq> Do any read operation on memory address and then write then edac driver will display some memory information zynq> devmem 0x1F400000 0xEA000049 zynq> devmem 0x1F400000 0x5D600000 Unhandled fault: external abort on non-lineinterface (0x1018) at 0xb6f83000 Bus error zynq> EDAC MC-1: 2 CE DDR ECC error type :CE Row 0 Bank 0 Col 512 on mc#4294967295csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1 syndrome:0x0) EDAC MC-1: 3 UE DDR ECC error type :UE Row 0 Bank 0 Col 512 on mc#4294967295csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1) zynq> To know the complete info of the edac device: zynq> cat /sys/devices/system/edac/mc/mc-1/ ce_count power/ subsystem/ ce_noinfo_count rank0/ ue_count csrow0/ reset_counters ue_noinfo_count max_location seconds_since_reset uevent mc_name size_mb For each CE or UE error, ce_count and ue_count will be incremented. |
Injecting ECC Errors for ZynqMP DDRC Controller The following sysfs entries supports injecting ecc errors -> /sys/devices/system/edac/mc/mc0/inject_data_poison (to enable CE/UE) -> /sys/devices/system/edac/mc/mc0/inject_data_error (to specify address) Enable the CE/UE errors -> echo "CE" > /sys/devices/system/edac/mc/mc0/inject_data_poison the above command enables Correctable error injection -> echo "UE" > /sys/devices/system/edac/mc/mc0/inject_data_poison the above command enables UnCorrectable error injection |
Select the address to inject ECC Errors -> echo 0x7EE0EEE0 > /sys/devices/system/edac/mc/mc0/inject_data_error The above command configures Data poison registers to inject errors at the address specified As per DDRC ZynqMP controller spec, when ever a write operation detected on the address specified, it injects errors to that location and it will report the errors back, when a read operation is performed So write some data to the address specified -> devmem 0x7EE0EEE0 32 0x1234 with the above command, the controller corrupts the data at that address try reading the data from that address -> devmem 0x7EE0EEE0 EDAC MC0: 1 UE DDR ECC error type :UE Row 12544 Bank 0 Col 0 BankGroup Number 2 Block Number 64 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1) Unhandled fault: synchronous external abort (0x92000210) at 0x0000007f8d666200 Bus error |
zynq> devmem 0x1F400000 Unhandled fault: imprecise external abort (0x1406) at 0x000cb884 Bus error zynq> EDAC MC-1: 2 CE DDR ECC error type :CE Row 0 Bank 0 Col 512 on mc#4294967295csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1 syndrome:0x0) EDAC MC-1: 3 UE DDR ECC error type :UE Row 0 Bank 0 Col 512 on mc#4294967295csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1) zynq> cat /sys/devices/system/edac/mc/mc-1/ce_count 11 zynq> cat /sys/devices/system/edac/mc/mc-1/ue_count 8 zynq> |
root@Xilinx-ZCU102-2016_1:~# dmesg | grep EDAC [ 1.688239] EDAC MC: Ver: 3.0.0 [ 1.691419] EDAC DEBUG: edac_mc_sysfs_init: device mc created [ 3.594032] EDAC DEBUG: edac_mc_alloc: allocating 2168 bytes for mci data (1 ranks, 1 csrows/channels) [ 3.594073] EDAC MC0: 5 CE DDR ECC error type :CE Row 0 Bank 0 Col 0 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1 syndrome:0x0) [ 3.594078] EDAC DEBUG: edac_mc_add_mc_with_groups: [ 3.594082] EDAC DEBUG: edac_create_sysfs_mci_device: creating bus mc0 [ 3.594117] EDAC DEBUG: edac_create_sysfs_mci_device: creating device mc0 [ 3.594180] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm0, located at csrow 0 channel 0 [ 3.594230] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device rank0 [ 3.594234] EDAC DEBUG: edac_create_csrow_object: creating (virtual) csrow node csrow0 [ 3.594324] EDAC MC0: Giving out device to module 1 controller synps_ddr_controller: DEV synps_edac (INTERRUPT) [ 3.646086] EDAC MC0: 10 UE DDR ECC error type :UE Row 0 Bank 0 Col 0 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1) root@Xilinx-ZCU102-2016_1:~# root@Xilinx-ZCU102-2016_1:~# root@Xilinx-ZCU102-2016_1:~# devmem 0x61000000 [ 28.911942] Unhandled fault: synchronous external abort (0x92000210) at 0x0000007f9038e000 [ 28.911955] EDAC MC0: 1 CE DDR ECC error type :CE Row 12416 Bank 0 Col 0 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1 syndrome:0x0) [ 28.911964] EDAC MC0: 14 UE DDR ECC error type :UE Row 12416 Bank 0 Col 0 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1) Bus error root@Xilinx-ZCU102-2016_1:~# root@Xilinx-ZCU102-2016_1:~# root@Xilinx-ZCU102-2016_1:~# cat /sys/devices/system/edac/mc/mc0/ce_count 6 root@Xilinx-ZCU102-2016_1:~# cat /sys/devices/system/edac/mc/mc0/ue_count 24 root@Xilinx-ZCU102-2016_1:~# root@Xilinx-ZCU102-2016_1:~# devmem 0x61000000 root@Xilinx-ZCU102-2016_1:~# devmem 0x61000000 72000000 [ 56.159845] Unhandled fault: synchronous external abort (0x92000210) at 0x0000007f8205c000 [ 56.159858] EDAC MC0: 2 CE DDR ECC error type :CE Row 14592 Bank 0 Col 0 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1 syndrome:0x0) [ 56.159867] EDAC MC0: 13 UE DDR ECC error type :UE Row 14592 Bank 0 Col 0 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1) Bus error root@Xilinx-ZCU102-2016_1:~# root@Xilinx-ZCU102-2016_1:~#cat /sys/devices/system/edac/mc/mc0/ce_count 8 root@xilinx-ZCU102-2016_1:~#cat /sys/devices/system/edac/mc/mc0/ue_count 37 |