...
Code Block | ||
---|---|---|
| ||
Check driver is probed or not zynq> dmesg | grep edac EDAC MC0: Giving out device to module 1 controller synps_ddr_controller: DEV synps_edac (POLLED) zynq> Do any read operation on memory address which is more than 500MB (0x1F400000) zynq> devmem 0x1F500000 Unhandled fault: imprecise external abort (0x1406) at 0xb6eb4700 pgd = (ptrval) [b6eb4700] *pgd=3ec98831 Bus error zynq> EDAC MC0: 1 CE Bit Position: 77 Data: 0x00000000 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1 syndrome:0x0) EDAC MC0: 4 UE DDR ECC error type :UE Row 32064 Bank 0 Col 0 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1) zynq> zynq> To know the complete info of the edac device: zynq> cat /sys/devices/system/edac/mc/mc0/ ce_count csrow0/ mc_name rank0/ seconds_since_reset subsystem/ ue_noinfo_count ce_noinfo_count max_location power/ reset_counters size_mb ue_count uevent For each CE or UE error, ce_count and ue_count will be incremented. |
ZynqMP
Code Block | ||
---|---|---|
| ||
Injecting ECC Errors for ZynqMP DDRC Controller The following sysfs entries supports injecting ecc errors -> /sys/devices/system/edac/mc/mc0/inject_data_poison (to enable CE/UE) -> /sys/devices/system/edac/mc/mc0/inject_data_error (to specify address) Enable the CE/UE errors -> echo "CE" > /sys/devices/system/edac/mc/mc0/inject_data_poison the above command enables Correctable error injection -> echo "UE" > /sys/devices/system/edac/mc/mc0/inject_data_poison the above command enables UnCorrectable error injection Select the address to inject ECC Errors -> echo 0x7EE0EEE0 > /sys/devices/system/edac/mc/mc0/inject_data_error The above command configures Data poison registers to inject errors at the address specified As per DDRC ZynqMP controller spec, when ever a write operation detected on the address specified, it injects errors to that location and it will report the errors back, when a read operation is performed So write some data to the address specified -> devmem 0x7EE0EEE0 32 0x1234 with the above command, the controller corrupts the data at that address try reading the data from that address -> devmem 0x7EE0EEE0 EDAC MC0: 1 UE DDR ECC error type :UE Row 12544 Bank 0 Col 0 BankGroup Number 2 Block Number 64 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1) Unhandled fault: synchronous external abort (0x92000210) at 0x0000007f8d666200 Bus error |
Expected Output
Zynq
Code Block | ||
---|---|---|
| ||
zynq> devmem 0x1F500000 Unhandled fault: imprecise external abort (0x1406) at 0xb6eb4700 pgd = (ptrval) [b6eb4700] *pgd=3ec98831 Bus error zynq> EDAC MC0: 1 CE Bit Position: 77 Data: 0x00000000 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1 syndrome:0x0) EDAC MC0: 4 UE DDR ECC error type :UE Row 32064 Bank 0 Col 0 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1) zynq> cat /sys/devices/system/edac/mc/mc0/ce_count 1 zynq># cat /sys/devices/system/edac/mc/mc0/ue_count 6 zynq> |
ZynqMP
Code Block | ||
---|---|---|
| ||
root@Xilinx-ZCU102-2017_3:~# dmesg | grep EDAC [ 1.688239] EDAC MC: Ver: 3.0.0 [ 1.691419] EDAC DEBUG: edac_mc_sysfs_init: device mc created [ 3.594032] EDAC DEBUG: edac_mc_alloc: allocating 2168 bytes for mci data (1 ranks, 1 csrows/channels) [ 3.594073] EDAC MC0: 5 CE DDR ECC error type :CE Row 0 Bank 0 Col 0 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1 syndrome:0x0) [ 3.594078] EDAC DEBUG: edac_mc_add_mc_with_groups: [ 3.594082] EDAC DEBUG: edac_create_sysfs_mci_device: creating bus mc0 [ 3.594117] EDAC DEBUG: edac_create_sysfs_mci_device: creating device mc0 [ 3.594180] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm0, located at csrow 0 channel 0 [ 3.594230] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device rank0 [ 3.594234] EDAC DEBUG: edac_create_csrow_object: creating (virtual) csrow node csrow0 [ 3.594324] EDAC MC0: Giving out device to module 1 controller synps_ddr_controller: DEV synps_edac (INTERRUPT) [ 3.646086] EDAC MC0: 10 UE DDR ECC error type :UE Row 0 Bank 0 Col 0 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1) root@Xilinx-ZCU102-2017_3:~# root@Xilinx-ZCU102-2017_3:~# |
Code Block | ||
---|---|---|
| ||
root@xilinx-zcu102-2017_3:~# echo "CE" > /sys/devices/system/edac/mc/mc0/inject_data_poison root@xilinx-zcu102-2017_3:~# echo 0x7EE0EEE0 > /sys/devices/system/edac/mc/mc0/inject_data_error root@xilinx-zcu102-2017_3:~# devmem 0x7EE0EEE0 32 0x1234 root@xilinx-zcu102-2017_3:~# devmem 0x7EE0EEE0 [ 38.109379] EDAC MC0: 1 CE DDR ECC error type :CE Row 16240 Bank 1 Col 0 BankGroup Number 3 Block Number 748 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1 syndrome:0x0) 0x00001234 root@xilinx-zcu102-2017_3:~# echo "CE" > /sys/devices/system/edac/mc/mc0/inject_data_poison root@xilinx-zcu102-2017_3:~# echo 0x7EE0[ 909.353767] random: crng init donenject_data_error DDD root@xilinx-zcu102-2017_3:~# devmem 0x7EE0DDD0 32 0x1234 root@xilinx-zcu102-2017_3:~# devmem 0x7EE0DDD0 [ 917.861298] EDAC MC0: 1 CE DDR ECC error type :CE Row 16240 Bank 1 Col 0 BankGroup Number 3 Block Number 472 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1 syndrome:0x0) 0x00001234 root@xilinx-zcu102-2017_3:~# echo "UE" > /sys/devices/system/edac/mc/mc0/inject_data_poison root@xilinx-zcu102-2017_3:~# echo 0x6EE0DDD0 > /sys/devices/system/edac/mc/mc0/inject_data_error root@xilinx-zcu102-2017_3:~# devmem 0x6EE0DDD0 32 0x1234 root@xilinx-zcu102-2017_3:~# devmem 0x6EE0DDD0 [ 1492.761355] Unhandled fault: synchronous external abort (0x92000210) at 0x0000007fb52e9dd0 [ 1492.761367] EDAC MC0: 1 UE DDR ECC error type :UE Row 14192 Bank 1 Col 0 BankGroup Number 3 Block Number 472 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1) Bus error root@xilinx-zcu102-2017_3:~# |
Change log
2023.2
- Summary
- Fix the cumulative reporting of the errorsNone
- Commits
- None
2023.1
- Summary
- None
- Commits
- None
...