Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...


Code Block
themeMidnight
Check driver is probed or not
zynq> dmesg | grep edac
EDAC MC0: Giving out device to module 1 controller synps_ddr_controller: DEV synps_edac (POLLED)
zynq>
 
Do any read operation on memory address which is more than 500MB (0x1F400000)
zynq> devmem 0x1F500000
Unhandled fault: imprecise external abort (0x1406) at 0xb6eb4700
pgd = (ptrval)
[b6eb4700] *pgd=3ec98831
Bus error
zynq> EDAC MC0: 1 CE Bit Position: 77 Data: 0x00000000
 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1 syndrome:0x0)
EDAC MC0: 4 UE DDR ECC error type :UE Row 32064 Bank 0 Col 0  on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1)
zynq>
zynq>
To know the complete info of the edac device:
zynq> cat /sys/devices/system/edac/mc/mc0/
ce_count             csrow0/              mc_name              rank0/               seconds_since_reset  subsystem/           ue_noinfo_count
ce_noinfo_count      max_location         power/               reset_counters       size_mb              ue_count             uevent
 
For each CE or UE error, ce_count and ue_count will be incremented.

ZynqMP

Code Block
themeMidnight
Injecting ECC Errors for ZynqMP DDRC Controller
The following sysfs entries supports injecting ecc errors
-> /sys/devices/system/edac/mc/mc0/inject_data_poison (to enable CE/UE)
-> /sys/devices/system/edac/mc/mc0/inject_data_error (to specify address)
 
Enable the CE/UE errors
-> echo "CE" > /sys/devices/system/edac/mc/mc0/inject_data_poison
the above command enables Correctable error injection
-> echo "UE" > /sys/devices/system/edac/mc/mc0/inject_data_poison
the above command enables UnCorrectable error injection
 
Select the address to inject ECC Errors
-> echo 0x7EE0EEE0 > /sys/devices/system/edac/mc/mc0/inject_data_error
 
The above command configures Data poison registers to inject errors at the address specified
As per DDRC ZynqMP controller spec, when ever a write operation detected on the address specified, it injects errors to that location
and it will report the errors back, when a read operation is performed
So write some data to the address specified
-> devmem 0x7EE0EEE0 32 0x1234
 
with the above command, the controller corrupts the data at that address
try reading the data from that address
-> devmem 0x7EE0EEE0
 
EDAC MC0: 1 UE DDR ECC error type :UE Row 12544 Bank 0 Col 0 BankGroup Number 2 Block Number 64 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1)
Unhandled fault: synchronous external abort (0x92000210) at 0x0000007f8d666200
Bus error



Expected Output

Zynq

Code Block
themeMidnight
zynq> devmem 0x1F500000
Unhandled fault: imprecise external abort (0x1406) at 0xb6eb4700
pgd = (ptrval)
[b6eb4700] *pgd=3ec98831
Bus error
zynq> EDAC MC0: 1 CE Bit Position: 77 Data: 0x00000000
 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1 syndrome:0x0)
EDAC MC0: 4 UE DDR ECC error type :UE Row 32064 Bank 0 Col 0  on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1)
zynq> cat /sys/devices/system/edac/mc/mc0/ce_count
1
zynq># cat /sys/devices/system/edac/mc/mc0/ue_count
6
zynq>

ZynqMP

Code Block
themeMidnight
root@Xilinx-ZCU102-2017_3:~# dmesg | grep EDAC
[    1.688239] EDAC MC: Ver: 3.0.0
[    1.691419] EDAC DEBUG: edac_mc_sysfs_init: device mc created
[    3.594032] EDAC DEBUG: edac_mc_alloc: allocating 2168 bytes for mci data (1 ranks, 1 csrows/channels)
[    3.594073] EDAC MC0: 5 CE DDR ECC error type :CE Row 0 Bank 0 Col 0  on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1 syndrome:0x0)
[    3.594078] EDAC DEBUG: edac_mc_add_mc_with_groups:
[    3.594082] EDAC DEBUG: edac_create_sysfs_mci_device: creating bus mc0
[    3.594117] EDAC DEBUG: edac_create_sysfs_mci_device: creating device mc0
[    3.594180] EDAC DEBUG: edac_create_sysfs_mci_device: creating dimm0, located at csrow 0 channel 0
[    3.594230] EDAC DEBUG: edac_create_dimm_object: creating rank/dimm device rank0
[    3.594234] EDAC DEBUG: edac_create_csrow_object: creating (virtual) csrow node csrow0
[    3.594324] EDAC MC0: Giving out device to module 1 controller synps_ddr_controller: DEV synps_edac (INTERRUPT)
[    3.646086] EDAC MC0: 10 UE DDR ECC error type :UE Row 0 Bank 0 Col 0  on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1)
root@Xilinx-ZCU102-2017_3:~#
root@Xilinx-ZCU102-2017_3:~#
 
 


Code Block
themeMidnight
root@xilinx-zcu102-2017_3:~# echo "CE" > /sys/devices/system/edac/mc/mc0/inject_data_poison
root@xilinx-zcu102-2017_3:~# echo 0x7EE0EEE0 > /sys/devices/system/edac/mc/mc0/inject_data_error
root@xilinx-zcu102-2017_3:~# devmem 0x7EE0EEE0 32 0x1234
root@xilinx-zcu102-2017_3:~# devmem 0x7EE0EEE0
[   38.109379] EDAC MC0: 1 CE DDR ECC error type :CE Row 16240 Bank 1 Col 0 BankGroup Number 3 Block Number 748 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1 syndrome:0x0)
0x00001234
root@xilinx-zcu102-2017_3:~# echo "CE" > /sys/devices/system/edac/mc/mc0/inject_data_poison
root@xilinx-zcu102-2017_3:~# echo 0x7EE0[  909.353767] random: crng init donenject_data_error
DDD
root@xilinx-zcu102-2017_3:~# devmem 0x7EE0DDD0 32 0x1234
root@xilinx-zcu102-2017_3:~# devmem 0x7EE0DDD0
[  917.861298] EDAC MC0: 1 CE DDR ECC error type :CE Row 16240 Bank 1 Col 0 BankGroup Number 3 Block Number 472 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1 syndrome:0x0)
0x00001234
root@xilinx-zcu102-2017_3:~# echo "UE" > /sys/devices/system/edac/mc/mc0/inject_data_poison
root@xilinx-zcu102-2017_3:~# echo 0x6EE0DDD0 > /sys/devices/system/edac/mc/mc0/inject_data_error
root@xilinx-zcu102-2017_3:~# devmem 0x6EE0DDD0 32 0x1234
root@xilinx-zcu102-2017_3:~# devmem 0x6EE0DDD0
[ 1492.761355] Unhandled fault: synchronous external abort (0x92000210) at 0x0000007fb52e9dd0
[ 1492.761367] EDAC MC0: 1 UE DDR ECC error type :UE Row 14192 Bank 1 Col 0 BankGroup Number 3 Block Number 472 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1)
Bus error
root@xilinx-zcu102-2017_3:~#

Change log

2016.3

  • Summary
    • None
  • Commits
    • None

2016.4

  • Summary
    • None
  • Commits
    • None

2017.1

  • Summary
    • None
  • Commits
    • None

2017.2

  • Summary
    • None
  • Commits
    • None

2017.3

  • Summary
    • Do not use symbolic permissions
  • Commits

2017.4

  • Summary
    • Fix for incorrect Macro defines
  • Commits

2018.1

  • Summary
    • Add Memory mapping, 16bit row mode and video buffer mode support
  • Commits

2018.2

  • Summary
    • None
  • Commits
    • None

2018.3

2019.1

  • Summary
    • None
  • Commits
    • None

2019.2

  • Summary
    • None
  • Commits
    • None

2020.1

  • Summary
    • None
  • Commits
    • None

2020.2

  • Summary
    • Fix the wrong value assignment for edac_mode.
  • Commits

2021.1

  • Summary
    • Fix the issue in reporting of the error count.
  • Commits

Related Links