-
Notifications
You must be signed in to change notification settings - Fork 88
Description
When we run OpTestEEH via cmd "./op-test -c mihawk.conf --run testcases.OpTestEEH", the result is FAIL.
《OP-Test Log》
test-run-20211211212625.zip
《FAIL Message》
ERROR [123.617s]: runTest (testcases.OpTestEEH.OpTestEEHmax_frozen_pe)
Traceback (most recent call last):
File "/home/ooo/0814_op-test/testcases/OpTestEEH.py", line 636, in runTest
self.verify_location_code_logging(pe)
File "/home/ooo/0814_op-test/testcases/OpTestEEH.py", line 381, in verify_location_code_logging
"PE ", pe, "Kernel failed to log the location codes for a PCI EEH error")
testcases.OpTestEEH.EEHLocCodeFailed: PE 0031:03:00.0 location code failure: Log=Kernel failed to log the location codes for a PCI EEH error
Ran 4 tests in 378.641s
FAILED (errors=4)
《Manual-Test》
Manual-Test-Log.txt
However, when we run "echo 40:0:4:0:0 > /sys/kernel/debug/powerpc/PCI0000/err_injct && lspci -ns 0000:01:00.0" for 5 times, PCI-Card (whose PIC-Code is "0000:01:00.0") can automatically recover.
Next, we run "echo 40:0:4:0:0 > /sys/kernel/debug/powerpc/PCI0000/err_injct && lspci -ns 0000:01:00.0" at No. 6 time, and PCI-Card cannot recover until we reboot OS.
According to our Manual-Test result, it is PASS to run commands of OpTestEEH by manual.
Please check if the issue comes from OpTestEEH's script.
《SUT's Config》
[Kernel]
4.18.0-305.25.1.el8_4.ppc64le
[FW Config]
BMC: op940.22.mih-1-0-g41157d8d2e
Pnor: OP9_v2.4.1-4.31-prod
[HW Config]
CPU DD2.3 20 core *2
Micron Technology(MTA18ASF2G72PZ-2G9E1)16GiB x32
SAMSUNG PM985 (MZ1LB960HAJQ-00007) 960GB M.2 x1
PSU ACBEL 2000w *2
Slot1: 2-PORT 100Gb ROCE EN CONNECTX-5 GEN4 PCIe x16 LP CAPABLE ADAPTER