tag name | mlx5-updates-2019-05-04 (b17ae1472fa3e083ecd69cd7975de8623ca74c43) |
tag date | 2019-05-04 17:22:54 -0700 |
tagged by | Saeed Mahameed <saeedm@mellanox.com> |
tagged object | commit 30d8b932dc... |
download | linux-mlx5-updates-2019-05-04.tar.gz |
---|
mlx5-updates-2019-05-04
Mlx5 devlink health fw reporters and sw reset support
This series provides mlx5 firmware reset support and firmware devlink health
reporters.
1) Add CR-Space access and FW Crdump snapshot support via devlink region_snapshot
2) Issue software reset upon FW asserts
3) Add fw and fw_fatal devlink heath reporters to follow fw errors indication by
dump and recover procedures and enable trigger these functionality by user.
3.1) fw reporter:
The fw reporter implements diagnose and dump callbacks.
It follows symptoms of fw error such as fw syndrome by triggering
fw core dump and storing it and any other fw trace into the dump buffer.
The fw reporter diagnose command can be triggered any time by the user to check
current fw status.
3.2) fw_fatal repoter:
The fw_fatal reporter implements dump and recover callbacks.
It follows fatal errors indications by CR-space dump and recover flow.
The CR-space dump uses vsc interface which is valid even if the FW command
interface is not functional, which is the case in most FW fatal errors. The
CR-space dump is stored as a memory region snapshot to ease read by address.
The recover function runs recover flow which reloads the driver and triggers fw
reset if needed.
Command examples and output:
diagnose data:
assert_var[0] 0xfc3fc043
assert_var[1] 0x0001b41c
assert_var[2] 0x00000000
assert_var[3] 0x00000000
assert_var[4] 0x00000000
assert_exit_ptr 0x008033b4
assert_callra 0x0080365c
fw_ver 16.24.1000
hw_id 0x0000020d
irisc_index 0
synd 0x8: unrecoverable hardware error
ext_synd 0x003d
raw fw_ver 0x101803e8
dump traces:
trace: 0000:82:00.1 [0x69cd6c5283e] 0 [0xb8] dump general info GVMI=0x0001
trace: 0000:82:00.1 [0x69cd6c53bec] 0 [0xb8] GVMI management info, gvmi_management context:
trace: 0000:82:00.1 [0x69cd6c55eff] 0 [0xb8] [000]: 00000000 00000000 00000000 00000000
trace: 0000:82:00.1 [0x69cd6c5657f] 0 [0xb8] [010]: 00000000 00000000 00000000 00000000
trace: 0000:82:00.1 [0x69cd6c56608] 0 [0xb8] [020]: 00000000 00000000 00000000 00000000
trace: 0000:82:00.1 [0x69cd6c566ff] 0 [0xb8] [030]: 00000000 00000000 00000000 00000000
trace: 0000:82:00.1 [0x69cd6c5677f] 0 [0xb8] [040]: 00000000 00000000 00000000 00000000
trace: 0000:82:00.1 [0x69cd6c5687f] 0 [0xb8] [050]: 00000000 00000000 00000000 00000000
trace: 0000:82:00.1 [0x69cd6c568ff] 0 [0xb8] [060]: 00000000 00000000 00000000 00000000
trace: 0000:82:00.1 [0x69cd6c569a5] 0 [0xb8] [070]: 00000000 00000000 00000000 00000000
trace: 0000:82:00.1 [0x69cd6c57021] 0 [0xb8] CMDIF dbase from IRON: active_dbase_slots = 0x00000000
trace: 0000:82:00.1 [0x69cd6c58dae] 0 [0xb8] GVMI=0x0001 hw_toc context:
trace: 0000:82:00.1 [0x69cd6c58e7f] 0 [0xb8] [000]: 00400100 00000000 00000000 fffff000
trace: 0000:82:00.1 [0x69cd6c58f7f] 0 [0xb8] [010]: 00000000 00000000 00000000 00000000
...
...
devlink_region_name: cr-space snapshot_id: 1
00000000000f0018 e1 03 00 00 fb ae a9 3f
0000000000000000 00 20 00 01 00 00 00 00 03 00 00 00 00 00 00 00
0000000000000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000000000000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000000000000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000000000000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80
0000000000000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000000000000060 00 00 00 00 00 00 00 00 00 00 00 00 de 0a 00 00
0000000000000070 0c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000000000000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa 00
0000000000000090 b6 0b 00 00 00 00 00 00 80 c7 fe ff 50 0a 00 00
...
...
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJczizeAAoJEEg/ir3gV/o+NaoIAJCRJoAoGE9fPtLFLiJzo0Iq
JhVoluroyEYpRcp28YfSTcPxhaVvXFgMhHzJM6Vtxzq5HaEgOxrv2ZmH6aNdL0HD
jH4zAfkaSn5OPaZE7lqMgM/W+ECHI8GAob5Yd0oQHcwAaSq6YoTVJuKwMAYFpDu6
JK/E0eTVN6Mvr+DmDZgcJ6VJPFAtPRcnSth0BJg0JF8Ujf/SUI9YURUSkplyvvs4
uboHBX8c0BOCflR7dMz2JVfKlPAUVDeEPadgQf87NCxMKma/o8U1/rwOGioZisex
NZNwpYG/LOOM0UQXBsZh+5oTJOMtNVcCAQDYo6iDf+o/dPydXEijgRcwJIejEvg=
=NbEf
-----END PGP SIGNATURE-----