简介
在arm64架构设备上,遇到过内存硬件问题导致的系统异常,非内核代码问题。
据boot同事说和启动阶段配置内存涉及到内存时序的一个参数有关。
查阅arm cpu手册,根据异常错误码EC区分。
EC, bits [31:26]
Exception Class. Indicates the reason for the exception that this register holds information about.
For each EC value, the table references a subsection that gives information about:
• The cause of the exception, for example the configuration required to enable the trap.
• The encoding of the associated ISS.
IFSC, bits [5:0]
Instruction Fault Status Code.
0x96000210
0x96000210 取数据异常,属于外部异常,非页表问题
[ 6621.614063] Unhandled fault: synchronous external abort (0x96000210) at 0xffffffdfd05b2bf0
[ 6621.713051] Internal error: : 96000210 [#1] SMP
[ 6621.767235] Modules linked in: system(PO) comeon(O)
[ 6621.824562] CPU: 0 PID: 787 Comm: dport_omcd Tainted: P S O 4.4.65-bex01a #1
[ 6621.920419] Hardware name: E2000Q DEMO DDR4 (DT)
[ 6621.975642] task: ffffffe0e1ae4880 ti: ffffffe0076e0000 task.ti: ffffffe0076e0000
[ 6622.065251] PC is at rb_next+0x0/0x60
[ 6622.109017] LR is at set_next_entity+0x640/0x7b0
[ 6622.164243] pc : [<ffffff804035ef60>] lr : [<ffffff80400e6ea0>] pstate: 600001c5
[ 6622.252805] sp : ffffffe0076ffa10
EC, bits [31:26]
EC == 100101
Data Abort taken without a change in Exception level.
Used for MMU faults generated by data accesses, alignment faults other than those
caused by the Stack Pointer misalignment, and Synchronous external aborts, including synchronous parity or ECC errors.
Not used for debug related exceptions.This value is valid for all described registers.
IFSC, bits [5:0]
010000
Synchronous external abort, not on translation table walk
0x86000210
0x86000210 取指令异常,属于外部异常,非页表问题
[17:04:40][ 755.236399] Bad mode in Synchronous Abort handler detected, code 0x86000210 -- IABT (current EL)
[17:04:40][ 755.236418] Bad mode in Synchronous Abort handler detected, code 0x86000210 -- IABT (current EL)
[17:04:40][ 755.236436] Bad mode in Synchronous Abort handler detected, code 0x86000210 -- IABT (current EL)
[17:04:40][ 755.236454] par_el1 = 0
[17:04:40][ 755.236464] Bad mode in Synchronous Abort handler detected, code 0x86000210 -- IABT (current EL)
[17:04:40][ 755.236481] Internal error: Oops - bad mode: 0 [#1] SMP
[17:04:40][ 755.236490] par_el1 = 0
[17:04:40][ 755.236497] Modules linked in:
[17:04:40][ 755.236497] par_el1 = 0
[17:04:40][ 755.236509] system(PO) comeon(O)
[17:04:40][ 755.236519] CPU: 0 PID: 2277 Comm: routed Tainted: P S O 4.4.65-bex01a #1
[17:04:40][ 755.236533] Hardware name: E2000Q DEMO DDR4 (DT)
[17:04:40][ 755.236542] task: ffffffdfdc58d700 ti: ffffffdfdc720000 task.ti: ffffffdfdc720000
[17:04:40][ 755.236561] PC is at vectors+0x200/0x790
[17:04:40][ 755.236570] LR is at el0_da+0x18/0x1c
EC, bits [31:26]
EC == 100001
Instruction Abort taken without a change in Exception level.
Used for MMU faults generated by instruction accesses and Synchronous external
aborts, including synchronous parity or ECC errors. Not used for debug related exceptions
IFSC, bits [5:0]
010000
Synchronous external abort, not on translation table walk
0x96000217
0x96000217 取数据异常,属于外部异常,页表问题
Unhandled fault: synchronous abort (translation table walk) (0x96000217) at 0xffffff807ee701e8
EC, bits [31:26]
EC == 100101
IFSC, bits [5:0]
010111
Synchronous External abort on translation table walk or hardware update of translation table, level 3.
参考
《Arm Architecture Reference Manual for A-profile architecture》