问题描述
某局点一台S9706核心交换机,slot1光口板持续翻转,导致接口持续up、down。
告警信息
Oct 22 2016 11:37:40+08:00 CC-DC1-CO-S1 %%01IFNET/4/IF_ENABLE(l)[28]:Interface GigabitEthernet1/0/1 has been available.
Oct 22 2016 11:37:40+08:00 CC-DC1-CO-S1 %%01IFNET/4/IF_ENABLE(l)[29]:Interface GigabitEthernet1/0/0 has been available.
Oct 22 2016 11:37:40+08:00 CC-DC1-CO-S1 %%01IFNET/4/BOARD_ENABLE(l)[30]:Board 1 has been available.
Oct 22 2016 11:37:38+08:00 CC-DC1-CO-S1 %%01ALML/4/PUBLISH_EVENT(l)[31]:Publish event. (Slot=1, Event ID=BOARD_SERVICE_REGISTER).
Oct 22 2016 11:37:37+08:00 CC-DC1-CO-S1 %%01ALML/4/ENTUP(l)[32]:LPU chassis[1] board[1] registers successfully.
Oct 22 2016 11:37:37+08:00 CC-DC1-CO-S1 %%01ALML/4/PUBLISH_EVENT(l)[33]:Publish event. (Slot=1, Event ID=BOARD_REGISTER).
Oct 22 2016 11:36:02+08:00 CC-DC1-CO-S1 %%01IFNET/4/BOARD_DISABLE(l)[61]:Board 1 has been unavailable.
Oct 22 2016 11:36:02+08:00 CC-DC1-CO-S1 %%01IFPDT/4/IF_STATE(l)[62]:Interface GigabitEthernet1/0/23 has turned into DOWN state.
Oct 22 2016 11:36:02+08:00 CC-DC1-CO-S1 %%01INFO/3/SUPPRESS_LOG(l)[63]:Last message repeated 1 times.(InfoID=4278259733, ModuleName=ALML, InfoAlias=CPU_RESET)
Oct 22 2016 11:36:02+08:00 CC-DC1-CO-S1 %%01ALML/3/CPU_RESET(l)[64]:The canbus node of LPU board[1] detects that CPU was reset.
处理过程
1、检查设备复位原因:display reset-reason
2、检查设备健康信息:display health
3、检查设备告警信息:display alarm
4、检查1槽位单板的启动记录 set output-mode slot 1
根因
通过检查1槽位单板的启动记录 set output-mode slot 1,能够发现,读取内存时没有done成功,结果单板又重启了,判断是单板内存有异常。
[CC-DC1-CO-SW1-diagnose]set output-mode board 1
******************************************************
* Slot 1 output to mainboard *
******************************************************
Press Ctrl+D to quit
est Start..........................................................OK
BIOS Creation Date ...................................... Feb 1 2013, 13:58:20
BoardType is...........................................................0000004A
Bootbus init.................................................................OK
DDR DRAM init................................................................OK
Start Memory Test ? ('t' or 'T' is test):skip
Copying Uncompressed Data from Rom to Ram .................................Done
Uncompressing Data from Rom to RAM .............................根据这个读取内存时没有done成功,结果单板又重启了,判断是单板内存有异常
Input Ctrl + y to Select Debug Console:
Bootrom Version ......................................................... Ver B
L2 Cache Test Start..........................................................OK
BIOS Creation Date ...................................... Feb 1 2013, 13:58:20
BoardType is...........................................................0000004A
Bootbus init.................................................................OK
DDR DRAM init................................................................OK
解决方案
硬件故障,仅能通过更换单板解决。