敏捷交换机S9706 CSS倒换出现异常
问题描述
处理过程
1. 查看逻辑集群端口、集群物理成员端口的配置信息
<S9706>display css port all
*down: administratively down
(e): ERROR down
VS08 Port status InUit OutUit inErrors outErrors
1/7/0/1 up 0% 0% 0 0
1/7/0/2 up 0% 0% 0 0
1/7/0/3 up 0% 0% 0 0
1/7/0/4 up 0% 0% 0 0
1/7/0/5 up 0% 0% 7 0
1/7/0/6 up 0% 0% 0 0
1/7/0/7 up 0% 0% 0 0
1/7/0/8 up 0% 0% 1 0
1/8/0/1 down 0% 0% 0 0 //4个接口down
1/8/0/2 down 0% 0% 0 0
1/8/0/3 down 0% 0% 0 0
1/8/0/4 down 0% 0% 0 0
1/8/0/5 up 0% 0% 2 0
1/8/0/6 up 0% 0% 0 0
1/8/0/7 up 0% 0% 0 0
1/8/0/8 up 0% 0% 0 0
2/7/0/1 up 0% 0% 0 0
2/7/0/2 up 0% 0% 0 0
2/7/0/3 up 0% 0% 0 0
2/7/0/4 up 0% 0% 0 0
2/7/0/5 up 0% 0% 0 0
2/7/0/6 up 0% 0% 0 0
2/7/0/7 up 0% 0% 0 0
2/7/0/8 up 0% 0% 0 0
2. 显示所有框的集群端口连线信息
<S9706> display css channel
Chassis 1 || Chassis 2
================================================================================
Num [SRUC HG] [VS08 Port(Status)] || [VS08 Port(Status)] [SRUC HG]
1 1/7 HG12 -- 1/7/0/1(UP 10G) ---||--- 2/7/0/1(UP 10G) -- 2/7 HG12
2 1/7 HG16 -- 1/7/0/2(UP 10G) ---||--- 2/7/0/2(UP 10G) -- 2/7 HG16
3 1/7 HG13 -- 1/7/0/3(UP 10G) ---||--- 2/7/0/3(UP 10G) -- 2/7 HG13
4 1/7 HG17 -- 1/7/0/4(UP 10G) ---||--- 2/7/0/4(UP 10G) -- 2/7 HG17
13 1/8 HG14 -- 1/8/0/5(UP 10G!) ---||--- 2/7/0/5(UP 10G) -- 2/7 HG14
14 1/8 HG18 -- 1/8/0/6(UP 10G) ---||--- 2/7/0/6(UP 10G) -- 2/7 HG18
15 1/8 HG15 -- 1/8/0/7(UP 10G) ---||--- 2/7/0/7(UP 10G) -- 2/7 HG15
16 1/8 HG19 -- 1/8/0/8(UP 10G) ---||--- 2/7/0/8(UP 10G) -- 2/7 HG19
===============display device===============
==================================================
Chassis 1 (Master Switch)
S9706's Device status:
Slot Sub Type Online Power Register Alarm Primary
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
1 - EH1D2G48SFA0 Present PowerOn Registered Normal NA
2 - EH1D2G48SFA0 Present PowerOn Registered Normal NA
3 - EH1D2G48TFA0 Present PowerOn Registered Normal NA
7 - EH1D2SRUC000 Present PowerOn Registered Normal Slave
1 EH1D2VS08000 Present PowerOn Registered Normal NA
8 - EH1D2SRUC000 Present PowerOn Registered Normal Master
1 EH1D2VS08000 Present PowerOn Registered Normal NA
PWR1 - - Present PowerOn Registered Normal NA
PWR2 - - Present PowerOn Registered Normal NA
CMU1 - EH1D200CMU00 Present PowerOn Registered Normal Master
FAN1 - - Present PowerOn Registered Normal NA
FAN2 - - Present PowerOn Registered Normal NA
Chassis 2 (Standby Switch)
S9706's Device status:
Slot Sub Type Online Power Register Alarm Primary
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
1 - EH1D2G48SFA0 Present PowerOn Registered Normal NA
2 - EH1D2G48SFA0 Present PowerOn Registered Normal NA
3 - EH1D2G48TFA0 Present PowerOn Registered Normal NA
7 - EH1D2SRUC000 Present PowerOn Registered Normal Master
1 EH1D2VS08000 Present PowerOn Registered Normal NA
8 - - Present PowerOn Unregistered - Slave
PWR1 - - Present PowerOn Registered Normal NA
PWR2 - - Present PowerOn Registered Normal NA
CMU1 - EH1D200CMU00 Present PowerOn Registered Normal Master
FAN1 - - Present PowerOn Registered Normal NA
FAN2 - - Present PowerOn Registered Normal NA
3. 查看 2/8的重启信息
[diag] local-telne slave
System-view视图下
[huawei]enable-command
[huawei-diag]set output-mode slot 8
查看启动信息
4.初步判断是线缆异常,或者备交换机2/8主控板或者堆叠卡异常
根因
根因:
S9706交换机主控堆叠卡线缆16根,分为4组(每组4根),现场操作工程只连接了其中3组,另外1组线缆未插上导致虽然集群组建成功,但是倒换出现异常。而2/8的主控无法注册的原因是到css系统主1/8主控上没有可用链路,所以注册请求报文一直无法发送OK。
解决方案
1.将剩余4根堆叠线缆连接上,css系统备用主控注册成功,倒换业务正常
2.如果线缆不足,将堆叠线缆仍旧分为4组(每组三根),组建集群系统,css系统备用主控注册成功,倒换业务正常
建议与总结
1.CSS组建完成后,检查线缆连接情况,确认线缆连接正常
2.CSS组建完成后,检查集群是否组建成功,使用display device查看设备单板状态(主控板、堆叠卡、接口板),执行命令display css status查看集群系统的状态
3.检查集群链路状态是否正常 (display css channel)
4.本次CSS恢复正常后,集群链路显示信息中仍有 “!”(“!”表示此端口下有错包),通过执行reset counters css port命令清除相应端口下的统计信息,再执行display css channel命令查看,“!”消失且不再出现,问题处理完成
云服务器爆款直降90%
新客首单¥68起 | 人人可享99元套餐,续费同价 | u2a指定配置低至2.5折1年,立即选购享更多福利!