问题描述
某客户拓扑如下:
如图,在WN_AR以上为到总部骨干网,以下为分公司内网;WN_DS与WN_AR之间运行EBGP,WN_DS之间以及CO_CS之间运行vrrp,左侧为主,右侧为备。WN_DS与CO_CS及CO_CS与AS之间运用OSPF,ServerB通过EBGP将路由发布给总部;在WN_DS上安装了SPU板,负责NAT功能,采用traffic policy的方式,将需要做NAT转换的数据流引入SPU板;WN_DS上SPU板的作用是将总部没有分公司的路由的IP地址转化成可路由的地址在EBGP中通告给总部。
目前分公司需要开通一个业务,需要分公司ServerB和总公司的ServerA互相通信,故障现象为,从分公司ServerB可以与总部ServerA通信,但是从总部ServerA无法同分公司ServerB通讯,导致业务不通。
关键性配置:
WN_DS:
#
interface Vlanif891
description link-to-FW_outside
ip address 10.246.130.50 255.255.255.248
vrrp vrid 4 virtual-ip 10.246.130.49
vrrp vrid 4 priority 120
#
interface Eth-Trunk100
description TO_HLJ1_SPU_01_ETK0
port link-type trunk
undo port trunk allow-pass vlan 1
port trunk allow-pass vlan 890 to 891
#
acl name nat_class 3999
rule 105 permit ip destination 10.229.197.232 0
#
traffic classifier nat_class
if-match acl nat_class
#
traffic behavior nat_redirect
redirect ip-nexthop 10.246.130.46
#
traffic policy nat_class
classifier nat_class
#
interface GigabitEthernet2/0/21
port default vlan 10
traffic-policy nat_class in
#
interface Eth-Trunk7
description TO_HLJ1_WN_AR_01
port link-type access
port default vlan 881
#
interface Vlanif881
description TO_HLJ_WN_AR_01
ip address 30.15.135.130 255.255.255.252
#
SPU板配置:
interface Eth-Trunk0.891
description TO_HLJ1_WN_DS_01_Outside
control-vid 891 dot1q-termination
dot1q termination vid 891
dot1q vrrp vid 891
ip address 10.246.130.52 255.255.255.248
vrrp vrid 3 virtual-ip 10.246.130.54
vrrp vrid 3 priority 110
nat static global 172.26.126.149 inside 10.75.24.12 netmask 255.255.255.255
nat static global 10.75.2.2 inside 10.75.21.2 netmask 255.255.255.255
#
ip route-static 0.0.0.0 0.0.0.0 10.246.130.49
#
处理过程
1.由于客户处的防火墙没有对具体服务做限制,顾使用icmp和tracert来测试,分公司服务器ServerB可以ping通总部服务器ServerA,所以排除路由问题;在总部服务器ServerA上tracert分公司ServerB,发现从总部到分公司的WN_DS后,无法再向下继续传递,初步断定数据包丢失在WN_DS上,tracert结果如下:
2.在SPU板上检查配置,发现在SPU板中并未对ServerB的IP地址进行NAT转换,WN_DS上进行流统,发现已将从ServerB回给ServerA的icmp的Reply报文发送至SPU板,但是WN_DS没有从SPU板收到ServerB给ServerA的ICMP Reply报文,由此确定报文丢在SPU板上;
WN_DS流通信息如下:
disp traffic policy statistics interface Eth-trunk7 inbound verbose rule-base
Interface: Eth-trunk7
Traffic policy inbound: icmp_cupture
Rule number: 2
Current status: OK!
---------------------------------------------------------------------
Classifier: nat_class operator and
Behavior: nat_class
Board : 0
rule 10 permit ip source 10.229.197.232 0 destination 10.77.2.38 0
Passed Packet 10,Passed Bytes 560
Dropped Packet 0,Dropped Bytes 0
disp traffic policy statistics interface GigabitEthernet 2/0/21 outbound verbose rule-base
Interface: GigabitEthernet2/0/21
Traffic policy inbound: icmp_cupture
Rule number: 2
Current status: OK!
---------------------------------------------------------------------
Classifier: nat_class operator and
Behavior: nat_class
Board : 0
rule 10 permit ip source 10.229.197.232 0 destination 10.77.2.38 0
Passed Packet 10,Passed Bytes 560
Dropped Packet 0,Dropped Bytes 0
disp traffic policy statistics interface Eth-trunk100 outbound verbose rule-base
Interface: Eth-trunk100
Traffic policy outbound: icmp_cupture
Rule number: 2
Current status: OK!
---------------------------------------------------------------------
Classifier: prd_nat_class operator and
Behavior: prd_nat_class
Board : 0
rule 5 permit ip source 10.77.2.38 0 destination 10.229.197.232 0
Passed Packet 10,Passed Bytes 560
Dropped Packet 0,Dropped Bytes 0
disp traffic policy statistics interface Eth-trunk7 outbound verbose rule-base
Interface: Eth-trunk7
Traffic policy outbound: icmp_cupture
Rule number: 2
Current status: OK!
---------------------------------------------------------------------
Classifier: nat_class operator and
Behavior: nat_class
Board : 0
rule 10 permit ip source 10.77.2.38 0 destination 10.229.197.232 0
Passed Packet 0,Passed Bytes 0
Dropped Packet 0,Dropped Bytes 0
3.在 WN_DS上查看路由表,发现去往ServerB的IP-10.77.2.38的路由未经过SPU板
[WN_AR01]disp ip routing-table 10.77.2.38
Route Flags: R - relay, D - download to fib
------------------------------------------------------------------------------
Routing Table : Public
Summary Count : 1
Destination/Mask Proto Pre Cost Flags NextHop Interface
10.77.2.0/24 ospf 10 2 D 192.168.2.3 Vlanif20
4.查看SPU板工作模式为firewall
display service-type
Current board service type is :Firewall.
Next startup board service type is :Firewall
5.由此得出,由于来回路径不一致,且ServerB回复ServerA的ICMP Reply报文经过引流至SPU板时,SPU板检查报文为非首包,不建立会话,而直接丢弃,顾业务不通。
根因
由于ServerA与ServerB之间通信的路由来回路径不一致,ServerB回复ServerA的ICMP Reply报文经过引流至SPU板时,SPU板检查报文为非首包,不建立会话,而直接丢弃,顾业务不通
解决方案
1.关闭SPU板的链路状态检测功能
2.在WN_DS与AR互联的Eth-trunk7接口将ServerA向ServerB发起的报文引流至SPU板,使来回路径一致;
后经和客户商讨,客户采用方案2,故障解决。