Comment by OVH - Monday, 16 November 2015, 16:09PM
At 3:45 GMT+1, we had a simultaneous crash on linecards from 3 vacs RBX, SBG and BHS
2015 Nov 15 05:04:11 admin %DIAG_PORT_LB-2-REWRITE_ENGINE_LOOPBACK_TEST_FAIL: Module:4 Test:RewriteEngine Loopback failed 10 consecutive times. Faulty module:Module 1 Error:Loopback test failed. Pack
ets lost on the SUP in the transmit direction
2015 Nov 15 05:04:11 admin %VSHD-5-VSHD_SYSLOG_CONFIG_I: Configured from vty by admin on vsh.31048
2015 Nov 15 05:06:31 admin %DIAG_PORT_LB-2-REWRITE_ENGINE_LOOPBACK_TEST_FAIL: Module:3 Test:RewriteEngine Loopback failed 10 consecutive times. Faulty module:Module 1 Error:Loopback test failed. Pack
ets lost on the SUP in the transmit direction
2015 Nov 15 05:06:31 admin %VSHD-5-VSHD_SYSLOG_CONFIG_I: Configured from vty by admin on vsh.32607
2015 Nov 15 05:07:01 admin %DIAG_PORT_LB-2-REWRITE_ENGINE_LOOPBACK_TEST_FAIL: Module:4 Test:RewriteEngine Loopback failed 10 consecutive times. Faulty module:Module 1 Error:Loopback test failed. Pack
ets lost on the SUP in the transmit direction
2015 Nov 15 05:07:02 admin %VSHD-5-VSHD_SYSLOG_CONFIG_I: Configured from vty by admin on vsh.468
2015 Nov 15 05:20:43 admin %AUTHPRIV-3-SYSTEM_MSG: pam_aaa:Authentication failed from console - login
2015 Nov 15 05:06:09 admin-vac2 %$ VDC-1 %$ %DIAG_PORT_LB-2-REWRITE_ENGINE_LOOPBACK_TEST_FAIL: Module:4 Test:RewriteEngine Loopback failed 10 consecutive times. Faulty module:Module 1 Error:Loopback test failed. Packets lost on the SUP in the transmit direction
2015 Nov 15 05:06:12 admin-vac2 %$ VDC-1 %$ %DIAG_PORT_LB-2-REWRITE_ENGINE_LOOPBACK_TEST_FAIL: Module:3 Test:RewriteEngine Loopback failed 10 consecutive times. Faulty module:Module 1 Error:Loopback test failed. Packets lost on the SUP in the transmit direction
We have reloaded the linecards.
4:30 GMT+1 , VAC1 was operational.
5:00 GMT+1, the service was totally restored.
We will work with the equipment supplier in order to find the cause of this crash and how it could happen again.
Comment by OVH - Monday, 16 November 2015, 16:11PM
We are in the process of troubleshooting with Cisco.
The failing of loopback logs is not the root-cause of default on linecards, but its consequence.
We continue the investigations to determine if the cause is hard or soft
Comment by OVH - Tuesday, 17 November 2015, 04:46AM
The malfunction reoccurred,our teams are on field investigating the situation.
Comment by OVH - Thursday, 19 November 2015, 14:01PM
We've reloaded the cards on vac1 and 3, it worked for 20 minutes and then broke down again. We're currently with the TAC Cisco to troobleshoot.
Comment by OVH - Thursday, 19 November 2015, 14:03PM
Vac 1 and 3 are online, we keep vac2 off for the troubleshoot with Cisco.
At the moment, it's a hard bug on cards M2.
We doubt that the 3 chassis at the same time is caused by a hardware issue.
However, we launch the RMA for vac1 while keeping troubleshooting.
Comment by OVH - Thursday, 19 November 2015, 14:03PM
Vac2 is UP.
Comment by OVH - Thursday, 19 November 2015, 14:04PM
We're isolating vac2 to update it.
Comment by OVH - Thursday, 19 November 2015, 14:04PM
Vac2 is now up to date.
Comment by OVH - Thursday, 19 November 2015, 14:04PM
We're isolating VAC1 to replace linecards M2.
The protection will be managed by vac2 and 3 during the maintenance.
Comment by OVH - Thursday, 19 November 2015, 14:05PM
Vac1 is up again, the 2 linecards are replaced.
Comment by OVH - Thursday, 19 November 2015, 14:05PM
The 3 Vacs have crashed again.
Comment by OVH - Thursday, 19 November 2015, 14:05PM
vac1 and 3 are up.
Comment by OVH - Thursday, 19 November 2015, 14:06PM
We've recovered the traces on Vac2 for Cisco.
Vac2 is up again.
At 3:45 GMT+1, we had a simultaneous crash on linecards from 3 vacs RBX, SBG and BHS
2015 Nov 15 05:04:11 admin %DIAG_PORT_LB-2-REWRITE_ENGINE_LOOPBACK_TEST_FAIL: Module:4 Test:RewriteEngine Loopback failed 10 consecutive times. Faulty module:Module 1 Error:Loopback test failed. Pack
ets lost on the SUP in the transmit direction
2015 Nov 15 05:04:11 admin %VSHD-5-VSHD_SYSLOG_CONFIG_I: Configured from vty by admin on vsh.31048
2015 Nov 15 05:06:31 admin %DIAG_PORT_LB-2-REWRITE_ENGINE_LOOPBACK_TEST_FAIL: Module:3 Test:RewriteEngine Loopback failed 10 consecutive times. Faulty module:Module 1 Error:Loopback test failed. Pack
ets lost on the SUP in the transmit direction
2015 Nov 15 05:06:31 admin %VSHD-5-VSHD_SYSLOG_CONFIG_I: Configured from vty by admin on vsh.32607
2015 Nov 15 05:07:01 admin %DIAG_PORT_LB-2-REWRITE_ENGINE_LOOPBACK_TEST_FAIL: Module:4 Test:RewriteEngine Loopback failed 10 consecutive times. Faulty module:Module 1 Error:Loopback test failed. Pack
ets lost on the SUP in the transmit direction
2015 Nov 15 05:07:02 admin %VSHD-5-VSHD_SYSLOG_CONFIG_I: Configured from vty by admin on vsh.468
2015 Nov 15 05:20:43 admin %AUTHPRIV-3-SYSTEM_MSG: pam_aaa:Authentication failed from console - login
2015 Nov 15 05:06:09 admin-vac2 %$ VDC-1 %$ %DIAG_PORT_LB-2-REWRITE_ENGINE_LOOPBACK_TEST_FAIL: Module:4 Test:RewriteEngine Loopback failed 10 consecutive times. Faulty module:Module 1 Error:Loopback test failed. Packets lost on the SUP in the transmit direction
2015 Nov 15 05:06:12 admin-vac2 %$ VDC-1 %$ %DIAG_PORT_LB-2-REWRITE_ENGINE_LOOPBACK_TEST_FAIL: Module:3 Test:RewriteEngine Loopback failed 10 consecutive times. Faulty module:Module 1 Error:Loopback test failed. Packets lost on the SUP in the transmit direction
We have reloaded the linecards.
4:30 GMT+1 , VAC1 was operational.
5:00 GMT+1, the service was totally restored.
We will work with the equipment supplier in order to find the cause of this crash and how it could happen again.
We are in the process of troubleshooting with Cisco.
The failing of loopback logs is not the root-cause of default on linecards, but its consequence.
We continue the investigations to determine if the cause is hard or soft
The malfunction reoccurred,our teams are on field investigating the situation.
We've reloaded the cards on vac1 and 3, it worked for 20 minutes and then broke down again. We're currently with the TAC Cisco to troobleshoot.
Vac 1 and 3 are online, we keep vac2 off for the troubleshoot with Cisco.
At the moment, it's a hard bug on cards M2.
We doubt that the 3 chassis at the same time is caused by a hardware issue.
However, we launch the RMA for vac1 while keeping troubleshooting.
Vac2 is UP.
We're isolating vac2 to update it.
Vac2 is now up to date.
We're isolating VAC1 to replace linecards M2.
The protection will be managed by vac2 and 3 during the maintenance.
Vac1 is up again, the 2 linecards are replaced.
The 3 Vacs have crashed again.
vac1 and 3 are up.
We've recovered the traces on Vac2 for Cisco.
Vac2 is up again.