Printer FriendlyEmail Article Link

Spirent TestCenter: Test modules ports goes down and stays in unavailable state - RSVP Crash

Symptoms

A customer is facing this issue randomly in both of the 400G Line card (Slot 1 and Slot 6 on the same chassis) when running automation scripts and using the lab server the ports went down and was in an unavailable state.

They usually had to reboot the line card (remotely) to bring the port UP.
 
Environment
 
  • Spirent TestCenter
  • Port down
  • Port unavailable
  • PX3-400GQ-T2
  • 5.30

From the logs generated you can see there are some crash logs withing the firmware logs:
 



Steps to reproduce:

 
  1. Create a new session on lab server.
  2. Load tcc file.
  3. Reserve ports and start protocols.
  4. Start traffic wait for 30 seconds, stop traffic and get results. Make sure there is no drop.(baseline before performing any triggers)
  5. Start traffic
  6. Perform some trigger on the DUT like reloading dut, or reloading configuration etc.,
  7. Stop traffic and check for results if there is drop seen. Perform step 4 again until there is no drop seen, to make sure that the DUT has recovered after the trigger. If the traffic recovers before 5 tries test is marked as pass, else it is marked as failed.
  8. Once all the test cases are performed customer will delete the session.
enlightenedThe customer is seeing the issue intermittently mostly during step 7. Once they hit the issue, the script will automatically delete the session and will restart again. 

 
Explanation/Resolution

Issue fixed within 5.31 release (Middle march 2022 release)
  • CR-01536036
  • CIPCD-17697
 
Root Cause


Defect
Issue occurred because while raising the events we didnt make sure all of the data structures are valid. In some of the rare cases, as there were no safety checks in code it leads to crash. The reason the daemon_ngcXX died is basically due to access violation and segmentation fault.
 

Product : PX3,Spirent TestCenter