Printer FriendlyEmail Article Link

NTP daemon crash is getting crashed when viewing session information for all pages at the same time

Symptoms
The configuration has 16K NTP devices, when getting NTP session info for all pages, NTP daemon gets crashed.
Other protocols like PPP, DHCP also have the similar problem.
 
From the log:
07-01-19 21:13:06,183 [4134771520] INFO content.il.0.bsdnetd <> - panic: Heap is out of memory!
 
BLL log message:
 
19/07/01 14:03:18.171 INFO  11152    - fmwk.bll.core.equip  - Updating smart info section info for chassis 10.216.37.195... 
19/07/01 14:03:19.392 WARN  13436    - fmwk.bll.core.rpc    - Still waiting for msg(Ntp_1.DoSQLFaster) seq(2310) msgr(port //10.216.37.195/4/17) response after 60 sec 
19/07/01 14:03:24.392 WARN  13436    - fmwk.bll.core.rpc    - Still waiting for msg(Ntp_1.DoSQLFaster) seq(2310) msgr(port //10.216.37.195/4/17) response after 65 sec 
19/07/01 14:03:29.393 WARN  13436    - fmwk.bll.core.rpc    - Still waiting for msg(Ntp_1.DoSQLFaster) seq(2310) msgr(port //10.216.37.195/4/17) response after 70 sec 
19/07/01 14:03:33.068 INFO  11152    - fmwk.bll.core.equip  - Updating smart diagnostic info for chassis 10.216.37.195... 
19/07/01 14:03:33.069 INFO  20068    - fmwk.bll.core.equip  - Updating smart attribute section info for chassis 10.216.37.195... 
19/07/01 14:03:33.069 INFO  16548    - fmwk.bll.core.equip  - Updating smart info section info for chassis 10.216.37.195... 
19/07/01 14:03:33.391 INFO  4796     - fmwk.bll.core.equip  - Calling event log notification handler for 10.216.37.195 test module 4 port group 17... 
19/07/01 14:03:33.391 INFO  4796     - fmwk.bll.core.equip  - Received event eventLog_1.EvtLogNotify for 10.216.37.195 test module 4 port group 17 
19/07/01 14:03:33.395 INFO  4796     - fmwk.bll.core.equip  - event log 0: message = Daemon ntpd_39 died, level = ERROR, daemon = sysmgr, date = Mon Jul  1 21:22:04 2019, port = 0 
19/07/01 14:03:33.396 ERROR 4796     - user.equipment       - port //10.216.37.195/4/17 event: Daemon ntpd_39 died 
19/07/01 14:03:33.425 INFO  4796     - fmwk.bll.base.cmd    - Command.101109(EventLogHandler) state: Started in background 
19/07/01 14:03:33.425 INFO  15420    - fmwk.bll.base.cmd    - Command.101109(EventLogHandler) state: Running 
19/07/01 14:03:33.427 WARN  15420    - fmwk.bll.core.equip  - Performing diagnostics on 10.216.37.195 test module 4 port group 17... 
19/07/01 14:03:33.428 INFO  15420    - fmwk.bll.core.equip  - [ManagePortGroupCentralImpl::GetILLogs] Retrieving IL logs from 10.216.37.195 test module 4 port group 17 
19/07/01 14:03:34.375 INFO  14076    - perf.fmwk.bll.scriptable - scriptable_total: 99986 
19/07/01 14:03:34.394 WARN  13436    - fmwk.bll.core.rpc    - Still waiting for msg(Ntp_1.DoSQLFaster) seq(2310) msgr(port //10.216.37.195/4/17) response after 75 sec 
19/07/01 14:03:35.151 INFO  15420    - fmwk.bll.base.cmd    - Command.101110(LostConnectionHandler) state: Started in background 
19/07/01 14:03:35.151 INFO  15420    - fmwk.bll.base.cmd    - Command.101109(EventLogHandler) state: Completed took: 1.727 sec 
19/07/01 14:03:35.152 INFO  18468    - fmwk.bll.base.cmd    - Command.101110(LostConnectionHandler) state: Running 
19/07/01 14:03:35.153 INFO  18468    - fmwk.bll.core.equip  - Lost connection to msgr(10.216.37.195 test module 4 port group 17) generating lost connection fault response for msg(Ntp_1.DoSQLFaster) seq(2310) 
Environment
Spirent TestCenter
Explanation/Resolution
In large scale test, getting all the session info at the same time may cause STC unstable.
In summary, this issue is a general performance issue for STC, not a specific functional bug. We suggest customer to use "Get session info per page" when doing large scale testing. 
Root Cause
The memory is used up when getting all the session information at the same time

Product : Windows GUI