This article applies to BoKS Manager 7.1.

 

Description

This hotfix, which is a hotfix for BoKS 7.1 that includes fixes delivered for BoKS 7.0 in the hotfixes HFBM-0220-1 and HFBM-0233-1, addresses several issues related to the operation of the BoKS servc receive bridge and one issue with the clntd send bridge.

1) The receive bridges locked a semaphore each time they needed a nodekey for a remote host. On the Master and Replicas this led to a lot of semaphore lock/unlock operations for the servc receive bridges and with very many Server Agents connected this could in extreme cases lead to a situation where Server Agents timed out and connected again, causing the number of receive bridges to increase even more.

2) There was no way to limit the number of simultaneous connections from Server Agents.

3) Access could fail intermittently when a Replica was overloaded.

4) The clntd send bridge on the BoKS Master slowed down with many messages in the batch queue. The slowdown manifested itself in slower updates of Server Agents and in extreme cases in a buildup of the $BOKS_data/master_spool/dd_fcom_* queue files because the bridge was slow accepting messages from the drainmast pswupdate process.


Resolution / Workaround

 

Apply hotfix HFBM-0238, available for download from the HelpSystems Community Portal, to resolve these issues.

 

1) The receive bridge copies the nodekey to local memory when it first loads it, so it only needs to lock the semaphore once.

2) It is possible to limit the number of simultaneous connections from Server Agents to servc using the ENV variable BRIDGE_SERVC_R_MAX_CONN. The default value is 9950.

3) The problem was that when a Replica was overloaded, the bridge sent an error back that the bridge on the Server Agent did not handle properly and propagated back to the application making the call. With the hotfix applied it sends another error back that makes the bridge on the Server Agent close the connection and try to find another server to send the request to.

4) The issue was that the speed of removal of a message in an internal queue once it got a reply to the message from the Server Agent depended on the length of the queue. With the hotfix applied, the data structure of this queue is changed to allow for speedy removal independent of queue length.

 

This hotfix also includes a speedup on some platforms for the boks_udsqd process that acts as a queue in front of servc. Normally it uses poll(), but on Linux it now uses epoll(), and on Solaris it uses event ports.

NOTE: Revision 1 of this hotfix could cause problems with Replica discovery if the hotfix was applied on a BoKS Server Agent - this has been resolved in revision 2 of the hotfix.

 


Still have questions? We can help. Submit a case to Technical Support.

Last Modified On: September 17, 2018