Tuning network timeouts

Network issues that make the connection to specific agents too slow can impact the performance of overall scheduling. IBM Workload Scheduler provides network timeout settings that you can tune to avoid problems with specific agents that affect the entire system.

When you run the EQQPCS05 sample, global and local options files are created in the end-to-end work directory on USS. For more information, see Step 4. Creating and customizing the work directory. You can add the following optional parameters to the localopts file to modify the default timeouts used by IBM Workload Scheduler, tuning the values to find the best settings for your environment.

Lower timeout values might cause the connection to fail too often due to slow systems or network that prevents scheduling on slower nodes. Higher values might cause the slowness of a single system that compromises the performance of the entire scheduling environment.

Setting the following local options on every distributed agent with the suggested values can help improve scheduling performance.
Table 1. Network timeout settings in localopts
Local option Default value Description
tcp connect timeout 15 This is the timeout in seconds used while opening a connection to an agent or domain manager (TCP/IP connect call). The actual time spent connecting to each agent can be a multiple of this value, since IBM Workload Scheduler can make multiple connection attempts for the same agent depending on the TCP/IP configuration.
tcp timeout 60 This is the time in seconds that the caller (mailman or translator) waits for the invoked service on the agent to return a result.

The default value, if not specified, is 60. The default value in the LOCALOPTS file is 300. This is the suggested value unless specific problems are found.

This is not used during some long operations, for example, when mailman is downloading the symphony on an agent or domain manager.

If these parameters are not present in the LOCALOPTS, they can be added manually as in this example:
TCP TIMEOUT         = 300
TCP CONNECT TIMEOUT = 15
If these parameters are not present in the localopts, they can be added manually as in this example:
tcp timeout         = 60
tcp connect timeout = 15
Restart the end-to-end server to assure that all the processes work with the new settings.