In-Flight Trace facility for engine

Describes the tracing facility for troubleshooting the IBM Workload Scheduler engine. This facility is called In-Flight Trace.

This document describes the IBM Workload Scheduler server tracing facility that replaced Autotrace from version 8.6. The facility is designed to be used by IBM® Software Support, but is fully described here so that you understand how to use it if requested to do so by IBM® Software Support.

The IBM Workload Scheduler server tracing facility (hereafter called In-Flight Trace) is a facility used by IBM® Software Support to help solve problems in IBM Workload Scheduler. At maximum capacity it can trace the entry into and exit from every IBM Workload Scheduler function, plus many other events, and includes all log and trace messages currently issued by the CCLog facility.

In-Flight Trace has been conceived as a multi-product tool, although this description concentrates on its use for IBM Workload Scheduler.

It works as follows:

Existing trace calls

In-Flight Trace uses the logging and tracing facilities still used by the CCLog logging and tracing mechanism, and which were used by the Autotrace facility in releases before 8.6.

Function entry and exit

In addition, the IBM Workload Scheduler engine product build now inserts trace calls in the code to record the entry to and exit from every function and assigns a sequential numeric function ID to each function. The trace calls use these IDs to identify the functions.

Building the xdb.dat symbols database

During the same process, the build creates the xdb.data symbols database associating the name of each function with the function ID. In this way, the trace writes the minimum information possible to the trace record (the function ID), which can then be expanded to give the function name later for viewing.

The build also stores in the database the source file and line number of each function.

Further, it stores the name of the component which "owns" the function. One program contains many components, each of which contains many functions.

The symbols database is the key to managing the activation/deactivation and filtering of the traces. The information it contains is encrypted.

Tracing in shared memory

The traces are written to shared memory. This is divided into segments, and the traces chosen to be written to each segment are written in an endless loop. At maximum capacity (tracing all events on all functions) the traces might loop every few seconds, while at minimum capacity (tracing just one little-used function), the trace might not loop for months.

Segments

You can choose to use any number of segments (each is identified by a unique number) and for each segment can determine how much shared memory the segment is to use. More and bigger segments consume more memory, with all the normal consequences that entails.

Programs

Any number of IBM Workload Scheduler programs can be configured to be saved to the same segment. You decide which programs are to be traced to which segments, and whether those segments are to be enabled for tracing, by modifying the basic configuration. Any of the IBM Workload Scheduler programs and utilities can be configured for tracing.

Basic configuration

The basic configuration determines which segments are enabled for tracing, and makes an initial determination of whether the tracing for a specific program is activated. It is achieved by editing a configuration file with a text editor. The IBM Workload Scheduler engine (the product) must be restarted to make the changes take effect. The configuration is divided into the following sections:

Global: This section not only includes general information like the product code and the segment size, but also acts as a "catch-all", where traces from programs not specifically configured are configured.
<program>: If a program is not to be traced under the "global" section, a specific program section must be configured, defining which segment the program is to be traced in, and other basic information. The information in a program section overrides that in the global section, but just for that program.

Activating and deactivating traces

For segments which are enabled, traces for specific programs can be activated and deactivated on-the-fly, from the command line, as these flags are held in memory.

Trace levels

Events in the code have been assigned trace levels. The lower the level, the more drastic the event. The levels range from reporting only unrecoverable errors, through recoverable errors, warnings, and informational messages and three debug levels to the maximum reporting level, where even function entry and exit events are recorded.

Trace levels can also be changed on-the-fly, from the command line, without restarting the engine.

Snapshots

In-Flight Trace lets you take a snapshot of the current contents of the traces for a program or segment and save it to a file. You can optionally clear the memory in the segment after taking the snapshot. The snapshot file is in the internal format, containing function IDs, etc., and is not easily readable. It must be formatted to make it readable.

Formatting the snapshot

A command-line option lets you format a snapshot file for the standard output. The output can be in CSV or XML format, and information about the source data (file name and line number) is automatically included. Or you can select the standard trace format (one line per trace record) and choose whether to include the source information. And finally you can choose whether to include the header information (ideal for a printed output) or not (ideal for the creation of a file you are going to analyze programmatically).

Filtering

The tooling-up of the code is a fully automatic process and you might find that your traces include frequently used components or functions that are not causing any problems. You would like to exclude them from the trace and you do this by using the command line to create a filter file, in which you can specify to include all and then exclude any combinations of specific components, functions, and source files. Alternatively, you can exclude all and then include any combinations of specific components, functions, and source files. Functions can also be included or excluded by specifying a range of function IDs.

Once created, a filter file is declared either in the global section of the configuration file or one of the program sections. You can have more than one filter file which you use with different programs, however, note that the filter is applied at segment level. This means that if you have two programs writing to the same segment, the filter is applied to both even if it is only specified for one of them.

Existing filter files can be modified from the command line.

Products

In-Flight Trace is conceived as a multi-product facility. Each product has its own separate configuration file. Multiple instances of the facility can be run on the same system, completely independently of each other. However, you can also control one product from the tracing facility of another, by identifying the product to which to apply the commands. For example, if you had two versions of IBM Workload Scheduler running on the same system, you could control the In-Flight Trace facility for both of them from one place, inserting the appropriate product code when required by the command syntax.