KPIs for IBM Workload Scheduler

Find out the IBM Workload Scheduler KPIs managed by AIDA.

IBM Workload Scheduler and IBM® Z Workload Scheduler expose metrics and KPIs definitions according to the OpenMetrics standard.

KPIs definitions and data retrieval frequency are defined into a json file inside IBM Workload Scheduler. This file is retrieved by AIDA Exporter component once a day.

According to the frequency of data retrieval defined in the json file, AIDA's Exporter component retrieves the metrics through ad-hoc APIs and stores them into AIDA OpenSearch datababase.

KPIs definitions and KPIs metrics cannot be modified by AIDA users.

For details about IBM Workload Scheduler exposed metrics, see Exposing metrics to monitor your workload in the IBM Workload Scheduler User's Guide and Reference.

For details about IBM® Z Workload Scheduler exposed metrics, see Exposing metrics to monitor your workload in the IBM® Z Workload Scheduler Managing the Workload.

AIDA also collects a special KPI named Job history, containing the duration for each job that has been defined in IBM Workload Scheduler with the advanced analytics option enabled and for all its predecessor jobs. Every day, this KPI generates one data point for each job execution (KPI frequency = 86400 seconds) .

On a daily basis, starting from the KPIs time series, AIDA uses Machine Learning alghoritms to predict the KPIs trends.

According to the Timerange parameter in the common.env configuration file (or values.yaml file for Kubernetes deployment), KPIs current values are compared with their predicted values. Alerts can be generated, based on alerts definition rules. For details, see Alert Definitions.

IBM Workload Scheduler KPIs are grouped in the following categories:

Category KPI name Metric name Description Data frequency
Jobs Job history job_history Duration of each job with the advanced analytics option enabled and all its predecessor jobs. 1 data point per each daily job executions (86400 seconds)
Total jobs in plan application_wa_JobsInPlanCount_job_total The total number of jobs in the current plan. 1 data point every 4 minutes (240 seconds)
Jobs in plan by status application_wa_JobsInPlanCount_job Jobs in the current plan with a specific status. The status can be: WAITING, READY, RUNNING, SUCCESSFUL, ERROR, CANCELED, HELD, UNDECIDED, BLOCKED, and SUPPRESS. 1 data point every 4 minutes (240 seconds)
Jobs in plan by workstation application_wa_JobsByWorkstation_jobs Jobs in the current plan, with a specific status, running on a specific workstation. 1 data point every 4 minutes (240 seconds)
Jobs in plan by folder application_wa_JobsByFolder_jobs Jobs in the current plan, with a specific status, in a specific folder. 1 data point every 4 minutes (240 seconds)
Queue WA message files fill percentage application_wa_msgFileFill_percent Internal message queue usage for Appserverbox.msg, Courier.msg, mirrorbox.msg, Mailbox.msg, Monbox.msgn, Moncmd.msg, auditbox.msg, clbox.msg, planbox.msg, Intercom.msg, pobox messages, and server.msg 1 data point every 4 minutes (240 seconds)

IBM Workload Scheduler KPIs json file

In the KPIs json file inside IBM Workload Scheduler, each entry defines a KPI. The frequency parameter represents the frequency of the KPI data retrieval, expressed in seconds. This file cannot be modified by users.

[
  {
    "name": "Job history KPI",
    "metric_name": "job_history",
    "frequency": 86400,
    "category": "Jobs",
    "subcategory": "history",
    "labels":[
		"job"
	],
	"keyprop":"attributes",
	"keyPropValues":["duration"],
    "type":"records"
  },
  {
    "name": "Total jobs in plan",
    "metric_name": "application_wa_JobsInPlanCount_job",
    "frequency": 240,
    "category": "Jobs",
    "subcategory": "Trend",
    "type":"total"
  },
  {
    "name": "Jobs in plan by status",
    "metric_name": "application_wa_JobsInPlanCount_job",
    "frequency": 240,
    "category": "Jobs",
    "subcategory": "Trend",
    "keyprop": "jobstatus"
  },
  {
    "name": "Jobs in plan by workstation",
    "metric_name": "application_wa_JobsByWorkstation_jobs",
    "frequency": 240,
    "category": "Jobs",
    "subcategory": "Trend_by_wks",
    "keyprop": "jobstatus",
    "labels": [
      "workstation"
    ]
  },
  {
    "name": "Jobs in plan by folder",
    "metric_name": "application_wa_JobsByFolder_jobs",
    "frequency": 240,
    "category": "Jobs",
    "subcategory": "Trend_by_folder",
    "keyprop": "jobstatus",
    "labels": [
      "folder"
    ]
  },
  {
    "name": "WA Message files fill percentile",
    "metric_name": "application_wa_msgFileFill_percent",
    "frequency": 240,
    "category": "Queue",
    "subcategory": "Msg file fill",
    "keyprop": "msgfile"
  }
]