Batchman, and other processes fail on a fault-tolerant agent with the message AWSDEC002E

The batchman process fails together with all other processes that are running on the fault-tolerant agent, typically mailman and jobman (and JOBMON on Windows 2000). The following errors are recorded in the stdlist log of the fault-tolerant agent:
+ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+ AWSBCV012E Mailman cannot read a message in a message file. 
+ The following gives more details of the error: 
+ AWSDEC002E  An internal error has occurred. The following UNIX 
+ system error occurred on an events file: "9" at line = 2212
+ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Cause and solution:

The cause is a corruption of the file Mailbox.msg, probably because the file is not large enough for the number of messages that needed to be written to it.

Consider if it seems likely that the problem is caused by the file overflowing:
  • If you are sure that this is the cause, you can delete the corrupted message file.

    All events lost: Following this procedure means that all events in the corrupted message file are lost.

    Perform the following steps:
    1. Use the evtsize command to increase the Mailbox.msg file. Ensure that the file system has sufficient space to accommodate the larger file.
    2. Delete the corrupt message file.
    3. Restart IBM Workload Scheduler by issuing the conman start command on the fault-tolerant agent.
  • If you do not think that this is the answer, or are not sure, contact IBM® Software Support for assistance.