Generating data in external format - XML representation

A traditionally formatted data set or file, with format described by a template (copybook), can be represented in an external format. This is possible by applying external format rules to the data in a character format determined by the input template. The output of a copy operation has an external format which can be recognized by applications on various platforms.

You can request that the output of the copy operation be "well-formed" XML data. (XML concepts and terminology are described in the XML Specification 1.0, www.w3.org). The output format derived from the input template also conforms to XML rules. A typical record in the output file is an XML line corresponding to an elementary data item in the input record, and has the format:

start-tag content end-tag

where start-tag and end-tag include the data item name and content is a character representation of the data item.

Using the copybook shown in About copybook templates:

01 ORDERS.
   05 ORDER-ID        PIC X(5).
   05 CUSTOMER-ID     PIC X(5).
   05 ORDER-DATE.
      10 ORDER-YEAR   PIC 9(4).
      10 ORDER-MONTH  PIC 9(2).
      10 ORDER-DAY    PIC 9(2).

and the input record with an Order ID of O1002 and a Customer ID of C0015 made on 20050110, the corresponding sequence of XML lines is:

<ORDER>
  <ORDER-ID>O1002</ORDER-ID>
  <CUSTOMER-ID>C0015</CUSTOMER-ID>
  <ORDER-DATE>
    <ORDER-YEAR>2005</ORDER-YEAR>
    <ORDER-MONTH>01</ORDER-MONTH>
    <ORDER-DAY>10</ORDER-DAY>
  </ORDER-DATE>
</ORDER>

File Manager assumes that the names used in a copybook or template are legitimate XML names according to the XML specification. It does not check that each data name, other than FILLER, is unique within the containing group. However, names which do not start with a letter or an underscore are prefixed with an underscore.

The format of elementary data depends on the input template or copybook. The result is similar to the output from the Print Utility. Numeric data is converted to character representation. File Manager removes leading zeros and blanks, trailing blanks, skips a positive sign, and so on. Character data is included into the XML content with trailing blanks removed (leading blanks are significant in data and are not suppressed).

There are situations when data cannot be represented in the XML output by its value. Data may not match the data type requirements. For example, when a numeric field contains non-numeric characters. Character data can contain non-printable and special characters. Non-printable characters are defined by the default (or customized) FMNTRTBS translation table. The following special characters have special meaning in the XML output and require different representation:

">" (greater than)
"<" (less than)
"'" (apostrophe)
"'" (quote)
"&" (ampersand)

The XML output is generated in the character set of the input data (EBCDIC, DBCS), but can optionally be converted to Unicode. File Manager uses standard z/OS® support for Unicode, assuming that appropriate infrastructure and services are present. Conversion to Unicode is possible if the conversion environment is created and activated (for details, see the z/OS Support for Unicode Using Conversion Services ). File Manager assumes CCSID 0037 for EBCDIC, 0939 for DBCS (MBCS), and 1200 for Unicode.

You can tailor the format and content of the XML output by specifying:

How to represent special characters
How to represent non-printable characters
How to represent invalid data
Whether FILLER should be included or ignored
Whether redefinitions are to be included or ignored
Whether to get the XML output in the character set of input data (EBCDIC, DBCS) or in Unicode
The number of blanks used to indent each logical level of XML tag nesting for better readability (logical levels 1, 2, 3, and so on), is assigned by File Manager by renumbering user levels specified in the copybook (for example 01, 05, 10, and so on)

The XML output contains additional information included as separate lines, or attributes following data-element names. They document:

The input data set name and the template (copybook). For example:
```
<INPUT FILE="FMN.SEQ1" FORMAT="FMN.TEMPLATE(SEQ1)">
```
The record sequence number and, in the case of keyed data, the key value. For example:
```
<ORDER SEQ_NUMBER="998" KEY="O1002">
```
The occurrence number, if the data element is an array. For example:
```
<MONTH-END ITEM_NUMBER="(2)">24</MONTH-END>
```
Non-printable or special characters found in the content of the element and converted according to processing parameters. For example:
```
<EMP-ID NONPRINT_CHAR="REPLACE">AA..17</EMP-ID>
<JOB-ID SPECIAL_CHAR="ESCAPE">A15&B32</JOB-ID>
```
Any invalid data conditions met when converting a numeric data to a character string; the element is presented according to processing parameters. For example, skipped:
```
<EMP-SALARY INVALID_DATA="SKIP"></EMP-SALARY>
```

In the cases of invalid data, non-printable and special characters, the attributes document the actual processing of particular data rather than the parameters specified. For example, if a data field contains both non-printable and special characters, with SKIP and REPLACE as the corresponding parameters, the content is skipped. REPLACE is inconsistent with SKIP in this case and is ignored. In general, if a data field contains both special and non-printable characters, SKIP and HEX have precedence over all other options. CDATA has less priority than SKIP and HEX, but higher priority than the rest of the options.

The XML output can be any data set allowed by the Copy Utility, except for VSAM KSDS. The output data set may be fixed or variable in length. If you want an output record created for each XML line, make sure that the maximum logical record length (LRECL) can contain the largest generated XML line. The size of the line depends on the names used in the template, the maximum length of the character representation of values, and the way the special and non-printable characters and invalid data are processed. Some special characters are substituted with strings, some data presented in hexadecimal form, and so on. This can make the line considerably longer than expected. If you want each XML line to be placed in a single output record, and the record is not long enough for a particular XML line, File Manager truncates the XML line, ends processing and reports an error (and, if in batch, sets a tailorable condition code of 8 indicating a truncation error). If you decide that an XML line can span multiple output records, the output data set can have any logical record length.

To generate XML-formatted output, use the Copy From panel as follows:

From the Primary Option Menu panel, select option 3 (Utilities), and then option 3 (Copy). File Manager displays the Copy Utility panel.
Enter the "From" data set details.
Enter the "From" copybook or template details. Use a combination of the Data set name and Member entry fields to specify the copybook or template that describes the data in the "From" data set. You can select data for copying at either record level or field level:
- For record-level selection, set the record identification and record selection criteria in your "From" template.
- For field-level selection, use a "From" template with a "To" template, and specify the selected fields, field attributes and field mapping in your "To" template.
If you specify both record-level and field-level selection, File Manager first selects data at record level, then at field level.
For Export/Import select 1 to indicate that you want output in XML format.
Select Copybook/template Processing Option 1 (Above) or 3 (Create dynamic) press Enter.
If you selected option 1, File Manager displays an extended version of the Copy To panel. This form of the panel allows you to specify the additional options.

If you selected option 3, then you must create the dynamic template. Once you have done so, File Manager displays the extended Copy To panel.
Type the "To" data set details.
Optionally, customize the generation of output. For XML, you can affect the generation and readability by specifying how to represent non-printable characters and invalid data, whether to include fillers and redefines, how to indent when nesting successive levels of XML tag nesting, and so on.
Press Enter. File Manager generates the selected data from the "From" data set and writes it, in XML format, to the "To" data set.