Generating Large Variable Length Records

This problem has arisen many times over the years and, for me, came to head with my having to respond to a set of specs requiring just such a problem. While working on the problem, a request hit SAG-L and loads of suggestions followed from which I gleaned some help but ended up with my own home-grown version that worked for the specific application problem.

To start, five different input records were written to a work file from which the generated files were to be created for transmission to a NY bank. The COBOL record definitions were internally emailed to another programmer who immediately translated them into a DEFINE DATA series of definitions similar to the following:

1 #RECORD-TYPE-1	
  2 #R1-FIELD1(A10)
  2 #COMMA1(A1)
  2 #R1-FIELD2(A9)
  2 #COMMA2(A1)
  2 #R1-FIELD3(A2)
  2 #COMMA3(A1)
  2 #R1-FIELD4(A3)
  2 #COMMA4(A1)
  2 #R1-FIELD5(A16)
  2 #COMMA5(A1)
  2 #R1-FIELD6(A60)
  2 #COMMA6(A1)
  2 #R1-FIELD7(A10)
  	:
  	:

#R1-FIELD5 data was actually S9(14)V99 when passed; there was one or two types of these field on each record type passed. The specs said they had to be passed with leading zeros suppressed.

As you can see, the data came in as "comma-delimited" (hence the #COMMA... fields). If a field was empty, the record contained the same number of spaces as defined for the field length.

The data had to be formatted into a single, comma-delimited record with all numeric data sent without leading zeros and any empty field represented with a simple comma, no spaces were allowed to represent an empty field.

The first thing we did was to set up an array similar to #IN-RECORD (A1/350) since the largest record was 324 bytes so we left some room for "expansion". So eventually the DEFINE DATA (minus the "comma" fields (now FILLER 1X)) looked much like:

DEFINE DATA LOCAL	
1 #OUT-RECORD(A1/350)
1 #IN-RECORD(A1/350)
1 REDEFINE #IN-RECORD
  2 #RECORD-TYPE-1	
    3 #R1-FIELD1(A10) 
    3 FILLER 1X
    3 #R1-FIELD2(A09)
    3 FILLER 1X
    3 #R1-FIELD3(A02)
    3 FILLER 1X
    3 #R1-FIELD4(A03)
    3 FILLER 1X
    3 #R1-FIELD5(A16)
    3 FILLER 1X
    3 #R1-FIELD6(A60)
    3 FILLER 1X
    3 #R1-FIELD7(A10)
    3 FILLER 1X
    3 #R1-FIELD8(A40)
    3 FILLER 1X
    3 #R1-FIELD9(A16)
    3 FILLER 1X
    3 #R1-FIELD10(A29)
    3 FILLER 1X
    3 #R1-FIELD11(A05)   
    3 FILLER 1X
    3 #R1-FIELD12(A05)    
    3 FILLER 1X
    3 #R1-FIELD13(A20)    
    3 FILLER 1X      /* end of block 1
    3 #R1-FIELD14(A20)    
    3 FILLER 1X
    3 #R1-FIELD15(A20)    
    3 FILLER 1X
    3 #R1-FIELD16(A01)    
    3 FILLER 1X
    3 #R1-FIELD17(A01)    
    3 FILLER 1X
    3 #R1-FIELD18(A01)    
    3 FILLER 1X
    3 #R1-FIELD19(A02)    
    3 FILLER 1X
    3 #R1-FIELD20(A03)    
    3 FILLER 1X
    3 #R1-FIELD21(A05)
    3 FILLER 1X
    3 #R1-FIELD22(A05)    
    3 FILLER 1X
    3 #R1-FIELD23(A08)    
    3 FILLER 1X
    3 #R1-FIELD24(A07)    
    3 FILLER 1X
    3 #R1-FIELD25(A05)    
    3 FILLER 1X
    3 #R1-FIELD26(A12)
    3 FILLER 1X
    3 #R1-FIELD27(A05) /* end of block 2
    3 FILLER 1X
    3 #R1-FIELD28(A16)
    3 FILLER 1X
    3 #R1-FIELD29(A20) /* end of block 3
1 REDEFINE #IN-RECORD 
  2 #R1-BLOCK1(A193)
  2 #R1-BLOCK2(A113)
  2 #R1=BLOCK3(A044)
1 REDEFINE #IN-RECORD
  2 RECORD-TYPE-2
:	
:
1 REDEFINE #IN-RECORD
  2 #R2-BLOCK1(A168)
  2 #R2-BLOCK2(A128)
  2 #R2-BLOCK3(A054)
:
 
1 #LENGTH(I4)
1 #OUT-LENGTH(I4)
1 #I(I4)

Each different record type was redefined and redefined again in similar fashion. Each block represented a complete range of fields with no field extending beyond the end of a specific block. I eliminated the commas passed with the FILLER 1X notation &endash; they didn't suit the purpose I put the COMPRESS to later.

#OUT-RECORD will contain the compressed, comma-delimited data set.

END-DEFINE
DEFINE WORK FILE 1 'TESTFILE.COBOLREC.INPUT'	
READ WORK FILE 1 #IN-RECORD(*)
DECIDE ON FIRST VALUE OF #IN-RECORD
  VALUE '1'  
    MOVE LEFT #R1-FIELD5 TO #R1-FIELD13
    MOVE LEFT #R1-FIELD9 TO #R1-FIELD9
    COMPRESS #R1-FIELD1 ... #R1-FIELD10
      NTO #R1-BLOCK1 WITH ALL DELIMITERS
    EXAMINE #R1-BLOCK1 FOR ', ,' REPLACE WITH ',,'
    EXAMINE #R1-BLOCK1 FOR '  ' DELETE
    EXAMINE #R1-BLOCK1 FOR ' ' GIVING LENGTH #LENGTH
    FOR #I 1 #LENGTH
      MOVE #IN-RECORD(#I) TO #OUT-RECORD(#I)
    END-FOR
    MOVE #LENGTH TO #OUT-LENGTH
    ADD 1 TO #OUT-LENGTH
    MOVE ',' TO #OUT-RECORD(#OUT-LENGTH)
*
    MOVE LEFT #R2-FIELD29 TO #R2-FIELD29
    COMPRESS #R1-FIELD14 ... #R1-FIELD27 
      INTO #R1-BLOCK2 WITH ALL DELIMITERS
    EXAMINE #R1-BLOCK2 FOR ', ,' REPLACE WITH ',,'
    EXAMINE #R1-BLOCK2 FOR '  ' DELETE
    EXAMINE #R1-BLOCK2 FOR ' ' GIVING LENGTH #LENGTH
    FOR #I 1 #LENGTH
      MOVE #IN-RECORD(#I) TO #OUT-RECORD(#OUT-LENGTH + #I)
    END-FOR
*
    IF #R1-BLOCK3 GT ' '
      ADD #LENGTH TO #OUT-RECORD
      MOVE ',' TO #OUT-RECORD(#OUT-LENGTH)
      MOVE LEFT #R2-FIELD28 TO #R2-FIELD29
      COMPRESS #R1-FIELD11 ... #R1-FIELD27 
        INTO #R1-BLOCK2 WITH ALL DELIMITERS
      EXAMINE #R1-BLOCK2 FOR ', ,' REPLACE WITH ',,'
      EXAMINE #R1-BLOCK2 FOR '  ' DELETE
      EXAMINE #R1-BLOCK2 FOR ' ' GIVING LENGTH #LENGTH
      EXAMINE #R1-BLOCK2 FOR ' ' GIVING LENGTH #LENGTH
      FOR #I 1 #LENGTH
        MOVE #IN-RECORD(#I) TO #OUT-RECORD(#OUT-LENGTH + #I)
      END-FOR
      ADD #LENGTH TO #OUT-RECORD
      MOVE ',' TO #OUT-RECORD(#OUT-LENGTH)
    END-IF
*
WRITE WORK FILE 2 VARIABLE #OUT-RECORD(*)
*
  VALUE '2'
* Process next record type 
  VALUE NONE IGNORE
END-DECIDE
END

 

Narrative, processing Record 1 Block 1:

  VALUE '1'  
    MOVE LEFT #R1-FIELD5 TO #R1-FIELD13
    MOVE LEFT #R1-FIELD9 TO #R1-FIELD9

These two statements shift the normally right justified numeric to a left justified perspective.

 

    COMPRESS #R1-FIELD1 ... #R1-FIELD10
      INTO #R1-BLOCK1 WITH ALL DELIMITERS

This COMPRESS now builds a comma-delimited record with trailing spaces removed from the previously left justified "numeric" fields and commas are automatically inserted between each compressed value. The target field, #R1-BLOCK1 is then padded with spaces, the normal process of the COMPRESS execution.

 

    EXAMINE #R1-BLOCK1 FOR ', ,' REPLACE WITH ',,'

The ALL option forces the COMPRESS to account for empty fields with a space-comma thus leaving a comma-space-comma for the empty field. The specs called for a ',,' to indicate an empty field which the above EXAMINE resolves.

 

    EXAMINE #R1-BLOCK1 FOR '  ' DELETE

You may or may not need this EXAMINE. We found, during testing, that a couple of alpha fields contained data that was both left justified and right justified in the same field, that is, there were a lot of intervening spaces which would violate the specs. The EXAMINE eliminated those spaces. For example, if the alpha field contained:

'MERISTEM SYSTEMS CORPORATION                   WH&O INTERNATIONAL' 

then the EXAMINE returned:

'MERISTEM SYSTEMS CORPORATION WH&O INTERNATIONAL'
 
 
    EXAMINE #R1-BLOCK1 FOR ' ' GIVING LENGTH #LENGTH

After all this data massaging, we then needed to know the length of the compressed value.

 

    FOR #I 1 #LENGTH
      MOVE #IN-RECORD(#I) TO #OUT-RECORD(#I)
    END-FOR

Now we move the input record data to the output record.

    MOVE #LENGTH TO #OUT-LENGTH
    ADD 1 TO #OUT-LENGTH
    MOVE ',' TO #OUT-RECORD(#OUT-LENGTH)

This little block of code loads up our index into the output record, adds 1 to the index so we can insert a comma to then load the second block of data after we clean it up.

 

Processing Record 1 Block 2:

    MOVE LEFT #R2-FIELD29 TO #R2-FIELD29

Once again left justify numeric fields in the block for future compression.

 

    COMPRESS #R1-FIELD14 ... #R1-FIELD27 
      INTO #R1-BLOCK2 WITH ALL DELIMITERS

Compress the block into itself as we did for the first block.

 

    EXAMINE #R1-BLOCK2 FOR ', ,' REPLACE WITH ',,'
    EXAMINE #R1-BLOCK2 FOR '  ' DELETE
    EXAMINE #R1-BLOCK2 FOR ' ' GIVING LENGTH #LENGTH
    FOR #I 1 #LENGTH
      MOVE #IN-RECORD(#I) TO #OUT-RECORD(#OUT-LENGTH + #I)
    END-FOR

Generally, this block of code is similar to the processing in the first block with the exception of the #OUT-RECORD index. We need to load immediately the end of block 1 and that's why the #I. #OUT-LENGTH currently points to the last comma we inserted but by starting at that point plus the FOR loop index, we begin loading at precisely the next "element".

Now, the following block is used to process the third block if it not empty. If it is empty, then processing falls through to the WRITE WORK FILE n VARIABLE and we're done processing the current record. If not, then put a comma after the last character loaded from block 2 an proceed as we did for the first two blocks.

Since we coded a DECIDE ON FIRST, you're WRITE WORK FILE n VARIABLE may be coded in a ANY clause instead. Please TEST, TEST, TEST and when you're done, TEST ONE MORE TIME!

Comments should be directed to whobooks@hotmail.com

Oh, by the way, we later introduced two PF key settings '%W<' and '%W>'; they were for shifting left and right during testing so you could see ALL the data on the screen. It was important to set the LS parameter also: LS=200 (or the largest segment coded).

FORMAT LS=200
SET KEY PF10='%W-' PF11-'%W+'

and for testing:

 1 REDEFINE #OUT-RECORD
   2 #OUT-RECORD-1(A175)
   2 #OUT-RECORD-2(A175)
 
* WRITE WORK FILE 2 VARIABLE #OUT-RECORD(*)
WRITE #OUT-RECORD-1 / #OUT-RECORD-2