Deepesh Shetty
Deepesh Shetty

Reputation: 1166

Convert a file with variable-length records to fixed-length records in Java

I have a requirement where I need to convert a file containing variable-length records into fixed-length records. This is a file from a Mainframe.

Since I don't access to files on the Mainframe, I need a sample variable-length record file and a way to convert to fixed-length records.

I am totally new to this kind of file. But if I get an idea to of how to map these variable-length records to fixed-length I can code it in Java.

This is my variable length file:

1 piyush    pankaj    04mathematic10physics   20biology   45   
2 vanitha   reddy     03physics   30chemistry 60   
3 deepesh   shetty    05chemistry 5 biology   45
4 jane      dsouja    01geography 30chemistry 60biology   45
5 ramadasa  hegde     02chemistry 80biology   70   

This is how my how my fields are positioned:

05  ID                         PIC 99.  
05  FNAME                      PIC X(10).   
05  LNAME                      PIC X(10).
05  NO_SUB                     PIC X(2).
05  SUBJECTS OCCURS 0 TO 10 TIMES
          DEPENDING ON NO_SUB
       10  SUB_NAME            PIC X(10).
       10  MARKS               PIC 99.

So i am expecting an output like this :

1 piyush    pankaj    04mathematic10
1 piyush    pankaj    04physics   20
1 piyush    pankaj    04biology   70
2 vanitha   reddy     03physics   30
2 vanitha   reddy     03chemistry 60
3 deepesh   shetty    05chemistry  5 
3 deepesh   shetty    05biology   45
4 jane      dsouja    01geography 30
4 jane      dsouja    01chemistry 60
4 jane      dsouja    01biology   70
5 ramadasa  hegde     02chemistry 80
5 ramadasa  hegde     02biology   70

Upvotes: 1

Views: 4310

Answers (2)

Bill Woodger
Bill Woodger

Reputation: 13076

This is your record-layout in COBOL:

05  ID                         PIC 99.  
05  FNAME                      PIC X(10).   
05  LNAME                      PIC X(10).
05  NO_SUB                     PIC X(2).
05  SUBJECTS OCCURS 0 TO 10 TIMES
          DEPENDING ON NO_SUB
       10  SUB_NAME            PIC X(10).
       10  MARKS               PIC 99.

This is not valid.

Firstly, ID is a Reserved Word, it is a shortening of IDENTIFICATION.

Secondly, NO_SUB is mentioned in OCCURS ... DEPENDING ON ... so must be numeric (in this case a PIC 99).

Thirdly, there is a full-stop/period missing after NO_SUB in the ODO.

Fourthly, from what you have shown of the data, ID and MARKS are both defined incorrectly. ID (when it has a proper name) should be PIC XX and MARKS either PIC XX or perhaps PIC Z9.

Your data is entirely textual, so there is no problem at all in transferring it from the Mainframe by preferred method and allowing the transfer to do the EBCDIC to ASCII translation, and delimit your records with whatever is appropriate for your file-system.

You will then have some variable-length records on your system.

The key to being able to understand them is the value in NO_SUB for each individual record.

Each record has a fixed length of 24 bytes (the fields from ID to NO_SUB inclusive).

Records with a NO-SUB of 00 have no data beyond that point, you should just see your record delimiter.

Otherwise, NO_SUB should contain a numeric value of 01-10 inclusive, which will represent the length of the variable part of your data in this way: variable-length = NO_SUB * 12. The 12 is the length of SUB_NAME plus the length of MARKS.

To output the data you require, you need the fixed part of the record, plus the appropriate variable part (if there is one, don't forget the zero in NO_SUB, and you'll need to find out what, if anything, to output for that) accessed in some way in some type of looping construct.

Having said all that, there is only one example in your data which is correct, the final record.

If NO_SUB is 03, you should find three blocks of (10-bytes-text, 2-bytes-numeric). If NO_SUB is 05, you should find five similar blocks.

With your final record, you should output:

Byte 1 for a length of 24 + Byte 25 for a length of 12
Byte 1 for a length of 24 + Byte 37 for a length of 12

The variable part of your record starts at byte 25 and is 12 long. You output the first variable part from byte 25, the second from 37, the third from 49 etc.

Start-position-to-output is ( 25 + ( 12 * ( occurrence-in-loop - 1 ) ) ), where 25 is the position of the first variable part of your data, and 12 is the length of each individual element of your variable data.

Giving you:

5 ramadasa  hegde     02chemistry 80
5 ramadasa  hegde     02biology   70

You should check that NO_SUB is numeric, and that it is not greater than 10, and find out what you should do it either of those is not the case.

in Java you can use the substr method to extract the fields e.g.

String id = inputLine.substr(1,2);
String firstName = inputLine.substr(3,12);
String lastName  = inputLine.substr(13,22);
String numberOfEntriesStr = inputLine.substr(23,24);

int numberOfEntries = Integer.parseInt(numberOfEntriesStr);

for (int i = 0; i < numberOfEntries) {
   ...
}

There are fixed width java packages + some that can read files using a Cobol copybook but they would be a complete over kill for this.

Upvotes: 3

Bruce Martin
Bruce Martin

Reputation: 10543

On the mainframe you can use the sort utility to convert from VB to FB. See using sort to copy from VB to FB. While you are at it, sort can also convert binary fields to text fields.

If the Java is running on zOS you should be able to use IBM's supplied classes to read VB files (see JZOS Java Launcher and Toolkit. What platform are you running the java on ???

As a last resort it is possible to read VB files, the format is not difficult, I can supply code for reading VB files(the difficult bit is transporting the VB file to another platform).


Finally can you clarify the question, for example what platform is the java running, how are you reading / transferring the files, are the files binary files ???.

I suspect there is actually no need to do a VB to FB conversion because

  • If running java on zOS there are IBM supplied classes that will read VB files.
  • If you are transporting the files to another platform:
    • You can convert any binary field to text fields using the sort utility
    • Do a Text File transfer (with EBCIDIC to ASCII translation). For text transfers it should not matter wether it is a FB or VB file.

The only time VB files should be an issue is for Binary VB files

Upvotes: 4

Related Questions