Reputation: 1166
I have a requirement where I need to convert a file containing variable-length records into fixed-length records. This is a file from a Mainframe.
Since I don't access to files on the Mainframe, I need a sample variable-length record file and a way to convert to fixed-length records.
I am totally new to this kind of file. But if I get an idea to of how to map these variable-length records to fixed-length I can code it in Java.
This is my variable length file:
1 piyush pankaj 04mathematic10physics 20biology 45
2 vanitha reddy 03physics 30chemistry 60
3 deepesh shetty 05chemistry 5 biology 45
4 jane dsouja 01geography 30chemistry 60biology 45
5 ramadasa hegde 02chemistry 80biology 70
This is how my how my fields are positioned:
05 ID PIC 99.
05 FNAME PIC X(10).
05 LNAME PIC X(10).
05 NO_SUB PIC X(2).
05 SUBJECTS OCCURS 0 TO 10 TIMES
DEPENDING ON NO_SUB
10 SUB_NAME PIC X(10).
10 MARKS PIC 99.
So i am expecting an output like this :
1 piyush pankaj 04mathematic10
1 piyush pankaj 04physics 20
1 piyush pankaj 04biology 70
2 vanitha reddy 03physics 30
2 vanitha reddy 03chemistry 60
3 deepesh shetty 05chemistry 5
3 deepesh shetty 05biology 45
4 jane dsouja 01geography 30
4 jane dsouja 01chemistry 60
4 jane dsouja 01biology 70
5 ramadasa hegde 02chemistry 80
5 ramadasa hegde 02biology 70
Upvotes: 1
Views: 4310
Reputation: 13076
This is your record-layout in COBOL:
05 ID PIC 99.
05 FNAME PIC X(10).
05 LNAME PIC X(10).
05 NO_SUB PIC X(2).
05 SUBJECTS OCCURS 0 TO 10 TIMES
DEPENDING ON NO_SUB
10 SUB_NAME PIC X(10).
10 MARKS PIC 99.
This is not valid.
Firstly, ID is a Reserved Word, it is a shortening of IDENTIFICATION.
Secondly, NO_SUB is mentioned in OCCURS ... DEPENDING ON ...
so must be numeric (in this case a PIC 99).
Thirdly, there is a full-stop/period missing after NO_SUB in the ODO.
Fourthly, from what you have shown of the data, ID and MARKS are both defined incorrectly. ID (when it has a proper name) should be PIC XX and MARKS either PIC XX or perhaps PIC Z9.
Your data is entirely textual, so there is no problem at all in transferring it from the Mainframe by preferred method and allowing the transfer to do the EBCDIC to ASCII translation, and delimit your records with whatever is appropriate for your file-system.
You will then have some variable-length records on your system.
The key to being able to understand them is the value in NO_SUB for each individual record.
Each record has a fixed length of 24 bytes (the fields from ID to NO_SUB inclusive).
Records with a NO-SUB of 00
have no data beyond that point, you should just see your record delimiter.
Otherwise, NO_SUB should contain a numeric value of 01-10 inclusive, which will represent the length of the variable part of your data in this way: variable-length = NO_SUB * 12. The 12 is the length of SUB_NAME plus the length of MARKS.
To output the data you require, you need the fixed part of the record, plus the appropriate variable part (if there is one, don't forget the zero in NO_SUB, and you'll need to find out what, if anything, to output for that) accessed in some way in some type of looping construct.
Having said all that, there is only one example in your data which is correct, the final record.
If NO_SUB is 03, you should find three blocks of (10-bytes-text, 2-bytes-numeric). If NO_SUB is 05, you should find five similar blocks.
With your final record, you should output:
Byte 1 for a length of 24 + Byte 25 for a length of 12
Byte 1 for a length of 24 + Byte 37 for a length of 12
The variable part of your record starts at byte 25 and is 12 long. You output the first variable part from byte 25, the second from 37, the third from 49 etc.
Start-position-to-output is ( 25 + ( 12 * ( occurrence-in-loop - 1 ) ) )
, where 25 is the position of the first variable part of your data, and 12 is the length of each individual element of your variable data.
Giving you:
5 ramadasa hegde 02chemistry 80
5 ramadasa hegde 02biology 70
You should check that NO_SUB is numeric, and that it is not greater than 10, and find out what you should do it either of those is not the case.
in Java you can use the substr method to extract the fields e.g.
String id = inputLine.substr(1,2);
String firstName = inputLine.substr(3,12);
String lastName = inputLine.substr(13,22);
String numberOfEntriesStr = inputLine.substr(23,24);
int numberOfEntries = Integer.parseInt(numberOfEntriesStr);
for (int i = 0; i < numberOfEntries) {
...
}
There are fixed width java packages + some that can read files using a Cobol copybook but they would be a complete over kill for this.
Upvotes: 3
Reputation: 10543
On the mainframe you can use the sort utility to convert from VB to FB. See using sort to copy from VB to FB. While you are at it, sort can also convert binary fields to text fields.
If the Java is running on zOS you should be able to use IBM's supplied classes to read VB files (see JZOS Java Launcher and Toolkit. What platform are you running the java on ???
As a last resort it is possible to read VB files, the format is not difficult, I can supply code for reading VB files(the difficult bit is transporting the VB file to another platform).
Finally can you clarify the question, for example what platform is the java running, how are you reading / transferring the files, are the files binary files ???.
I suspect there is actually no need to do a VB to FB conversion because
The only time VB files should be an issue is for Binary VB files
Upvotes: 4