Reputation: 1
I have a large ASCII dataset (2.7gb) which I believe is in an IMS hierarchial format. I'm unsure how to access the data to get it into a usable database, I would guess SQL but am open to other solutions. This is the "Layout" that came with the database if its at all helpful...
Upvotes: 0
Views: 525
Reputation: 418
So you are missing some key information here. You would actually want the IMS Database Descriptor (DBD) file in addition to the layout you pasted. The IMS DBD file will describe the structure of the database. An IMS database can have many segments (aka tables) in it which the DBD will describe in addition to other information such as the size of those tables.
That actual records will be stored in a flat file (probably the 2.7gb ASCII file you mentioned) in a depth first format. So let's say you had two segments A and B where B is a child of A. Your flat file might look like this A1,B1,B2,B3,A2,B4,B5 where B1, B2, and B3 are children of A1 and B4 and B5 are children of A2. The reason this matter is because your layout information only provides an overlay for a specific segment structure.
So if your database had more than the one segment UIMNH10, you won't know where in the ASCII file to apply your starting point for the layout.
Now let's make a HUGE assumption here that your database only has one segment UIMNH10. In that case your ASCII file would look like: A1, A2, A3, A4. That's pretty straight forward as you would apply your layout over the data repeatedly.
Luckily your data structures are pretty straight forward as it's all character data. You would interpret PIC X(n) as a character string of length n. Similarly, for PIC 9(n) which would be a numeric character string of length n.
Assuming your sample data starts with: AA201805...
RRC-H10-SEGMENT-ID is 'AA' because it's PIC X(2)
MN-H10-CENTURY is '20' because it's PIC 9(2)
MN-H10-YEAR is '18' because it's PIC 9(2)
MN-H10-MONTH is '05' because it's PIC 9(2)
You would do this until you reach the end of your layout and then start again at the beginning for your next record. This is also making an ASSUMPTION that the layout definition MATCHES the length of your record.
Your best bet is to work with your IMS database administrator to confirm these assumptions but once you get an idea of your starting points you should be able to map the data yourself or write a quick program to do it for you. There are some other alternatives as well but that would assume some back end setup for things like SQL support to read and dump the data into a csv file format for Excel.
Upvotes: 0
Reputation: 10553
If you do not have a programming background You are in big trouble !!!. Excel MsAccess will not help you much.
So the answer is:
Hire Some programmers with Cobol / Cobol Conversion experience !!!
The Cobol Copybook tells you the format of the file. The format of UIC-MN-H10-SEGMENT is
2 byte segment id (10 ???)
4 byte year
2 Byte Month
4 byte average injection pressure etc
This is a multi-record file.
RecordEditor might be able to display the File (Size might be a problem). Also the RecordEditor will take a bit of getting used to
Cobol e.g. GNU Cobol will need Cobol programmers
Java / JRecord -needs java programmers
To give a more meaning full answer, please supply the Cobol copybook in text format and some sample data
Upvotes: 0