Reputation: 55
After many years away from programming, I've decided to take it up again for fun, and finding myself enjoying it quite a lot. In the process of finding things to code, I've found this data which is openly available from Network Rail in the United Kingdom.
Among other things, you can obtain schedule data, which is a list of all train, bus and ferry journeys.
A schedule record for a train journey might look like this:
BSNY819581902281902280001000 PEE5A99 122112002 EMU390 125
BX VTY
LOMNCRPIC 2131 00008 FL TB
LIARDWCKJ 2133 00000000
LISLDLJN 2134H00000000 SL H
LIHTNOJN 2138 00000000 FL 1H
LISTKP 2140H000000002 SL
LISTKPE1 2141H00000000
LIADSWDRD 2142H00000000
LICHDH 2143 000000002
LIWLMSL 2146 000000004 1
LIALDEDGE 2148 00000000 1 3
LISBCH 2200 000000001 FL
LICREWSBG 2203 00000000
LICREWUML 2204 00000000
LICREWE 2206 000000001 FL H
LICREWBHJ 2207H00000000 1
LIMADELEY 2212 00000000 FL FL 5H
LINTNB 2222H00000000 FL FL
LISTAFFDJ 2226H00000000 SL
LISTAFFRD 2228H2231H 000000004 SL SL A C
LISTAFTVJ 2233 00000000
LIPNKRDG 2236H00000000 1 1
LIBSBYJN 2244 00000000
LIPBLJWM 2248 00000000
LIDRLSTNJ 2251H00000000
LIBSCTSTA 2252H00000000 1
LIPRYBRNJ 2257 00000000 7
LIASTON 2306H000000002
LISTECHFD 2311H00000000
LIBHAMINT 2315 000000004
LIBKSWELL 2318H00000000 1H
LICOVNTRY 2324 000000001 2 3
LIRUGBTVJ 2336 00000000 UNL 3
LIRUGBY 2340 000000005 UNLUNL 1H
LIHMTNJ 2343 00000000 1H
LIDVNTYNJ 2346H00000000
LILNGBKBY 2351 00000000 1 1
LINMPTN 0001H000000001 6
LIHANSLPJ 0016 00000000 SL
LIMKNSCEN 0021 000000001 SL SL
LIBLTCHLY 0023 000000004 SL SL 5 1H
LILEDBRNJ 0036 00000000 SL SL 2
LITRING 0042 000000002 SL SL 2H
LIBONENDJ 0048H00000000 SL SL 1H
LIWATFDJ 0056 000000009 SL SL 1H
LIHROW 0101H000000006 SL SL 3
LIWMBY 0107H000000006 CL
LTWMBYICD 0117H0000 TF
tl;dr the first two lines describe what type of train is running, when it runs, how fast, etc. The other lines describe the points the train will pass, and what time they are expected to do so. The main takeaway is that each record has a different length depending on the journey.
When I saw this, I thought "This would be a great thing to try and mess around with in COBOL." I went to polytech and learned PASCAL and COBOL, but only had to deal with files with a consistent length and consistent data, not something like this.
I spent a couple of hours trying to find some sort of answer to this on Google, but nothing really showed, hence my asking.
Just for reference, I have managed to do this in GW-BASIC, and could do it in elementary Python if needed, but COBOL, being what it is, is a whole different kettle of fish.
Is it possible to read something like this in to COBOL without having to resort to witchcraft, or is it just in the "too hard" basket? I'm only doing this for fun, so it's really no big deal.
Any responses or feedback would be most welcome.
Many thanks,
Joseph.
Upvotes: 1
Views: 2028
Reputation: 143
COBOL input files can be VARIABLE or FIXED. If it is variable, the first positions will be the number of cols of the row.
Here you have the information at IBM official webpage. Processing Files with Variable Length Records
Upvotes: 1
Reputation: 331
To expand on @Simon Sobisch's answer.
Looking at the data, and trying to work it out, I can see these things.
Top two lines as you say are the type of train and that.
Then you have a line starting LO which must be the start of the journey. The next 7 characters would be the station, with MNCRPIC being presumably "Manchester Piccadily". Then there's a space, then four digits which would be a time.
Then you have a load of lines starting LI which are intermediate points. Some of those have a "H" after the time, others don't. This will be a problem if you're going to do UNSTRING DELIMITED BY SPACE
. I'm going to presume that H means Halt.
LISTAFFRD 2228H2231H 000000004 SL SL A C
is an odd looking line.
At the end we have LT which is the end of the journey, arriving at WMBYICD at 0117.
01 TRAIN-SCHEDULE.
03 RECORD-TYPE PIC XX.
88 JOURNEY-START VALUE 'LO'.
88 JOURNEY-INTERMEDIATE VALUE 'LI'.
88 JOURNEY-TERMINATE VALUE 'LT'.
03 TRAIN-STATION PIC X(7).
03 FILLER PIC X(11).
03 TRAIN-TIME.
05 TRAIN-TIME-HH PIC 99.
05 TRAIN-TIME-MM PIC 99.
03 TRAIN-HALT-FLAG PIC X.
88 TRAIN-STOPS-HERE VALUE 'H'.
And so on.
Upvotes: 1
Reputation: 7287
Actually (after reformatting the question) I think COBOL is perfect for this job as the data is fixed-length (may came out of COBOL, too...)
OPEN INPUT file
, READ
until end of fileDepending on the amount of lines in that data you may either process the records direct, move them to a table (have a look at OCCURS
), or WRITE
them to another file (likely and INDEXED
with multiple KEY
definitions)
Upvotes: 2
Reputation: 10543
Yes it is possible. For the file use Line Sequential
The file definition
select lineseq assign to "lineseq.dat"
organization is line sequential.
To split the lines up use UNSTRING. i.e.
UNSTRING in-line
DELIMITED BY SPACES
into item-1, item-2, item-3
END-UNSTRING
It is probably easier to do in languages like python
Upvotes: 2