Reputation: 65
I have a text file that needs to be split at certain character positions along a line. Ideally I would like to simply insert a comma at a designated position so that I can then load it into an MS Access table (or similar). A line form the text looks like so:
"00ZZ101 Bla Bla BlaBlaBlaBla 022000G0132000R6G00BBDJ1000 091030820514 BlaBla Bla 1PP"
I need to parse the text from 0:4, and 13:29, and 30:32, and 33:34, and so on......
I need the results to essentially be comma delimited so that I can load them into a table. So basically take the first four characters and split them, then the 13th through the 29th and split them, and so forth. The problem in the text file has a carriage return at the end of each line (at the 167th character position). So I need to split each line into multiple pieces based on some rules that determine what data should be grouped together.
Upvotes: 1
Views: 4123
Reputation: 63709
From the Python console:
>>> s = "00ZZ101 Bla Bla BlaBlaBlaBla 022000G0132000R6G00BBDJ1000 091030820514 BlaBla Bla 1PP"
>>> slices = [(0,4), (13,29), (30,32), (33,34)]
>>> [s[slice(*slc)] for slc in slices]
['00ZZ', 'la BlaBlaBlaBla ', '22', '0']
If you are reading strings from each line in an input text file, this is the way to read the file and process each line in turn:
with open('xyzzy.txt') as sourcefile:
for line in sourcefile:
# process each line
Upvotes: 6
Reputation: 197
You have:
s = "00ZZ101 Bla Bla BlaBlaBlaBla 022000G0132000R6G00BBDJ1000 091030820514 BlaBla Bla 1PP"
And in python shell
>>> s = "00ZZ101 Bla Bla BlaBlaBlaBla 022000G0132000R6G00BBDJ1000 091030820514 BlaBla Bla 1PP"
>>> s[0:4]
'00ZZ'
>>> s[13:29]
'la BlaBlaBlaBla '
>>> s[30:32]
'22'
>>>
Upvotes: 0