Reputation: 117
I have an information file with many lines, some lines having a set of data pairs on them. I want to extract pairs of depth and temperature. The pairs begin after character 64, and each pair occupies a space of 17 characters.
What I am doing currently is cutting off the string at character 64 (63 with python counting), then splitting the string every 17th character. This works fine if there is white space between the pairs, however some pairs do not, as the depth is large. Below are examples:
' 1.9901 954.01'
' 1.43011675.01'
The temp occupies the first 10 characters and the depth the next 7. So what I would like to do is split the line in such a way that I can extract all of the values separately then pair them up.
However, I am having issues with creating a split that increments by 7 or 10. Also I am not sure what is happening with python converting the string into a list and preserving the character length.
Here is my working code:
import os, re
import string
with open('TE_feb_2014.pos','r') as file:
for line in file:
m = re.search('(TEMP01)', line)
if m:
values = string.split(line[63:])
#print values
n = 17
[values[i:i+n] for i in range(0, len(values), n)]
print values
Here is an example data line (without the issue described above):
00087501 297017Q990066614201402251006TE 42550TEMP01 18D 2.01 -1.2801 50.01 -1.1601 99.01 -0.5901 148.01 -0.8001 197.01 -1.1001 245.01 -1.7501 295.01 -1.7701 301.01 -1.7801 343.01 -1.7301 392.01 -1.6701 441.01 -1.5901 489.01 -1.4501 538.01 -1.1401 587.01 -0.7201 635.01 -0.3201 684.01 0.3501 731.01 0.6201 733.01 0.6201
Upvotes: 2
Views: 197
Reputation: 174706
Seems like you want something like this,
>>> import re
>>> s = "00087501 297017Q990066614201402251006TE 42550TEMP01 18D 2.01 -1.2801 50.01 -1.1601 99.01 -0.5901 148.01 -0.8001 197.01 -1.1001 245.01 -1.7501 295.01 -1.7701 301.01 -1.7801 343.01 -1.7301 392.01 -1.6701 441.01 -1.5901 489.01 -1.4501 538.01 -1.1401 587.01 -0.7201 635.01 -0.3201 684.01 0.3501 731.01 0.6201 733.01 0.6201"
>>> m = re.sub(r'^.{64}', r'', s) # To remove the first 64 characters from the input string.
>>> re.findall(r'.{1,17}', m) # To find all the matches which has the maximum of 17 characters and a minimum of `1` character.
[' 2.01 -1.2801 ', ' 50.01 -1.1601 ', ' 99.01 -0.5901 ', '148.01 -0.8001 ', '197.01 -1.1001 ', '245.01 -1.7501 ', '295.01 -1.7701 ', '301.01 -1.7801 ', '343.01 -1.7301 ', '392.01 -1.6701 ', '441.01 -1.5901 ', '489.01 -1.4501 ', '538.01 -1.1401 ', '587.01 -0.7201 ', '635.01 -0.3201 ', '684.01 0.3501 ', '731.01 0.6201 ', '733.01 0.6201']
>>> for i in re.findall(r'.{1,17}', m):
print(i)
2.01 -1.2801
50.01 -1.1601
99.01 -0.5901
148.01 -0.8001
197.01 -1.1001
245.01 -1.7501
295.01 -1.7701
301.01 -1.7801
343.01 -1.7301
392.01 -1.6701
441.01 -1.5901
489.01 -1.4501
538.01 -1.1401
587.01 -0.7201
635.01 -0.3201
684.01 0.3501
731.01 0.6201
733.01 0.6201
Upvotes: 2