Kg123
Kg123

Reputation: 117

Splitting a string then splitting by character number

I have an information file with many lines, some lines having a set of data pairs on them. I want to extract pairs of depth and temperature. The pairs begin after character 64, and each pair occupies a space of 17 characters.

What I am doing currently is cutting off the string at character 64 (63 with python counting), then splitting the string every 17th character. This works fine if there is white space between the pairs, however some pairs do not, as the depth is large. Below are examples:

'    1.9901 954.01'
'    1.43011675.01'

The temp occupies the first 10 characters and the depth the next 7. So what I would like to do is split the line in such a way that I can extract all of the values separately then pair them up.

However, I am having issues with creating a split that increments by 7 or 10. Also I am not sure what is happening with python converting the string into a list and preserving the character length.

Here is my working code:

import os, re
import string

with open('TE_feb_2014.pos','r') as file:
  for line in file:
    m = re.search('(TEMP01)', line)
    if m:
      values = string.split(line[63:])
      #print values
      n = 17
      [values[i:i+n] for i in range(0, len(values), n)]
      print values

Here is an example data line (without the issue described above):

00087501  297017Q990066614201402251006TE       42550TEMP01  18D   2.01   -1.2801  50.01   -1.1601  99.01   -0.5901 148.01   -0.8001 197.01   -1.1001 245.01   -1.7501 295.01   -1.7701 301.01   -1.7801 343.01   -1.7301 392.01   -1.6701 441.01   -1.5901 489.01   -1.4501 538.01   -1.1401 587.01   -0.7201 635.01   -0.3201 684.01    0.3501 731.01    0.6201 733.01    0.6201

Upvotes: 2

Views: 197

Answers (1)

Avinash Raj
Avinash Raj

Reputation: 174706

Seems like you want something like this,

>>> import re
>>> s = "00087501  297017Q990066614201402251006TE       42550TEMP01  18D   2.01   -1.2801  50.01   -1.1601  99.01   -0.5901 148.01   -0.8001 197.01   -1.1001 245.01   -1.7501 295.01   -1.7701 301.01   -1.7801 343.01   -1.7301 392.01   -1.6701 441.01   -1.5901 489.01   -1.4501 538.01   -1.1401 587.01   -0.7201 635.01   -0.3201 684.01    0.3501 731.01    0.6201 733.01    0.6201"
>>> m = re.sub(r'^.{64}', r'', s)   # To remove the first 64 characters from the input string.
>>> re.findall(r'.{1,17}', m)       # To find all the matches which has the maximum of 17 characters and a minimum of `1` character.
['  2.01   -1.2801 ', ' 50.01   -1.1601 ', ' 99.01   -0.5901 ', '148.01   -0.8001 ', '197.01   -1.1001 ', '245.01   -1.7501 ', '295.01   -1.7701 ', '301.01   -1.7801 ', '343.01   -1.7301 ', '392.01   -1.6701 ', '441.01   -1.5901 ', '489.01   -1.4501 ', '538.01   -1.1401 ', '587.01   -0.7201 ', '635.01   -0.3201 ', '684.01    0.3501 ', '731.01    0.6201 ', '733.01    0.6201']
>>> for i in re.findall(r'.{1,17}', m):
        print(i)


  2.01   -1.2801 
 50.01   -1.1601 
 99.01   -0.5901 
148.01   -0.8001 
197.01   -1.1001 
245.01   -1.7501 
295.01   -1.7701 
301.01   -1.7801 
343.01   -1.7301 
392.01   -1.6701 
441.01   -1.5901 
489.01   -1.4501 
538.01   -1.1401 
587.01   -0.7201 
635.01   -0.3201 
684.01    0.3501 
731.01    0.6201 
733.01    0.6201

Upvotes: 2

Related Questions