Reputation: 127
I am a bit familiar with Python. I have a file with information that I need to read in a very specific way. Below is an example...
1
6
0.714285714286
0 0 1.00000000000
0 1 0.61356352337
...
-1 -1 0.00000000000
0 0 5.13787636499
0 1 0.97147643932
...
-1 -1 0.00000000000
0 0 5.13787636499
0 1 0.97147643932
...
-1 -1 0.00000000000
0 0 0 0 5.13787636499
0 0 0 1 0.97147643932
....
So every file will have this structure (tab delimited).
-1 -1 0.0000000000
. Each block of code is 'n' lines long. The first two numbers represent the position/location that the 3rd number in the line is to be inserted in an array. Only the unique positions are listed (so, position 0 1 would be the same as 1 0 but that information would not be shown). What I need
0 1
value should also appear in 1 0
).Upvotes: 1
Views: 1335
Reputation: 7618
I rewrote the code. Now it's almost what you need. You only need fine tuning.
I decided to leave the old answer - perhaps it would be helpful too. Because the new is feature-rich enough, and sometimes may not be clear to understand.
def the_function(filename):
"""
returns tuple of list of independent values and list of sparsed arrays as dicts
e.g. ( [1,2,0.5], [{(0.0):1,(0,1):2},...] )
on fail prints the reason and returns None:
e.g. 'failed on text.txt: invalid literal for int() with base 10: '0.0', line: 5'
"""
# open file and read content
try:
with open(filename, "r") as f:
data_txt = [line.split() for line in f]
# no such file
except IOError, e:
print 'fail on open ' + str(e)
# try to get the first 3 variables
try:
vars =[int(data_txt[0][0]), int(data_txt[1][0]), float(data_txt[2][0])]
except ValueError,e:
print 'failed on '+filename+': '+str(e)+', somewhere on lines 1-3'
return
# now get arrays
arrays =[dict()]
for lineidx, item in enumerate(data_txt[3:]):
try:
# for 2d array data
if len(item) == 3:
i, j = map(int, item[:2])
val = float(item[-1])
# check for 'block separator'
if (i,j,val) == (-1,-1,0.0):
# make new array
arrays.append(dict())
else:
# update last, existing
arrays[-1][(i,j)] = val
# almost the same for 4d array data
if len(item) == 5:
i, j, k, m = map(int, item[:4])
val = float(item[-1])
arrays[-1][(i,j,k,m)] = val
# if value is unparsable like '0.00' for int or 'text'
except ValueError,e:
print 'failed on '+filename+': '+str(e)+', line: '+str(lineidx+3)
return
return vars, arrays
Upvotes: 3
Reputation: 7618
As i anderstand what did you ask for..
# read data from file into list
parsed=[]
with open(filename, "r") as f:
for line in f:
# # you can exclude separator here with such code (uncomment) (1)
# # be careful one zero more, one zero less and it wouldn work
# if line == '-1 -1 0.00000000000':
# continue
parsed.append(line.split())
# a simpler version
with open(filename, "r") as f:
# # you can exclude separator here with such code (uncomment, replace) (2)
# parsed = [line.split() for line in f if line != '-1 -1 0.00000000000']
parsed = [line.split() for line in f]
# at this point 'parsed' is a list of lists of strings.
# [['1'],['6'],['0.714285714286'],['0', '0', '1.00000000000'],['0', '1', '0.61356352337'] .. ]
# ALT 1 -------------------------------
# we do know the len of each data block
# get the first 3 lines:
head = parsed[:3]
# get the body:
body = parsed[3:-2]
# get the last 2 lines:
tail = parsed[-2:]
# now you can do anything you want with your data
# but remember to convert str to int or float
# first3 as unique:
unique0 = int(head[0][0])
unique1 = int(head[1][0])
unique2 = float(head[2][0])
# cast body:
# check each item of body has 3 inner items
is_correct = all(map(lambda item: len(item)==3, body))
# parse str and cast
if is_correct:
for i, j, v in body:
# # you can exclude separator here (uncomment) (3)
# # * 1. is the same as float(1)
# if (i,j,v) == (0,0,1.):
# # here we skip iteration for line w/ '-1 -1 0.0...'
# # but you can place another code that will be executed
# # at the point where block-termination lines appear
# continue
some_body_cast_function(int(i), int(j), float(v))
else:
raise Exception('incorrect body')
# cast tail
# check each item of body has 5 inner items
is_correct = all(map(lambda item: len(item)==5, tail))
# parse str and cast
if is_correct:
for i, j, k, m, v in body: # 'l' is bad index, because similar to 1.
some_tail_cast_function(int(i), int(j), int(k), int(m), float(v))
else:
raise Exception('incorrect tail')
# ALT 2 -----------------------------------
# we do NOT know the len of each data block
# maybe we have some array?
array = dict() # your array may be other type
v1,v2,v2 = parsed[:3]
unique0 = int(v1[0])
unique1 = int(v2[0])
unique2 = float(v3[0])
for item in parsed[3:]:
if len(item) == 3:
i,j,v = item
i = int(i)
j = int(j)
v = float(v)
# # yo can exclude separator here (uncomment) (4)
# # * 1. is the same as float(1)
# # logic is the same as in 3rd variant
# if (i,j,v) == (0,0,1.):
# continue
# do your stuff
# for example,
array[(i,j)]=v
array[(j,i)]=v
elif len(item) ==5:
i, j, k, m, v = item
i = int(i)
j = int(j)
k = int(k)
m = int(m)
v = float(v)
# do your stuff
else:
raise Exception('unsupported') # or, maybe just 'pass'
Upvotes: 2
Reputation: 60924
To read lines from a file iteratively, you can use something like:
with open(filename, "r") as f:
var1 = int(f.next())
var2 = int(f.next())
var3 = float(f.next())
for line in f:
do some stuff particular to the line we are on...
Just create some data structures outside the loop, and fill them in the loop above. To split strings into elements, you can use:
>>> "spam ham".split()
['spam', 'ham']
I also think you want to take a look at the numpy
library for array datastructures, and possible the SciPy
library for analysis.
Upvotes: 1