LordStryker
LordStryker

Reputation: 127

Read in file contents to Arrays

I am a bit familiar with Python. I have a file with information that I need to read in a very specific way. Below is an example...

1
6
0.714285714286
0    0    1.00000000000
0    1    0.61356352337
...
-1  -1    0.00000000000
0    0    5.13787636499
0    1    0.97147643932
...
-1  -1    0.00000000000
0    0    5.13787636499
0    1    0.97147643932
...
-1  -1    0.00000000000
0 0 0 0   5.13787636499
0 0 0 1   0.97147643932
....

So every file will have this structure (tab delimited).

What I need

Upvotes: 1

Views: 1335

Answers (3)

akaRem
akaRem

Reputation: 7618

I rewrote the code. Now it's almost what you need. You only need fine tuning.

I decided to leave the old answer - perhaps it would be helpful too. Because the new is feature-rich enough, and sometimes may not be clear to understand.

def the_function(filename):
    """
    returns tuple of list of independent values and list of sparsed arrays as dicts
    e.g. ( [1,2,0.5], [{(0.0):1,(0,1):2},...] )
    on fail prints the reason and returns None:
    e.g. 'failed on text.txt: invalid literal for int() with base 10: '0.0', line: 5'
    """

    # open file and read content
    try:
        with open(filename, "r") as f:
            data_txt = [line.split() for line in f]
    # no such file
    except IOError, e:
        print 'fail on open ' + str(e)

    # try to get the first 3 variables
    try:
        vars =[int(data_txt[0][0]), int(data_txt[1][0]), float(data_txt[2][0])]
    except ValueError,e:
        print 'failed on '+filename+': '+str(e)+', somewhere on lines 1-3'
        return

    # now get arrays
    arrays =[dict()]
    for lineidx, item in enumerate(data_txt[3:]):
        try:
            # for 2d array data
            if len(item) == 3:
                i, j = map(int, item[:2])
                val = float(item[-1])
                # check for 'block separator'
                if (i,j,val) == (-1,-1,0.0):
                    # make new array
                    arrays.append(dict())
                else:
                    # update last, existing
                    arrays[-1][(i,j)] = val
            # almost the same for 4d array data
            if len(item) == 5:
                i, j, k, m = map(int, item[:4])
                val = float(item[-1])
                arrays[-1][(i,j,k,m)] = val
        # if value is unparsable like '0.00' for int or 'text'
        except ValueError,e:
            print 'failed on '+filename+': '+str(e)+', line: '+str(lineidx+3)
            return
    return vars, arrays

Upvotes: 3

akaRem
akaRem

Reputation: 7618

As i anderstand what did you ask for..

# read data from file into list
parsed=[]
with open(filename, "r") as f:
    for line in f:
        # # you can exclude separator here with such code (uncomment) (1)
        # # be careful one zero more, one zero less and it wouldn work
        # if line == '-1  -1    0.00000000000':
        #     continue
        parsed.append(line.split())

# a simpler version
with open(filename, "r") as f:
    # # you can exclude separator here with such code (uncomment, replace) (2)
    # parsed = [line.split() for line in f if line != '-1  -1    0.00000000000']
    parsed = [line.split() for line in f]

# at this point 'parsed' is a list of lists of strings.
# [['1'],['6'],['0.714285714286'],['0', '0', '1.00000000000'],['0', '1', '0.61356352337'] .. ]

# ALT 1 -------------------------------
# we do know the len of each data block 

# get the first 3 lines:
head = parsed[:3]

# get the body:
body = parsed[3:-2]

# get the last 2 lines:
tail = parsed[-2:]

# now you can do anything you want with your data
# but remember to convert str to int or float

# first3 as unique:
unique0 = int(head[0][0])
unique1 = int(head[1][0])
unique2 = float(head[2][0])

# cast body:
# check each item of body has 3 inner items
is_correct = all(map(lambda item: len(item)==3, body))
# parse str and cast
if is_correct:
    for i, j, v in body:
        # # you can exclude separator here (uncomment) (3)
        # # * 1. is the same as float(1)
        # if (i,j,v) == (0,0,1.):
        #     # here we skip iteration for line w/ '-1  -1    0.0...'
        #     # but you can place another code that will be executed 
        #     # at the point where block-termination lines appear
        #     continue 

        some_body_cast_function(int(i), int(j), float(v))
else:
    raise Exception('incorrect body')


# cast tail
# check each item of body has 5 inner items
is_correct = all(map(lambda item: len(item)==5, tail))
# parse str and cast
if is_correct:
    for i, j, k, m, v in body: # 'l' is bad index, because similar to 1.
        some_tail_cast_function(int(i), int(j), int(k), int(m), float(v))
else:
    raise Exception('incorrect tail')

# ALT 2 -----------------------------------
# we do NOT know the len of each data block 

# maybe we have some array?
array = dict() # your array may be other type

v1,v2,v2 = parsed[:3]
unique0 = int(v1[0])
unique1 = int(v2[0])
unique2 = float(v3[0])

for item in parsed[3:]:
    if len(item) == 3:
        i,j,v = item
        i = int(i)
        j = int(j)
        v = float(v)

        # # yo can exclude separator here (uncomment) (4)
        # # * 1. is the same as float(1)
        # # logic is the same as in 3rd variant
        # if (i,j,v) == (0,0,1.):
        #     continue

        # do your stuff
        # for example,
        array[(i,j)]=v
        array[(j,i)]=v

    elif len(item) ==5:
        i, j, k, m, v = item
        i = int(i)
        j = int(j)
        k = int(k)
        m = int(m)
        v = float(v)

        # do your stuff

    else:
        raise Exception('unsupported') # or, maybe just 'pass'

Upvotes: 2

Paul Hiemstra
Paul Hiemstra

Reputation: 60924

To read lines from a file iteratively, you can use something like:

with open(filename, "r") as f:
  var1 = int(f.next())
  var2 = int(f.next())
  var3 = float(f.next())
  for line in f:
    do some stuff particular to the line we are on...

Just create some data structures outside the loop, and fill them in the loop above. To split strings into elements, you can use:

>>> "spam   ham".split()
['spam', 'ham']

I also think you want to take a look at the numpy library for array datastructures, and possible the SciPy library for analysis.

Upvotes: 1

Related Questions