user612041
user612041

Reputation: 215

Python - Parsing Columns and Rows

I am running into some trouble with parsing the contents of a text file into a 2D array/list. I cannot use built-in libraries, so have taken a different approach. This is what my text file looks like, followed by my code

1,0,4,3,6,7,4,8,3,2,1,0
2,3,6,3,2,1,7,4,3,1,1,0
5,2,1,3,4,6,4,8,9,5,2,1
def twoDArray():
    network = [[]]       

    filename = open('twoDArray.txt', 'r')
    for line in filename.readlines():
        col = line.split(line, ',')
        row = line.split(',')

    network.append(col,row)        

    print "Network = "
    print network        

if __name__ == "__main__":
    twoDArray()

I ran this code but got this error:

Traceback (most recent call last):
  File "2dArray.py", line 22, in <module>
    twoDArray()
  File "2dArray.py", line 8, in twoDArray
    col = line.split(line, ',')
TypeError: an integer is required

I am using the comma to separate both row and column as I am not sure how I would differentiate between the two - I am confused about why it is telling me that an integer is required when the file is made up of integers

Upvotes: 2

Views: 24012

Answers (7)

BarryPye
BarryPye

Reputation: 2082

Read the data from the file. Here's one way:

f = open('twoDArray.txt', 'r')
buffer = f.read()
f.close()

Parse the data into a table

table = [map(int, row.split(',')) for row in buffer.strip().split("\n")]
>>> print table
[[1, 0, 4, 3, 6, 7, 4, 8, 3, 2, 1, 0], [2, 3, 6, 3, 2, 1, 7, 4, 3, 1, 1, 0], [5, 2, 1, 3, 4, 6, 4, 8, 9, 5, 2, 1]]

Perhaps you want the transpose instead:

transpose = zip(*table)
>>> print transpose
[(1, 2, 5), (0, 3, 2), (4, 6, 1), (3, 3, 3), (6, 2, 4), (7, 1, 6), (4, 7, 4), (8, 4, 8), (3, 3, 9), (2, 1, 5), (1, 1, 2), (0, 0, 1)]

Upvotes: 0

jfs
jfs

Reputation: 414285

The input format is simple, so the solution should be simple too:

network = [map(int, line.split(',')) for line in open(filename)]
print network

csv module doesn't provide an advantage in this case:

import csv
print [map(int, row) for row in csv.reader(open(filename, 'rb'))]

If you need float instead of int:

print list(csv.reader(open(filename, 'rb'), quoting=csv.QUOTE_NONNUMERIC))

If you are working with numpy arrays:

import numpy
print numpy.loadtxt(filename, dtype='i', delimiter=',')

See Why NumPy instead of Python lists?

All examples produce arrays equal to:

[[1 0 4 3 6 7 4 8 3 2 1 0]
 [2 3 6 3 2 1 7 4 3 1 1 0]
 [5 2 1 3 4 6 4 8 9 5 2 1]]

Upvotes: 0

John Machin
John Machin

Reputation: 82934

"""I cannot use built-in libraries""" -- do you really mean "cannot" as in you have tried to use the csv module and failed? If so, say so. Do you mean that "may not" as in you are forbidden to use a built-in module by the terms of your homework assignment? If so, say so.

Here is an answer that works. It doesn't leave a newline attached to the end of the last item in each row. It converts the numbers to int so that you can use them for whatever purpose you have. It fixes other errors that nobody else has mentioned.

def twoDArray():
    network = []       
    # filename = open('twoDArray.txt', 'r')
    # "filename" is a very weird name for a file HANDLE
    f = open('twoDArray.txt', 'r')
    # for line in filename.readlines():
    # readlines reads the whole file into memory at once.
    # That is quite unnecessary.
    for line in f: # just iterate over the file handle
        line = line.rstrip('\n') # remove the newline, if any
        # col = line.split(line, ',')
        # wrong args, as others have said.
        # In any case, only 1 split call is necessary 
        row = line.split(',')
        # now convert string to integer
        irow = [int(item) for item in row]
        # network.append(col,row)        
        # list.append expects only ONE arg
        # indentation was wrong; you need to do this once per line
        network.append(irow)

    print "Network = "
    print network        

if __name__ == "__main__":
    twoDArray()

Upvotes: 2

Hugh Bothwell
Hugh Bothwell

Reputation: 56654

class TwoDArray(object):
    @classmethod
    def fromFile(cls, fname, *args, **kwargs):
        splitOn = kwargs.pop('splitOn', None)
        mode    = kwargs.pop('mode',    'r')
        with open(fname, mode) as inf:
            return cls([line.strip('\r\n').split(splitOn) for line in inf], *args, **kwargs)

    def __init__(self, data=[[]], *args, **kwargs):
        dataType = kwargs.pop('dataType', lambda x:x)
        super(TwoDArray,self).__init__()
        self.data = [[dataType(i) for i in line] for line in data]

    def __str__(self, fmt=str, endrow='\n', endcol='\t'):
        return endrow.join(
            endcol.join(fmt(i) for i in row) for row in self.data
        )

def main():
    network = TwoDArray.fromFile('twodarray.txt', splitOn=',', dataType=int)

    print("Network =")
    print(network)

if __name__ == "__main__":
    main()

Upvotes: 0

Wesley
Wesley

Reputation: 10852

To directly answer your question, there is a problem with the following line:

col = line.split(line, ',')

If you check the documentation for str.split, you'll find the description to be as follows:

str.split([sep[, maxsplit]])

Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done (thus, the list will have at most maxsplit+1 elements). If maxsplit is not specified, then there is no limit on the number of splits (all possible splits are made).

This is not what you want. You are not trying to specify the number of splits you want to make.


Consider replacing your for loop and network.append with this:

for line in filename.readlines():
    # line is a string representing the values for this row
    row = line.split(',')
    # row is the list of numbers strings for this row, such as ['1', '0', '4', ...]
    cols = [int(x) for x in row]
    # cols is the list of numbers for this row, such as [1, 0, 4, ...]
    network.append(row)
    # Put this row into network, such that network is [[1, 0, 4, ...], [...], ...]

Upvotes: 2

seriyPS
seriyPS

Reputation: 7102

Omg...

network = []
filename = open('twoDArray.txt', 'r')
for line in filename.readlines():
    network.append(line.split(','))

you take

[
[1,0,4,3,6,7,4,8,3,2,1,0],
[2,3,6,3,2,1,7,4,3,1,1,0],
[5,2,1,3,4,6,4,8,9,5,2,1]
]

or you neeed some other structure as output? Please add what do you need as output?

Upvotes: 0

Jeremy Whitlock
Jeremy Whitlock

Reputation: 3818

Well, I can explain the error. You're using str.split() and its usage pattern is:

str.split(separator, maxsplit)

You're using str.split(string, separator) and that isn't a valid call to split. Here is a direct link to the Python docs for this:

http://docs.python.org/library/stdtypes.html#str.split

Upvotes: 4

Related Questions