FongYu
FongYu

Reputation: 767

Python, need help parsing items from text file into list

I'm trying to parse items in a text file and store them into a list. The data looks something like this:

[(0, 0, 0), (1, 0, 0), (2, 0, 0), (3, 0, 0), (4, 0, 0), (5, 0, 0), (6, 0, 0)]
[(10, 3, 1), (11, 3, 1), (12, 3, 1), (13, 3, 1), (13, 4, 1)]
[(10, 3, 5), (11, 3, 5), (12, 3, 5), (13, 3, 5), (13, 4, 5), (13, 5, 5), (13, 6, 5)]
[(6, 13, 5), (7, 13, 5), (8, 13, 5), (8, 14, 5), (7, 14, 5), (6, 14, 5), (6, 14, 6)]

I was able to strip the "[" and "]" but couldn't store the rest of information into list as such format: (x, y, z). Any help?

 def dataParser(fileName):
    zoneList=[]; zone=[]
    input=open(fileName,"r")

    for line in input:
        vals = line.strip("[")
        newVals = vals.strip("]\n")

        print newVals
        v=newVals[0:9]
        zone.append(v)

    input.close()
    return zone

Upvotes: 1

Views: 4042

Answers (5)

Michael David Watson
Michael David Watson

Reputation: 3071

The other answers work just fine and are a simple solution to this specific problem. But I am assuming that if you were having problems with string manipulation, then a simple eval() function won't help you out much the next time you have this problem.

As a general rule, the first thing you want to do when you are approached with a problem like this, is define your delimiters.

[(0, 0, 0), (1, 0, 0), (2, 0, 0), (3, 0, 0), (4, 0, 0), (5, 0, 0), (6, 0, 0)]

Here you can see that "), (" is a potential delimiter between groups and a simple comma (",") is your delimiter between values. Next you want to see what you need to remove, and as you pointed out, brackets ("[" and "]") provide little information. We can also see that because we are dealing with numeric values, all spacing gives us little information and needs to be removed.

Building on this information, I set up your dataParser function in a way that returns the values you were looking for:

fileName= "../ZoneFinding/outputData/zoneFinding_tofu_rs1000.txt"

def dataParser(fileName):
    with open(fileName,"r") as input
        zoneLst = []
        for line in input:
            #First remove white space and the bracket+parenthese combos on the end
            line = line.replace(" ","").replace("[(","").replace(")]","")

            #Now lets split line by "),(" to create a list of strings with the values
            lineLst = line.split("),(")

            # At this point lineLst = ["0,0,0" , "1,0,0", "2,0,0", ...]
            #Lastly, we will split each number by a "," and add each to a list
            zone = [group.split(",") for group in lineLst]

            zoneLst.append(zone)

        return zoneLst

In the example above, all of the values are stored as strings. You could also replace the definition of zone with the code below to store the values as floats.

zone = [ [float(val) for val in group.split(",")] for group in lineLst]

Upvotes: 0

Brenden Brown
Brenden Brown

Reputation: 3215

You can do it without eval, using the string split method and the tuple constructor:

>>> st = "[(0,0,0), (1,0,0)]"
>>> splits = st.strip('[').strip(']\n').split(', ')
>>> splits
['(0,0,0)', '(1,0,0)']
>>> for split in splits:
...   trimmed = split.strip('(').strip(')')
...   tup = tuple(trimmed.split(','))
...   print tup, type(tup)
...
('0', '0', '0') <type 'tuple'>
('1', '0', '0') <type 'tuple'>
>>>

From there, it's just appending to a list.

Upvotes: 3

Ashwini Chaudhary
Ashwini Chaudhary

Reputation: 250951

some might not like using eval() here, but you can do this in one line using it:

In [20]: lis=eval("[(0, 0, 0), (1, 0, 0), (2, 0, 0), (3, 0, 0), (4, 0, 0), (5, 0, 0), (6, 0, 0)]")
In [23]: lis
Out[23]: [(0, 0, 0), (1, 0, 0), (2, 0, 0), (3, 0, 0), (4, 0, 0), (5, 0, 0), (6, 0, 0)]

using a text file:

In [44]: with open('data.txt') as f:
   ....:     lis=[eval(x.strip()) for x in f]
   ....:     print lis
   ....:     
   ....:     
[[(0, 0, 0), (1, 0, 0), (2, 0, 0), (3, 0, 0), (4, 0, 0), (5, 0, 0), (6, 0, 0)], [(10, 3, 1), (11, 3, 1), (12, 3, 1), (13, 3, 1), (13, 4, 1)], [(10, 3, 5), (11, 3, 5), (12, 3, 5), (13, 3, 5), (13, 4, 5), (13, 5, 5), (13, 6, 5)], [(6, 13, 5), (7, 13, 5), (8, 13, 5), (8, 14, 5), (7, 14, 5), (6, 14, 5), (6, 14, 6)]]

Upvotes: 2

DSM
DSM

Reputation: 353059

In this particular case, you can use ast.literal_eval:

>>> with open("list.txt") as fp:
...     data = [ast.literal_eval(line) for line in fp if line.strip()]
... 
>>> data
[[(0, 0, 0), (1, 0, 0), (2, 0, 0), (3, 0, 0), (4, 0, 0), (5, 0, 0), (6, 0, 0)], [(10, 3, 1), (11, 3, 1), (12, 3, 1), (13, 3, 1), (13, 4, 1)], [(10, 3, 5), (11, 3, 5), (12, 3, 5), (13, 3, 5), (13, 4, 5), (13, 5, 5), (13, 6, 5)], [(6, 13, 5), (7, 13, 5), (8, 13, 5), (8, 14, 5), (7, 14, 5), (6, 14, 5), (6, 14, 6)]]

It's the "safe" version of eval. It's not as general, though, for precisely that reason. If you're generating this input, you might want to look into a different way to save your data ("serialization"), whether using pickle or something like JSON -- there are lots of examples of using both you can find on SO and elsewhere.

Upvotes: 6

Wilduck
Wilduck

Reputation: 14106

The following is a bad idea if you're getting this data from any source that you don't trust completely, but if the data will always be in this format (and only contain numbers as the elements) something like this is quite straightforward:

collect = []
for line in input:
    collect.append(eval(line))

Upvotes: 0

Related Questions