user808545
user808545

Reputation: 1631

cleanup nested list

I have a huge mess of a nested list that looks something like this, just longer:

fruit_mess = [['watermelon,0,1.0\n'], ['apple,0,1.0\n'], ['"pineapple",0,1.0\n'], ['"strawberry, banana",0,1.0\n'], ['peach plum pear,0,1.0\n'], ['"orange, grape",0,1.0\n']]

Ultimately I want something that looks like this:

neat_fruit = [['watermelon',0,1.0], ['apple',0,1.0], ['pineapple',0,1.0], ['strawberry, banana',0,1.0], ['peach plum pear',0,1.0], ['orange, grape',0,1.0]]

but I'm not sure how to deal with the double quotes in the quotes and how to split the fruits from the numbers, especially with the commas separating some of the fruits. I've tried a bunch of things, but everything just seems to make it even more of a mess. Any suggestions would be greatly appreciated.

Upvotes: 1

Views: 252

Answers (3)

Artsiom Rudzenka
Artsiom Rudzenka

Reputation: 29121

One more simple solution:

fruit_mess = [['watermelon,0,1.0\n'], ['apple,0,1.0\n'], ['"pineapple",0,1.0\n'], ['"strawberry, banana",0,1.0\n'], ['peach plum pear,0,1.0\n'], ['"orange, grape",0,1.0\n']]
for i,x in enumerate(fruit_mess):
    data = x[0].rstrip('\n').rsplit(',', 2)
    fruit_mess[i] = [data[0], int(data[1]), float(data[2])]

Upvotes: 1

unutbu
unutbu

Reputation: 879919

Use the csv module (in the standard library) to handle the double-quoted fruits with commas in their names:

import csv
import io

fruit_mess = [['watermelon,0,1.0\n'], ['apple,0,1.0\n'], ['"pineapple",0,1.0\n'], ['"strawberry, banana",0,1.0\n'], ['peach plum pear,0,1.0\n'], ['"orange, grape",0,1.0\n']]

# flatten the list of lists into a string:
data='\n'.join(item[0].strip() for item in fruit_mess)    
reader=csv.reader(io.BytesIO(data))
neat_fruit=[[fruit,int(num1),float(num2)] for fruit,num1,num2 in reader]

print(neat_fruit)    
# [['watermelon', 0, 1.0], ['apple', 0, 1.0], ['pineapple', 0, 1.0], ['strawberry, banana', 0, 1.0], ['peach plum pear', 0, 1.0], ['orange, grape', 0, 1.0]]

Upvotes: 6

Tim Pietzcker
Tim Pietzcker

Reputation: 336258

A regex-based solution:

>>> import re
>>> regex = re.compile(r'("[^"]*"|[^,]*),(\d+),([\d.]+)')
>>> neat_fruit = []
>>> for item in fruit_mess:
...     match = regex.match(item[0])
...     result = [match.group(1).strip('"'), int(match.group(2)), float(match.group(3))]
...     neat_fruit.append(result)
...
>>> neat_fruit
[['watermelon', 0, 1.0], ['apple', 0, 1.0], ['pineapple', 0, 1.0], ['strawberry,
 banana', 0, 1.0], ['peach plum pear', 0, 1.0], ['orange, grape', 0, 1.0]]

Upvotes: 0

Related Questions