Reputation: 1955
I have a variable name "inventory" which has following data. How do i load the data from this variable into a pandas dataframe. If key=value exist, i want to use key as column name.
print (inventory)
2017-05-01,pink,name=apple,quantity=6,orange,place=america
2017-05-03,pink,name=mango,quantity=1,orange,place=europe
2017-05-04,pink,name=apple,quantity=4,orange,place=africa
Upvotes: 1
Views: 4830
Reputation: 11
I tried to solve it like this:
import pandas as pd
inventory = \
"""2017-05-01,pink,name=apple,quantity=6,orange,place=america
2017-05-03,pink,name=mango,quantity=1,orange,place=europe
2017-05-04,pink,name=apple,quantity=4,orange,place=africa"""
content = [line.split(',') for line in inventory.splitlines()]
# prepare column names to be changed and clean the data
columns_to_be_rename = {}
for line in content:
for i, s in enumerate(line):
if '=' in s:
columns_to_be_rename[i], line[i] = s.split('=')
df = pd.DataFrame(content)
df.rename(columns = columns_to_be_rename)
0 1 name quantity 4 place
0 2017-05-01 pink apple 6 orange america
1 2017-05-03 pink mango 1 orange europe
2 2017-05-04 pink apple 4 orange africa
Upvotes: 1
Reputation: 294258
Use pd.DataFrame
with a comprehension
inventory = """2017-05-01,pink,name=apple,quantity=6,orange,place=america
2017-05-03,pink,name=mango,quantity=1,orange,place=europe
2017-05-01,pink,name=apple,quantity=4,orange,place=africa"""
lol = [l.split(',') for l in inventory.splitlines()]
d1 = pd.DataFrame([[i for i in row if '=' not in i] for row in lol])
d2 = pd.DataFrame(
[dict([tuple(i.split('=')) for i in row if '=' in i]) for row in lol]
)
d1.join(d2)
0 1 2 name place quantity
0 2017-05-01 pink orange apple america 6
1 2017-05-03 pink orange mango europe 1
2 2017-05-01 pink orange apple africa 4
Upvotes: 1