How to unpack an object of dictionaries to a range of Data Frames

Question

I am creating a function that grabs data from an ERP system to display to the end user.

I want to unpack an object of dictionaries and create a range of Pandas DataFrames with them.

For example, I have:

troRows

{0: [{'productID': 134336, 'price': '10.0000', 'amount': '1', 'cost': 0}],
 1: [{'productID': 142141, 'price': '5.5000', 'amount': '4', 'cost': 0}],
 2: [{'productID': 141764, 'price': '5.5000', 'amount': '1', 'cost': 0}],
 3: [{'productID': 81661, 'price': '4.5000', 'amount': '1', 'cost': 0}],
 4: [{'productID': 146761, 'price': '5.5000', 'amount': '1', 'cost': 0}],
 5: [{'productID': 143585, 'price': '5.5900', 'amount': '9', 'cost': 0}],
 6: [{'productID': 133018, 'price': '5.0000', 'amount': '1', 'cost': 0}],
 7: [{'productID': 146250, 'price': '13.7500', 'amount': '5', 'cost': 0}],
 8: [{'productID': 149986, 'price': '5.8900', 'amount': '2', 'cost': 0},
  {'productID': 149790, 'price': '4.9900', 'amount': '2', 'cost': 0},
  {'productID': 149972, 'price': '5.2900', 'amount': '2', 'cost': 0},
  {'productID': 149248, 'price': '2.0000', 'amount': '2', 'cost': 0},
  {'productID': 149984, 'price': '4.2000', 'amount': '2', 'cost': 0},

Each time the function will need to unpack x number of dictionaries which may have different number of rows into a range of DataFrames.

So for example, this range of Dictionaries would return DF0, DF1, DF2, DF3, DF4, DF5, DF6, DF7, DF8.

I can unpack a single Dictionary with:

pd.DataFrame(troRows[8])

which returns

 amount  cost   price  productID
0       2     0  5.8900     149986
1       2     0  4.9900     149790
2       2     0  5.2900     149972
3       2     0  2.0000     149248
4       2     0  4.2000     149984

How can I structure my code so that it does this for all the dictionaries for me?

jezrael · Accepted Answer

Solution for dictionary of DataFrames - use dictioanry comprehension and set index values to keys of dictionary:

dfs = {k: pd.DataFrame(v) for k, v in troRows.items()}
print (dfs)
{0:   amount  cost    price  productID
0      1     0  10.0000     134336, 1:   amount  cost   price  productID
0      4     0  5.5000     142141, 2:   amount  cost   price  productID
0      1     0  5.5000     141764, 3:   amount  cost   price  productID
0      1     0  4.5000      81661, 4:   amount  cost   price  productID
0      1     0  5.5000     146761, 5:   amount  cost   price  productID
0      9     0  5.5900     143585, 6:   amount  cost   price  productID
0      1     0  5.0000     133018, 7:   amount  cost    price  productID
0      5     0  13.7500     146250, 8:   amount  cost   price  productID
0      2     0  5.8900     149986
1      2     0  4.9900     149790
2      2     0  5.2900     149972
3      2     0  2.0000     149248
4      2     0  4.2000     149984}

print (dfs[8])
  amount  cost   price  productID
0      2     0  5.8900     149986
1      2     0  4.9900     149790
2      2     0  5.2900     149972
3      2     0  2.0000     149248
4      2     0  4.2000     149984

Solutions for one DataFrame:

Use list comprehension with flattening and pass it to DataFrame constructor:

troRows = pd.Series([[{'productID': 134336, 'price': '10.0000', 'amount': '1', 'cost': 0}],
  [{'productID': 142141, 'price': '5.5000', 'amount': '4', 'cost': 0}],
  [{'productID': 141764, 'price': '5.5000', 'amount': '1', 'cost': 0}],
 [{'productID': 81661, 'price': '4.5000', 'amount': '1', 'cost': 0}],
 [{'productID': 146761, 'price': '5.5000', 'amount': '1', 'cost': 0}],
  [{'productID': 143585, 'price': '5.5900', 'amount': '9', 'cost': 0}],
 [{'productID': 133018, 'price': '5.0000', 'amount': '1', 'cost': 0}],
  [{'productID': 146250, 'price': '13.7500', 'amount': '5', 'cost': 0}],
  [{'productID': 149986, 'price': '5.8900', 'amount': '2', 'cost': 0},
  {'productID': 149790, 'price': '4.9900', 'amount': '2', 'cost': 0},
  {'productID': 149972, 'price': '5.2900', 'amount': '2', 'cost': 0},
  {'productID': 149248, 'price': '2.0000', 'amount': '2', 'cost': 0},
  {'productID': 149984, 'price': '4.2000', 'amount': '2', 'cost': 0}]])

df = pd.DataFrame([y for x in troRows for y in x])

Another solution for flatten your data is use chain.from_iterable:

from  itertools import chain

df = pd.DataFrame(list(chain.from_iterable(troRows)))

print (df)
   amount  cost    price  productID
0       1     0  10.0000     134336
1       4     0   5.5000     142141
2       1     0   5.5000     141764
3       1     0   4.5000      81661
4       1     0   5.5000     146761
5       9     0   5.5900     143585
6       1     0   5.0000     133018
7       5     0  13.7500     146250
8       2     0   5.8900     149986
9       2     0   4.9900     149790
10      2     0   5.2900     149972
11      2     0   2.0000     149248
12      2     0   4.2000     149984

How to unpack an object of dictionaries to a range of Data Frames

Answers (1)

Related Questions