Reputation: 205
Trying to convert the dataframe
Into a dictionary with this specific structure:
sales = {
"clients": [
{"ID_client": "241341",
"purchases": [
"Item 101",
"Item 202",
"Item 324",
],
"payment": [
"visa", "master", "visa"
]
},
{"ID_client": "24356",
"purchases": [
"Item 2320",
"Item 2342",
"Item 5604",
],
"payment": [
"diners", "cash", "diners"
]
},
{"ID_client": "5534",
"purchases": [
"Item 50563",
"Item 52878",
"Item 54233",
],
"payment": [
"diners", "master", "visa"
]
}
]
}
I've been trying with some for loops like:
d = {"sales":[]}
for i in df1['ID_Client'].unique():/
clients = {"ID_client": df1['ID_client'][i]}
d[i] = [{df1['purchases'][j]: df1['payment'][j]} for j in
df1[df1['ID_Client']==i].index]
Any help will be greatly appreciated. Thanks in advance.
Upvotes: 0
Views: 579
Reputation: 30991
Assuming that your source DataFrame is df, your dictionary can be generated with a single instruction:
sales = { 'clients' : df.groupby(df.ID_client)\
.agg(list)\
.reset_index()\
.set_axis(['ID_client', 'purchases', 'payment'], axis=1, inplace=False)\
.to_dict('records')
}
As you can see, creation of the array - content of clients consists actually of the following chain:
The content of sales variable is:
{'clients': [{'ID_client': '241341',
'purchases': ['Item 101', 'Item 202', 'Item 324'],
'payment': ['visa', 'master', 'visa']},
{'ID_client': '24356',
'purchases': ['Item 2320', 'Item 2342', 'Item 5604'],
'payment': ['diners', 'cash', 'diners']},
{'ID_client': '5534',
'purchases': ['Item 50563', 'Item 52878', 'Item 54233'],
'payment': ['diners', 'master', 'visa']}]}
just as you wish.
Upvotes: 1
Reputation: 21
I think you were on the right track with the start of your loop, but you'll be better off using the results to filter the dataframe. Should be more efficient and more panda like (pandaren?) to do it that way.
Example:
clients = []
for c in df1['ID_client'].unique():
client = {
'ID_client': c,
'purchases': list(df1[df1.ID_client == c].loc[:, 'Purchase']),
'payment': list(df1[df1.ID_client == c].loc[:, 'Payment'])
}
clients.append(client)
Upvotes: 1
Reputation: 164693
Here's one way using np.repeat
and itertools.chain
:
import pandas as pd, numpy as np
from itertools import chain
df = pd.DataFrame(sales['clients'])
res = pd.DataFrame({'ID_client': np.repeat(df['ID_client'], df['payment'].map(len)),
'payment': list(chain.from_iterable(df['payment'])),
'purchases': list(chain.from_iterable(df['purchases']))})
print(res)
ID_client payment purchases
0 241341 visa Item 101
0 241341 master Item 202
0 241341 visa Item 324
1 24356 diners Item 2320
1 24356 cash Item 2342
1 24356 diners Item 5604
2 5534 diners Item 50563
2 5534 master Item 52878
2 5534 visa Item 54233
Note with this method each unique index aligns with an ID_client
, as per your input.
Upvotes: 1