Reputation: 745
I have a dataframe df and I want to convert the dataframe to a list of list
left_side right_side similarity
0114600043776001 loan payment receipt 0421209017073500 loan payment receipt 0.689008
0114600043776001 loan payment receipt 0421209017073500 loan payment receipt 0.689008
vat onverve*issuance fee*506108 vat onverve*issuance fee*5061087 0.743522
vat onverve*issuance fee*506108 verve*issuance fee*506108*********1112 0.684342
verve*issuance fee*506108 verve*issuance fee*506108*********8296 0.717817
verve*issuance fee*506108 vat onverve*issuance fee*506108** 0.684342
maint fee recovery jun 2018 vat maint fee recovery jun 2018 0.896607
maint fee recovery jun 2018 vat maint fee recovery jun 2018 0.896607
maint fee recovery jun 2018 vat maint fee recovery jun 2018 0.896607
Expected output should look like this:
[[0114600043776001 loan payment receipt, 0421209017073500 loan payment receipt,
0421209017073500 loan payment receipt],
[vat onverve*issuance fee*506108, vat onverve*issuance fee*5061087,
verve*issuance fee*506108*********1112],
[verve*issuance fee*506108*********8296, verve*issuance fee*506108
vat onverve*issuance fee*506108** ],...]
I have tried grouping the above df by left_side column
and converting the resulting df to a list, but the output is not what I expected. please I need your assistance on this
grouup_df = df.groupby(['left_side']).right_side.sum().to_frame()
grouup_df.values.tolist()
and the output looks like this:
['0421209017073500 loan payment receipt0421209017073500 loan payment receipt0421209017073500 loan payment receipt0421209017073500 loan payment receipt0421209017073500 loan payment receipt0421209017073500 loan payment receipt']
['vat maint fee recovery jun 2018vat maint fee recovery jun 2018vat maint fee recovery jun 2018maint fee recovery jul 2018maint fee recovery oct 2018maint fee recovery jul 2018maint fee recovery jul 2018']
Upvotes: 0
Views: 59
Reputation: 15872
You can use df.groupby
:
>>> [[k, *g] for k, g in df.groupby('left_side', sort=False)['right_side']]
[['0114600043776001 loan payment receipt',
'0421209017073500 loan payment receipt',
'0421209017073500 loan payment receipt'],
['vat onverve*issuance fee*506108',
'vat onverve*issuance fee*5061087',
'verve*issuance fee*506108*********1112'],
['verve*issuance fee*506108',
'verve*issuance fee*506108*********8296',
'vat onverve*issuance fee*506108**'],
['maint fee recovery jun 2018',
'vat maint fee recovery jun 2018',
'vat maint fee recovery jun 2018',
'vat maint fee recovery jun 2018']]
Upvotes: 1
Reputation: 878
import pandas as pd
dfold = {'left_side': ['string','string','string','string'],
'right_side': ['string','string','string','string']
}
df = pd.DataFrame(dfold, columns= ['left_side', 'right_side'])
print(df)
df_list = df.values.tolist()
print(df_list)
Upvotes: 1
Reputation: 181
I believe your looking for the to_records()
method on a Datagrams.
Try df.to_records()
, you can find its documentation here
Upvotes: 0