Reputation: 7909
Given this DataFrame
:
import pandas as pd
first=[0,1,2,3,4]
second=[10.2,5.7,7.4,17.1,86.11]
third=['a','b','c','d','e']
fourth=['z','zz','zzz','zzzz','zzzzz']
df=pd.DataFrame({'first':first,'second':second,'third':third,'fourth':fourth})
df=df[['first','second','third','fourth']]
first second third fourth
0 0 10.20 a z
1 1 5.70 b zz
2 2 7.40 c zzz
3 3 17.10 d zzzz
4 4 86.11 e zzzzz
I can create a dictionary out of df
using
a=df.set_index('first')['second'].to_dict()
so that I can decide what is keys
and what is values
. But what if you want values
to be a list of columns, such as second
AND third
?
If I try this
b=df.set_index('first')[['second','third']].to_dict()
I get a weird dictionary of dictionaries
{'second': {0: 10.199999999999999,
1: 5.7000000000000002,
2: 7.4000000000000004,
3: 17.100000000000001,
4: 86.109999999999999},
'third': {0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e'}}
Instead, I want a dictionary of lists
{0: [10.199999999999999,a],
1: [5.7000000000000002,b],
2: [7.4000000000000004,c],
3: [17.100000000000001,d],
4: [86.109999999999999,e]}
How to deal with this?
Upvotes: 3
Views: 11420
Reputation: 12515
Someone else can probably chime in with a pure-pandas solution, but in a pinch I think this ought to work for you. You'd basically create the dictionary on-the-fly, indexing values in each row instead.
d = {df.loc[idx, 'first']: [df.loc[idx, 'second'], df.loc[idx, 'third']] for idx in range(df.shape[0])}
d
Out[5]:
{0: [10.199999999999999, 'a'],
1: [5.7000000000000002, 'b'],
2: [7.4000000000000004, 'c'],
3: [17.100000000000001, 'd'],
4: [86.109999999999999, 'e']}
Edit: You could also do this:
df['new'] = list(zip(df['second'], df['third']))
df
Out[25]:
first second third fourth new
0 0 10.20 a z (10.2, a)
1 1 5.70 b zz (5.7, b)
2 2 7.40 c zzz (7.4, c)
3 3 17.10 d zzzz (17.1, d)
4 4 86.11 e zzzzz (86.11, e)
df = df[['first', 'new']]
df
Out[27]:
first new
0 0 (10.2, a)
1 1 (5.7, b)
2 2 (7.4, c)
3 3 (17.1, d)
4 4 (86.11, e)
df.set_index('first').to_dict()
Out[28]:
{'new': {0: (10.199999999999999, 'a'),
1: (5.7000000000000002, 'b'),
2: (7.4000000000000004, 'c'),
3: (17.100000000000001, 'd'),
4: (86.109999999999999, 'e')}}
In this approach, you would first create the list (or tuple), you want to keep and then "drop" the other columns. This is basically your original approach, modified.
And if you really wanted lists instead of tuples, just map
the list
type onto that 'new'
column:
df['new'] = list(map(list, zip(df['second'], df['third'])))
Upvotes: 3
Reputation: 393963
You can zip
the values:
In [118]:
b=df.set_index('first')[['second','third']].values.tolist()
dict(zip(df['first'].index,b))
Out[118]:
{0: [10.2, 'a'], 1: [5.7, 'b'], 2: [7.4, 'c'], 3: [17.1, 'd'], 4: [86.11, 'e']}
Upvotes: 1