Reputation: 581
I have a dataframe df
that I want to use to create a new dataframe df1
.
Here is a slice of df
(which is over 4 million rows):
xnum class/subclass
1 86963 004/665000
51 86963 004/342000
101 86963 004/392000
151 86963 004/437000
201 86963 004/480000
251 86963 004/526000
301 86963 004/255080
351 86939 004/231000
401 81868 029/603200
451 81868 004/665000
501 81868 029/890100
551 69931 029/603200
601 69931 015/199000
651 69931 015/230000
701 75047 029/603200
751 75047 123/653000
801 75047 123/1690TC
851 75047 123/185700
901 75047 004/665000
951 75047 123/190900
I would like to create a dictionary were the keys are class/subclass
and the values are each xnum
that appear on the rows of the class/subclass
.
For the df above, one key: value would be "004/665000": "86963", "81868", "75047"
.
Note, the dtype for xnum
and class/subclass
is object since i need to keep the leading zeros.
My questions is: How should I create the dictionary from the dataframe? Thank you
Upvotes: 0
Views: 92
Reputation: 79338
df.groupby('class/subclass').xnum.agg(lambda x:x if len(x)==1 else list(x)).to_dict()
Out[759]:
{'004/231000': 86939,
'004/255080': 86963,
'004/342000': 86963,
'004/392000': 86963,
'004/437000': 86963,
'004/480000': 86963,
'004/526000': 86963,
'004/665000': [86963, 81868, 75047],
'015/199000': 69931,
'015/230000': 69931,
'029/603200': [81868, 69931, 75047],
'029/890100': 81868,
'123/1690TC': 75047,
'123/185700': 75047,
'123/190900': 75047,
'123/653000': 75047}
Upvotes: 1
Reputation: 581
[20]Qdf = df.groupby('class/subclass')['xnum'].apply(list)
Qdf.to_dict()
[20] {'004/231000': [86939],
'004/255080': [86963],
'004/342000': [86963],
'004/392000': [86963],
'004/437000': [86963],
'004/480000': [86963],
'004/526000': [86963],
'004/665000': [86963, 81868, 75047],
'015/199000': [69931],
'015/230000': [69931],
'029/603200': [81868, 69931, 75047],
'029/890100': [81868],
'123/1690TC': [75047],
'123/185700': [75047],
'123/190900': [75047],
'123/653000': [75047]}
Upvotes: 1