Reputation: 634
I am trying to iterate through a dataframe in pandas and generate a dictionary based off of the values in a column. I am interested in capturing the column name every time the value in the column is equal to a value of 3. Given a dataframe below:
Sample Variable 1 Variable 2 Variable 3
Sample 1 1 3 1
Sample 2 3 0 3
Sample 3 3 3 3
Sample 4 2 1 3
I am interested in creating a dictionary that gives me:
{'Sample 1': [Variable 2], 'Sample 2': [Variable 1, Variable3], 'Sample 3': [Variable 1, Variable 2, Variable 3], 'Sample 4': [Variable3]}
Upvotes: 2
Views: 4804
Reputation: 2980
You can do this by converting your DataFrame into a dict
and then apply dictionary comprehension to get a list of the variables equal to 3.
df_dict = df.to_dict(orient="index")
{k: [k1 for (k1, v1) in v.items() if v1 == 3] for (k, v) in df_dict.items()}
Upvotes: 4
Reputation: 51185
Setup
df = pd.DataFrame({'Sample': ['Sample 1', 'Sample 2', 'Sample 3', 'Sample 4'], 'Variable 1': [1,3,3,2], 'Variable 2': [3,0,3,1], 'Variable 3': [1,3,3,3]})
set_index
with unstack
:
s = df.set_index('Sample').unstack().reset_index()
Then groupby
and apply
, and finally to_dict
:
s[s[0].eq(3)].groupby('Sample')['level_0'].apply(list).to_dict()
{'Sample 1': ['Variable 2'],
'Sample 2': ['Variable 1', 'Variable 3'],
'Sample 3': ['Variable 1', 'Variable 2', 'Variable 3'],
'Sample 4': ['Variable 3']}
Upvotes: 2