Reputation: 780
I am iterating over a groupby column in a pandas dataframe in Python 3.6 with the help of a for loop. The problem with this is that it becomes slow if I have a lot of data. This is my code:
import pandas as pd
dataDict = {}
for metric, df_metric in frontendFrame.groupby('METRIC'): # Creates frames for each metric
dataDict[metric] = df_metric.to_dict('records') # Converts dataframe to dictionary
frontendFrame is a dataframe containing two columns: VALUE and METRIC. My end goal is basically creating a dictionary where there is a key for each metric containing all data connected to it. I now this should be possible to do with lambda or map but I can't get it working with multiple arguments. frontendFrame.groupby('METRIC').apply(lambda x: print(x))
How can I solve this and make my script faster?
Upvotes: 0
Views: 141
Reputation: 323276
If you do not need any calculation involved after groupby
, do not groupby data , you can using .loc to get what you need
s=frontendFrame.METRIC.unique()
frontendFrame.loc[frontendFrame.METRIC==s[0],]
Upvotes: 1