Reputation: 361
I have a script where I do munging with dataframes and extract data like the following:
times = pd.Series(df.loc[df['sy_x'].str.contains('AA'), ('t_diff')].quantile([.1, .25, .5, .75, .9]))
I want to add the resulting data from quantile()
to a data frame with separate columns for each of those quantiles, lets say the columns are:
ID pt_1 pt_2 pt_5 pt_7 pt_9
AA
BB
CC
How might I add the quantiles to each row of ID?
new_df = None
for index, value in times.items():
for col in df[['pt_1', 'pt_2','pt_5','pt_7','pt_9',]]:
..but that feels wrong and not idiomatic. Should I be using loc
or iloc
? I have a couple more Series that I'll need to add to other columns not shown, but I think I can figure that out once I know
EDIT:
Some of the output of times
looks like:
0.1 -0.5
0.25 -0.3
0.5 0.0
0.75 2.0
0.90 4.0
Thanks in advance for any insight
Upvotes: 1
Views: 55
Reputation: 743
Try something like:
pd.DataFrame(times.values.T, index=times.keys())
Upvotes: 1
Reputation: 150735
IIUC, you want a groupby()
:
# toy data
np.random.seed(1)
df = pd.DataFrame({'sy_x':np.random.choice(['AA','BB','CC'], 100),
't_diff': np.random.randint(0,100,100)})
df.groupby('sy_x').t_diff.quantile((0.1,.25,.5,.75,.9)).unstack(1)
Output:
0.10 0.25 0.50 0.75 0.90
sy_x
AA 16.5 22.25 57.0 77.00 94.5
BB 9.1 21.00 58.5 80.25 91.3
CC 9.7 23.25 40.5 65.75 84.1
Upvotes: 1