Reputation: 147
I have a dataframe(df):
id CI VaR
0 1 0.600 1000
1 1 0.650 1100
2 1 0.700 1200
3 1 0.750 1300
4 2 0.600 2500
5 2 0.650 2600
6 2 0.700 2700
7 2 0.750 2800
8 3 0.600 1500
9 3 0.650 1600
10 3 0.700 1700
11 3 0.750 1800
i have to create a subset dataframe from this dataframe im doing this;
for col in range(1,4):
df2 = df1.loc[df1["id"]==col]
print(df2)
output:
id CI VaR
0 1 0.600 1000
1 1 0.650 1100
2 1 0.700 1200
3 1 0.750 1300
and
4 2 0.600 2500
5 2 0.650 2600
6 2 0.700 2700
7 2 0.750 2800
and
8 3 0.600 1500
9 3 0.650 1600
10 3 0.700 1700
11 3 0.750 1800
this will give me separate dataframe for id 1,2,3 Now i want to VaR value of all dataframe(1 ,2 and 3) and add all the value in the order they are and append to the dataframe of respective. like:
obj = 0
for col in range(1,4):
df2 = df1.loc[df1["id"]==col]
obj = obj + df1["VaR"] # error is here
print(df2)
But this is not working for me
i need ouput like ;
id CI VaR capital
0 1 0.600 1000 5000
1 1 0.650 1100 5300
2 1 0.700 1200 5600
3 1 0.750 1300 5900
capital value 5000 came from adding 1000 + 2500 + 1500 ( these are all first value of respecti id) capital value 5300 came from adding 1100 + 2600 + 1600 ( these are all second value of respecti id) and so on ... and i need for all the id's like;
4 2 0.600 2500 5000
5 2 0.650 2600 5300
6 2 0.700 2700 5600
7 2 0.750 2800 5900
and
8 3 0.600 1500 5000
9 3 0.650 1600 5300
10 3 0.700 1700 5600
11 3 0.750 1800 5900
Thanks for your time :)
Upvotes: 0
Views: 68
Reputation: 153460
Let's use groupby
with transform
:
df['capital'] = df.groupby(df.groupby('id').cumcount())['VaR'].transform('sum')
Output:
id CI VaR capital
0 1 0.60 1000 5000
1 1 0.65 1100 5300
2 1 0.70 1200 5600
3 1 0.75 1300 5900
4 2 0.60 2500 5000
5 2 0.65 2600 5300
6 2 0.70 2700 5600
7 2 0.75 2800 5900
8 3 0.60 1500 5000
9 3 0.65 1600 5300
10 3 0.70 1700 5600
11 3 0.75 1800 5900
Details:
cumcount
to get position in each grouptransform
Upvotes: 2
Reputation: 35626
An option via np.tile
and a different way to divide the DataFrame via np.arraysplit
:
(Assumption: All id groups are equal length, and the total number of groups is equal to the number of rows per group)
from pprint import pprint
import numpy as np
import pandas as pd
df = pd.DataFrame({
'id': [1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3],
'CI': [0.6, 0.65, 0.7, 0.75, 0.6, 0.65, 0.7, 0.75, 0.6, 0.65, 0.7, 0.75],
'VaR': [1000, 1100, 1200, 1300, 2500, 2600, 2700, 2800, 1500, 1600, 1700,
1800]
})
unique_count = df['id'].nunique()
df['capital'] = np.tile(
df.groupby(df.groupby("id").cumcount())["VaR"].sum(),
unique_count
)
dfs = np.array_split(df, unique_count)
pprint(dfs)
dfs
:
[ id CI VaR capital
0 1 0.60 1000 5000
1 1 0.65 1100 5300
2 1 0.70 1200 5600
3 1 0.75 1300 5900,
id CI VaR capital
4 2 0.60 2500 5000
5 2 0.65 2600 5300
6 2 0.70 2700 5600
7 2 0.75 2800 5900,
id CI VaR capital
8 3 0.60 1500 5000
9 3 0.65 1600 5300
10 3 0.70 1700 5600
11 3 0.75 1800 5900]
Upvotes: 2
Reputation: 195418
I hope I've understood your question right. If you need to repeat the firt, second, third... sums of values in each group:
vals = df.groupby(df.groupby("id").cumcount())["VaR"].sum()
df["capital"] = [*vals] * df["id"].nunique()
print(df)
Prints:
id CI VaR capital
0 1 0.60 1000 5000
1 1 0.65 1100 5300
2 1 0.70 1200 5600
3 1 0.75 1300 5900
4 2 0.60 2500 5000
5 2 0.65 2600 5300
6 2 0.70 2700 5600
7 2 0.75 2800 5900
8 3 0.60 1500 5000
9 3 0.65 1600 5300
10 3 0.70 1700 5600
11 3 0.75 1800 5900
Upvotes: 2