Reputation: 103
I have this similar dataframe as below:-
WELL RESV TYPE X1 Y1 X2 Y2 TD2
0 W1 A OP 100 250 500 -5 495
1 W2 B INJ 120 255 700 -7 695
2 W3 B OBS 140 260 900 -9 895
3 W4 B OP 160 265 1100 -11 1095
4 W5 A OBS 180 270 1300 -13 1295
5 W6 B INJ 200 275 1500 -15 1495
6 W7 A OBS 220 280 1700 -17 1695
7 W8 B INJ 240 285 1900 -19 1895
8 W9 A OP 260 290 2100 -21 2095
Then i start to split this dataframe with unique values of 'TYPE' and 'RESV' columns. First, i start with TYPE == 'OP' and RESV == 'A'. Then with this sub dataframe, i do some re-arranging the sub dataframe to a certain format and to_csv as below.
df= df[(df.TYPE == 'OP') & (df.RESV == 'A')]
df1 = df[['WELL', 'RESV', 'TYPE', 'X1', 'Y1', 'TD2']]
df2 = df[['WELL', 'X2', 'Y2']]
df2.columns = ['WELL', 'X1', 'Y1']
df = pd.concat([df1, df2], sort=True).sort_values(['WELL', 'TD2']).fillna(method='ffill').reset_index(drop = True)[['WELL', 'RESV', 'TYPE', 'X1', 'Y1', 'TD2']]
for i, x in df.groupby('WELL'):
x.to_csv({}, + 'csv')
results is something like this
WELL RESV TYPE X1 Y1 TD2
0 W1 A OP 100 250 495.0
1 W1 A OP 500 -5 495.0
2 W9 A OP 260 290 2095.0
3 W9 A OP 2100 -21 2095.0
Instead of running this code several time, each time changing the TYPE and RESV to different unique values
df= df[(df.TYPE == 'OP') & (df.RESV == 'A')]
What i really want to achieve is do a groupby() i.e.
df_gb = df.groupby(['TYPE','RESV'])
And then do a loop/iteration over each group to do operation as i did above.
How do i use groupby in combination of the operation below, to do a loop through each group one at a time?
df1 = df[['WELL', 'RESV', 'TYPE', 'X1', 'Y1', 'TD2']]
df2 = df[['WELL', 'X2', 'Y2']]
df2.columns = ['WELL', 'X1', 'Y1']
df = pd.concat([df1, df2], sort=True).sort_values(['WELL', 'TD2']).fillna(method='ffill').reset_index(drop = True)[['WELL', 'RESV', 'TYPE', 'X1', 'Y1', 'TD2']]
for i, x in df.groupby('WELL'):
x.to_csv({}, + 'csv')
Upvotes: 0
Views: 39
Reputation: 9019
You can use pd.concat()
and apply()
after renaming some columns:
def reformat(x):
return pd.concat([x[['WELL','X1','Y1','TD2']], x[['WELL','X2','Y2','TD2']].rename(columns={'X2': 'X1', 'Y2': 'Y1'})], axis=0).sort_values('WELL')
out = df.groupby(['TYPE','RESV']).apply(reformat).reset_index().drop('level_2', axis=1)
Yields:
TYPE RESV WELL X1 Y1 TD2
0 INJ B W2 120 255 695
1 INJ B W2 700 -7 695
2 INJ B W6 200 275 1495
3 INJ B W6 1500 -15 1495
4 INJ B W8 240 285 1895
5 INJ B W8 1900 -19 1895
6 OBS A W5 180 270 1295
7 OBS A W5 1300 -13 1295
8 OBS A W7 220 280 1695
9 OBS A W7 1700 -17 1695
10 OBS B W3 140 260 895
11 OBS B W3 900 -9 895
12 OP A W1 100 250 495
13 OP A W1 500 -5 495
14 OP A W9 260 290 2095
15 OP A W9 2100 -21 2095
16 OP B W4 160 265 1095
17 OP B W4 1100 -11 1095
Upvotes: 0
Reputation: 1220
Try this:
for name_grp, df_grp in df.groupby(["TYPE", "RESV"]):
df1 = df_grp[['WELL', 'RESV', 'TYPE', 'X1', 'Y1', 'TD2']]
df2 = df_grp[['WELL', 'X2', 'Y2']]
df2.columns = ['WELL', 'X1', 'Y1']
df3 = pd.concat([df1, df2], sort=True).sort_values(['WELL', 'TD2']).fillna(method='ffill')
df3 = df3.reset_index(drop = True)[['WELL', 'RESV', 'TYPE', 'X1', 'Y1', 'TD2']]
for i, x in df3.groupby('WELL'):
x.to_csv(str(i) + '.csv')
Upvotes: 1