ooikiam
ooikiam

Reputation: 103

Loop through groups to do operation on

I have this similar dataframe as below:-

   WELL RESV TYPE X1    Y1  X2   Y2    TD2
0   W1  A    OP   100   250 500   -5    495
1   W2  B   INJ   120   255 700   -7    695
2   W3  B   OBS   140   260 900   -9    895
3   W4  B   OP    160   265 1100  -11   1095
4   W5  A   OBS   180   270 1300  -13   1295
5   W6  B   INJ   200   275 1500  -15   1495
6   W7  A   OBS   220   280 1700  -17   1695
7   W8  B   INJ   240   285 1900  -19   1895
8   W9  A   OP    260   290 2100  -21   2095

Then i start to split this dataframe with unique values of 'TYPE' and 'RESV' columns. First, i start with TYPE == 'OP' and RESV == 'A'. Then with this sub dataframe, i do some re-arranging the sub dataframe to a certain format and to_csv as below.

df= df[(df.TYPE == 'OP') & (df.RESV == 'A')]
df1 = df[['WELL', 'RESV', 'TYPE', 'X1', 'Y1', 'TD2']]
df2 = df[['WELL', 'X2', 'Y2']]
df2.columns = ['WELL', 'X1', 'Y1']
df = pd.concat([df1, df2], sort=True).sort_values(['WELL', 'TD2']).fillna(method='ffill').reset_index(drop = True)[['WELL', 'RESV', 'TYPE', 'X1', 'Y1', 'TD2']]
for i, x in df.groupby('WELL'):
    x.to_csv({}, + 'csv')

results is something like this

   WELL RESV  TYPE  X1   Y1     TD2
0   W1  A     OP    100  250    495.0
1   W1  A     OP    500  -5     495.0
2   W9  A     OP    260  290    2095.0
3   W9  A     OP    2100 -21    2095.0

Instead of running this code several time, each time changing the TYPE and RESV to different unique values

df= df[(df.TYPE == 'OP') & (df.RESV == 'A')]

What i really want to achieve is do a groupby() i.e.

df_gb = df.groupby(['TYPE','RESV'])

And then do a loop/iteration over each group to do operation as i did above.

How do i use groupby in combination of the operation below, to do a loop through each group one at a time?

df1 = df[['WELL', 'RESV', 'TYPE', 'X1', 'Y1', 'TD2']]
df2 = df[['WELL', 'X2', 'Y2']]
df2.columns = ['WELL', 'X1', 'Y1']
df = pd.concat([df1, df2], sort=True).sort_values(['WELL', 'TD2']).fillna(method='ffill').reset_index(drop = True)[['WELL', 'RESV', 'TYPE', 'X1', 'Y1', 'TD2']]
for i, x in df.groupby('WELL'):
    x.to_csv({}, + 'csv')

Upvotes: 0

Views: 39

Answers (2)

rahlf23
rahlf23

Reputation: 9019

You can use pd.concat() and apply() after renaming some columns:

def reformat(x):

    return pd.concat([x[['WELL','X1','Y1','TD2']], x[['WELL','X2','Y2','TD2']].rename(columns={'X2': 'X1', 'Y2': 'Y1'})], axis=0).sort_values('WELL')

out = df.groupby(['TYPE','RESV']).apply(reformat).reset_index().drop('level_2', axis=1)

Yields:

   TYPE RESV WELL    X1   Y1   TD2
0   INJ    B   W2   120  255   695
1   INJ    B   W2   700   -7   695
2   INJ    B   W6   200  275  1495
3   INJ    B   W6  1500  -15  1495
4   INJ    B   W8   240  285  1895
5   INJ    B   W8  1900  -19  1895
6   OBS    A   W5   180  270  1295
7   OBS    A   W5  1300  -13  1295
8   OBS    A   W7   220  280  1695
9   OBS    A   W7  1700  -17  1695
10  OBS    B   W3   140  260   895
11  OBS    B   W3   900   -9   895
12   OP    A   W1   100  250   495
13   OP    A   W1   500   -5   495
14   OP    A   W9   260  290  2095
15   OP    A   W9  2100  -21  2095
16   OP    B   W4   160  265  1095
17   OP    B   W4  1100  -11  1095

Upvotes: 0

Kai Aeberli
Kai Aeberli

Reputation: 1220

Try this:

for name_grp, df_grp in df.groupby(["TYPE", "RESV"]):
    df1 = df_grp[['WELL', 'RESV', 'TYPE', 'X1', 'Y1', 'TD2']]
    df2 = df_grp[['WELL', 'X2', 'Y2']]
    df2.columns = ['WELL', 'X1', 'Y1'] 
    df3 = pd.concat([df1, df2], sort=True).sort_values(['WELL', 'TD2']).fillna(method='ffill')
    df3 = df3.reset_index(drop = True)[['WELL', 'RESV', 'TYPE', 'X1', 'Y1', 'TD2']]
    for i, x in df3.groupby('WELL'):
        x.to_csv(str(i) + '.csv')

Upvotes: 1

Related Questions