Shreya
Shreya

Reputation: 151

Difference of one element with all other elements after Groupby

I have a data set as shown below:

Date	      Lon_s	lat_s	HLON_cv	HLAT_cv
1853-11-09	31	-10.4	293.85	5.2
1853-11-09	302.3	3.6	290.15	4.9
1853-12-01	85.5	-7.5	84.62	-6.88
1853-12-01	85.5	-7.5	78.2	-6.83
1853-12-01	88	-8.6	84.62	-6.88
1853-12-01	88	-8.6	78.2	-6.83
1853-12-01	86.6	-7.8	84.62	-6.88
1853-12-01	86.6	-7.8	78.2	-6.83

For each Date I want to take one element from lon_s and have a difference with all other values of HLON. For example: For the 1853-11-09 :
lon 31- HLON 293.85
lon 31- HLON 290.15
lon 302.3-HLON 293.85
lon 302.3-HLON 290.15
with corresponding lat-HLAT as well.

I have usedfor group, values in df.groupby(df['Date']): to groupby date but need help for rest of the part.

Upvotes: 0

Views: 66

Answers (2)

Aditya Bhattacharya
Aditya Bhattacharya

Reputation: 1014

Please try the following code and let me know if it works:

diff_df = pd.DataFrame()
for group, values in df.groupby(df['Date']):
      lat_diff = []; lon_diff = []
      for i in range(len(values['Lon_s'])):
         for j in range(len(values['HLON_cv']):
                lon_diff.append(values['Lon_s'].iloc[i] - values['HLON_cv'].iloc[j])

      for m in range(len(values['Lat_s'])):
         for n in range(len(values['HLAT_cv']):
                lat_diff.append(values['Lat_s'].iloc[m] - values['HLAT_cv'].iloc[n])
      df = pd.DataFrame({"date": group, "Lat_Diff": lat_diff, "Long_Diff":lon_diff})
      diff_df = diff_df.append(df)

This should create a dataframe for you in the format:

       Date   Lon_diff  Lat_diff
0  1853-11-09    -262.85    -15.60
1  1853-11-09    -259.15    -15.30
2  1853-12-01       0.88     -0.62
3  1853-12-01       7.30     -0.67
4  1853-12-01       0.88     -0.62
5  1853-12-01       7.30     -0.67
6  1853-12-01       0.88     -0.62
7  1853-12-01       7.30     -0.67

Upvotes: 1

Valdi_Bo
Valdi_Bo

Reputation: 30971

Define the following function (with another function inside):

def myFun(grp):
    def myDiff(col1, col2):
        return col1.iloc[0] - col2
    return pd.DataFrame({'Lon_diff': myDiff(grp.Lon_s, grp.HLON_cv),
                         'Lat_diff': myDiff(grp.Lat_s, grp.HLAT_cv)})

It generates a DataFrame composed of 2 columns: Lon_diff and Lat_diff, each with respective difference.

Then join the result of application of this function to the original DataFrame:

result = df.join(df.groupby('Date').apply(myFun))

The result is:

         Date  Lon_s  Lat_s  HLON_cv  HLAT_cv  Lon_diff  Lat_diff
0  1853-11-09   31.0  -10.4   293.85     5.20   -262.85    -15.60
1  1853-11-09  302.3    3.6   290.15     4.90   -259.15    -15.30
2  1853-12-01   85.5   -7.5    84.62    -6.88      0.88     -0.62
3  1853-12-01   85.5   -7.5    78.20    -6.83      7.30     -0.67
4  1853-12-01   88.0   -8.6    84.62    -6.88      0.88     -0.62
5  1853-12-01   88.0   -8.6    78.20    -6.83      7.30     -0.67
6  1853-12-01   86.6   -7.8    84.62    -6.88      0.88     -0.62
7  1853-12-01   86.6   -7.8    78.20    -6.83      7.30     -0.67

Upvotes: 0

Related Questions