Medhusalem
Medhusalem

Reputation: 65

Lists/DataFrames - Running a function over all values in Python

I am stuck at the moment and don't really know how to solve this problem. I want to apply this calculation to a list/dataframe:

Distance interpolation - inverse distance weighted

The equation itself is not really the problem for me, I am able to easily solve it manually, but that wouldn't do with the amount of data I have.

So basically this is for calculating/approximating a new temperature value for a position a certain distance away from the corners of the square:

enter image description here

import pandas as pd
import numpy as np
import xarray as xr
import math

filepath = r'F:\Data\data.nc' # just the path to the file
obj= xr.open_dataset(filepath)
# This is where I get the coordinates for each of the corners of the square
# from the netcdf4 file

lat = 9.7398
lon = 51.2695
xlat = obj['XLAT'].values
xlon = obj['XLON'].values           
p_1 = [xlat[0,0], xlon[0,0]]
p_2 = [xlat[0,1], xlon[0,1]]
p_3 = [xlat[1,0], xlon[1,0]]
p_4 = [xlat[1,1], xlon[1,1]]

p_rect = [p_1, p_2, p_3, p_4]
p_orig = [lat, lon]

#=================================================
# Calculates the distance between the points
# d = sqrt((x2-x1)^2 + (y2-y1)^2))
#=================================================   
distance = []
for coord in p_rect:
    distance.append(math.sqrt(math.pow(coord[0]-p_orig[0],2)+math.pow(coord[1]-p_orig[1],2)))

# to get the values for they key['WS'] for example:
a = obj['WS'].values[:,0,0,0] # Array of floats for the first values
b = obj['WS'].values[:,0,0,1] # Array of floats for the second values
c = obj['WS'].values[:,0,1,0] # Array of floats for the third values
d = obj['WS'].values[:,0,1,1] # Array of floats for the fourth values

From then on, I have no idea how I should continue, should I do:

df = pd.DataFrame()
df['a'] = a
df['b'] = b
df['c'] = c
df['d'] = d

Then somehow work with DataFrames, and drop abcd after I got the needed values or should I do it with lists first, then add only the result to the dataframe. I am a bit lost.

The only thing I came up with so far is how it would look like if I would do it manually:

for i starting at 0 and ending if the end of the list [a, b, c d have the same length] is reached .

     1/a[i]^2*distance[0] + 1/b[i]^2*distance[1] + 1/c[i]^2*distance[2] + 1/d[i]^2*distance[3]
v =  ------------------------------------------------------------------------------------------
                    1/a[i]^2 + 1/b[i]^2 + 1/c[i]^2 + 1/d[i]^2
'''  

This is the first time I had such a (at least for me) complex calculation on a list/dataframe. I hope you can help me solve this problem or at least nudge me in the right direction.

PS: here is the link to the file: LINK TO FILE

Upvotes: 1

Views: 67

Answers (1)

Parfait
Parfait

Reputation: 107652

Simply vectorize your calculations. With data frames you can run whole arithmetic operations directly on columns as if they were scalars to generate another column,df['v']. Below assumes distance is a list of four scalars and remember in Python ^ does not mean power, instead us **.

df = pd.DataFrame({'a':a, 'b':b, 'c':c, 'd':d})

df['v'] = (1/df['a']**2 * distance[0] +
           1/df['b']**2 * distance[1] + 
           1/df['c']**2 * distance[2] + 
           1/df['d']**2 * distance[3]) / (1/df['a']**2 + 
                                          1/df['b']**2 + 
                                          1/df['c']**2 + 
                                          1/df['d']**2)

Or the functional form using Pandas Series binary operators. Below follows the order of operations (Parentheses --> Exponential --> Multiplication/Division --> Addition/Subtraction):

df['v'] = (df['a'].pow(2).pow(-1).mul(distance[0]) +
           df['b'].pow(2).pow(-1).mul(distance[1]) + 
           df['c'].pow(2).pow(-1).mul(distance[2]) + 
           df['d'].pow(2).pow(-1).mul(distance[3])) / (df['a'].pow(2).pow(-1) + 
                                                       df['b'].pow(2).pow(-1) + 
                                                       df['c'].pow(2).pow(-1) + 
                                                       df['d'].pow(2).pow(-1))

Upvotes: 1

Related Questions