Shalva Esakia
Shalva Esakia

Reputation: 21

The Accuracy of Kriging Interpolation

I am attempting to interpolate the normal strain results in the wall type structure (dimensions:800x500 mm) while having approximately 150 scattered known points in the whole element (find picture for the Sensor Layout).

I used universal kriging with the Pykrige toolkit, but the lowest error margin I achieved, when compared to FEM results, was 37%, which seems very high. This was with a Gaussian variogram. I also tried creating a custom variogram model, but the results seemed incorrect with an effective range of 839.42, sill and nugget effect both 0, and an even higher error margin.

Am I creating the variogram model incorrectly? Is Kriging suitable for this case, or should I use simpler interpolation methods? Should I add anything else to the interpolation like trend model, use zonal Kriging (as I have three distinct zones for high, medium and low strains), or implement drift_term?

Here is my Python code:

import numpy as np
import pandas as pd
from pykrige.uk import UniversalKriging
import matplotlib.pyplot as plt
import skgstat as skg

# Load the known strain data and x, y locations from the Excel file using pandas
excel_file = 'Sensor-Results(Exx).xlsx'
data = pd.read_excel(excel_file)

# Extract x, y, and strain values from the DataFrame by using their names on top of the columns
x_known = data['X'].values
y_known = data['Y'].values
strain_known = data['Exx'].values

# Define the grid dimensions
grid_width = 800  # in mm
grid_height = 500  # in mm

# Define the grid division
width1 = 37.5  # in mm
width2 = 50  # in mm
width3 = 282.5  # in mm
width4 = 60  # in mm

# Generate points using the provided divisions (For creating the identical mesh to FEM model)
x1 = np.linspace(0, width1, 1)  # Grid on 0 location
x2 = np.linspace(width1, width1 + width2, 1)  # Grid on 37.5 location
x3 = np.linspace(width1 + width2, 370, 8)  # Grid on 127.857 to 370
x4 = np.linspace(400, width1 + width2 + width3 + width4, 2)
x5 = np.linspace(width1 + width2 + width3 + width4 + width3 / 7, width1 + width2 + width3 * 2 + width4, 7)
x6 = np.linspace(width1 + width2 * 2 + width3 * 2 + width4, grid_width, 2)

# Combine all x points
x_grid = np.unique(np.concatenate((x1, x2, x3, x4, x5, x6)))
y_grid = np.linspace(0, grid_height, 14)

# Create a meshgrid of the grid points
X, Y = np.meshgrid(x_grid, y_grid)

# Flatten the meshgrid arrays
x_flat = X.flatten()
y_flat = Y.flatten()

# Create empirical variogram
coords = np.vstack((x_known, y_known)).T
strain_known = strain_known.flatten()
V = skg.Variogram(coords, strain_known, n_lags=15, normalize=True, model='gaussian', estimator='dowd')
fig = V.plot(show=False)
plt.show()
print(V)

# Extract variogram parameters
variogram_model_parameters = V.parameters
variogram_model = 'gaussian'
print("Variogram Model Parameters:", variogram_model_parameters)

# Perform Universal Kriging interpolation for the full grid with linear drift
uk = UniversalKriging(
    x_known, y_known, strain_known,
    variogram_model=variogram_model,
    variogram_parameters=variogram_model_parameters,
    drift_terms=['regional_linear']
)
strain_interpolated, _ = uk.execute('grid', x_grid, y_grid)

# Replace the interpolated values at known data points with the actual known values
for x, y, strain in zip(x_known, y_known, strain_known):
    xi = np.abs(x_grid - x).argmin()
    yi = np.abs(y_grid - y).argmin()
    strain_interpolated[yi, xi] = strain

# Create a DataFrame with all grid points and interpolated strain values
interpolated_df = pd.DataFrame({
    'X': x_flat,
    'Y': y_flat,
    'Strain': strain_interpolated.flatten()
})

# Sort the DataFrame to achieve the desired zigzag order
interpolated_sorted = pd.DataFrame(columns=['X', 'Y', 'Strain'])

for x_val in x_grid:
    temp_df = interpolated_df[interpolated_df['X'] == x_val]
    idx = np.where(x_grid == x_val)[0][0]
    if idx % 2 == 0:  # Even index
        interpolated_sorted = pd.concat([interpolated_sorted, temp_df])
    else:  # Odd index
        interpolated_sorted = pd.concat([interpolated_sorted, temp_df.iloc[::-1]])

# Save the interpolated data
interpolated_sorted.to_csv('24_06_21_Kriging-Variogram(Exx).txt', sep='\t', index=False, header=True)

# Plot the heat map
strain_grid = strain_interpolated.reshape(X.shape)
plt.figure(figsize=(10, 6))
plt.contourf(X, Y, strain_grid, cmap='rainbow')
plt.colorbar()

# Scatter plot with connecting lines
plt.scatter(x_known, y_known, color='black', s=10)
plt.plot(x_known, y_known, color='red', linewidth=1, linestyle='-')

plt.xlabel('X (mm)')
plt.ylabel('Y (mm)')
plt.title('Strain Heatmap Exx')
plt.show()

The data that I am analyzing is randomly taken from the FEM model that I have created. The contour plot shows the distinct zones of high low and medium strain results, gradually changing in y direction. From the default UK, the toolkit somehow gives me different parameters from variogram analysis.

Sill: 4.94245937e-06

Range: 1.85944481e+02

Nugget: 1.04982139e-06

These results are quite different when I create a custom variogram for example for Gaussian Variogram the effective range, sill, nugget are printed as follows:

[839.4157043511434, 1.234774627684911e-05, 0]

What could be the reason of such difference in custom and default variogram results?

Upvotes: 1

Views: 116

Answers (0)

Related Questions