Reputation: 1090
I am trying to estimate the Schott coefficients of a glass material given only its n_e
(refraction index at e
line) and V_e
(Abbe number at e
line).
Schott is one way to represent the dispersion of a material, which is the different index of refraction (RI) at different wavelength.
In the figure above, the horizontal axis is the wavelength (in micrometer) and the vertical axis is index of refraction (This figure is based on the glass type named KZFH1
).
Because the glass dispersion have a common shape (higher at shorter wavelength and then tappers down), and the RI at key points (Fraunhofer lines) have a stable relationship, my thought is that I can use the definition of Abbe number and the general relation of different Fraunhofer line RI to create some data points, and use them to fit a curve:
import numpy as np
from scipy.optimize import curve_fit
# Definition of the Schott function
def _InvSchott(x, a0, a1, a2, a3, a4, a5):
return np.sqrt(a0 + a1* x**2 + a2 * x**(-2) + a3 * x**(-4) + a4 * x**(-6) + a5 * x**(-8))
# Sample input, material parameter from a Leica Summilux patent
n = 1.7899
V = 48
# 6 wavelengths, Fraunhofer symbols are not used due to there is another version that uses n_d and V_d
shorter = 479.99
short = 486.13
neighbor = 546.07
middle = 587.56
longc = 643.85
longer = 656.27
# Refraction index of the corresponding wavelengths.
# The linear functions are acquired from external regressions from 2000 glass materials
n_long = 0.984 * n + 0.0246 # n_C'
n_shorter = ( (n-1) / V) + n_long # n_F', from the definition of Abbe number
n_short = 1.02 * n -0.0272 # n_F
n_neighbor = n # n_e
n_mid = 1.013 * n - 0.0264 # n_d
n_longer = 0.982 * n + 0.0268 # n_C
# The /1000 is to convert the wavelength from nanometer to micrometers
x_data = np.array([longer, longc, middle, neighbor, short, shorter]) / 1000.0
y_data = np.array([n_longer, n_long, n_mid, n_neighbor, n_short, n_shorter])
# Provided estimate are average value from the 2000 Schott glasses
popt, pcov = curve_fit(_InvSchott, x_data, y_data, [2.75118, -0.01055, 0.02357, 0.00084, -0.00003, 0.00001])
The x_data
and y_data
in this case are as follow:
[0.65627 0.64385 0.58756 0.54607 0.48613 0.47999]
[1.7844818 1.7858616 1.7867687 1.7899 1.798498 1.80231785]
And then I got the warning OptimizeWarning: Covariance of the parameters could not be estimated
. The fit result were all but [inf inf inf inf inf inf]
.
I know this question has been asked a lot but I have not found a solution that works in this case yet. 6 data point is certainly a bit poor but this does satisfy the minimum, and Schott function is continuous, so I cannot figure out which part went wrong.
TLDR:
How do I find the coefficients for the function
def _InvSchott(x, a0, a1, a2, a3, a4, a5):
return np.sqrt(a0 + a1* x**2 + a2 * x**(-2) + a3 * x**(-4) + a4 * x**(-6) + a5 * x**(-8))
that fits the data below:
x: [0.65627 0.64385 0.58756 0.54607 0.48613 0.47999]
y: [1.7844818 1.7858616 1.7867687 1.7899 1.798498 1.80231785]
Upvotes: 1
Views: 84
Reputation: 15273
Don't use sqrt
during fitting, and don't fit this as a nonlinear model. Fit it as a linear model:
import numpy as np
from matplotlib import pyplot as plt
schott_powers = np.arange(2, -9, -2)
def inv_schott(lambd: np.ndarray, a: np.ndarray) -> np.ndarray:
return np.sqrt(inv_schott_squared(lambd, a))
def inv_schott_squared(lambd: np.ndarray, a: np.ndarray) -> np.ndarray:
terms = np.power.outer(lambd, schott_powers)
return terms @ a
def demo() -> None:
# Sample input, material parameter from a Leica Summilux patent
n = 1.7899
V = 48
# 6 wavelengths, Fraunhofer symbols are not used due to there is another version that uses n_d and V_d
shorter = 479.99
short = 486.13
neighbor = 546.07
middle = 587.56
longc = 643.85
longer = 656.27
# Refraction index of the corresponding wavelengths.
# The linear functions are acquired from external regressions from 2000 glass materials
n_long = 0.984*n + 0.0246 # n_C'
n_shorter = (n - 1)/V + n_long # n_F', from the definition of Abbe number
n_short = 1.02*n - 0.0272 # n_F
n_neighbor = n # n_e
n_mid = 1.013*n - 0.0264 # n_d
n_longer = 0.982*n + 0.0268 # n_C
lambda_nm = np.array((longer, longc, middle, neighbor, short, shorter))
lambda_um = lambda_nm*1e-3
n_all = np.array((n_longer, n_long, n_mid, n_neighbor, n_short, n_shorter))
a, residuals, rank, singular = np.linalg.lstsq(
a=np.power.outer(lambda_um, schott_powers),
b=n_all**2, rcond=None,
)
fig, ax = plt.subplots()
lambda_hires = np.linspace(start=lambda_um.min(), stop=lambda_um.max(), num=501)
ax.scatter(lambda_um, n_all, label='experiment')
ax.plot(lambda_hires, inv_schott(lambda_hires, a), label='fit')
ax.legend()
plt.show()
if __name__ == '__main__':
demo()
You can observe the effect of truncating the polynomial order, though whether that's advisable is very difficult to say unless you produce more data:
import numpy as np
from matplotlib import pyplot as plt
def inv_schott(lambd: np.ndarray, a: np.ndarray, powers: np.ndarray) -> np.ndarray:
return np.sqrt(inv_schott_squared(lambd, a, powers))
def inv_schott_squared(lambd: np.ndarray, a: np.ndarray, powers: np.ndarray) -> np.ndarray:
terms = np.power.outer(lambd, powers)
return terms @ a
def demo() -> None:
# Sample input, material parameter from a Leica Summilux patent
n = 1.7899
V = 48
# 6 wavelengths, Fraunhofer symbols are not used due to there is another version that uses n_d and V_d
shorter = 479.99
short = 486.13
neighbor = 546.07
middle = 587.56
longc = 643.85
longer = 656.27
# Refraction index of the corresponding wavelengths.
# The linear functions are acquired from external regressions from 2000 glass materials
n_long = 0.984*n + 0.0246 # n_C'
n_shorter = (n - 1)/V + n_long # n_F', from the definition of Abbe number
n_short = 1.02*n - 0.0272 # n_F
n_neighbor = n # n_e
n_mid = 1.013*n - 0.0264 # n_d
n_longer = 0.982*n + 0.0268 # n_C
lambda_nm = np.array((longer, longc, middle, neighbor, short, shorter))
lambda_um = lambda_nm*1e-3
n_all = np.array((n_longer, n_long, n_mid, n_neighbor, n_short, n_shorter))
fig, ax = plt.subplots()
lambda_hires = np.linspace(start=lambda_um.min(), stop=lambda_um.max(), num=501)
ax.scatter(lambda_um, n_all, label='experiment')
for lowest_power in range(0, -9, -2):
powers = np.arange(2, lowest_power - 1, -2)
a, residuals, rank, singular = np.linalg.lstsq(
a=np.power.outer(lambda_um, powers),
b=n_all**2, rcond=None,
)
ax.plot(lambda_hires, inv_schott(lambda_hires, a, powers), label=f'powers to {lowest_power}')
ax.legend()
plt.show()
if __name__ == '__main__':
demo()
Upvotes: 3
Reputation: 2749
There are several ways this can go wrong. Firstly you are trying to determine 6 unknown parameters a0,a1...a5 from 6 data points as constraints. This can result in a fit that hits every point exactly but oscillates wildly in between each one to do so (but that isn't what actually happens here).
Sqrt is undefined if its argument becomes negative and the sum of squares then goes haywire. This can happen if the starting guess is too far away from the real solution or an intermediate model step is. I provoked such bad behaviour by putting in a guess of [0,0,0,0,0,0] into my solver and it failed the same way.
I made a creative mistake at first which protected me from the weird behaviour that you were seeing. Initially I fitted a0 + a1x^-2 + a2x^-4 + a3x^-6 + ...
and found that with my solver it was fairly well behaved.
Then as I was writing this answer I noticed your fitted function is actually :
a0 + a1x^2 + a2x^-2 + a3x^6 + ...
The coefficient in the a1 term in x^2 (increasing with wavelength) causes severe numerical instability since you can create a linear combination of it and all of the others x^-2n (decreasing with wavelength) that is approximately zero at all of the datapoints you have as constraints. This degenerate linear combo seems to be able to kill several classic optimisation algorithms.
After setting a1=0
starting from [3,0,0,0,0,0] it had no trouble finding a solution but the optimum solution based on these data is somewhat ad hoc. BTW The first and last points of your dataset are not consistent with the polynomial model being fitted (ie. you need a better model).
This probably isn't too surprising since any dispersion relation must necessarily contain poles and should probably be modelled as a rational polynomial.
The wide range of parameters for the original problem as asked that give a (locally optimal) fit is worrying - the best I found was as follows. I also took the liberty of also fitting fewer free parameters to the equation. I arbitrarily set the coefficient of x^2 to zero since it only ever made things worse and caused numerical instability in some solvers (including mine). Certain combos were more sensitive to the initial guess than others. NB your a1 = 0
in all cases.
N | chisq | a0 | a2 | a3 | a4 | a5 | a6 |
---|---|---|---|---|---|---|---|
1 | 0.000271761 | 3.208772379 | |||||
2 | 1.55256E-05 | 3.113545357 | 0.029229403 | ||||
3 | 3.29237E-06 | 3.246393531 | -0.054982546 | 0.012571877 | |||
4 | 3.3514E-06 | 3.219760977 | -0.033412544 | 0.007112368 | 0.00043505 | ||
5 | 3.00226E-06 | 3.195383324 | -0.003670169 | -0.003906433 | 0.001628003 | 1.6941E-05 | |
6 | 3.39075E-06 | 3.195301293 | -0.000914088 | -0.001697281 | -0.002116599 | 0.001330161 | -0.00013867 |
6 | 2.93304E-06 | 3.196308907 | -0.009172953 | -0.00027767 | 0.001019308 | -2.69302E-05 | 1.29025E-05 |
6 | 2.07634E-06 | 3.052241994 | 0.079628152 | -0.000355528 | -0.004888907 | 1.23345E-05 | 0.000162729 |
6 | 1.92E-06 | 3.137812905 | 0.004385607 | 0.006955285 | 0.005498906 | -0.003516277 | 0.000491416 |
Note that the solution is sensitive to the initial guess provided and can easily get trapped on a local optimum - something which becomes a big problem when you try to fit too many parameters to not much data. Chisquared is the sum of the squares of the residuals with equal weighting. If I had to choose a model based on these data I would go for N=3
which pretty well fits all the data (in as much as this model ever can). Here is the graph:
N=4
was slightly worse than N=3
.
N=5
was very sensitive to choice of starting guess and could easily go wrong (as in fail with Nans). N=6
was less inclined to go wrong but could yield a wide range of possible "solutions" depending on the exact starting guess some of which were very slight improvements (but not statistically significant). Note how coefficient a5
thrashes about. Each solution found was a stable minimum in the sense that once found the optimiser was unable to move away from it.
You should always do some sort of basic ANOVA on any model fit and use the least number of model parameters necessary to fully explain the observed data. Likewise a plot of the data, model fit and residuals can be very informative.
It is all too common for people to overfit noisy data
Upvotes: 2