Reputation: 91
I am having some issues with a pretty simple code I have written. I have 4 sets of data, and want to generate polynomial best fit lines using numpy polyfit. 3 of the lists yield numbers when using polyfit, but the third data set yields NAN when using polyfit. Below is the code and the print out. Any ideas?
Code:
###all of the 'ind_#'s are the lists of data. Below converts them into numpy arrays that can then generate polynomial best fit line###
ind_1=np.array(ind_1, np.float)
dep_1=np.array(dep_1, np.float)
x_1=np.arange(min(ind_1)-1, max(ind_1)+1, .01)
ind_2=np.array(ind_2, np.float)
dep_2=np.array(dep_2, np.float)
x_2=np.arange(min(ind_2)-1, max(ind_2)+1, .01)
ind_3=np.array(ind_3, np.float)
dep_3=np.array(dep_3, np.float)
x_3=np.arange(min(ind_3)-1, max(ind_3)+1, .01)
ind_4=np.array(ind_4, np.float)
dep_4=np.array(dep_4, np.float)
x_4=np.arange(min(ind_4)-1, max(ind_4)+1, .01)
###Below prints off the arrays generated above, as well as the contents of the polyfit list, which are usually the coefficients of the polynomial equation, but for the third case below, all of the polyfit contents print off as NAN###
print(ind_1)
print(dep_1)
print(np.polyfit(ind_1,dep_1,2))
print(ind_2)
print(dep_2)
print(np.polyfit(ind_2,dep_2,2))
print(ind_3)
print(dep_3)
print(np.polyfit(ind_3,dep_3,2))
print(ind_4)
print(dep_4)
print(np.polyfit(ind_4,dep_4,2))
Print out:
[ 1.405 1.871 2.713 ..., 5.367 5.404 2.155]
[ 0.274 0.07 0.043 ..., 0.607 0.614 0.152]
[ 0.01391925 -0.00950728 0.14803846]
[ 0.9760001 2.067 8.8 ..., 1.301 1.625 2.007 ]
[ 0.219 0.05 0.9810001 ..., 0.163 0.161 0.163 ]
[ 0.00886807 -0.00868727 0.17793324]
[ 1.143 0.9120001 2.162 ..., 2.915 2.865 2.739 ]
[ 0.283 0.3 0.27 ..., 0.227 0.213 0.161]
[ nan nan nan]
[ 0.167 0.315 1.938 ..., 2.641 1.799 2.719]
[ 0.6810001 0.7140001 0.309 ..., 0.283 0.313 0.251 ]
[ 0.00382331 0.00222269 0.16940372]
Why are the polyfit constants from the third case listed as NAN? All the data sets have same type of data, and all of the code is consistent. Please help.
Upvotes: 9
Views: 9873
Reputation: 23492
Just looked at your data. This is happening because you have a NaN
in dep_3
(element 713). You can make sure that you only use finite values in the fit like this:
idx = np.isfinite(ind_3) & np.isfinite(dep_3)
print(np.polyfit(ind_3[idx], dep_3[idx], 2))
As for finding for bad values in large datasets, numpy makes that really easy. You can find the indices like this:
print(np.where(~np.isfinite(dep_3)))
Upvotes: 21