Reputation: 1
The following code is meant to calculate the IQR of dataframes for follow on regression in ML data analysis. The code worked at first, but then had to upgrade Anaconda to get libraries. Code is
#Checking different percentiles
pd.DataFrame(MyData['Age']).describe(percentiles=(1,0.99,0.9,0.75,0.5,0.3,0.1,0.01))
The code for IQR
Checking Outlier by definition and treating outliers
#Getting median age
Age_col_df = pd.DataFrame(MyData['Age'])
Age_median = Age_col_df.median()
#Getting IQR of Age column
Q3 = Age_col_df.quantile(q=0.75)
Q1 = Age_col_df.quantile(q=0.25)
IQR = Q3-Q1
#Deriving boundaries of Outliers
IQR_LL = int(Q1 - 1.5*IQR)
IQR_UL = int(Q3 + 1.5*IQR)
#Finding and treating outliers - both lower and upper end
MyData.loc[MyData['Age']>IQR_UL , 'Age'] = int(Age_col_df.quantile(q=0.90))
MyData.loc[MyData['Age']<IQR_LL , 'Age'] = int(Age_col_df.quantile(q=0.01))
The errors that I get are
C:\Users\danie\AppData\Local\Temp\ipykernel_19008\1214848077.py:13: FutureWarning: Calling int on a single element Series is deprecated and will raise a TypeError in the future. Use int(ser.iloc[0]) instead
C:\Users\danie\AppData\Local\Temp\ipykernel_19008\1214848077.py:14: FutureWarning: Calling int on a single element Series is deprecated and will raise a TypeError in the future. Use int(ser.iloc[0]) instead
C:\Users\danie\AppData\Local\Temp\ipykernel_19008\1214848077.py:17: FutureWarning: Calling int on a single element Series is deprecated and will raise a TypeError in the future. Use int(ser.iloc[0]) instead
C:\Users\danie\AppData\Local\Temp\ipykernel_19008\1214848077.py:18: FutureWarning: Calling int on a single element Series is deprecated and will raise a TypeError in the future. Use int(ser.iloc[0]) instead
Upvotes: 0
Views: 43