outlier dedection with z-score, but

Question

I wrote a code to outlier dedection with Python. I used the z-score method to do this. You can see my data and my codes below.

data =[5,10,15,20,25,30,36,22]
data.append(180)
data = pd.DataFrame(data, columns = ["Data"])
z = np.abs(stats.zscore(data))
print(z)
print(np.where( z > 1.5))

I wrote this code to detect outliers. Actually, I wanted to getthe indices of values with z-score higher than 1.5. But I think something is wrong with output.

Data
0  0.649600
1  0.551506
2  0.453412
3  0.355318
4  0.257224
5  0.159130
6  0.041417
7  0.316080
8  2.783688
(array([8], dtype=int64), array([0], dtype=int64))

The 8th element of the data's z-score is higher than 1.5 and it's already written on output, I'm okay with this but the 0th's z-score 0.64. What am i doing wrong?

pr94 · Accepted Answer

You could do something like this:

import numpy as np
from scipy import stats

data =[5,10,15,20,25,30,36,22]
data.append(180)

z = stats.zscore(data)

np.where(z > 1.5)[0]

output:

array([8])

outlier dedection with z-score, but

Answers (1)

Related Questions