Implement numpy covariance matrix from scratch

Question

I'm trying to emulate the np.cov function by implementing covariance matrix from scratch. However, my code doesn't seem to give the same output as np.cov

Code:

import pandas as pd
import numpy as np

df = pd.read_csv('C:/Users/User/Downloads/Admission_Predict.csv')
X = df.values
N, M = X.shape

means = np.zeros(M)  # M many of them
stdevs = np.zeros(M)
Xcoeff = np.zeros((M, M))

# Mean
for i in range(M):
    means[i] = np.sum(X[:, i]) / N
    stdevs[i] = math.sqrt(sum(pow(x-means[i], 2) for x in X[:, i]) / (N-1))

    # Covariance matrix
    for j in range(M):
        mat0 = mat[i][j] - [means][0].reshape(M, -1)
        covariance = (mat0 * mat0.T) / (N-1)

Desired matrix values

print(np.cov(df))

> [[14128.00654107 13533.16488393 13222.07435357 ... 13831.92691786
>   13050.78170893 13961.07189821]  [13533.16488393 12968.32105536 12670.19783929 ... 13249.25808929
>   12505.84390893 13372.93946964]  [13222.07435357 12670.19783929 12379.07033571 ... 12944.65915
>   12218.34000357 13065.526925  ]  ...  [13831.92691786 13249.25808929 12944.65915    ... 13542.10545
>   12777.00158214 13668.54191786]  [13050.78170893 12505.84390893 12218.34000357 ... 12777.00158214
>   12060.0142125  12896.28555179]  [13961.07189821 13372.93946964 13065.526925   ... 13668.54191786
>   12896.28555179 13796.19808393]]

My output matrix values

print(covariance)

> [ 3.47493270e+02  1.17319616e+02  2.64636910e+00  2.98987496e+00
>   3.04758394e+00  8.70463072e+00 -1.45646482e-01  4.87503509e-02]

Implement numpy covariance matrix from scratch

Answers (1)

Related Questions