prof31
prof31

Reputation: 85

Constructing Correlation Matrix With Only certain Columns

I have a 60 column dataset.

I want to create a correlation matrix for only 10 of the columns, compared with the other 50 columns.

I dont want to have a 60x60 correlation matrix. I need a 10*50 correlation matrix.

Any help?

Upvotes: 0

Views: 881

Answers (1)

Ian Thompson
Ian Thompson

Reputation: 3275

Make your correlation matrix as you normally would, then limit the index and columns to the values you want.

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.random(size=(100, 60)))

correlation = df.corr()

first_10 = correlation.columns[:10]
exclude_10 = correlation.columns.drop(first_10)

correlation.loc[first_10, exclude_10]

         10        11        12  ...        57        58        59
0 -0.075061  0.062559 -0.260992  ...  0.024617  0.005765 -0.077287
1 -0.065540 -0.079958  0.143195  ... -0.216650 -0.050884  0.117338
2  0.073335  0.132874  0.149404  ...  0.085404 -0.124058  0.011124
3 -0.137916 -0.173107 -0.075658  ... -0.084010 -0.286557 -0.073148
4 -0.040975  0.075740 -0.127664  ...  0.075596  0.030846  0.095129
5  0.034180 -0.084942  0.040704  ... -0.042057 -0.072879 -0.062279
6  0.172650  0.088127  0.063521  ... -0.095621 -0.162743 -0.056033
7  0.096467  0.103262 -0.088065  ... -0.257419  0.089628  0.108185
8 -0.088350  0.034066  0.047837  ... -0.069311  0.020804  0.075076
9  0.065377 -0.163597  0.059740  ... -0.001015 -0.181609  0.027455
[10 rows x 50 columns]

Upvotes: 1

Related Questions