Reputation: 7490
Lets say I have a numpy array of numbers. It's like 43,000X5000. For ex:
array([[-0. , 0.02, 0.03, 0.05, 0.06, 0.05],
[ 0.02, 0. , 0.02, 0.05, 0.04, 0.04],
[ 0.03, 0.02, 0. , 0.06, 0.05, 0.05],
[ 0.05, 0.05, 0.06, 0. , 0.02, 0.01],
[ 0.06, 0.04, 0.05, 0.02, -0. , 0.01],
[ 0.05, 0.04, 0.05, 0.01, 0.01, -0. ]])
I want to print a result such that it's like a cross-tab with these values and having headers both as column headers and as index. Basically what I am trying to do is I have a distance matrix of text documents. I want to show a table where I have these distances for each pair of text documents with the text document names on both the columns and indexes.
Something like below:
Austen_Emma Austen_Pride Austen_Sense CBronte_Jane CBronte_Professor CBronte_Villette
Austen_Emma -0.00 0.02 0.03 0.05 0.06 0.05
Austen_Pride 0.02 0.00 0.02 0.05 0.04 0.04
Austen_Sense 0.03 0.02 0.00 0.06 0.05 0.05
CBronte_Jane 0.05 0.05 0.06 0.00 0.02 0.01
CBronte_Professor 0.06 0.04 0.05 0.02 -0.00 0.01
CBronte_Villette 0.05 0.04 0.05 0.01 0.01 -0.00
I was thinking of converting the numpy matrix to pandas data frame and then adding header and index. Any other suggestions.
Upvotes: 1
Views: 3806
Reputation: 46759
You could do the following using Pandas
:
import numpy as np
import pandas as pd
pd.set_option('display.width', 150)
header = ['Austen_Emma', 'Austen_Pride', 'Austen_Sense', 'CBronte_Jane', 'CBronte_Professor', 'CBronte_Villette']
a = np.array([[-0. , 0.02, 0.03, 0.05, 0.06, 0.05],
[ 0.02, 0. , 0.02, 0.05, 0.04, 0.04],
[ 0.03, 0.02, 0. , 0.06, 0.05, 0.05],
[ 0.05, 0.05, 0.06, 0. , 0.02, 0.01],
[ 0.06, 0.04, 0.05, 0.02, -0. , 0.01],
[ 0.05, 0.04, 0.05, 0.01, 0.01, -0. ]])
frame = pd.DataFrame(a, index=header, columns=header)
print frame
This would give you the following output:
Austen_Emma Austen_Pride Austen_Sense CBronte_Jane CBronte_Professor CBronte_Villette
Austen_Emma -0.00 0.02 0.03 0.05 0.06 0.05
Austen_Pride 0.02 0.00 0.02 0.05 0.04 0.04
Austen_Sense 0.03 0.02 0.00 0.06 0.05 0.05
CBronte_Jane 0.05 0.05 0.06 0.00 0.02 0.01
CBronte_Professor 0.06 0.04 0.05 0.02 -0.00 0.01
CBronte_Villette 0.05 0.04 0.05 0.01 0.01 -0.00
Upvotes: 2