Reputation: 951
This is a little exercise in combinations and pandas.
I currently have a pandas data frame like:
| node1 | node2| distance|
|Sydney|Canberra|100|
|Melbourne| Sydney| 200|
|Canberra| Melbourne| 150|
I then create a new data frame so column headers and row names are the unique values of the nodes.
mat = pd.DataFrame(index = np.unique(df.node1), columns = np.unique(df.node1))
Either node1 or node2 would suffice here as order doesn't matter when constructing the data frame.
Which produces what I want but filled with NaN values.
How do I go about filling mat with the values from distance column without writing some ugly loop?
Thanks
Upvotes: 1
Views: 133
Reputation: 863291
First use pivot
or set_index
with unstack
:
mat = df.pivot(index='node1', columns='node2', values='distance')
print (mat)
node2 Canberra Melbourne Sydney
node1
Canberra NaN 150.0 NaN
Melbourne NaN NaN 200.0
Sydney 100.0 NaN NaN
mat = df.set_index(['node1', 'node2'])['distance'].unstack()
print (mat)
node2 Canberra Melbourne Sydney
node1
Canberra NaN 150.0 NaN
Melbourne NaN NaN 200.0
Sydney 100.0 NaN NaN
And then combine_first
with T
:
mat = mat.combine_first(mat.T)
print (mat)
Canberra Melbourne Sydney
node1
Canberra NaN 150.0 100.0
Melbourne 150.0 NaN 200.0
Sydney 100.0 200.0 NaN
Upvotes: 1