Eran Moshe
Eran Moshe

Reputation: 3208

python - Converting pandas Matrix to DataFrame

I have created a matrix:

items = [0, 1, 2, 3]
item_to_item = pd.DataFrame(index=items, columns=items)

I've put values in it so:

  1. Its symetric
  2. Its diagonal is all 0's

for example:

   0  1  2  3
0  0  4  5  9
1  4  0  3  7
2  5  3  0  3
3  9  7  3  0

I want to create a data frame of all possible pairs (from [0, 1, 2, 3]) so that there wont be pairs of (x, x) and if (x, y) is in, I dont want (y, x) becuase its symetric and holds the same value. In the end I will have the following Dataframe (or numpy 2d array)

item, item, value
 0     1     4
 0     2     5
 0     3     9
 1     2     3
 1     3     7
 2     3     3

Upvotes: 1

Views: 449

Answers (2)

Divakar
Divakar

Reputation: 221514

Here's a NumPy solution with np.triu_indices -

In [453]: item_to_item
Out[453]: 
   0  1  2  3
0  0  4  5  9
1  4  0  3  7
2  5  3  0  3
3  9  7  3  0

In [454]: r,c = np.triu_indices(len(items),1)

In [455]: pd.DataFrame(np.column_stack((r,c, item_to_item.values[r,c])))
Out[455]: 
   0  1  2
0  0  1  4
1  0  2  5
2  0  3  9
3  1  2  3
4  1  3  7
5  2  3  3

Upvotes: 1

user2285236
user2285236

Reputation:

numpy's np.triu gives you the upper triangle with all other elements set to zero. You can use that to construct your DataFrame and replace them with NaNs (so that they are dropped when you stack the columns):

pd.DataFrame(np.triu(df), index=df.index, columns=df.columns).replace(0, np.nan).stack()
Out: 
0  1    4.0
   2    5.0
   3    9.0
1  2    3.0
   3    7.0
2  3    3.0
dtype: float64

You can use reset_index at the end to convert indices to columns.

Another alternative would be resetting the index and stacking again but this time use a callable to slice the DataFrame:

df.stack().reset_index()[lambda x: x['level_0'] < x['level_1']]
Out: 
    level_0  level_1  0
1         0        1  4
2         0        2  5
3         0        3  9
6         1        2  3
7         1        3  7
11        2        3  3

This one requires pandas 0.18.0.

Upvotes: 2

Related Questions