Reputation: 5177
I have a list of particle pairs within which each pair is referred to by a combination of a chain-index and an intra-chain-index of both particles. I have saved those in a Dataframe (let's call it index_array
) and now I want to plot a matrix of all particle pairs, where I plot all matrix elements that correspond to a pair in the list in one color and all others in another color. My idea was thus to produce a DataFrame (let's call it to_fill
) with chain- and intra-chain-index as a MultiIndex
for both rows and columns, which thus has two entries per pair and then use index_array
to index to_fill
to change the corresponding values, such that I can then plot the values of to_fill
via matplotlib.pyplot.pcolormesh
.
So to break it down into a more or less well-defined problem: I have a boolean DataFrame to_fill
that has multiindexed rows and columns (2 levels each) that contains only False
s. I also have another DataFrame index_array
that has four columns, containing the index values for the levels of both rows and columns. Now I want to set all elements pointed to by index_array
to True
. A toy version of those could for example be produced with the code below:
import numpy as np
import pandas as pd
lengths = pd.Series(data=[2, 4], index=[1, 2]) # Corresponds to the chains' lengths
index = pd.MultiIndex.from_tuples([(i, j) for i in lengths.index
for j in np.arange(1, lengths.loc[i]+1)])
to_fill = pd.DataFrame(index=index, columns=index, dtype=np.bool)
to_fill.loc[slice(None), slice(None)] = 0
print(to_fill)
# 1 2
# 1 2 1 2 3 4
# 1 1 False False False False False False
# 2 False False False False False False
# 2 1 False False False False False False
# 2 False False False False False False
# 3 False False False False False False
# 4 False False False False False False
index_array = pd.DataFrame([[1, 1, 1, 1],
[1, 1, 1, 2],
[2, 3, 2, 3],
[2, 3, 2, 4]],
columns=["i_1", "j_1", "i_2", "j_2"])
print(index_array)
# i_1 j_1 i_2 j_2
# 0 1 1 1 1
# 1 1 1 1 2
# 2 2 3 2 3
# 3 2 3 2 4
Now I want to set all entries in to_fill
that correspond to (i_1, j_1), (i_2, j_2)
for a row in index_array
to True
. So basically, index_array
refers to entries in to_fill
that should be changed. The expected result would thus be:
print(to_fill)
# 1 2
# 1 2 1 2 3 4
# 1 1 True True False False False False
# 2 False False False False False False
# 2 1 False False False False False False
# 2 False False False False False False
# 3 False False False False True True
# 4 False False False False False False
But I did not manage to properly use index_array
as an index. How can I tell to_fill
to treat the indexing arrays i_1
, j_1
, i_2
, and j_2
as corresponding index values for the levels of the row and column MultiIndex
respectively?
Upvotes: 0
Views: 127
Reputation: 3009
This is a little better - hmm perhaps not really:
tuples = [tuple(x) for x in index_array.values]
stacked = to_fill.stack(level=0).stack() # double stack carefully ordered
stacked.loc[tuples] = True
result = stacked.unstack(level=2).unstack().dropna(axis=1) #unstack and drop NaN cols
Upvotes: 1
Reputation: 3009
This is not great as I don't seek to use iterrows() if it can be helped.
idx = pd.IndexSlice
for row in index_array.iterrows():
r = row[1]
i_1= r.loc['i_1']
j_1= r.loc['j_1']
i_2= r.loc['i_2']
j_2 = r.loc['j_2']
to_fill.loc[idx[i_1,j_1],idx[i_2,j_2]] = True
Upvotes: 1