Sinbud
Sinbud

Reputation: 40

Pandas, set value based on sequence of index+column pairs

I would like to fill an empty dataframe using sequence of specific index/column label pairs.

In [3]: test = pd.DataFrame(np.zeros([6,6]), index=list('ABCDEF'), columns=list('ABCDEF'))

In numpy you can fill an array pointwise when providing two disjoint lists of numerical indices for two axis.

In [4]: test.values[((0,1,2,3,4),(0,0,2,3,3))]=88
In [5]: test
Out[5]: 
      A    B     C     D    E    F
A  88.0  0.0   0.0   0.0  0.0  0.0
B  88.0  0.0   0.0   0.0  0.0  0.0
C   0.0  0.0  88.0   0.0  0.0  0.0
D   0.0  0.0   0.0  88.0  0.0  0.0
E   0.0  0.0   0.0  88.0  0.0  0.0
F   0.0  0.0   0.0   0.0  0.0  0.0

However, same notation in pandas returns a subset of the dataframe and fills whole quadrant.

In [7]: test.loc[('A','B','C','D','E'), ('A','A','B','C','C')]=88
In [8]: test # df.loc, df.iloc and df.at all return same result
Out[8]: 
      A     B     C    D    E    F
A  88.0  88.0  88.0  0.0  0.0  0.0
B  88.0  88.0  88.0  0.0  0.0  0.0
C  88.0  88.0  88.0  0.0  0.0  0.0
D  88.0  88.0  88.0  0.0  0.0  0.0
E  88.0  88.0  88.0  0.0  0.0  0.0
F   0.0   0.0   0.0  0.0  0.0  0.0

Currently I'm switching from alphabetic to numerical indices and use numpy elements access, but I wonder, is there a more clear and consise solution?

test.values[ [list(test.index).index(x) for x in ('A','B','C','D','E')],
         [list(test.columns).index(x) for x in ('A','A','B','C','C')]]=88

Upvotes: 0

Views: 890

Answers (2)

Scott Boston
Scott Boston

Reputation: 153460

Another way is to use dataframe reshaping:

t1 = test.stack()

t1.loc[zip(('A','B','C','D','E'), ('A','A','B','C','C'))] = 88

t1.unstack()

Output:

      A     B     C    D    E    F
A  88.0   0.0   0.0  0.0  0.0  0.0
B  88.0   0.0   0.0  0.0  0.0  0.0
C   0.0  88.0   0.0  0.0  0.0  0.0
D   0.0   0.0  88.0  0.0  0.0  0.0
E   0.0   0.0  88.0  0.0  0.0  0.0
F   0.0   0.0   0.0  0.0  0.0  0.0

Upvotes: 2

BENY
BENY

Reputation: 323226

I can only think of for loop here

for x , y in zip(('A','B','C','D','E'), ('A','A','B','C','C')):
    test.at[x, y] = 88 

test
Out[294]: 
      A     B     C    D    E    F
A  88.0   0.0   0.0  0.0  0.0  0.0
B  88.0   0.0   0.0  0.0  0.0  0.0
C   0.0  88.0   0.0  0.0  0.0  0.0
D   0.0   0.0  88.0  0.0  0.0  0.0
E   0.0   0.0  88.0  0.0  0.0  0.0
F   0.0   0.0   0.0  0.0  0.0  0.0

Upvotes: 2

Related Questions