Reputation: 40
I would like to fill an empty dataframe using sequence of specific index/column label pairs.
In [3]: test = pd.DataFrame(np.zeros([6,6]), index=list('ABCDEF'), columns=list('ABCDEF'))
In numpy you can fill an array pointwise when providing two disjoint lists of numerical indices for two axis.
In [4]: test.values[((0,1,2,3,4),(0,0,2,3,3))]=88
In [5]: test
Out[5]:
A B C D E F
A 88.0 0.0 0.0 0.0 0.0 0.0
B 88.0 0.0 0.0 0.0 0.0 0.0
C 0.0 0.0 88.0 0.0 0.0 0.0
D 0.0 0.0 0.0 88.0 0.0 0.0
E 0.0 0.0 0.0 88.0 0.0 0.0
F 0.0 0.0 0.0 0.0 0.0 0.0
However, same notation in pandas returns a subset of the dataframe and fills whole quadrant.
In [7]: test.loc[('A','B','C','D','E'), ('A','A','B','C','C')]=88
In [8]: test # df.loc, df.iloc and df.at all return same result
Out[8]:
A B C D E F
A 88.0 88.0 88.0 0.0 0.0 0.0
B 88.0 88.0 88.0 0.0 0.0 0.0
C 88.0 88.0 88.0 0.0 0.0 0.0
D 88.0 88.0 88.0 0.0 0.0 0.0
E 88.0 88.0 88.0 0.0 0.0 0.0
F 0.0 0.0 0.0 0.0 0.0 0.0
Currently I'm switching from alphabetic to numerical indices and use numpy elements access, but I wonder, is there a more clear and consise solution?
test.values[ [list(test.index).index(x) for x in ('A','B','C','D','E')],
[list(test.columns).index(x) for x in ('A','A','B','C','C')]]=88
Upvotes: 0
Views: 890
Reputation: 153460
Another way is to use dataframe reshaping:
t1 = test.stack()
t1.loc[zip(('A','B','C','D','E'), ('A','A','B','C','C'))] = 88
t1.unstack()
Output:
A B C D E F
A 88.0 0.0 0.0 0.0 0.0 0.0
B 88.0 0.0 0.0 0.0 0.0 0.0
C 0.0 88.0 0.0 0.0 0.0 0.0
D 0.0 0.0 88.0 0.0 0.0 0.0
E 0.0 0.0 88.0 0.0 0.0 0.0
F 0.0 0.0 0.0 0.0 0.0 0.0
Upvotes: 2
Reputation: 323226
I can only think of for loop here
for x , y in zip(('A','B','C','D','E'), ('A','A','B','C','C')):
test.at[x, y] = 88
test
Out[294]:
A B C D E F
A 88.0 0.0 0.0 0.0 0.0 0.0
B 88.0 0.0 0.0 0.0 0.0 0.0
C 0.0 88.0 0.0 0.0 0.0 0.0
D 0.0 0.0 88.0 0.0 0.0 0.0
E 0.0 0.0 88.0 0.0 0.0 0.0
F 0.0 0.0 0.0 0.0 0.0 0.0
Upvotes: 2