Reputation: 22686
I've got a numpy array (actually a pandas Data Frame, but the array will do) whose values I would like to permute. The catch is that there are a number of non-randomly positioned NaN's that I'd need to keep in place. So far I have an iterative solution involving populating a list of indices, making a permuted copy of that list and then assigning values from the original matrix from the original index to the permuted index. Any suggestions on how to do this more quickly? The matrix has millions of values and optimally I'd like to do many permutations but it's prohibitively slow with the iterative solution.
Here's the iterative solution:
import numpy, pandas
df = pandas.DataFrame(numpy.random.randn(3,3), index=list("ABC"), columns=list("abc"))
df.loc[[0,2], "a"] = numpy.nan
indices = []
for row in df.index:
for col in df.columns:
if not numpy.isnan(df.loc[row, col]):
indices.append((row, col))
permutedIndices = numpy.random.permutation(indices)
permuteddf = pandas.DataFrame(index=df.index, columns=df.columns)
for i in range(len(indices)):
permuteddf.loc[permutedIndices[i][0], permutedIndices[i][1]] = df.loc[indices[i][0], indices[i][1]]
With results:
In [19]: df
Out[19]:
a b c
A NaN 0.816350 -1.187731
B -0.58708 -1.054487 -1.570801
C NaN -0.290624 -0.453697
In [20]: permuteddf
Out[20]:
a b c
A NaN -0.290624 0.8163501
B -1.570801 -0.4536974 -1.054487
C NaN -0.5870797 -1.187731
Upvotes: 3
Views: 237
Reputation: 353229
How about:
>>> df = pd.DataFrame(np.random.randn(5,5))
>>> df[df < 0.1] = np.nan
>>> df
0 1 2 3 4
0 NaN 1.721657 0.446694 NaN 0.747747
1 1.178905 0.931979 NaN NaN NaN
2 1.547098 NaN NaN NaN 0.225014
3 NaN NaN NaN 0.886416 0.922250
4 0.453913 0.653732 NaN 1.013655 NaN
[5 rows x 5 columns]
>>> movers = ~np.isnan(df.values)
>>> df.values[movers] = np.random.permutation(df.values[movers])
>>> df
0 1 2 3 4
0 NaN 1.013655 1.547098 NaN 1.721657
1 0.886416 0.446694 NaN NaN NaN
2 1.178905 NaN NaN NaN 0.453913
3 NaN NaN NaN 0.747747 0.653732
4 0.922250 0.225014 NaN 0.931979 NaN
[5 rows x 5 columns]
Upvotes: 4