Reputation: 1
I have two pandas dataframes(df1 and df2) with the exact same number of columns and rows. (colum and index names are the same as well) The values in these two dataframes may or may not differ.
I want to compare every value in df1 with the value in the corresponding position in df2 and if the value in df2 is equal or bigger then the value in df1 i want to replace the value in df1 with a random integer.
So i thought I would want something like this (but preferably there wouldn't be any loops at all)
for every value in df1
df1.value - df2.value
if df1.value < 1
df1.value = np.random()
I tried looking at pandas df.replace function in combination with the df.where function but I just can't seem to get it work it.
Edit: I want to add something i forgot previously. When assigning my random int I want it to be within a a range based on my corresponding value. So it will be:
for every value in df1
df1.value - df2.value
if df1.value < 1
df1.value = np.random( in range (df1.value -10, df.value +10)
I believe this not possible with Pietro Tortella answer as I'm processing the dataframe as whole.
Does anyone know how to solve this?
Upvotes: 0
Views: 1958
Reputation: 1114
If memory is not a concern, I would create a third DataFrame of random numbers, and make a substitution using the difference as a mask.
For instance, something like
randoms = pd.DataFrame(
np.random.randn(*df1.values.shape),
index=df1.index,
columns=df1.columns
)
df1[df2 >= df1] = randoms[df2 >= df1]
Upvotes: 2