Pandas, Python. Replace random subset of values in a column

Question

I have a data frame from which a specific column (y1) has 3 possible values: -9, 1 and 2.

I would like to change a random sample of 1000 values which originally were 2 to -9.

I have tried this:

df.loc[df.y1 == "2", 'y1'].sample(1000) =="-9"

but it does not work.

jezrael · Accepted Answer

I think you need index of values for change first and then assign:

There is problem if number of rows of filtered a is less as 1000, so min was added - so it return length of a if length < 1000:

a = df.loc[df.y1 == 2, 'y1']
df.loc[a.sample(min(len(a.index), 1000)).index, 'y1'] = -9

Thank you, John Galt for better solution, if possible no 2 in column y1:

df.loc[(a if len(a.index) < 1000 else a.sample(1000)).index, 'y1'] = -9

Answers (2)