aspire57
aspire57

Reputation: 1547

Pandas, Python. Replace random subset of values in a column

I have a data frame from which a specific column (y1) has 3 possible values: -9, 1 and 2.

I would like to change a random sample of 1000 values which originally were 2 to -9.

I have tried this:

df.loc[df.y1 == "2", 'y1'].sample(1000) =="-9"

but it does not work.

Upvotes: 2

Views: 2960

Answers (2)

jezrael
jezrael

Reputation: 862681

I think you need index of values for change first and then assign:

There is problem if number of rows of filtered a is less as 1000, so min was added - so it return length of a if length < 1000:

a = df.loc[df.y1 == 2, 'y1']
df.loc[a.sample(min(len(a.index), 1000)).index, 'y1'] = -9

Thank you, John Galt for better solution, if possible no 2 in column y1:

df.loc[(a if len(a.index) < 1000 else a.sample(1000)).index, 'y1'] = -9

Upvotes: 2

Ando Jurai
Ando Jurai

Reputation: 1049

Because while you use "==" the right way for your index, you merely should use "=" for the second one to assign the -9 value

Upvotes: 0

Related Questions