user2935184
user2935184

Reputation: 113

Conditionally change the values in a data frame

I added a new variable (all zeros) to my old data frame. Now in this new data frame, I have to change the value from 0 to 1 for observations who meet the condition. The condition is on the other variable.

For example, I have variables x,y,z in this new data frame. z is the new variable I just added, they are all zero. If y=some number a, I want z=1.

I try to use a simple for loop to accomplish this, but I have no idea where I did wrong.

for (i==999 in data$y) {
    {data$z==1} 
}

Upvotes: 0

Views: 4159

Answers (2)

Ari
Ari

Reputation: 1972

It would have helped if you gave us a reproducible example. I'll create one instead:

df = data.frame(x = sample.int(5, 5),
                y = sample.int(5, 5),
                z = rep(0, 5))

df
  x y z
1 3 3 0
2 4 5 0 
3 2 1 0
4 5 4 0
5 1 2 0

Your problem states that you are trying to change values of df$z when some condition in y is met. In R, the general way to do this is to use subscripts. I highly recommend John Cook's blog post 5 Kinds of Subscripts in R to help understand this; it's one of those things in R that just works differently than most other languages, but when you get the hang of it it becomes very handy.

So in this case:

# where is y==1?
df$y == 1
[1] FALSE FALSE  TRUE FALSE FALSE

We can feed this resulting logical vector into the row index of an expression like df[row, column]

df[df$y == 1, ]
  x y z
3 2 1 0

And if we want to set the value of the "z" column in that row to be something, just type

df[df$y == 1, "z"] = 999
df
  x y   z
1 3 3   0
2 4 5   0
3 2 1 999
4 5 4   0
5 1 2   0

Upvotes: 2

josliber
josliber

Reputation: 44320

It seems like you're trying to set data$z to be 1 when data$y is 999, and set it to be 0 otherwise. This can be accomplished with:

data$z = as.numeric(data$y == 999)

Upvotes: 2

Related Questions