Reputation: 4926
I have a dataframe
with columns A, B and C.
I want to apply a function on each row of a dataframe
in which a function will check the value of row$A
and row$B
and will update row$C
based on those values. How can I achieve that?
Example:
A B C
1 1 10 10
2 2 20 20
3 NA 30 30
4 NA 40 40
5 5 50 50
Now I want to update all rows in C column to B/2 value in that same row if value in A column for that row is NA
.
So the dataframe
after changes would look like:
A B C
1 1 10 10
2 2 20 20
3 NA 30 15
4 NA 40 20
5 5 50 50
I would like to know if this can be done without using a for
loop.
Upvotes: 0
Views: 59
Reputation: 24535
Try:
> ddf$C = with(ddf, ifelse(is.na(A), B/2, C))
>
> ddf
A B C
1 1 10 10
2 2 20 20
3 NA 30 15
4 NA 40 20
5 5 50 50
Upvotes: 0
Reputation: 81683
Another approach:
dat <- transform(dat, C = B / 2 * (i <- is.na(A)) + C * !i)
# A B C
# 1 1 10 10
# 2 2 20 20
# 3 NA 30 15
# 4 NA 40 20
# 5 5 50 50
Upvotes: 0
Reputation: 92282
Or if you want to update the column by reference (without copying the whole data set when updating the column) could also try data.table
library(data.table)
setDT(dat)[is.na(A), C := B/2]
dat
# A B C
# 1: 1 10 10
# 2: 2 20 20
# 3: NA 30 15
# 4: NA 40 20
# 5: 5 50 50
Edit: Regarding @aruns comment, checking the address before and after the change implies it was updated by reference still.
library(pryr)
address(dat$C)
## [1] "0x2f85a4f0"
setDT(dat)[is.na(A), C := B/2]
address(dat$C)
## [1] "0x2f85a4f0"
Upvotes: 2
Reputation: 506
here is simple example using library(dplyr)
.
Fictional dataset:
df <- data.frame(a=c(1, NA, NA, 2), b=c(10, 20, 50, 50))
And you want just those rows where a == NA, therefore you can use ifelse
:
df <- mutate(df, c=ifelse(is.na(a), b/2, b))
Upvotes: 0
Reputation: 886938
Try
indx <- is.na(df$A)
df$C[indx] <- df$B[indx]/2
df
# A B C
#1 1 10 10
#2 2 20 20
#3 NA 30 15
#4 NA 40 20
#5 5 50 50
Upvotes: 0
Reputation: 121057
Try this:
your_data <- within(your_data, C[is.na(A)] <- B[is.na(A)] / 2)
Upvotes: 1