exAres
exAres

Reputation: 4926

how to use apply-like function on data frame? [please see details below]

I have a dataframe with columns A, B and C. I want to apply a function on each row of a dataframe in which a function will check the value of row$A and row$B and will update row$C based on those values. How can I achieve that?

Example:

  A   B  C
1 1  10 10
2 2  20 20
3 NA 30 30
4 NA 40 40
5 5  50 50

Now I want to update all rows in C column to B/2 value in that same row if value in A column for that row is NA.

So the dataframe after changes would look like:

  A   B  C
1 1  10 10
2 2  20 20
3 NA 30 15
4 NA 40 20
5 5  50 50

I would like to know if this can be done without using a for loop.

Upvotes: 0

Views: 59

Answers (6)

rnso
rnso

Reputation: 24535

Try:

> ddf$C = with(ddf, ifelse(is.na(A), B/2, C))
> 
> ddf
   A  B  C
1  1 10 10
2  2 20 20
3 NA 30 15
4 NA 40 20
5  5 50 50

Upvotes: 0

Sven Hohenstein
Sven Hohenstein

Reputation: 81683

Another approach:

dat <- transform(dat, C = B / 2 * (i <- is.na(A)) + C * !i)
#    A  B  C
# 1  1 10 10
# 2  2 20 20
# 3 NA 30 15
# 4 NA 40 20
# 5  5 50 50

Upvotes: 0

David Arenburg
David Arenburg

Reputation: 92282

Or if you want to update the column by reference (without copying the whole data set when updating the column) could also try data.table

library(data.table)
setDT(dat)[is.na(A), C := B/2]
dat
#     A  B  C
# 1:  1 10 10
# 2:  2 20 20
# 3: NA 30 15
# 4: NA 40 20
# 5:  5 50 50

Edit: Regarding @aruns comment, checking the address before and after the change implies it was updated by reference still.

library(pryr)
address(dat$C)
## [1] "0x2f85a4f0"
setDT(dat)[is.na(A), C := B/2]
address(dat$C)
## [1] "0x2f85a4f0"

Upvotes: 2

adomasb
adomasb

Reputation: 506

here is simple example using library(dplyr).

Fictional dataset:

df <- data.frame(a=c(1, NA, NA, 2), b=c(10, 20, 50, 50))

And you want just those rows where a == NA, therefore you can use ifelse:

df <- mutate(df, c=ifelse(is.na(a), b/2, b))

Upvotes: 0

akrun
akrun

Reputation: 886938

Try

 indx <- is.na(df$A)
 df$C[indx] <- df$B[indx]/2
 df
 #   A  B  C
 #1  1 10 10
 #2  2 20 20
 #3 NA 30 15
 #4 NA 40 20
 #5  5 50 50

Upvotes: 0

Richie Cotton
Richie Cotton

Reputation: 121057

Try this:

your_data <- within(your_data, C[is.na(A)] <- B[is.na(A)] / 2)

Upvotes: 1

Related Questions