Reputation: 279
I have a huge data frame that is like:
df = data.frame(A = c(1,54,23,2), B=c(1,2,4,65), C=c("+","-","-","+"))
> df
A B C
1 1 1 +
2 54 2 -
3 23 4 -
4 2 65 +
I need to subtract the rows based on different conditions, and add these results in a new column:
A - B if C == +
B - A if C == -
So, my output would be:
> new_df
A B C D
1 1 1 + 0
2 54 2 - -52
3 23 4 - -19
4 2 65 + -63
Upvotes: 2
Views: 3339
Reputation: 47300
A base solution:
df$D = (df$B-df$A)*sign((df$C=="-")-0.5)
# A B C D
# 1 1 1 + 0
# 2 54 2 - -52
# 3 23 4 - -19
# 4 2 65 + -63
can also be written df <- transform(df, D = (B-A)*sign((C=="-")-0.5))
Upvotes: 0
Reputation: 402
This answer should work for you https://stackoverflow.com/a/19000310/6395612
You can use with like this:
df['D'] = with(df, ifelse(C=='+', A - B, B - A))
Upvotes: 0
Reputation: 1857
Alternatively, if you want to evaluate the arithmetic information in column C (as in addition or subtraction), you can use eval(parse(txt))
(more about that here: Evaluate expression given as a string).
## Transforming into a matrix (simplifies everything into characters)
df_mat <- as.matrix(df)
## Function for evaluation the rows
eval.row <- function(row) {
eval(parse(text= paste(row[1], row[3], row[2])))
}
## For the first row
eval.row(df_mat[1,])
# [1] 2
## For the whole data frame
apply(df_mat, 1, eval.row)
# [1] 2 52 19 67
## Updating the data.frame
df$D <- apply(df_mat, 1, eval.row)
Upvotes: 0
Reputation: 3722
using dplyr
:
If there are definitely only +
and -
in the C column you can do:
library(dplyr)
df2 <- df %>%
mutate(D = ifelse(C == '+', A - B, B - A))
I would generally do:
df2 <- df %>%
mutate(D = ifelse(C == '+', A - B,
ifelse(C == '-', B - A, NA)))
Just in case there are some that do not have +
or -
.
Upvotes: 1
Reputation: 33772
Better to add stringsAsFactors = FALSE
when you create a data frame. Also, I don't like to use df
for variable names since there is a df()
function:
df1 <- data.frame(A = c(1, 54, 23, 2),
B = c(1, 2, 4, 65),
C = c("+", "-", "-", "+"),
stringsAsFactors = FALSE)
Assuming that C
is only +
or -
, you can use dplyr::mutate()
and test using ifelse()
:
library(dplyr)
df1 %>%
mutate(D = ifelse(C == "+", A - B, B - A))
Upvotes: 2
Reputation: 39154
This assumes that only two conditions, +
and -
, are in column C
.
df$D <- with(df, ifelse(C %in% "+", A - B, B - A))
df
# A B C D
# 1 1 1 + 0
# 2 54 2 - -52
# 3 23 4 - -19
# 4 2 65 + -63
Upvotes: 3