Reputation: 234
If I have a sample data frame like mtcars, and I want to find the difference between mtcars$qsec for all rows, I can do diff(mtcars$qsec). But is there a simple way to make diff(mtcars$qsec) a new column in the original mtcars data frame? I'm finding it difficult because there's one less row in diff(mtcars$qsec) than the rest of mtcars.
> head(mtcars,3)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Upvotes: 3
Views: 7278
Reputation: 269694
Here are two approaches. Both put an NA
in the first row of diff_qsec
and put diff(qsec)
in the remaining rows:
library(dplyr)
mtcars %>% mutate(diff_qsec = qsec - lag(qsec)) # dplyr has its own version of lag
transform(mtcars, diff_qsec = c(NA, diff(qsec)))
Also, on the general issue of padding see: How can I pad a vector with NA from the front?
Upvotes: 10
Reputation: 91
You could use the base function within() like so:
mtcars <- within(mtcars, difference <- c(NA,diff(qsec)))
This creates a column called "difference" with the first element NA and the rest calculated by diff(qsec).
You could create more columns at the same time by wrapping commands in {}, such as:
mtcars <- within(mtcars, {difference <- c(NA,diff(qsec))
multiple <- qsec*2})
Note that you must use <- for the assignment and not =.
Upvotes: 2