Reputation: 13
So I want to transfer the entries from one column vector (age in my example) of a dataframe into a new column vector (experience, which i define as experience = age - years of school). So, I can work out the number of years in school via the age, there are three different bands, eg in my example people below the age of 26 were in school for 14 years (other bands can be seen in code below). To find experience I would just bring these together.
I ran the code below, but got loads of errors, confused prompt signs, entries about parynthesis I did not understand. And it just returned me a vector with all the entries being 'age-12', rather than the different bands I was hoping for (some age-10, others age-14).
if (df[,4] < 26){
df[,4] - 14
} elseif (df[,4] < 53) {
df[,4] - 12
} else (df[,4] - 10)
Any help would be hugely appreciated!
Upvotes: 0
Views: 32
Reputation: 173793
You could use cut
plus some indexing to achieve this with a single line of code:
df$exp <- df$age - c(14, 12, 10)[as.numeric(cut(df$age, c(0, 25, 52, Inf)))]
Suppose your data looked like this:
set.seed(69)
df <- data.frame(age = sample(16:70, 10))
df
#> age
#> 1 47
#> 2 32
#> 3 42
#> 4 17
#> 5 63
#> 6 55
#> 7 28
#> 8 54
#> 9 62
#> 10 66
Then if you did
df$exp <- df$age - c(14, 12, 10)[as.numeric(cut(df$age, c(0, 25, 52, Inf)))]
Then df
would look like this:
df
#> age exp
#> 1 47 35
#> 2 32 20
#> 3 42 30
#> 4 17 3
#> 5 63 53
#> 6 55 45
#> 7 28 16
#> 8 54 44
#> 9 62 52
#> 10 66 56
Created on 2020-11-10 by the reprex package (v0.3.0)
Upvotes: 0
Reputation: 123838
The issue is that if
is not vectorized. Instead you could use ifelse
or dplyr::case_when
.
Using some random example data:
set.seed(42)
df <- data.frame(
a = 1,
b = 2,
c = 3,
d = sample(20:100, 5)
)
df[,4]
#> [1] 68 84 44 93 37
ifelse(df[,4] < 26, df[,4] - 14, ifelse(df[,4] < 53, df[,4] - 12, df[,4] - 10))
#> [1] 58 74 32 83 25
dplyr::case_when(
df[,4] < 26 ~ df[,4] - 14,
df[,4] < 53 ~ df[,4] - 12,
TRUE ~ df[,4] - 10
)
#> [1] 58 74 32 83 25
Upvotes: 1