Justine
Justine

Reputation: 1

Create a categorical variable (age categories) and apply to a table

I'm working with a large table with many variables, including "age". Here is an example of what my table looks like:

  1. Age Var2 Var3 Var4 Var5
  2. 32 John Green Married 6'1
  3. 47 Julia Stone Divorced 5'4
  4. 72 Mike White Divorced 5'8

...

I am trying to add a variable to this table that classifies age in categories of 10 years, starting from 20 years old.

I have created my criterias:

mydata$age[mydata$age>=20 & mydata$age<=29] <- "20-29"
mydata$age[mydata$age>=30 & mydata$age<=39] <- "30-39"
mydata$age[mydata$age>=40 & mydata$age<=49] <- "40-49"
mydata$age[mydata$age>=50 & mydata$age<=59] <- "50-59"
mydata$age[mydata$age>=60 & mydata$age<=69] <- "60-69"
mydata$age[mydata$age>=70 & mydata$age<=79] <- "70-79"

Now, i want to add this as a variable in my table. So I want this variable to apply the right age category to every age listed in my data table. Here is an example of what it should look like:

  1. Age Var2 Var3 Var4 Var5 AgeClass
  2. 32 John Green Married 6'1 30-39
  3. 47 Julia Stone Divorced 5'4 40-49
  4. 72 Mike White Divorced 5'8 70-79 ...

Anyone has an idea how to do that? Thank you!

Upvotes: 0

Views: 804

Answers (1)

tpbilton
tpbilton

Reputation: 173

How about the cut function, e.g.,

df = data.frame(Age=c(32,47,72), 
                Var2=c("John","Julia","Mike"), 
                Var3=c("Green","Stone","White"),
                Var4=c("Married","Divorced","Divorced"),
                Var5=c("6'1","5'4","5'8"))
df$age = cut(df$Age,breaks = seq(20,80,10), 
             labels=paste0(seq(20,70,10),"-",seq(30,80,10)-1))

Upvotes: 1

Related Questions