Reputation: 7745
The application here is grouping U.S. states into regions.
group1 <- c("ME", "NH", "VT", "MA", "CT", "RI")
group2 <- c("FL", "GA", "AL", "MS", "LA")
My data looks like:
SomeVar | State
---------------
300 | AL
331 | GA
103 | MA
500 | FL
And I would like to add a "region" column to the data according to the groupings above, like so:
SomeVar | State | Region
------------------------
300 | AL | 2
331 | GA | 2
103 | MA | 1
500 | FL | 2
Is there a straightforward way to assign factors based on groupings?
Upvotes: 2
Views: 249
Reputation: 226077
group1 <- c("ME", "NH", "VT", "MA", "CT", "RI")
group2 <- c("FL", "GA", "AL", "MS", "LA")
grouptab <- rbind(data.frame(State=group1,grp=1),
data.frame(State=group2,grp=2))
DF <- read.table(text="SomeVar State
300 AL
331 GA
103 MA
500 FL",header=TRUE)
merge(DF,grouptab)
Or more generally:
groupList <- list(group1,group2)
grouptab <- data.frame(State=unlist(groupList),
grp=rep(seq_along(groupList),
sapply(groupList,length)))
(there may be other ways to do this -- I tried mapply
but couldn't figure it out quickly)
I think suitable arguments to merge
(e.g. all
, all.x
, all.y
) would handle the missing-group cases in various ways.
Upvotes: 3
Reputation: 51640
Assuming your data frame is called df
and that all the states are either in group 1 or in group 2 you can do
df$region <- ifelse(df$state %in% group1, 1, 2)
Upvotes: 1
Reputation: 132616
group1 <- c("ME", "NH", "VT", "MA", "CT", "RI")
group2 <- c("FL", "GA", "AL", "MS", "LA")
DF <- read.table(text="SomeVar State
300 AL
331 GA
103 MA
500 FL",header=TRUE)
DF$Region <- NA
DF$Region[DF$State %in% group1] <- 1
DF$Region[DF$State %in% group2] <- 2
# SomeVar State Region
# 1 300 AL 2
# 2 331 GA 2
# 3 103 MA 1
# 4 500 FL 2
Upvotes: 1