Reputation: 23
I'm trying to assign areas to observations in a dataframe in R, based on grid square IDs. I have the following dataframe (df):
year month square
1 2000 2 A1
2 2000 2 B2
3 2000 2 H5
4 2000 2 J9
5 2000 2 A2
6 2000 3 N8
7 2000 3 M9
8 2000 3 C7
I'd like to add another column for "area", assigning each observation to "North", "East", "South" or "West" based on the grid square. I've tried the following for loops which didn't do anything,
for(i in 1:length(df$square)) {
for(j in 1:length(N)) {
if(df$square[i]==N[j]){
df$area[i]=="N"}
}
}
for(i in 1:length(df$square)) {
if(any(df$square==N)==T){
df$area[i]=="North"}
}
Where "N" is an object I created containing the squares located in the north, i.e.:
N <- c("A1","A2","B2")
I did find the following related question, but I'm wondering if it's different when characters are involved: Assign a group number based on another column by group in R
Any help would be appreciated. Thanks
Upvotes: 0
Views: 260
Reputation: 2489
In R it is usually best to avoid loops, and especially nested loops. For this case, I prefer sapply()
.
N <- c("A1","A2","B2")
#assume these are the other designations
S <- c("H5", "J9")
E <- c("N8","M9")
W <- c("C7")
mydat$area<- sapply(mydat$square, function (x){
if (x %in% N) return("North")
if (x %in% S) return("South")
if (x %in% E) return("East")
if (x %in% W) return("West")
else NA
})
mydat
year month square area
2000 2 A1 North
2000 2 B2 North
2000 2 H5 South
2000 2 J9 South
2000 2 A2 North
2000 3 N8 East
2000 3 M9 East
2000 3 C7 West
When you start having large data sets, *apply()
functions will be much faster than loops in R.
Upvotes: 0
Reputation: 161085
Instead of defining vectors like N
, I recommend a second data.frame
pairing squares with areas:
df <- data.frame(year = 2000,
month = c(2,2,2,2,2,3,3,3),
square = c("A1", "B2", "H5", "J9", "A2", "N8", "M9", "C7"),
stringsAsFactors = FALSE)
areas <- data.frame(square = c("A1", "A2", "B1", "H5", "J9", "M9", "N8"),
area = c("N", "N", "N", "W", "E", "S", "S"),
stringsAsFactors = FALSE)
With that, just do a merge:
merge(df, areas, by = "square", all.x = TRUE)
# square year month area
# 1 A1 2000 2 N
# 2 A2 2000 2 N
# 3 B2 2000 2 <NA>
# 4 C7 2000 3 <NA>
# 5 H5 2000 2 W
# 6 J9 2000 2 E
# 7 M9 2000 3 S
# 8 N8 2000 3 S
(The NA
s are because of in incomplete areas
definition.)
Upvotes: 1
Reputation: 10671
d <- data.frame(year = rep(2000, 8), month = rep(3,8),
square = c("A1", "B2", "H5", "J9", "A2", "N8", "M9", "C7"))
N <- c("A1","A2","B2")
for(i in 1:nrow(d)) {
if (d$square[i] %in% N) {
d$area[i] <- "North"
}
else (
d$area[i] <- "Somewhere Else"
)
}
layer in else if() statements in the for loop for other cardinal direction id vectors
Upvotes: 0