Reputation: 1285
Here is my problem. I've got data on city codes (GeoCode) and zip codes (PostCode). Often several zip codes correspond to a single city code. If that's the case, I want to make a column with a string of zip codes corresponding to the same city:
ID<-1:10
GeoCode<-c("AA","BB","BB","CC","CC","CC","DD","DD","DD","DD")
PostCode<-c("01","10","11","20","21","22","30","31","32","33")
data<-data.frame(ID,GeoCode,PostCode)
I want to make such table. For example "20_21_22" belong to City code CC
ID GeoCode PostCode strPostcode
1 1 AA 01 01
2 2 BB 10 10_11
3 3 BB 11 10_11
4 4 CC 20 20_21_22
5 5 CC 21 20_21_22
6 6 CC 22 20_21_22
7 7 DD 30 30_31_32_33
8 8 DD 31 30_31_32_33
9 9 DD 32 30_31_32_33
10 10 DD 33 30_31_32_33
Upvotes: 1
Views: 56
Reputation: 1354
Or you can try this one
data2 <- data %>%
group_by(GeoCode) %>%
mutate(strPostCode = paste0(unique(PostCode), collapse = "_"))
# ID GeoCode PostCode strPostCode
# <int> <chr> <chr> <chr>
# 1 1 AA 01 01
# 2 2 BB 10 10_11
# 3 3 BB 11 10_11
# 4 4 CC 20 20_21_22
# 5 5 CC 21 20_21_22
# 6 6 CC 22 20_21_22
# 7 7 DD 30 30_31_32_33
# 8 8 DD 31 30_31_32_33
# 9 9 DD 32 30_31_32_33
# 10 10 DD 33 30_31_32_33
Upvotes: 1
Reputation: 101139
The base R option with ave
by @akrun is efficient. Here is another workaround
merge(data,
aggregate(PostCode ~ ., data[-1], paste0, collapse = "_"),
by = "GeoCode",
all = TRUE
)
which gives
GeoCode ID PostCode.x PostCode.y
1 AA 1 01 01
2 BB 2 10 10_11
3 BB 3 11 10_11
4 CC 4 20 20_21_22
5 CC 5 21 20_21_22
6 CC 6 22 20_21_22
7 DD 7 30 30_31_32_33
8 DD 8 31 30_31_32_33
9 DD 9 32 30_31_32_33
10 DD 10 33 30_31_32_33
Upvotes: 1
Reputation: 886968
We could group by 'GeoCode' and paste
all the unique
'PostCode' in mutate
library(dplyr)
library(stringr)
data %>%
group_by(GeoCode) %>%
mutate(strPostcode = str_c(unique(PostCode), collapse="_"))
# A tibble: 10 x 4
# Groups: GeoCode [4]
# ID GeoCode PostCode strPostcode
# <int> <chr> <chr> <chr>
# 1 1 AA 01 01
# 2 2 BB 10 10_11
# 3 3 BB 11 10_11
# 4 4 CC 20 20_21_22
# 5 5 CC 21 20_21_22
# 6 6 CC 22 20_21_22
# 7 7 DD 30 30_31_32_33
# 8 8 DD 31 30_31_32_33
# 9 9 DD 32 30_31_32_33
#10 10 DD 33 30_31_32_33
Or an option with base R
data$strPostcode <- with(data, ave(PostCode, GeoCode, FUN =
function(x) paste(unique(x), collapse="_")))
Upvotes: 2