stochastiq
stochastiq

Reputation: 269

Duplicate rows in R that satisfy condition

I want to map one category to multiple categories in R.

I have a dataframe,

Region var1 var2
Texas  XX   XX 
Texas  XX   XX

I need to relabel Texas as "Dallas" and "Houston", in order words, "Dallas" and "Houston" will share the same values for var1 and var2.

How do I create a dataframe like this:

Region var1 var2 Region2
Texas  XX   XX   Dallas
Texas  XX   XX   Dallas
Texas  XX   XX   Houston
Texas  XX   XX   Houston

This should involve some duplication of rows by the condition that Region == Texas?

Upvotes: 1

Views: 931

Answers (3)

Sraffa
Sraffa

Reputation: 1668

With dplyr, assuming you have a dataframe with subregions:

library(dplyr)
df <- data.frame(
    Region = c("Texas", "Texas"),
    var1 = c("XX", "XX"),
    var2 = c("XX", "XX")
    )

regions <- data.frame(
    Region = c("Texas", "Texas"),
    Region2 = c("Houston", "Dallas")
    )

df %>% right_join(regions, by = "Region")
  Region var1 var2 Region2
1  Texas   XX   XX Houston
2  Texas   XX   XX Houston
3  Texas   XX   XX  Dallas
4  Texas   XX   XX  Dallas

Upvotes: 0

akrun
akrun

Reputation: 887038

Another option without merge is to transform the dataset by creating 'Region2' and replicate the sequence of rows to expand it

transform(df1, Region2 = c("Dallas", "Houston"))[rep(seq_len(nrow(df1)), each = 2), ]

Upvotes: 1

thelatemail
thelatemail

Reputation: 93813

Essentially a merge operation if you make a separate table for your new regions:

big <- data.frame(Region=rep("Texas",2), Region2=c("Dallas","Houston"))
merge(dat,big)
#  Region var1 var2 Region2
#1  Texas   XX   XX  Dallas
#2  Texas   XX   XX Houston
#3  Texas   XX   XX  Dallas
#4  Texas   XX   XX Houston

Upvotes: 3

Related Questions