Martin
Martin

Reputation: 41

Create a new variable by pasting two factor variables

If the two columns in my dataframe are:

species <- c("Dengue", "Dengue", "Dengue", "Dengue", "Dengue", "Dengue", "Dengue", "Dengue") 

And

strain <- c(1, NA, 2, NA, NA, 3, 4, 5)

How do I get a column that combines the two to say Dengue 1, etc.?

Upvotes: 1

Views: 594

Answers (2)

M--
M--

Reputation: 29119

You can use ifelse to suppress NA in your final output:

paste0(species, ifelse(is.na(strain),"",strain))

 #>  [1] "Dengue1" "Dengue"  "Dengue2" "Dengue"  "Dengue"  "Dengue3" "Dengue4" "Dengue5"

Upvotes: 1

akrun
akrun

Reputation: 887571

We can use unite

library(dplyr)
library(tidyr)
library(stringr)
df1 %>% 
     unite(species, species, strain)

If the NA needs to remain as NA, use str_c

df1 %>%
   transmute(species = str_c(species, strain, sep="_")) %>%
   fill(species)

If it is to filter out the NAs, then do the filter first

df1 %>%
   filter(!is.na(strain)) %>%
   transmute(species = str_c(species, strain, sep="_"))

data

df1 <- data.frame(species, strain)

Upvotes: 1

Related Questions