Reputation: 4232
I am providing a data frame to tidyr::separate() and getting unexpected results. I have a minimal working example below where I show how I am using it, what I expect it to produce, and what it is actually producing. Why is this not working?
# Create toy data frame
dat <- data.frame(text = c("time_suffer|suffer_employ|suffer_sick"),
stringsAsFactors = FALSE)
# Separate variable into 3 columns a,b,c using | as a delimiter
dat %>% tidyr::separate(., col = "text", into = c("a","b","c"), sep = "|")
# What I'm expecting
data.frame(a = "time_suffer", b = "suffer_employ", c = "suffer_sick")
# What I'm actually getting:
data.frame(a = NA, b = "t", c = "1")
I am also getting the warning "Warning message: Expected 3 pieces. Additional pieces discarded in 1 rows [1]."
Upvotes: 1
Views: 379
Reputation: 15062
According to the documentation, the sep
argument to separate
is interpreted as a regular expression if it is a character (extremely useful if you have complicated separators). This does mean, however, that you need to escape characters with special meaning in regular expressions if you want to match on them literally. Use "\\|"
as your separator:
library(tidyverse)
dat <- data.frame(text = c("time_suffer|suffer_employ|suffer_sick"),
stringsAsFactors = FALSE)
dat %>%
tidyr::separate(., col = "text", into = c("a","b","c"), sep = "\\|")
#> a b c
#> 1 time_suffer suffer_employ suffer_sick
Created on 2019-04-02 by the reprex package (v0.2.1)
Upvotes: 4