Reputation: 672
I have a dataframe df
and the first column looks like this:
[1] "760–563" "01455–1" "4672–04" "11–31234" "22–12" "11111–53" "111–21" "17–356239" "14–22352" "531–353"
I want to split that column on -
.
What I'm doing is
strsplit(df[,1], "-")
The problem is that it's not working. It returns me a list without splitting the elements. I already tried adding the parameter fixed = TRUE
and putting a regular expressing on the split
parameter but nothing worked.
What is weird is that if I replicate the column on my own, for example:
myVector <- c("760–563" "01455–1" "4672–04" "11–31234" "22–12" "11111–53" "111–21" "17–356239" "14–22352" "531–353")
and then apply the strsplit
, it works.
I already checked my column type and class with
class(df[,1])
and typeof(df[,1])
and both returns me character
, so it's good.
I was also using the dataframe with dplyr so it was of the type tbl_df
. I converted it back to dataframe
but didn't work too.
Also tried apply(df, 2, function(x) strsplit(x, "-", fixed = T))
but didn't work too.
Any clues?
Upvotes: 2
Views: 1756
Reputation: 93813
I don't know how you did it, but you have two different types of dashes:
charToRaw(substr("760–563", 4, 4))
#[1] 96
charToRaw("-")
#[1] 2d
So the strsplit()
is working just fine, it's just that the dash isn't there in your original data. Adjust this, and away you go:
strsplit("760–563", "–")
#[[1]]
#[1] "760" "563"
Upvotes: 5
Reputation: 4024
You can just split on a non-numeric character
library(dplyr)
library(tidyr)
data %>%
separate(your_column,
c("first_number", "second_number"),
sep = "[^0-9]")
Upvotes: 2