user4791235
user4791235

Reputation:

Data Frame of Factors: Split column into two and extract number

I have the following data frame, df, and below is my first column in the data frame as df[1] :

 Well and Depth  
   Black Peak 1000
   Black Peak 1001
   Black Peak 1002
   Black Peak 1003  

RStudio is currently treating this column as a list of factors, but I want to split this into two data frame columns/vectors, one with the text as a string/char, and one with the numbers as numeric values. So that it will look something like this:

   Well            Depth
   "Black Peak"     1000
   "Black Peak"     1001
   "Black Peak"     1002
   "Black Peak"     1003  

The depth numbers are going to be plotted.

Upvotes: 0

Views: 92

Answers (3)

akrun
akrun

Reputation: 887098

We can use separate fromtidyr

library(tidyr)
separate(df1, `Well and Depth`, into = c("Well", "Depth"), "\\s+(?=[0-9])")
#        Well Depth
#1 Black Peak  1000
#2 Black Peak  1001
#3 Black Peak  1002
#4 Black Peak  1003

Upvotes: 0

Onyambu
Onyambu

Reputation: 79208

HERE=data.frame(WELL=character(),DEPTH=numeric())
strcapture("(.*)\\s(\\d+)$",as.character(df[,1]),HERE)
        WELL DEPTH
1 Black Peak  1000
2 Black Peak  1001
3 Black Peak  1002
4 Black Peak  1003

Upvotes: 1

pogibas
pogibas

Reputation: 28339

You can try this:

df$Well  <- sub("(^.*) [0-9]+$", "\\1", df$`Well and Depth`)
df$Depth <- as.numeric(sub(".* ([0-9]+$)", "\\1", df$`Well and Depth`))

Data:

structure(list(`Well and Depth` = structure(1:4, .Label = c("Black Peak 1000", 
"Black Peak 1001", "Black Peak 1002", "Black Peak 1003"), class = "factor")), .Names = "Well and Depth", row.names = c(NA, 
-4L), class = "data.frame")

Upvotes: 2

Related Questions