Mathias
Mathias

Reputation: 41

Separate One Column Into Five Columns

My dataset shows 6 variables in 153 observations, all saved in one column, which looks like this:

Ozone.Solar.R.Wind.Temp.Month.Day
1                   41,190,7.4,67,5,1
2                     36,118,8,72,5,2
3                  12,149,12.6,74,5,3
4                  18,313,11.5,62,5,4
5                   NA,NA,14.3,56,5,5

I am now looking for a way to separate those variables into 6 different columns, which in the end should look like this:

Ozone Solar Wind Temp Month Day
41    190   7.4  67   5     1    
36    118   8    72   5     2  
12    149   12.6 74   5     3  
18    313   11.5 62   5     4  
NA    NA    14.3 56   5     5 

Thanks in advance for your help!

Upvotes: 1

Views: 94

Answers (2)

akrun
akrun

Reputation: 886978

We can do this easily in base R, using read.csv

out <- read.csv(text = df[[1]], header = FALSE, col.names = scan(text = names(df), 
             what = "", sep=".", quiet = TRUE), stringsAsFactors = FALSE)
out
#  Ozone Solar Wind Temp Month Day
#1    41   190  7.4   67     5   1
#2    36   118  8.0   72     5   2
#3    12   149 12.6   74     5   3
#4    18   313 11.5   62     5   4
#5    NA    NA 14.3   56     5   5

data

df <- structure(list(Ozone.Solar.Wind.Temp.Month.Day = c("41,190,7.4,67,5,1", 
 "36,118,8,72,5,2", "12,149,12.6,74,5,3", "18,313,11.5,62,5,4", 
 "NA,NA,14.3,56,5,5")), class = "data.frame", row.names = c("1", 
  "2", "3", "4", "5"))

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388817

We can use separate in the following way without hard coding any of the values.

tidyr::separate(df, names(df), sep = ",", into = strsplit(names(df), "\\.")[[1]])

#  Ozone Solar Wind Temp Month Day
#1    41   190  7.4   67     5   1
#2    36   118    8   72     5   2
#3    12   149 12.6   74     5   3
#4    18   313 11.5   62     5   4
#5    NA    NA 14.3   56     5   5

Using only base R we know can split the strings on comma using strsplit and rbind the list together and assign names using setNames.

setNames(do.call(rbind.data.frame, strsplit(as.character(df[[1]]), ",")), 
                 strsplit(names(df), "\\.")[[1]]) 

data

df <- structure(list(Ozone.Solar.Wind.Temp.Month.Day = structure(c(4L, 
3L, 1L, 2L, 5L), .Label = c("12,149,12.6,74,5,3", "18,313,11.5,62,5,4", 
"36,118,8,72,5,2", "41,190,7.4,67,5,1", "NA,NA,14.3,56,5,5"), class = 
"factor")), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5"))

Upvotes: 1

Related Questions