Reputation: 41
My dataset shows 6 variables in 153 observations, all saved in one column, which looks like this:
Ozone.Solar.R.Wind.Temp.Month.Day
1 41,190,7.4,67,5,1
2 36,118,8,72,5,2
3 12,149,12.6,74,5,3
4 18,313,11.5,62,5,4
5 NA,NA,14.3,56,5,5
I am now looking for a way to separate those variables into 6 different columns, which in the end should look like this:
Ozone Solar Wind Temp Month Day
41 190 7.4 67 5 1
36 118 8 72 5 2
12 149 12.6 74 5 3
18 313 11.5 62 5 4
NA NA 14.3 56 5 5
Thanks in advance for your help!
Upvotes: 1
Views: 94
Reputation: 886978
We can do this easily in base R
, using read.csv
out <- read.csv(text = df[[1]], header = FALSE, col.names = scan(text = names(df),
what = "", sep=".", quiet = TRUE), stringsAsFactors = FALSE)
out
# Ozone Solar Wind Temp Month Day
#1 41 190 7.4 67 5 1
#2 36 118 8.0 72 5 2
#3 12 149 12.6 74 5 3
#4 18 313 11.5 62 5 4
#5 NA NA 14.3 56 5 5
df <- structure(list(Ozone.Solar.Wind.Temp.Month.Day = c("41,190,7.4,67,5,1",
"36,118,8,72,5,2", "12,149,12.6,74,5,3", "18,313,11.5,62,5,4",
"NA,NA,14.3,56,5,5")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5"))
Upvotes: 1
Reputation: 388817
We can use separate
in the following way without hard coding any of the values.
tidyr::separate(df, names(df), sep = ",", into = strsplit(names(df), "\\.")[[1]])
# Ozone Solar Wind Temp Month Day
#1 41 190 7.4 67 5 1
#2 36 118 8 72 5 2
#3 12 149 12.6 74 5 3
#4 18 313 11.5 62 5 4
#5 NA NA 14.3 56 5 5
Using only base R we know can split the strings on comma using strsplit
and rbind
the list together and assign names using setNames
.
setNames(do.call(rbind.data.frame, strsplit(as.character(df[[1]]), ",")),
strsplit(names(df), "\\.")[[1]])
data
df <- structure(list(Ozone.Solar.Wind.Temp.Month.Day = structure(c(4L,
3L, 1L, 2L, 5L), .Label = c("12,149,12.6,74,5,3", "18,313,11.5,62,5,4",
"36,118,8,72,5,2", "41,190,7.4,67,5,1", "NA,NA,14.3,56,5,5"), class =
"factor")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5"))
Upvotes: 1