Split lines and create new dataframe

Question

I have a file, file.txt, with data that looks like as follows,

Not all lines contain all the required information. For eg., the last line above does not have an entry for ReqPriority like the other lines. I split the data into a dataframe using,

data.frame(do.call(rbind,strsplit(readLines('file.txt'),'ows_',fixed=T)))

but due to the missing entries in some of the lines the dataframe does not come out properly.

Any suggestions on how I can export this into a df and fill in the missing values with NA.

Req_Name1   ReqPriority    ReqDate
John        High           2012-10-10
Jack        Low            2012-11-10
John        NA             2012-10-10

flodel · Accepted Answer

Since each row looks a lot like how data.frames are created in R, I thought is would be fun to work it this way:

x <- readLines('file.txt')
x <- gsub("", "data.frame(\1)", x)
x <- gsub("ows_", "", x)
x <- gsub(" ", ", ", x)
x
# [1] "data.frame(Req_Name1='John', ReqPriority='High', ReqDate='2012-10-10')"
# [2] "data.frame(Req_Name1='Jack', ReqPriority='Low', ReqDate='2012-11-10')" 
# [3] "data.frame(Req_Name1='John', ReqDate='2012-12-10')"                    

library(plyr)
do.call(rbind.fill, lapply(x, function(z)eval(parse(text = z))))
#   Req_Name1 ReqPriority    ReqDate
# 1      John        High 2012-10-10
# 2      Jack         Low 2012-11-10
# 3      John         2012-12-10

But it should come with the usual warnings about using eval/parse.

Split lines and create new dataframe

Answers (2)

Related Questions