Chabo
Chabo

Reputation: 3000

Dealing with non-structured data in R

PSA:I am not sure whether this is on topic / belongs on meta

New users often post their code in the version which it is printed in their console, e.g.

"Here is my data:"

> data
   Num Data
 1   1    A
 2   2    B
 3   3    C
 4   4    D
 5   5    E

Which is often a pain or impossible to reproduce, as far as I know. Is there an obvious way I am missing in which converting non structured data into reproducible data is possible? (besides asking the user to dput their data)

If not, I would like to consider creating a package to do so. Below is an unreliable, non-robust, example of a function to exist in such a package.

Dump_to_DF<-function(dump){

test<-regmatches(dump
, gregexpr(".*\n|.*$",
dump, perl=TRUE))

test2<-unlist(test)

test3<-strsplit(test2, split="\\s+",perl = T)

len<-length(test3)

test5<-list()

for(i in 2:len){
    test4<-unlist(test3[[i]])
    test5[[i]]<-test4[-1]
    }

test5<-test5[-1]

Fin_Data<-do.call(rbind.data.frame, test5)

names(Fin_Data)<-test3[[1]]

return(Fin_Data)

}

data<-"Num Data
 1   1    A
 2   2    B
 3   3    C
 4   4    D
 5   5    E"

Data<-Dump_to_DF(data)

> Data
  Num Data
1   1    A
2   2    B
3   3    C
4   4    D
5   5    E

Is there anything that exists that already does something similar to what this function does?

For anyone wondering my motive, I hate waiting and would rather be able to quickly edit a new question to include reproducible data so everyone can get to work on the answer faster. Also eventually getting an SO bot to suggest edits using something like this would be neat.

Upvotes: 1

Views: 84

Answers (1)

morgan121
morgan121

Reputation: 2253

One way to read data like you have given is like this:

data <- read.table(text="Num Data
         1   1    A
         2   2    B
         3   3    C
         4   4    D
         5   5    E")

If it has headings you may need to add header=T in the command.

Upvotes: 1

Related Questions