andrewj
andrewj

Reputation: 3035

converting multiple lines of text into a data frame

I'm trying to find a way to convert multiple lines of text into a data frame. I'm not sure if there's a way where you can use read.delim() to read in multiple lines of text and create the following data frame with something akin to rehape()?.

The data is structured as follows:

A: 1
B: 2
C: 10
A: 34
B: 20
C: 6.7
A: 2
B: 78
C: 35

I'd like to convert this data to something that looks like the following data frame:

A             B             C
1             2             10
34            20            6.7
2             78            35

Apologies if there is an obvious way to do this!

Upvotes: 6

Views: 7836

Answers (4)

andrewj
andrewj

Reputation: 3035

I posted this question on R-help as well, and got a response from Phil Spector suggesting unstack.

This is a modification of Leo Alekseyev's response

my.data <- "A: 1
            B: 2
            C: 10
            A: 34
            B: 20
            C: 6.7
            A: 2
            B: 78
            C: 35"   
df <- read.delim(textConnection(my.data),header=FALSE,sep=":",strip.white=TRUE)
unstack(df, V2 ~ V1)

This results in:

   A  B    C
1  1  2 10.0
2 34 20  6.7
3  2 78 35.0

Some advantages of this approach compared to the other thoughtful answers is that you don't need to specify the number of columns ahead of time. It also doesn't require any additional packages.

Upvotes: 2

unutbu
unutbu

Reputation: 879113

How about :

s<-"A: 1
B: 2
C: 10
A: 34
B: 20
C: 6.7
A: 2
B: 78
C: 35
"
d<-read.delim(textConnection(s),header=FALSE,sep=":",strip.white=TRUE)
cols<-levels(d[,'V1'])
d<-data.frame(sapply(cols,function(x) {d['V2'][d['V1']==x]}, USE.NAMES=TRUE))

which yields:

   A  B    C
1  1  2 10.0
2 34 20  6.7
3  2 78 35.0

Upvotes: 12

Leo Alekseyev
Leo Alekseyev

Reputation: 13483

Here is how to do it with the plyr package:

require("plyr")
my.data <- "A: 1
            B: 2
            C: 10
            A: 34
            B: 20
            C: 6.7
            A: 2
            B: 78
            C: 35"   
df <- read.delim(textConnection(my.data),header=FALSE,sep=":",strip.white=TRUE)

as.data.frame(dlply(df,.(V1),function(x) x[[2]]))

You get

   A  B    C
1  1  2 10.0
2 34 20  6.7
3  2 78 35.0

You can see what magic plyr is doing just by playing with dlply(df,.(V1)) or dlply(df,.(V1),function(x) x)

Upvotes: 4

Jyotirmoy Bhattacharya
Jyotirmoy Bhattacharya

Reputation: 9587

Here is one solution using reshape

s<-"A: 1
B: 2
C: 10
A: 34
B: 20
C: 6.7
A: 2
B: 78
C: 35
"
d<-d<-read.delim(textConnection(s),header=FALSE,sep=":",strip.white=TRUE)
N<-nrow(d)%/%3
d$id<-rep(1:N,each=3)
reshape(d,dir="wide",timevar="V1",idvar="id")

Which produces

  id V2.A V2.B V2.C
1  1    1    2 10.0
4  2   34   20  6.7
7  3    2   78 35.0

Upvotes: 0

Related Questions