Reputation: 3035
I'm trying to find a way to convert multiple lines of text into a
data frame. I'm not sure if there's a way where you can use read.delim()
to read in multiple lines of text and create the following data frame
with something akin to rehape()
?.
The data is structured as follows:
A: 1
B: 2
C: 10
A: 34
B: 20
C: 6.7
A: 2
B: 78
C: 35
I'd like to convert this data to something that looks like the following data frame:
A B C
1 2 10
34 20 6.7
2 78 35
Apologies if there is an obvious way to do this!
Upvotes: 6
Views: 7836
Reputation: 3035
I posted this question on R-help as well, and got a response from Phil Spector suggesting unstack
.
This is a modification of Leo Alekseyev's response
my.data <- "A: 1
B: 2
C: 10
A: 34
B: 20
C: 6.7
A: 2
B: 78
C: 35"
df <- read.delim(textConnection(my.data),header=FALSE,sep=":",strip.white=TRUE)
unstack(df, V2 ~ V1)
This results in:
A B C
1 1 2 10.0
2 34 20 6.7
3 2 78 35.0
Some advantages of this approach compared to the other thoughtful answers is that you don't need to specify the number of columns ahead of time. It also doesn't require any additional packages.
Upvotes: 2
Reputation: 879113
How about :
s<-"A: 1
B: 2
C: 10
A: 34
B: 20
C: 6.7
A: 2
B: 78
C: 35
"
d<-read.delim(textConnection(s),header=FALSE,sep=":",strip.white=TRUE)
cols<-levels(d[,'V1'])
d<-data.frame(sapply(cols,function(x) {d['V2'][d['V1']==x]}, USE.NAMES=TRUE))
which yields:
A B C
1 1 2 10.0
2 34 20 6.7
3 2 78 35.0
Upvotes: 12
Reputation: 13483
Here is how to do it with the plyr package:
require("plyr")
my.data <- "A: 1
B: 2
C: 10
A: 34
B: 20
C: 6.7
A: 2
B: 78
C: 35"
df <- read.delim(textConnection(my.data),header=FALSE,sep=":",strip.white=TRUE)
as.data.frame(dlply(df,.(V1),function(x) x[[2]]))
You get
A B C
1 1 2 10.0
2 34 20 6.7
3 2 78 35.0
You can see what magic plyr is doing just by playing with dlply(df,.(V1))
or dlply(df,.(V1),function(x) x)
Upvotes: 4
Reputation: 9587
Here is one solution using reshape
s<-"A: 1
B: 2
C: 10
A: 34
B: 20
C: 6.7
A: 2
B: 78
C: 35
"
d<-d<-read.delim(textConnection(s),header=FALSE,sep=":",strip.white=TRUE)
N<-nrow(d)%/%3
d$id<-rep(1:N,each=3)
reshape(d,dir="wide",timevar="V1",idvar="id")
Which produces
id V2.A V2.B V2.C
1 1 1 2 10.0
4 2 34 20 6.7
7 3 2 78 35.0
Upvotes: 0