Reputation: 325
I have scraped a large table from a web page using the rvest package, but it is reading it as a single vector:
foo<-c("A","B","C","Dog","1","2","3","Cat","4","5","6","Goat","7","8","9")
that I need to deal with as a dataframe that looks like this:
bar<-as.data.frame(cbind(Animal=c("Dog","Cat","Goat"),A=c(1,4,7),B=c(2,5,8),C=c(3,6,9)))
This might be a simple dilemma but I'd appreciate the help.
Upvotes: 1
Views: 150
Reputation: 2549
Here is a convenient tool to work with list,
seqList <-
function(character,by= 1,res=list()){
### sequence characters by
if (length(character)==0){
res
} else{
seqList(character[-c(1:by)],by=by,res=c(res,list(character[1:by])))
}
}
Once you convert your characters into lists it's easier to manipulate them for instance you can do.
options(stringsAsFactors=FALSE)
foo <-c("A","B","C","Dog","1","2","3","Cat","4","5","6","Goat","7","8","9")
foo <- c("Animal",foo)
df <- data.frame(t(do.call("rbind",
lapply(1:4,function(x) do.call("cbind",lapply(seqList(foo,4),"[[",x))))))
colnames(df) <- df[1,]
df <- df[-1,]
## > df
## Animal A B C
## 2 Dog 1 2 3
## 3 Cat 4 5 6
## 4 Goat 7 8 9
Note: I haven't tested the efficiency of the function. It might not be very efficient for large amount of characters. The use of matrices might a better tool for this job.
Upvotes: 0
Reputation: 99351
If you want the proper column types, you can try this. Split into a list, name the list, then convert the column types before coercing to data frame.
l <- setNames(split(tail(foo, -3), rep(1:4, 3)), c("Animal", foo[1:3]))
as.data.frame(lapply(l, type.convert)) ## stringsAsFactors=FALSE if desired
# Animal A B C
# 1 Dog 1 2 3
# 2 Cat 4 5 6
# 3 Goat 7 8 9
Upvotes: 1
Reputation: 32548
Just split
it into required number of rows and rbind
it. I added "Animal"
at the start of foo
to make the elements equal in each row when splitting
foo = c("Animal", foo)
df = data.frame(do.call(rbind, split(foo, ceiling(seq_along(foo)/4))),
stringsAsFactors = FALSE)
colnames(df) = df[1,]
df = df[-1,]
df
# Animal A B C
#2 Dog 1 2 3
#3 Cat 4 5 6
#4 Goat 7 8 9
Upvotes: 2
Reputation: 2939
you can create a matrix from your vector and turn it into a data frame:
foo<-c("A","B","C","Dog","1","2","3","Cat","4","5","6","Goat","7","8","9")
foo <- c("Animal" , foo)
m <- matrix(foo , ncol = 4 , byrow = TRUE)
df <- as.data.frame(m[-1,] , stringsAsFactors = FALSE)
colnames(df) <- m[1,]
# I assume you want numerics for your A,B,C columns:
df[,2:4]<-apply(df[,2:4],2,as.numeric)
lapply(df,class)
$Animal
[1] "character"
$A
[1] "numeric"
$B
[1] "numeric"
$C
[1] "numeric"
Upvotes: 5