Reputation: 351
I have 22 of Excel files (850*2). I loaded into R by such code
setwd ("D: /baseline")
file_2day=list. files (pattern = "*. csv")
d_2day<-do.call("rbind", sapply(file_2day, read.csv, simplify = FALSE)) .
They have a naming pattern, like T1_W1_base.CSV, T1_W10_base.CSV, etc. Below is the sample of my data
feature.name value
w1.1 3ddim 100
w1.2 2ddim 80
w1.3 mean 5
w10.1 3ddim 90
w10.2 2ddim 70
w10.3 mean 3
I'd like to arrange my data like this
Feature.name 3ddim 2ddim mean
w1 100 80 5
w10 90 70 3
actually my features are 850. Does anyone have any suggestions to achieve this format?
Upvotes: 0
Views: 114
Reputation: 11975
Currently in your sample data I can see that there are duplicate values in "rownames" and R
doesn't allow that. But when I back tracked your post I saw that you have distinct rownames in real data so it won't be an issue.
Assumption - Considering this fact I have modified below sample data accordingly (by referring your earlier sample data posted as an image).
library(dplyr)
library(tidyr)
library(tibble)
df %>%
rownames_to_column("rowname_col") %>%
mutate(rowname_col = gsub("(\\S+)[.].*", "\\1", rowname_col)) %>%
spread(feature_name, value) %>%
rename(feature_name = rowname_col)
Output is:
feature_name 2ddim 3ddim mean
1 w1 80 100 5
2 w10 70 90 3
Sample data:
df <- structure(list(feature_name = c("3ddim", "2ddim", "mean", "3ddim",
"2ddim", "mean"), value = c(100L, 80L, 5L, 90L, 70L, 3L)), .Names = c("feature_name",
"value"), class = "data.frame", row.names = c("w1.1", "w1.2",
"w1.3", "w10.10", "w10.20", "w10.30"))
feature_name value
w1.1 3ddim 100
w1.2 2ddim 80
w1.3 mean 5
w10.10 3ddim 90
w10.20 2ddim 70
w10.30 mean 3
Upvotes: 1