Reputation: 247
I have multiple csv files, and these files contain some identical columns as well as different columns. For example,
#1st.csv
col1,col2
1,2
#2nd.csv
col1,col3,col4
1,2,3
#3rd.csv
col1,col2,col3,col5
1,2,3,4
I try to combine these files based on the same columns, but for those different columns, I simply include all columns but fill the cell with NA (for those data without that columns).
So I expect to see:
col1,col2,col3,col4,col5
1,2,NA,NA,NA #this is 1st.csv
1,NA,2,3,NA #this is 2nd.csv
1,2,3,NA,4 #this is 3rd.csv
Here is the r code I give, but it returns an error message
> Combine_data <- smartbind(1st,2nd,3rd)
Error in `[<-.data.frame`(`*tmp*`, , value = list(ID = c(1001, 1001, :
replacement element 1 has 143460 rows, need 143462
Does anyone know any alternative or elegant way to get the expected result?
The R version is 3.3.2.
Upvotes: 0
Views: 86
Reputation: 1743
You should be able to accomplish this with the bind_rows
function from dplyr
df1 <- read.csv(text = "col1, col2
1,2", header = TRUE)
df2 <- read.csv(text = "col1, col3, col4
1,2,3", header = TRUE)
df3 <- read.csv(text = "col1, col2, col3, col5
1,2,3,4", header = TRUE)
library(dplyr)
res <- bind_rows(df1, df2, df3)
> res
col1 col2 col3 col4 col5
1 1 2 NA NA NA
2 1 NA 2 3 NA
3 1 2 3 NA 4
Upvotes: 2