Reputation: 307
I have two dataframes:
df_1 <- data.frame(c("a_b", "a_c", "a_d"))
df_2 <- data.frame(matrix(ncol = 2))
And I would like to loop over df_1 in order to fill df_2:
for (i in (1:(length(df_1[,1])))){
for (j in (1:2)) {
df_2[i*j,] <-str_split_fixed(df_1[i,1], "_", 2)
}
}
I would like df_2 to look like:
col1 col2
a b
a b
a c
a c
a d
a d
But instead I get:
col1 col2
a b
a c
a d
a c
NA NA
a d
I must be doing something wrong, but cannot figure it out. I also would like to use apply (or something like it, but am pretty new to R and not firm with the apply-family.
Thanks for your help!
Upvotes: 2
Views: 103
Reputation: 307
I was trying to post a solution I found right after posting but it was misunderstood and was deleted:
"sometimes posting a question helps:
I am was asking for the right position in df_1, but I was saving the result in the wrong cell.
the answer to my original question should be something like this:
n <- 1
for (i in (1:(length(df_1[,1])))){
for (j in (1:2)) {
df_2[n,] <-str_split_fixed(df_1[i,1], "_", 2)
n <- n+1
}
}"
Upvotes: 0
Reputation: 887971
We can use cSplit
with data.table
approach
library(splitstackshape)
cSplit(df_1, 'col1', '_')[rep(seq_len(.N), each =2)]
# col1_1 col1_2
#1: a b
#2: a b
#3: a c
#4: a c
#5: a d
#6: a d
Or another option is tidyverse
library(tidyverse)
separate(df_1, col1, into=c("col_1", "col_2")) %>%
map_df(~rep(., each = 2))
# A tibble: 6 × 2
# col_1 col_2
# <chr> <chr>
#1 a b
#2 a b
#3 a c
#4 a c
#5 a d
#6 a d
NOTE: Both the answers are one-liners.
df_1 <- data.frame(col1 = c("a_b", "a_c", "a_d"))
Upvotes: 3
Reputation: 389325
This would be a combination of two answers. With cSplit
we split the column by _
and then repeat each row twice. Assuming your column name as V1
.
library(splitstackshape)
df_2 <- cSplit(df_1, "V1", "_")
df_2[rep(seq_len(nrow(df_2)),each = 2), ]
# V1_1 V1_2
#1: a b
#2: a b
#3: a c
#4: a c
#5: a d
#6: a d
Or as @Sotos mentioned in the comments we can use expandRows
to accomodate everything into one line.
expandRows(cSplit(df_1, "V1", "_"), 2, count.is.col = FALSE)
# V1_1 V1_2
#1: a b
#2: a b
#3: a c
#4: a c
#5: a d
#6: a d
data
df_1 <- data.frame(V1 = c("a_b", "a_c", "a_d"))
Upvotes: 3
Reputation: 1376
OK, I started learning R this week, but if you want presented result you can use your code with this fix:
for (i in (1:(length(df_1[,1])))){
for (j in (1:2)) {
df_2[(i-1)*2+j,] <- str_split_fixed(df_1[i,1], "_", 2)
}
}
I changed index of df_2.
I guess that there is better way than two for loops, but that all I can do for the moment.
Upvotes: 0
Reputation: 70653
Another way would be
df_1 <- data.frame(col1 = c("a_b", "a_c", "a_d"))
df_2 <- as.data.frame(do.call(rbind, strsplit(as.character(df_1$col1), split = "_", fixed = TRUE)))
df_2[rep(1:nrow(df_2), each = 2), ]
V1 V2
1 a b
1.1 a b
2 a c
2.1 a c
3 a d
3.1 a d
Upvotes: 4