Rivka
Rivka

Reputation: 307

for loop acting weird

I have two dataframes:

df_1 <- data.frame(c("a_b", "a_c", "a_d"))
df_2 <- data.frame(matrix(ncol = 2))

And I would like to loop over df_1 in order to fill df_2:

for (i in (1:(length(df_1[,1])))){
  for (j in (1:2)) {
   df_2[i*j,] <-str_split_fixed(df_1[i,1], "_", 2)
  }
}

I would like df_2 to look like:

col1  col2
a     b
a     b
a     c
a     c
a     d
a     d

But instead I get:

col1  col2
a     b
a     c
a     d
a     c
NA    NA
a     d

I must be doing something wrong, but cannot figure it out. I also would like to use apply (or something like it, but am pretty new to R and not firm with the apply-family.

Thanks for your help!

Upvotes: 2

Views: 103

Answers (5)

Rivka
Rivka

Reputation: 307

I was trying to post a solution I found right after posting but it was misunderstood and was deleted:

"sometimes posting a question helps:

I am was asking for the right position in df_1, but I was saving the result in the wrong cell.

the answer to my original question should be something like this:

n <- 1
for (i in (1:(length(df_1[,1])))){
  for (j in (1:2)) {
   df_2[n,] <-str_split_fixed(df_1[i,1], "_", 2)
   n <- n+1 
  }
}"

Upvotes: 0

akrun
akrun

Reputation: 887971

We can use cSplit with data.table approach

library(splitstackshape)
cSplit(df_1, 'col1', '_')[rep(seq_len(.N), each =2)]
#   col1_1 col1_2
#1:      a      b
#2:      a      b
#3:      a      c
#4:      a      c
#5:      a      d
#6:      a      d

Or another option is tidyverse

library(tidyverse)
separate(df_1, col1, into=c("col_1", "col_2")) %>%
                              map_df(~rep(., each = 2))
# A tibble: 6 × 2
#   col_1 col_2
#  <chr> <chr>
#1     a     b
#2     a     b
#3     a     c
#4     a     c
#5     a     d
#6     a     d

NOTE: Both the answers are one-liners.

data

df_1 <- data.frame(col1 = c("a_b", "a_c", "a_d"))

Upvotes: 3

Ronak Shah
Ronak Shah

Reputation: 389325

This would be a combination of two answers. With cSplit we split the column by _ and then repeat each row twice. Assuming your column name as V1.

library(splitstackshape)
df_2 <- cSplit(df_1, "V1", "_")
df_2[rep(seq_len(nrow(df_2)),each =  2), ]

#   V1_1 V1_2
#1:    a    b
#2:    a    b
#3:    a    c
#4:    a    c
#5:    a    d
#6:    a    d

Or as @Sotos mentioned in the comments we can use expandRows to accomodate everything into one line.

expandRows(cSplit(df_1, "V1", "_"), 2, count.is.col = FALSE)

#   V1_1 V1_2
#1:    a    b
#2:    a    b
#3:    a    c
#4:    a    c
#5:    a    d
#6:    a    d

data

df_1 <- data.frame(V1 = c("a_b", "a_c", "a_d"))

Upvotes: 3

franiis
franiis

Reputation: 1376

OK, I started learning R this week, but if you want presented result you can use your code with this fix:

for (i in (1:(length(df_1[,1])))){
    for (j in (1:2)) {
        df_2[(i-1)*2+j,] <- str_split_fixed(df_1[i,1], "_", 2)
    }
}

I changed index of df_2.

I guess that there is better way than two for loops, but that all I can do for the moment.

Upvotes: 0

Roman Luštrik
Roman Luštrik

Reputation: 70653

Another way would be

df_1 <- data.frame(col1 = c("a_b", "a_c", "a_d"))

df_2 <- as.data.frame(do.call(rbind, strsplit(as.character(df_1$col1), split = "_", fixed = TRUE)))
df_2[rep(1:nrow(df_2), each = 2), ]

    V1 V2
1    a  b
1.1  a  b
2    a  c
2.1  a  c
3    a  d
3.1  a  d

Upvotes: 4

Related Questions