Sven
Sven

Reputation: 301

Creating a data frame with a for loop and if else statement

I have been searching for a solution on here and nothing seems to be working for me, so I apologize in advance if this happens to be a duplicate question. If it is, and the example used is also the sample, please let me know where I could be looking. This was not an easy one to title. I thank you in advance for any time you spend on this.

Problem: I have a data set that is in 1 column and about a few hundred thousand rows. I need to create 178 columns and however many rows.

The order in which the data is provided is the order it must stay.

For example, the first 178 rows of the data set all need to become row 1. The next 178 columns of the data then need to become row 2. This continues until the end of the data frame.

The below code will create a sample with letters that contains 1000 rows. It also contains what I have tried thus far.

I feel like I am close, but the results are not what I am expecting. It gets a little weird when the data turns to row 3 in the new row column. It also gets weird when it gets past its first 178 columns. It repeats column 1 two times in a row.

Any assistance would be great. If further clarification is needed, please just ask. However, when you run the code and look around where row 1 turns to row 2 and where row 2 turns to row 3, you should see the strange results.

Edit 1: I do need to add that this will need to be in a long format and not wide. Essentially, it needs to be in the format that the example provides. I do apologize for the misstatement in the title. I have changed the title to remove the word matrix.

What I have tried and sample data:

rna = data.frame(sample(letters, size=1000, replace=TRUE))

x = 1
row = 1
y = 0
column = 1

for (i in 1:nrow(rna)) {

  if (x < 178) {
    rna$rowNum[i] = row
    x = x + 1
    } else {
      row = row + 1
      x = 1
    }

  if (y < 178) {
    rna$colNum[i] = column
    column = column + 1
    y = y + 1
  } else {
    column = 1
    y = 0
  }
}

Upvotes: 0

Views: 86

Answers (2)

Ian Campbell
Ian Campbell

Reputation: 24770

Edit OK after clarifying your question, I think I know what you want, a 3 column data.frame with value, rowNum and colNum. One approach would be to use rep and then assemble the data.frame from component vectors.

rna = data.frame(sample(letters, size=1000, replace=TRUE))
n.cols <- ceiling(nrow(rna) / 178)
col.vector <- rep(1:178,times = n.cols)
row.vector <- rep(seq(1,n.cols),each = 178)
result <- data.frame(value = as.vector(rna), rowNum = row.vector[1:nrow(rna)], colNum = col.vector[1:nrow(rna)])
colnames(result)[1] <- "value"
result
    value rowNum colNum
1       h      1      1
2       w      1      2
3       v      1      3
4       g      1      4
5       l      1      5
6       y      1      6
7       t      1      7
8       n      1      8
9       q      1      9
10      d      1     10

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 388807

We can create a group of 178 values and convert the data into wide format.

library(dplyr)

rna %>%
 group_by(grp = rep(seq_along(temp), each  = 178, length.out = n())) %>%
 mutate(col = paste0('col', row_number())) %>%
 tidyr::pivot_wider(names_from = col, values_from = temp)

data

rna = data.frame(temp = sample(letters, size=1000, replace=TRUE))

Upvotes: 1

Related Questions