Reputation:
I have a dataframe DF1 consisting of 168 file names:
DF1$FileName <- c("File1.csv", "File2.csv",..... "File168.csv")
Using:
filez <- NULL
for (i in 1:168){
filez[i] <- paste0("file", i, ".csv", sep="")
}
filesz <- as.data.frame(filez)
I have another dataframe DF2 as follows:
DF2$RowNumber <- as.data.frame(rep(c(1:512000), times = 168, length.out = NA, each = 1))
This means DF2 has a column "RowNumber" in which there are 168 times repetition of numbers 1 through 512000 (i.e. total 86016000 rows).
What I want to do is to:
Select a file name (one at a time) -> DF1$FileName[i]
And repeat paste it 1 to 512000 in DF2$FileName
Repeat the above untill all 86016000 rows have filled in
The end result should look like:
DF2
RowNumber FileName
1 File1.txt
2 File1.txt
3 File1.txt
. .
. .
. .
. .
512000 File1.txt
1 File2.txt
2 File2.txt
3 File2.txt
. .
. .
512000 File2.txt
1 File3.txt
2 File3.txt
3 File3.txt
. .
. .
512000 File3.txt
. .
. .
512000 File167.txt
1 File168.txt
2 File168.txt
3 File168.txt
. .
. .
512000 File168.txt
I tried this, but I know there is logical mistake leading to system hanged up:
for (i in 1:nrow(m)){
while(m$RowNumber[i] != 512000) {m$FileName[i] <- filez[[i]]}
}
Can someone please suggest me better and easy way to resolve my issue?
I am sure R would have some package to perform such operations, but I don't know which one.
Upvotes: 2
Views: 398
Reputation: 83275
There is no need for a for
loop in this case. You can use specifically designed functions for that, like:
1) expand.grid
from base R:
filenames <- paste0("file", 1:168, ".csv")
rownumbers <- 1:512000
d <- expand.grid(rownumbers = rownumbers, filenames = filenames)
which gives:
> head(d)
rownumbers filenames
1 1 file1.csv
2 2 file1.csv
3 3 file1.csv
4 4 file1.csv
5 5 file1.csv
6 6 file1.csv
2) The CJ
(cross join) function from the data.table package:
library(data.table)
d <- CJ(rownumbers = rownumbers, filenames = filenames)
which will give you the same result.
3) The crossing
function from the tidyr package:
library(tidyr)
d <- crossing(rownumbers = rownumbers, filenames = filenames)
which will also give you the same result.
Upvotes: 1
Reputation: 86
The simplest way to do this would be with integer division so:
for(i in 1:nrow(m)) {
filenum = 1+floor((i-1)/512000)
filename = paste0("File",filenum,".txt")
## instead of : m$FileName[i]=filenum , use:
m$FileName[i] = filename ## it works!
}
Hope this helps
Upvotes: 1