Reputation: 99
I'm looking to create multiple data frames using a for loop and then stitch them together with merge()
.
I'm able to create my data frames using assign(paste(), blah)
. But then, in the same for loop, I need to delete the first column of each of these data frames.
Here's the relevant bits of my code:
for (j in 1:3)
{
#This is to create each data frame
#This works
assign(paste(platform, j, "df", sep = "_"), read.csv(file = paste(masterfilename, extension, sep = "."), header = FALSE, skip = 1, nrows = 100))
#This is to delete first column
#This does not work
assign(paste(platform, j, "df$V1", sep = "_"), NULL)
}
In the first situation I'm assigning my variables to a data frame, so they inherit that type. But in the second situation, I'm assigning it to NULL
.
Does anyone have any suggestions on how I can work this out? Also, is there a more elegant solution than assign()
, which seems to bog down my code? Thanks,
n.i.
Upvotes: 2
Views: 7029
Reputation: 14366
As @joran pointed out in his comment, the proper way of doing this would be using a list. But if you want to stick to assign
you can replace your second statement with
assign(paste(platform, j, "df", sep = "_"),
get(paste(platform, j, "df", sep = "_"))[
2:length(get(paste(platform, j, "df", sep = "_")))]
If you wanted to use a list instead, your code to read the data frames would look like
dfs <- replicate(3,
read.csv(file = paste(masterfilename, extension, sep = "."),
header = FALSE, skip = 1, nrows = 100), simplify = FALSE)
Note you can use replicate
because your call to read.csv
does not depend on j
in the loop. Then you can remove the first column of each
dfs <- lapply(dfs, function(d) d[-1])
Or, combining everything in one command
dfs <- replicate(3,
read.csv(file = paste(masterfilename, extension, sep = "."),
header = FALSE, skip = 1, nrows = 100)[-1], simplify = FALSE)
Upvotes: 1
Reputation: 206546
assign
can be used to build variable names, but "name$V1" isn't a variable name. The $
is an operator in R so you're trying to build a function call and you can't do that with assign
. In fact, in this case it's best to avoid assign
completely. You con't need to create a bunch of different variables. If you data.frames are related, just keep them in a list.
mydfs <- lapply(1:3, function(j) {
df<- read.csv(file = paste(masterfilename, extension, sep = "."),
header = FALSE, skip = 1, nrows = 100))
df$V1<-NULL
df
})
Now you can access them with mydfs[[1]]
, mydfs[[2]]
, etc. And you can run functions overall data.sets with any of the *apply
family of functions.
Upvotes: 4