Reputation: 88
I have a dataframe with 1000 columns of data
str(MT)
'data.frame': 1356 obs. of 1000 variables:
$ Date : Factor w/ 1356 levels "Apr-1900","Apr-1901",..: 453 340 792 1 905 679 566 114 1244 1131 ...
$ Year : int 1900 1900 1900 1900 1900 1900 1900 1900 1900 1900 ...
$ X1 : num -27.4 -27.8 -17 1.7 7.9 ...
$ X2 : num -27.21 -27.99 -17.05 1.69 7.75 ...
$ X3 : num -26.67 -27.84 -16.75 2.24 7.82 ...
$ X4 : num -26.64 -27.98 -16.83 2.46 7.97 ...
.....
$ X1000 : num -29.13 -30.61 -20.47 -0.46 6.5
I would like to split this dataframe into three columns ( Date, Year and Xn) using a loop so that the end of it all I will have 1000 separate csv files with 3 columns of data. My codes thus far is
for (i in ncol(MT)) {
x[[i]]<-data.frame(MT$Date, Year, MT$[[i]]) }
However, is giving me errors. Your guidance would be appreciated as this I am new to R
Upvotes: 0
Views: 2965
Reputation: 4686
Reusing the sample data created by @Molx, and doing some reshaping as @Neal Fultz suggested in comments, using tidyr
# generate sample data
MT <- data.frame(Date = rnorm(5), Year = rnorm(5), X1 = rnorm(5), X2 = rnorm(5), X3 = rnorm(5))
Then fit all variables and values excluding Date
and Year
as key-value column pairs
> require(tidyr)
> MTg <- gather(MT, var, value, -c(Date, Year))
> MTg
Date Year var value
1 -1.5356474 -1.0963886 X1 -0.74075807
2 -1.1346928 0.2925819 X1 1.42787059
3 0.7031032 0.3361561 X1 -0.27112156
4 1.0140557 1.2587298 X1 0.85693377
5 0.2529787 -3.0113663 X1 0.12686607
6 -1.5356474 -1.0963886 X2 0.21406288
7 -1.1346928 0.2925819 X2 -1.11363330
8 0.7031032 0.3361561 X2 -0.30324978
9 1.0140557 1.2587298 X2 0.48954893
10 0.2529787 -3.0113663 X2 0.85898166
11 -1.5356474 -1.0963886 X3 -0.44394680
12 -1.1346928 0.2925819 X3 -0.86942530
13 0.7031032 0.3361561 X3 -1.62344294
14 1.0140557 1.2587298 X3 0.09880026
15 0.2529787 -3.0113663 X3 -0.76091871
Then run through all possible variable names, exporting them into individual csv files with same name as var
.
varnames <- levels(MTg$var) # get variable names
dummy <- lapply(varnames, function(x)
write.csv(MTg[MTg$var==x,], file=paste0(x, ".csv"))
Upvotes: 0
Reputation: 6931
Your code has some syntax and algorithm errors:
i = ncol(MT)
, it should be (i in 1:ncol(MT))
;Xn
, so (i in 1:(ncol(MT)-2))
;x
before trying to allocate data to it, preferably with its final size;MT$
to select the Year
column;$
and [[
to subset the Xn
column. You should use just [
instead, because this way you get to use i
and keep the column name.Fixing all these, with some example data, you get:
MT <- data.frame(Date = rnorm(5), Year = rnorm(5), X1 = rnorm(5), X2 = rnorm(5), X3 = rnorm(5))
nX <- ncol(MT)-2
listofdf <- lapply(1:nX, function(x) NULL)
for (i in 1:nX) {
listofdf[[i]] <- data.frame(MT$Date, MT$Year, MT[i+2])
}
listofdf
# [[1]]
# MT.Date MT.Year X1
# 1 -0.94184053 1.0241134 -0.4329728
# 2 0.59637577 -0.6195477 -1.3011527
# 3 0.33474278 1.0628674 -0.8957239
# 4 -0.04328685 0.4275993 -0.7840214
# 5 0.78799652 0.5707058 -0.4243622
#
# [[2]]
# MT.Date MT.Year X2
# 1 -0.94184053 1.0241134 2.2380838
# 2 0.59637577 -0.6195477 -0.9995170
# 3 0.33474278 1.0628674 0.3452450
# 4 -0.04328685 0.4275993 -1.0453718
# 5 0.78799652 0.5707058 -0.6292885
#
# [[3]]
# MT.Date MT.Year X3
# 1 -0.94184053 1.0241134 -0.05293727
# 2 0.59637577 -0.6195477 0.84947635
# 3 0.33474278 1.0628674 1.17748809
# 4 -0.04328685 0.4275993 1.73233398
# 5 0.78799652 0.5707058 -0.61874653
If you're just going to save them as .csv files, it's not necessary to store in a list though. Instead, you can use:
for (i in 1:nX) {
tempdf <- data.frame(MT$Date, MT$Year, MT[i+2])
write.csv(tempdf, paste0("MT_subset_X", i, ".csv"))
}
Upvotes: 1