Reputation: 2763
I have a data frame with monthly temperature data for several locations:
> df4[1:36,]
location variable cut month year freq
1 Adamantina temperature 10 Jan 1981 21.0
646 Adamantina temperature 10 Feb 1981 20.5
1291 Adamantina temperature 10 Mar 1981 21.5
1936 Adamantina temperature 10 Apr 1981 21.5
2581 Adamantina temperature 10 May 1981 24.0
3226 Adamantina temperature 10 Jun 1981 21.5
3871 Adamantina temperature 10 Jul 1981 22.5
4516 Adamantina temperature 10 Aug 1981 23.5
5161 Adamantina temperature 10 Sep 1981 19.5
5806 Adamantina temperature 10 Oct 1981 21.5
6451 Adamantina temperature 10 Nov 1981 23.0
7096 Adamantina temperature 10 Dec 1981 19.0
2 Adolfo temperature 10 Jan 1981 24.0
647 Adolfo temperature 10 Feb 1981 20.0
1292 Adolfo temperature 10 Mar 1981 24.0
1937 Adolfo temperature 10 Apr 1981 23.0
2582 Adolfo temperature 10 May 1981 18.0
3227 Adolfo temperature 10 Jun 1981 21.0
3872 Adolfo temperature 10 Jul 1981 22.0
4517 Adolfo temperature 10 Aug 1981 19.0
5162 Adolfo temperature 10 Sep 1981 19.0
5807 Adolfo temperature 10 Oct 1981 24.0
6452 Adolfo temperature 10 Nov 1981 24.0
7097 Adolfo temperature 10 Dec 1981 24.0
3 Aguai temperature 10 Jan 1981 24.0
648 Aguai temperature 10 Feb 1981 20.0
1293 Aguai temperature 10 Mar 1981 22.0
1938 Aguai temperature 10 Apr 1981 20.0
2583 Aguai temperature 10 May 1981 21.5
3228 Aguai temperature 10 Jun 1981 20.5
3873 Aguai temperature 10 Jul 1981 24.0
4518 Aguai temperature 10 Aug 1981 23.5
5163 Aguai temperature 10 Sep 1981 18.5
5808 Aguai temperature 10 Oct 1981 21.0
6453 Aguai temperature 10 Nov 1981 22.0
7098 Aguai temperature 10 Dec 1981 23.5
What I need to do is to programmatically split this data frame by location and create a .Rdata file for every location.
In the example above, I would have three different files - Adamantina.Rdata, Adolfo.Rdata and Aguai.Rdata - containing all the columns but only the rows corresponding to those locations.
It needs to be efficient and programmatic, because in my actual data I have about 700 different locations and about 50 years of data for every location.
Thanks in advance.
Upvotes: 3
Views: 9615
Reputation: 4818
To split data frame, use split(df4, df4$location)
. It will create data frames named Adamantina
, Adolfo
, Aguai
, etc.
And to save these new data frames into locations.RData
file, use save(Adamantina, Adolfo, Aguai, file="locations.RData")
. save.image(file="filename.RData")
will save everything in current R session into filename.RData
file.
You can read more about save
and save.image
here.
Edit:
If number of splits is way too large, then use this approach:
locations <- split(df4, df4$location)
save(locations, "locations.RData")
locations.RData
will then load as a list.
Upvotes: 3
Reputation: 1190
This is borrowing from a previous answer, but I don't believe that answer does you want.
First, as they suggest, you want to split up your data set.
splitData <- split(df4, df4$location)
Now, to go through this list and one by one, save your datasetset, this can be done with by pulling off the names:
allNames <- names(splitData)
for(thisName in allNames){
saveName = paste0(thisName, '.Rdata')
saveRDS(splitData[[thisName]], file = saveName)
}
Upvotes: 6