Orpheus
Orpheus

Reputation: 329

How to plot multiple semi-variogram from a single dataset efficiently in R?

I have a dataframe named seoul1to7 contains the hourly PM10 concentration data from 1 march to 7 march,2012.please,download.In this dataset, time is in yyyymmddhr format.for example, 2012030101 means 1 march 2012, 1.00 a.m.

Data is Look like:

      ID       time PM10      LAT     LON
1 111121 2012030101   42 37.56464 126.976
2 111121 2012030102   36 37.56464 126.976
3 111121 2012030103   46 37.56464 126.976
4 111121 2012030104   40 37.56464 126.976
.
.

My ultimate goal is to plot semi-variogram for every hour. for example, for 1 march 2012,1.00 am(2012030101) there are 107 PM10 data. And I want to plot semivariogram for 2012030101 to 2012030723 (total 7*24 semivariogram). I wrote some code in R:

seoul1to7<-read.csv("seoul1to7.csv", row.names=1)
rownames(seoul1to7)<-NULL

seoul311<-subset(seoul1to7, time==2012030101)
seoul312<-subset(seoul1to7, time==2012030102)
.
.
.
seoul3723<-subset(seoul1to7,time==2012030724)

at first, I tried to make my desired (7*24) dataframes by subset() function then I wanted to plot semivariogram for each dataframe.For example, I have plotted semivariogram for seoul311(for 2012030101) by following code:

library(sp)
library(gstat)
library(rgdal)
seoul311<-read.csv("seoul311.csv",row.name=1)
seoul311<-na.omit(seoul311)

coordinates(seoul311)=~LON+LAT
proj4string(seoul311) =  "+proj=longlat +datum=WGS84" 
seoul311<-spTransform(seoul311, CRS("+proj=utm +north +zone=52 +datum=WGS84"))

#plot Omnidirectional Variogram
seoul311.var<-variogram(PM10~1,data=seoul311,cutoff=66000, width=6000)
seoul311.var
plot(seoul311.var, col="black", pch=16,cex=1.3,
     xlab="Distance",ylab="Semivariance",
     main="Omnidirectional Variogram for seoul 311")

#Model fit
model.311<- fit.variogram(seoul311.var,vgm(psill=250,model="Gau",range=40000,nugget=100),
                          fit.method = 2)
plot(seoul311.var,model=model.311, col="black", pch=16,cex=1.3,
     xlab="Distance",ylab="Semivariance",
     main="Omnidirectional Variogram for seoul 3112")


#Directional Variogram
seoul311.var1<-variogram(PM10~1,data=seoul311,width=6000,cutoff=66000,
                         alpha=seq(0,135,45),tol.hor=15)
seoul311.var1
plot(seoul311.var1,model=model.311, cex=1.1,pch=16,col=1,
     main="ANisotropic Variogram for PM10")    


#anisotropy corrected variograms
model.3112.anis<- fit.variogram(seoul311.var1,vgm(250,"Gau",40000,100,anis=c(45,0.80)),
                                fit.method = 2)

#Final isotropic variogram for kriging
plot(seoul311.var,model=model.3112.anis, col="black", pch=16,cex=1.3,
     xlab="Distance",ylab="Semivariance",
     main="Final Isotropic Variogram")

But I understand that my code is very inefficient! I am writing (7*24) times subset(seoul1to7, time==2012030101) this code. and then again (7*24) times the code for plotting semivariogram! I think this is very inappropriate way.

So, How can I plotting these (7*24) semi variogram very efficiently from my dataset seoul1to7 (by using loop or any other function)? If you have need any further information please let me know.

Upvotes: 1

Views: 1539

Answers (1)

Orpheus
Orpheus

Reputation: 329

library(sp)
library(gstat)
library(rgdal)
library(automap)

seoul1to7<-read.csv("seoul1to7.csv", row.names=1)
seoul1to7 <- na.omit(seoul1to7)
seoul1to7_split<-split(seoul1to7,seoul1to7$time) #a list contain 161 dataframe
seq(seoul1to7_split)

### now we loop (using lapply()) over each seoul1to7_split entry and calculate 
### variogram using autofitVariogram and return the variogram plot

vars<-lapply(seq(seoul1to7_split), function(i)
{
  dat<-seoul1to7_split[[i]]  # for list element [[]]
  coordinates(dat)<-~LON+LAT
  proj4string(dat) <- "+proj=longlat +datum=WGS84" 
  dat <- spTransform(dat, CRS("+proj=utm +north +zone=52 +datum=WGS84"))
  variogram<-autofitVariogram(log(PM10)~1,dat)
  plot<- plot(variogram,plotit=FALSE, asp=1)

  ### in case you do not want to fix xlim and ylim to be identical 
  ### for each plot just comment out the following line or change
  ### values as you see fit
  #plt <- update(plt, xlim = c(-1000, 35000), ylim = c(0, 1000))

  return(plot)
})

### now we actually have 23 * 7 variogram plots which we will combine 
### into 23 hourly plots using latticeCombineGrid()
library(raster)
library(devtools)
#install.packages("Rcpp")
install_github("environmentalinformatics-marburg/Rsenal")
library(Rsenal)

names(seoul1to7_split)
hours<-substr(names(seoul1to7_split),9,10) #substructuring 9th to 10th digit of every element
hours
unique(hours)
class(unique(hours))   #character
seq(unique(hours))

var7_per_plot<-lapply(seq(unique(hours)), function(j)
  {
  index<- hours %in% unique(hours)[j]  
  plot.hours <- vars[index]

  return(latticeCombineGrid(plot.hours, layout =c(3,3)))
})
var7_per_plot[[1]]
var7_per_plot[[2]]
.
.
.
var7_per_plot[[23]]

Special thanks to Tim Appelhans for taught me.

Upvotes: 1

Related Questions