Lerner54
Lerner54

Reputation: 31

accessing large dataset from NetCDF4 file using R

I am trying to use data from this file (MOD13.A2010.unaccum.nc4) in a project. I have installed and loaded the ncdf4, raster, ggplot2, and viridis librarys. I open the file successfully using:

mcd_file <- nc_open("C:\\Program Files\\RStudio\\R\\MCD13.A2010.unaccum.nc4")

and can access attributes of mcd_file by highlighting mcd_file and clicking the RUN button and by using:

vars <- names(my_file$var)
print(vars)

This indicates that mcd_file has 5 variables, with time_bnds being the second variable, and the dataset NDVI as the third variable.

I can access correct information regarding the time_bns variable using this:

time.layers < ncvar_get(mcd_file, "time_bnds")

but when I try this:

NDVI <- ncvar_get(mcd_file, "NDVI")

I get an error message that reads:

Error: cannot allocate vector of size 49.5 Gb

I looked up the meaning of the error message, and it means that I don't have enough RAM to hold all the information in NDVI, but then I don't know anyone with over 49.5 Gb of RAM. Yet people do analyze this file using R so the information in NDVI should be accessible using R somehow.

I know that raster objects from the raster library can be used to access data in files that are too big to fit in RAM. But I can't figure out how to extract the information in NDVI from the original file so I can write it to file without the other information in the original file so I can than use raster objects to access it. This:

write.csv(my_file$NDVI, file = "NDVI.csv")

creates a file called NDVI, but it is empty.

This:

write.csv(as.raster(ncvar_get(my_file, "NDVI")),file = "NDVI2010.csv")

generates this error message:

Error: cannot allocate vector of size 49.5 Gb

Can anyone help me out?

Upvotes: 1

Views: 1173

Answers (1)

Lerner54
Lerner54

Reputation: 31

Solved.

Don't need to go through base R at all. Just let the raster package do it's magic.

>f <- "C:\\Program Files\\RStudio\\R\\MCD13.A2010.unaccum.nc4"
>b <- brick(f)  

The raster program opens the indicated file and accesses the data set therein, but leaves the data on disk, bringing out the desired data in chunks when needed.

Plotting a raster layer uses similar syntax:

>x <- desiredBandToPlot
>r <- raster(f, band = x)
>plot(r)

Because each layer has 28,000,000 pixels of data to plot, it took my computer (with AMD A8 CPU) about 5 minutes for the plot to be (compressed and then) displayed, but it works.

Upvotes: 2

Related Questions