Reputation: 85
I have a data frame T_mod
with 150 observations and 2920 variables, containing subsurface temperature values in °C over one year. It looks like this:
> T_mod[1:10, 1:6]
t=-24548400 t=-24537600 t=-24526800 t=-24516000 t=-24505200 t=-24494400
z=0.1 9.000187 9.004622 9.009004 9.013332 9.017607 9.021829
z=0.2 8.587763 8.592795 8.597776 8.602705 8.607583 8.612410
z=0.3 8.179728 8.185313 8.190848 8.196334 8.201770 8.207157
z=0.4 7.776561 7.782655 7.788702 7.794702 7.800653 7.806558
z=0.5 7.378704 7.385267 7.391785 7.398256 7.404682 7.411062
z=0.6 6.986564 6.993556 7.000504 7.007408 7.014268 7.021084
z=0.7 6.600512 6.607894 6.615235 6.622533 6.629789 6.637003
z=0.8 6.220886 6.228623 6.236319 6.243975 6.251591 6.259166
z=0.9 5.847995 5.856050 5.864068 5.872046 5.879986 5.887887
z=1 5.482113 5.490454 5.498759 5.507026 5.515257 5.523450
The rownames stand for depth. In 10 cm increments from 0.1 m to 15 m underground. Colnames indicate time in elapsed seconds. The cell values are temperatures in °C, for each point in time for a given depth.
I want to create a heatmap showing temperatures along time on the x-axis and depth on the y-axis. The plot below is created with the image.plot function in R base graphics using the following code:
image.plot(z = t(as.matrix(T_mod[150:1,])), legend.lab = "Temperature (°C)",
ylab = "Depth (m)", xlab = "Time")
The x axis represents time (one year in 3h intervals) and the y axis represents depth (0 to 15 m in 10 cm increments). Z values are temperatures for a given point in time and a specific depth. Obvisously, the axes ticks and tick labels make little sense as of now. The problem is the image
and image.plot
functions are somewhat rigid, not allowing to adjust axis ticks, labels, etc.
Now, someone has pointed me towards ggplot2 for greater flexibility in adjusting plot parameters but I have not used ggplot so far. Consequently, the code below does not work.
ggplot(T_mod, aes(x=time, y=Depth, z=Temperature)) +
geom_tile(aes(fill=Temperature)) +
theme(panel.background = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_line(colour = "black"),
panel.border = element_rect(colour = "black", fill=NA, size=2))+
ylab("Depth")+
xlab("Time")+
# possibly use stat_contour(binwidth = 0.1,aes(colour = ..level..),size=0.1) +
# ... and scale_fill_gradient(low = "red", high = "Green”) +
# ... and scale_colour_gradient(low = "black", high = "black",guide = "none")+
scale_y_continuous(expand = c(0,0),breaks=seq(20, 140, 20),limits=c(20,140),labels=lbl_y)+
scale_x_continuous(expand = c(0,0),breaks=seq(124, 2796, 240),limits=c(124,2796),labels=lbl_x)+
coord_cartesian(ylim=c(1,150),xlim=c(1,2920))+
theme(axis.text.x = element_text(size = 15),axis.text.y = element_text(size = 15),axis.title.x = element_text(size = 15),axis.title.y = element_text(size = 15),plot.title = element_text(size=15))+
ggtitle("Main title")
> lbl_y
[1] -2 -4 -6 -8 -10 -12 -14
> lbl_x
[1] "01 Sep" "01 Okt" "01 Nov" "01 Dez" "01 Jan" "01 Feb" "01 Mrz" "01 Apr" "01 Mai"
[10] "01 Jun" "01 Jul" "01 Aug"
The basic issue I believe is that I do not know how to assign depth, time, and temperature from the data frame to the aes()
call in the first row. Other examples use columns to specify that but the columns in my data frame indicate temperatures at one point in time and as infill I want all temperatures plotted. Any sugggestions on how to plot this with ggplot2 or how to make changes to the image.plot function above that allow axes to be set are greatly appreciated.
Upvotes: 1
Views: 7452
Reputation: 17810
I agree with the other statements that the data need to be reshaped to be in tidy format. I just wanted to add that geom_raster()
rather than geom_tile()
is generally the better option for large heatmaps. It is optimized for large raster datasets and it is way faster. Example follows below (using the built-in volcano
data, since I don't have your dataset).
library(ggplot2)
library(viridis)
# create tidy version of volcano data
nx = 87
ny = 61
volcano_data <- data.frame(height = c(volcano), x = rep(1:nx, ny), y = rep(1:ny, each = nx))
# take a look at the dataset. it's indeed tidy.
head(volcano_data)
# height x y
# 1 100 1 1
# 2 101 2 1
# 3 102 3 1
# 4 103 4 1
# 5 104 5 1
# 6 105 6 1
# plot
ggplot(volcano_data, aes(x, y, fill=height)) +
geom_raster() +
coord_fixed(expand = FALSE) +
scale_fill_viridis()
geom_raster()
also allows you to interpolate between adjacent colors for a smoother appearance. This may or may not be useful to you:
ggplot(volcano_data, aes(x, y, fill=height)) +
geom_raster(interpolate = TRUE) +
coord_fixed(expand = FALSE) +
scale_fill_viridis()
Upvotes: 2
Reputation: 15072
I mentioned in the comment that I think you needed to gather your data, at least if it was presented as shown with time in columns and depth in rows. ggplot2
is designed to work with tidy data, where each row is an observation and variables are stored in columns. Here, that means you want just three columns, one for each of depth
, temp
and time
, and each row is then a single measurement. You can do this with the code below.
gather
to combine all the time columns into a single oneseparate
to split up the time and row values into just the numeric partselect
to drop unneeded variablesmutate_at
to convert the values stored as strings into numbersThen, ggplot
becomes easy to use. geom_tile
is designed for three main aesthetics, x
, y
, and fill
. We just call geom_tile
and map its aesthetics to the variables we want, and produce the plot below. I include scale_fill_viridis
which changes the colours to perceptually uniform ones, but that isn't necessary. You might not need all these steps if your data isn't stored exactly as shown.
As far as the axis ticks go, you probably do want scale_x_continuous
but I am not sure what units your time values are in right now.
For more info on tidy data and on ggplot, try these chapters.
library(tidyverse)
library(viridis)
tbl <- read_table2(
"depth t=-24548400 t=-24537600 t=-24526800 t=-24516000 t=-24505200 t=-24494400
z=0.1 9.000187 9.004622 9.009004 9.013332 9.017607 9.021829
z=0.2 8.587763 8.592795 8.597776 8.602705 8.607583 8.612410
z=0.3 8.179728 8.185313 8.190848 8.196334 8.201770 8.207157
z=0.4 7.776561 7.782655 7.788702 7.794702 7.800653 7.806558
z=0.5 7.378704 7.385267 7.391785 7.398256 7.404682 7.411062
z=0.6 6.986564 6.993556 7.000504 7.007408 7.014268 7.021084
z=0.7 6.600512 6.607894 6.615235 6.622533 6.629789 6.637003
z=0.8 6.220886 6.228623 6.236319 6.243975 6.251591 6.259166
z=0.9 5.847995 5.856050 5.864068 5.872046 5.879986 5.887887
z=1 5.482113 5.490454 5.498759 5.507026 5.515257 5.523450"
)
tidy_tbl <- tbl %>%
gather(key = "time", value = "temp", starts_with("t=")) %>%
separate(depth, c("z", "depth"), sep = "=") %>%
separate(time, c("t", "time"), sep = "-") %>%
select(-z, -t) %>%
mutate_at(vars(depth, time), as.numeric) %>%
print()
# A tibble: 60 x 3
depth time temp
<dbl> <dbl> <dbl>
1 0.100 24548400 9.00
2 0.200 24548400 8.59
3 0.300 24548400 8.18
4 0.400 24548400 7.78
5 0.500 24548400 7.38
6 0.600 24548400 6.99
7 0.700 24548400 6.60
8 0.800 24548400 6.22
9 0.900 24548400 5.85
10 1.00 24548400 5.48
# ... with 50 more rows
ggplot(data = tidy_tbl) +
theme_bw() +
geom_tile(aes(x = time, y = depth, fill = temp)) +
scale_fill_viridis(name = "Temp") +
labs(x = "Time", y = "Depth")
Upvotes: 3