Reputation: 13
I'm trying to make a density plot of intensity in a circle, from radial intensity measuements of a bacterial population.
I've looked into various 3D, 2D options to best represent the data.
I effectively want to create something like this:
R code:
df <- tibble(x_variable = rnorm(5000), y_variable = rnorm(5000))
ggplot(df, aes(x = x_variable, y = y_variable)) + stat_density2d(aes(fill = ..density..), contour = F, geom = 'tile')
However, what i want to create, from this is a rough extract, where x and y are radial distances from the center point with a density/contours/shading being representing intensity z.
I'm ok with it being a contour plot, or a 3D plot viewed above. Just feel like i've tried everything, with other examples i'll have surfaces which might overlap and 'submerge' and reappear at different distances. So any advice on how to tackle that would be great.
Any help would be greatly appreciated.
Upvotes: 1
Views: 759
Reputation: 173793
It sounds as though you are trying to represent x, y, z data as a contour or 2D density plot. In order to do this, your data have to be organized into a regular grid.
My understanding is that the x value in your data represents the radial distance from the center of the bacterial colony and the z value represents the density at that point. This gives us enough information to create a 3D surface, providing that we can assume the colony is perfectly radially symmetrical.
We start by loading in the data, and arrange it by increasing x value:
df <- read.csv('my_data.csv')
df <- df[order(df$x),]
Now we create a sequence of radial distances from the centre which encompass our data set and will be used as the x and y co-ordinates of our grid. We need enough points to make the grid smooth, so we will use 300 points along each side of our grid:
radii <- seq(min(df$x), max(df$x), length = 300)
Now we can create a grid of all x, y combinations of these points using expand.grid
, and then find the Euclidean distance from each point to the center:
plot_df <- expand.grid(x = radii, y = radii)
plot_df$radius <- sqrt(plot_df$x^2 + plot_df$y^2)
To get the z value at each grid point, we can use findInterval
, which will identify the row of our original data frame which is closest to the radius of each grid point. The z value of that row will be the z value of the grid point:
plot_df$z <- df$z[findInterval(plot_df$radius, df$x)]
Now we can plot the result using geom_raster
:
library(ggplot2)
p <- ggplot(plot_df, aes(x, y, fill = z)) +
geom_raster() +
coord_equal() +
scale_fill_viridis_c()
p
If you want to add labelled contour lines you could do:
library(geomtextpath)
p + geom_textcontour(aes(z = z, label = ..level..), breaks = 5:9)
Update for three data sets
This shows which colony has the highest density at which point:
plot_list <- lapply(paste0("my_data_", letters[1:3], ".csv"),
function(x) {
df <- read.csv(x)
df <- df[order(df$x),]
radii <- seq(-2, 2, length = 300)
plot_df <- expand.grid(x = radii, y = radii)
plot_df$radius <- sqrt(plot_df$x^2 + plot_df$y^2)
plot_df$z <- df$z[findInterval(plot_df$radius, df$x)]
plot_df
})
plot_list[[1]]$fill <- c("Col1", "Col2", "Col3")[apply(cbind(plot_list[[1]]$z,
plot_list[[2]]$z, plot_list[[3]]$z), 1, which.max)]
ggplot(plot_list[[1]], aes(x, y, fill = fill)) +
geom_raster() +
coord_equal()
Upvotes: 2