Reputation: 33
I have two formats of my Mortality Data, one in the list form you get it from The Human Mortality Database, with Male, Female and Combined data all in columns. The other format is separated into Male and Female matrices, with just Age, year and the mortality rate in each matrix.
The first format is along the lines of
Year Age Female Male Total
1961 99 0.3 0.4 0.3
1961 98 0.4 0.5 0.4
etc.
The second format I separated to get data in the form of:
Age 1961 1962 1963 .....
0 0.02 0.02 0.02 ...
1 0.002 0.002 0.002....
etc.
I would like to be able to plot a heatmap so I can look at the cohort effects etc.
I have tried various methods found by searching online but these aren't working for the way my data is presented. The heatmaps I've produced come out completely red. Can anyone help?
I've tried this:
rnames <- France[,1] #assign labels in column 1 to "rnames"
mat_data <- data.matrix(France[,2:ncol(France)])
rownames(mat_data) <- rnames #assign row names
col_breaks = c(seq(-1,0,length=100), # for red
seq(0,0.8,length=100), # for yellow
seq(0.8,1,length=100)) # for green
my_palette <- colorRampPalette(c("red", "yellow", "green"))(n = 299)
png("location", # create PNG for the heat map
width = 5*300, # 5 x 300 pixels
height = 5*300,
res = 300, # 300 pixels per inch
pointsize = 8) # smaller font size
heatmap.2(mat_data,
cellnote=mat_data,
main="Correlation",
notecol="black",
trace="none",
margins =c(12,9),
col=my_palette,
breaks=col_breaks,
dendrogram="row",
Colv="NA")
dev.off()
Which creates a solid red heatmap, with the year listed along the bottom, and then the word Age next to the years, and then the actual ages listed along the y-axis. It also gives me an error code:
Error in seq.default(min.raw, max.raw, by = min(diff(breaks)/4)) :
invalid (to - from)/by in seq(.)
Does anyone know of a better way of producing the heatmap or what I've done wrong here?
Upvotes: 1
Views: 621
Reputation: 13149
Is this in any way helpful? I based it on what your data looks like, and generated some data to match. Then I started with a plot with 'year' on the x-axis and 'age' on the y-axis and a square (geom_tile) for each point. Those squares are coloured according to the 'total'. It doesn't have any polygons like the example you gave, but I think with your real data it would enable you to look for cohort effects.
#generate some data ranging from 0 to 0.1
set.seed(1000)
France <- expand.grid(Year=1961:2000,Age=20:98)
France$Female <- runif(nrow(France),0,0.05)
France$Male <- runif(nrow(France),0,0.05)
France$Total <- France$Male + France$Female
library(ggplot2)
p1 <- ggplot(France, aes(x=Year,y=Age,fill=Total)) +
geom_tile()+
scale_fill_gradientn(colours=rainbow(10))
p1
Upvotes: 1
Reputation: 7170
From the source code:
z <- seq(min.raw, max.raw, by=min(diff(breaks)/4))
The heatmap.2
code is internally calling the seq
function and produces the error you're experiencing:
Error in seq.default(min.raw, max.raw, by = min(diff(breaks)/4)) :
invalid (to - from)/by in seq(.)
What are min.raw
and max.raw
, though? Scroll up a bit (line 640) and you'll see they are the min and max of the breaks
arg you passed in (which in this case is -1 and 1 respectively). The by
parameter in the internal seq
function evaluates to 0:
min(diff(breaks)/4)
In fact, you can replicate this error if you try to construct a seq
function with these parameters:
> seq(-1, 1, by=0)
Error in seq.default(-1, 1, by = 0) : invalid (to - from)/by in seq(.)
There are two implications here: first of all, you've uncovered a cornercase that breaks that code and this is a bug that should probably be reported on the github repository (i.e., if this evaluates to 0, use some pre-defined by
param). Secondly, you could use a uniform break
parameter or just not define it. It is, afterall, an optional parameter. From the documentation:
breaks
(optional) Either a numeric vector indicating the splitting points for binning x
into colors, or a integer number of break points to be used, in which case the break
points will be spaced equally between min(x) and max(x).
By leaving breaks
blank or providing a single value, you shouldn't encounter this problem.
Upvotes: 1