Reputation: 17
I'm getting the error:
TypeError: Image data of dtype object cannot be converted to float
when I try to run the heatmap
function in the code below:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Read the data
df = pd.read_csv("gapminder-FiveYearData.csv")
print(df.head(10))
# Create an array of n-dimensional array of life expectancy changes for countries over the years.
year = ((np.asarray(df['year'])).reshape(12, 142))
country = ((np.asarray(df['country'])).reshape(12, 142))
print(year)
print(country)
# Create a pivot table
result = df.pivot(index='year',columns='country',values='lifeExp')
print(result)
# Create an array to annotate the heatmap
labels = (np.asarray(["{1:.2f} \n {0}".format(year,value)
for year, value in zip(year.flatten(),
country.flatten())])
).reshape(12, 142)
# Define the plot
fig, ax = plt.subplots(figsize=(15, 9))
# Add title to the Heat map
title = "GapMinder Heat Map"
# Set the font size and the distance of the title from the plot
plt.title(title,fontsize=18)
ttl = ax.title
ttl.set_position([0.5,1.05])
# Hide ticks for X & Y axis
ax.set_xticks([])
ax.set_yticks([])
# Remove the axes
ax.axis('off')
# Use the heatmap function from the seaborn package
hmap = sns.heatmap(result,annot=labels,fmt="",cmap='RdYlGn',linewidths=0.30,ax=ax)
# Display the Heatmap
plt.imshow(hmap)
Here is a link to the CSV file.
The objective of the activity is to
data file is the dataset with 6 columns namely: country, year, pop, continent, lifeExp
and gdpPercap
.
Create a pivot table dataframe with year along x-axes, country along y-axes and lifeExp
filled within cells.
Plot a heatmap using seaborn for the pivot table that was just created.
Upvotes: 1
Views: 13776
Reputation: 312
Thanks for providing your data to this question. I believe your typeError is coming from the labels
array your code is creating for the annotation. Based on the function's built-in annotate properties, I actually don't think you need this extra work and it's modifying your data in a way that errors out when plotting.
I took a stab at re-writing your project to produce a heatmap that shows the pivot table of country
and year
of lifeExp
. I'm also assuming that it is important for you to keep this number a float
.
import numpy as np
import pandas as pd
import seaborn as sb
import matplotlib.pyplot as plt
## UNCHANGED FROM ABOVE **
# Read in the data
df = pd.read_csv('https://raw.githubusercontent.com/resbaz/r-novice-gapminder-files/master/data/gapminder-FiveYearData.csv')
df.head()
## ** UNCHANGED FROM ABOVE **
# Create an array of n-dimensional array of life expectancy changes for countries over the years.
year = ((np.asarray(df['year'])).reshape(12,142))
country = ((np.asarray(df['country'])).reshape(12,142))
print('show year\n', year)
print('\nshow country\n', country)
# Create a pivot table
result = df.pivot(index='country',columns='year',values='lifeExp')
# Note: This index and columns order is reversed from your code.
# This will put the year on the X axis of our heatmap
result
I removed the labels
code block.
Notes on the sb.heatmap
function:
plt.cm.get_cmap()
to restrict the number of colors in your
mapping. If you want to use the entire colormap spectrum, just remove
it and include how you had it originally.fmt
= "f", this if for float
, your lifeExp
values. cbar_kws
- you can use this to play around with the size, label and orientation of your color bar.# Define the plot - feel free to modify however you want
plt.figure(figsize = [20, 50])
# Set the font size and the distance of the title from the plot
title = 'GapMinder Heat Map'
plt.title(title,fontsize=24)
ax = sb.heatmap(result, annot = True, fmt='f', linewidths = .5,
cmap = plt.cm.get_cmap('RdYlGn', 7), cbar_kws={
'label': 'Life Expectancy', 'shrink': 0.5})
# This sets a label, size 20 to your color bar
ax.figure.axes[-1].yaxis.label.set_size(20)
plt.show()
limited screenshot, only b/c the plot is so large
another of the bottom of the plot to show the year axis, slightly zoomed in on my browser.
Upvotes: 6