chandra sekhar T V
chandra sekhar T V

Reputation: 17

TypeError: Image data of dtype object cannot be converted to float - Issue with HeatMap Plot using Seaborn

I'm getting the error:

TypeError: Image data of dtype object cannot be converted to float

when I try to run the heatmap function in the code below:

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Read the data
df = pd.read_csv("gapminder-FiveYearData.csv")
print(df.head(10))

# Create an array of n-dimensional array of life expectancy changes for countries over the years. 
year = ((np.asarray(df['year'])).reshape(12, 142))
country = ((np.asarray(df['country'])).reshape(12, 142))

print(year)
print(country)

# Create a pivot table
result = df.pivot(index='year',columns='country',values='lifeExp')
print(result)

# Create an array to annotate the heatmap
labels = (np.asarray(["{1:.2f} \n {0}".format(year,value)
                      for year, value in zip(year.flatten(),
                                               country.flatten())])
         ).reshape(12, 142)

# Define the plot
fig, ax = plt.subplots(figsize=(15, 9))

# Add title to the Heat map
title = "GapMinder Heat Map"

# Set the font size and the distance of the title from the plot
plt.title(title,fontsize=18)
ttl = ax.title
ttl.set_position([0.5,1.05])

# Hide ticks for X & Y axis
ax.set_xticks([]) 
ax.set_yticks([]) 

# Remove the axes
ax.axis('off')

# Use the heatmap function from the seaborn package
hmap = sns.heatmap(result,annot=labels,fmt="",cmap='RdYlGn',linewidths=0.30,ax=ax)

# Display the Heatmap
plt.imshow(hmap)

Here is a link to the CSV file.

The objective of the activity is to

  1. data file is the dataset with 6 columns namely: country, year, pop, continent, lifeExp and gdpPercap.

  2. Create a pivot table dataframe with year along x-axes, country along y-axes and lifeExp filled within cells.

  3. Plot a heatmap using seaborn for the pivot table that was just created.

Upvotes: 1

Views: 13776

Answers (1)

jwho
jwho

Reputation: 312

Thanks for providing your data to this question. I believe your typeError is coming from the labels array your code is creating for the annotation. Based on the function's built-in annotate properties, I actually don't think you need this extra work and it's modifying your data in a way that errors out when plotting.

I took a stab at re-writing your project to produce a heatmap that shows the pivot table of country and year of lifeExp. I'm also assuming that it is important for you to keep this number a float.

import numpy as np
import pandas as pd
import seaborn as sb
import matplotlib.pyplot as plt

## UNCHANGED FROM ABOVE **
# Read in the data
df = pd.read_csv('https://raw.githubusercontent.com/resbaz/r-novice-gapminder-files/master/data/gapminder-FiveYearData.csv')
df.head()

output

## ** UNCHANGED FROM ABOVE ** 
# Create an array of n-dimensional array of life expectancy changes for countries over the years.
year = ((np.asarray(df['year'])).reshape(12,142))
country = ((np.asarray(df['country'])).reshape(12,142))

print('show year\n', year)
print('\nshow country\n', country)
# Create a pivot table
result = df.pivot(index='country',columns='year',values='lifeExp')
# Note: This index and columns order is reversed from your code. 
# This will put the year on the X axis of our heatmap
result

pivot table

I removed the labels code block. Notes on the sb.heatmap function:

  • I used plt.cm.get_cmap() to restrict the number of colors in your mapping. If you want to use the entire colormap spectrum, just remove it and include how you had it originally.
  • fmt = "f", this if for float, your lifeExp values.
  • cbar_kws - you can use this to play around with the size, label and orientation of your color bar.
# Define the plot - feel free to modify however you want
plt.figure(figsize = [20, 50])

# Set the font size and the distance of the title from the plot
title = 'GapMinder Heat Map'
plt.title(title,fontsize=24)

ax = sb.heatmap(result, annot = True, fmt='f', linewidths = .5,
                 cmap = plt.cm.get_cmap('RdYlGn', 7), cbar_kws={
                     'label': 'Life Expectancy', 'shrink': 0.5})

# This sets a label, size 20 to your color bar
ax.figure.axes[-1].yaxis.label.set_size(20)
plt.show()

limited screenshot, only b/c the plot is so large final plot another of the bottom of the plot to show the year axis, slightly zoomed in on my browser. additional screenshot

Upvotes: 6

Related Questions