Thedone
Thedone

Reputation: 33

Seaborn heatmap from group of columns?

Needing to create a heatmap with seaborn, can't seem to get there or fully grasp how.

Each component (row) needs to be present on the heatmap. On the left (y-axis) should be present the EID of each component. There are a lot so if only 1 every 10-20 is labelled, that's fine. On the x-axis should be ROTATION1 ROTATION2 ROTATION3 ROTATION4 ROTATION5 which represent the 5 columns of the dataset. Column EXTRA here is irrelevant for the heatmap.

The values that should be represented by the heatmap are either ROT STILL FLIP or any number between 160-180 separated by 2 (so 160 162 164 etc).

Some rows are blank for all columns ROTATION1 - ROTATION5 but the components should still be included in the heatmap (and show no colours for them).

+--------+-------+-----------+-----------+-----------+-----------+-----------+
| EID    | EXTRA | ROTATION1 | ROTATION2 | ROTATION3 | ROTATION4 | ROTATION5 |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| AB1178 | POS   | FLIP      |           | STILL     | 172       |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| EC8361 | NEG   |           |           |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| QS7229 | POS   |           |           | 160       |           | ROT       |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| SE0447 | NEG   | ROT       | STILL     |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| YT5489 | NEG   |           |           |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| SZ2548 | NEG   | 164       |           |           | FLIP      |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| OT6892 | POS   | FLIP      |           |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| PL3811 | POS   |           |           |           | STILL     |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| WQ0893 | POS   |           |           | ROT       |           | 170       |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| TY3551 | NEG   | 160       | FLIP      |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| PC6466 | POS   |           | 180       | 176       |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| YH5912 | POS   |           |           |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| BK6245 | NEG   |           |           |           | STILL     |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| GQ2081 | POS   |           |           |           | 162       | FLIP      |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| GF8633 | NEG   |           |           |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| FJ4895 | NEG   |           | 174       |           | ROT       |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| YD2504 | POS   |           |           |           |           | 162       |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| RF3510 | POS   |           |           |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| PN6167 | NEG   |           | 168       | FLIP      |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| RB9747 | POS   | FLIP      |           | STILL     | 178       | STILL     |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| BQ0841 | NEG   |           | ROT       |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| HJ5187 | NEG   |           |           |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| BP2359 | POS   | 168       | STILL     |           |           | ROT       |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| FN6198 | POS   | ROT       |           |           | 172       | FLIP      |
+--------+-------+-----------+-----------+-----------+-----------+-----------+

What I have tried:

df = pd.read_csv('DATA.csv')
df = pd.DataFrame(df, columns = ['EID', 'ROTATION1','ROTATION2', 'ROTATION3', 'ROTATION4', 'ROTATION5'])

in_range = list(range(160,181, 2))
direction = ['ROT', 'FLIP', 'STILL']
elements = direction + ([str(num) for num in num_range])

sensing = sns.load_dataset("df")
sensing = sensing.pivot("EID", ['EID', 'ROTATION1','ROTATION2', 'ROTATION3', 'ROTATION4', 'ROTATION5'], elements)

heatmap = sns.heatmap(sensing)

This does not work because I think the "x-axis" elements should be in the form of a column, not multiple rows? If anyone can tell me how to get round that would be great!

Outcome wanted:

A heatmap with the "colour legend bar" on the right to have ROT STILL FLIP and numbers between 160-180 separated by 2. In that order if at all possible. As previously said, the y-axis on the left should have the EID but the actual dataset is about 200 rows so one represented every 10 or 20 is fine. There should be 5 columns in the heatmap, each representing ROTATION1 to ROTATION5

I am inexperienced and just need a bit of help.

Using Python2.7 and PANDAS 0.24.2 and seaborn 0.9.1

Upvotes: 3

Views: 2647

Answers (1)

Zephyr
Zephyr

Reputation: 12496

First of all, you need to convert all values in your data to numeric type, int for example:

replacements = {np.nan: 157, 'FLIP': 182, 'STILL': 184, 'ROT': 187}
inv_replacements = {value: key for key, value in replacements.items()}

df = pd.read_csv(r'data/data.csv')
df = df.drop('EXTRA', axis = 1).set_index('EID')
annot = df.values

df = df.replace(replacements).astype(int)
        ROTATION1  ROTATION2  ROTATION3  ROTATION4  ROTATION5
EID                                                          
AB1178        182        157        184        172        157
EC8361        157        157        157        157        157
QS7229        157        157        160        157        187
SE0447        187        184        157        157        157
YT5489        157        157        157        157        157
SZ2548        164        157        157        182        157
OT6892        182        157        157        157        157
PL3811        157        157        157        184        157
WQ0893        157        157        187        157        170
TY3551        160        182        157        157        157
PC6466        157        180        176        157        157
YH5912        157        157        157        157        157
BK6245        157        157        157        184        157
GQ2081        157        157        157        162        182
GF8633        157        157        157        157        157
FJ4895        157        174        157        187        157
YD2504        157        157        157        157        162
RF3510        157        157        157        157        157
PN6167        157        168        182        157        157
RB9747        182        157        184        178        184
BQ0841        157        187        157        157        157
HJ5187        157        157        157        157        157
BP2359        168        184        157        157        187
FN6198        187        157        157        172        182

Then you should map each numerical value to respective label and prepare a colormap:

values = list(replacements.values())
values.extend(list(range(160, 181, 2)))
values = sorted(values)
vmap = {value: str(value) if value not in inv_replacements.keys() else inv_replacements[value] for value in values}
n = len(vmap)
cmap = sns.color_palette('tab20', n)
cmap[0] = (1, 1, 1, 1)

I chose 'tab20' colormap because you need 15 different colors and this colormap is one of the few that contains enough colors.
Then you can draw the heatmap:

ax = sns.heatmap(df, cmap = cmap, annot = annot, fmt = '')

Finally you need to tune the colormap:

colorbar = ax.collections[0].colorbar
r = colorbar.vmax - colorbar.vmin
colorbar.set_ticks([colorbar.vmin + 0.5*r/(n) + r*i/(n) for i in range(n)])
colorbar.set_ticklabels(list(vmap.values()))

Complete Code

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np


replacements = {np.nan: 157, 'FLIP': 182, 'STILL': 184, 'ROT': 187}
inv_replacements = {value: key for key, value in replacements.items()}

df = pd.read_csv(r'data/data.csv')
df = df.drop('EXTRA', axis = 1).set_index('EID')
annot = df.values

df = df.replace(replacements).astype(int)


values = list(replacements.values())
values.extend(list(range(160, 181, 2)))
values = sorted(values)
vmap = {value: str(value) if value not in inv_replacements.keys() else inv_replacements[value] for value in values}
n = len(vmap)
cmap = sns.color_palette('tab20', n)
cmap[0] = (1, 1, 1, 1)


ax = sns.heatmap(df, cmap = cmap, annot = annot, fmt = '')

colorbar = ax.collections[0].colorbar
r = colorbar.vmax - colorbar.vmin
colorbar.set_ticks([colorbar.vmin + 0.5*r/(n) + r*i/(n) for i in range(n)])
colorbar.set_ticklabels(list(vmap.values()))
 
plt.show()
  • heatmap with annotation for checking:

    enter image description here

  • heatmap without annotations:

    enter image description here


I don't reccomend the use of a continuous colormap: it could be difficult to distinguish one value from the next one.
However, if you want, you can use a continuous colormap, either for all values or for numerical ones only.
(of course you can keep or remove annotations)

  • colormap 'plasma' only for numerical values, white for nans, RGB for categorical ones:

    cmap = sns.color_palette('plasma', n - 4)
    cmap.insert(0, (1, 1, 1, 1))
    cmap.append((1, 0, 0, 1))
    cmap.append((0, 1, 0, 1))
    cmap.append((0, 0, 1, 1))
    

    enter image description here

  • colormap 'plasma' for all values:

    cmap = sns.color_palette('plasma', n - 1)
    cmap.insert(0, (1, 1, 1, 1))
    

    enter image description here

Upvotes: 1

Related Questions