Reputation: 33
Needing to create a heatmap with seaborn, can't seem to get there or fully grasp how.
Each component (row) needs to be present on the heatmap. On the left (y-axis) should be present the EID
of each component. There are a lot so if only 1 every 10-20 is labelled, that's fine. On the x-axis should be ROTATION1
ROTATION2
ROTATION3
ROTATION4
ROTATION5
which represent the 5 columns of the dataset. Column EXTRA
here is irrelevant for the heatmap.
The values that should be represented by the heatmap are either ROT
STILL
FLIP
or any number between 160-180 separated by 2 (so 160
162
164
etc).
Some rows are blank for all columns ROTATION1 - ROTATION5 but the components should still be included in the heatmap (and show no colours for them).
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| EID | EXTRA | ROTATION1 | ROTATION2 | ROTATION3 | ROTATION4 | ROTATION5 |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| AB1178 | POS | FLIP | | STILL | 172 | |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| EC8361 | NEG | | | | | |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| QS7229 | POS | | | 160 | | ROT |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| SE0447 | NEG | ROT | STILL | | | |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| YT5489 | NEG | | | | | |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| SZ2548 | NEG | 164 | | | FLIP | |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| OT6892 | POS | FLIP | | | | |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| PL3811 | POS | | | | STILL | |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| WQ0893 | POS | | | ROT | | 170 |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| TY3551 | NEG | 160 | FLIP | | | |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| PC6466 | POS | | 180 | 176 | | |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| YH5912 | POS | | | | | |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| BK6245 | NEG | | | | STILL | |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| GQ2081 | POS | | | | 162 | FLIP |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| GF8633 | NEG | | | | | |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| FJ4895 | NEG | | 174 | | ROT | |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| YD2504 | POS | | | | | 162 |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| RF3510 | POS | | | | | |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| PN6167 | NEG | | 168 | FLIP | | |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| RB9747 | POS | FLIP | | STILL | 178 | STILL |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| BQ0841 | NEG | | ROT | | | |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| HJ5187 | NEG | | | | | |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| BP2359 | POS | 168 | STILL | | | ROT |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| FN6198 | POS | ROT | | | 172 | FLIP |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
What I have tried:
df = pd.read_csv('DATA.csv')
df = pd.DataFrame(df, columns = ['EID', 'ROTATION1','ROTATION2', 'ROTATION3', 'ROTATION4', 'ROTATION5'])
in_range = list(range(160,181, 2))
direction = ['ROT', 'FLIP', 'STILL']
elements = direction + ([str(num) for num in num_range])
sensing = sns.load_dataset("df")
sensing = sensing.pivot("EID", ['EID', 'ROTATION1','ROTATION2', 'ROTATION3', 'ROTATION4', 'ROTATION5'], elements)
heatmap = sns.heatmap(sensing)
This does not work because I think the "x-axis" elements should be in the form of a column, not multiple rows? If anyone can tell me how to get round that would be great!
Outcome wanted:
A heatmap with the "colour legend bar" on the right to have ROT
STILL
FLIP
and numbers between 160-180 separated by 2. In that order if at all possible.
As previously said, the y-axis on the left should have the EID
but the actual dataset is about 200 rows so one represented every 10 or 20 is fine.
There should be 5 columns in the heatmap, each representing ROTATION1
to ROTATION5
I am inexperienced and just need a bit of help.
Using Python2.7 and PANDAS 0.24.2 and seaborn 0.9.1
Upvotes: 3
Views: 2647
Reputation: 12496
First of all, you need to convert all values in your data to numeric type, int
for example:
replacements = {np.nan: 157, 'FLIP': 182, 'STILL': 184, 'ROT': 187}
inv_replacements = {value: key for key, value in replacements.items()}
df = pd.read_csv(r'data/data.csv')
df = df.drop('EXTRA', axis = 1).set_index('EID')
annot = df.values
df = df.replace(replacements).astype(int)
ROTATION1 ROTATION2 ROTATION3 ROTATION4 ROTATION5
EID
AB1178 182 157 184 172 157
EC8361 157 157 157 157 157
QS7229 157 157 160 157 187
SE0447 187 184 157 157 157
YT5489 157 157 157 157 157
SZ2548 164 157 157 182 157
OT6892 182 157 157 157 157
PL3811 157 157 157 184 157
WQ0893 157 157 187 157 170
TY3551 160 182 157 157 157
PC6466 157 180 176 157 157
YH5912 157 157 157 157 157
BK6245 157 157 157 184 157
GQ2081 157 157 157 162 182
GF8633 157 157 157 157 157
FJ4895 157 174 157 187 157
YD2504 157 157 157 157 162
RF3510 157 157 157 157 157
PN6167 157 168 182 157 157
RB9747 182 157 184 178 184
BQ0841 157 187 157 157 157
HJ5187 157 157 157 157 157
BP2359 168 184 157 157 187
FN6198 187 157 157 172 182
Then you should map each numerical value to respective label and prepare a colormap:
values = list(replacements.values())
values.extend(list(range(160, 181, 2)))
values = sorted(values)
vmap = {value: str(value) if value not in inv_replacements.keys() else inv_replacements[value] for value in values}
n = len(vmap)
cmap = sns.color_palette('tab20', n)
cmap[0] = (1, 1, 1, 1)
I chose 'tab20'
colormap because you need 15 different colors and this colormap is one of the few that contains enough colors.
Then you can draw the heatmap:
ax = sns.heatmap(df, cmap = cmap, annot = annot, fmt = '')
Finally you need to tune the colormap:
colorbar = ax.collections[0].colorbar
r = colorbar.vmax - colorbar.vmin
colorbar.set_ticks([colorbar.vmin + 0.5*r/(n) + r*i/(n) for i in range(n)])
colorbar.set_ticklabels(list(vmap.values()))
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
replacements = {np.nan: 157, 'FLIP': 182, 'STILL': 184, 'ROT': 187}
inv_replacements = {value: key for key, value in replacements.items()}
df = pd.read_csv(r'data/data.csv')
df = df.drop('EXTRA', axis = 1).set_index('EID')
annot = df.values
df = df.replace(replacements).astype(int)
values = list(replacements.values())
values.extend(list(range(160, 181, 2)))
values = sorted(values)
vmap = {value: str(value) if value not in inv_replacements.keys() else inv_replacements[value] for value in values}
n = len(vmap)
cmap = sns.color_palette('tab20', n)
cmap[0] = (1, 1, 1, 1)
ax = sns.heatmap(df, cmap = cmap, annot = annot, fmt = '')
colorbar = ax.collections[0].colorbar
r = colorbar.vmax - colorbar.vmin
colorbar.set_ticks([colorbar.vmin + 0.5*r/(n) + r*i/(n) for i in range(n)])
colorbar.set_ticklabels(list(vmap.values()))
plt.show()
I don't reccomend the use of a continuous colormap: it could be difficult to distinguish one value from the next one.
However, if you want, you can use a continuous colormap, either for all values or for numerical ones only.
(of course you can keep or remove annotations)
colormap 'plasma'
only for numerical values, white for nan
s, RGB for categorical ones:
cmap = sns.color_palette('plasma', n - 4)
cmap.insert(0, (1, 1, 1, 1))
cmap.append((1, 0, 0, 1))
cmap.append((0, 1, 0, 1))
cmap.append((0, 0, 1, 1))
colormap 'plasma'
for all values:
cmap = sns.color_palette('plasma', n - 1)
cmap.insert(0, (1, 1, 1, 1))
Upvotes: 1