Reputation: 464
I have a 2880x2880 similarity matrix (8.5 mil points). My attempt with Holoviews resulted in a 500 MB HTML file which never finishes "opening". So how do I make a round heatmap of the matrix?
I had data from 10 different places, measured over 1 whole year. The hours of each month were turned into arrays, so each month had 24 arrays (one for all 00:00, one for all 01:00 ... 22:00, 23:00).
These were about 28-31 cells long, and each cell had the measurement of the thing I'm trying to analyze. So there are these 24 arrays for each month of 1 whole year, i.e. 24x12 = 288 arrays per place. And there are measurements from 10 places. So a total of 2880 arrays were created and all compared to each other, and saved in a 2880x2880 matrix with similarity coefficients.
I'm trying to turn it into a radial similarity matrix like the one from holoviews, but without the ticks and tags (since the format Place01Jan0800 would be cumbersome to look at for 2880 rows), just the shape and colors and divisions:
I managed to create the HTML file itself, but it ended up being 500 MB big, so it never shows up when I open it up. It's just blank. I've added a minimal example below of what I have, and replaced the loading of the datafile with some randomly generated data.
import sys
sys.setrecursionlimit(10000)
import random
import numpy as np
import pandas as pd
import holoviews as hv
from holoviews import opts
from bokeh.plotting import show
import gc
# Function creating dummy data for this example
def transformer():
dimension = 2880
dummy_matrix = ([[ random.random() for i in range(dimension) ] for j in range(dimension)]) #Fake, similar data
col_vals = [str(i) for i in range(dimension*dimension)] # Placeholder
row_vals = [str(i) for i in range(dimension*dimension)] # Placeholder
val_vals = (np.reshape(np.array(dummy_matrix), -1)).tolist() # Turn matrix into an array
idx_vals = [i for i in range(dimension*dimension)] # Placeholder
return idx_vals, val_vals, row_vals, col_vals
idx_arr, val_arr, row_arr, col_arr = transformer()
df = pd.DataFrame({"values": val_arr, "x-label": row_arr, "y-label": col_arr}, index=idx_arr)
hv.extension('bokeh')
heatmap = hv.HeatMap(df, ["x-label", "y-label"])
heatmap.opts(opts.HeatMap(cmap="viridis", radial=True))
gc.collect() # Attempt to save memory, because this thing is huge
show(hv.render(heatmap))
I had a look at datashader to see if it would help, but I have no idea how to plug it in (if it's possible for this case) to this radial heatmap, since it seems like the radial heatmap doesn't have that datashade-feature.
So I have no idea how to tackle this. I would be content with a broad overview too, I don't need the details nor the hover-infobox nor ability to zoom or any fancy extra features, I just need the general overview for a presentation. I'm open to any solution really.
Upvotes: 0
Views: 681
Reputation: 464
Plain Matplotlib seems to be able to handle it, based on answers from here: How do I create radial heatmap in matplotlib?
import random
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
fig = plt.figure()
ax = Axes3D(fig)
n = 2880
m = 2880
rad = np.linspace(0, 10, m)
a = np.linspace(0, 2 * np.pi, n)
r, th = np.meshgrid(rad, a)
dummy_matrix = ([[ random.random() for i in range(n) ] for j in range(m)])
plt.subplot(projection="polar")
plt.pcolormesh(th, r, dummy_matrix, cmap = 'Blues')
plt.plot(a, r, ls='none', color = 'k')
plt.grid()
plt.colorbar()
plt.savefig("custom_radial_heatmap.png")
plt.show()
And it didn't even take an eternity, took only about 20 seconds max.
You would think it would turn out monstrous like that
But the sheer amount of points drowns out the jaggedness, WOOHOO!
There's some things left to be desired, like tags and ticks, but I think I'll figure that out.
Upvotes: 0
Reputation: 3074
I recommend you to use heatmp instead of radial heatamp for showing the similarity matrix. The reasons are:
The following is a simple code to show a heatmap.
import matplotlib.cm
import matplotlib.pyplot as plt
from matplotlib.colors import Normalize
import numpy as np
n = 2880
m = 2880
dummy_matrix = np.random.rand(m, n)
fig = plt.figure(figsize=(50,50)) # change the figsize to control the resolution
ax = fig.add_subplot(111)
cmap = matplotlib.cm.get_cmap("Blues") # you may use other build-in colormap or define you own colormap
# if your data is not in range[0,1], use a normalization. Here is normalized by min and max values.
norm = Normalize(vmin=np.amin(dummy_matrix), vmax=np.amax(dummy_matrix))
image = ax.imshow(dummy_matrix, cmap=cmap, norm=norm)
plt.colorbar(image)
plt.show()
Another idea that comes to me is that, perhaps the computation of similarity matrix is unnecessary, and you can plot the orginial 288 * 10 data using radial heat map or just a normal heatmap, and one can get to know the data similarity from the color distribution directly.
Upvotes: 1