Reputation: 763
I need to save a figure (with 8 subplots on it) generated from matpolotlib in python3.2. I need to same save the figure on one pdf page. Each subplot nmay have 240k to 400k data points.
My code:
from matplotlib.backends.backend_pdf import PdfPages
plt.show(block=False)
pp = PdfPages('multipage.pdf')
fig = plt.figure()
fig.savefig('figure_1.pdf', dpi = fig.dpi)
pp.close()
But, only an empty pdf file was created and no figures on it. Any help would be appreciated.
UPDATE This is a demo code:
def plot_pdf_example():
fig = plt.figure()
# I create subplots here
#x = np.random.rand(50)
#y = np.random.rand(50)
plt.plot(x, y, '.')
fig.savefig('figure_b.pdf')
if __name__ == '__main__':
r = plot_pdf_example()
# the return value of r is not 0 for my case
print("donne")
If I used plt.show() to get the figure in pop-up window, there are some titles and legends overlaps between subplots. How to adjuse the pop-up figure so that I can get all subplots without any overlaps and also keep all subplots as square. keeping them as square uis very important for me.
Upvotes: 1
Views: 1185
Reputation: 2707
Your code does save the single and empty figure fig
to the file figure_1.pdf
, without making any use of PdfPages
. It is also normal that the pdf file is empty, since you are not plotting anything in fig
. Below is a MWE that shows how to save only one figure to a single pdf file. I've removed all the stuff with PdfPages
that was not necessary.
Update (2015-07-27): When there is some problem saving a fig to pdf because there is too much data to render or in the cases of complex and detailed colormaps, it may be a good idea to rasterize some of the elements of the plot that are problematic. The MWE below has been updated to reflect that.
import matplotlib.pyplot as plt
import numpy as np
import time
plt.close("all")
fig = plt.figure()
N = 400000
x = np.random.rand(400000)
y = np.random.rand(400000)
colors = np.random.rand(400000)
area = 3
ax0 = fig.add_axes([0.1, 0.1, 0.85, 0.85])
scater = ax0.scatter(x, y, s=area, c=colors)
scater.set_rasterized(True)
plt.show(block=False)
ts = time.clock()
fig.savefig('figure_1.pdf')
te = time.clock()
print('t = %f sec' % (te-ts))
On my machine, the code above took about 6.5 sec to save the pdf when rasterized was set to true for scater
, while it took 61.5 sec when it was set to False.
By default, when saving in pdf, the figure is saved in vectorial format. This means that every point is saved as a set of parameters (colors, size, position, etc). This is a lot of information to store when there is a lot of data (8 * 400k in the case of the OP). When converting some elements of the plot to a raster format, the number of points plotted does not matter because the image is saved as a fixed number of pixels instead (like in a png). By only rasterizing the scater
, the rest of the figure (axes, labels, text, legend, etc.) still remains in vectorial format. So overall, the loss in quality is not that much noticeable for some type of graph (like colormaps or scatter plots), but it will be for graphs with sharp lines.
Upvotes: 1