Reputation: 2713
I could not describe how the plot looks like so I just use "strange" as I have no idea why gnuplot gives me such a plot. Here is the thing I am trying to do.
I have a data file with two columns, the first column is the file name and the second is the size of each file. Each column is more than 2 million rows. I just want to plot the distribution of file sizes. Here is my code
set terminal postscript landscape enhanced mono dashed lw 2 "Times" 18
outputfile = "sizedist.ps"
set output outputfile
binwidth = 0.05
bin(x,width)=width*floor(x/width)
plot [0:3.5][]'sizedist.out' using (bin(log10($2/1024),binwidth)):(1.0) smooth freq with boxes t "Binsize=0.05 dex"
set terminal x11
Ideally, it should be a single Gaussian-like bar plot, but it has many other plots over-layed (see my attachment). Any expert on gnuplot knows why this happened?
Upvotes: 0
Views: 265
Reputation: 2000
This happens if some of your data in the frequency plot does not have well defined values (such as NaN, inf etc.).
Since you are using a logarithmic function in the plot, you have to be careful with data that has values <=0. I guess you have files with size=0. In this cases log10 just gives you NaN
and this messes up the counting procedure of the frequency plot.
Include a condition to your plot to fix this. For example:
plot [0:3.5][]'sizedist.out' using ($2>0?bin(log10($2/1024),binwidth):0):(1.0) smooth freq with boxes t "Binsize=0.05 dex"
Upvotes: 1