Reputation: 131
I am trying to plot the frequency count of the following values available in (test_plot_bins.txt) in Gnuplot
-0.355534673
0.13217762
-0.842048585
-0.131302223
0.03265043
0.190827265
-0.25680709
-0.072149448
-0.156086803
0.065009468
-0.003330263
-0.059023393
-0.017556788
-0.06090761
-0.205948548
-0.18360047
using this script
#!/usr/bin/gnuplot -persist
clear
reset
set key off
set border 3
set boxwidth 0.05 absolute
set style fill solid 1.0 noborder
bin_width = 0.01;
bin_number(x) = floor(x/bin_width)
rounded(x) = bin_width * ( bin_number(x) + 0.5 )
plot 'test_plot_bins.txt' using (rounded($1)) :(1) smooth frequency with boxes
Here is the result. How do I change it that it becomes a normal line plot?
Secondly, when I replicate a value so that its count becomes 2 and then plot it, it stands out and diminishes other values. Here is how it looks like
How do I scale the plot that other values with count of 1 are also visible?
Upvotes: 0
Views: 494
Reputation: 15093
To get a line plot you can replace with boxes
by with lines
. However this is probably not the best way to display this sort of binned data. I suggest using the on-line demo bins.demas a guide. I copy it here for reference.
#
# Demo illustrating the relationship between
# a binned histogram and a kernel density model of the same data.
#
$DATA << EOD
1 1
2 1
8 1
9 1
17 1
17 1
9 1
9 1
5 1
7 1
7 1
8 1
8 1
8 1
10 1
11 1
11 1
12 1
14 1
3 1
3 1
3 1
8 7
15 1
17 1
17 1
18 1
19 1
20 1
EOD
set title "Comparison of a binned histogram and\na kernel density model of the same data"
set style data lines
set xtics 1 norangelimit nomirror
set grid y
set yrange [0:5.5]
set style fill solid 0.5 noborder
set jitter spread 0.5
plot $DATA using 1 bins=20 with boxes title '20 bins', \
'' using 1:(1) smooth kdensity bandwidth .5 lw 2 title 'smooth kdensity', \
'' using 1:(.9) with impulse lc "black" title 'jittered data'
Notes:
bins=20
will automatically divide the data into 20 equal-width bins along x. There is an optional keyword binrange
but if no binrange is given, the range is taken from the extremes of the x values found in the data.
The set jitter
command adds a small displacement to points, or in this case impulses, that would otherwise lie exactly on top of each other. It has no effect on the boxes or lines.
smooth kdensity
is one of many options for producing a smooth line. For full documentation see the gnuplot documentation. It generates a function f(x) in which each point in the data set acts as the center of a Gaussian weighted by distance from x; the resulting curve is the sum of these contributions. Selecting a bandwidth smaller than the spacing between points insures that each local maximum gets its own peak.
Each of the three representations, boxes
, lines
, and impulses
could be plotted independently. This plot combines all three modes for comparison.
Upvotes: 3