Reputation: 318
(Edit: Histogram removed, not relevant and confusing.)
I want a boxplot which can visualize statistical data. I made two data files for two types of data. In the first column the level, which is the x value, is written, in the second column the value. One line for each data point, several points per level. I want the same levels in the different files compared next to each other. I came up with the following code:
Tournament5 = "#99ffff"; Sigmascaling = "#4671d5"
set terminal pngcairo
set output "generations_dev.png"
set yrange [0:17.5]
set ylabel "Maximum Compactness of Best Solutions"
set xlabel "Number of Generations"
set autoscale fix
set style fill solid 0.25 border -1
set style boxplot nooutliers pointtype 7 separation 3
set style data boxplot
set boxwidth 1
plot "generation_tour.data" using (1.0):2:(0):1, "generation_sig.data" using (2.0):2:(0):1
which gives me the following picture:
Now my problems/questions are:
Thank you for any help!
Data File generation_tour.data
Upvotes: 2
Views: 1444
Reputation: 48390
Ok, that seems to be a bit tricky.
Two things: seems like gnuplot fails to produce a correct autoscale for the x-values in your plotting case. You would need to set an explicit xrange like you already do for the yrange. Second: gnuplot seems always to use the values given in the levels
column as xticlabel, without giving you the change to suppress them.
Here I give you a possible solution which relies on the data file to have blocks with equals values in the first column kept together and separated from other blocks with different values by two empty lines, so that you can access each block via the index
keyword and iterate over them:
...
"0" 14.49786677484523
"0" 14.691225516174955
"20" 10.28997920528754
"20" 8.764312035687594
...
Then you can use the following script to plot all those boxplots in the positions you want:
Tournament5 = "#99ffff"; Sigmascaling = "#4671d5"
set terminal pngcairo
set output "generations_dev.png"
set yrange [0:17.5]
set ylabel "Maximum Compactness of Best Solutions"
set xlabel "Number of Generations"
set autoscale xfix
set style fill solid 0.25 border -1
set style boxplot nooutliers pointtype 7
set style data boxplot
set boxwidth 1
stats "generation_sig.data" using 2 nooutput
plot for [i=0:STATS_blocks-1] "generation_sig.data" using (3*i):2 index i lt 1 title (i==0 ? 'Sigmascaling' : ''),\
for [i=0:STATS_blocks-1] "generation_tour.data" using (3*i+1):2 index i lt 2 title (i==0 ? 'Tournament 5' : ''),\
for [i=0:STATS_blocks-1] "generation_sig.data" using (3*i+0.5):(-1):xticlabel(1) index i w l notitle
The stats
call is used to count the number of blocks contained in the data file. The third plot
is explicitely put outside of the defined yrange. It only produces the xtics in the middle of two boxplots. You could also have used
plot for [i=0:STATS_blocks-1] "generation_sig.data" using (3*i):2:(0):1 index i lt 1,\
for [i=0:STATS_blocks-1] "generation_tour.data" using (3*i+1):2 index i lt 2
that would give you xtics centered under the first of the two boxplots.
The output is
If you don't want to change the data files and you can use awk
, then you could also add the empty lines on-the-fly with
cmd(file) = '< awk ''{if (NR != 1 && $1 != prev) print "\n"; prev=$1; print}'' '.file
plot for [i=0:STATS_blocks-1] cmd("generation_sig.data") using (3*i):2 index i lt 1 # ....
Upvotes: 2